Currently, the unconfiguring PMP is implemented directly inside
switch_to_next_domain_context() whereas rest of the PMP programming
is done via functions implemented in sbi_hart.c.
Introduce a separate sbi_hart_pmp_unconfigure() function so that
all PMP programming is in one place.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20251209135235.423391-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, platforms do not provide complete memory region information
to OpenSBI. Generally, memory regions are only created for the few MMIO
devices that have M-mode drivers. As a result, most MMIO devices fall
inside the default S-mode RWX memory region, which does _not_ have the
MMIO flag set.
In fact, OpenSBI relies on certain S-mode MMIO devices being inside
non-MMIO memory regions. Both fdt_domain_based_fixup_one() and
mpxy_rpmi_sysmis_xfer() call sbi_domain_check_addr() with the MMIO flag
cleared, and that function currently requires an exact flag match. Those
access checks will thus erroneously fail if the platform creates memory
regions with the correct flags for these devices (or for a larger MMIO
region containing these devices).
We should not ignore the MMIO flag entirely, because
sbi_domain_check_addr() is also used to check the permissions of S-mode
shared memory buffers, and S-mode should not be using MMIO device
addresses as memory buffers. But when checking if S-mode is allowed to
do MMIO accesses, we need to recognize that MMIO devices appear in
memory regions both with and without the MMIO flag set.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Yu-Chien Peter Lin <peter.lin@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251121193808.1528050-2-samuel.holland@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The QoS Identifiers extension (Ssqosid) introduces the srmcfg register,
which configures a hart with two identifiers: a Resource Control ID
(RCID) and a Monitoring Counter ID (MCID). These identifiers accompany
each request issued by the hart to shared resource controllers.
If extension Smstateen is implemented together with Ssqosid, then
Ssqosid also requires the SRMCFG bit in mstateen0 to be implemented. If
mstateen0.SRMCFG is 0, attempts to access srmcfg in privilege modes less
privileged than M-mode raise an illegal-instruction exception. If
mstateen0.SRMCFG is 1 or if extension Smstateen is not implemented,
attempts to access srmcfg when V=1 raise a virtual-instruction exception.
This extension can be found in the RISC-V Instruction Set Manual:
https://github.com/riscv/riscv-isa-manual
Changes in v5:
- Remove SBI_HART_EXT_SSQOSID dependency SBI_HART_PRIV_VER_1_12
Changes in v4:
- Remove extraneous parentheses around SMSTATEEN0_SRMCFG
Changes in v3:
- Check SBI_HART_EXT_SSQOSID when swapping SRMCFG
Changes in v2:
- Remove trap-n-detect
- Context switch CSR_SRMCFG
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Link: https://lore.kernel.org/r/20251114115722.1831-1-cp0613@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
In the sanitize_domain, code that checks for the case when one
memory region covered by the other, was never executed. Quote:
/* Sort the memory regions */
for (i = 0; i < (count - 1); i++) {
<snip>
}
/* Remove covered regions */
while(i < (count - 1)) {
Here "while" loop never executed because condition "i < (count - 1)"
is always false after the "for" loop just above.
In addition, when clearing region, "root_memregs_count"
should be adjusted as well, otherwise code that adds memory region
in the "root_add_memregion" will use wrong position:
/* Append the memregion to root memregions */
nreg = &root.regions[root_memregs_count];
empty entry will be created in the middle of regions array, new
regions will be added after this empty entry while sanitizing code
will stop when reaching empty entry.
Fixes: 3b03cdd60c ("lib: sbi: Add regions merging when sanitizing domain region")
Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@mobileye.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251111104327.1170919-2-vladimir.kondratiev@mobileye.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Add fw_smepmp_ids bitmap to track PMP entries that protect firmware
regions. Allow us to preserve these critical entries across domain
transitions and check inconsistent firmware entry allocation.
Also add sbi_hart_smepmp_is_fw_region() helper function to query
whether a given SmePMP entry protects firmware regions.
Signed-off-by: Yu-Chien Peter Lin <peter.lin@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251008084444.3525615-8-peter.lin@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
During domain context switches, all PMP entries are reconfigured
which can clear firmware access permissions, causing M-mode access
faults under SmePMP.
Sort domain regions to place firmware regions first, ensuring
consistent firmware PMP entries so they won't be revoked during
domain context switches.
Signed-off-by: Yu-Chien Peter Lin <peter.lin@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251008084444.3525615-7-peter.lin@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Previously, when memory regions exceed available PMP entries,
some regions were silently ignored. If the last entry that covers
the full 64-bit address space is not added to a domain, the next
stage S-mode software won't have permission to access and fetch
instructions from its memory. So return early with error message
to catch such situation.
Signed-off-by: Yu-Chien Peter Lin <peter.lin@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251008084444.3525615-5-peter.lin@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Previously, in sbi_pmu_ctr_cfg_match() function, ctr_idx was used immediately
after pmu_ctr_find_fw() or pmu_ctr_find_hw() calls. In first case, array index
was (ctr_idx - num_hw_ctrs), in second - ctr_idx. But pmu_ctr_find_fw() and
pmu_ctr_find_hw() functions can return negative value, in which case writing
in arrays with such indexes would corrupt sbi_pmu_hart_state structure.
To avoid this situation, direct ctr_idx value check added.
Signed-off-by: Alexander Chuprunov <alexander.chuprunov@syntacore.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250918090706.2217603-4-alexander.chuprunov@syntacore.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Generally, hardware performance counters can only be started, stopped,
or configured from machine-mode using mcountinhibit and mhpmeventX CSRs.
Also, in opensbi only sbi_pmu_ctr_cfg_match() managed mhpmeventX. But
in generic Linux driver, when perf starts, Linux calls both
sbi_pmu_ctr_cfg_match() and sbi_pmu_ctr_start(), while after hart suspend
only sbi_pmu_ctr_start() command called through SBI interface. This doesn't
work properly in case when suspend state resets HPM registers. In order
to keep counter integrity, sbi_pmu_ctr_start() modified. First, we're saving
hw_counters_data, and after hart suspend this value is restored if
event is currently active.
Signed-off-by: Alexander Chuprunov <alexander.chuprunov@syntacore.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250918090706.2217603-2-alexander.chuprunov@syntacore.com
Signed-off-by: Anup Patel <anup@brainfault.org>
A platform can have multiple IPI devices (such as ACLINT MSWI,
AIA IMSIC, etc). Currently, OpenSBI rely on platform calling
the sbi_ipi_set_device() function in correct order and prefer
the first avaiable IPI device which is fragile.
Instead of the above, introduce IPI device rating and prefer
the highest rated IPI device. This further allows extending
the sbi_ipi_raw_clear() to clear all available IPI devices.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Nick Hu <nick.hu@sifive.com>
Link: https://lore.kernel.org/r/20250904052410.546818-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently the heap has a fixed housekeeping factor of 16, which means
1/16 of the heap is reserved for list nodes. But this is not enough when
there are many small allocations; in the worst case, 1/3 of the heap is
needed for list nodes (32 byte heap_node for each 64 byte allocation).
This has caused allocation failures on some platforms.
Let's avoid trying to guess the best ratio. Instead, allocate more nodes
as needed. To avoid recursion, the nodes are permanent allocations. So
to minimize fragmentation, allocate them in small batches from the end
of the last free space node. Bootstrap the free space list by embedding
one node in the heap control struct.
Some error paths are avoided because the nodes are allocated up front.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250617032306.1494528-3-samuel.holland@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Using hsm stop in hsm wait loop causes secondary harts to be stuck
forever in OpenSBI on RISC-V platforms where HSM hart hotplug is
available and all harts come-up at the same time during system
power-on.
For example, lets say we have two harts A and B on a RISC-V platform
with HSM hart hotplug which come-up at the same time during system
power-on. The hart A enters OpenSBI before hart B hence it becomes
the primary (or cold-boot) hart whereas hart B becomes the secondary
(or warm-boot) hart. The hart A follows the OpenSBI cold-boot path
and registers hsm device before hart B enters OpenSBI. The hart B
eventually enters OpenSBI and follows the OpenSBI warm-boot path
so it will increment it's own entry_count before entering hsm wait
loop where it sees hsm device and stops itself. Later as part of
the Linux boot-up sequence, hart A issues SBI HSM start call to
bring-up hart B but OpenSBI sees entry_count != init_count for
hart B in sbi_hsm_hart_start() hence hsm_device_hart_start() is
not called for hart B resulting in hart B stuck forever in OpenSBI.
To fix the above issue, revert entry_count before doing hsm stop
in hsm wait loop.
Fixes: d844deadec ("lib: sbi: Use hsm stop for hsm wait")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Nick Hu <nick.hu@sifive.com>
Link: https://lore.kernel.org/r/20250527124821.2113467-1-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>