sbi_domain_check_addr_range() computes `max = addr + size` without
checking for integer overflow. When a caller passes a size large enough
to wrap around (e.g. addr=0x80000000, size=0xFFFFFFFF80000000), max
becomes less than addr, causing the while(addr < max) validation loop
to be skipped entirely. The function then returns true without
performing any permission checks.
This allows an S-mode caller to bypass domain memory protection and
access M-mode memory through SBI extensions that use address range
validation (e.g. DBCN console write/read).
Add an overflow check after computing max: if size is non-zero and
max wrapped to a value <= addr, reject the request.
Signed-off-by: Takumi Hara <takumihara1226@gmail.com>
Reviewed-by: Rahul Pathak <rahul@summations.net>
Link: https://lore.kernel.org/r/20260319132232.51572-1-takumihara1226@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently when searching for a hardware counter for an event, if no
programmable counter is available, the code falls back to using a fixed
counter (mcycle/minstret) if one matches the event.
However the fallback is incorrect when sscofpmf is present but
smcntrpmf is not. That's because with sscofpmf, programmable counters
support mode filtering, but the fixed counters do not (without
smcntrpmf). Even if the caller didn't configure mode filtering, by
default programmable counters don't count M mode when sscofpmf is
present, whereas mcycle/minstret do.
Fix the logic to not fallback to a fixed counter if sscofpmf is present
but smcntrpmf is not.
Fixes: 0c304b6619 ("lib: sbi: Allow programmable counters to monitor cycle/instret events")
Signed-off-by: Michael Ellerman <mpe@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260324-mcycle-fix-v1-1-1444e9fe5c32@kernel.org
Signed-off-by: Anup Patel <anup@brainfault.org>
The location of the RNMI/E trap vectors in the Smrnmi extension is
implementation-defined, so platforms with vendor-specific NMI vector
mechanisms must install the firmware's NMI entry points themselves.
Add an smrnmi_handlers_init() callback to sbi_platform_operations that
receives the firmware entry points and lets platform code install them
at the hardware-specific vector locations. Two pointers are passed:
- _trap_rnmi_handler: the dedicated RNMI entry point that saves
context using the Smrnmi MN* CSRs and returns via mnret.
- _trap_handler: the regular M-mode trap entry since RNME is taken
as a regular M-mode trap with NMIE=0.
When Smrnmi is present, install the platform's NMI vectors via the new
callback, initialize MNSCRATCH with the per-hart scratch pointer, and
set MNSTATUS.NMIE.
Smrnmi-enabled platforms must register smrnmi_handlers_init; if the
extension is detected but no callback is registered, sbi_panic() is
called since enabling NMIs without handlers in place would route
subsequent traps into nowhere.
Signed-off-by: Evgeny Voevodin <evvoevod@tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/88b1470e1e3348d454b4b995a11a85c01914f7af.1778176768.git.evvoevod@tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Current placement of entropy initialization via Zkr extension requires a
trap-based mechanism to handle absent Zkr extension case. In presence of
Smrnmi extension no trap-based mechanisms should be used before Smrnmi is
detected and enabled otherwise trap will jump to undefined location.
Move stack guard initialization into init_coldboot function body after
device tree has been parsed so we know if Zkr extension is implemented by
the platform which helps to avoid trap-based discovery.
init_coldboot() is a safe place to initialize entropy because it doesn't
return so no check of __stack_chk_guard against value on entry
will be done.
Signed-off-by: Evgeny Voevodin <evvoevod@tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/acd52b0f3468758bc5f09e6a45662341b31d4d87.1778176768.git.evvoevod@tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Implement basic Resumable NMI (RNMI) handler support for the RISC-V
Smrnmi extension.
The new _trap_rnmi_handler assembly entry point saves context using the
Smrnmi MN* CSRs (MNSCRATCH, MNEPC, MNSTATUS, MNCAUSE) and returns via
mnret. It dispatches to sbi_trap_rnmi_handler(), which optionally calls
a platform-specific ops->rnmi_handler callback for actual NMI
processing. If no platform handler is registered or it fails, the
event is reported as an unhandled NMI.
The RNMI handler reuses the generic trap context structure but stores MN*
CSR values (MNEPC, MNSTATUS, MNCAUSE) into the corresponding generic
fields (mepc, mstatus, cause) for compatibility with existing trap
infrastructure.
Signed-off-by: Evgeny Voevodin <evvoevod@tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/050ae6d2762ba8d5b9dfb3cc1960a23aa3d6c549.1778176768.git.evvoevod@tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The original sbi_strchr implementation did not conform to the C standard
behavior. According to the C standard and POSIX specification, strchr(s, 0)
should return a pointer to the null terminator at the end of string s.
The previous implementation used a while loop that would terminate when
either reaching the end of string or finding the character, but it would
return NULL when searching for the null terminator instead of returning
a pointer to the null terminator itself.
The fixed implementation uses a do-while loop that ensures even when
searching for the null terminator, the function correctly returns a
pointer to the null terminator position rather than NULL.
This fix ensures sbi_strchr behavior aligns with standard library
function semantics, making it more predictable and safe for users
expecting standard C library behavior.
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260306094425.1918-3-cp0613@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
It is possible to have platform where an irqchip device targets
a subset of harts and there are multiple irqchip devices to cover
all harts.
To support this scenario:
1) Add target_harts hartmask to struct sbi_irqchip_device which
represents the set of harts targetted by the irqchip device
2) Call warm_init() and process_hwirqs() callbacks of an irqchip
device on a hart only if irqchip device targets that particular
hart
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260213055342.3124872-6-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
UBSan detected undefined behavior in sbi_hart.c and sbi_fwft.c (in
the case of sbi_fwft.c, the bug comes from a macro call defined at
sbi_ecall_interface.h) caused by shifting a signed integer into the
sign bit (1 << 31)
This can be fixed by using the 1UL literal, ensuring defined arithmetic.
Please let me know if there’s any other most suitable solution for
this bug.
Signed-off-by: Marcos Oduardo <marcos.oduardo@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260223001202.284612-1-marcos.oduardo@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, when we attempt to read the upper 32 bits of a firmware
counter on RV64 or higher, we just set `sbiret.value` to 0 without
validating the counter index. The SBI specification requires us to set
`sbiret.error` to `SBI_ERR_INVALID_PARAM` if the counter index points to
a hardware counter or an invalid counter. Add a validation check to
ensure compliance with the specification on RV64 or higher.
Fixes: 51951d9e9a ("lib: sbi_pmu: Implement sbi_pmu_counter_fw_read_hi")
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260125090643.190748-1-jamestiotio@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, we immediately return the result of `fw_counter_start` if the
event code is 0xFFFF. However, this skips setting the bit in the
`fw_counters_started` bitmap even if the platform-specific call
succeeds. Restore the original behavior of returning early only on an
error so that we still set the bit in the bitmap. This prevents multiple
starts of the same FW counter. This also aligns the expectations of
`pmu_ctr_start_fw` with `pmu_ctr_stop_fw` since we cannot assume that
the platform-specific functions to start and stop FW counters will
modify the bitmap state.
Fixes: 57d3aa3b0d ("lib: sbi_pmu: Introduce fw_counter_write_value API")
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260116165304.180441-1-jamestiotio@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The "RISC-V C API" [1] defines architecture extension test macros
says naming rule for the test macros is __riscv_<ext_name>, where
<ext_name> is all lower-case.
Three extensions dealing with atomics implementation are:
"zaamo" consists of AMO instructions,
"zalrsc" - LR/SC,
"a" extension means both "zaamo" and "zalrsc"
Built-in test macros are __riscv_a, __riscv_zaamo and __riscv_zalrsc.
Alternative to the __riscv_a macro name, __riscv_atomic, is deprecated.
Use correct test macro __riscv_zaamo for the AMO variant of atomics.
It used to be __riscv_atomic that is both deprecated and incorrect
because it tests for the "a" extension; i.e. both "zaamo" and "zalrsc"
If ISA enables only zaamo but not zalrsc, code as it was would not compile.
Older toolchains may have neither __riscv_zaamo nor __riscv_zalrsc, so
query __riscv_atomic - it should be treated as both __riscv_zaamo and
__riscv_zalrsc, in all present cases __riscv_zaamo is more favorable
so take is as alternative for __riscv_zaamo
[1] https://github.com/riscv-non-isa/riscv-c-api-doc
Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@mobileye.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251228073321.1533844-1-vladimir.kondratiev@mobileye.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, OpenSBI returns SBI_ERR_ALREADY_STARTED when attempting to
start a HW counter that is already started and SBI_ERR_ALREADY_STOPPED
when attempting to stop a HW counter that is already stopped. However,
this is not yet implemented for FW counters.
Add the necessary checks to return the same error codes when attempting
the same actions on FW counters.
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251213104146.422972-1-jamestiotio@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
By default the OpenSBI itself is covered by 2 memregions for RX/RW
sections. This is required by platforms with Smepmp to enforce
proper permissions in M mode. Note: M-mode only regions can't
have RWX permissions with Smepmp. Platforms with traditional PMPs
won't be able to benefit from it, as both regions are effectively
RWX in M mode, but usually it's harmless to so. Now we provide
these platforms with an option to disable this logic. It saves 1
PMP entry. For platforms really in short of PMPs, it does make a
difference.
Note: Platform requesting single OpenSBI memregion must be using
traditional (old) PMP. We expect the platform code to do
the right thing.
Signed-off-by: Bo Gan <ganboing@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251218104243.562667-5-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, the unconfiguring PMP is implemented directly inside
switch_to_next_domain_context() whereas rest of the PMP programming
is done via functions implemented in sbi_hart.c.
Introduce a separate sbi_hart_pmp_unconfigure() function so that
all PMP programming is in one place.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20251209135235.423391-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Currently, platforms do not provide complete memory region information
to OpenSBI. Generally, memory regions are only created for the few MMIO
devices that have M-mode drivers. As a result, most MMIO devices fall
inside the default S-mode RWX memory region, which does _not_ have the
MMIO flag set.
In fact, OpenSBI relies on certain S-mode MMIO devices being inside
non-MMIO memory regions. Both fdt_domain_based_fixup_one() and
mpxy_rpmi_sysmis_xfer() call sbi_domain_check_addr() with the MMIO flag
cleared, and that function currently requires an exact flag match. Those
access checks will thus erroneously fail if the platform creates memory
regions with the correct flags for these devices (or for a larger MMIO
region containing these devices).
We should not ignore the MMIO flag entirely, because
sbi_domain_check_addr() is also used to check the permissions of S-mode
shared memory buffers, and S-mode should not be using MMIO device
addresses as memory buffers. But when checking if S-mode is allowed to
do MMIO accesses, we need to recognize that MMIO devices appear in
memory regions both with and without the MMIO flag set.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Yu-Chien Peter Lin <peter.lin@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251121193808.1528050-2-samuel.holland@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The QoS Identifiers extension (Ssqosid) introduces the srmcfg register,
which configures a hart with two identifiers: a Resource Control ID
(RCID) and a Monitoring Counter ID (MCID). These identifiers accompany
each request issued by the hart to shared resource controllers.
If extension Smstateen is implemented together with Ssqosid, then
Ssqosid also requires the SRMCFG bit in mstateen0 to be implemented. If
mstateen0.SRMCFG is 0, attempts to access srmcfg in privilege modes less
privileged than M-mode raise an illegal-instruction exception. If
mstateen0.SRMCFG is 1 or if extension Smstateen is not implemented,
attempts to access srmcfg when V=1 raise a virtual-instruction exception.
This extension can be found in the RISC-V Instruction Set Manual:
https://github.com/riscv/riscv-isa-manual
Changes in v5:
- Remove SBI_HART_EXT_SSQOSID dependency SBI_HART_PRIV_VER_1_12
Changes in v4:
- Remove extraneous parentheses around SMSTATEEN0_SRMCFG
Changes in v3:
- Check SBI_HART_EXT_SSQOSID when swapping SRMCFG
Changes in v2:
- Remove trap-n-detect
- Context switch CSR_SRMCFG
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Link: https://lore.kernel.org/r/20251114115722.1831-1-cp0613@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
In the sanitize_domain, code that checks for the case when one
memory region covered by the other, was never executed. Quote:
/* Sort the memory regions */
for (i = 0; i < (count - 1); i++) {
<snip>
}
/* Remove covered regions */
while(i < (count - 1)) {
Here "while" loop never executed because condition "i < (count - 1)"
is always false after the "for" loop just above.
In addition, when clearing region, "root_memregs_count"
should be adjusted as well, otherwise code that adds memory region
in the "root_add_memregion" will use wrong position:
/* Append the memregion to root memregions */
nreg = &root.regions[root_memregs_count];
empty entry will be created in the middle of regions array, new
regions will be added after this empty entry while sanitizing code
will stop when reaching empty entry.
Fixes: 3b03cdd60c ("lib: sbi: Add regions merging when sanitizing domain region")
Signed-off-by: Vladimir Kondratiev <vladimir.kondratiev@mobileye.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251111104327.1170919-2-vladimir.kondratiev@mobileye.com
Signed-off-by: Anup Patel <anup@brainfault.org>