Rehaul instruction decoding to fix the following issues:
- We assume the XLEN of previous mode is the same as MXLEN. However,
RVC instructions decodes differently in RV32 and RV64, so shouldn't
have assumed that.
- We assume it's a misaligned fault and the load/store offset is 0,
i.e., base address == fault address, but access faults can have
non-0 offset (on HW supporting misaligned accesses), so platform
specific load/store fault handler gets the wrong base address.
- No checking of [63:32] of tinst in RV64, which is explicitly
required by Privileged ISA 19.6.3. Must reject tinst with non-0
high 32 bits.
Thus, fix all the above. For misaligned load/store fault, the address
offset should be 0, but we'll validate that on a DEBUG build. On an
optmized build, we kill the use of base address, and use trap address
instead (same as before), which lets the compiler optimize out imm
parsing and other calculations.
I also analyzed the behavior of misaligned fault handler before fix.
With the following conditions met, it can trigger data corruption:
- HW doesn't transform instruction into tinst.
- HW doesn't support misaligned load/store, and OS doesn't enable
misaligned delegation, thus OpenSBI handler is in effect
- HW supports mixed XLEN, and M mode is running RV64, and the trapping
mode (U/VS/VU) is running RV32.
- The trapping instruction is c.f{l|s}w(sp).
Due to the incorrect insn decoding, the trapping instruction would
mistakenly be decoded as c.{l|s}d(sp). With this fix, c.f{l|s}w(sp)
in RV32 is now emulated correctly.
Validation:
The patch is validated to have fixed the issue with test cases running
on a modified version of QEMU that exposes misaligned faults [1], and
a further modified version that removes tinst transformation [2]. The
S-mode OS is a local build of Debian Trixie 6.12 kernel that enables
COMPAT (RV32), and the U-mode test application exercises all integer
and floating-point load/store (RVIFD64/32+RVC64/32) instructions with
all possible imm values. The patch is also tested on real HW (Sifive
P550/ESWIN EIC7700), which only supports RV64. On P550, the same test
was validated both in U mode and VU mode, where the host runs a 6.12
ESWIN vendor kernel that has some ESWIN SoC device driver patches [3]
applied, and the guest runs the exact same Debian Trixie 6.12 kernel
mentioned above.
[1] https://github.com/ganboing/qemu/tree/ganboing-misalign
[2] https://github.com/ganboing/qemu/tree/ganboing-misalign-no-tinst
[3] https://github.com/sifiveinc/riscv-linux/tree/rel/kernel-6.12/hifive-premier-p550
Fixes: 7219477f7b ("lib: Use MTINST CSR in misaligned load/store emulation")
Fixes: b5ae8e8a65 ("lib: Add misaligned load/store trap handling")
Fixes: 4c112650bb ("lib: sbi: abstract out insn decoding to unify mem fault handlers")
Signed-off-by: Bo Gan <ganboing@gmail.com>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260605113214.242-8-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
It's wrong to override the emulator callback in sbi_trap_emulate_load/
store. The function must respect the callback function passed in the
parameter. Hence, let the misaligned emulator callback decide when to
use sbi_misaligned_v_ld/st_emulator. To clean up things, also make the
following changes:
- Add the `insn` parameter to the callback. The trapping insn has been
fetched by the caller already, whether transformed or directly loaded,
thus saving the trouble in the callback. Note that you must not rely
on the length of the `insn`, as it can be a transformed one from tinst
- Also the `tcntx` is added, providing the callback with register values
to handle vector insn or other customized insns.
- Clarify that the read/write length (rlen/wlen) can be 0, in which
case it could be a vector load/store or some customized instruction.
The callback is responsible to handle it accordingly.
Also fixed issues in the sbi_misaligned_v_ld/st_emulator:
a. Redirect the trap when OPENSBI_CC_SUPPORT_VECTOR is not available.
b. Ensure the return code is >0 when no faults are redirected.
Fixes: c2acc5e5b0 ("lib: sbi_misaligned_ldst: Add handling of vector load/store")
Signed-off-by: Bo Gan <ganboing@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Link: https://lore.kernel.org/r/20260605113214.242-6-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
During warm reset, my EIC770X/Hifive Premier P550 can sometimes
encounter memory corruption issue crashing Linux boot. Currently the
issue is mitigated by having a sbi_printf before writing to the reset
register. I analyzed the issue further since then. From the SoC
datasheet[1], it's recommended to implement power-down flow as:
a. Designate a primary core, and let it broadcast requests to other
cores to execute a CEASE insn. Primary core also notifies an
"Externel Agent" to start monitoring.
b. Primary core waits for other cores to CEASE before it CEASEs.
c. "External Agent" waits for primary core to CEASE before resets
the Core Complex.
It's possible that EIC770X can trigger undefined behavior if the core
complex is reset while the harts are actively running. The sbi_printf
in the reset handler effectively hides the problem by delaying the
reset -- by the time sbi_printf finishes, all other harts will have
already landed in the loop in sbi_hsm_hart_wait(), which parks the hart.
Without the sbi_printf, I confirmed that other harts haven't reached
sbi_hsm_hart_wait yet before current hart resets the SoC. (by debugging)
To safely reset, and inspired by the datasheet, the warm reset logic
in eic770x.c now use the current hart as both primary core and the
"External Agent", and other harts as secondary cores. It leverages
the HSM framework and a new eic770x_hsm device to CEASE other harts,
and wait for them to CEASE before resets the SoC. with the sbi_printf
before reset removed, and this logic in place, stress test shows that
the memory corruption issue no longer occurs.
The new eic770x_hsm device is only used for the reset-CEASE logic at
the moment, and may be extended to a fully functional HSM device in
the future.
[1] https://github.com/eswincomputing/EIC7700X-SoC-Technical-Reference-Manual
Fixes: e5797e0688 ("platform: generic: eswin: add EIC7700")
Signed-off-by: Bo Gan <ganboing@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260605075708.96-3-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
When SBI is built with a compiler that doesn't support vector, the
misaligned vector load/store emulation is not built in, the handlers are
just stubs.
Currently the stubs just return 0, causing sbi_trap_emulate_load() to
return without incrementing mepc, meaning the instruction will just
fault again, an infinite loop.
Fix the stubs to use sbi_trap_redirect(), which forwards the trap to the
previous mode, allowing it to be handled there.
Fixes: c2acc5e5 ("lib: sbi_misaligned_ldst: Add handling of vector load/store")
Signed-off-by: Michael Ellerman <mpe@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260530-trap-redirect-v1-1-45d4d333d8c9@kernel.org
Signed-off-by: Anup Patel <anup@brainfault.org>
sbi_pmu_ctr_cfg_match() only acts on SBI_PMU_CFG_FLAG_CLEAR_VALUE and
SBI_PMU_CFG_FLAG_AUTO_START when the event type is SBI_PMU_EVENT_TYPE_HW.
However, pmu_ctr_find_hw() allocates a hardware counter from the same
hw_event_map for SBI_PMU_EVENT_TYPE_HW_CACHE, SBI_PMU_EVENT_TYPE_HW_RAW,
and SBI_PMU_EVENT_TYPE_HW_RAW_V2 as well, and the start/clear helpers
(pmu_ctr_start_hw, pmu_ctr_write_hw) operate on the counter index alone
and are agnostic to the event type. As a result, when a supervisor
configures a HW_CACHE/HW_RAW/HW_RAW_V2 event with these flags, the
counter is programmed and recorded in active_events[] but is never
cleared or started, requiring an extra SBI call to make it count.
Extend the check to cover all hardware-counter event types so that
the configuration flags take effect for HW_CACHE and raw events too.
Deliberately avoiding using "not FW" logic to be explicit about
HW-backed events only.
Fixes: 13d40f21 ("lib: sbi: Add PMU support")
Signed-off-by: David E. Garcia Porras <david.garcia@aheadcomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260522144608.3433470-1-david.garcia@aheadcomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Zkr architecture doesn't define a time limit on state transitions
which results in hanging on unresponsive or event-driven platforms.
To prevent this, we need to limit polling iterations and fall back
in case the budget is over, and stack guard keeps its initial value.
The budget is configurable with CONFIG_ZKR_POLL_BUDGET, defaulting
to 1000 iterations. Successful reads do not consume a try.
Signed-off-by: Evgeny Voevodin <evvoevod@tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260519225014.244672-1-evvoevod@tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Instead of parsing ISA extensions separately for each hart in the
generic_extensions_init() function, it is better to parse ISA extensions
for all available harts in the cold boot path. Also, this allows us
to remove fdt_isa_bitmap from scratch space and directly initialize
"extensions" in struct sbi_hart_features for each hart.
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260521082625.1520870-3-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
UBSan (Undefined Behavior Sanitizer) is a tool implemented using
compiler instrumentation at runtime that allows checking for
statements whose output is not deterministic or defined by the C
standard. Compiling and running OpenSBI with UBSan instrumentation
will print a message in the console if any sentence performs such
an action.
Support involves two main components:
1. The UBSan implementation hooks (derived from NetBSD),
used by the compiler to handle the check output.
2. A test suite integrated with the SBI unit test framework to
verify correct operation at runtime.
Usage:
make UBSAN=y PLATFORM=generic ...
The test suite is built when both UBSAN=y and CONFIG_SBIUNIT=y are
enabled.
When UBSan is enabled, FW_PAYLOAD_OFFSET may need to be increased
due to the size increase added by the instrumentation. A
value of 0x400000 has been tested.
UBSan adds runtime overhead and is intended for development builds
only, not for production.
Note: This patch marks __stack_chk_guard in sbi_init.c as a weak
symbol to prevent multiple definition errors at compile time with
UBSan instrumentation enabled. This resolves the conflict
between the .globl definitions in sbi_init.c and test_head.S.
Signed-off-by: Marcos Oduardo <marcos.oduardo@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260515163321.2038366-1-marcos.oduardo@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The following LLVM compile error is observed in sbi_mpxy.c:
CC lib/sbi/sbi_mpxy.o
lib/sbi/sbi_mpxy.c:535:36: error: result of comparison of constant 18446744073709551615 with
expression of type 'u32' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
535 | (attrs->msi_info.msi_addr_hi == INVALID_ADDR))
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~
lib/sbi/sbi_mpxy.c:534:36: error: result of comparison of constant 18446744073709551615 with
expression of type 'u32' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
534 | (attrs->msi_info.msi_addr_lo == INVALID_ADDR) &&
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~
2 errors generated.
To address the above issue, add u32 typecast to INVALID_ADDR.
Fixes: e92c8fd083 ("sbi: mpxy: define INVALID_ADDR using unsigned long width")
Fixes: 7939bf1329 ("lib: sbi: Add SBI Message Proxy (MPXY) framework")
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260612062218.172726-1-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The payload for FW_PAYLOAD needs to be placed at a 2M/4M aligned address
(for 64/32 bit systems) and the current makefile uses FW_PAYLOAD_OFFSET
to achieve this. This only works if FW_TEXT_START is already 2M/4M
aligned. Most existing physical/virtual platforms have used a
FW_TEXT_START of 0x0 or 0x80000000, so this hasn't been an issue so far.
If, for example, FW_TEXT_START is 0x80000, the payload would end up
placed at 0x280000 on a 64 bit system, which isn't a 2M aligned
address.
Update the makefile to use FW_PAYLOAD_ALIGN instead. This will ensure
that the address picked for the payload is 2M/4M aligned irrespective of
where FW_TEXT_START is.
Signed-off-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260508-payload_alignment-v1-1-6628b4ec1ed3@oss.tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Add rpmi_event_notification_state enum with disable, enable, and return
current state IDs. Also, add req_state field to rpmi_enable_notification_req
and current_state field to rpmi_enable_notification_resp for RPMI specification
compliance.
Add notification event ID enum and data structures for RPMI Performance
service group events: power change, limit change, and level change.
Signed-off-by: David E. Garcia Porras <david.garcia@aheadcomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260608125257.3220114-4-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
PMP functions that deal with hart PMP CSRs are given a sbi_hart_ prefix,
to distinguish from RISC-V PMP encoding functions.
The is_pmp_entry_mapped() function is changed a little more, to align
with other PMP conventions, and made to return a bool to make it more
obvious that it returns a bool and not an SBI_ return code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260430045528.420437-8-npiggin@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
When switching between domains with different satp.MODE values (e.g.
Sv39 to Sv48), the RISC-V ISA permits hardware to use cached translations
from the old virtual-address width if no SFENCE.VMA intervenes. This
constrained-unpredictable behavior is clarified in riscv-isa-manual
PR #2219.
The hart protection re-configuration will anyway do full SFENCE / HFENCE
so move the hart protection re-configuration after register context switch
in switch_to_next_domain_context() to ensure translations from the new
domain's address width are used.
Link: https://github.com/riscv/riscv-isa-manual/pull/2219
Signed-off-by: Zishun Yi <vulab@iscas.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260429181747.160033-1-vulab@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
Add bit-field defines for the RPMI performance domain attributes flags
and fast-channel attributes flags as specified in the RPMI specification.
These are needed by platform implementations that provide RPMI
performance services (e.g. DVFS controllers).
Also add the missing db_write_value field to
rpmi_perf_get_fast_chn_attr_resp to match the RPMI spec layout.
Signed-off-by: David E. Garcia Porras <david.garcia@aheadcomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260328054347.3706029-5-david.garcia@aheadcomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
This patch adds proper support for per-domain floating-point (FP) and
vector (V) contexts in the domain context switch logic. Each domain
now maintains its own FP and vector state, which is saved and restored
during domain switches.
Conditionalize FP and Vector save/restore based on extensions, unconditional
save and restore of floating-point (FP) and Vector registers fails on
generic platform firmware. This firmware must run on multiple platforms
that may lack these extensions.
Address this by conditionally executing FP save/restore only if the underlying
hart supports the F or D extensions. Similarly, perform Vector save/restore
only if the hart supports the Vector extension.
This improves support for multi-domain systems with FP and Vector
extensions, and prevents corruption of FP/Vector state during domain
switches.
Signed-off-by: Dave Patel <dave.patel@riscstar.com>
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518083023.997323-4-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Add support for saving and restoring RISC-V floating-point (F/D) extension
state in OpenSBI. This introduces a floating-point context structure and
helper routines to perform full context save and restore.
The floating-point context includes storage for all 32 FPi registers (f0–f31)
along with the fcsr control and status register. The register state is saved
and restored using double-precision load/store instructions (fsd/fld), and
single-precision load/store instructions (fsw/flw) on an RV64 system with
F and D-extension support.
The implementation follows an eager context switching model where the entire
FP state is saved and restored on every context switch. This avoids the need
for trap-based lazy management and keeps the design simple and deterministic.
Signed-off-by: Dave Patel <dave.patel@riscstar.com>
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518083023.997323-3-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Eager context switch: Add support for saving and restoring RISC-V vector
extension state in OpenSBI. This introduces a per-hart vector context
structure and helper routines to perform full context save and restore.
The vector context includes vcsr CSRs along with storage for all 32 vector
registers. The register state is saved and restored using byte-wise vector
load/store instructions (vs8r/vl8r).
The implementation follows an eager context switching model where the entire
vector state is saved and restored on every context switch. This provides a
simple and deterministic mechanism without requiring lazy trap-based
management.
Signed-off-by: Dave Patel <dave.patel@riscstar.com>
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260518083023.997323-2-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
sbi_domain_check_addr_range() computes `max = addr + size` without
checking for integer overflow. When a caller passes a size large enough
to wrap around (e.g. addr=0x80000000, size=0xFFFFFFFF80000000), max
becomes less than addr, causing the while(addr < max) validation loop
to be skipped entirely. The function then returns true without
performing any permission checks.
This allows an S-mode caller to bypass domain memory protection and
access M-mode memory through SBI extensions that use address range
validation (e.g. DBCN console write/read).
Add an overflow check after computing max: if size is non-zero and
max wrapped to a value <= addr, reject the request.
Signed-off-by: Takumi Hara <takumihara1226@gmail.com>
Reviewed-by: Rahul Pathak <rahul@summations.net>
Link: https://lore.kernel.org/r/20260319132232.51572-1-takumihara1226@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Add a simple SpacemiT I2C driver for basic byte transfers over the I2C
bus, prioritizing simplicity over performance. The driver operates in
PIO mode and does not use interrupts, FIFO, or DMA.
The controller is reset at the start of each transaction to ensure a
known initial state, regardless of prior configuration by the kernel.
This also avoids the need for additional error recovery code.
This will be used for communication with onboard PMIC to reset and
power-off the board.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Anand Moon <linux.amoon@gmail.com>
Link: https://lore.kernel.org/r/20260419150857.2705843-2-aurelien@aurel32.net
Signed-off-by: Anup Patel <anup@brainfault.org>
The reg_stride field represents the address stride in bytes between
consecutive registers. The Linux kernel regmap framework validates
register accesses using IS_ALIGNED(reg, map->reg_stride) as an address
alignment check (drivers/base/regmap/regmap.c). The Linux kernel syscon
driver (drivers/mfd/syscon.c) sets reg_stride directly to reg_io_width:
syscon_config.reg_stride = reg_io_width;
The current OpenSBI code incorrectly multiplies reg_io_width by 8,
converting a byte value to bits. Fix this by using reg_io_width directly
as the stride value, consistent with the Linux kernel.
Fixes: f21d8f7d59 ("lib: utils/regmap: Add simple FDT based syscon regmap driver")
Signed-off-by: David E. Garcia Porras <david.garcia@aheadcomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260403202903.3407945-1-david.garcia@aheadcomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>