Fix the following issues with misaligned vector load/store:
a. Stack overflow: the mask[VLEN_MAX / 8] variable consumes 8K stack
space, given VLEN_MAX=65536, overflowing the default-sized stack.
There's no need to fetch the whole mask in one go, instead, make it
on-demand. Use a 128-byte mask as local buffer to hold the sliding
window of mask. For rvv load, this is allowed -- from the spec:
"The destination vector register group for a masked vector
instruction cannot overlap the source mask register (v0),
unless the destination vector register is being written with
a mask value (e.g., compares) or the scalar result of a reduction"
We don't need to worry about the mask getting overwritten.
b. Maintain the value of vstart upon abort (uptrap) to avoid duplicate
work. After fault resolution, the instruction can restart from the
faulting vstart. For Fault-Only-First loads, reset vstart to 0, as
previously done so, to conform to spec.
c. Explicitly set VS dirty in VSSTATUS with SET_VS_DIRTY() if faulting
from V=1, and if any vector register, including vstart/vl/vtype, gets
changed in the handler. It can add 1 unnecessary op to set VS dirty
in M/SSTATUS (not VSSTATUS), where the HW already did, but for code
simplicity, do it anyway. The overhead should be negligible.
Signed-off-by: Bo Gan <ganboing@gmail.com>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260609060024.706-5-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
sbi_load/store_loop read/write variable-length buffer unprivileged.
Both function use the widest aligned 8/4/2/1 byte load/stores in each
loop to reduce the total number of iterations.
Also switch the scalar/vector misaligned handlers to make use of such
functions to simplify code.
Miscellaneous: remove the unnecessary [taddr] in inline assembly
Signed-off-by: Bo Gan <ganboing@gmail.com>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260609060024.706-4-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
The load/store address offset between the uptrap and the orig_trap
can be derived by orig_trap->tval - uptrap->tval, thus refactor
the function prototype for simplicity.
For vector load, sbi_misaligned_v_tinst_fixup is introduced. There's
no transformed instruction for vector load/store, so null out tinst
if the fault is not a guest-page fault.
Signed-off-by: Bo Gan <ganboing@gmail.com>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260609060024.706-3-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
It's wrong to override the emulator callback in sbi_trap_emulate_load/
store. The function must respect the callback function passed in the
parameter. Hence, let the misaligned emulator callback decide when to
use sbi_misaligned_v_ld/st_emulator. To clean up things, also make the
following changes:
- Add the `insn` parameter to the callback. The trapping insn has been
fetched by the caller already, whether transformed or directly loaded,
thus saving the trouble in the callback. Note that you must not rely
on the length of the `insn`, as it can be a transformed one from tinst
- Also the `tcntx` is added, providing the callback with register values
to handle vector insn or other customized insns.
- Clarify that the read/write length (rlen/wlen) can be 0, in which
case it could be a vector load/store or some customized instruction.
The callback is responsible to handle it accordingly.
Also fixed issues in the sbi_misaligned_v_ld/st_emulator:
a. Redirect the trap when OPENSBI_CC_SUPPORT_VECTOR is not available.
b. Ensure the return code is >0 when no faults are redirected.
Fixes: c2acc5e5b0 ("lib: sbi_misaligned_ldst: Add handling of vector load/store")
Signed-off-by: Bo Gan <ganboing@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Link: https://lore.kernel.org/r/20260605113214.242-6-ganboing@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
When SBI is built with a compiler that doesn't support vector, the
misaligned vector load/store emulation is not built in, the handlers are
just stubs.
Currently the stubs just return 0, causing sbi_trap_emulate_load() to
return without incrementing mepc, meaning the instruction will just
fault again, an infinite loop.
Fix the stubs to use sbi_trap_redirect(), which forwards the trap to the
previous mode, allowing it to be handled there.
Fixes: c2acc5e5 ("lib: sbi_misaligned_ldst: Add handling of vector load/store")
Signed-off-by: Michael Ellerman <mpe@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260530-trap-redirect-v1-1-45d4d333d8c9@kernel.org
Signed-off-by: Anup Patel <anup@brainfault.org>
Enabling V-extension using -march option causes OpenSBI boot-time
hang with LLVM compiler.
As a work-around, don't enable V-extension using -march option and
instead use a custom OpenSBI specific define inform availability of
V-extension to lib/sbi/sbi_trap_v_ldst.c.
Fixes: c2acc5e5b0 ("lib: sbi_misaligned_ldst: Add handling of vector load/store")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Add misaligned load/store handling for the vector extension
to the sbi_misaligned_ldst library.
This implementation is inspired from the misaligned_vec_ldst
implementation in the riscv-pk project.
Co-developed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>