2024-03-28 00:30:14

by Vineet Gupta

[permalink] [raw]
Subject: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

Hi Will, Marc,

In the RISC-V land we are hitting an issue and need some help
understanding the SVE ABI about dropping the state on syscalls (and its
implications etc - in hindsight)

If I'm reading the arm64 code correctly, SVE state is unconditionally
(for any syscall whatsoever) dropped in following code path:

el0_svc
    fp_user_discard

The RISC-V Vector ABI mandates something similar and kernel implements
something similar.

    2023-06-29 9657e9b7d253 riscv: Discard vector state on syscalls  

However in recent testing with RISC-V vector builds we are running into
an issue when this just doesn't work.

Just for some background, RISC-V vector instructions relies on
additional state in a VTYPE register which is setup using an apriori
VSETVLI insn.
So consider the following piece of code:

   3ff80:    cc787057              vsetivli    zero,16,e8,mf2,ta,ma    
<-- sets up VTYPE
   3ff84:    44d8                    lw    a4,12(s1)
   3ff86:    449c                    lw    a5,8(s1)
   3ff88:    06f75563              bge    a4,a5,3fff2
   3ff8c:    02010087              vle8.v    v1,(sp)
   3ff90:    020980a7              vse8.v    v1,(s3)   <-- Vector store
instruction
Here's the sequence of events that's causing the issue 1. The vector
store instruction (in say bash) takes a page fault, enters kernel.
2. In PF return path, a SIGCHLD signal is pending (a bash sub-shell
which exited, likely on different cpu).
3. kernel resumes in userspace signal handler which ends up making an
rt_sigreturn syscall - and which as specified discards the V state (and
makes VTYPE reg invalid).
4. When sigreturn finally returns to original Vector store instruction,
invalid VTYPE triggers an Illegal instruction which causes a SIGILL (as
state was discarded above).

So there is no way dropping syscall state would work here.

How do you guys handle this for SVE/SME ? One way would be to not do the
discard in rt_sigreturn codepath, but I don't see that - granted I'm not
too familiar with arch/arm64/*/**

Other thing I wanted to ask is, have there been any perf implications of
this ABI decision: as in if this was other way around, userspace (and/or
compilers) could potentially leverage the fact that SVE/SME state would
still be valid past a syscall - and won't have to reload/resetup etc.

Thanks,
-Vineet


2024-04-02 18:11:59

by Mark Rutland

[permalink] [raw]
Subject: Re: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

On Wed, Mar 27, 2024 at 05:30:00PM -0700, Vineet Gupta wrote:
> Hi Will, Marc,
>
> In the RISC-V land we are hitting an issue and need some help
> understanding the SVE ABI about dropping the state on syscalls (and its
> implications etc - in hindsight)
>
> If I'm reading the arm64 code correctly, SVE state is unconditionally
> (for any syscall whatsoever) dropped in following code path:
>
> el0_svc
> fp_user_discard
>
> The RISC-V Vector ABI mandates something similar and kernel implements
> something similar.
>
> 2023-06-29 9657e9b7d253 riscv: Discard vector state on syscalls
>
> However in recent testing with RISC-V vector builds we are running into
> an issue when this just doesn't work.
>
> Just for some background, RISC-V vector instructions relies on
> additional state in a VTYPE register which is setup using an apriori
> VSETVLI insn.
> So consider the following piece of code:
>
> 3ff80: cc787057 vsetivli zero,16,e8,mf2,ta,ma
> <-- sets up VTYPE
> 3ff84: 44d8 lw a4,12(s1)
> 3ff86: 449c lw a5,8(s1)
> 3ff88: 06f75563 bge a4,a5,3fff2
> 3ff8c: 02010087 vle8.v v1,(sp)
> 3ff90: 020980a7 vse8.v v1,(s3) <-- Vector store
> instruction
> Here's the sequence of events that's causing the issue
>
> 1. The vector store instruction (in say bash) takes a page fault, enters
> kernel.
> 2. In PF return path, a SIGCHLD signal is pending (a bash sub-shell
> which exited, likely on different cpu).

At this point, surely you need to save the VTYPE into a sigframe before
delivering the signal?

> 3. kernel resumes in userspace signal handler which ends up making an
> rt_sigreturn syscall - and which as specified discards the V state (and
> makes VTYPE reg invalid).

The state is discarded at syscall entry, but rt_sigreturn() runs *after* the
discard. If you saved the original VTYPE prior to delivering the signal, it
should be able to restore it regardless of whether it'd be clobbered at syscall
entry.

Surely you *must* save/restore VTYPE in the signal frame? Otherwise the signal
handler can't make any syscall whatsoever, or it's responsible for saving and
restoring VTYPE in userspace, which doesn't seem right.

> 4. When sigreturn finally returns to original Vector store instruction,
> invalid VTYPE triggers an Illegal instruction which causes a SIGILL (as
> state was discarded above).
>
> So there is no way dropping syscall state would work here.

As above, I don't think that's quite true. It sounds to me like that the actual
bug is that you don't save+restore VTYPE in the signal frame?

> How do you guys handle this for SVE/SME ? One way would be to not do the
> discard in rt_sigreturn codepath, but I don't see that - granted I'm not
> too familiar with arch/arm64/*/**

IIUC this works on arm64 because we'll save all the original state when we
deliver the signal, then restore that state *after* entry to the rt_sigreturn()
syscall.

I can go dig into that tomorrow, but I don't see how this can work unless we
save *all* state prior to delivering the signal, and restoring *all* that state
from the sigframe.

> Other thing I wanted to ask is, have there been any perf implications of
> this ABI decision: as in if this was other way around, userspace (and/or
> compilers) could potentially leverage the fact that SVE/SME state would
> still be valid past a syscall - and won't have to reload/resetup etc.

I believe Mark Brown has made some changes recently to try to avoid some of
that impact. He might be able to comment on that.

Mark.

2024-04-02 19:22:29

by Mark Brown

[permalink] [raw]
Subject: Re: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

On Tue, Apr 02, 2024 at 07:11:43PM +0100, Mark Rutland wrote:
> On Wed, Mar 27, 2024 at 05:30:00PM -0700, Vineet Gupta wrote:

> > If I'm reading the arm64 code correctly, SVE state is unconditionally
> > (for any syscall whatsoever) dropped in following code path:

> > el0_svc
> > fp_user_discard

Yes.

> Surely you *must* save/restore VTYPE in the signal frame? Otherwise the signal
> handler can't make any syscall whatsoever, or it's responsible for saving and
> restoring VTYPE in userspace, which doesn't seem right.

The signal handler would also be unable to use any instructions that
affect the vector state (we've got SVE based memcpy() implementations on
arm64...), and there would be issues with anything that returns to a
context other than the one where the signal was originally deliverered.

> > How do you guys handle this for SVE/SME ? One way would be to not do the
> > discard in rt_sigreturn codepath, but I don't see that - granted I'm not
> > too familiar with arch/arm64/*/**

> IIUC this works on arm64 because we'll save all the original state when we
> deliver the signal, then restore that state *after* entry to the rt_sigreturn()
> syscall.

Yes, that's it exactly. All the process state including the SVE and SME
state is written into the signal frame along when the signal is
delivered, then on signal return we restore everything from the signal
frame overwriting any changes that happened while the signal handler was
running including the effects of entering the kernel to do the signal
return.

> > Other thing I wanted to ask is, have there been any perf implications of
> > this ABI decision: as in if this was other way around, userspace (and/or
> > compilers) could potentially leverage the fact that SVE/SME state would
> > still be valid past a syscall - and won't have to reload/resetup etc.

> I believe Mark Brown has made some changes recently to try to avoid some of
> that impact. He might be able to comment on that.

The optimisation work I've been doing has all been around avoiding
enabling traps when we discard the SVE state rather than to do with the
discarding of SVE state, the initial implementation for SVE disabled
traps very eagerly which wasn't ideal.

For arm64 our PCS specifies that the SVE specific state becomes unknown
over any standard function call, this means that any optimisation in
userspace that relied on state being preserved would need to be outside
the PCS which would be *relatively* niche though it's definitely
something that could be done especially for non-C code. The advantage
is that we can stop saving and restoring the full SVE state for
processes that have stopped using SVE, avoiding overhead there. The
kernel ABI basically models syscalls as C function calls, though it is
stricter in that it specifies the values after a syscall rather than
allowing them to be undefined which does impose a small performance
overhead.

Our SME ABI similarly follows the PCS for standard C functions. Our
standard PCS for SME requires that functions be called with SME disabled
so when we exit streaming mode over syscall we're just enforcing that,
we will preserve ZA (the matrix register) since it is controlled
separately and that's broadly how the PCS works.


Attachments:
(No filename) (3.24 kB)
signature.asc (499.00 B)
Download all attachments

2024-04-02 21:01:52

by Vineet Gupta

[permalink] [raw]
Subject: Re: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

+CC Bjorn

On 4/2/24 11:11, Mark Rutland wrote:
> On Wed, Mar 27, 2024 at 05:30:00PM -0700, Vineet Gupta wrote:
>> Hi Will, Marc,

Thx for the reply and apologies for fat-fingering your name above.

>> 1. The vector store instruction (in say bash) takes a page fault, enters
>> kernel.
>> 2. In PF return path, a SIGCHLD signal is pending (a bash sub-shell
>> which exited, likely on different cpu).
> At this point, surely you need to save the VTYPE into a sigframe before
> delivering the signal?

Yes we do.

>> 3. kernel resumes in userspace signal handler which ends up making an
>> rt_sigreturn syscall - and which as specified discards the V state (and
>> makes VTYPE reg invalid).
> The state is discarded at syscall entry, but rt_sigreturn() runs *after* the
> discard. If you saved the original VTYPE prior to delivering the signal, it
> should be able to restore it regardless of whether it'd be clobbered at syscall
> entry.
>
> Surely you *must* save/restore VTYPE in the signal frame? Otherwise the signal
> handler can't make any syscall whatsoever, or it's responsible for saving and
> restoring VTYPE in userspace, which doesn't seem right.

Indeed I later realized that sigreturn is special as it has its own
state to restore. The discard prior drops the state during signal
handler which is anyways transient / throw-away so doesn't hurt this
specific case.

>> 4. When sigreturn finally returns to original Vector store instruction,
>> invalid VTYPE triggers an Illegal instruction which causes a SIGILL (as
>> state was discarded above).
>>
>> So there is no way dropping syscall state would work here.
> As above, I don't think that's quite true. It sounds to me like that the actual
> bug is that you don't save+restore VTYPE in the signal frame?

We do, but there was indeed a different bug which Bjorn found, in
sigreturn V state restore where we were (re)clobbering the V state by
using V-regs in copy-from-user and returning back with that ill restored
state.

>> How do you guys handle this for SVE/SME ? One way would be to not do the
>> discard in rt_sigreturn codepath, but I don't see that - granted I'm not
>> too familiar with arch/arm64/*/**
> IIUC this works on arm64 because we'll save all the original state when we
> deliver the signal, then restore that state *after* entry to the rt_sigreturn()
> syscall.
>
> I can go dig into that tomorrow, but I don't see how this can work unless we
> save *all* state prior to delivering the signal, and restoring *all* that state
> from the sigframe.

You don't have to, Bjorn found the bug and he'll post a fix to lists soon.

Thx,
-Vineet

2024-04-03 08:23:04

by Björn Töpel

[permalink] [raw]
Subject: Re: ARM SVE ABI: kernel dropping SVE/SME state on syscalls

Vineet Gupta <[email protected]> writes:

> You don't have to, Bjorn found the bug and he'll post a fix to lists soon.

FWIW, my patch is here:
https://lore.kernel.org/linux-riscv/[email protected]/


Björn