LinuxLists.cc - [PATCH 2/2] sigaction.2: wfix - Clarify si

2021-02-26 17:32:48

Subject: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

SIGSEGV fills si_addr only for memory access faults. Add a note to clarify.

Signed-off-by: Yu-cheng Yu <[email protected]>
Cc: Alejandro Colomar <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Florian Weimer <[email protected]>
Cc: "H.J. Lu" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/linux-api/[email protected]/
---
man2/sigaction.2 | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/man2/sigaction.2 b/man2/sigaction.2
index 49a30f11e..bea884a23 100644
--- a/man2/sigaction.2
+++ b/man2/sigaction.2
@@ -467,7 +467,7 @@ and
.BR SIGTRAP
fill in
.I si_addr
-with the address of the fault.
+with the address of the fault (see notes).
On some architectures,
these signals also fill in the
.I si_trapno
@@ -955,6 +955,11 @@ It is not possible to block
.IR sa_mask ).
Attempts to do so are silently ignored.
.PP
+In a
+.B SIGSEGV,
+if the fault is a memory access fault, si_addr is filled with the address
+causing the fault, otherwise it is not filled.
+.PP
See
.BR sigsetops (3)
for details on manipulating signal sets.
--
2.21.0

2021-03-08 21:38:40

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

On Fri, Feb 26, 2021 at 09:26:34AM -0800, Yu-cheng Yu wrote:
> SIGSEGV fills si_addr only for memory access faults. Add a note to clarify.
>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Cc: Alejandro Colomar <[email protected]>
> Cc: Michael Kerrisk <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Florian Weimer <[email protected]>
> Cc: "H.J. Lu" <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Link: https://lore.kernel.org/linux-api/[email protected]/
> ---
> man2/sigaction.2 | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/man2/sigaction.2 b/man2/sigaction.2
> index 49a30f11e..bea884a23 100644
> --- a/man2/sigaction.2
> +++ b/man2/sigaction.2
> @@ -467,7 +467,7 @@ and
> .BR SIGTRAP
> fill in
> .I si_addr
> -with the address of the fault.
> +with the address of the fault (see notes).
> On some architectures,
> these signals also fill in the
> .I si_trapno
> @@ -955,6 +955,11 @@ It is not possible to block
> .IR sa_mask ).
> Attempts to do so are silently ignored.
> .PP
> +In a
> +.B SIGSEGV,
> +if the fault is a memory access fault, si_addr is filled with the address
> +causing the fault, otherwise it is not filled.

"... otherwise it is uninitialized." or "zeroed" or whatever...

And I'm having trouble figuring out why do you need to clarify this?

Because of this sentence:

* SIGILL, SIGFPE, SIGSEGV, SIGBUS, and SIGTRAP fill in si_addr with the address
of the fault. On some architectures, these signals also fill in the si_trapno
field.

?

If so, did you audit all architectures whether si_addr is populated only
on memory access faults or is this something POSIX dictates or what's
up? Because the sigaction(2) manpage is arch-agnostic and this is a
rather strong assertion.

What am I missing?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-08 21:49:10

by Yu-cheng Yu

[permalink] [raw]

Subject: Re: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

On 3/8/2021 1:30 PM, Borislav Petkov wrote:
> On Fri, Feb 26, 2021 at 09:26:34AM -0800, Yu-cheng Yu wrote:
>> SIGSEGV fills si_addr only for memory access faults. Add a note to clarify.
>>
>> Signed-off-by: Yu-cheng Yu <[email protected]>
>> Cc: Alejandro Colomar <[email protected]>
>> Cc: Michael Kerrisk <[email protected]>
>> Cc: Andy Lutomirski <[email protected]>
>> Cc: Borislav Petkov <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: Florian Weimer <[email protected]>
>> Cc: "H.J. Lu" <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Link: https://lore.kernel.org/linux-api/[email protected]/
>> ---
>> man2/sigaction.2 | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/man2/sigaction.2 b/man2/sigaction.2
>> index 49a30f11e..bea884a23 100644
>> --- a/man2/sigaction.2
>> +++ b/man2/sigaction.2
>> @@ -467,7 +467,7 @@ and
>> .BR SIGTRAP
>> fill in
>> .I si_addr
>> -with the address of the fault.
>> +with the address of the fault (see notes).
>> On some architectures,
>> these signals also fill in the
>> .I si_trapno
>> @@ -955,6 +955,11 @@ It is not possible to block
>> .IR sa_mask ).
>> Attempts to do so are silently ignored.
>> .PP
>> +In a
>> +.B SIGSEGV,
>> +if the fault is a memory access fault, si_addr is filled with the address
>> +causing the fault, otherwise it is not filled.
>
> "... otherwise it is uninitialized." or "zeroed" or whatever...
>
> And I'm having trouble figuring out why do you need to clarify this?
>
> Because of this sentence:
>
> * SIGILL, SIGFPE, SIGSEGV, SIGBUS, and SIGTRAP fill in si_addr with the address
> of the fault. On some architectures, these signals also fill in the si_trapno
> field.
>
> ?

I think the sentence above is vague, but probably for the reason that
each arch is different. Maybe this patch is unnecessary and can be dropped?

>
> If so, did you audit all architectures whether si_addr is populated only
> on memory access faults or is this something POSIX dictates or what's
> up? Because the sigaction(2) manpage is arch-agnostic and this is a
> rather strong assertion.
>
> What am I missing?
>
> Thx.
>

2021-03-09 14:33:41

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

On Mon, Mar 08, 2021 at 01:46:07PM -0800, Yu, Yu-cheng wrote:
> I think the sentence above is vague, but probably for the reason that each
> arch is different. Maybe this patch is unnecessary and can be dropped?

Maybe.

If you want to clarify it, you should audit every arch. But what
would that bring? IOW, is it that important to specify when si_addr
is populated and when not...? I don't know of an example but I'm
no userspace programmer anyway, to know when this info would be
beneficial...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-11 17:34:50

by Yu-cheng Yu

[permalink] [raw]

Subject: Re: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

On 3/11/2021 9:17 AM, Stefan Puiu wrote:
> Hi,
>
> My 2 cents below.
>
> On Tue, Mar 9, 2021, 16:33 Borislav Petkov <[email protected]
> <mailto:[email protected]>> wrote:
>
> On Mon, Mar 08, 2021 at 01:46:07PM -0800, Yu, Yu-cheng wrote:
> > I think the sentence above is vague, but probably for the reason
> that each
> > arch is different. Maybe this patch is unnecessary and can be
> dropped?
>
> Maybe.
>
> If you want to clarify it, you should audit every arch. But what
> would that bring? IOW, is it that important to specify when si_addr
> is populated and when not...? I don't know of an example but I'm
> no userspace programmer anyway, to know when this info would be
> beneficial...
>
>
> I've worked on projects where the SIGSEGV sig handler would also print
> si_addr. When diagnosing a crash, the address that triggered the fault
> is useful to know. If you can't reproduce the crash in a debugger, or
> there's no core dump, at least you have an idea if it's a NULL pointer
> dereference or some naked pointer dereferencing. So I think it's useful
> to know when si_addr can be used to infer such information and when not.

At least for x86, the faulting ip is already in ucontext, and si_addr is
mostly the memory address being accessed if that was the reason of the
fault (i.e. the memory is not supposed to be accessed). That way, the
signal handler has both the instruction pointer and the memory address.

For shadow stack violation, for example, it is not because the memory
being accessed; it is the instruction itself causing the violation. It
is unnecessary to duplicate the ip in si_addr. Setting si_addr to zero
also indicates this is not a memory type fault.

--
Yu-cheng

2021-03-12 12:57:20

by Stefan Puiu

[permalink] [raw]

Subject: Re: [PATCH 2/2] sigaction.2: wfix - Clarify si_addr description.

On Thu, Mar 11, 2021 at 7:33 PM Yu, Yu-cheng <[email protected]> wrote:
>
> On 3/11/2021 9:17 AM, Stefan Puiu wrote:
> > Hi,
> >
> > My 2 cents below.
> >
> > On Tue, Mar 9, 2021, 16:33 Borislav Petkov <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> > On Mon, Mar 08, 2021 at 01:46:07PM -0800, Yu, Yu-cheng wrote:
> > > I think the sentence above is vague, but probably for the reason
> > that each
> > > arch is different. Maybe this patch is unnecessary and can be
> > dropped?
> >
> > Maybe.
> >
> > If you want to clarify it, you should audit every arch. But what
> > would that bring? IOW, is it that important to specify when si_addr
> > is populated and when not...? I don't know of an example but I'm
> > no userspace programmer anyway, to know when this info would be
> > beneficial...
> >
> >
> > I've worked on projects where the SIGSEGV sig handler would also print
> > si_addr. When diagnosing a crash, the address that triggered the fault
> > is useful to know. If you can't reproduce the crash in a debugger, or
> > there's no core dump, at least you have an idea if it's a NULL pointer
> > dereference or some naked pointer dereferencing. So I think it's useful
> > to know when si_addr can be used to infer such information and when not.
>
> At least for x86, the faulting ip is already in ucontext, and si_addr is
> mostly the memory address being accessed if that was the reason of the
> fault (i.e. the memory is not supposed to be accessed). That way, the
> signal handler has both the instruction pointer and the memory address.

Interesting that you mention ucontext. I think the ability to fetch
the IP from it is not that well documented. See for example the
sigaction man page
(https://man7.org/linux/man-pages/man2/sigaction.2.html):

Further information about the ucontext_t structure can be
found in getcontext(3) and signal(7). Commonly, the
handler function doesn't make any use of the third
argument.

Michael's book ("The Linux Programming Interface") has similar text on
ucontext ("This information is rarely used in signal handlers, so we
don’t go into further details"). I could find one example on google
for fetching the IP at
https://www.oracle.com/technical-resources/articles/it-infrastructure/dev-signal-handlers-studio.html,
but it pertains to SPARC. Also I've found one older of our projects
that uses this, and it seems each architecture has its own layout (the
code handles ppc, mips and x86-64). Is this documented somewhere?
Outside of the arch-specific kernel definition of the uc_mcontext
member in the code, I mean :).

Thanks,
Stefan.

>
> For shadow stack violation, for example, it is not because the memory
> being accessed; it is the instruction itself causing the violation. It
> is unnecessary to duplicate the ip in si_addr. Setting si_addr to zero
> also indicates this is not a memory type fault.
>
> --
> Yu-cheng