2004-01-02 19:49:39

by Joe Korty

[permalink] [raw]
Subject: siginfo_t fracturing, especially for 64/32-bit compatibility mode

siginfo_t processing is fragile when in 32 bit compatibility mode on
a 64 bit processor. The kernel does conversions between 32 and 64
bit versions of siginfo_t and to do this, it must must always know
which of the (unioned) fields of siginfo are actually being used. I
believe this is the original purpose of the si_code field -- the
value in it should directly or indirectly indicate, unambigiously,
which of the fields in siginfo_t hold useful values.

rt_sigqueueinfo(2) subverts this by reserving a range of si_code
values for users, and there is nothing about them to indicate to the
kernel which fields of siginfo_t are actually in use. This is not a
problem in native mode as rt_sigqueueinfo simply passes the siginfo_t
argument through from the sending to the receiving process without
looking at or caring about the individual field values. But as
compatibility mode must convert and pass individual argument values,
it must know what fields are being used in each any every call to
rt_sigqueueinfo from a compatibility-mode 32 bit process, and each
time a siginfo structure is delivered to a compatibility-mode 32 bit
process (whether generated by a rt_sigqueueinfo from a 64 or 32 bit
process, or internally by the kernel).

A partial solution is to grep all uses of si_code in the kernel and
in glibc and tailor the architecture-specific 64 <-> 32 bit siginfo
kernel transform routines to current use. But this is fragile as it
does not take into account future glibc growth nor other users of
rt_sigqueueinfo outside of glibc, such as applications invoking
rt_sigqueueinfo directly.

Worse, in 2.6.0 and glibc-2.3.2, there are conflicts in current
si_code value assignments which affect both compatibility and native
mode users. When an application receives one of these siginfo_t's,
it cannot in general determine why it got it or which fields in the
siginfo_t it should extract and act upon. And when in compatibility
mode, the kernel cannot always determine which fields need to be
converted and passed on.

The current conflicts are:

SI_TKILL (used by the 2.6 kernel) and SI_ASYNCNL (used by
glibc-2.3.2) both have the same value (-6). If an application is
both threaded and is also using async IO, then it when it gets a
siginfo with the SI_TKILL / SI_ASYNCNL value then cannot reliably
decide which type it is. This may be solvable by changing glibc to
define another value for SI_ASYNCNL as this appears to be used only
by glibc in some process that is talking to itself.

SI_ASYNCIO is used by the kernel USB drivers to return the field
si_addr. Glibc also uses SI_ASYNCIO, but this time the fields
si_pid, si_uid, and si_value are used. As si_addr is unioned with
si_pid and si_uid, it is not possible for both sets to be correctly
converted when in compatibility mode, and even if they could be
converted, it still is not possible for application to safely
determine, when they receive one of these siginfo_t's, which set of
fields to extract and use. As glibc's use of SI_ASYNCIO appears
entirely internal to itself, it may be best to fix this problem by
defining a new SI_ASYNCIO_GLIBC value in glibc and have glibc use
that instead of SI_ASYNCIO. Or, if the USB driver usage is new and
no applications or libraries are using it yet, a new si_code value
could instead be defined for the USB drivers.

The 32-bit compatibility mode issues can be prevented from getting
any worse by eliminating the existing 'user' si_code value range
(grandfathering in all sensible current uses), then creating a new
'user' si_code system in which the caller declares which data fields
in siginfo_t are in use. This would operate much like _IO, _IOR, etc
do for ioctls. A possible definition:

#define __SI_USERMASK 0xff000000
#define __SI_USERCODE 0x20000000

#define __SI_BITMASK 0x00ffff00
#define __SI_PID 0x00800000 /* si_pid */
#define __SI_UID 0x00400000 /* si_uid */
#define __SI_TID 0x00200000 /* si_tid */
#define __SI_OVRUN 0x00100000 /* si_overrun */
#define __SI_PRIV 0x00080000 /* si_sys_private */
#define __SI_INCR 0x00040000 /* si_overrun_incr */
#define __SI_STAT 0x00020000 /* si_status */
#define __SI_UTIME 0x00010000 /* si_utime */
#define __SI_STIME 0x00008000 /* si_stime */
#define __SI_INT 0x00004000 /* si_int */
#define __SI_PTR 0x00002000 /* si_ptr */
#define __SI_ADDR 0x00001000 /* si_addr */
#define __SI_TRAPNO 0x00000800 /* si_trapno */
#define __SI_BAND 0x00000400 /* si_band */
#define __SI_FD 0x00000200 /* si_fd */

#define __SI_CMDMASK 0x000000ff

For example, glibc and other users would define any new user si_code
values they need by doing something like:

#define SI_FASYNCXX (__SI_USERCODE | __SI_UID | __SI_PID | __SI_INT | 17)

To work well, si_code values which do not match these new user values
or the grandfathered old user values would cause rt_sigqueueinfo to
return EINVAL. Also, to work well, copy_siginfo_to_user should be
changed to remove the old blanket-copy when si_code < 0 to one that
copies just the specified fields (except for the grandfathered
values).

Regards,
Joe


2004-01-02 20:29:15

by Linus Torvalds

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode



On Fri, 2 Jan 2004, Joe Korty wrote:
>
> siginfo_t processing is fragile when in 32 bit compatibility mode on
> a 64 bit processor.

It shouldn't be.

Inside the kernel, we should always use the "native" format (ie 64-bit).
The fact that 64-bit architectures are broken is their bug, and the proper
way to fix it is to make sure that everything always uses the native
format.

We should _not_ play games with si_code etc. There is no reason to do so,
since every entrypointe knows _statically_ whether it is given a 32-bit or
64-bit version. That's a lot less fragile than depending on a field that
is filled in by the user.

Linus

2004-01-02 20:38:31

by Joe Korty

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

[ resend, accidently sent originally from a broken email account ]

> On Fri, 2 Jan 2004, Joe Korty wrote:
>> siginfo_t processing is fragile when in 32 bit compatibility mode on
>> a 64 bit processor.
>
> It shouldn't be.
> Inside the kernel, we should always use the "native" format (ie 64-bit).

Indeed we do, and that is the problem. 32 bit apps by definition use
the 32 bit version of siginfo_t and the first act the kernel has to do
on receiving one of these is convert it to 64 bit for consumption by
the rest of the kernel. In order to do that, the kernel must know what
fields in siginfo_t the user has set.

Joe

2004-01-02 20:47:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode



On Fri, 2 Jan 2004, Joe Korty wrote:
>
> Indeed we do, and that is the problem. 32 bit apps by definition use
> the 32 bit version of siginfo_t and the first act the kernel has to do
> on receiving one of these is convert it to 64 bit for consumption by
> the rest of the kernel. In order to do that, the kernel must know what
> fields in siginfo_t the user has set.

Ahh, a light goes on. Yeah, that's broken. Argh.

Linus

2004-01-03 00:24:38

by Andi Kleen

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Fri, 2 Jan 2004 14:49:09 -0500
Joe Korty <[email protected]> wrote:

> siginfo_t processing is fragile when in 32 bit compatibility mode on
> a 64 bit processor. The kernel does conversions between 32 and 64
> bit versions of siginfo_t and to do this, it must must always know
> which of the (unioned) fields of siginfo are actually being used. I
> believe this is the original purpose of the si_code field -- the
> value in it should directly or indirectly indicate, unambigiously,
> which of the fields in siginfo_t hold useful values.
>
> rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> values for users, and there is nothing about them to indicate to the
> kernel which fields of siginfo_t are actually in use. This is not a

My understanding was that the syscall always only supports si_int/si_ptr.
Only the kernel can pass other values. The original idea was to
detect if the code comes from user space, the convert si_int/si_ptr,
otherwise do the kernel conversion.

More for compatibility the emulation layer has been copying the
rest of the 128byte siginfo too, but it didn't do any alignment
adjustment. So if somebody passes some arbitary structure
in there from user space it will likely only work if he sends
it to another 32bit or another 64bit process. Otherwise the alignment
will be messed up. There is nothing that can be done about them.

> A partial solution is to grep all uses of si_code in the kernel and
> in glibc and tailor the architecture-specific 64 <-> 32 bit siginfo
> kernel transform routines to current use. But this is fragile as it
> does not take into account future glibc growth nor other users of
> rt_sigqueueinfo outside of glibc, such as applications invoking
> rt_sigqueueinfo directly.

Basically it was supposed to be:

any signal queuing system calls:
reject any codes that can be generated by the kernel

conversion:
if (code generated by the kernel)
do appropiate conversion
else
fix si_int/si_ptr alignment and copy the rest


>
> Worse, in 2.6.0 and glibc-2.3.2, there are conflicts in current
> si_code value assignments which affect both compatibility and native
> mode users. When an application receives one of these siginfo_t's,
> it cannot in general determine why it got it or which fields in the
> siginfo_t it should extract and act upon. And when in compatibility
> mode, the kernel cannot always determine which fields need to be
> converted and passed on.

If glibc uses other values than si_int/si_ptr for non kernel generated
signals it is IMHO broken.

>
> The current conflicts are:

[...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
SuS just cannot be emulated in user space, glibc was misguided about attempting
it.

I think it is reasonable to just not support this in emulation. We should actually
reject these codes in sigqueueinfo when comming from user space.

-Andi

2004-01-03 00:44:20

by Jakub Jelinek

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > values for users, and there is nothing about them to indicate to the
> > kernel which fields of siginfo_t are actually in use. This is not a
>
> My understanding was that the syscall always only supports si_int/si_ptr.

No, why?

> >
> > The current conflicts are:
>
> [...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
> SuS just cannot be emulated in user space, glibc was misguided about attempting
> it.

SI_ASYNCNL is -60, not -6.
Negative si_code values are reserved for userspace, while positive ones are for
kernel space.

Jakub

2004-01-03 01:07:29

by Andi Kleen

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Fri, 2 Jan 2004 19:44:06 -0500
Jakub Jelinek <[email protected]> wrote:

> On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > > values for users, and there is nothing about them to indicate to the
> > > kernel which fields of siginfo_t are actually in use. This is not a
> >
> > My understanding was that the syscall always only supports si_int/si_ptr.
>
> No, why?

Because otherwise it cannot be supported in the 32bit emulation. Or rather you
won't get any conversion.

>
> > >
> > > The current conflicts are:
> >
> > [...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
> > SuS just cannot be emulated in user space, glibc was misguided about attempting
> > it.
>
> SI_ASYNCNL is -60, not -6.
> Negative si_code values are reserved for userspace, while positive ones are for
> kernel space.

Ok, if the kernel generates that that's broken too then.

-Andi

2004-01-03 02:13:23

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Sat, Jan 03, 2004 at 02:07:26AM +0100, Andi Kleen wrote:
> On Fri, 2 Jan 2004 19:44:06 -0500
> Jakub Jelinek <[email protected]> wrote:
>
> > On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > > > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > > > values for users, and there is nothing about them to indicate to the
> > > > kernel which fields of siginfo_t are actually in use. This is not a
> > >
> > > My understanding was that the syscall always only supports si_int/si_ptr.
> >
> > No, why?
>
> Because otherwise it cannot be supported in the 32bit emulation. Or rather you
> won't get any conversion.

That's probably an acceptable limitation I think what the OP pointed
out is that they _are_ converted, from 32-bit to 64-bit, when a 32-bit
process sends a siginfo to another 32-bit process, thereby garbling it.

--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer

2004-01-03 20:15:25

by Joe Korty

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

I decided to do a more systematic search.

The below table summarizes all user-mode si_code values declared
and sent by either the kernel or by glibc.

Method was simple grep. Therefore tricky uses were not accounted
for.

code name value glibc-2.3.2 send-usage 2.6.1-rc2 send-usage
------------ ------ ---------------------------- --------------------
SI_ASYNCNL -6 si_pid, si_uid, si_value never sent
SI_TKILL -6 never sent si_pid, si_uid
SI_SIGIO -5 never sent never sent
SI_ASYNCIO -4 si_pid, si_uid, si_value si_addr
SI_MESGQ -3 never sent never sent
SI_QUEUE -1 si_pid, si_uid, si_value never sent

Observations:

glibc only sends siginfo_t's with si_pid, si_uid, and si_value set.
This makes trivial the conversion of 32bit user-space-originated
siginfo_t's to the 64-bit form.

The SI_ASYNCNL and SI_TKILL values collide but this collision is
conversion-safe, as the fields used are compatible. However
applications may on occasion have trouble determining which
subsystem sent a received siginfo_t of this type.

SI_ASYNCIO uses are incompatible. This prevents the kernel from
being able to determine which fields to convert when a 64-bit
siginfo_t of this type is to be sent to a 32-bit application.

SI_SIGIO is not used by either the kernel or glibc. This was
somewhat suprising given the extensive coverage of SI_SIGIO in the
man pages.

The kernel likes to send user siginfo_t's to applications, rather
the restrict itself to kernel siginfo_t types. This is a misuse of
the user-siginfo_t concept, though (so far) largely harmless.

Joe

2004-03-29 15:39:32

by Andi Kleen

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Fri, 2 Jan 2004 19:44:06 -0500
Jakub Jelinek <[email protected]> wrote:

> On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > > values for users, and there is nothing about them to indicate to the
> > > kernel which fields of siginfo_t are actually in use. This is not a
> >
> > My understanding was that the syscall always only supports si_int/si_ptr.
>
> No, why?

Because otherwise it cannot be supported in the 32bit emulation. Or rather you
won't get any conversion.

>
> > >
> > > The current conflicts are:
> >
> > [...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
> > SuS just cannot be emulated in user space, glibc was misguided about attempting
> > it.
>
> SI_ASYNCNL is -60, not -6.
> Negative si_code values are reserved for userspace, while positive ones are for
> kernel space.

Ok, if the kernel generates that that's broken too then.

-Andi

2004-03-29 15:39:32

by Andi Kleen

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Fri, 2 Jan 2004 14:49:09 -0500
Joe Korty <[email protected]> wrote:

> siginfo_t processing is fragile when in 32 bit compatibility mode on
> a 64 bit processor. The kernel does conversions between 32 and 64
> bit versions of siginfo_t and to do this, it must must always know
> which of the (unioned) fields of siginfo are actually being used. I
> believe this is the original purpose of the si_code field -- the
> value in it should directly or indirectly indicate, unambigiously,
> which of the fields in siginfo_t hold useful values.
>
> rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> values for users, and there is nothing about them to indicate to the
> kernel which fields of siginfo_t are actually in use. This is not a

My understanding was that the syscall always only supports si_int/si_ptr.
Only the kernel can pass other values. The original idea was to
detect if the code comes from user space, the convert si_int/si_ptr,
otherwise do the kernel conversion.

More for compatibility the emulation layer has been copying the
rest of the 128byte siginfo too, but it didn't do any alignment
adjustment. So if somebody passes some arbitary structure
in there from user space it will likely only work if he sends
it to another 32bit or another 64bit process. Otherwise the alignment
will be messed up. There is nothing that can be done about them.

> A partial solution is to grep all uses of si_code in the kernel and
> in glibc and tailor the architecture-specific 64 <-> 32 bit siginfo
> kernel transform routines to current use. But this is fragile as it
> does not take into account future glibc growth nor other users of
> rt_sigqueueinfo outside of glibc, such as applications invoking
> rt_sigqueueinfo directly.

Basically it was supposed to be:

any signal queuing system calls:
reject any codes that can be generated by the kernel

conversion:
if (code generated by the kernel)
do appropiate conversion
else
fix si_int/si_ptr alignment and copy the rest


>
> Worse, in 2.6.0 and glibc-2.3.2, there are conflicts in current
> si_code value assignments which affect both compatibility and native
> mode users. When an application receives one of these siginfo_t's,
> it cannot in general determine why it got it or which fields in the
> siginfo_t it should extract and act upon. And when in compatibility
> mode, the kernel cannot always determine which fields need to be
> converted and passed on.

If glibc uses other values than si_int/si_ptr for non kernel generated
signals it is IMHO broken.

>
> The current conflicts are:

[...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
SuS just cannot be emulated in user space, glibc was misguided about attempting
it.

I think it is reasonable to just not support this in emulation. We should actually
reject these codes in sigqueueinfo when comming from user space.

-Andi

2004-03-29 15:39:32

by Jakub Jelinek

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > values for users, and there is nothing about them to indicate to the
> > kernel which fields of siginfo_t are actually in use. This is not a
>
> My understanding was that the syscall always only supports si_int/si_ptr.

No, why?

> >
> > The current conflicts are:
>
> [...SI_TKILL, SI_ASYNCIO...] that's broken. We just cannot support that. This aspect of
> SuS just cannot be emulated in user space, glibc was misguided about attempting
> it.

SI_ASYNCNL is -60, not -6.
Negative si_code values are reserved for userspace, while positive ones are for
kernel space.

Jakub

2004-03-29 15:39:32

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

On Sat, Jan 03, 2004 at 02:07:26AM +0100, Andi Kleen wrote:
> On Fri, 2 Jan 2004 19:44:06 -0500
> Jakub Jelinek <[email protected]> wrote:
>
> > On Sat, Jan 03, 2004 at 01:24:33AM +0100, Andi Kleen wrote:
> > > > rt_sigqueueinfo(2) subverts this by reserving a range of si_code
> > > > values for users, and there is nothing about them to indicate to the
> > > > kernel which fields of siginfo_t are actually in use. This is not a
> > >
> > > My understanding was that the syscall always only supports si_int/si_ptr.
> >
> > No, why?
>
> Because otherwise it cannot be supported in the 32bit emulation. Or rather you
> won't get any conversion.

That's probably an acceptable limitation I think what the OP pointed
out is that they _are_ converted, from 32-bit to 64-bit, when a 32-bit
process sends a siginfo to another 32-bit process, thereby garbling it.

--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer

2004-03-29 15:40:45

by Joe Korty

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

I decided to do a more systematic search.

The below table summarizes all user-mode si_code values declared
and sent by either the kernel or by glibc.

Method was simple grep. Therefore tricky uses were not accounted
for.

code name value glibc-2.3.2 send-usage 2.6.1-rc2 send-usage
------------ ------ ---------------------------- --------------------
SI_ASYNCNL -6 si_pid, si_uid, si_value never sent
SI_TKILL -6 never sent si_pid, si_uid
SI_SIGIO -5 never sent never sent
SI_ASYNCIO -4 si_pid, si_uid, si_value si_addr
SI_MESGQ -3 never sent never sent
SI_QUEUE -1 si_pid, si_uid, si_value never sent

Observations:

glibc only sends siginfo_t's with si_pid, si_uid, and si_value set.
This makes trivial the conversion of 32bit user-space-originated
siginfo_t's to the 64-bit form.

The SI_ASYNCNL and SI_TKILL values collide but this collision is
conversion-safe, as the fields used are compatible. However
applications may on occasion have trouble determining which
subsystem sent a received siginfo_t of this type.

SI_ASYNCIO uses are incompatible. This prevents the kernel from
being able to determine which fields to convert when a 64-bit
siginfo_t of this type is to be sent to a 32-bit application.

SI_SIGIO is not used by either the kernel or glibc. This was
somewhat suprising given the extensive coverage of SI_SIGIO in the
man pages.

The kernel likes to send user siginfo_t's to applications, rather
the restrict itself to kernel siginfo_t types. This is a misuse of
the user-siginfo_t concept, though (so far) largely harmless.

Joe

2004-03-29 15:39:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode



On Fri, 2 Jan 2004, Joe Korty wrote:
>
> Indeed we do, and that is the problem. 32 bit apps by definition use
> the 32 bit version of siginfo_t and the first act the kernel has to do
> on receiving one of these is convert it to 64 bit for consumption by
> the rest of the kernel. In order to do that, the kernel must know what
> fields in siginfo_t the user has set.

Ahh, a light goes on. Yeah, that's broken. Argh.

Linus

2004-03-29 15:39:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode



On Fri, 2 Jan 2004, Joe Korty wrote:
>
> siginfo_t processing is fragile when in 32 bit compatibility mode on
> a 64 bit processor.

It shouldn't be.

Inside the kernel, we should always use the "native" format (ie 64-bit).
The fact that 64-bit architectures are broken is their bug, and the proper
way to fix it is to make sure that everything always uses the native
format.

We should _not_ play games with si_code etc. There is no reason to do so,
since every entrypointe knows _statically_ whether it is given a 32-bit or
64-bit version. That's a lot less fragile than depending on a field that
is filled in by the user.

Linus

2004-03-29 15:39:11

by Joe Korty

[permalink] [raw]
Subject: Re: siginfo_t fracturing, especially for 64/32-bit compatibility mode

[ resend, accidently sent originally from a broken email account ]

> On Fri, 2 Jan 2004, Joe Korty wrote:
>> siginfo_t processing is fragile when in 32 bit compatibility mode on
>> a 64 bit processor.
>
> It shouldn't be.
> Inside the kernel, we should always use the "native" format (ie 64-bit).

Indeed we do, and that is the problem. 32 bit apps by definition use
the 32 bit version of siginfo_t and the first act the kernel has to do
on receiving one of these is convert it to 64 bit for consumption by
the rest of the kernel. In order to do that, the kernel must know what
fields in siginfo_t the user has set.

Joe