2010-02-12 16:48:33

by Jeff Mahoney

[permalink] [raw]
Subject: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

prepare_reply sets up an skb for the response. If I understand it correctly,
the payload contains:

+--------------------------------+
| genlmsghdr - 4 bytes |
+--------------------------------+
| NLA header - 4 bytes | /* Aggregate header */
+-+------------------------------+
| | NLA header - 4 bytes | /* PID header */
| +------------------------------+
| | pid/tgid - 4 bytes |
| +------------------------------+
| | NLA header - 4 bytes | /* stats header */
| + -----------------------------+ <- oops. aligned on 4 byte boundary
| | struct taskstats - 328 bytes |
+-+------------------------------+

The start of the taskstats struct must be 8 byte aligned on IA64 (and other
systems with 8 byte alignment rules for 64-bit types) or runtime alignment
warnings will be issued.

This patch pads the pid/tgid field out to sizeof(long), which forces
the alignment of taskstats. The getdelays userspace code is ok with this
since it assumes 32-bit pid/tgid and then honors that header's length field.

An array is used to avoid exposing kernel memory contents to userspace in the
response.

Signed-off-by: Jeff Mahoney <[email protected]>
---
kernel/taskstats.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
struct nlattr *na, *ret;
int aggr;

+ /* If we don't pad, we end up with alignment on a 4 byte boundary.
+ * This causes lots of runtime warnings on systems requiring 8 byte
+ * alignment */
+ u32 pids[2] = { pid, 0 };
+ int pid_size = ALIGN(sizeof(pid), sizeof(long));
+
aggr = (type == TASKSTATS_TYPE_PID)
? TASKSTATS_TYPE_AGGR_PID
: TASKSTATS_TYPE_AGGR_TGID;
@@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
na = nla_nest_start(skb, aggr);
if (!na)
goto err;
- if (nla_put(skb, type, sizeof(pid), &pid) < 0)
+ if (nla_put(skb, type, pid_size, pids) < 0)
goto err;
ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
if (!ret)
--
Jeff Mahoney
SUSE Labs


2010-02-12 18:20:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

On Fri, 12 Feb 2010 11:48:27 -0500
Jeff Mahoney <[email protected]> wrote:

> prepare_reply sets up an skb for the response. If I understand it correctly,
> the payload contains:
>
> +--------------------------------+
> | genlmsghdr - 4 bytes |
> +--------------------------------+
> | NLA header - 4 bytes | /* Aggregate header */
> +-+------------------------------+
> | | NLA header - 4 bytes | /* PID header */
> | +------------------------------+
> | | pid/tgid - 4 bytes |

So we put another four zero bytes in here and add four to the "PID header".

> | +------------------------------+
> | | NLA header - 4 bytes | /* stats header */
> | + -----------------------------+ <- oops. aligned on 4 byte boundary
> | | struct taskstats - 328 bytes |
> +-+------------------------------+
>
> The start of the taskstats struct must be 8 byte aligned on IA64 (and other
> systems with 8 byte alignment rules for 64-bit types) or runtime alignment
> warnings will be issued.
>
> This patch pads the pid/tgid field out to sizeof(long), which forces
> the alignment of taskstats. The getdelays userspace code is ok with this
> since it assumes 32-bit pid/tgid and then honors that header's length field.
>
> An array is used to avoid exposing kernel memory contents to userspace in the
> response.
>
> Signed-off-by: Jeff Mahoney <[email protected]>
> ---
> kernel/taskstats.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
> struct nlattr *na, *ret;
> int aggr;
>
> + /* If we don't pad, we end up with alignment on a 4 byte boundary.
> + * This causes lots of runtime warnings on systems requiring 8 byte
> + * alignment */
> + u32 pids[2] = { pid, 0 };
> + int pid_size = ALIGN(sizeof(pid), sizeof(long));
> +
> aggr = (type == TASKSTATS_TYPE_PID)
> ? TASKSTATS_TYPE_AGGR_PID
> : TASKSTATS_TYPE_AGGR_TGID;
> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
> na = nla_nest_start(skb, aggr);
> if (!na)
> goto err;
> - if (nla_put(skb, type, sizeof(pid), &pid) < 0)
> + if (nla_put(skb, type, pid_size, pids) < 0)
> goto err;
> ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
> if (!ret)

So any code which assumes that the pid/tgid field is four bytes long
will break. Code which takes that length from the netlink message
header will work OK.

32-bit architectures are unaltered.

Seems safe enough. We'd be safer still if we didn't do this on 64-bit
architectures which don't need it. ie: x86_64. But if we do that we
add a risk that people will develop shoddy code which works on x86_64
and doesn't work on ia64.

hmm.

2010-02-12 19:20:29

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

On 02/12/2010 01:19 PM, Andrew Morton wrote:
> On Fri, 12 Feb 2010 11:48:27 -0500
> Jeff Mahoney <[email protected]> wrote:
>
>> prepare_reply sets up an skb for the response. If I understand it correctly,
>> the payload contains:
>>
>> +--------------------------------+
>> | genlmsghdr - 4 bytes |
>> +--------------------------------+
>> | NLA header - 4 bytes | /* Aggregate header */
>> +-+------------------------------+
>> | | NLA header - 4 bytes | /* PID header */
>> | +------------------------------+
>> | | pid/tgid - 4 bytes |
>
> So we put another four zero bytes in here and add four to the "PID header".
>
>> | +------------------------------+
>> | | NLA header - 4 bytes | /* stats header */
>> | + -----------------------------+ <- oops. aligned on 4 byte boundary
>> | | struct taskstats - 328 bytes |
>> +-+------------------------------+
>>
>> The start of the taskstats struct must be 8 byte aligned on IA64 (and other
>> systems with 8 byte alignment rules for 64-bit types) or runtime alignment
>> warnings will be issued.
>>
>> This patch pads the pid/tgid field out to sizeof(long), which forces
>> the alignment of taskstats. The getdelays userspace code is ok with this
>> since it assumes 32-bit pid/tgid and then honors that header's length field.
>>
>> An array is used to avoid exposing kernel memory contents to userspace in the
>> response.
>>
>> Signed-off-by: Jeff Mahoney <[email protected]>
>> ---
>> kernel/taskstats.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> --- a/kernel/taskstats.c
>> +++ b/kernel/taskstats.c
>> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
>> struct nlattr *na, *ret;
>> int aggr;
>>
>> + /* If we don't pad, we end up with alignment on a 4 byte boundary.
>> + * This causes lots of runtime warnings on systems requiring 8 byte
>> + * alignment */
>> + u32 pids[2] = { pid, 0 };
>> + int pid_size = ALIGN(sizeof(pid), sizeof(long));
>> +
>> aggr = (type == TASKSTATS_TYPE_PID)
>> ? TASKSTATS_TYPE_AGGR_PID
>> : TASKSTATS_TYPE_AGGR_TGID;
>> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
>> na = nla_nest_start(skb, aggr);
>> if (!na)
>> goto err;
>> - if (nla_put(skb, type, sizeof(pid), &pid) < 0)
>> + if (nla_put(skb, type, pid_size, pids) < 0)
>> goto err;
>> ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
>> if (!ret)
>
> So any code which assumes that the pid/tgid field is four bytes long
> will break. Code which takes that length from the netlink message
> header will work OK.
>
> 32-bit architectures are unaltered.
>
> Seems safe enough. We'd be safer still if we didn't do this on 64-bit
> architectures which don't need it. ie: x86_64. But if we do that we
> add a risk that people will develop shoddy code which works on x86_64
> and doesn't work on ia64.

Is there a way to do that without needlessly complicating things? I
didn't see any existing infrastructure to do that.

Another option was to put an empty attribute in with a garbage type,
which would add a 4 byte header - but even the getdelays code included
with the kernel can't deal with that.

It's ugly all the way around.

-Jeff

--
Jeff Mahoney
SUSE Labs

2010-02-12 19:29:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

On Fri, 12 Feb 2010 14:20:23 -0500
Jeff Mahoney <[email protected]> wrote:

> > Seems safe enough. We'd be safer still if we didn't do this on 64-bit
> > architectures which don't need it. ie: x86_64. But if we do that we
> > add a risk that people will develop shoddy code which works on x86_64
> > and doesn't work on ia64.
>
> Is there a way to do that without needlessly complicating things? I
> didn't see any existing infrastructure to do that.

ifdefs? I don't think it's worth doing, really. Probably anyone who
wrote an application for this copied the getdelays.c code anyway.

2010-02-12 19:34:19

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

On 02/12/2010 02:29 PM, Andrew Morton wrote:
> On Fri, 12 Feb 2010 14:20:23 -0500
> Jeff Mahoney <[email protected]> wrote:
>
>>> Seems safe enough. We'd be safer still if we didn't do this on 64-bit
>>> architectures which don't need it. ie: x86_64. But if we do that we
>>> add a risk that people will develop shoddy code which works on x86_64
>>> and doesn't work on ia64.
>>
>> Is there a way to do that without needlessly complicating things? I
>> didn't see any existing infrastructure to do that.
>
> ifdefs? I don't think it's worth doing, really. Probably anyone who
> wrote an application for this copied the getdelays.c code anyway.

Yeah I didn't want to get into #if defined(LIST OF ARCHES) for this. I
was hoping for a way to get the alignment rules for the arch. In the
absence of that, this is good enough. You're probably right about people
just lifting the getdelays.c code.

-Jeff

--
Jeff Mahoney
SUSE Labs

2010-02-13 02:14:28

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

* Andrew Morton <[email protected]> [2010-02-12 10:19:57]:

> On Fri, 12 Feb 2010 11:48:27 -0500
> Jeff Mahoney <[email protected]> wrote:
>
> > prepare_reply sets up an skb for the response. If I understand it correctly,
> > the payload contains:
> >
> > +--------------------------------+
> > | genlmsghdr - 4 bytes |
> > +--------------------------------+
> > | NLA header - 4 bytes | /* Aggregate header */
> > +-+------------------------------+
> > | | NLA header - 4 bytes | /* PID header */
> > | +------------------------------+
> > | | pid/tgid - 4 bytes |
>
> So we put another four zero bytes in here and add four to the "PID header".
>

Do you know if these breaks existing applications?

> > | +------------------------------+
> > | | NLA header - 4 bytes | /* stats header */
> > | + -----------------------------+ <- oops. aligned on 4 byte boundary
> > | | struct taskstats - 328 bytes |
> > +-+------------------------------+
> >
> > The start of the taskstats struct must be 8 byte aligned on IA64 (and other
> > systems with 8 byte alignment rules for 64-bit types) or runtime alignment
> > warnings will be issued.
> >
> > This patch pads the pid/tgid field out to sizeof(long), which forces
> > the alignment of taskstats. The getdelays userspace code is ok with this
> > since it assumes 32-bit pid/tgid and then honors that header's length field.
> >

Could you define OK above? Does the application work without
recompiling? Have you checked for endianness issues?

> > An array is used to avoid exposing kernel memory contents to userspace in the
> > response.
> >
> > Signed-off-by: Jeff Mahoney <[email protected]>
> > ---
> > kernel/taskstats.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
> > struct nlattr *na, *ret;
> > int aggr;
> >
> > + /* If we don't pad, we end up with alignment on a 4 byte boundary.
> > + * This causes lots of runtime warnings on systems requiring 8 byte
> > + * alignment */
> > + u32 pids[2] = { pid, 0 };

Shouldn't this be endianness dependent?

> > + int pid_size = ALIGN(sizeof(pid), sizeof(long));
> > +
> > aggr = (type == TASKSTATS_TYPE_PID)
> > ? TASKSTATS_TYPE_AGGR_PID
> > : TASKSTATS_TYPE_AGGR_TGID;
> > @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
> > na = nla_nest_start(skb, aggr);
> > if (!na)
> > goto err;
> > - if (nla_put(skb, type, sizeof(pid), &pid) < 0)
> > + if (nla_put(skb, type, pid_size, pids) < 0)
> > goto err;
> > ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
> > if (!ret)
>
> So any code which assumes that the pid/tgid field is four bytes long
> will break. Code which takes that length from the netlink message
> header will work OK.
>

I think we assume that PID/TGID is 32 bits long, we use the length to
extract the rest of the message. We cast NLA_DATA to int* in
getdelays.c.

> 32-bit architectures are unaltered.
>
> Seems safe enough. We'd be safer still if we didn't do this on 64-bit
> architectures which don't need it. ie: x86_64. But if we do that we
> add a risk that people will develop shoddy code which works on x86_64
> and doesn't work on ia64.
>

May be, this deserves an ABI and version bump in taskstats. I'll test
this patch soon.

--
Three Cheers,
Balbir

2010-02-17 21:47:51

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [PATCH] delayacct: align to 8 byte boundary on 64-bit systems

On 02/12/2010 09:14 PM, Balbir Singh wrote:
> * Andrew Morton <[email protected]> [2010-02-12 10:19:57]:
>
>> On Fri, 12 Feb 2010 11:48:27 -0500
>> Jeff Mahoney <[email protected]> wrote:
>>
>>> prepare_reply sets up an skb for the response. If I understand it correctly,
>>> the payload contains:
>>>
>>> +--------------------------------+
>>> | genlmsghdr - 4 bytes |
>>> +--------------------------------+
>>> | NLA header - 4 bytes | /* Aggregate header */
>>> +-+------------------------------+
>>> | | NLA header - 4 bytes | /* PID header */
>>> | +------------------------------+
>>> | | pid/tgid - 4 bytes |
>>
>> So we put another four zero bytes in here and add four to the "PID header".
>>
>
> Do you know if these breaks existing applications?

AFAIK it doesn't. The initial problem was reported by a partner and
there have been no reports of breakage since the fix was shipped in a
maintenance update for SLE11 in October. I think Andrew's guess about
users just copying the getdelays.c code is probably pretty accurate.

>>> | +------------------------------+
>>> | | NLA header - 4 bytes | /* stats header */
>>> | + -----------------------------+ <- oops. aligned on 4 byte boundary
>>> | | struct taskstats - 328 bytes |
>>> +-+------------------------------+
>>>
>>> The start of the taskstats struct must be 8 byte aligned on IA64 (and other
>>> systems with 8 byte alignment rules for 64-bit types) or runtime alignment
>>> warnings will be issued.
>>>
>>> This patch pads the pid/tgid field out to sizeof(long), which forces
>>> the alignment of taskstats. The getdelays userspace code is ok with this
>>> since it assumes 32-bit pid/tgid and then honors that header's length field.
>>>
>
> Could you define OK above? Does the application work without
> recompiling? Have you checked for endianness issues?

Yes, it seemed to work as before. I just used getdelays.c to test it.

>>> An array is used to avoid exposing kernel memory contents to userspace in the
>>> response.
>>>
>>> Signed-off-by: Jeff Mahoney <[email protected]>
>>> ---
>>> kernel/taskstats.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> --- a/kernel/taskstats.c
>>> +++ b/kernel/taskstats.c
>>> @@ -362,6 +362,12 @@ static struct taskstats *mk_reply(struct
>>> struct nlattr *na, *ret;
>>> int aggr;
>>>
>>> + /* If we don't pad, we end up with alignment on a 4 byte boundary.
>>> + * This causes lots of runtime warnings on systems requiring 8 byte
>>> + * alignment */
>>> + u32 pids[2] = { pid, 0 };
>
> Shouldn't this be endianness dependent?

It could if the user casts to a 64-bit type instead of a 32-bit type,
but that assumption was covered in the comment about the pids and tgids
being 32-bit.

>>> + int pid_size = ALIGN(sizeof(pid), sizeof(long));
>>> +
>>> aggr = (type == TASKSTATS_TYPE_PID)
>>> ? TASKSTATS_TYPE_AGGR_PID
>>> : TASKSTATS_TYPE_AGGR_TGID;
>>> @@ -369,7 +375,7 @@ static struct taskstats *mk_reply(struct
>>> na = nla_nest_start(skb, aggr);
>>> if (!na)
>>> goto err;
>>> - if (nla_put(skb, type, sizeof(pid), &pid) < 0)
>>> + if (nla_put(skb, type, pid_size, pids) < 0)
>>> goto err;
>>> ret = nla_reserve(skb, TASKSTATS_TYPE_STATS, sizeof(struct taskstats));
>>> if (!ret)
>>
>> So any code which assumes that the pid/tgid field is four bytes long
>> will break. Code which takes that length from the netlink message
>> header will work OK.
>>
>
> I think we assume that PID/TGID is 32 bits long, we use the length to
> extract the rest of the message. We cast NLA_DATA to int* in
> getdelays.c.
>
>> 32-bit architectures are unaltered.
>>
>> Seems safe enough. We'd be safer still if we didn't do this on 64-bit
>> architectures which don't need it. ie: x86_64. But if we do that we
>> add a risk that people will develop shoddy code which works on x86_64
>> and doesn't work on ia64.
>>
>
> May be, this deserves an ABI and version bump in taskstats. I'll test
> this patch soon.

That could be too. Though if an ABI bump is in the works, then it's
probably better to not fix it this way and have an empty attribute type
instead.

-Jeff

--
Jeff Mahoney
SuSE Labs