On Fri, 2007-06-15 at 01:59 +0000, Linux Kernel Mailing List wrote:
> Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9b01bd5b284bbf519b726b39f1352023cb5e9e69
> Commit: 9b01bd5b284bbf519b726b39f1352023cb5e9e69
> Parent: dc7a93190c21edbf3ed23e678ad04f852b9cff28
> Author: Dave Airlie <[email protected]>
> AuthorDate: Sun Jun 10 16:00:27 2007 +1000
> Committer: Dave Airlie <[email protected]>
> CommitDate: Sun Jun 10 16:00:27 2007 +1000
>
> drm: fix radeon setparam on 32/64 bit systems.
>
> The alignment on 64-bit is different for 64-bit values.
Only on i386. I think you just broke ppc32-on-ppc64.
> Signed-off-by: Dave Airlie <[email protected]>
> ---
> drivers/char/drm/radeon_ioc32.c | 26 ++++++++++++++++++++++++++
> 1 files changed, 26 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/char/drm/radeon_ioc32.c b/drivers/char/drm/radeon_ioc32.c
> index 1f1f9cc..04126c2 100644
> --- a/drivers/char/drm/radeon_ioc32.c
> +++ b/drivers/char/drm/radeon_ioc32.c
> @@ -349,6 +349,31 @@ static int compat_radeon_irq_emit(struct file *file, unsigned int cmd,
> DRM_IOCTL_RADEON_IRQ_EMIT, (unsigned long)request);
> }
>
> +typedef struct drm_radeon_setparam32 {
> + int param;
> + u64 value;
> +} __attribute__((packed)) drm_radeon_setparam32_t;
> +
> +static int compat_radeon_cp_setparam(struct file *file, unsigned int cmd,
> + unsigned long arg)
> +{
> + drm_radeon_setparam32_t req32;
> + drm_radeon_setparam_t __user *request;
> +
> + if (copy_from_user(&req32, (void __user *) arg, sizeof(req32)))
> + return -EFAULT;
> +
> + request = compat_alloc_user_space(sizeof(*request));
> + if (!access_ok(VERIFY_WRITE, request, sizeof(*request))
> + || __put_user(req32.param, &request->param)
> + || __put_user((void __user *)(unsigned long)req32.value,
> + &request->value))
> + return -EFAULT;
> +
> + return drm_ioctl(file->f_dentry->d_inode, file,
> + DRM_IOCTL_RADEON_SETPARAM, (unsigned long) request);
> +}
> +
> drm_ioctl_compat_t *radeon_compat_ioctls[] = {
> [DRM_RADEON_CP_INIT] = compat_radeon_cp_init,
> [DRM_RADEON_CLEAR] = compat_radeon_cp_clear,
> @@ -357,6 +382,7 @@ drm_ioctl_compat_t *radeon_compat_ioctls[] = {
> [DRM_RADEON_VERTEX2] = compat_radeon_cp_vertex2,
> [DRM_RADEON_CMDBUF] = compat_radeon_cp_cmdbuf,
> [DRM_RADEON_GETPARAM] = compat_radeon_cp_getparam,
> + [DRM_RADEON_SETPARAM] = compat_radeon_cp_setparam,
> [DRM_RADEON_ALLOC] = compat_radeon_mem_alloc,
> [DRM_RADEON_IRQ_EMIT] = compat_radeon_irq_emit,
> };
> -
> To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
dwmw2
On Fri, 2007-06-15 at 09:44 +0100, David Woodhouse wrote:
> On Fri, 2007-06-15 at 01:59 +0000, Linux Kernel Mailing List wrote:
> > Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9b01bd5b284bbf519b726b39f1352023cb5e9e69
> > Commit: 9b01bd5b284bbf519b726b39f1352023cb5e9e69
> > Parent: dc7a93190c21edbf3ed23e678ad04f852b9cff28
> > Author: Dave Airlie <[email protected]>
> > AuthorDate: Sun Jun 10 16:00:27 2007 +1000
> > Committer: Dave Airlie <[email protected]>
> > CommitDate: Sun Jun 10 16:00:27 2007 +1000
> >
> > drm: fix radeon setparam on 32/64 bit systems.
> >
> > The alignment on 64-bit is different for 64-bit values.
>
> Only on i386. I think you just broke ppc32-on-ppc64.
Indeed... I'm pretty sure the 64 bits quantity will be naturally aligned
on ppc32 (and possibly others).
Ben.
On Friday 15 June 2007, Benjamin Herrenschmidt wrote:
>
> > Only on i386. I think you just broke ppc32-on-ppc64.
>
> Indeed... I'm pretty sure the 64 bits quantity will be naturally aligned
> on ppc32 (and possibly others).
>
Ok, I'll bite. I've been telling people for ages what the right solution
for this is, but apparently sometimes you need to do it yourself if you
want to have it done well ;-)
Patch follows to introduce compat_u64. Please redefine your data
structure as
typedef struct drm_radeon_setparam32 {
int param;
compat_u64 value;
} drm_radeon_setparam32_t;
Arnd <><
One common problem with 32 bit system call and ioctl emulation
is the different alignment rules between i386 and 64 bit machines.
A number of drivers work around this by marking the compat
structures as 'attribute((packed))', which is not the right
solution because it breaks all the non-x86 architectures that
want to use the same compat code.
Hopefully, this patch improves the situation, it introduces two
new types, compat_u64 and compat_s64. These are defined on all
architectures to have the same size and alignment as the 32 bit
version of u64 and s64.
Signed-off-by: Arnd Bergmann <[email protected]>
---
See the discussions about 'Re: drm: fix radeon setparam on
32/64 bit systems.' and 'diskquota: 32bit quota tools on
64bit architectures' about where this comes from.
Index: linux-2.6/include/asm-ia64/compat.h
===================================================================
--- linux-2.6.orig/include/asm-ia64/compat.h
+++ linux-2.6/include/asm-ia64/compat.h
@@ -31,8 +31,10 @@ typedef s32 compat_timer_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 __attribute__((aligned(4))) compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 __attribute__((aligned(4))) compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-mips/compat.h
===================================================================
--- linux-2.6.orig/include/asm-mips/compat.h
+++ linux-2.6/include/asm-mips/compat.h
@@ -37,8 +37,10 @@ typedef s32 compat_key_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-parisc/compat.h
===================================================================
--- linux-2.6.orig/include/asm-parisc/compat.h
+++ linux-2.6/include/asm-parisc/compat.h
@@ -31,8 +31,10 @@ typedef s32 compat_timer_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-powerpc/compat.h
===================================================================
--- linux-2.6.orig/include/asm-powerpc/compat.h
+++ linux-2.6/include/asm-powerpc/compat.h
@@ -33,8 +33,10 @@ typedef s32 compat_timer_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-s390/compat.h
===================================================================
--- linux-2.6.orig/include/asm-s390/compat.h
+++ linux-2.6/include/asm-s390/compat.h
@@ -60,8 +60,10 @@ typedef s32 compat_timer_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-sparc64/compat.h
===================================================================
--- linux-2.6.orig/include/asm-sparc64/compat.h
+++ linux-2.6/include/asm-sparc64/compat.h
@@ -31,8 +31,10 @@ typedef s32 compat_timer_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
Index: linux-2.6/include/asm-x86_64/compat.h
===================================================================
--- linux-2.6.orig/include/asm-x86_64/compat.h
+++ linux-2.6/include/asm-x86_64/compat.h
@@ -33,8 +33,10 @@ typedef s32 compat_key_t;
typedef s32 compat_int_t;
typedef s32 compat_long_t;
+typedef s64 __attribute__((aligned(4))) compat_s64;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u64 __attribute__((aligned(4))) compat_u64;
struct compat_timespec {
compat_time_t tv_sec;
From: Arnd Bergmann <[email protected]>
Date: Fri, 15 Jun 2007 11:31:37 +0200
> One common problem with 32 bit system call and ioctl emulation
> is the different alignment rules between i386 and 64 bit machines.
> A number of drivers work around this by marking the compat
> structures as 'attribute((packed))', which is not the right
> solution because it breaks all the non-x86 architectures that
> want to use the same compat code.
>
> Hopefully, this patch improves the situation, it introduces two
> new types, compat_u64 and compat_s64. These are defined on all
> architectures to have the same size and alignment as the 32 bit
> version of u64 and s64.
>
> Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: David S. Miller <[email protected]>
On Friday 15 June 2007 11:31:37 Arnd Bergmann wrote:
> One common problem with 32 bit system call and ioctl emulation
> is the different alignment rules between i386 and 64 bit machines.
> A number of drivers work around this by marking the compat
> structures as 'attribute((packed))', which is not the right
> solution because it breaks all the non-x86 architectures that
> want to use the same compat code.
Why does it break them? It should just make them a little slower.
The network code requires unaligned accesses to work
anyways so if your architecture doesn't support them it is already
remotely crashable.
-Andi
On Fri, 2007-06-15 at 13:55 +0200, Andi Kleen wrote:
> On Friday 15 June 2007 11:31:37 Arnd Bergmann wrote:
> > One common problem with 32 bit system call and ioctl emulation
> > is the different alignment rules between i386 and 64 bit machines.
> > A number of drivers work around this by marking the compat
> > structures as 'attribute((packed))', which is not the right
> > solution because it breaks all the non-x86 architectures that
> > want to use the same compat code.
>
> Why does it break them? It should just make them a little slower.
>
> The network code requires unaligned accesses to work
> anyways so if your architecture doesn't support them it is already
> remotely crashable.
alignof(uint64_t) is 8 on just about every 32-bit architecture except
i386. Using __attribute__((packed)) for the 32-on-64 compat code is thus
wrong on every 64-bit architecture except x86_64 and ia64.
It's the _location_ which is wrong; the handling of unaligned loads is
irrelevant (and Linux actually supports a bunch of architecture on which
fixups are impossible now, btw).
--
dwmw2
On Friday 15 June 2007, Andi Kleen wrote:
>
> On Friday 15 June 2007 11:31:37 Arnd Bergmann wrote:
> > One common problem with 32 bit system call and ioctl emulation
> > is the different alignment rules between i386 and 64 bit machines.
> > A number of drivers work around this by marking the compat
> > structures as 'attribute((packed))', which is not the right
> > solution because it breaks all the non-x86 architectures that
> > want to use the same compat code.
>
> Why does it break them? It should just make them a little slower.
>
> The network code requires unaligned accesses to work
> anyways so if your architecture doesn't support them it is already
> remotely crashable.
>
It doesn't break in all cases, but quite often, you have
something like:
struct foo {
__u32 a;
__u64 b;
};
If you define a
struct compat_foo {
__u32 a;
__u64 b;
} __attribute__((packed));
That is broken on all non-x86 architectures, because it removes the
padding that is inserted on the respective 32 bit platforms, while
struct compat_foo {
__u32 a;
compat_u64 b;
};
Is a correct definition on all architectures. It also produces
somewhat better code if the architecture does not support unaligned
data access, but that is just an unintended side-effect.
Arnd <><
Andi Kleen <[email protected]> wrote:
> Why does it break them? It should just make them a little slower.
Not all CPUs deliver recoverable misalignment exceptions. This is probably
particularly true of NOMMU-mode archs where the CPU designed may have taken
the view that if a data exception is delivered, then the whole system is kaput
anyway and must be restarted.
> The network code requires unaligned accesses to work anyways so if your
> architecture doesn't support them it is already remotely crashable.
I thought we'd fixed all that.
David
On Fri, Jun 15, 2007 at 11:31:37AM +0200, Arnd Bergmann wrote:
> One common problem with 32 bit system call and ioctl emulation
> is the different alignment rules between i386 and 64 bit machines.
> A number of drivers work around this by marking the compat
> structures as 'attribute((packed))', which is not the right
> solution because it breaks all the non-x86 architectures that
> want to use the same compat code.
>
> Hopefully, this patch improves the situation, it introduces two
> new types, compat_u64 and compat_s64. These are defined on all
> architectures to have the same size and alignment as the 32 bit
> version of u64 and s64.
You're relying on compat_[us]64 being only used in structures which are
already packed. If someone uses them in a non-packed struct, they won't
decrease the alignment. I think it would be more effective to specify
it as:
__attribute__((aligned(4), packed))
The other problem is that if someone defines a struct like this:
struct foo {
short bar;
compat_s64 baz;
} __attribute__((packed))
it'll have different definitions on x86 and ia64.
So I think we should be aiming for the ((aligned, packed)) definition and
remove the __attribute__((packed)) from the struct definitions. What do
you think?
On Fri, 2007-06-15 at 06:11 -0600, Matthew Wilcox wrote:
> You're relying on compat_[us]64 being only used in structures which are
> already packed.
In which case the whole exercise is pointless, on account of the
structure being already packed. It was _already_ laid out the same on
32-bit and 64-bit builds.
> If someone uses them in a non-packed struct, they won't
> decrease the alignment. I think it would be more effective to specify
> it as:
> __attribute__((aligned(4), packed))
Good point. Yes, it looks like we need the additional 'packed' in order
for the aligned(4) to be anything other than a no-op.
--
dwmw2
> It's the _location_ which is wrong;
As long as everybody using it sees the attribute packed it is not wrong at
all. ABIs are just a convention, not a law.
-Andi
On Friday 15 June 2007, Matthew Wilcox wrote:
> You're relying on compat_[us]64 being only used in structures which are
> already packed. ?If someone uses them in a non-packed struct, they won't
> decrease the alignment. ?I think it would be more effective to specify
> it as:
>
> __attribute__((aligned(4), packed))
That's what I thought as well at first, since this is how the gcc
documentation seems to describe it. However, recent version of gcc
complain about this:
gcc-4.1 -Wall -O2 test.c -c
test.c:1: warning: 'packed' attribute ignored
I have tested versions 2.95, 3.3 and 4.1, an they all ignore do the
right thing when you do not specify the packed attribute.
> The other problem is that if someone defines a struct like this:
>
> struct foo {
> ????????short bar;
> ????????compat_s64 baz;
> } __attribute__((packed))
>
> it'll have different definitions on x86 and ia64.
>
> So I think we should be aiming for the ((aligned, packed)) definition and
> remove the __attribute__((packed)) from the struct definitions. ?What do
> you think?
There should never be an __attribute__((packed)) to solve this alignment
problem, neither in the definition of compat_s64 nor in the definition of
a data structure using it.
We might ask the gcc developers to clarify the documentation, which as of 4.1
states:
The `aligned' attribute can only increase the alignment; but you
can decrease it by specifying `packed' as well. See below.
My understanding is that this only applies to statically allocated variables,
but not to automatic stack variables and to usage of the type inside of
a data structure.
Arnd <><
> That is broken on all non-x86 architectures,
It cannot be broken, it just might be somewhat slower
-Andi
On Friday 15 une 2007 14:09, David Howells wrote:
> Andi Kleen <[email protected]> wrote:
> > Why does it break them? It should just make them a little slower.
>
> Not all CPUs deliver recoverable misalignment exceptions.
These CPUs are too broken to run Linux then.
> > The network code requires unaligned accesses to work anyways so if your
> > architecture doesn't support them it is already remotely crashable.
>
> I thought we'd fixed all that.
Did you audit the complete network stack?
-Andi
On Fri, 2007-06-15 at 14:40 +0200, Andi Kleen wrote:
> > That is broken on all non-x86 architectures,
> It cannot be broken, it just might be somewhat slower
No, Andi. It's broken.
We're speaking of a 32-bit ioctl compat routine. I would say it's more
than 'convention' that the structure used by the compat_ioctl routine
actually matches the 'real' structure as it gets laid out on the 32-bit
architecture.
The 'real' structure as used by the 32-bit userspace does not have the
'packed' attribute. Thus the u64 member of the structure is aligned to 8
bytes on _all_ relevant 32-bit architectures except for i386.
By adding 'packed' in the compat_ioctl routine, you cause it to expect a
structure which does not match what userspace is using, for all non-x86
architectures. That's kind of not very compatible. I call that 'broken'.
--
dwmw2
> > > Why does it break them? It should just make them a little slower.
> > Not all CPUs deliver recoverable misalignment exceptions.
> These CPUs are too broken to run Linux then.
People fixed that up
>
> > > The network code requires unaligned accesses to work anyways so if your
> > > architecture doesn't support them it is already remotely crashable.
> >
> > I thought we'd fixed all that.
>
> Did you audit the complete network stack?
For the parts used by the processors in question yes people have done
that work so using the types without unaligned.
Alan
On Friday 15 June 2007 14:54:20 Alan Cox wrote:
> > > > The network code requires unaligned accesses to work anyways so if your
> > > > architecture doesn't support them it is already remotely crashable.
> > >
> > > I thought we'd fixed all that.
> >
> > Did you audit the complete network stack?
>
> For the parts used by the processors in question yes
That means? They're expected to run only a subset of the network stack?
Is that expressed in Kconfig? Is it documented that the rest is dangerous?
> people have done
> that work so using the types without unaligned.
Very brave; we're talking about around half a million lines
of non trivial source code here.
-Andi
On Fri, 2007-06-15 at 13:54 +0100, Alan Cox wrote:
> For the parts used by the processors in question yes people have done
> that work so using the types without unaligned.
I still think we could use a 'get_maybe_but_probably_not_unaligned()',
although it could have a better name.
People now use get_unaligned() for two reasons -- firstly because we
_expect_ something to be unaligned, and we get a performance improvement
if we just emit code to directly handle that instead of taking the trap.
And more recently we've also started to use it because there's a
_slight_ chance it could be unaligned, and some architectures can't
cope.
Where the unalignment is possible but unlikely, many architectures
probably shouldn't be emitting code to cope with it; they should just
take the trap in the rare case.
In fact, the probability threshold at which it makes sense to just take
the trap will vary from architecture to architecture. We should probably
just have a get_probably_unaligned(...) macro which has its first arg as
an _estimated_ constant probability. Then the architecture code could
emit what it likes according to the probability.
--
dwmw2
> > For the parts used by the processors in question yes
>
> That means? They're expected to run only a subset of the network stack?
> Is that expressed in Kconfig? Is it documented that the rest is dangerous?
Have you audited every other line of code for every other kind of bug ?
> > people have done
> > that work so using the types without unaligned.
>
> Very brave; we're talking about around half a million lines
> of non trivial source code here.
And debug simulators that can be made to trap such accesses, and in most
cases processors which fault such an access (so you find it) but don't
provide enough information to restart.
The testing isn't that hard for a given embedded system and having done
work Linux does not need other changes re-breaking things.
Alan
On Fri, 2007-06-15 at 11:31 +0200, Arnd Bergmann wrote:
> One common problem with 32 bit system call and ioctl emulation
> is the different alignment rules between i386 and 64 bit machines.
> A number of drivers work around this by marking the compat
> structures as 'attribute((packed))', which is not the right
> solution because it breaks all the non-x86 architectures that
> want to use the same compat code.
>
> Hopefully, this patch improves the situation, it introduces two
> new types, compat_u64 and compat_s64. These are defined on all
> architectures to have the same size and alignment as the 32 bit
> version of u64 and s64.
>
> Signed-off-by: Arnd Bergmann <[email protected]>
Sounds good
Acked-by: Benjamin Herrenschmidt <[email protected]>
On Fri, 2007-06-15 at 14:40 +0200, Andi Kleen wrote:
> > That is broken on all non-x86 architectures,
>
> It cannot be broken, it just might be somewhat slower
It is because there is already code out there that expects the padding
(32 bits userland code that is).
Ben.
> And debug simulators that can be made to trap such accesses, and in most
> cases processors which fault such an access (so you find it) but don't
> provide enough information to restart.
>
> The testing isn't that hard for a given embedded system and having done
> work Linux does not need other changes re-breaking things.
Hopefully everybody who deploys these systems knows this. It seems
like a open death trap to me, especially since the consequences
are so severe: remote packet of death, could be a recall for
a network conntected embedded device that doesn't easily allow firmware
update. And they would rightfully blame Linux.
It would be much safer if the parts of the stack that weren't
audited/tested were marked this way and check for BROKEN_UNALIGNED or similar.
Also frankly I'm surprised that whoever designed these systems
didn't learn from the old M68000 who made this mistake the first time.
-Andi
On Fri, 2007-06-15 at 15:27 +0200, Andi Kleen wrote:
> Hopefully everybody who deploys these systems knows this. It seems
> like a open death trap to me, especially since the consequences
> are so severe: remote packet of death, could be a recall for
> a network conntected embedded device that doesn't easily allow
> firmware update. And they would rightfully blame Linux.
>
> It would be much safer if the parts of the stack that weren't
> audited/tested were marked this way and check for BROKEN_UNALIGNED or
> similar.
If we did the get_probably_unaligned() thing, then on architectures
where we _do_ trap and fix up, we could actually spit out a warning
whenever we hit a trap and it _wasn't_ marked with
get_probably_unaligned().
> Also frankly I'm surprised that whoever designed these systems
> didn't learn from the old M68000 who made this mistake the first time.
Mostly, the machines which can't handle traps are the machines which
don't even have an MMU. Why add hardware when you don't really need it?
Fundamentally, it isn't a hardware design flaw. You could perhaps ask
interesting questions about whether Linux is _really_ the right software
to be using on them, but the fact remains that Linux _is_ the software
people are using on them.
--
dwmw2
On Fri, Jun 15, 2007 at 02:42:23PM +0200, Arnd Bergmann wrote:
> That's what I thought as well at first, since this is how the gcc
> documentation seems to describe it. However, recent version of gcc
> complain about this:
>
> gcc-4.1 -Wall -O2 test.c -c
> test.c:1: warning: 'packed' attribute ignored
>
> I have tested versions 2.95, 3.3 and 4.1, an they all ignore do the
> right thing when you do not specify the packed attribute.
...
> We might ask the gcc developers to clarify the documentation, which as of 4.1
> states:
>
> The `aligned' attribute can only increase the alignment; but you
> can decrease it by specifying `packed' as well. See below.
>
> My understanding is that this only applies to statically allocated variables,
> but not to automatic stack variables and to usage of the type inside of
> a data structure.
Here's a program which illustrates the source of confusion:
#include <stdio.h>
#include <stddef.h>
typedef unsigned long long __attribute__((aligned(4))) compat_u64;
struct foo {
int y;
unsigned long long __attribute__((aligned(4))) x;
};
struct bar {
int y;
compat_u64 x;
};
int main(void)
{
printf("offset of foo->x is %lu\n", offsetof(struct foo, x));
printf("offset of bar->x is %lu\n", offsetof(struct bar, x));
return 0;
}
output (on ia64, and I'm told other 64-bit platforms) is:
$ ./test
offset of foo->x is 8
offset of bar->x is 4
I'll try and come up with some wording that works for the GCC manual.
On Friday 15 June 2007, Matthew Wilcox wrote:
> Here's a program which illustrates the source of confusion:
>
> #include <stdio.h>
> #include <stddef.h>
>
> typedef unsigned long long __attribute__((aligned(4))) compat_u64;
>
> struct foo {
> ? ? ? ? int y;
> ? ? ? ? unsigned long long __attribute__((aligned(4))) x;
> };
>
> struct bar {
> ? ? ? ? int y;
> ? ? ? ? compat_u64 x;
> };
>
> int main(void)
> {
> ? ? ? ? printf("offset of foo->x is %lu\n", offsetof(struct foo, x));
> ? ? ? ? printf("offset of bar->x is %lu\n", offsetof(struct bar, x));
> ? ? ? ? return 0;
> }
>
> output (on ia64, and I'm told other 64-bit platforms) is:
>
> $ ./test
> offset of foo->x is 8
> offset of bar->x is 4
>
> I'll try and come up with some wording that works for the GCC manual.
I just talked to Ulrich Weigand, who explained to me that
__attribute__((packed)) should not be specified on a typedef that is
not also a struct/union/enum definition, because it can not change the
type anyway.
Also, the attribute((aligned(x))) works differently in a typedef than
in a field or variable declaration:
In your struct foo, __attribute__((aligned(4))) does not have any
effect because the attribute on a field declaration will only increase
the alignment if you specify a larger value than the default alignment
for the member type.
In struct bar, you have two members that both have type with a default
alignment of 4, because the typedef overwrote the default alignment
for the compat_u64 type.
Arnd <><
On Fri 15 Jun 2007 08:54, Andi Kleen pondered:
> On Friday 15 June 2007 14:54:20 Alan Cox wrote:
>
> > > > > The network code requires unaligned accesses to work anyways so if
your
> > > > > architecture doesn't support them it is already remotely crashable.
> > > >
> > > > I thought we'd fixed all that.
> > >
> > > Did you audit the complete network stack?
> >
> > For the parts used by the processors in question yes
>
> That means? They're expected to run only a subset of the network stack?
> Is that expressed in Kconfig? Is it documented that the rest is dangerous?
For the architecture we use (Blackfin), it does not support unaligned
accesses, and we purposely never put in the trap/fixup code - we trap, and
printk("fix your source");
We have run into a few kernel issues (never networking) this way - but a fixup
of the source is normally the best solution - since it doesn't impose a
hidden performance issue by trapping everything all the time.
> > people have done
> > that work so using the types without unaligned.
>
> Very brave; we're talking about around half a million lines
> of non trivial source code here.
Is there something specific that you can think of that we should be testing?
We have done alot of testing, and people have shipped alot of systems
connected to a varity of networks, and have run all kinds of protocols on
them.
-Robin
> For the architecture we use (Blackfin), it does not support unaligned
> accesses, and we purposely never put in the trap/fixup code - we trap, and
> printk("fix your source");
For the kernel you should fix up too in addition to the printk. Otherwise
you risk a ping of death in the field with some more obscure protocol.
Also the printk should be load limited.
> > > people have done
> > > that work so using the types without unaligned.
> >
> > Very brave; we're talking about around half a million lines
> > of non trivial source code here.
>
> Is there something specific that you can think of that we should be
> testing?
A lot of protocols have variable length headers. Normally everything
is aligned, but an attacker could purposefully misalign some headers
using variable length padding, causing fields in later headers to be
misaligned.
e.g. the original reason this was forbidden was because the TCP
option parser could be tricked into reading time stamps unaligned.
That code is now using get_unaligned(), but there are probably
other culprits too (there is a lot of code in the network stack)
Unfortunately it is hard to test all combinations, so the only safe
alternative would be to audit source code. Another possibility would be to
use a tainted data scheme similar sparse's __user/__iomem annotations with
a static code checker (or extending sparse), but that would also require a lot
of work to do properly and add the necessarily annotations.
Also watch out for netfilter modules -- they make all this much more
complex. And drivers possibly too.
> We have done alot of testing, and people have shipped alot of systems
> connected to a varity of networks, and have run all kinds of protocols on
> them.
Well behaved peers normally don't generate unaligned headers. But
that doesn't mean it couldn't be exploited by someone malicious.
-Andi
On Fri, 2007-06-15 at 07:45 -0600, Matthew Wilcox wrote:
>
> $ ./test
> offset of foo->x is 8
> offset of bar->x is 4
Looks to me like bar (that is compat_s64) is doing the right thing, no ?
Ben.
On Fri, 2007-06-15 at 23:22 +0200, Andi Kleen wrote:
> > For the architecture we use (Blackfin), it does not support unaligned
> > accesses, and we purposely never put in the trap/fixup code - we trap, and
> > printk("fix your source");
>
> For the kernel you should fix up too in addition to the printk. Otherwise
> you risk a ping of death in the field with some more obscure protocol.
> Also the printk should be load limited.
Sometimes you can't. Some architectures don't even take a trap, and on
others (FR-V without MMU) it's an imprecise exception.
--
dwmw2
On Fri, 2007-06-15 at 11:31 +0200, Arnd Bergmann wrote:
> Hopefully, this patch improves the situation, it introduces two
> new types, compat_u64 and compat_s64. These are defined on all
> architectures to have the same size and alignment as the 32 bit
> version of u64 and s64.
Will GCC know that it needs to emit code to handle that (mis)alignment?
Preliminary tests show that it does load the value bytewise if we use
the 'packed' attribute structure on ppc64, but doesn't if we use
compat_u64. But then, I don't think it actually _needs_ to handle it n
ppc64 anyway, so maybe that's not such a good test.
I'll send a simple patch to fix the bug for now, just making the new
wrapper conditional on X86_64||IA64.
--
dwmw2
On Saturday 16 June 2007, David Woodhouse wrote:
> Will GCC know that it needs to emit code to handle that (mis)alignment?
I've tested this with gcc-4.0.3, and it does the right thing, which
is to split a 4 byte aligned 64 bit load/store into two 32 bit accesses,
if you pass -mstrict-align.
> Preliminary tests show that it does load the value bytewise if we use
> the 'packed' attribute structure on ppc64, but doesn't if we use
> compat_u64. But then, I don't think it actually _needs_ to handle it n
> ppc64 anyway, so maybe that's not such a good test.
Right. Note that the behaviour of compat_u64 is the same as when you
pass attribute((packed,aligned(4))) to the structure, where it also won't
do byte accesses, and split the 64 bit access only if you pass
-mstrict-align.
Arnd <><
On Saturday 16 June 2007, Arnd Bergmann wrote:
> On Saturday 16 June 2007, David Woodhouse wrote:
> > Will GCC know that it needs to emit code to handle that (mis)alignment?
>
> I've tested this with gcc-4.0.3, and it does the right thing, which
> is to split a 4 byte aligned 64 bit load/store into two 32 bit accesses,
> if you pass -mstrict-align.
I just realized this was correct but slightly misleading. On powerpc, we
don't set the 'attribute((aligned(4)))' on compat_64, so there is never
a reason to handle the misalignment, even though it would work.
On x86_64, misaligned loads are always ok, so gcc never needs to
care about this, even attribute((packed)) does not cause byte access
here.
Arnd <><
On Sat, 2007-06-16 at 13:21 +0200, Arnd Bergmann wrote:
> On Saturday 16 June 2007, Arnd Bergmann wrote:
> > On Saturday 16 June 2007, David Woodhouse wrote:
> > > Will GCC know that it needs to emit code to handle that (mis)alignment?
> >
> > I've tested this with gcc-4.0.3, and it does the right thing, which
> > is to split a 4 byte aligned 64 bit load/store into two 32 bit accesses,
> > if you pass -mstrict-align.
>
> I just realized this was correct but slightly misleading. On powerpc, we
> don't set the 'attribute((aligned(4)))' on compat_64, so there is never
> a reason to handle the misalignment, even though it would work.
You're right. My question was probably not relevant -- all these 64-bit
architectures cope with misaligned loads anyway. If we ever have to deal
with 32-bit compat on a 64-bit architecture which can't handle
misalignment, I'm just going to hide under my desk and never come out.
> On x86_64, misaligned loads are always ok, so gcc never needs to
> care about this, even attribute((packed)) does not cause byte access
> here.
IA64 too, but it'll be handled there too -- either naturally or by
fixups; it doesn't matter.
--
dwmw2
On Sat, Jun 16, 2007 at 12:34:11PM +0100, David Woodhouse wrote:
> You're right. My question was probably not relevant -- all these 64-bit
> architectures cope with misaligned loads anyway. If we ever have to deal
> with 32-bit compat on a 64-bit architecture which can't handle
> misalignment, I'm just going to hide under my desk and never come out.
... 32-bit compat on a 64-bit architecture where the 32-bit architecture
aligned 64-bit quantities to 32-bit boundaries ...
> > On x86_64, misaligned loads are always ok, so gcc never needs to
> > care about this, even attribute((packed)) does not cause byte access
> > here.
>
> IA64 too, but it'll be handled there too -- either naturally or by
> fixups; it doesn't matter.
Yes. iirc, McKinley and later handle misaligned loads within a cacheline
without interrupting. Merced would interrupt on every misaligned load.