2008-10-08 07:26:25

by Harvey Harrison

[permalink] [raw]
Subject: FRV/ARM unaligned access question

David, Russell,

I noticed that frv/arm are the only two arches that currently use open-coded
byteshifting routines for both the cpu endianness and the other endianness
whereas just about all the other arches use a packed-struct version for the
cpu-endian and then the byteshifting versions (lifted from arm) for the other
endianness.

For frv, this was due to the consolidation of unaligned access helpers which
removed the assembly version. For ARM, I think it was done this way as the
kernel could be compiled for either endianness and so it was just easiest to
always have the byteshifting versions around.

>From the comments in the ARM version, time was spent to make sure the byteshifting
helpers used a small amount or registers. Was there a known problem with the
packed-struct implementation on ARM that made you choose to use byte-shifting
or was it just done this way to more easily support compiling for different
endianness?

If there isn't an issue I'm missing, could ARM/FRV move over to the packed-struct
version?

Cheers,

Harvey




2008-10-08 07:36:08

by Russell King

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

On Wed, Oct 08, 2008 at 12:26:13AM -0700, Harvey Harrison wrote:
> I noticed that frv/arm are the only two arches that currently use open-coded
> byteshifting routines for both the cpu endianness and the other endianness
> whereas just about all the other arches use a packed-struct version for the
> cpu-endian and then the byteshifting versions (lifted from arm) for the other
> endianness.

I'm sorry, I think you're mistaken. I've looked at x86, m68k and
parisc, and they all use assembly for their swab functions in
asm/byteorder.h.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2008-10-08 07:36:34

by Harvey Harrison

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

On Wed, 2008-10-08 at 08:35 +0100, Russell King wrote:
> On Wed, Oct 08, 2008 at 12:26:13AM -0700, Harvey Harrison wrote:
> > I noticed that frv/arm are the only two arches that currently use open-coded
> > byteshifting routines for both the cpu endianness and the other endianness
> > whereas just about all the other arches use a packed-struct version for the
> > cpu-endian and then the byteshifting versions (lifted from arm) for the other
> > endianness.
>
> I'm sorry, I think you're mistaken. I've looked at x86, m68k and
> parisc, and they all use assembly for their swab functions in
> asm/byteorder.h.
>

Sorry, not talking about byteorder at the moment, talking about
unaligned.h.

Cheers,

Harvey

2008-10-08 09:11:36

by Russell King

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

On Wed, Oct 08, 2008 at 12:36:19AM -0700, Harvey Harrison wrote:
> On Wed, 2008-10-08 at 08:35 +0100, Russell King wrote:
> > On Wed, Oct 08, 2008 at 12:26:13AM -0700, Harvey Harrison wrote:
> > > I noticed that frv/arm are the only two arches that currently use open-coded
> > > byteshifting routines for both the cpu endianness and the other endianness
> > > whereas just about all the other arches use a packed-struct version for the
> > > cpu-endian and then the byteshifting versions (lifted from arm) for the other
> > > endianness.
> >
> > I'm sorry, I think you're mistaken. I've looked at x86, m68k and
> > parisc, and they all use assembly for their swab functions in
> > asm/byteorder.h.
> >
>
> Sorry, not talking about byteorder at the moment, talking about
> unaligned.h.

At the moment, I've no idea what effect it'll have. I'd need to run
some tests to discover what the effect will be. Not sure when I'll
get around to that.

If someone else can be found to evaluate what the effect would be...

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2008-10-08 09:34:44

by Harvey Harrison

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

On Wed, 2008-10-08 at 10:10 +0100, Russell King wrote:
> On Wed, Oct 08, 2008 at 12:36:19AM -0700, Harvey Harrison wrote:
> > On Wed, 2008-10-08 at 08:35 +0100, Russell King wrote:
> > > On Wed, Oct 08, 2008 at 12:26:13AM -0700, Harvey Harrison wrote:
> > > > I noticed that frv/arm are the only two arches that currently use open-coded
> > > > byteshifting routines for both the cpu endianness and the other endianness
> > > > whereas just about all the other arches use a packed-struct version for the
> > > > cpu-endian and then the byteshifting versions (lifted from arm) for the other
> > > > endianness.
> > >
> > > I'm sorry, I think you're mistaken. I've looked at x86, m68k and
> > > parisc, and they all use assembly for their swab functions in
> > > asm/byteorder.h.
> > >
> >
> > Sorry, not talking about byteorder at the moment, talking about
> > unaligned.h.
>
> At the moment, I've no idea what effect it'll have. I'd need to run
> some tests to discover what the effect will be. Not sure when I'll
> get around to that.
>
> If someone else can be found to evaluate what the effect would be...
>

I don't have hardware to test with, but I'll do some cross-compiles to
investigate a bit. I was just curious if there was any known issues on
arm, or a specific arm compiler that made you choose the implementation
you did.

Cheers,

Harvey

2008-10-08 11:17:41

by David Howells

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

Harvey Harrison <[email protected]> wrote:

> If there isn't an issue I'm missing, could ARM/FRV move over to the
> packed-struct version?

Using this source:

typedef unsigned char u8;
typedef unsigned int u32;

struct __una_u32 { u32 x __attribute__((packed)); };

#if 0 // packed struct
static inline u32 __get_unaligned_cpu32(const void *p)
{
const struct __una_u32 *ptr = (const struct __una_u32 *)p;
return ptr->x;
}

static inline u32 get_unaligned_be32(const void *p)
{
return __get_unaligned_cpu32((const u8 *)p);
}

#else // manual byteshift
static inline u32 __get_unaligned_be32(const u8 *p)
{
return p[0] << 24 | p[1] << 16 | p[2] << 8 | p[3];
}

static inline u32 get_unaligned_be32(const void *p)
{
return __get_unaligned_be32((const u8 *)p);
}
#endif

u32 jump(u32 *p)
{
return get_unaligned_be32(p);
}

I see the packed struct version compile (with -O2) to:

jump:
ldub @(gr8,gr0),gr6
ldubi @(gr8,1),gr4
ldubi @(gr8,2),gr5
ldubi @(gr8,3),gr8
slli gr6,#24,gr6
slli gr4,#16,gr4
or.p gr4, gr6, gr4
slli gr5,#8,gr5
or gr5, gr4, gr5
or.p gr8, gr5, gr8
ret

and the byteshift version compile to:

jump:
ldubi.p @(gr8,1),gr7
mov gr8, gr5
ldubi @(gr5,2),gr6
ldub @(gr8,gr0),gr8
ldubi @(gr5,3),gr9
slli gr7,#16,gr7
slli gr6,#8,gr6
slli.p gr8,#24,gr8
or gr6, gr7, gr6
or gr8, gr9, gr8
or.p gr8, gr6, gr8
ret

so they're more or less equivalent, give or take the compiler using an extra
instruction unnecessarily.


Switching to the packed struct algorithms also reduces the kernel size very
slightly. Before:

warthog>size vmlinux
text data bss dec hex filename
2207836 66588 150189 2424613 24ff25 vmlinux

After:

warthog>size vmlinux
text data bss dec hex filename
2207804 66588 150189 2424581 24ff05 vmlinux

The attached patch boots okay over NFS.

David
---
[PATCH] FRV: Use packed-struct unalignment rather than manual-shift

From: David Howells <[email protected]>

Use the packed-struct unalignment algorithms rather than the manual-shift
unalignment algorithms.

This makes the kernel very slightly smaller.

Signed-off-by: David Howells <[email protected]>
---

include/asm-frv/unaligned.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/include/asm-frv/unaligned.h b/include/asm-frv/unaligned.h
index 839a2fb..d06b9bc 100644
--- a/include/asm-frv/unaligned.h
+++ b/include/asm-frv/unaligned.h
@@ -12,8 +12,8 @@
#ifndef _ASM_UNALIGNED_H
#define _ASM_UNALIGNED_H

-#include <linux/unaligned/le_byteshift.h>
-#include <linux/unaligned/be_byteshift.h>
+#include <linux/unaligned/le_struct.h>
+#include <linux/unaligned/be_struct.h>
#include <linux/unaligned/generic.h>

#define get_unaligned __get_unaligned_be

2008-10-08 20:23:15

by Harvey Harrison

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question

On Wed, 2008-10-08 at 12:16 +0100, David Howells wrote:
> Harvey Harrison <[email protected]> wrote:
>
> Switching to the packed struct algorithms also reduces the kernel size very
> slightly. Before:
>
> warthog>size vmlinux
> text data bss dec hex filename
> 2207836 66588 150189 2424613 24ff25 vmlinux
>
> After:
>
> warthog>size vmlinux
> text data bss dec hex filename
> 2207804 66588 150189 2424581 24ff05 vmlinux

Probably smaller as now the le values aren't being byteswapped anymore,
only the native endianess can use the struct version. Care to look at
the kernel size with the following instead?

From: Harvey Harrison <[email protected]>
Subject: [PATCH] frv: switch unaligned access to the packed-struct implementation

Signed-off-by: Harvey Harrison <[email protected]>
---
include/asm-frv/unaligned.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/asm-frv/unaligned.h b/include/asm-frv/unaligned.h
index 839a2fb..6c61c05 100644
--- a/include/asm-frv/unaligned.h
+++ b/include/asm-frv/unaligned.h
@@ -13,7 +13,7 @@
#define _ASM_UNALIGNED_H

#include <linux/unaligned/le_byteshift.h>
-#include <linux/unaligned/be_byteshift.h>
+#include <linux/unaligned/be_struct.h>
#include <linux/unaligned/generic.h>

#define get_unaligned __get_unaligned_be
--
1.6.0.2.471.g47a76


2008-10-09 11:35:57

by David Howells

[permalink] [raw]
Subject: Re: FRV/ARM unaligned access question


Harvey Harrison <[email protected]> wrote:

> Probably smaller as now the le values aren't being byteswapped anymore,
> only the native endianess can use the struct version. Care to look at
> the kernel size with the following instead?

For some unknown reason, the data segment shrank by four bytes. The text
segment stayed the same though:

warthog>size vmlinux
text data bss dec hex filename
2207804 66584 150189 2424577 24ff01 vmlinux

I would guess that in that kernel almost nothing uses LE unalignment.

The kernel works, so feel free to add my Acked-by to the patch.

David