Hi,
Patch 70674f95c0a2ea694d5c39f4e514f538a09be36f
[PATCH] Optimize select/poll by putting small data sets on the stack
resulted in the poll stack being 4-byte aligned on 64-bit
architectures, causing misaligned accesses to elements in the array.
This patch fixes it by declaring the stack in terms of 'long' instead
of 'char'.
Cheers,
Jes
Force alignment of poll stack to long to avoid unaligned access on 64
bit architectures.
Signed-off-by: Jes Sorensen <[email protected]>
---
fs/select.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Index: linux-2.6/fs/select.c
===================================================================
--- linux-2.6.orig/fs/select.c
+++ linux-2.6/fs/select.c
@@ -639,8 +639,10 @@
struct poll_list *walk;
struct fdtable *fdt;
int max_fdset;
- /* Allocate small arguments on the stack to save memory and be faster */
- char stack_pps[POLL_STACK_ALLOC];
+ /* Allocate small arguments on the stack to save memory and be
+ faster - use long to make sure the buffer is aligned properly
+ on 64 bit archs to avoid unaligned access */
+ long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
struct poll_list *stack_pp = NULL;
/* Do a sanity check on nfds ... */
On Friday 31 March 2006 17:38, Jes Sorensen wrote:
> Hi,
>
> Patch 70674f95c0a2ea694d5c39f4e514f538a09be36f
> [PATCH] Optimize select/poll by putting small data sets on the stack
> resulted in the poll stack being 4-byte aligned on 64-bit
> architectures, causing misaligned accesses to elements in the array.
>
> This patch fixes it by declaring the stack in terms of 'long' instead
> of 'char'.
You should do that for poll too then.
-Andi
>>>>> "Andi" == Andi Kleen <[email protected]> writes:
Andi> On Friday 31 March 2006 17:38, Jes Sorensen wrote:
>> Hi,
>>
>> Patch 70674f95c0a2ea694d5c39f4e514f538a09be36f [PATCH] Optimize
>> select/poll by putting small data sets on the stack resulted in the
>> poll stack being 4-byte aligned on 64-bit architectures, causing
>> misaligned accesses to elements in the array.
>>
>> This patch fixes it by declaring the stack in terms of 'long'
>> instead of 'char'.
Andi> You should do that for poll too then.
I assume you mean select().
Updated patch attached.
Jes
Force alignment of poll and select stacks to long to avoid unaligned
access on 64 bit architectures.
Signed-off-by: Jes Sorensen <[email protected]>
---
fs/select.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
Index: linux-2.6/fs/select.c
===================================================================
--- linux-2.6.orig/fs/select.c
+++ linux-2.6/fs/select.c
@@ -314,7 +314,7 @@
int ret, size, max_fdset;
struct fdtable *fdt;
/* Allocate small arguments on the stack to save memory and be faster */
- char stack_fds[SELECT_STACK_ALLOC];
+ long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
ret = -EINVAL;
if (n < 0)
@@ -639,8 +639,10 @@
struct poll_list *walk;
struct fdtable *fdt;
int max_fdset;
- /* Allocate small arguments on the stack to save memory and be faster */
- char stack_pps[POLL_STACK_ALLOC];
+ /* Allocate small arguments on the stack to save memory and be
+ faster - use long to make sure the buffer is aligned properly
+ on 64 bit archs to avoid unaligned access */
+ long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
struct poll_list *stack_pp = NULL;
/* Do a sanity check on nfds ... */
Jes Sorensen <[email protected]> writes:
> struct poll_list *walk;
> struct fdtable *fdt;
> int max_fdset;
> - /* Allocate small arguments on the stack to save memory and be faster */
> - char stack_pps[POLL_STACK_ALLOC];
> + /* Allocate small arguments on the stack to save memory and be
> + faster - use long to make sure the buffer is aligned properly
> + on 64 bit archs to avoid unaligned access */
> + long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
> struct poll_list *stack_pp = NULL;
struct poll_list stack_pps[POLL_STACK_ALLOC / sizeof(struct poll_list)];
is more readable, and probably gcc align it rightly?
--
OGAWA Hirofumi <[email protected]>
On Friday 31 March 2006 18:37, OGAWA Hirofumi wrote:
> Jes Sorensen <[email protected]> writes:
>
> > struct poll_list *walk;
> > struct fdtable *fdt;
> > int max_fdset;
> > - /* Allocate small arguments on the stack to save memory and be faster */
> > - char stack_pps[POLL_STACK_ALLOC];
> > + /* Allocate small arguments on the stack to save memory and be
> > + faster - use long to make sure the buffer is aligned properly
> > + on 64 bit archs to avoid unaligned access */
> > + long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
> > struct poll_list *stack_pp = NULL;
>
> struct poll_list stack_pps[POLL_STACK_ALLOC / sizeof(struct poll_list)];
>
> is more readable, and probably gcc align it rightly?
Yes, but it would be wrong
-Andi
Andi Kleen <[email protected]> writes:
> On Friday 31 March 2006 18:37, OGAWA Hirofumi wrote:
>> Jes Sorensen <[email protected]> writes:
>>
>> > struct poll_list *walk;
>> > struct fdtable *fdt;
>> > int max_fdset;
>> > - /* Allocate small arguments on the stack to save memory and be faster */
>> > - char stack_pps[POLL_STACK_ALLOC];
>> > + /* Allocate small arguments on the stack to save memory and be
>> > + faster - use long to make sure the buffer is aligned properly
>> > + on 64 bit archs to avoid unaligned access */
>> > + long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
>> > struct poll_list *stack_pp = NULL;
>>
>> struct poll_list stack_pps[POLL_STACK_ALLOC / sizeof(struct poll_list)];
>>
>> is more readable, and probably gcc align it rightly?
>
> Yes, but it would be wrong
OK. So how about this?
char stack_pps[POLL_STACK_ALLOC]
__attribute__((aligned (sizeof(struct poll_list))));
--
OGAWA Hirofumi <[email protected]>
On Friday 31 March 2006 19:16, OGAWA Hirofumi wrote:
>
> OK. So how about this?
>
> char stack_pps[POLL_STACK_ALLOC]
> __attribute__((aligned (sizeof(struct poll_list))));
This would require much more alignment than really necessary. Probably you meant
s/sizeof/alignof/. But Jes' patch is fine I think.
-Andi
OGAWA Hirofumi <[email protected]> writes:
> char stack_pps[POLL_STACK_ALLOC]
> __attribute__((aligned (sizeof(struct poll_list))));
Ugh, just wrong. Please ignore.
--
OGAWA Hirofumi <[email protected]>
Andi Kleen <[email protected]> writes:
>> char stack_pps[POLL_STACK_ALLOC]
>> __attribute__((aligned (sizeof(struct poll_list))));
>
> This would require much more alignment than really necessary. Probably you meant
> s/sizeof/alignof/. But Jes' patch is fine I think.
I read your this patch now, so, I agree. Sorry for noise.
--
OGAWA Hirofumi <[email protected]>
Jes Sorensen <[email protected]> wrote:
>
> [PATCH] Optimize select/poll by putting small data sets on the stack
> resulted in the poll stack being 4-byte aligned on 64-bit
> architectures, causing misaligned accesses to elements in the array.
How come you noticed this but I did not?
Jes Sorensen wrote:
> I assume you mean select().
>
> Updated patch attached.
This fixes a few problems introduced by this patch.
* Fixes two warnings caused by mixing "char *" and "long *" pointers
* If SELECT_STACK_ALLOC is not a multiple of sizeof(long) then stack_fds[]
would be less than SELECT_STACK_ALLOC bytes and could overflow later in
the function. Fixed by simply rearranging the test later to work on
sizeof(stack_fds)
Currently SELECT_STACK_ALLOC is 256 so this doesn't happen, but it's
nasty to have things like this hidden in the code. What if later
someone decides to change SELECT_STACK_ALLOC to 300?
* I also changed "size" to be unsigned since that makes more sense and
is less prone to overflow bugs. I'm also a little scared by the
"kmalloc(6 * size)" since that type of call is a classic multiply-overflow
security hole (hence libc's calloc() API). However "size" is constrained
by fdt->max_fdset so I think it isn't exploitable. The kernel doesn't
have an overflow-safe API for kmalloc'ing arrays, does it?
Patch is vs current git HEAD. Only compile/boot tested.
Signed-off-by: Mitchell Blank Jr <[email protected]>
diff --git a/fs/select.c b/fs/select.c
index 071660f..bd9c7db 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -311,7 +311,8 @@ static int core_sys_select(int n, fd_set
{
fd_set_bits fds;
char *bits;
- int ret, size, max_fdset;
+ int ret, max_fdset;
+ unsigned int size;
struct fdtable *fdt;
/* Allocate small arguments on the stack to save memory and be faster */
long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
@@ -335,8 +336,8 @@ static int core_sys_select(int n, fd_set
*/
ret = -ENOMEM;
size = FDS_BYTES(n);
- if (6*size < SELECT_STACK_ALLOC)
- bits = stack_fds;
+ if (size < sizeof(stack_fds) / 6)
+ bits = (char *) stack_fds;
else
bits = kmalloc(6 * size, GFP_KERNEL);
if (!bits)
@@ -373,7 +374,7 @@ static int core_sys_select(int n, fd_set
ret = -EFAULT;
out:
- if (bits != stack_fds)
+ if (bits != (char *) stack_fds)
kfree(bits);
out_nofds:
return ret;
On Saturday 01 April 2006 04:35, Mitchell Blank Jr wrote:
> * I also changed "size" to be unsigned since that makes more sense and
> is less prone to overflow bugs. I'm also a little scared by the
> "kmalloc(6 * size)" since that type of call is a classic multiply-overflow
> security hole (hence libc's calloc() API). However "size" is constrained
> by fdt->max_fdset so I think it isn't exploitable. The kernel doesn't
> have an overflow-safe API for kmalloc'ing arrays, does it?
kcalloc. But it's slow since it memsets.
-Andi
Here's a slightly updated version of my patch: it changes the
if (size < sizeof(stack_fds) / 6)
to:
if (size <= sizeof(stack_fds) / 6)
Otherwise this is exactly the same as the version I just posted. The old
code had this problem too but before it only mattered if SELECT_STACK_ALLOC
was a multiple of six.
Signed-off-by: Mitchell Blank Jr <[email protected]>
diff --git a/fs/select.c b/fs/select.c
index 071660f..c46d40c 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -311,7 +311,8 @@ static int core_sys_select(int n, fd_set
{
fd_set_bits fds;
char *bits;
- int ret, size, max_fdset;
+ int ret, max_fdset;
+ unsigned int size;
struct fdtable *fdt;
/* Allocate small arguments on the stack to save memory and be faster */
long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
@@ -335,8 +336,8 @@ static int core_sys_select(int n, fd_set
*/
ret = -ENOMEM;
size = FDS_BYTES(n);
- if (6*size < SELECT_STACK_ALLOC)
- bits = stack_fds;
+ if (size <= sizeof(stack_fds) / 6)
+ bits = (char *) stack_fds;
else
bits = kmalloc(6 * size, GFP_KERNEL);
if (!bits)
@@ -373,7 +374,7 @@ static int core_sys_select(int n, fd_set
ret = -EFAULT;
out:
- if (bits != stack_fds)
+ if (bits != (char *) stack_fds)
kfree(bits);
out_nofds:
return ret;