2009-03-14 16:07:16

by David Newall

[permalink] [raw]
Subject: Screwy arch/x86/include/asm/string_32.h

There's a lot of weirdness in __constant_memcpy in
arch/x86/include/asm/string_32.h (2.6.29-rc7).

1. The first switch is missing a case for n equal 7.
2. When n < 20, the four tests (n > 16, n > 12, n > 8 & n > 4)
execute identical statements; always 'asm volatile("movsl" :
"=&D"(edi), "=&S"(esi) : "0"(edi), "1"(esi) : "memory")'.


I think it copies less than 4 bytes for values of n between 9 and 19.


2009-03-14 22:53:18

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: Screwy arch/x86/include/asm/string_32.h

David Newall wrote:
> There's a lot of weirdness in __constant_memcpy in
> arch/x86/include/asm/string_32.h (2.6.29-rc7).
>
> 1. The first switch is missing a case for n equal 7.
>
That's because there's no way to copy 7 bytes in two operations of 1/2/4
bytes; it would need to be 3 (4+2+1), which I guess the author deemed
overly complex.

> 2. When n < 20, the four tests (n > 16, n > 12, n > 8 & n > 4)
> execute identical statements; always 'asm volatile("movsl" :
> "=&D"(edi), "=&S"(esi) : "0"(edi), "1"(esi) : "memory")'.
>

movs is a string instruction which has the side-effect of incrementing
esi and edi, so each if() falls through to copy the next 4 bytes.

However, the tail switch should probably use plain "mov" rather than
"movs" because nobody uses esi/edi afterwards.

J