2000-11-13 23:03:39

by Yann Dirson

[permalink] [raw]
Subject: Inconsistencies in 3dNOW handling


Looking at what the CONFIG_X86_USE_3DNOW config option in 2.40.-test10
enables, I find a couple of strange things. This led me through a
small high-level audit of 3DNOW/MMX stuff.

Hopefully someone will be able to explain or to confirm whether the
points I highlight are indeed bugs. I'd value some comments before
starting to change this :}


- CONFIG_MK6 is described as "K6/K6-II/K6-III", and CONFIG_MK7 as
"Athlon/K7". Of these two, only the latter defines
CONFIG_X86_USE_3DNOW, although K6-II and K6-III do provide 3DNOW
instructions. We even find in strings-486.h a comment saying:

This CPU favours 3DNow strongly (eg AMD K6-II, K6-III, Athlon)

OTOH, string.h only says:

* This CPU favours 3DNow strongly (eg AMD Athlon)

page.h says:

* On older X86 processors its not a win to use MMX here it seems.
* Maybe the K6-III ?

Gasp. Would it or not in the end be useful to add a CONFIG_MK6II
option that would enable 3DNOW ?


- In all places where 3DNOW is tested (strings-486.h, page.h), only
MMX-specific funcs are used (_mmx_memcpy mostly, mmx_{clear,copy}_page)

page.h says:

* On older X86 processors its not a win to use MMX here it seems.
* Maybe the K6-III ?

mmx.[ch] say:

* MMX 3Dnow! helper operations

So do they use MMX or 3Dnow after all ? They are distinct processor
features, aren't they ?

If this option is really just meant to enable MMX stuff, let's just
call it by its name

Some doc about that should be written - I'll gladly do that once I've
gathered the info. Some Documentation/i386/MMX file maybe, or in
mmx.c. But this info won't be easily found anyway if we keep using
3DNOW as a name for the config option... Objections ?


- mmx.c says:

Checksums are not a win with MMX on any CPU
tested so far for any MMX solution figured

This would be better to have the list of tested CPUs here. Does
someone have this list ?


- there is even a CONFIG_X86_USE_3DNOW_AND_WORKS option, that would
enable MMX in __generic_copy-{to,from}_user
(arch/i386/lib/usercopy.c). There is no comment about why this code
was disabled.

This code uses the following test to trigger MMX use:

if(n<512)
__copy_user(to,from,n);
else
mmx_copy_user(to,from,n);

... whereas string{,-486}.h use the following test to trigger MMX use:

if(len<512 || in_interrupt())
return __constant_memcpy(to, from, len);
return _mmx_memcpy(to, from, len);

Could this be the cause of the problem ?


You'll also have noticed the inconsistent naming in 2 highly similar
pieces of code.


- BTW, what does this 512 stand for ? Especially as it's used in
several places, a #define would seem nice at first glance.


- In mmx.c, function naming and ordering really seems inconsistent.
Or is there a reason for a "zero/clear" duality ?


- drivers/md/xor.c says:

certain CPU features like MMX can only be detected runtime

I'm not sure how much this relates to the above, but I'd say a MMX
config option could be used for this ? Or a common detection routine
that other drivers could use ?

--
Yann Dirson <[email protected]> | Why make M$-Bill richer & richer ?
debian-email: <[email protected]> | Support Debian GNU/Linux:
| Cheaper, more Powerful, more Stable !
http://ydirson.free.fr/ | Check <http://www.debian.org/>


2000-11-14 08:33:21

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Inconsistencies in 3dNOW handling

>
> - CONFIG_MK6 is described as "K6/K6-II/K6-III", and CONFIG_MK7 as
> "Athlon/K7". Of these two, only the latter defines
> CONFIG_X86_USE_3DNOW, although K6-II and K6-III do provide 3DNOW
> instructions.

The Athlon has an extended version of 3DNOW, which the kernel uses as of
test11-pre2. The entire 3DNOW option has nothing to do with userspace;
unlike the Screaming Sindy Extensions, 3DNOW requires no (extra) kernel
support.

The 3DNOW option is about the kernel using extended instructions for
internal functions such as zero_page() and copy_page().
This is no advantage on K6 processors, but on Athlon processors (well, most
of them anyway) it is a gain of more than a factor 2 for these functions.
The test11pre2 code will also not run on a K6-II/III, but this valid due to the
fact that this new code is > 2x faster than the old code, and the old code
was no win on the K6-II/III anyway.


> * On older X86 processors its not a win to use MMX here it seems.
> * Maybe the K6-III ?
>
> Gasp. Would it or not in the end be useful to add a CONFIG_MK6II
> option that would enable 3DNOW ?

Won't work. (see previous comment).



> - In all places where 3DNOW is tested (strings-486.h, page.h), only
> MMX-specific funcs are used (_mmx_memcpy mostly, mmx_{clear,copy}_page)
> page.h says:

> So do they use MMX or 3Dnow after all ? They are distinct processor
> features, aren't they ?

Some of the "MMX" instructions are part of "3Dnow" according to AMD
publications. This is especially true for the "prefetch" instructions which
have a different memnonic/opcode on Intel CPU's.


> - BTW, what does this 512 stand for ? Especially as it's used in
> several places, a #define would seem nice at first glance.

The 512 is a rough estimate of the minimum size of the copy that makes it
worth saving and restoring the extra processor-state for using mmx.


> - drivers/md/xor.c says:
>
> certain CPU features like MMX can only be detected runtime
>
> I'm not sure how much this relates to the above, but I'd say a MMX
> config option could be used for this ? Or a common detection routine
> that other drivers could use ?

The way it is done now, even a generic i386 compiled kernel (think
distributions) will use the fast MMX raid code. For the copy_page and
friends this extra test is probably too much overhead. (the < 512 test is
inlined and often constant...)

Greetings,
Arjan van de Ven

2000-11-14 13:52:17

by Petko Manolov

[permalink] [raw]
Subject: Re: Inconsistencies in 3dNOW handling

You already have good answer from Arjan van de Ven.
I'm about to submit a patch for string-486.h where
3Dnow! support will be removed. It is not needed as
routines in this file expect 486 or older 586.

Anyway, i'm in doubt if this file will be ever used.


Petkan




Yann Dirson wrote:
>
> Looking at what the CONFIG_X86_USE_3DNOW config option in 2.40.-test10
> enables, I find a couple of strange things. This led me through a
> small high-level audit of 3DNOW/MMX stuff.
>
> Hopefully someone will be able to explain or to confirm whether the
> points I highlight are indeed bugs. I'd value some comments before
> starting to change this :}
>
> - CONFIG_MK6 is described as "K6/K6-II/K6-III", and CONFIG_MK7 as
> "Athlon/K7". Of these two, only the latter defines
> CONFIG_X86_USE_3DNOW, although K6-II and K6-III do provide 3DNOW
> instructions. We even find in strings-486.h a comment saying:
>
> This CPU favours 3DNow strongly (eg AMD K6-II, K6-III, Athlon)
>
> OTOH, string.h only says:
>
> * This CPU favours 3DNow strongly (eg AMD Athlon)
>
> page.h says:
>
> * On older X86 processors its not a win to use MMX here it seems.
> * Maybe the K6-III ?
>
> Gasp. Would it or not in the end be useful to add a CONFIG_MK6II
> option that would enable 3DNOW ?
>
> - In all places where 3DNOW is tested (strings-486.h, page.h), only
> MMX-specific funcs are used (_mmx_memcpy mostly, mmx_{clear,copy}_page)
>
> page.h says:
>
> * On older X86 processors its not a win to use MMX here it seems.
> * Maybe the K6-III ?
>
> mmx.[ch] say:
>
> * MMX 3Dnow! helper operations
>
> So do they use MMX or 3Dnow after all ? They are distinct processor
> features, aren't they ?
>
> If this option is really just meant to enable MMX stuff, let's just
> call it by its name
>
> Some doc about that should be written - I'll gladly do that once I've
> gathered the info. Some Documentation/i386/MMX file maybe, or in
> mmx.c. But this info won't be easily found anyway if we keep using
> 3DNOW as a name for the config option... Objections ?
>
> - mmx.c says:
>
> Checksums are not a win with MMX on any CPU
> tested so far for any MMX solution figured
>
> This would be better to have the list of tested CPUs here. Does
> someone have this list ?
>
> - there is even a CONFIG_X86_USE_3DNOW_AND_WORKS option, that would
> enable MMX in __generic_copy-{to,from}_user
> (arch/i386/lib/usercopy.c). There is no comment about why this code
> was disabled.
>
> This code uses the following test to trigger MMX use:
>
> if(n<512)
> __copy_user(to,from,n);
> else
> mmx_copy_user(to,from,n);
>
> ... whereas string{,-486}.h use the following test to trigger MMX use:
>
> if(len<512 || in_interrupt())
> return __constant_memcpy(to, from, len);
> return _mmx_memcpy(to, from, len);
>
> Could this be the cause of the problem ?
>
> You'll also have noticed the inconsistent naming in 2 highly similar
> pieces of code.
>
> - BTW, what does this 512 stand for ? Especially as it's used in
> several places, a #define would seem nice at first glance.
>
> - In mmx.c, function naming and ordering really seems inconsistent.
> Or is there a reason for a "zero/clear" duality ?
>
> - drivers/md/xor.c says:
>
> certain CPU features like MMX can only be detected runtime
>
> I'm not sure how much this relates to the above, but I'd say a MMX
> config option could be used for this ? Or a common detection routine
> that other drivers could use ?
>
> --
> Yann Dirson <[email protected]> | Why make M$-Bill richer & richer ?
> debian-email: <[email protected]> | Support Debian GNU/Linux:
> | Cheaper, more Powerful, more Stable !
> http://ydirson.free.fr/ | Check <http://www.debian.org/>

2000-11-14 19:45:23

by Yann Dirson

[permalink] [raw]
Subject: Re: Inconsistencies in 3dNOW handling

On Tue, Nov 14, 2000 at 08:57:29AM +0100, Arjan van de Ven wrote:
> The test11pre2 code will also not run on a K6-II/III

I'll look at this.


> Some of the "MMX" instructions are part of "3Dnow" according to AMD
> publications. This is especially true for the "prefetch" instructions which
> have a different memnonic/opcode on Intel CPU's.

If we don't use them on pure MMX-enabled machine, but only on 3Dnow
ones, what about renaming those *mmx* {files,funcs} to *3dnow* ?


> > > - BTW, what does this 512 stand for ? Especially as it's used in
> > several places, a #define would seem nice at first glance.
>
> The 512 is a rough estimate of the minimum size of the copy that makes it
> worth saving and restoring the extra processor-state for using mmx.

What about "#define MMX_MIN_ACCELERATED_COPYSIZE 512" in mmx.h ?

Er... s/MMX/3DNOW/ :)



Hm, noone commented my note about usercopy.c's 3dnow code being
possibly fixed by cut-and paste from elsewhere ?


Regards,
--
Yann Dirson <[email protected]> | Why make M$-Bill richer & richer ?
debian-email: <[email protected]> | Support Debian GNU/Linux:
| Cheaper, more Powerful, more Stable !
http://ydirson.free.fr/ | Check <http://www.debian.org/>