2002-09-19 22:52:13

by Mala Anand

[permalink] [raw]

Andrew Morton wrote ...

>It seems that movsl works acceptably with all alignments on AMD
>hardware, although this needs to be checked with more recent machines.

>movsl is a (bad) loss on PII and PIII for all alignments except 8&8.
>Don't know about P4 - I can test that in a day or two.

>I expect that a minimal, 90% solution would be just:

>fancy_copy_to_user(dst, src, count)
> if (arch_has_sane_movsl || ((dst|src) & 7) == 0)
> movsl_copy_to_user(dst, src, count);
> else
> movl_copy_to_user(dst, src, count);


>#define fancy_copy_to_user copy_to_user

>and we really only need fancy_copy_to_user in a handful of
>places - the bulk copies in networking and filemap.c. For all
>the other call sites it's probably more important to keep the
>code footprint down than it is to squeeze the last few drops out
>of the copy speed.

>Mala Anand has done some work on this. See

><searches> Yes, I have a copy of Mala's patch here which works
>against 2.5.current. Mala's patch will cause quite an expansion
>of kernel size; we would need an implementation which did not
>use inlining. This work was discussed at OLS2002. See

I will move the code from uaccess.h (inline) to usercopy.c (routine)
and will post it soon. It is in my list of things to do.


Mala Anand
IBM Linux Technology Center - Kernel Performance
E-mail:[email protected]
Phone:838-8088; Tie-line:678-8088