Hello Andi,
The following emails contain the patches to convert x86-64 to store current
in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
provides a significant amount of code savings (~43KB) over the current
use of the per cpu data area. I also tested using r15, but that generated
code that was larger than that generated with r10. This code seems to be
working well for me now (it stands up to 32 and 64 bit processes and ptrace
users) and would be a good candidate for further exposure.
-ben
arch/i386/oprofile/nmi_int.c | 1
arch/x86_64/Makefile | 1
arch/x86_64/crypto/aes-x86_64-asm.S | 27 +++++++++++----------
arch/x86_64/ia32/ia32entry.S | 17 +++++++++----
arch/x86_64/kernel/asm-offsets.c | 2 -
arch/x86_64/kernel/entry.S | 44 +++++++++++++++--------------------
arch/x86_64/kernel/genapic_cluster.c | 1
arch/x86_64/kernel/genapic_flat.c | 1
arch/x86_64/kernel/i387.c | 2 -
arch/x86_64/kernel/process.c | 8 ++++--
arch/x86_64/kernel/setup64.c | 16 +++++++-----
arch/x86_64/kernel/smpboot.c | 6 +++-
arch/x86_64/lib/copy_user.S | 16 ++++++------
arch/x86_64/lib/csum-copy.S | 24 ++++++++++---------
arch/x86_64/lib/getuser.S | 12 +++------
arch/x86_64/lib/putuser.S | 12 +++------
include/asm-x86_64/current.h | 8 ------
include/asm-x86_64/desc.h | 1
include/asm-x86_64/i387.h | 8 +++---
include/asm-x86_64/processor.h | 10 ++-----
include/asm-x86_64/system.h | 6 +---
include/asm-x86_64/thread_info.h | 31 +++++++++++-------------
include/linux/seccomp.h | 15 ++++-------
include/linux/smp.h | 25 ++++++++++---------
24 files changed, 145 insertions(+), 149 deletions(-)
--
"You know, I've seen some crystals do some pretty trippy shit, man."
Don't Email: <[email protected]>.
Benjamin LaHaise wrote:
> The following emails contain the patches to convert x86-64 to store current
> in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/).
[snip]
> No benchmarks that I am aware of show regressions with this change.
Ben,
Your patch breaks all out-of-tree amd64 assembler code used in kernel. r10
register is one of those registers that does not need to be preserved across
function calls, and reserving that register for other purpose means that all
assembler code using r10 in kernel must be rewritten. This is deeply
unfunny.
Andi,
Please don't apply Ben's patch. It is already bad enough having to deal with
two incompatible calling conventions on 32 bit x86.
--
Jari Ruusu 1024R/3A220F51 5B 4B F9 BB D3 3F 52 E9 DB 1D EB E3 24 0E A9 DD
On Wed, 2005-11-30 at 08:39 +0200, Jari Ruusu wrote:
> Benjamin LaHaise wrote:
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/).
> [snip]
> > No benchmarks that I am aware of show regressions with this change.
>
> Ben,
> Your patch breaks all out-of-tree amd64 assembler code used in kernel.
so what?
Arjan van de Ven wrote:
> On Wed, 2005-11-30 at 08:39 +0200, Jari Ruusu wrote:
>
>>Benjamin LaHaise wrote:
>>
>>>The following emails contain the patches to convert x86-64 to store current
>>>in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/).
>>
>>[snip]
>>
>>>No benchmarks that I am aware of show regressions with this change.
>>
>>Ben,
>>Your patch breaks all out-of-tree amd64 assembler code used in kernel.
>
>
> so what?
>
Sounds like a trick question - I don't think the kernel does use any
out-of-tree amd64 assember code, does it? ;)
Send instant messages to your online friends http://au.messenger.yahoo.com
> Your patch breaks all out-of-tree amd64 assembler code used in kernel. r10
> register is one of those registers that does not need to be preserved across
> function calls, and reserving that register for other purpose means that all
> assembler code using r10 in kernel must be rewritten. This is deeply
> unfunny.
Well, the changes should be minor.
>
> Please don't apply Ben's patch. It is already bad enough having to deal with
> two incompatible calling conventions on 32 bit x86.
43KB .text savings are hard to argue against. There is no guarantee
for a stable kernel ABI. If you maintain out of tree code you
will need to live with the occasional changes.
-Andi
On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> Hello Andi,
>
> The following emails contain the patches to convert x86-64 to store current
> in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> provides a significant amount of code savings (~43KB) over the current
> use of the per cpu data area. I also tested using r15, but that generated
> code that was larger than that generated with r10. This code seems to be
> working well for me now (it stands up to 32 and 64 bit processes and ptrace
> users) and would be a good candidate for further exposure.
Looks good thanks. It will need longer testing though.
-Andi
On Wed, 2005-11-30 at 14:02 +0100, Andi Kleen wrote:
> On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> > Hello Andi,
> >
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> > provides a significant amount of code savings (~43KB) over the current
> > use of the per cpu data area. I also tested using r15, but that generated
> > code that was larger than that generated with r10. This code seems to be
> > working well for me now (it stands up to 32 and 64 bit processes and ptrace
> > users) and would be a good candidate for further exposure.
>
> Looks good thanks. It will need longer testing though.
is it -mm ready?
P.S.: The correct mailing list for x86-64 patches is [email protected]
Please at least cc that list always.
-Andi
On Wed, Nov 30, 2005 at 02:32:51PM +0100, Arjan van de Ven wrote:
> > Looks good thanks. It will need longer testing though.
>
> is it -mm ready?
I'd like to hear back on if suspend/resume works with it, as that is one
of the areas I couldn't test. The patch set is completely incremental,
so we could merge the bits of it in a couple of steps. The mb() in
smpboot.c is an important bug fix and should be considered for immediate
inclusion.
-ben
--
"You know, I've seen some crystals do some pretty trippy shit, man."
Don't Email: <[email protected]>.
On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> Date: Tue, 29 Nov 2005 23:21:18 -0500
> From: Benjamin LaHaise <[email protected]>
> To: Andi Kleen <[email protected]>
> Cc: [email protected]
> Subject: [PATCH 0/9] x86-64 put current in r10
>
> Hello Andi,
>
> The following emails contain the patches to convert x86-64 to store current
> in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> provides a significant amount of code savings (~43KB) over the current
> use of the per cpu data area. I also tested using r15, but that generated
> code that was larger than that generated with r10. This code seems to be
> working well for me now (it stands up to 32 and 64 bit processes and ptrace
> users) and would be a good candidate for further exposure.
I would rather prefer NOT to introduce this at this time.
My primary concern is that during "even numbered series" there
should not be radical internal ABI/API changes, like this one.
In 2.7 it can be introduced, by all means.
Indeed at the moment my thinking is, that X86-64 is way more UNSTABLE,
than it should be. (And Linux kernel overall, but that is another story.)
> -ben
/Matti Aarnio
On Wed, Nov 30, 2005 at 05:18:47PM +0200, Matti Aarnio wrote:
> I would rather prefer NOT to introduce this at this time.
> My primary concern is that during "even numbered series" there
> should not be radical internal ABI/API changes, like this one.
Any modules built by the official Makefile method will atomatically
pick up the necessary changes to the compiler flags, and the ABI
presented to userspace is unchanged.
Also, part of the patch series is needed in order to introduce colouring
of the kernel stack (namely divorcing the relationship of thread_info
with the stack pointer).
> In 2.7 it can be introduced, by all means.
As far as I am aware, there is no plan for a 2.7 at this time.
> Indeed at the moment my thinking is, that X86-64 is way more UNSTABLE,
> than it should be. (And Linux kernel overall, but that is another story.)
At least I found one problem that was impacting the boot stability of my
test box, so it can't all be bad. =-)
-ben
--
"You know, I've seen some crystals do some pretty trippy shit, man."
Don't Email: <[email protected]>.
On Wed, Nov 30, 2005 at 05:18:47PM +0200, Matti Aarnio wrote:
> On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> > Date: Tue, 29 Nov 2005 23:21:18 -0500
> > From: Benjamin LaHaise <[email protected]>
> > To: Andi Kleen <[email protected]>
> > Cc: [email protected]
> > Subject: [PATCH 0/9] x86-64 put current in r10
> >
> > Hello Andi,
> >
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> > provides a significant amount of code savings (~43KB) over the current
> > use of the per cpu data area. I also tested using r15, but that generated
> > code that was larger than that generated with r10. This code seems to be
> > working well for me now (it stands up to 32 and 64 bit processes and ptrace
> > users) and would be a good candidate for further exposure.
>
> I would rather prefer NOT to introduce this at this time.
> My primary concern is that during "even numbered series" there
> should not be radical internal ABI/API changes, like this one.
Hmm? I am not aware of such a constraint.
It's not very invasive anyways in that it would require changing
a lot of code.
> In 2.7 it can be introduced, by all means.
>
> Indeed at the moment my thinking is, that X86-64 is way more UNSTABLE,
> than it should be. (And Linux kernel overall, but that is another story.)
The actual x86-64 kernel is actually quite stable, most of the reported
problems (including yours) come from various hardware or firmware
issues mostly in new platforms. If you use a trusty old chipset
(e.g. AMD 8111 or Intel E7505) and proven motherboard you're usually ok.
-Andi
On Wed, 2005-11-30 at 17:18 +0200, Matti Aarnio wrote:
> On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> > Date: Tue, 29 Nov 2005 23:21:18 -0500
> > From: Benjamin LaHaise <[email protected]>
> > To: Andi Kleen <[email protected]>
> > Cc: [email protected]
> > Subject: [PATCH 0/9] x86-64 put current in r10
> >
> > Hello Andi,
> >
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> > provides a significant amount of code savings (~43KB) over the current
> > use of the per cpu data area. I also tested using r15, but that generated
> > code that was larger than that generated with r10. This code seems to be
> > working well for me now (it stands up to 32 and 64 bit processes and ptrace
> > users) and would be a good candidate for further exposure.
>
> I would rather prefer NOT to introduce this at this time.
> My primary concern is that during "even numbered series" there
> should not be radical internal ABI/API changes, like this one.
this isn't a radical API change actually, and.. well there is no kernel
ABI. If you care about ABI.. that breaks all the time anyway. Not just a
little bit, but highly radical.
> In 2.7 it can be introduced, by all means.
There is no 2.7, and the current development model also is that there
won't be one until the development model changes.
> Indeed at the moment my thinking is, that X86-64 is way more UNSTABLE,
> than it should be. (And Linux kernel overall, but that is another story.)
the funny thing is that this is a very localized change compared to many
of the other things going on, and unlikely to cause any major issues
with the kernel code. And the changes have clear gains in size and
probably also in speed (segment accesses are not cheap)
So personally I don't think your objections make sense in todays
reality.
Greetings,
Arjan van de Ven
On 11/30/05, Matti Aarnio <[email protected]> wrote:
> On Tue, Nov 29, 2005 at 11:21:18PM -0500, Benjamin LaHaise wrote:
> > Date: Tue, 29 Nov 2005 23:21:18 -0500
> > From: Benjamin LaHaise <[email protected]>
> > To: Andi Kleen <[email protected]>
> > Cc: [email protected]
> > Subject: [PATCH 0/9] x86-64 put current in r10
> >
> > Hello Andi,
> >
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/). This
> > provides a significant amount of code savings (~43KB) over the current
> > use of the per cpu data area. I also tested using r15, but that generated
> > code that was larger than that generated with r10. This code seems to be
> > working well for me now (it stands up to 32 and 64 bit processes and ptrace
> > users) and would be a good candidate for further exposure.
>
> I would rather prefer NOT to introduce this at this time.
> My primary concern is that during "even numbered series" there
> should not be radical internal ABI/API changes, like this one.
>
http://sosdg.org/~coywolf/lxr/source/Documentation/stable_api_nonsense.txt
> In 2.7 it can be introduced, by all means.
>
As many others have pointed out, there's not likely to be a 2.7 series
nytime in the forseable future. 2.6.x is where development happens.
Which in large part is why we have the -mm tree to test new stuff
before it hits mainline.
Check the sections on the various trees in
http://sosdg.org/~coywolf/lxr/source/Documentation/applying-patches.txt
--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
On Wed, 30 Nov 2005, Jari Ruusu wrote:
> Benjamin LaHaise wrote:
> > The following emails contain the patches to convert x86-64 to store current
> > in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/).
> [snip]
> > No benchmarks that I am aware of show regressions with this change.
>
> Ben,
> Your patch breaks all out-of-tree amd64 assembler code used in kernel. r10
> register is one of those registers that does not need to be preserved across
> function calls, and reserving that register for other purpose means that all
> assembler code using r10 in kernel must be rewritten. This is deeply
> unfunny.
>
> Andi,
> Please don't apply Ben's patch. It is already bad enough having to deal with
> two incompatible calling conventions on 32 bit x86.
Just for the sake of understanding the current kernel release
process, when would something like this be acceptable/possible?
Would it require a Linux 3.0 version, or at least a 2.8?
--
~Randy
On Nov 30, 2005, at 11:22:48, Randy.Dunlap wrote:
> On Wed, 30 Nov 2005, Jari Ruusu wrote:
>
>> Benjamin LaHaise wrote:
>>> The following emails contain the patches to convert x86-64 to
>>> store current
>>> in r10 (also at http://www.kvack.org/~bcrl/patches/v2.6.15-rc3/).
>> [snip]
>>> No benchmarks that I am aware of show regressions with this change.
>>
>> Ben,
>> Your patch breaks all out-of-tree amd64 assembler code used in
>> kernel. r10 register is one of those registers that does not need
>> to be preserved across function calls, and reserving that register
>> for other purpose means that all assembler code using r10 in
>> kernel must be rewritten. This is deeply unfunny.
>>
>> Andi,
>> Please don't apply Ben's patch. It is already bad enough having to
>> deal with two incompatible calling conventions on 32 bit x86.
>
> Just for the sake of understanding the current kernel release
> process, when would something like this be acceptable/possible?
> Would it require a Linux 3.0 version, or at least a 2.8?
It's perfectly acceptable in 2.6, assuming it's properly divided up
into small discrete changes and spends a bit of time in -mm first to
work out the bugs. If people want to maintain out-of-tree drivers,
especially those using assembly, when things break they get to keep
both pieces. This patch produces a rather large space savings and
speeds things up to boot, and I would support it being pushed to
Linus during the 2.6.16 merge window assuming it stands up to abuse
in -mm for a bit.
Cheers,
Kyle Moffett
Nick Piggin wrote:
> Sounds like a trick question - I don't think the kernel does use any
> out-of-tree amd64 assember code, does it? ;)
Out-of-tree amd64 assember code is being run in kernel space. For example:
http://loop-aes.sourceforge.net/
Calling convention change that breaks existing assembler code that has been
field proven and is believed to be entirely free of bugs for long time, does
NOT belong in a STABLE kernel series.
OTOH, if your business model requires breaking stuff and then milking your
customers for "fixing" the breakage, then this type of change is
understandable. </sarcasm>
--
Jari Ruusu 1024R/3A220F51 5B 4B F9 BB D3 3F 52 E9 DB 1D EB E3 24 0E A9 DD
On Wed, Nov 30, 2005 at 07:29:39PM +0200, Jari Ruusu wrote:
> Nick Piggin wrote:
> > Sounds like a trick question - I don't think the kernel does use any
> > out-of-tree amd64 assember code, does it? ;)
>
> Out-of-tree amd64 assember code is being run in kernel space. For example:
> http://loop-aes.sourceforge.net/
>
> Calling convention change that breaks existing assembler code that has been
> field proven and is believed to be entirely free of bugs for long time, does
> NOT belong in a STABLE kernel series.
I don't think you understand the policies of linux kernel development
very well.
If you want your code be maintained it's best to submit it to mainline.
Otherwise you're on your own.
Anyways - as long as your assembly code doesn't call any other kernel
services it should be enough to just save/restore R10 at the beginning/end.
Interrupts reload it automatically.
-Andi
On Wed, Nov 30, 2005 at 07:29:39PM +0200, Jari Ruusu wrote:
> Calling convention change that breaks existing assembler code that has been
> field proven and is believed to be entirely free of bugs for long time, does
> NOT belong in a STABLE kernel series.
>
> OTOH, if your business model requires breaking stuff and then milking your
> customers for "fixing" the breakage, then this type of change is
> understandable. </sarcasm>
Please stop spreading bullshit. Calling convetions can change all the
time. The kernel only exports a C API and not ABI at all to modules.
And even the API is rather volatile and can change with every release.
Just because you're too much of a dickhead to work with others on the
inkernel crypto implementation we don't have to care about your unsupported
out of tree code.
Randy.Dunlap wrote:
>Just for the sake of understanding the current kernel release
>process, when would something like this be acceptable/possible?
>Would it require a Linux 3.0 version, or at least a 2.8?
>
>
No. It has been stated many times that there is no guarantee about
binary compatibility. So this sort of change (breaking out-of-tree
assembly or other code) can happen at anytime, even in a stable series,
even if
the reason for the change isn't very strong. You will even find those
who want to break binary compatibility occationally on purpose, just to get
people firmly off the idea that they can depend on such things.
The reason for the last attitude is the preference for open source.
Vendors may like binary drivers, but they have a history of bad
maintainership, especially when the product no longer sell. Open source
is then useful in that any interested programmer can fix things. That's
almost impossible with binary stuff. Sort of "if they want to be
difficult to us,
then we'll be difficult to them."
Getting out-of-tree code into the kernel tree is one way of avoiding
trouble, because then the people making changes will try hard not
to break anything. This is obviously not an option for non-gpl code,
search the mail archives for how many times kernel changes
broke the binary modules of vmware, nvidia and others.
Policy is that those who keep their code to themselves gets
to play catchup - a lot. Their trouble is a non-issue.
Exceptions have sometimes been made in
order to not break the kernel for large amounts of people. Apparently,
the number of people using nvidia/vmware/out-of-tree assembly
isn't considered large enough, or at least the changes have been
more important than their troubles.
Helge Hafting
Ben,
Do you have these patches maintained on a website somewhere too. I'd be
willing to test these patches out, without testing all of -mm (my x86_64
box needs to be pretty stable). I have the confidence of using your
patches without breaking too much ;-)
I'll apply the ones you posted to LKML, but I would like to also know
where any updates are.
Thanks,
-- Steve