2007-12-20 01:08:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 0/5] x86: another attempt at x86 pagetable unification

Hi Ingo,

Here's another round of the pagetable unification patches. I've done a
few dozen rounds of randconfig builds on both 32- and 64-bit, so I hope
that will prevent compile problems in your test environment.

I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both paravirt
and non-paravirt).

Thanks,
J
--


2007-12-20 09:14:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Jeremy Fitzhardinge <[email protected]> wrote:

> Hi Ingo,
>
> Here's another round of the pagetable unification patches. I've done
> a few dozen rounds of randconfig builds on both 32- and 64-bit, so I
> hope that will prevent compile problems in your test environment.
>
> I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both
> paravirt and non-paravirt).

thanks, applied.

Ingo

2007-12-20 09:50:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Jeremy Fitzhardinge <[email protected]> wrote:

> Here's another round of the pagetable unification patches. I've done
> a few dozen rounds of randconfig builds on both 32- and 64-bit, so I
> hope that will prevent compile problems in your test environment.
>
> I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both
> paravirt and non-paravirt).

i've done a dozen random tests too and it's looking good so far. Nice
work!

pgtable_32.h and pgtable_64.h still look a tiny bit messy from the
include file dependencies POV. For example pgtable_32.h:

#include <asm/processor.h>
#include <asm/fixmap.h>
#include <linux/threads.h>
#include <asm/paravirt.h>

#include <linux/bitops.h>
#include <linux/slab.h>
#include <linux/list.h>
#include <linux/spinlock.h>

that asm/paravirt.h include is already present in pgtable.h, in a
somewhat quirky way:

#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
#else /* !CONFIG_PARAVIRT */

also, most of the:

scripts/checkpatch.pl --file include/asm-x86/pgtable*.h

complaints are real ones and should be fixed.

would you be interested in cleaning up that stuff once and forever? It
would be a fine approach if you just tried to quickly push for a "high
quality" end result in a series of patches and sent that series to me,
without having tested it fully through - i can figure out whatever build
breakages and dependencies there still are. So there would be no
expectation of getting such a cleanup series right in the first (or
second, or third) attempt, this is spaghetti code that has been
accumulated up for years. The important thing would be to be careful to
not introduce runtime breakages accidentally - build breakages due to
some include file dependency we can sort out just fine. Hm?

Ingo

2007-12-20 11:23:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Ingo Molnar <[email protected]> wrote:

> > Here's another round of the pagetable unification patches. I've
> > done a few dozen rounds of randconfig builds on both 32- and 64-bit,
> > so I hope that will prevent compile problems in your test
> > environment.
> >
> > I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both
> > paravirt and non-paravirt).
>
> i've done a dozen random tests too and it's looking good so far. Nice
> work!

found a couple of bugs.

firstly, 64-bit wasnt so lucky, you broke iounmap()/change_page_attr()
:-)

The crash is here:

[ 0.000000] PCI: Calling quirk ffffffff804625d0 for 0000:00:02.0
[ 0.000000] PCI: Calling quirk ffffffff80565b10 for 0000:00:02.0
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at arch/x86/mm/pageattr_64.c:176!
[ 0.000000] invalid opcode: 0000 [1] SMP
[ 0.000000] CPU 1
[ 0.000000] Modules linked in:
[ 0.000000] Pid: 1, comm: swapper Not tainted 2.6.24-rc5 #5
[ 0.000000] RIP: 0010:[<ffffffff802229b9>] [<ffffffff802229b9>] __change_page_attr+0x189/0x2e0
[ 0.000000] RSP: 0018:ffff81003f9c3d30 EFLAGS: 00010282
[ 0.000000] RAX: 0000000000000400 RBX: 00000000000da103 RCX: ffffe20000000370
[ 0.000000] RDX: 000000000000006e RSI: 00003ffffffff000 RDI: ffff81000000a000
[ 0.000000] RBP: ffff81003f9c3d90 R08: 80000000da0001e3 R09: 0000000000000001
[ 0.000000] R10: 0000000000000001 R11: 00000000ffffffff R12: ffff81000000a680
[ 0.000000] R13: 0000000000000001 R14: 8000000000000163 R15: ffff8100da103000
[ 0.000000] FS: 0000000000000000(0000) GS:ffff81003f8014b0(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 0.000000] CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.000000] Process swapper (pid: 1, threadinfo ffff81003f9c2000, task ffff81003f9c0000)
[ 0.000000] Stack: ffff8100da102000 8000000000000163 8000000000000163 0000000000000000
[ 0.000000] ffffffff809320e0 ffff8100da102000 ffff81003f9c3d70 00000000000da103
[ 0.000000] ffff8100da103000 0000000000000001 8000000000000163 0000000000000000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff80222bb6>] change_page_attr_addr+0xa6/0x150
[ 0.000000] [<ffffffff8022243b>] ioremap_change_attr+0x5b/0x70
[ 0.000000] [<ffffffff8022265f>] iounmap+0xbf/0xe0
[ 0.000000] [<ffffffff80565e23>] quirk_usb_early_handoff+0x313/0x410
[ 0.000000] [<ffffffff8044a109>] kobject_put+0x19/0x20
[ 0.000000] [<ffffffff804a45d5>] put_device+0x15/0x20
[ 0.000000] [<ffffffff804611bd>] pci_fixup_device+0x8d/0xe0
[ 0.000000] [<ffffffff8045f53d>] pci_init+0x1d/0x40
[ 0.000000] [<ffffffff80a04794>] kernel_init+0x164/0x350
[ 0.000000] [<ffffffff8025cd9f>] trace_hardirqs_on+0xbf/0x160
[ 0.000000] [<ffffffff806fc824>] trace_hardirqs_on_thunk+0x35/0x3a
[ 0.000000] [<ffffffff8025cd9f>] trace_hardirqs_on+0xbf/0x160
[ 0.000000] [<ffffffff8020cba8>] child_rip+0xa/0x12
[ 0.000000] [<ffffffff8020c2bf>] restore_args+0x0/0x30
[ 0.000000] [<ffffffff80a04630>] kernel_init+0x0/0x350
[ 0.000000] [<ffffffff8020cb9e>] child_rip+0x0/0x12
[ 0.000000]
[ 0.000000]
[ 0.000000] Code: 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 48 b8 7f 0f
[ 0.000000] RIP [<ffffffff802229b9>] __change_page_attr+0x189/0x2e0
[ 0.000000] RSP <ffff81003f9c3d30>
[ 0.000000] Kernel panic - not syncing: Attempted to kill init!

secondly, as i tried to bisect it, it would build and boot at this
patch:

Subject: x86: clean up asm-x86/page*.h

but wouldnt build at this patch:

Subject: x86: unify pgtable*.h

nor at this patch:

Subject: x86: fix up formatting in pgtable*.h

it finally built at:

Subject: x86: use a uniform structure for pte_t

but crashed. So the bug is in one of those 3 patches. Please make them
bisect-friendly.

full bootlog and config attached. (The crash happens reliably here - if
you cannot reproduce it then i can try any fix from you. I've removed
the 5 patches of yours for now from x86.git.)

Ingo


Attachments:
(No filename) (4.14 kB)
config (47.91 kB)
crash.log (66.38 kB)
Download all attachments

2007-12-20 21:03:45

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification

Ingo Molnar wrote:
> * Jeremy Fitzhardinge <[email protected]> wrote:
>
>
>> Here's another round of the pagetable unification patches. I've done
>> a few dozen rounds of randconfig builds on both 32- and 64-bit, so I
>> hope that will prevent compile problems in your test environment.
>>
>> I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both
>> paravirt and non-paravirt).
>>
>
> i've done a dozen random tests too and it's looking good so far. Nice
> work!
>
> pgtable_32.h and pgtable_64.h still look a tiny bit messy from the
> include file dependencies POV. For example pgtable_32.h:
>
> #include <asm/processor.h>
> #include <asm/fixmap.h>
> #include <linux/threads.h>
> #include <asm/paravirt.h>
>
> #include <linux/bitops.h>
> #include <linux/slab.h>
> #include <linux/list.h>
> #include <linux/spinlock.h>
>
> that asm/paravirt.h include is already present in pgtable.h, in a
> somewhat quirky way:
>
> #ifdef CONFIG_PARAVIRT
> #include <asm/paravirt.h>
> #else /* !CONFIG_PARAVIRT */
>
> also, most of the:
>
> scripts/checkpatch.pl --file include/asm-x86/pgtable*.h
>
> complaints are real ones and should be fixed.
>
> would you be interested in cleaning up that stuff once and forever? It
> would be a fine approach if you just tried to quickly push for a "high
> quality" end result in a series of patches and sent that series to me,
> without having tested it fully through - i can figure out whatever build
> breakages and dependencies there still are. So there would be no
> expectation of getting such a cleanup series right in the first (or
> second, or third) attempt, this is spaghetti code that has been
> accumulated up for years. The important thing would be to be careful to
> not introduce runtime breakages accidentally - build breakages due to
> some include file dependency we can sort out just fine. Hm?

Yep, I'm happy to do a cleanup pass. I just wanted to get this out the
door while it seemed to work for me. But I'll fix the bug first.

J

2007-12-20 21:09:20

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification

Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
>
>>> Here's another round of the pagetable unification patches. I've
>>> done a few dozen rounds of randconfig builds on both 32- and 64-bit,
>>> so I hope that will prevent compile problems in your test
>>> environment.
>>>
>>> I've also boot-tested 64-bit and 32-bit PAE/non-PAE configs (both
>>> paravirt and non-paravirt).
>>>
>> i've done a dozen random tests too and it's looking good so far. Nice
>> work!
>>
>
> found a couple of bugs.
>
> firstly, 64-bit wasnt so lucky, you broke iounmap()/change_page_attr()
> :-)
>

Crap. Worked for me. I'll look into it.

> The crash is here:
>
> [ 0.000000] PCI: Calling quirk ffffffff804625d0 for 0000:00:02.0
> [ 0.000000] PCI: Calling quirk ffffffff80565b10 for 0000:00:02.0
> [ 0.000000] ------------[ cut here ]------------
> [ 0.000000] kernel BUG at arch/x86/mm/pageattr_64.c:176!
> [ 0.000000] invalid opcode: 0000 [1] SMP
> [ 0.000000] CPU 1
> [ 0.000000] Modules linked in:
> [ 0.000000] Pid: 1, comm: swapper Not tainted 2.6.24-rc5 #5
> [ 0.000000] RIP: 0010:[<ffffffff802229b9>] [<ffffffff802229b9>] __change_page_attr+0x189/0x2e0
> [ 0.000000] RSP: 0018:ffff81003f9c3d30 EFLAGS: 00010282
> [ 0.000000] RAX: 0000000000000400 RBX: 00000000000da103 RCX: ffffe20000000370
> [ 0.000000] RDX: 000000000000006e RSI: 00003ffffffff000 RDI: ffff81000000a000
> [ 0.000000] RBP: ffff81003f9c3d90 R08: 80000000da0001e3 R09: 0000000000000001
> [ 0.000000] R10: 0000000000000001 R11: 00000000ffffffff R12: ffff81000000a680
> [ 0.000000] R13: 0000000000000001 R14: 8000000000000163 R15: ffff8100da103000
> [ 0.000000] FS: 0000000000000000(0000) GS:ffff81003f8014b0(0000) knlGS:0000000000000000
> [ 0.000000] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 0.000000] CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
> [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 0.000000] Process swapper (pid: 1, threadinfo ffff81003f9c2000, task ffff81003f9c0000)
> [ 0.000000] Stack: ffff8100da102000 8000000000000163 8000000000000163 0000000000000000
> [ 0.000000] ffffffff809320e0 ffff8100da102000 ffff81003f9c3d70 00000000000da103
> [ 0.000000] ffff8100da103000 0000000000000001 8000000000000163 0000000000000000
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff80222bb6>] change_page_attr_addr+0xa6/0x150
> [ 0.000000] [<ffffffff8022243b>] ioremap_change_attr+0x5b/0x70
> [ 0.000000] [<ffffffff8022265f>] iounmap+0xbf/0xe0
> [ 0.000000] [<ffffffff80565e23>] quirk_usb_early_handoff+0x313/0x410
> [ 0.000000] [<ffffffff8044a109>] kobject_put+0x19/0x20
> [ 0.000000] [<ffffffff804a45d5>] put_device+0x15/0x20
> [ 0.000000] [<ffffffff804611bd>] pci_fixup_device+0x8d/0xe0
> [ 0.000000] [<ffffffff8045f53d>] pci_init+0x1d/0x40
> [ 0.000000] [<ffffffff80a04794>] kernel_init+0x164/0x350
> [ 0.000000] [<ffffffff8025cd9f>] trace_hardirqs_on+0xbf/0x160
> [ 0.000000] [<ffffffff806fc824>] trace_hardirqs_on_thunk+0x35/0x3a
> [ 0.000000] [<ffffffff8025cd9f>] trace_hardirqs_on+0xbf/0x160
> [ 0.000000] [<ffffffff8020cba8>] child_rip+0xa/0x12
> [ 0.000000] [<ffffffff8020c2bf>] restore_args+0x0/0x30
> [ 0.000000] [<ffffffff80a04630>] kernel_init+0x0/0x350
> [ 0.000000] [<ffffffff8020cb9e>] child_rip+0x0/0x12
> [ 0.000000]
> [ 0.000000]
> [ 0.000000] Code: 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 48 b8 7f 0f
> [ 0.000000] RIP [<ffffffff802229b9>] __change_page_attr+0x189/0x2e0
> [ 0.000000] RSP <ffff81003f9c3d30>
> [ 0.000000] Kernel panic - not syncing: Attempted to kill init!
>
> secondly, as i tried to bisect it, it would build and boot at this
> patch:
>
> Subject: x86: clean up asm-x86/page*.h
>
> but wouldnt build at this patch:
>
> Subject: x86: unify pgtable*.h
>
> nor at this patch:
>
> Subject: x86: fix up formatting in pgtable*.h
>
> it finally built at:
>
> Subject: x86: use a uniform structure for pte_t
>
> but crashed. So the bug is in one of those 3 patches. Please make them
> bisect-friendly.
>

Hm, I tried to.

> full bootlog and config attached. (The crash happens reliably here - if
> you cannot reproduce it then i can try any fix from you. I've removed
> the 5 patches of yours for now from x86.git.)
>

OK. Will beat on this some more. Unfortunately I can only test 64-bit
stuff under kvm, and I'm a little bit skeptical of how solid it is,
since i get sporadic random oopses even with base x86.git. But a nice
simple repeatable BUG_ON should be easy to fix regardless.

J

2007-12-20 21:39:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Jeremy Fitzhardinge <[email protected]> wrote:

> > found a couple of bugs.
> >
> > firstly, 64-bit wasnt so lucky, you broke
> > iounmap()/change_page_attr()
> > :-)
>
> Crap. Worked for me. I'll look into it.

well, there's an easy solution for unification patches: the resulting
object files must have _exactly the same_ content as without the
unification patches. (Modulo strings as WARN_ON()s referring to
include-file names.)

If they differ then the unification did something wrong. With your
patchset and the config i sent, the difference is visible in the image
size already:

text data bss dec hex filename
7763766 967330 5812328 14543424 ddea40 vmlinux.after
7763811 967330 5812328 14543469 ddea6d vmlinux.before

also, reducing the size and scope of changes helps as well - because
that way it can be bisected down to specific changes. Mistakes
inevitably happen, especially if you do not enforce a rigid
byte-for-byte correctness along the way. You did 5 rather large patches,
and it's not testable because your unification steps were too coarse.

In other words: you were asking for trouble and you got it :-)

Ingo

2007-12-20 22:08:44

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification

Ingo Molnar wrote:
> * Jeremy Fitzhardinge <[email protected]> wrote:
>
>
>>> found a couple of bugs.
>>>
>>> firstly, 64-bit wasnt so lucky, you broke
>>> iounmap()/change_page_attr()
>>> :-)
>>>
>> Crap. Worked for me. I'll look into it.
>>
>
> well, there's an easy solution for unification patches: the resulting
> object files must have _exactly the same_ content as without the
> unification patches. (Modulo strings as WARN_ON()s referring to
> include-file names.)
>
> If they differ then the unification did something wrong. With your
> patchset and the config i sent, the difference is visible in the image
> size already:
>
> text data bss dec hex filename
> 7763766 967330 5812328 14543424 ddea40 vmlinux.after
> 7763811 967330 5812328 14543469 ddea6d vmlinux.before
>
> also, reducing the size and scope of changes helps as well - because
> that way it can be bisected down to specific changes. Mistakes
> inevitably happen, especially if you do not enforce a rigid
> byte-for-byte correctness along the way. You did 5 rather large patches,
> and it's not testable because your unification steps were too coarse.
>

But byte-for-byte identity isn't (necessarily) possible when actually
unifying. If the same function exists in different forms on 32- and
64-bit, then unifying requires I pick one of them (or perhaps a new
superset) to use in the unified form. That function may generate
different code compared to the one that it replaced...

But you're right, I can do the patches in a more piecemeal form. I'll
see if I can rework them.

J

2007-12-20 22:24:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Jeremy Fitzhardinge <[email protected]> wrote:

> But byte-for-byte identity isn't (necessarily) possible when actually
> unifying. If the same function exists in different forms on 32- and
> 64-bit, then unifying requires I pick one of them (or perhaps a new
> superset) to use in the unified form. That function may generate
> different code compared to the one that it replaced...

it's still possible: you can do preparatory patches that bring one
architecture in sync with the other one, in small, per function steps.
Then the actual unification is still an identity transformation. (and
all the preparatory patches are small and bisectable)

it's also a lot less frustrating and a lot more enjoyable that way IMO.
If it's 50 small patches, then so be it ... 50 patches only take ~2
seconds more for me to apply to x86.git (which time is immediately saved
by the vastly improved reviewability and testability of a 50 patches
set), so dont worry about any overhead on the maintainers side. And
you'll end up moving up on the v2.6.25 contributors top-list on LWN as
well ;-) The worst aspect of it is writing up the 50 changelogs (i use
pre-created templates for that) and figuring out how to script a
patch-bomb to lkml. In every other aspect it's a win-win scenario for
everyone involved.

Ingo

2007-12-21 00:53:12

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification

Ingo Molnar wrote:
> it's also a lot less frustrating and a lot more enjoyable that way IMO.
> If it's 50 small patches, then so be it ... 50 patches only take ~2
> seconds more for me to apply to x86.git (which time is immediately saved
> by the vastly improved reviewability and testability of a 50 patches
> set), so dont worry about any overhead on the maintainers side. And
> you'll end up moving up on the v2.6.25 contributors top-list on LWN as
> well ;-) The worst aspect of it is writing up the 50 changelogs (i use
> pre-created templates for that) and figuring out how to script a
> patch-bomb to lkml. In every other aspect it's a win-win scenario for
> everyone involved.

Well, testing for bisectability requires compiling each patch as its
applied, which gets painful for something like this where any change
will rebuild the world. And dealing with patch conflicts caused by
changing early patches in the series is never fun.

But I'm refactoring the series into smaller pieces now. Knowing what
the outcome should look like, and there the pitfalls are, makes it
fairly easy.

J

2007-12-21 00:59:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification


* Jeremy Fitzhardinge <[email protected]> wrote:

> Well, testing for bisectability requires compiling each patch as its
> applied, which gets painful for something like this where any change
> will rebuild the world. And dealing with patch conflicts caused by
> changing early patches in the series is never fun.

that's true. The 'rej' tool helps alot though. ( Plus a distcc cluster
that builds a distro kernel in 45-50 seconds from scratch ;)

> But I'm refactoring the series into smaller pieces now. Knowing what
> the outcome should look like, and there the pitfalls are, makes it
> fairly easy.

ok, great :-)

Ingo

2007-12-21 01:03:50

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 0/5] x86: another attempt at x86 pagetable unification

On Dec 20, 2007 10:58 PM, Ingo Molnar <[email protected]> wrote:
>
> * Jeremy Fitzhardinge <[email protected]> wrote:
>
> > Well, testing for bisectability requires compiling each patch as its
> > applied, which gets painful for something like this where any change
> > will rebuild the world. And dealing with patch conflicts caused by
> > changing early patches in the series is never fun.
>
> that's true. The 'rej' tool helps alot though. ( Plus a distcc cluster
> that builds a distro kernel in 45-50 seconds from scratch ;)


So that's your secret!


--
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."