2009-03-06 06:47:14

by Rob Landley

[permalink] [raw]
Subject: [PATCH] Fix ARCH=um segfault on x86-64.

Apparently, nobody other than me has ever attempted to use User Mode Linux built from 2.6.28 on x86-64, because it doesn't work. It still doesn't work in current git. I complained about it not working back in January:
http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.04389.rob%40landley.net&forum_name=user-mode-linux-develhttp://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
And today, I bothered to track down why.
This is the commit that broke it, when Peter Anvin merged x86 and x86-64 for ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
Here's a patch that fixes it for me:
Signed-off-by: Rob Landley <[email protected]>
diff -r 178a096e9e38 arch/um/Kconfig.x86--- a/arch/um/Kconfig.x86 Fri Feb 27 16:49:46 2009 -0800+++ b/arch/um/Kconfig.x86 Thu Mar 05 23:35:55 2009 -0600@@ -26,9 +26,8 @@ def_bool !X86_XADD config 3_LEVEL_PGTABLES- bool "Three-level pagetables (EXPERIMENTAL)" if !64BIT+ bool default 64BIT- depends on EXPERIMENTAL help Three-level pagetables will let UML have more than 4G of physical memory. All the memory that can't be mapped directly will be treatedWhat changed is that the resulting .config no longer contains the line "CONFIG_3_LEVEL_PGTABLES=y" (it's not visible, and thus not written out into the config file file). Without that symbol defined, x86-64 dies trying to boot. If you tweak the Kconfig so the symbol gets written out, it starts working again.
I have no idea how ANYBODY has EVER managed to use 2.6.28 User Mode Linux on an x86-64 host. My theory is that nobody ever did. I suspect that very few people use UML anymore now that KVM and the rustyvisor and such are available, and those legacy users still fiddling with it are apparently all either using old versions or 32-bit hosts. (I still like being able to stick printfs into the kernel.)
Here's the panic, in case you're wondering:
$ ./linux rw init=/bin/bash rootfstype=hostfsCore dump limits : soft - 0 hard - NONEChecking that ptrace can change system call numbers...OKChecking syscall emulation patch for ptrace...OKChecking advanced syscall emulation patch for ptrace...OKChecking for tmpfs mount on /dev/shm...OKChecking PROT_EXEC mmap in /dev/shm/...OKChecking for the skas3 patch in the host: - /proc/mm...not found: No such file or directory - PTRACE_FAULTINFO...not found - PTRACE_LDT...not foundUML running in SKAS0 modeAdding 4390912 bytes to physical memory to account for exec-shield gapLinux version 2.6.29-rc7 (landley@driftwood) (gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) ) #1 Thu Mar 5 21:20:14 CST 2009Built 1 zonelists in Zone order, mobility grouping on. Total pages: 9137Kernel command line: rw init=/bin/bash rootfstype=hostfs root=98:0PID hash table entries: 256 (order: 8, 2048 bytes)Dentry cache hash table entries: 8192 (order: 4, 65536 bytes)Inode-cache hash table entries: 4096 (order: 3, 32768 bytes)Memory: 29244k availableSLUB: Genslabs=12, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1Calibrating delay loop... 209.30 BogoMIPS (lpj=1046528)Mount-cache hash table entries: 256Checking that host ptys support output SIGIO...YesChecking that host ptys support SIGIO on close...No, enabling workaroundUsing 2.6 host AIObio: create slab <bio-0> at 0Switched to NOHz mode on CPU #0io scheduler noop registered (default)loop: module loadedInitialized stdio console driverUsing a channel type which is configured out of UMLparse_chan_pair failed for device 1 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 2 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 3 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 4 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 5 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 6 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 7 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 8 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 9 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 10 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 11 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 12 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 13 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 14 : Configuration failedUsing a channel type which is configured out of UMLparse_chan_pair failed for device 15 : Configuration failedConsole initialized on /dev/tty0console [tty0] enabledVFS: Mounted root (hostfs filesystem) on device 0:8.IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQsIRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQsIRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
Pid: 1, comm: swapper Not tainted 2.6.29-rc7RIP: 0033:[<000000006001b342>]RSP: 0000000062029dd0 EFLAGS: 00010216RAX: 00000000622af800 RBX: 00000000621b0000 RCX: 0000000003ffc09fRDX: fffffffffff02800 RSI: 0000000060313900 RDI: 00000000622af800RBP: 0000000060d10048 R08: 0000000000000000 R09: 0000000000100000R10: 0000000000000000 R11: 0000000060197a00 R12: 000000006211f300R13: 000000006211f300 R14: 0000000060206440 R15: 0000000062020300Call Trace: 602058f8: [<600160c5>] timer_one_shot+0x55/0x8060205908: [<6000e4b9>] segv+0x2a9/0x2d060205918: [<6001b342>] __memcpy+0xe/0xac60205928: [<6003f376>] tick_dev_program_event+0x36/0xb060205958: [<6003f5c4>] tick_check_oneshot_change+0xf4/0x10060205968: [<6002bc6d>] run_timer_softirq+0x1cd/0x210602059e8: [<6000e530>] segv_handler+0x50/0xe060205a08: [<6003f250>] tick_handle_periodic+0x10/0x6060205a48: [<60026a7d>] do_softirq+0x4d/0x7060205a68: [<60026bf2>] irq_exit+0x42/0xa060205a88: [<6000aecf>] do_IRQ+0x2f/0x5060205aa8: [<600154e4>] sig_handler_common+0x64/0xe060205b30: [<6001b342>] __memcpy+0xe/0xac60205b50: [<600ae3de>] sysfs_new_dirent+0xfe/0x12060205bd8: [<600156aa>] sig_handler+0x1a/0x4060205be8: [<60015983>] handle_signal+0x73/0xb060205c28: [<60100140>] __restore_rt+0x0/0x1060205cd8: [<6001b342>] __memcpy+0xe/0xac
Kernel panic - not syncing: Segfault with no mm
Pid: 1, comm: swapper Not tainted 2.6.29-rc7RIP: 0033:[<00000000601003a7>]RSP: 00007fff8026e2a8 EFLAGS: 00000246RAX: 0000000000000000 RBX: 0000000000001cc4 RCX: ffffffffffffffffRDX: 0000000000000000 RSI: 0000000000000013 RDI: 0000000000001cc4RBP: 0000000000001cc0 R08: 00007fff8026e1f0 R09: 0000000000000000R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff8026e3b8R13: 0000000000000004 R14: 00007fff8026e580 R15: 00007fff8026e414Call Trace: 602057b8: [<6003aacd>] up+0x1d/0x50602057c8: [<6000e77d>] panic_exit+0x2d/0x50602057d8: [<600214ac>] release_console_sem+0x19c/0x1e0602057e8: [<6003ac87>] notifier_call_chain+0x37/0x7060205818: [<60167cd7>] panic+0xd0/0x16460205858: [<60100370>] __sigprocmask+0x10/0x4060205878: [<60167df6>] printk+0x8b/0x9560205898: [<6001604e>] os_nsecs+0xe/0x30602058b8: [<6001b342>] __memcpy+0xe/0xac602058c8: [<6000d080>] show_trace+0x60/0xc0602058e8: [<6001b148>] show_regs+0x28/0x3060205908: [<6000e4c5>] segv+0x2b5/0x2d060205918: [<6001b342>] __memcpy+0xe/0xac60205928: [<6003f376>] tick_dev_program_event+0x36/0xb060205958: [<6003f5c4>] tick_check_oneshot_change+0xf4/0x10060205968: [<6002bc6d>] run_timer_softirq+0x1cd/0x210602059e8: [<6000e530>] segv_handler+0x50/0xe060205a08: [<6003f250>] tick_handle_periodic+0x10/0x6060205a48: [<60026a7d>] do_softirq+0x4d/0x7060205a68: [<60026bf2>] irq_exit+0x42/0xa060205a88: [<6000aecf>] do_IRQ+0x2f/0x5060205aa8: [<600154e4>] sig_handler_common+0x64/0xe060205b30: [<6001b342>] __memcpy+0xe/0xac60205b50: [<600ae3de>] sysfs_new_dirent+0xfe/0x12060205bd8: [<600156aa>] sig_handler+0x1a/0x4060205be8: [<60015983>] handle_signal+0x73/0xb060205c28: [<60100140>] __restore_rt+0x0/0x1060205cd8: [<6001b342>] __memcpy+0xe/0xac
Segmentation fault
Rob????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?


2009-03-06 08:48:45

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 06, 2009 at 12:42:14AM -0600, Rob Landley wrote:
>Apparently, nobody other than me has ever attempted to use User Mode Linux
>built from 2.6.28 on x86-64, because it doesn't work. It still doesn't work
>in current git. I complained about it not working back in January:
>
>http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.04389.rob%40landley.net&forum_name=user-
>mode-linux-devel
>http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
>
>And today, I bothered to track down why.
>
>This is the commit that broke it, when Peter Anvin merged x86 and x86-64 for
>ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
>
>Here's a patch that fixes it for me:

Thanks, Bob!

>
>Signed-off-by: Rob Landley <[email protected]>
>
>diff -r 178a096e9e38 arch/um/Kconfig.x86
>--- a/arch/um/Kconfig.x86 Fri Feb 27 16:49:46 2009 -0800
>+++ b/arch/um/Kconfig.x86 Thu Mar 05 23:35:55 2009 -0600
>@@ -26,9 +26,8 @@
> def_bool !X86_XADD
>
> config 3_LEVEL_PGTABLES
>- bool "Three-level pagetables (EXPERIMENTAL)" if !64BIT
>+ bool
> default 64BIT
>- depends on EXPERIMENTAL


So, on i386, it will not depend on EXPERIMENTAL any more, right?

How about changing it to the following?

depends on 64BIT || EXPERIMENTAL

> help
> Three-level pagetables will let UML have more than 4G of physical
> memory. All the memory that can't be mapped directly will be treated
>>What changed is that the resulting .config no longer contains the line
>"CONFIG_3_LEVEL_PGTABLES=y" (it's not visible, and thus not written out into
>the config file file). Without that symbol defined, x86-64 dies trying to
>boot. If you tweak the Kconfig so the symbol gets written out, it starts
>working again.
>
>I have no idea how ANYBODY has EVER managed to use 2.6.28 User Mode Linux on
>an x86-64 host. My theory is that nobody ever did. I suspect that very few
>people use UML anymore now that KVM and the rustyvisor and such are available,
>and those legacy users still fiddling with it are apparently all either using
>old versions or 32-bit hosts. (I still like being able to stick printfs into
>the kernel.)
>

I am sorry that I never have an x86_64 machine to use. :(

--
Do what you love, f**k the rest! F**k the regulations!

2009-03-06 09:50:54

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 6, 2009 at 09:48, Américo Wang <[email protected]> wrote:
> On Fri, Mar 06, 2009 at 12:42:14AM -0600, Rob Landley wrote:
>>Apparently, nobody other than me has ever attempted to use User Mode Linux
>>built from 2.6.28 on x86-64, because it doesn't work.  It still doesn't work
>>in current git.  I complained about it not working back in January:
>>
>>http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.04389.rob%40landley.net&forum_name=user-
>>mode-linux-devel
>>http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
>>
>>And today, I bothered to track down why.
>>
>>This is the commit that broke it, when Peter Anvin merged x86 and x86-64 for
>>ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
>>
>>Here's a patch that fixes it for me:
>
> Thanks, Bob!

I've just did a build of plain v2.6.28 on amd64 aka x86-64. The
resulting image ran fine.

I attached my .config.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Attachments:
config (16.39 kB)

2009-03-06 09:51:29

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 6, 2009 at 10:50, Geert Uytterhoeven <[email protected]> wrote:
> On Fri, Mar 6, 2009 at 09:48, Américo Wang <[email protected]> wrote:
>> On Fri, Mar 06, 2009 at 12:42:14AM -0600, Rob Landley wrote:
>>>Apparently, nobody other than me has ever attempted to use User Mode Linux
>>>built from 2.6.28 on x86-64, because it doesn't work.  It still doesn't work
>>>in current git.  I complained about it not working back in January:
>>>
>>>http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.04389.rob%40landley.net&forum_name=user-
>>>mode-linux-devel
>>>http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
>>>
>>>And today, I bothered to track down why.
>>>
>>>This is the commit that broke it, when Peter Anvin merged x86 and x86-64 for
>>>ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
>>>
>>>Here's a patch that fixes it for me:
>>
>> Thanks, Bob!
>
> I've just did a build of plain v2.6.28 on amd64 aka x86-64. The
> resulting image ran fine.

So I'm wondering: why does it work for me and not for you?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2009-03-06 11:18:31

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 06, 2009 at 10:51:12AM +0100, Geert Uytterhoeven wrote:
>On Fri, Mar 6, 2009 at 10:50, Geert Uytterhoeven <[email protected]> wrote:
>> On Fri, Mar 6, 2009 at 09:48, Américo Wang <[email protected]> wrote:
>>> On Fri, Mar 06, 2009 at 12:42:14AM -0600, Rob Landley wrote:
>>>>Apparently, nobody other than me has ever attempted to use User Mode Linux
>>>>built from 2.6.28 on x86-64, because it doesn't work.  It still doesn't work
>>>>in current git.  I complained about it not working back in January:
>>>>
>>>>http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.04389.rob%40landley.net&forum_name=user-
>>>>mode-linux-devel
>>>>http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
>>>>
>>>>And today, I bothered to track down why.
>>>>
>>>>This is the commit that broke it, when Peter Anvin merged x86 and x86-64 for
>>>>ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
>>>>
>>>>Here's a patch that fixes it for me:
>>>
>>> Thanks, Bob!
>>
>> I've just did a build of plain v2.6.28 on amd64 aka x86-64. The
>> resulting image ran fine.
>
>So I'm wondering: why does it work for me and not for you?
>

It does work well x86_64, but my question is that whether this will
break i386 or not, since before, CONFIG_3_LEVEL_PGTABLES depends on
EXPERIMENTAL on i386, this patch removes it.

--
Do what you love, f**k the rest! F**k the regulations!

2009-03-06 15:14:31

by Jeff Dike

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 06, 2009 at 07:18:34PM +0800, Am??rico Wang wrote:
> It does work well x86_64, but my question is that whether this will
> break i386 or not, since before, CONFIG_3_LEVEL_PGTABLES depends on
> EXPERIMENTAL on i386, this patch removes it.

As long as CONFIG_3_LEVEL_PGTABLES is off on 32-bit, it should be fine.

It did work, last I checked, but 3-level page tables on 32-bit is a
very rarely used combination, and not useful.

Jeff

--
Work email - jdike at linux dot intel dot com

2009-03-06 22:20:27

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Friday 06 March 2009 03:50:38 Geert Uytterhoeven wrote:
> On Fri, Mar 6, 2009 at 09:48, Américo Wang <[email protected]> wrote:
> > On Fri, Mar 06, 2009 at 12:42:14AM -0600, Rob Landley wrote:
> >>Apparently, nobody other than me has ever attempted to use User Mode
> >> Linux built from 2.6.28 on x86-64, because it doesn't work.  It still
> >> doesn't work in current git.  I complained about it not working back in
> >> January:
> >>
> >>http://sourceforge.net/mailarchive/forum.php?thread_name=200901130159.043
> >>89.rob%40landley.net&forum_name=user- mode-linux-devel
> >>http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/00669.html
> >>
> >>And today, I bothered to track down why.
> >>
> >>This is the commit that broke it, when Peter Anvin merged x86 and x86-64
> >> for ARCH=um: http://kernel.org/hg/linux-2.6/rev/117978
> >>
> >>Here's a patch that fixes it for me:
> >
> > Thanks, Bob!
>
> I've just did a build of plain v2.6.28 on amd64 aka x86-64. The
> resulting image ran fine.
>
> I attached my .config.

Which contains:
CONFIG_3_LEVEL_PGTABLES=y

So the question is, why is your config saving that value, and mine isn't?

Ah, I found it. You enabled CONFIG_EXPERIMENTAL, and I didn't. That's the
difference.

Ok, CONFIG_EXPERIMENTAL is required in order for UML to initialize its memory
management. That makes a bit more sense why other people haven't seen this...

Rob

2009-03-06 22:22:35

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Friday 06 March 2009 08:35:43 Jeff Dike wrote:
> On Fri, Mar 06, 2009 at 07:18:34PM +0800, Am??rico Wang wrote:
> > It does work well x86_64, but my question is that whether this will
> > break i386 or not, since before, CONFIG_3_LEVEL_PGTABLES depends on
> > EXPERIMENTAL on i386, this patch removes it.
>
> As long as CONFIG_3_LEVEL_PGTABLES is off on 32-bit, it should be fine.
>
> It did work, last I checked, but 3-level page tables on 32-bit is a
> very rarely used combination, and not useful.

I have no idea if my patch is the _right_ fix, I just know I couldn't use UML
for 2 months and now I can.

If you enable CONFIG_EXPERIMENTAL, then it writes it out to the config file.
If you don't, it hides it and doesn't write it out even though the value would
be y. (The visibility predicates affect the resulting data.)

Rob

2009-03-10 14:29:18

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH] Fix ARCH=um segfault on x86-64.

On Fri, Mar 06, 2009 at 09:35:43AM -0500, Jeff Dike wrote:
>On Fri, Mar 06, 2009 at 07:18:34PM +0800, Am??rico Wang wrote:
>> It does work well x86_64, but my question is that whether this will
>> break i386 or not, since before, CONFIG_3_LEVEL_PGTABLES depends on
>> EXPERIMENTAL on i386, this patch removes it.
>
>As long as CONFIG_3_LEVEL_PGTABLES is off on 32-bit, it should be fine.
>
>It did work, last I checked, but 3-level page tables on 32-bit is a
>very rarely used combination, and not useful.
>

So this patch should be fine, right? :)

Thanks for your comments!

--
Do what you love, f**k the rest! F**k the regulations!