2008-07-10 14:20:11

by Toralf Förster

[permalink] [raw]
Subject: UML kernel failed to start betvee 2.6.8-rc8 and current git sources

Hm,

from v2.6.26-rc8 to v2.6.26-rc9-56-g6329d30 sth. affected UML, b/c my user mode linux kernel produces this :

...
VFS: Mounted root (ext3 filesystem) readonly.
Registers -
0 0x0
1 0x0
2 0x0
3 0x0
4 0x0
5 0x0
6 0x0
7 0x0
8 0x0
9 0x0
10 0x0
11 0x0
12 0x0
13 0x0
14 0x0
15 0x0
16 0x0
Kernel panic - not syncing: do_syscall_stub : PTRACE_SETREGS failed, errno = 5


EIP: 0073:[<b7eef424>] CPU: 0 Not tainted ESP: 007b:bfc0acf0 EFLAGS: 00200246
Not tainted
EAX: 00000000 EBX: 00002e8f ECX: 00000013 EDX: 00002e8f
ESI: 00002e8b EDI: 00000000 EBP: 00000002 DS: 007b ES: 007b
17c6bd00: [<080927c3>] notifier_call_chain+0x43/0x90
17c6bd24: [<0809286e>] atomic_notifier_call_chain+0x2e/0x40
17c6bd3c: [<08078eb6>] panic+0x76/0x110
17c6bd5c: [<0806e4d3>] run_syscall_stub+0x2f3/0x300
17c6bd90: [<0807044d>] write_ldt_entry+0x1ad/0x1c0
17c6bdd8: [<08070c0e>] init_new_ldt+0x18e/0x3d0
17c6bdf0: [<080a55a7>] __alloc_pages_internal+0x87/0x450
17c6be1c: [<0806f105>] start_userspace+0x1a5/0x1f0
17c6be50: [<0805ede1>] init_new_context+0x91/0x140
17c6be6c: [<080c6789>] bprm_mm_init+0x49/0x1c0
17c6be78: [<0806c485>] set_signals+0x25/0x30
17c6be80: [<080be18a>] kmem_cache_alloc+0x4a/0xa0
17c6be98: [<080c7e73>] do_execve+0x83/0x1e0
17c6bebc: [<0805a685>] execve1+0x35/0x60
17c6bedc: [<0805a714>] um_execve+0x14/0x50
17c6bee0: [<080ce4cb>] dupfd+0xfb/0x130
17c6beec: [<0805ce8f>] kernel_execve+0x3f/0x50
17c6bf08: [<0805a302>] run_init_process+0x22/0x30
17c6bf18: [<0805a3c6>] init_post+0xb6/0x100
17c6bf2c: [<080496e1>] kernel_init+0x251/0x2b0
17c6bf54: [<08055400>] tcp_congestion_default+0x0/0x20
17c6bfa4: [<08075b36>] finish_task_switch+0x26/0x80
17c6bfbc: [<0806b0b3>] run_kernel_thread+0x43/0x50
17c6bfd8: [<0806b08f>] run_kernel_thread+0x1f/0x50
17c6bfe4: [<0805be11>] new_thread_handler+0x61/0x90
17c6bfe8: [<08049490>] kernel_init+0x0/0x2b0

Terminated

--
MfG/Sincerely

Toralf F?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


Attachments:
(No filename) (2.16 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-07-10 16:35:42

by Jeff Dike

[permalink] [raw]
Subject: Re: UML kernel failed to start betvee 2.6.8-rc8 and current git sources

On Thu, Jul 10, 2008 at 04:19:51PM +0200, Toralf F?rster wrote:
> from v2.6.26-rc8 to v2.6.26-rc9-56-g6329d30 sth. affected UML, b/c my user mode linux kernel produces this :
>
> ...
> VFS: Mounted root (ext3 filesystem) readonly.
> Registers -
> 0 0x0
> 1 0x0
> 2 0x0
> 3 0x0
> 4 0x0
> 5 0x0
> 6 0x0
> 7 0x0
> 8 0x0
> 9 0x0
> 10 0x0
> 11 0x0
> 12 0x0
> 13 0x0
> 14 0x0
> 15 0x0
> 16 0x0
> Kernel panic - not syncing: do_syscall_stub : PTRACE_SETREGS failed, errno = 5
>

I have 2.6.26-rc9-00010-g3bc5ab9 and it seems fine. I'll pull the latest git
and see if that's still OK.

What's the host?

Jeff

--
Work email - jdike at linux dot intel dot com

2008-07-10 16:58:32

by Jeff Dike

[permalink] [raw]
Subject: Re: [uml-devel] UML kernel failed to start betvee 2.6.8-rc8 and current git sources

On Thu, Jul 10, 2008 at 04:19:51PM +0200, Toralf F?rster wrote:
> from v2.6.26-rc8 to v2.6.26-rc9-56-g6329d30 sth. affected UML, b/c my user mode linux kernel produces this :

That UML runs fine here. Did you also upgrade the host kernel between
rc8 and now?

Jeff

--
Work email - jdike at linux dot intel dot com

2008-07-10 17:34:52

by Toralf Förster

[permalink] [raw]
Subject: Re: [uml-devel] UML kernel failed to start betvee 2.6.8-rc8 and current git sources

At Thursday 10 July 2008 18:58:11 Jeff Dike wrote :
> On Thu, Jul 10, 2008 at 04:19:51PM +0200, Toralf F?rster wrote:
> > from v2.6.26-rc8 to v2.6.26-rc9-56-g6329d30 sth. affected UML, b/c my user mode linux kernel produces this :
>
> That UML runs fine here. Did you also upgrade the host kernel between
> rc8 and now?
>
Yes, but that's not the reason. The UML crashes both under host kernel
2.6.24-gentoo-r8 and 2.6.25-gentoo-r6. The last (available) working UML kernel
is linux-v2.6.26-rc8-227 (tested under host kernel 2.6.25-gentoo-r6).

--
MfG/Sincerely

Toralf F?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


Attachments:
(No filename) (648.00 B)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-07-10 18:53:11

by Jeff Dike

[permalink] [raw]
Subject: Re: [uml-devel] UML kernel failed to start betvee 2.6.8-rc8 and current git sources

On Thu, Jul 10, 2008 at 07:34:29PM +0200, Toralf F?rster wrote:
> At Thursday 10 July 2008 18:58:11 Jeff Dike wrote :
> > On Thu, Jul 10, 2008 at 04:19:51PM +0200, Toralf F?rster wrote:
> > > from v2.6.26-rc8 to v2.6.26-rc9-56-g6329d30 sth. affected UML, b/c my user mode linux kernel produces this :
> >
> > That UML runs fine here. Did you also upgrade the host kernel between
> > rc8 and now?
> >
> Yes, but that's not the reason. The UML crashes both under host kernel
> 2.6.24-gentoo-r8 and 2.6.25-gentoo-r6. The last (available) working UML kernel
> is linux-v2.6.26-rc8-227 (tested under host kernel 2.6.25-gentoo-r6).

Can you send me your crashing UML binary?

Also, can you bisect this and see where something broke?

Jeff

--
Work email - jdike at linux dot intel dot com

2008-07-10 19:30:58

by Jeff Dike

[permalink] [raw]
Subject: Re: UML kernel failed to start betvee 2.6.8-rc8 and current git sources

On Thu, Jul 10, 2008 at 08:09:59PM +0200, Toralf F?rster wrote:
> At Thursday 10 July 2008 18:35:15 Jeff Dike wrote :
>
> > What's the host?
>
> tfoerste@n22 ~ $ uname -a
> Linux n22 2.6.25-gentoo-r6 #4 Thu Jul 10 19:49:36 CEST 2008 i686 Intel(R) Pentium(R) M processor 1700MHz GenuineIntel GNU/Linux
>
> BTW, this works fine and shows only the config :
> n22 ~ # linux-v2.6.26-rc8-227 --showconfig
>
> whereas this failed:
>
> n22 ~ # linux-v2.6.26-rc9-56 --showconfig
> Locating the bottom of the address space ... 0x0
> Locating the top of the address space ... 0xc0000000

OK, I believe you're seeing the same bug as Uli saw here:

http://marc.info/?l=linux-kernel&m=121011518003727&w=2

More precise symptoms are here:

http://marc.info/?l=linux-kernel&m=121011722806093&w=2

I never did figure that one out. I had the same kernel version, same
toolchain, same everything as far as I could see, and I couldn't
reproduce it, except with a binary that he gave me.

You're seeing it on i386, whereas he saw it on x86_64.

The underlying problem is that somehow the UML initcalls aren't being
run, which is why you're seeing all zeros in the register dump.

If you bisect this, I bet you end up at the no-unit-at-a-time patch
that he ended up at. And I have no idea what that has to do with
anything.

Jeff

--
Work email - jdike at linux dot intel dot com

2008-07-11 08:43:59

by Toralf Förster

[permalink] [raw]
Subject: Re: UML kernel failed to start betvee 2.6.8-rc8 and current git sources

At Thursday 10 July 2008 21:30:19 Jeff Dike wrote :
> On Thu, Jul 10, 2008 at 08:09:59PM +0200, Toralf F?rster wrote:
> > At Thursday 10 July 2008 18:35:15 Jeff Dike wrote :
> >
> > > What's the host?
> >
> > tfoerste@n22 ~ $ uname -a
> > Linux n22 2.6.25-gentoo-r6 #4 Thu Jul 10 19:49:36 CEST 2008 i686 Intel(R) Pentium(R) M processor 1700MHz GenuineIntel GNU/Linux
> >
> > BTW, this works fine and shows only the config :
> > n22 ~ # linux-v2.6.26-rc8-227 --showconfig
> >
> > whereas this failed:
> >
> > n22 ~ # linux-v2.6.26-rc9-56 --showconfig
> > Locating the bottom of the address space ... 0x0
> > Locating the top of the address space ... 0xc0000000
>
> OK, I believe you're seeing the same bug as Uli saw here:
>
> http://marc.info/?l=linux-kernel&m=121011518003727&w=2
>
> More precise symptoms are here:
>
> http://marc.info/?l=linux-kernel&m=121011722806093&w=2
>
> I never did figure that one out. I had the same kernel version, same
> toolchain, same everything as far as I could see, and I couldn't
> reproduce it, except with a binary that he gave me.
>
> You're seeing it on i386, whereas he saw it on x86_64.
>
> The underlying problem is that somehow the UML initcalls aren't being
> run, which is why you're seeing all zeros in the register dump.
>
> If you bisect this, I bet you end up at the no-unit-at-a-time patch
> that he ended up at. And I have no idea what that has to do with
> anything.
>
You're right :

tfoerste@n22 ~/devel/linux-2.6 $ git bisect good
4f81c5350b44bcc501ab6f8a089b16d064b4d2f6 is first bad commit
commit 4f81c5350b44bcc501ab6f8a089b16d064b4d2f6
Author: Jeff Dike <[email protected]>
Date: Mon Jul 7 13:36:56 2008 -0400

[UML] fix gcc ICEs and unresolved externs

There are various constraints on the use of unit-at-a-time:
- i386 uses no-unit-at-a-time for pre-4.0 (not 4.3)
- x86_64 uses unit-at-a-time always

...


--
MfG/Sincerely

Toralf F?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


Attachments:
(No filename) (1.97 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments