LinuxLists.cc - 2.6.{26.2,27-rc} oops on virtualbox

2008-08-20 19:31:26

Subject: 2.6.{26.2,27-rc} oops on virtualbox

Hi there,

Users of different Linux distros (includes Ubuntu, Mandriva, ArchLinux and
possibly Fedora) are reporting that kernel 2.6.26.2 is OOPSing in the
Virtual Box emulator[1].

It is not clear if this is a kernel or Virtual Box bug, but as the kernel
is also OOPsing in QEMU (although with different behaivor) I have decided
to post my debug results here in case someone is interested in debugging
the kernel part further.

I have done a bisection by hand among kernel versions and found that
the commit which triggers the oops in _Virtual Box_ was introduced in
2.6.26-rc1 and the problem also happens with latest Linus tree.

By using git bisect I found that the commit is this:

"""
commit e587cadd8f47e202a30712e2906a65a0606d5865
Author: Mathieu Desnoyers <[email protected]>
Date: Thu Mar 6 08:48:49 2008 -0500

x86: enhance DEBUG_RODATA support - alternatives

[...]

"""

By reverting this commit I don't get the OOPS anymore. I have
tested with 2.6.26-rc1 and latest Linus tree (2.6.27-rc3).

What puzzles me though is that a similar problem happens with
QEMU, but it also OOPSes with kernels before 2.6.26-rc1,
reverting the patch above makes no difference and it works with
current Linus tree.

Does this look like a kernel bug?

All my tests have been done with vanilla kernels, but I have
built .iso installation images with them and I'm not sure of
what the build script does.

[1] http://en.wikipedia.org/wiki/Virtual_box

Thanks for reading this.

--
Luiz Fernando N. Capitulino

2008-08-21 21:34:58

Em Fri, 22 Aug 2008 08:50:12 +0200
Ingo Molnar <[email protected]> escreveu:

|
| * H. Peter Anvin <[email protected]> wrote:
|
| > H. Peter Anvin wrote:
| >>>
| >>> Does this look like a kernel bug?
| >>>
| >>
| >> No, it looks like a very common virtualizer bug. Does the attached
| >> patch work for you?
| >>
| >
| > Also, in addition to this, please try tip:master. There is a patch in
| > tip:master which I hope should fix this problem, but the details are
| > important.
|
| access coordinates would be at:
|
| http://people.redhat.com/mingo/tip.git/README

As I already have Linus tree downloaded I have cloned it in
the usual way.

Got the same results: OOPS in virtualbox but it works on QEMU.

The OOPS's output follows and I have attached the .config I'm using
to reproduce the problem.

"""
BUG: unable to handle kernel NULL pointer dereference at 00000246
IP: [<c01310f1>] vprintk+0x181/0x440
*pde = 00000000
Oops: 0002 [#1] SMP
Modules linked in:

Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test24-tip #3)
EIP: 0060:[<c01310f1>] EFLAGS: 00010246 CPU: 0
EIP is at vprintk+0x181/0x440
EAX: 00000246 EBX: 00000000 ECX: c0130ca9 EDX: 0000dedd
ESI: c0474ae3 EDI: c04cf6bc EBP: c7435f24 ESP: c7435eb0
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
Stack: 0000dedd c0130ca9 c7435f40 00000000 a026104f a026106c c7434000 c7435ee6
00000006 00000246 00000000 a0260cf3 0000001c c7434000 00000282 00000046
c11a85a0 c7435efc c0135c6f c7435f14 c0115fcb a0296e91 c0104c2c 00000000
Call Trace:
[<c0130ca9>] ? release_console_sem+0x199/0x1e0
[<c0135c6f>] ? irq_exit+0x3f/0x90
[<c0115fcb>] ? smp_apic_timer_interrupt+0x5b/0x90
[<c0104c2c>] ? apic_timer_interrupt+0x28/0x30
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0346ed9>] ? printk+0x18/0x1f
[<c0474b00>] ? net_ns_init+0x1d/0x1ad
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0101116>] ? do_one_initcall+0x26/0x170
[<c0128f66>] ? try_to_wake_up+0xc6/0x240
[<c012910f>] ? wake_up_process+0xf/0x20
[<c014192d>] ? start_workqueue_thread+0x1d/0x20
[<c0141d4b>] ? __create_workqueue_key+0x1eb/0x240
[<c0141820>] ? worker_thread+0x0/0xf0
[<c044b387>] ? kernel_init+0x141/0x214
[<c044b246>] ? kernel_init+0x0/0x214
[<c0104dc7>] ? kernel_thread_helper+0x7/0x10
=======================
Code: c0 0f 84 0b 01 00 00 b8 50 f1 41 c0 c7 05 ec f1 41 c0 ff ff ff ff e8 cf 8b 21 00 e8 ea 04 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 ec f1 41 c0 e8
EIP: [<c01310f1>] vprintk+0x181/0x440 SS:ESP 0069:c7435eb0
---[ end trace 4eaa2a86a8e2da22 ]---
"""

--
Luiz Fernando N. Capitulino

Attachments:

(No filename) (2.64 kB)
config (55.60 kB)
Download all attachments

2008-08-22 15:35:06

by Mathieu Desnoyers

[permalink] [raw]

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Fri, 22 Aug 2008 08:50:12 +0200
> Ingo Molnar <[email protected]> escreveu:
>
> |
> | * H. Peter Anvin <[email protected]> wrote:
> |
> | > H. Peter Anvin wrote:
> | >>>
> | >>> Does this look like a kernel bug?
> | >>>
> | >>
> | >> No, it looks like a very common virtualizer bug. Does the attached
> | >> patch work for you?
> | >>
> | >
> | > Also, in addition to this, please try tip:master. There is a patch in
> | > tip:master which I hope should fix this problem, but the details are
> | > important.
> |
> | access coordinates would be at:
> |
> | http://people.redhat.com/mingo/tip.git/README
>
> As I already have Linus tree downloaded I have cloned it in
> the usual way.
>
> Got the same results: OOPS in virtualbox but it works on QEMU.
>
> The OOPS's output follows and I have attached the .config I'm using
> to reproduce the problem.
>

Can you try booting with the kernel argument :
debug_alternative

The dmesg of the kernel bootup up to the oops would be helpful.

My guess is that there may be something wrong with irq disabling which
protects text_poke_early in apply_alternatives().

Mathieu

> """
> BUG: unable to handle kernel NULL pointer dereference at 00000246
> IP: [<c01310f1>] vprintk+0x181/0x440
> *pde = 00000000
> Oops: 0002 [#1] SMP
> Modules linked in:
>
> Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test24-tip #3)
> EIP: 0060:[<c01310f1>] EFLAGS: 00010246 CPU: 0
> EIP is at vprintk+0x181/0x440
> EAX: 00000246 EBX: 00000000 ECX: c0130ca9 EDX: 0000dedd
> ESI: c0474ae3 EDI: c04cf6bc EBP: c7435f24 ESP: c7435eb0
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
> Stack: 0000dedd c0130ca9 c7435f40 00000000 a026104f a026106c c7434000 c7435ee6
> 00000006 00000246 00000000 a0260cf3 0000001c c7434000 00000282 00000046
> c11a85a0 c7435efc c0135c6f c7435f14 c0115fcb a0296e91 c0104c2c 00000000
> Call Trace:
> [<c0130ca9>] ? release_console_sem+0x199/0x1e0
> [<c0135c6f>] ? irq_exit+0x3f/0x90
> [<c0115fcb>] ? smp_apic_timer_interrupt+0x5b/0x90
> [<c0104c2c>] ? apic_timer_interrupt+0x28/0x30
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0346ed9>] ? printk+0x18/0x1f
> [<c0474b00>] ? net_ns_init+0x1d/0x1ad
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0101116>] ? do_one_initcall+0x26/0x170
> [<c0128f66>] ? try_to_wake_up+0xc6/0x240
> [<c012910f>] ? wake_up_process+0xf/0x20
> [<c014192d>] ? start_workqueue_thread+0x1d/0x20
> [<c0141d4b>] ? __create_workqueue_key+0x1eb/0x240
> [<c0141820>] ? worker_thread+0x0/0xf0
> [<c044b387>] ? kernel_init+0x141/0x214
> [<c044b246>] ? kernel_init+0x0/0x214
> [<c0104dc7>] ? kernel_thread_helper+0x7/0x10
> =======================
> Code: c0 0f 84 0b 01 00 00 b8 50 f1 41 c0 c7 05 ec f1 41 c0 ff ff ff ff e8 cf 8b 21 00 e8 ea 04 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 ec f1 41 c0 e8
> EIP: [<c01310f1>] vprintk+0x181/0x440 SS:ESP 0069:c7435eb0
> ---[ end trace 4eaa2a86a8e2da22 ]---
> """
>
> --
> Luiz Fernando N. Capitulino

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-22 16:29:52

Em Fri, 22 Aug 2008 12:35:20 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > Em Fri, 22 Aug 2008 11:34:52 -0400
| > Mathieu Desnoyers <[email protected]> escreveu:
| >
| > | * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > | > Em Fri, 22 Aug 2008 08:50:12 +0200
| > | > Ingo Molnar <[email protected]> escreveu:
| > | >
| > | > |
| > | > | * H. Peter Anvin <[email protected]> wrote:
| > | > |
| > | > | > H. Peter Anvin wrote:
| > | > | >>>
| > | > | >>> Does this look like a kernel bug?
| > | > | >>>
| > | > | >>
| > | > | >> No, it looks like a very common virtualizer bug. Does the attached
| > | > | >> patch work for you?
| > | > | >>
| > | > | >
| > | > | > Also, in addition to this, please try tip:master. There is a patch in
| > | > | > tip:master which I hope should fix this problem, but the details are
| > | > | > important.
| > | > |
| > | > | access coordinates would be at:
| > | > |
| > | > | http://people.redhat.com/mingo/tip.git/README
| > | >
| > | > As I already have Linus tree downloaded I have cloned it in
| > | > the usual way.
| > | >
| > | > Got the same results: OOPS in virtualbox but it works on QEMU.
| > | >
| > | > The OOPS's output follows and I have attached the .config I'm using
| > | > to reproduce the problem.
| > | >
| > |
| > | Can you try booting with the kernel argument :
| > | debug_alternative
| > |
| > | The dmesg of the kernel bootup up to the oops would be helpful.
| > |
| > | My guess is that there may be something wrong with irq disabling which
| > | protects text_poke_early in apply_alternatives().
| >
| > I have attached two files:
| >
| > - normal.txt: normal boot with no debug options
| > - debug-alternative.txt ignore_loglevel and debug-alternative boot
| > options
| >
| > I had to pass ignore_loglevel otherwise it wouldn't print
| > anything.
| >
|
| Ok, now can you try booting with either of those args :
|
| noreplace-paravirt
| noreplace-smp
|
| And see which one(s) works ?

noreplace-paravirt works, the other one causes no change.

You will find the full boot log (with debug-alternative enabled)
attached.

--
Luiz Fernando N. Capitulino

Attachments:

(No filename) (2.23 kB)
working.txt (8.12 kB)
Download all attachments

2008-08-22 17:45:42

by Mathieu Desnoyers

[permalink] [raw]

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* H. Peter Anvin ([email protected]) wrote:
> Was looking at the code stream, and noticed this:
>
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8
> 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45
> bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
>
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8
> 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45
> bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
>
> The EIP is in the *MIDDLE* of a NOPL instruction:
>
> C012FC46 C00F84 ror byte [edi],0x84
> C012FC49 0B01 or eax,[ecx]
> C012FC4B 0000 add [eax],al
> C012FC4D B8D0BF41C0 mov eax,0xc041bfd0
> C012FC52 C7056CC041C0FFFF mov dword [dword 0xc041c06c],0xffffffff
> -FFFF
> C012FC5C E87F822100 call dword 0xc0347ee0
> C012FC61 E81A030200 call dword 0xc014ff80
> C012FC66 8B45B0 mov eax,[ebp-0x50]
> C012FC69 50 push eax
> C012FC6A 9D popfd
> C012FC6B 0F1F840000000000 nop dword [eax+eax+0x0]
> C012FC73 8B45BC mov eax,[ebp-0x44]
> C012FC76 83C460 add esp,byte +0x60
> C012FC79 5B pop ebx
> C012FC7A 5E pop esi
> C012FC7B 5F pop edi
> C012FC7C 5D pop ebp
> C012FC7D C3 ret
> C012FC7E 6690 xchg ax,ax
> C012FC80 A16CC041C0 mov eax,[0xc041c06c]
>
> There are two possibilities: VirtualBox mis-executes (not merely traps,
> which is what tip:master looks for) the NOPL instruction, or something is
> jumping into the middle of the sequence that is then replaced by the NOPL.
>
> So, Luiz: the DEBUG_INFO version of vmlinux would be helpful. It would
> also help to know the exact version of VirtualBox you're running, what
> source you got it from, and what your host system looks like.
>
> -hpa

The patch which turns on this bug this this important change to the
apply paravirt : it disables interrupts _near_ the code patching,
_within_ the loop. Before, interrupts were disabled outside of the loop.
It needs to disable interrupts within the loop to be able to use vmap in
text_poke().

So I bet VirtualBox has a race in the way it handles interrupt
disabling.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-22 17:58:13

by H. Peter Anvin

[permalink] [raw]

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Mathieu Desnoyers wrote:
>
> The patch which turns on this bug this this important change to the
> apply paravirt : it disables interrupts _near_ the code patching,
> _within_ the loop. Before, interrupts were disabled outside of the loop.
> It needs to disable interrupts within the loop to be able to use vmap in
> text_poke().
>
> So I bet VirtualBox has a race in the way it handles interrupt
> disabling.
>

That seems a bit far-fetched. The fault is in an initcall, and there
are no interrupts involved. Perhaps VirtualBox doesn't manage its
tcache correctly, but I don't see this as being interrupt-related.

-hpa

I have been unable to replicate this on my own hardware mostly because
my testing machine decided to blow its DVD drive in some very strange
way, but I did pick apart the data from Luiz, and found it very interesting:

The code sequence before patching looks like:

c012fc69: 51 push %ecx
c012fc6a: 52 push %edx
c012fc6b: ff 15 40 b9 41 c0 call *0xc041b940
c012fc71: 5a pop %edx
c012fc72: 59 pop %ecx

After patching:

50 9d 0f 1f 84 00 00 00 <00> 00

... which disassembles to (in Intel notation):

C012FC69 50 push eax
C012FC6A 9D popfd
C012FC6B 0F1F840000000000 nop dword [eax+eax+0x0]

We do, indeed have a return point that falls in the *middle* of a
patched instruction, and if the patching happens in the middle of the
instruction call, then, well, bad things happen.

Furthermore, why on Earth is %ecx/%edx pushed and popped in-line here?
Surely it should be the responsibility of the PV call to present a
no-clobber interface (using an assembly wrapper if necessary[*]), rather
than bloating every callsite like this?

-hpa

[*] One can compile gcc code with -fcall-saved-* to use nonstandard
register conventions. Unfortunately stock gcc only lets you do this
with a file parameter, and doesn't support doing this with attributes.

2008-08-26 18:02:42

by Luiz Fernando N. Capitulino

[permalink] [raw]

On Sun, Aug 31, 2008 at 03:28:22PM +0200, Stefan Lippers-Hollmann wrote:
> Hi
>
> On Sonntag, 31. August 2008, Gerhard Brauer wrote:
> [...]
> > Ok, some news from archlinux side:
> > Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
> > this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
> > My virtual machines boots always fine with this, and i have one
> > confirmation from a user about this.
>
> Sorry, I can't confirm this here on Debian unstable (with virtualbox-ose
> 1.6.2 or 1.6.4), are you sure that other configuration options didn't
> change between the different kernel versions? Preemption and paravirt can
> influence the probability of the early boot panic seriously, without really
> avoiding it alltogether.

Only changes between our 2.6.26.2-1 and 2.6.26.3-1 are some minor
framebuffer changes in config. If i have a look at the different
patchsets between the two versions i don't see something which could be
the reason between work and not work.

> Actually I still get the same issues with implanting
> ftp://ftp5.gwdg.de/pub/linux/archlinux/core/os/i686/kernel26-2.6.26.3-1-i686.pkg.tar.gz
> into the test vm using virtualbox-ose 1.6.4.

Hmm, one user also reports that he have no problem when using a vanilla
2.6.26 as guest kernel. But there must be some reasons when different
distributions notice a major problem between 2.6.25 and 2.6.26 with
their stock kernels. Although i don't even know if our few reports here
are very representavive...

> Regards
> Stefan Lippers-Hollmann

Gerhard

--
Heute ist das Morgen wovor du gestern Angst hattest...

2008-08-31 14:09:57

by Luiz Fernando N. Capitulino

[permalink] [raw]

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Sun, 31 Aug 2008 11:29:23 +0200
Gerhard Brauer <[email protected]> escreveu:

| On Thu, Aug 28, 2008 at 10:30:13AM -0300, Luiz Fernando N. Capitulino wrote:
| > Em Wed, 27 Aug 2008 19:33:28 -0400
| > Mathieu Desnoyers <[email protected]> escreveu:
| > |
| > | Since this problem appears while we are using a simple memcpy (the
| > | text_poke_early version), but disappears when we disable interrupts for
| > | a longer period of this, I suspect a problem with irq disabling in
| > | Virtualbox.
| > |
| > | We could try to add some nsleep() or msleep() calls within text_poke and
| > | text_poke_early before and after the code modificatoin to see if the
| > | problem disappears. If it does, then that would somewhat confirm the
| > | racy irq disable thesis.
| >
| > Well, a Ubuntu kernel guy has reported in the virtualbox's ticket[1]
| > that the oops doesn't happen if he puts a printk() in the crash site.
| >
| > The funny thing is that someone (who might be a virtualbox developer)
| > used the same race argument to say that this is a bug in the kernel.
| >
| > What concerns me though is that how can virtualbox be worth using
| > in the Linux community if it's probably not working for various distros
| > (currently Fedora, Ubuntu, Mandriva and ArchLinux).
| >
| > Thanks for the effort, guys.
| >
| > [1] http://www.virtualbox.org/ticket/1875
|
| Ok, some news from archlinux side:
| Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
| this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
| My virtual machines boots always fine with this, and i have one
| confirmation from a user about this.
|
| Kernel upgrade does not solve the kernel panic during work with the VM,
| when there is heavy disk IO. I test and could reproduce this by untar 2
| big files in seperate dirs: bsdtar -x -f VirtualBox-1.6.2-OSE.tar.bz2.
| Doing this simultan crashed the VM always.
| SreenShot:
| http://users.archlinux.de/~gerbra/tmp/2008-08-31-110449_724x456_scrot.png
|
| This heavy IO oops does not occur under 2.6.26.2 when using the
| "3-changes-patch" against alternatives.c, which we have tested in the
| other mails. There must be something irq related which fix this
| 3-changes-patch, and what was not fixed in 2.6.26.3
| On the other hand: I never have stressed a VM like this before
| researching for this problem. So it could also be that the heavy-IO
| problem way a total seperate problem from that we're talking about here.
| Doing my "normal" work now in VM (it's my devel VM for compiling and
| testing), until now i don't have had this IO oops.

Mandriva kernel was 2.6.26.3 based at the time I started testing
this and all my last tests have been done on 2.6.27-rc4. I think it's
very unusual to have a change in a -stable kernel not present in the
latest -rc.

Also note that CPU settings in the VM has a big influency in the
problem, so I'm pretty sure 2.6.26.3 doesn't fix the problem.

--
Luiz Fernando N. Capitulino