Subject: 2.6.{26.2,27-rc} oops on virtualbox


Hi there,

Users of different Linux distros (includes Ubuntu, Mandriva, ArchLinux and
possibly Fedora) are reporting that kernel 2.6.26.2 is OOPSing in the
Virtual Box emulator[1].

It is not clear if this is a kernel or Virtual Box bug, but as the kernel
is also OOPsing in QEMU (although with different behaivor) I have decided
to post my debug results here in case someone is interested in debugging
the kernel part further.

I have done a bisection by hand among kernel versions and found that
the commit which triggers the oops in _Virtual Box_ was introduced in
2.6.26-rc1 and the problem also happens with latest Linus tree.

By using git bisect I found that the commit is this:

"""
commit e587cadd8f47e202a30712e2906a65a0606d5865
Author: Mathieu Desnoyers <[email protected]>
Date: Thu Mar 6 08:48:49 2008 -0500

x86: enhance DEBUG_RODATA support - alternatives

[...]

"""

By reverting this commit I don't get the OOPS anymore. I have
tested with 2.6.26-rc1 and latest Linus tree (2.6.27-rc3).

What puzzles me though is that a similar problem happens with
QEMU, but it also OOPSes with kernels before 2.6.26-rc1,
reverting the patch above makes no difference and it works with
current Linus tree.

Does this look like a kernel bug?

All my tests have been done with vanilla kernels, but I have
built .iso installation images with them and I'm not sure of
what the build script does.

[1] http://en.wikipedia.org/wiki/Virtual_box

Thanks for reading this.

--
Luiz Fernando N. Capitulino


2008-08-21 21:34:58

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2763cb3..33193fe 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -157,8 +157,6 @@ static const struct nop {
} noptypes[] = {
{ X86_FEATURE_K8, k8_nops },
{ X86_FEATURE_K7, k7_nops },
- { X86_FEATURE_P4, p6_nops },
- { X86_FEATURE_P3, p6_nops },
{ -1, NULL }
};


Attachments:
diff (386.00 B)

2008-08-22 06:42:44

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

H. Peter Anvin wrote:
>>
>> Does this look like a kernel bug?
>>
>
> No, it looks like a very common virtualizer bug. Does the attached
> patch work for you?
>

Also, in addition to this, please try tip:master. There is a patch in
tip:master which I hope should fix this problem, but the details are
important.

-hpa

2008-08-22 06:50:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox


* H. Peter Anvin <[email protected]> wrote:

> H. Peter Anvin wrote:
>>>
>>> Does this look like a kernel bug?
>>>
>>
>> No, it looks like a very common virtualizer bug. Does the attached
>> patch work for you?
>>
>
> Also, in addition to this, please try tip:master. There is a patch in
> tip:master which I hope should fix this problem, but the details are
> important.

access coordinates would be at:

http://people.redhat.com/mingo/tip.git/README

Ingo

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Thu, 21 Aug 2008 14:34:07 -0700
"H. Peter Anvin" <[email protected]> escreveu:

| >
| > Does this look like a kernel bug?
| >
|
| No, it looks like a very common virtualizer bug. Does the attached
| patch work for you?

Unfortunately it does not.

--
Luiz Fernando N. Capitulino

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 08:50:12 +0200
Ingo Molnar <[email protected]> escreveu:

|
| * H. Peter Anvin <[email protected]> wrote:
|
| > H. Peter Anvin wrote:
| >>>
| >>> Does this look like a kernel bug?
| >>>
| >>
| >> No, it looks like a very common virtualizer bug. Does the attached
| >> patch work for you?
| >>
| >
| > Also, in addition to this, please try tip:master. There is a patch in
| > tip:master which I hope should fix this problem, but the details are
| > important.
|
| access coordinates would be at:
|
| http://people.redhat.com/mingo/tip.git/README

As I already have Linus tree downloaded I have cloned it in
the usual way.

Got the same results: OOPS in virtualbox but it works on QEMU.

The OOPS's output follows and I have attached the .config I'm using
to reproduce the problem.

"""
BUG: unable to handle kernel NULL pointer dereference at 00000246
IP: [<c01310f1>] vprintk+0x181/0x440
*pde = 00000000
Oops: 0002 [#1] SMP
Modules linked in:

Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test24-tip #3)
EIP: 0060:[<c01310f1>] EFLAGS: 00010246 CPU: 0
EIP is at vprintk+0x181/0x440
EAX: 00000246 EBX: 00000000 ECX: c0130ca9 EDX: 0000dedd
ESI: c0474ae3 EDI: c04cf6bc EBP: c7435f24 ESP: c7435eb0
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
Stack: 0000dedd c0130ca9 c7435f40 00000000 a026104f a026106c c7434000 c7435ee6
00000006 00000246 00000000 a0260cf3 0000001c c7434000 00000282 00000046
c11a85a0 c7435efc c0135c6f c7435f14 c0115fcb a0296e91 c0104c2c 00000000
Call Trace:
[<c0130ca9>] ? release_console_sem+0x199/0x1e0
[<c0135c6f>] ? irq_exit+0x3f/0x90
[<c0115fcb>] ? smp_apic_timer_interrupt+0x5b/0x90
[<c0104c2c>] ? apic_timer_interrupt+0x28/0x30
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0346ed9>] ? printk+0x18/0x1f
[<c0474b00>] ? net_ns_init+0x1d/0x1ad
[<c0474ae3>] ? net_ns_init+0x0/0x1ad
[<c0101116>] ? do_one_initcall+0x26/0x170
[<c0128f66>] ? try_to_wake_up+0xc6/0x240
[<c012910f>] ? wake_up_process+0xf/0x20
[<c014192d>] ? start_workqueue_thread+0x1d/0x20
[<c0141d4b>] ? __create_workqueue_key+0x1eb/0x240
[<c0141820>] ? worker_thread+0x0/0xf0
[<c044b387>] ? kernel_init+0x141/0x214
[<c044b246>] ? kernel_init+0x0/0x214
[<c0104dc7>] ? kernel_thread_helper+0x7/0x10
=======================
Code: c0 0f 84 0b 01 00 00 b8 50 f1 41 c0 c7 05 ec f1 41 c0 ff ff ff ff e8 cf 8b 21 00 e8 ea 04 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 ec f1 41 c0 e8
EIP: [<c01310f1>] vprintk+0x181/0x440 SS:ESP 0069:c7435eb0
---[ end trace 4eaa2a86a8e2da22 ]---
"""

--
Luiz Fernando N. Capitulino


Attachments:
(No filename) (2.64 kB)
config (55.60 kB)
Download all attachments

2008-08-22 15:35:06

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Fri, 22 Aug 2008 08:50:12 +0200
> Ingo Molnar <[email protected]> escreveu:
>
> |
> | * H. Peter Anvin <[email protected]> wrote:
> |
> | > H. Peter Anvin wrote:
> | >>>
> | >>> Does this look like a kernel bug?
> | >>>
> | >>
> | >> No, it looks like a very common virtualizer bug. Does the attached
> | >> patch work for you?
> | >>
> | >
> | > Also, in addition to this, please try tip:master. There is a patch in
> | > tip:master which I hope should fix this problem, but the details are
> | > important.
> |
> | access coordinates would be at:
> |
> | http://people.redhat.com/mingo/tip.git/README
>
> As I already have Linus tree downloaded I have cloned it in
> the usual way.
>
> Got the same results: OOPS in virtualbox but it works on QEMU.
>
> The OOPS's output follows and I have attached the .config I'm using
> to reproduce the problem.
>

Can you try booting with the kernel argument :
debug_alternative

The dmesg of the kernel bootup up to the oops would be helpful.

My guess is that there may be something wrong with irq disabling which
protects text_poke_early in apply_alternatives().

Mathieu


> """
> BUG: unable to handle kernel NULL pointer dereference at 00000246
> IP: [<c01310f1>] vprintk+0x181/0x440
> *pde = 00000000
> Oops: 0002 [#1] SMP
> Modules linked in:
>
> Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test24-tip #3)
> EIP: 0060:[<c01310f1>] EFLAGS: 00010246 CPU: 0
> EIP is at vprintk+0x181/0x440
> EAX: 00000246 EBX: 00000000 ECX: c0130ca9 EDX: 0000dedd
> ESI: c0474ae3 EDI: c04cf6bc EBP: c7435f24 ESP: c7435eb0
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
> Stack: 0000dedd c0130ca9 c7435f40 00000000 a026104f a026106c c7434000 c7435ee6
> 00000006 00000246 00000000 a0260cf3 0000001c c7434000 00000282 00000046
> c11a85a0 c7435efc c0135c6f c7435f14 c0115fcb a0296e91 c0104c2c 00000000
> Call Trace:
> [<c0130ca9>] ? release_console_sem+0x199/0x1e0
> [<c0135c6f>] ? irq_exit+0x3f/0x90
> [<c0115fcb>] ? smp_apic_timer_interrupt+0x5b/0x90
> [<c0104c2c>] ? apic_timer_interrupt+0x28/0x30
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0346ed9>] ? printk+0x18/0x1f
> [<c0474b00>] ? net_ns_init+0x1d/0x1ad
> [<c0474ae3>] ? net_ns_init+0x0/0x1ad
> [<c0101116>] ? do_one_initcall+0x26/0x170
> [<c0128f66>] ? try_to_wake_up+0xc6/0x240
> [<c012910f>] ? wake_up_process+0xf/0x20
> [<c014192d>] ? start_workqueue_thread+0x1d/0x20
> [<c0141d4b>] ? __create_workqueue_key+0x1eb/0x240
> [<c0141820>] ? worker_thread+0x0/0xf0
> [<c044b387>] ? kernel_init+0x141/0x214
> [<c044b246>] ? kernel_init+0x0/0x214
> [<c0104dc7>] ? kernel_thread_helper+0x7/0x10
> =======================
> Code: c0 0f 84 0b 01 00 00 b8 50 f1 41 c0 c7 05 ec f1 41 c0 ff ff ff ff e8 cf 8b 21 00 e8 ea 04 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 ec f1 41 c0 e8
> EIP: [<c01310f1>] vprintk+0x181/0x440 SS:ESP 0069:c7435eb0
> ---[ end trace 4eaa2a86a8e2da22 ]---
> """
>
> --
> Luiz Fernando N. Capitulino



--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 11:34:52 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > Em Fri, 22 Aug 2008 08:50:12 +0200
| > Ingo Molnar <[email protected]> escreveu:
| >
| > |
| > | * H. Peter Anvin <[email protected]> wrote:
| > |
| > | > H. Peter Anvin wrote:
| > | >>>
| > | >>> Does this look like a kernel bug?
| > | >>>
| > | >>
| > | >> No, it looks like a very common virtualizer bug. Does the attached
| > | >> patch work for you?
| > | >>
| > | >
| > | > Also, in addition to this, please try tip:master. There is a patch in
| > | > tip:master which I hope should fix this problem, but the details are
| > | > important.
| > |
| > | access coordinates would be at:
| > |
| > | http://people.redhat.com/mingo/tip.git/README
| >
| > As I already have Linus tree downloaded I have cloned it in
| > the usual way.
| >
| > Got the same results: OOPS in virtualbox but it works on QEMU.
| >
| > The OOPS's output follows and I have attached the .config I'm using
| > to reproduce the problem.
| >
|
| Can you try booting with the kernel argument :
| debug_alternative
|
| The dmesg of the kernel bootup up to the oops would be helpful.
|
| My guess is that there may be something wrong with irq disabling which
| protects text_poke_early in apply_alternatives().

I have attached two files:

- normal.txt: normal boot with no debug options
- debug-alternative.txt ignore_loglevel and debug-alternative boot
options

I had to pass ignore_loglevel otherwise it wouldn't print
anything.

--
Luiz Fernando N. Capitulino


Attachments:
(No filename) (1.59 kB)
normal.txt (7.10 kB)
debug-alternative.txt (6.69 kB)
Download all attachments

2008-08-22 16:35:33

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Fri, 22 Aug 2008 11:34:52 -0400
> Mathieu Desnoyers <[email protected]> escreveu:
>
> | * Luiz Fernando N. Capitulino ([email protected]) wrote:
> | > Em Fri, 22 Aug 2008 08:50:12 +0200
> | > Ingo Molnar <[email protected]> escreveu:
> | >
> | > |
> | > | * H. Peter Anvin <[email protected]> wrote:
> | > |
> | > | > H. Peter Anvin wrote:
> | > | >>>
> | > | >>> Does this look like a kernel bug?
> | > | >>>
> | > | >>
> | > | >> No, it looks like a very common virtualizer bug. Does the attached
> | > | >> patch work for you?
> | > | >>
> | > | >
> | > | > Also, in addition to this, please try tip:master. There is a patch in
> | > | > tip:master which I hope should fix this problem, but the details are
> | > | > important.
> | > |
> | > | access coordinates would be at:
> | > |
> | > | http://people.redhat.com/mingo/tip.git/README
> | >
> | > As I already have Linus tree downloaded I have cloned it in
> | > the usual way.
> | >
> | > Got the same results: OOPS in virtualbox but it works on QEMU.
> | >
> | > The OOPS's output follows and I have attached the .config I'm using
> | > to reproduce the problem.
> | >
> |
> | Can you try booting with the kernel argument :
> | debug_alternative
> |
> | The dmesg of the kernel bootup up to the oops would be helpful.
> |
> | My guess is that there may be something wrong with irq disabling which
> | protects text_poke_early in apply_alternatives().
>
> I have attached two files:
>
> - normal.txt: normal boot with no debug options
> - debug-alternative.txt ignore_loglevel and debug-alternative boot
> options
>
> I had to pass ignore_loglevel otherwise it wouldn't print
> anything.
>

Ok, now can you try booting with either of those args :

noreplace-paravirt
noreplace-smp

And see which one(s) works ?

Thanks,

Mathieu

> --
> Luiz Fernando N. Capitulino

> Linux version 2.6.27-rc4-test25 ([email protected]) (gcc version 4.3.1 20080626 (prerelease) (GCC) ) #89 SMP Fri Aug 22 12:47:34 BRT 2008
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 0000000007ff0000 (usable)
> BIOS-e820: 0000000007ff0000 - 0000000008000000 (ACPI data)
> BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
> last_pfn = 0x7ff0 max_arch_pfn = 0x100000
> RAMDISK: 07b9b000 - 07fbf89d
> DMI 2.5 present.
> ACPI: RSDP 000E0000, 0024 (r2 VBOX )
> ACPI: XSDT 07FF0030, 002C (r1 VBOX VBOXXSDT 1 ASL 61)
> ACPI: FACP 07FF0060, 00F4 (r4 VBOX VBOXFACP 1 ASL 61)
> ACPI: DSDT 07FF01A0, 1064 (r1 VBOX VBOXBIOS 2 INTL 20080213)
> ACPI: FACS 07FF0160, 0040
> 0MB HIGHMEM available.
> 127MB LOWMEM available.
> mapped low ram: 0 - 07ff0000
> low ram: 00000000 - 07ff0000
> bootmap 00002000 - 00003000
> (9 early reservations) ==> bootmem [0000000000 - 0007ff0000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000]
> #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000]
> #3 [0000100000 - 0000814b10] TEXT DATA BSS ==> [0000100000 - 0000814b10]
> #4 [0007b9b000 - 0007fbf89d] RAMDISK ==> [0007b9b000 - 0007fbf89d]
> #5 [0000815000 - 0000819000] INIT_PG_TABLE ==> [0000815000 - 0000819000]
> #6 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000]
> #7 [0000007000 - 0000009000] PGTABLE ==> [0000007000 - 0000009000]
> #8 [0000002000 - 0000003000] BOOTMAP ==> [0000002000 - 0000003000]
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> Normal 0x00001000 -> 0x00007ff0
> HighMem 0x00007ff0 -> 0x00007ff0
> Movable zone start PFN for each node
> early_node_map[2] active PFN ranges
> 0: 0x00000000 -> 0x0000009f
> 0: 0x00000100 -> 0x00007ff0
> ACPI: PM-Timer IO Port: 0x4008
> SMP: Allowing 1 CPUs, 0 hotplug CPUs
> Found and enabled local APIC!
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
> PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
> Allocating PCI resources starting at 10000000 (gap: 8000000:f7fc0000)
> PERCPU: Allocating 40224 bytes of per cpu data
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32239
> Kernel command line: initrd=alt0/all.rdz vga=788 splash=silent BOOT_IMAGE=alt0/vmlinuz vga=0 console=ttyS0,9600 console=tty0
> Enabling fast FPU save and restore... done.
> Enabling unmasked SIMD FPU exception support... done.
> Initializing CPU#0
> PID hash table entries: 512 (order: 9, 2048 bytes)
> TSC calibrated against PM_TIMER
> Detected 2410.453 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> ... MAX_LOCKDEP_SUBCLASSES: 8
> ... MAX_LOCK_DEPTH: 48
> ... MAX_LOCKDEP_KEYS: 8191
> ... CLASSHASH_SIZE: 4096
> ... MAX_LOCKDEP_ENTRIES: 8192
> ... MAX_LOCKDEP_CHAINS: 16384
> ... CHAINHASH_SIZE: 8192
> memory used by lock dependency info: 2335 kB
> per task-struct memory footprint: 1152 bytes
> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> Memory: 117088k/131008k available (2340k kernel code, 13364k reserved, 1027k data, 308k init, 0k highmem)
> virtual kernel memory layout:
> fixmap : 0xffe18000 - 0xfffff000 (1948 kB)
> pkmap : 0xff800000 - 0xffc00000 (4096 kB)
> vmalloc : 0xc8800000 - 0xff7fe000 ( 879 MB)
> lowmem : 0xc0000000 - 0xc7ff0000 ( 127 MB)
> .init : 0xc0451000 - 0xc049e000 ( 308 kB)
> .data : 0xc03493b8 - 0xc044a040 (1027 kB)
> .text : 0xc0100000 - 0xc03493b8 (2340 kB)
> Checking if this processor honours the WP bit even in supervisor mode...Ok.
> SLUB: Genslabs=12, HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Calibrating delay loop (skipped), value calculated using timer frequency.. 4820.90 BogoMIPS (lpj=2410453)
> Security Framework initialized
> Mount-cache hash table entries: 512
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Checking 'hlt' instruction... OK.
> SMP alternatives: switching to UP code
> Freeing SMP alternatives: 10k freed
> ACPI: Core revision 20080609
> ACPI: setting ELCR to 0200 (from 0c00)
> weird, boot CPU (#0) not listedby the BIOS.
> SMP motherboard not detected.
> SMP disabled
> Brought up 1 CPUs
> Total of 1 processors activated (4820.90 BogoMIPS).
> khelper used greatest stack depth: 7108 bytes left
> net_namespace: 384 bytes
> Booting paravirtualized kernel on bare hardware
> NET: Registered protocol family 16
> ACPI: bus type pci registered
> PCI: PCI BIOS revision 2.10 entry at 0xfadb0, last bus=0
> PCI: Using configuration type 1 for base access
> ACPI: Interpreter enabled
> ACPI: (supports S0 S5)
> ACPI: Using PIC for interrupt routing
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> ACPI: PCI Interrupt Link [LNKA] (IRQs 5 9 10 11) *0, disabled.
> ACPI: PCI Interrupt Link [LNKB] (IRQs 5 9 10 11) *0, disabled.
> ACPI: PCI Interrupt Link [LNKC] (IRQs 5 9 10 *11)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 5 9 *10 11)
> Linux Plug and Play Support v0.97 (c) Adam Belay
> pnp: PnP ACPI init
> ACPI: bus type pnp registered
> BUG: unable to handle kernel NULL pointer dereference at 00000246
> IP: [<c012fc71>] vprintk+0x181/0x440
> *pde = 00000000
> Oops: 0002 [#1] SMP
> Modules linked in:
>
> Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test25 #89)
> EIP: 0060:[<c012fc71>] EFLAGS: 00010246 CPU: 0
> EIP is at vprintk+0x181/0x440
> EAX: 00000246 EBX: 00000000 ECX: c012f8a9 EDX: 00003b3a
> ESI: 00000000 EDI: c04d76c1 EBP: c7435f20 ESP: c7435eac
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
> Stack: 00003b3a c012f8a9 c7435f3c c02948b1 c7435f18 c02957f6 00000074 c7435ee2
> 00000006 00000246 00000000 00000000 00000021 00000000 00000001 00000000
> a027c4ab a027c4c8 00000001 00000297 00000246 00000000 00000001 00000000
> Call Trace:
> [<c012f8a9>] ? release_console_sem+0x1c9/0x1e0
> [<c02948b1>] ? put_device+0x11/0x20
> [<c02957f6>] ? device_add+0x26/0x610
> [<c0471c5c>] ? pnpacpi_init+0x0/0x89
> [<c03450f4>] ? printk+0x18/0x1c
> [<c0266f87>] ? register_acpi_bus_type+0x58/0x69
> [<c0471ca5>] ? pnpacpi_init+0x49/0x89
> [<c0101116>] ? do_one_initcall+0x26/0x170
> [<c01e1d14>] ? create_proc_entry+0x54/0xa0
> [<c016ef86>] ? register_irq_proc+0xb6/0xd0
> [<c016efea>] ? init_irq_proc+0x4a/0x60
> [<c045132d>] ? kernel_init+0x10f/0x166
> [<c045121e>] ? kernel_init+0x0/0x166
> [<c0104b67>] ? kernel_thread_helper+0x7/0x10
> =======================
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
> EIP: [<c012fc71>] vprintk+0x181/0x440 SS:ESP 0069:c7435eac
> ---[ end trace 4eaa2a86a8e2da22 ]---
> Kernel panic - not syncing: Attempted to kill init!

> Linux version 2.6.27-rc4-test25 ([email protected]) (gcc version 4.3.1 20080626 (prerelease) (GCC) ) #89 SMP Fri Aug 22 12:47:34 BRT 2008
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 0000000007ff0000 (usable)
> BIOS-e820: 0000000007ff0000 - 0000000008000000 (ACPI data)
> BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
> debug: ignoring loglevel setting.
> last_pfn = 0x7ff0 max_arch_pfn = 0x100000
> kernel direct mapping tables up to 7ff0000 @ 7000-d000
> RAMDISK: 07b9b000 - 07fbf89d
> DMI 2.5 present.
> ACPI: RSDP 000E0000, 0024 (r2 VBOX )
> ACPI: XSDT 07FF0030, 002C (r1 VBOX VBOXXSDT 1 ASL 61)
> ACPI: FACP 07FF0060, 00F4 (r4 VBOX VBOXFACP 1 ASL 61)
> ACPI: DSDT 07FF01A0, 1064 (r1 VBOX VBOXBIOS 2 INTL 20080213)
> ACPI: FACS 07FF0160, 0040
> 0MB HIGHMEM available.
> 127MB LOWMEM available.
> mapped low ram: 0 - 07ff0000
> low ram: 00000000 - 07ff0000
> bootmap 00002000 - 00003000
> (9 early reservations) ==> bootmem [0000000000 - 0007ff0000]
> #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
> #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000]
> #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000]
> #3 [0000100000 - 0000814b10] TEXT DATA BSS ==> [0000100000 - 0000814b10]
> #4 [0007b9b000 - 0007fbf89d] RAMDISK ==> [0007b9b000 - 0007fbf89d]
> #5 [0000815000 - 0000819000] INIT_PG_TABLE ==> [0000815000 - 0000819000]
> #6 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 0000100000]
> #7 [0000007000 - 0000009000] PGTABLE ==> [0000007000 - 0000009000]
> #8 [0000002000 - 0000003000] BOOTMAP ==> [0000002000 - 0000003000]
> Zone PFN ranges:
> DMA 0x00000000 -> 0x00001000
> Normal 0x00001000 -> 0x00007ff0
> HighMem 0x00007ff0 -> 0x00007ff0
> Movable zone start PFN for each node
> early_node_map[2] active PFN ranges
> 0: 0x00000000 -> 0x0000009f
> 0: 0x00000100 -> 0x00007ff0
> On node 0 totalpages: 32655
> free_area_init_node: node 0, pgdat c041f600, node_mem_map c1000000
> DMA zone: 3947 pages, LIFO batch:0
> Normal zone: 28292 pages, LIFO batch:7
> ACPI: PM-Timer IO Port: 0x4008
> SMP: Allowing 1 CPUs, 0 hotplug CPUs
> Found and enabled local APIC!
> mapped APIC to ffffb000 (fee00000)
> PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
> PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
> PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
> Allocating PCI resources starting at 10000000 (gap: 8000000:f7fc0000)
> PERCPU: Allocating 40224 bytes of per cpu data
> NR_CPUS: 32, nr_cpu_ids: 1, nr_node_ids 1
> Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32239
> Kernel command line: initrd=alt0/all.rdz vga=788 splash=silent BOOT_IMAGE=alt0/vmlinuz vga=0 console=ttyS0,9600 console=tty0 ignore_loglevel debug-alternative
> Enabling fast FPU save and restore... done.
> Enabling unmasked SIMD FPU exception support... done.
> Initializing CPU#0
> PID hash table entries: 512 (order: 9, 2048 bytes)
> TSC calibrated against PM_TIMER
> Detected 2410.976 MHz processor.
> Console: colour VGA+ 80x25
> console [tty0] enabled
> console [ttyS0] enabled
> Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> ... MAX_LOCKDEP_SUBCLASSES: 8
> ... MAX_LOCK_DEPTH: 48
> ... MAX_LOCKDEP_KEYS: 8191
> ... CLASSHASH_SIZE: 4096
> ... MAX_LOCKDEP_ENTRIES: 8192
> ... MAX_LOCKDEP_CHAINS: 16384
> ... CHAINHASH_SIZE: 8192
> memory used by lock dependency info: 2335 kB
> per task-struct memory footprint: 1152 bytes
> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> Memory: 117088k/131008k available (2340k kernel code, 13364k reserved, 1027k data, 308k init, 0k highmem)
> virtual kernel memory layout:
> fixmap : 0xffe18000 - 0xfffff000 (1948 kB)
> pkmap : 0xff800000 - 0xffc00000 (4096 kB)
> vmalloc : 0xc8800000 - 0xff7fe000 ( 879 MB)
> lowmem : 0xc0000000 - 0xc7ff0000 ( 127 MB)
> .init : 0xc0451000 - 0xc049e000 ( 308 kB)
> .data : 0xc03493b8 - 0xc044a040 (1027 kB)
> .text : 0xc0100000 - 0xc03493b8 (2340 kB)
> Checking if this processor honours the WP bit even in supervisor mode...Ok.
> CPA: page pool initialized 1 of 1 pages preallocated
> SLUB: Genslabs=12, HWalign=128, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> Calibrating delay loop (skipped), value calculated using timer frequency.. 4821.95 BogoMIPS (lpj=2410976)
> Security Framework initialized
> Mount-cache hash table entries: 512
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> Checking 'hlt' instruction... OK.
> apply_alternatives: alt table c048afa4 -> c048f11f
> SMP alternatives: switching to UP code
> Freeing SMP alternatives: 10k freed
> ACPI: Core revision 20080609
> ACPI: setting ELCR to 0200 (from 0c00)
> weird, boot CPU (#0) not listedby the BIOS.
> SMP motherboard not detected.
> SMP disabled
> Brought up 1 CPUs
> Total of 1 processors activated (4821.95 BogoMIPS).
> BUG: unable to handle kernel NULL pointer dereference at 00000246
> IP: [<c012fc71>] vprintk+0x181/0x440
> *pde = 00000000
> Oops: 0002 [#1] SMP
> Modules linked in:
>
> Pid: 1, comm: swapper Not tainted (2.6.27-rc4-test25 #89)
> EIP: 0060:[<c012fc71>] EFLAGS: 00010246 CPU: 0
> EIP is at vprintk+0x181/0x440
> EAX: 00000246 EBX: 00000000 ECX: c012f8a9 EDX: 00009695
> ESI: 00000000 EDI: c04d76d7 EBP: c7435f98 ESP: c7435f24
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process swapper (pid: 1, ti=c7434000 task=c7438000 task.ti=c7434000)
> Stack: 00009695 c012f8a9 c7435fb4 0000007b c012007b 000000d8 ffffff10 c7435f5a
> 00000006 00000246 00000000 c046902c 00000037 00000246 c03438ac c7435f7a
> 00000006 00000246 00000000 00000000 00000015 c7435f94 c045e900 00000030
> Call Trace:
> [<c012f8a9>] ? release_console_sem+0x1c9/0x1e0
> [<c012007b>] ? resched_task+0x4b/0x70
> [<c046902c>] ? relay_init+0xd/0x11
> [<c03438ac>] ? end_local_APIC_setup+0xb9/0xf2
> [<c045e900>] ? prefill_possible_map+0x7/0x8a
> [<c03450f4>] ? printk+0x18/0x1c
> [<c045eacd>] ? native_smp_cpus_done+0x93/0xe9
> [<c04512f3>] ? kernel_init+0xd5/0x166
> [<c045121e>] ? kernel_init+0x0/0x166
> [<c0104b67>] ? kernel_thread_helper+0x7/0x10
> =======================
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
> EIP: [<c012fc71>] vprintk+0x181/0x440 SS:ESP 0069:c7435f24
> ---[ end trace 4eaa2a86a8e2da22 ]---
> Kernel panic - not syncing: Attempted to kill init!


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-22 17:17:27

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Was looking at the code stream, and noticed this:

Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff
e8 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00
8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8

Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff
e8 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00
8b 45 bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8

The EIP is in the *MIDDLE* of a NOPL instruction:

C012FC46 C00F84 ror byte [edi],0x84
C012FC49 0B01 or eax,[ecx]
C012FC4B 0000 add [eax],al
C012FC4D B8D0BF41C0 mov eax,0xc041bfd0
C012FC52 C7056CC041C0FFFF mov dword [dword 0xc041c06c],0xffffffff
-FFFF
C012FC5C E87F822100 call dword 0xc0347ee0
C012FC61 E81A030200 call dword 0xc014ff80
C012FC66 8B45B0 mov eax,[ebp-0x50]
C012FC69 50 push eax
C012FC6A 9D popfd
C012FC6B 0F1F840000000000 nop dword [eax+eax+0x0]
C012FC73 8B45BC mov eax,[ebp-0x44]
C012FC76 83C460 add esp,byte +0x60
C012FC79 5B pop ebx
C012FC7A 5E pop esi
C012FC7B 5F pop edi
C012FC7C 5D pop ebp
C012FC7D C3 ret
C012FC7E 6690 xchg ax,ax
C012FC80 A16CC041C0 mov eax,[0xc041c06c]

There are two possibilities: VirtualBox mis-executes (not merely traps,
which is what tip:master looks for) the NOPL instruction, or something
is jumping into the middle of the sequence that is then replaced by the
NOPL.

So, Luiz: the DEBUG_INFO version of vmlinux would be helpful. It would
also help to know the exact version of VirtualBox you're running, what
source you got it from, and what your host system looks like.

-hpa

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 12:35:20 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > Em Fri, 22 Aug 2008 11:34:52 -0400
| > Mathieu Desnoyers <[email protected]> escreveu:
| >
| > | * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > | > Em Fri, 22 Aug 2008 08:50:12 +0200
| > | > Ingo Molnar <[email protected]> escreveu:
| > | >
| > | > |
| > | > | * H. Peter Anvin <[email protected]> wrote:
| > | > |
| > | > | > H. Peter Anvin wrote:
| > | > | >>>
| > | > | >>> Does this look like a kernel bug?
| > | > | >>>
| > | > | >>
| > | > | >> No, it looks like a very common virtualizer bug. Does the attached
| > | > | >> patch work for you?
| > | > | >>
| > | > | >
| > | > | > Also, in addition to this, please try tip:master. There is a patch in
| > | > | > tip:master which I hope should fix this problem, but the details are
| > | > | > important.
| > | > |
| > | > | access coordinates would be at:
| > | > |
| > | > | http://people.redhat.com/mingo/tip.git/README
| > | >
| > | > As I already have Linus tree downloaded I have cloned it in
| > | > the usual way.
| > | >
| > | > Got the same results: OOPS in virtualbox but it works on QEMU.
| > | >
| > | > The OOPS's output follows and I have attached the .config I'm using
| > | > to reproduce the problem.
| > | >
| > |
| > | Can you try booting with the kernel argument :
| > | debug_alternative
| > |
| > | The dmesg of the kernel bootup up to the oops would be helpful.
| > |
| > | My guess is that there may be something wrong with irq disabling which
| > | protects text_poke_early in apply_alternatives().
| >
| > I have attached two files:
| >
| > - normal.txt: normal boot with no debug options
| > - debug-alternative.txt ignore_loglevel and debug-alternative boot
| > options
| >
| > I had to pass ignore_loglevel otherwise it wouldn't print
| > anything.
| >
|
| Ok, now can you try booting with either of those args :
|
| noreplace-paravirt
| noreplace-smp
|
| And see which one(s) works ?

noreplace-paravirt works, the other one causes no change.

You will find the full boot log (with debug-alternative enabled)
attached.

--
Luiz Fernando N. Capitulino


Attachments:
(No filename) (2.23 kB)
working.txt (8.12 kB)
Download all attachments

2008-08-22 17:45:42

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* H. Peter Anvin ([email protected]) wrote:
> Was looking at the code stream, and noticed this:
>
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8
> 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45
> bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
>
> Code: c0 0f 84 0b 01 00 00 b8 d0 bf 41 c0 c7 05 6c c0 41 c0 ff ff ff ff e8
> 7f 82 21 00 e8 1a 03 02 00 8b 45 b0 50 9d 0f 1f 84 00 00 00 <00> 00 8b 45
> bc 83 c4 60 5b 5e 5f 5d c3 66 90 a1 6c c0 41 c0 e8
>
> The EIP is in the *MIDDLE* of a NOPL instruction:
>
> C012FC46 C00F84 ror byte [edi],0x84
> C012FC49 0B01 or eax,[ecx]
> C012FC4B 0000 add [eax],al
> C012FC4D B8D0BF41C0 mov eax,0xc041bfd0
> C012FC52 C7056CC041C0FFFF mov dword [dword 0xc041c06c],0xffffffff
> -FFFF
> C012FC5C E87F822100 call dword 0xc0347ee0
> C012FC61 E81A030200 call dword 0xc014ff80
> C012FC66 8B45B0 mov eax,[ebp-0x50]
> C012FC69 50 push eax
> C012FC6A 9D popfd
> C012FC6B 0F1F840000000000 nop dword [eax+eax+0x0]
> C012FC73 8B45BC mov eax,[ebp-0x44]
> C012FC76 83C460 add esp,byte +0x60
> C012FC79 5B pop ebx
> C012FC7A 5E pop esi
> C012FC7B 5F pop edi
> C012FC7C 5D pop ebp
> C012FC7D C3 ret
> C012FC7E 6690 xchg ax,ax
> C012FC80 A16CC041C0 mov eax,[0xc041c06c]
>
> There are two possibilities: VirtualBox mis-executes (not merely traps,
> which is what tip:master looks for) the NOPL instruction, or something is
> jumping into the middle of the sequence that is then replaced by the NOPL.
>
> So, Luiz: the DEBUG_INFO version of vmlinux would be helpful. It would
> also help to know the exact version of VirtualBox you're running, what
> source you got it from, and what your host system looks like.
>
> -hpa

The patch which turns on this bug this this important change to the
apply paravirt : it disables interrupts _near_ the code patching,
_within_ the loop. Before, interrupts were disabled outside of the loop.
It needs to disable interrupts within the loop to be able to use vmap in
text_poke().

So I bet VirtualBox has a race in the way it handles interrupt
disabling.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-22 17:58:13

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Mathieu Desnoyers wrote:
>
> The patch which turns on this bug this this important change to the
> apply paravirt : it disables interrupts _near_ the code patching,
> _within_ the loop. Before, interrupts were disabled outside of the loop.
> It needs to disable interrupts within the loop to be able to use vmap in
> text_poke().
>
> So I bet VirtualBox has a race in the way it handles interrupt
> disabling.
>

That seems a bit far-fetched. The fault is in an initcall, and there
are no interrupts involved. Perhaps VirtualBox doesn't manage its
tcache correctly, but I don't see this as being interrupt-related.

-hpa

2008-08-22 18:12:16

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2763cb3..33193fe 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -157,8 +157,6 @@ static const struct nop {
} noptypes[] = {
{ X86_FEATURE_K8, k8_nops },
{ X86_FEATURE_K7, k7_nops },
- { X86_FEATURE_P4, p6_nops },
- { X86_FEATURE_P3, p6_nops },
{ -1, NULL }
};


Attachments:
nopl.c (263.00 B)
diff (386.00 B)
Download all attachments
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 10:16:07 -0700
"H. Peter Anvin" <[email protected]> escreveu:

| So, Luiz: the DEBUG_INFO version of vmlinux would be helpful. It would
| also help to know the exact version of VirtualBox you're running, what
| source you got it from, and what your host system looks like.

You will find vmlinux with DEBUG_INFO enabled at:

http://users.mandriva.com.br/~lcapitulino/virtualbox-oops/

I'm running Mandriva's VirtualBox 1.6.4 OSE, my host kernel is 2.6.26-3mnb
(patched).

I could try with upstream's VirtualBox just to be sure it's not
something else, but I don't think it is since there are reports for
ArchLinux and Ubuntu as well:

https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/246067

--
Luiz Fernando N. Capitulino

2008-08-22 19:15:27

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Luiz Fernando N. Capitulino wrote:
> Em Fri, 22 Aug 2008 10:16:07 -0700
> "H. Peter Anvin" <[email protected]> escreveu:
>
> | So, Luiz: the DEBUG_INFO version of vmlinux would be helpful. It would
> | also help to know the exact version of VirtualBox you're running, what
> | source you got it from, and what your host system looks like.
>
> You will find vmlinux with DEBUG_INFO enabled at:
>
> http://users.mandriva.com.br/~lcapitulino/virtualbox-oops/
>
> I'm running Mandriva's VirtualBox 1.6.4 OSE, my host kernel is 2.6.26-3mnb
> (patched).
>
> I could try with upstream's VirtualBox just to be sure it's not
> something else, but I don't think it is since there are reports for
> ArchLinux and Ubuntu as well:
>
> https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/246067
>

Not necessary, but I wanted to get the information so I can try to
reproduce locally.

-hpa

2008-08-22 19:19:13

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Luiz Fernando N. Capitulino wrote:
>
> I'm running Mandriva's VirtualBox 1.6.4 OSE, my host kernel is 2.6.26-3mnb
> (patched).
>

What is your host *system* like -- CPU especially, and is your host
kernel 32 or 64 bits?

-hpa

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 11:11:25 -0700
"H. Peter Anvin" <[email protected]> escreveu:

| Hi Luiz, two more tests:
|
| 1. a small program to run in userspace and tell us what you get;

88776655:44332211

It is the same output in the virtualized system and the host
system.

| 2. a patch against -linus for testing.

I have tried this patch with Linus tree early today, should I try
it with Ingo's tree too?

--
Luiz Fernando N. Capitulino

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 12:18:21 -0700
"H. Peter Anvin" <[email protected]> escreveu:

| Luiz Fernando N. Capitulino wrote:
| >
| > I'm running Mandriva's VirtualBox 1.6.4 OSE, my host kernel is 2.6.26-3mnb
| > (patched).
| >
|
| What is your host *system* like -- CPU especially, and is your host
| kernel 32 or 64 bits?

32 bits, /proc/cpuinfo output:

"""
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping : 1
cpu MHz : 2410.462
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pebs bts pni monitor ds_cpl cid xtpr
bogomips : 4825.33
clflush size : 64
power management:
"""

I have 1G of RAM and a VIA mobo.

--
Luiz Fernando N. Capitulino

2008-08-22 20:32:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Luiz Fernando N. Capitulino wrote:
>
> | 2. a patch against -linus for testing.
>
> I have tried this patch with Linus tree early today, should I try
> it with Ingo's tree too?
>

It doesn't apply to tip. This did not fix the problem?

-hpa

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 13:31:49 -0700
"H. Peter Anvin" <[email protected]> escreveu:

| Luiz Fernando N. Capitulino wrote:
| >
| > | 2. a patch against -linus for testing.
| >
| > I have tried this patch with Linus tree early today, should I try
| > it with Ingo's tree too?
| >
|
| It doesn't apply to tip. This did not fix the problem?

No, it did not. :(

--
Luiz Fernando N. Capitulino

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Fri, 22 Aug 2008 14:20:54 -0300
"Luiz Fernando N. Capitulino" <[email protected]> escreveu:

| Em Fri, 22 Aug 2008 12:35:20 -0400
| Mathieu Desnoyers <[email protected]> escreveu:
|
| | * Luiz Fernando N. Capitulino ([email protected]) wrote:
| | > Em Fri, 22 Aug 2008 11:34:52 -0400
| | > Mathieu Desnoyers <[email protected]> escreveu:
| | >
| | > | * Luiz Fernando N. Capitulino ([email protected]) wrote:
| | > | > Em Fri, 22 Aug 2008 08:50:12 +0200
| | > | > Ingo Molnar <[email protected]> escreveu:
| | > | >
| | > | > |
| | > | > | * H. Peter Anvin <[email protected]> wrote:
| | > | > |
| | > | > | > H. Peter Anvin wrote:
| | > | > | >>>
| | > | > | >>> Does this look like a kernel bug?
| | > | > | >>>
| | > | > | >>
| | > | > | >> No, it looks like a very common virtualizer bug. Does the attached
| | > | > | >> patch work for you?
| | > | > | >>
| | > | > | >
| | > | > | > Also, in addition to this, please try tip:master. There is a patch in
| | > | > | > tip:master which I hope should fix this problem, but the details are
| | > | > | > important.
| | > | > |
| | > | > | access coordinates would be at:
| | > | > |
| | > | > | http://people.redhat.com/mingo/tip.git/README
| | > | >
| | > | > As I already have Linus tree downloaded I have cloned it in
| | > | > the usual way.
| | > | >
| | > | > Got the same results: OOPS in virtualbox but it works on QEMU.
| | > | >
| | > | > The OOPS's output follows and I have attached the .config I'm using
| | > | > to reproduce the problem.
| | > | >
| | > |
| | > | Can you try booting with the kernel argument :
| | > | debug_alternative
| | > |
| | > | The dmesg of the kernel bootup up to the oops would be helpful.
| | > |
| | > | My guess is that there may be something wrong with irq disabling which
| | > | protects text_poke_early in apply_alternatives().
| | >
| | > I have attached two files:
| | >
| | > - normal.txt: normal boot with no debug options
| | > - debug-alternative.txt ignore_loglevel and debug-alternative boot
| | > options
| | >
| | > I had to pass ignore_loglevel otherwise it wouldn't print
| | > anything.
| | >
| |
| | Ok, now can you try booting with either of those args :
| |
| | noreplace-paravirt
| | noreplace-smp
| |
| | And see which one(s) works ?
|
| noreplace-paravirt works, the other one causes no change.

I have asked Mandriva and Ubuntu users to test this and all of
them so far are saying that noreplace-paravirt works.

It makes the system slower, but it works.

--
Luiz Fernando N. Capitulino

2008-08-22 21:09:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Luiz Fernando N. Capitulino wrote:
>
> I have asked Mandriva and Ubuntu users to test this and all of
> them so far are saying that noreplace-paravirt works.
>
> It makes the system slower, but it works.
>

Yes, the big issue is exactly what VirtualBox screws up in this matter,
how to detect it, and how to work around it.

It's pretty clear it's a VirtualBox f*ckup at this point, but the
failure mechanism isn't at all obvious and so far the workaround is elusive.

I'm strongly suspect this is a VirtualBox tcache management failure, but
that doesn't help the situation without knowing how it happens.

-hpa

2008-08-26 14:19:06

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Fri, Aug 22, 2008 at 02:08:13PM -0700, H. Peter Anvin wrote:
> Luiz Fernando N. Capitulino wrote:
>>
>> I have asked Mandriva and Ubuntu users to test this and all of
>> them so far are saying that noreplace-paravirt works.
>>
>> It makes the system slower, but it works.
>>
>
> Yes, the big issue is exactly what VirtualBox screws up in this matter,
> how to detect it, and how to work around it.
>
> It's pretty clear it's a VirtualBox f*ckup at this point, but the failure
> mechanism isn't at all obvious and so far the workaround is elusive.
>
> I'm strongly suspect this is a VirtualBox tcache management failure, but
> that doesn't help the situation without knowing how it happens.

On Archlinux we have the same problem. We have a bugreport here:
http://bugs.archlinux.org/task/11141

Myself test it with a LiveCD/Install-ISO which has 2.6.26 as install
kernel. We have the guest oops on virtualbox-ose, virtualbox-sun and both on
i686 or x86_64 hosts.

Some things i noticed:
- The system boots always when i either enable VT-x in guest settings or
disable acpi and run the guest with acpi=off.
- The oops occurs always on (disk)-io, no matter which file system i
use.
- When the oops has occured and the guest has to close and restart then,
if i don't use VT-x or acpi=off, i always get an oops directly when
initrd/kernel is starting. Last screen message before the oops then is
"Freeing SMP alternatives".

Here is also an archive with guest dmesg and messages.log from such an
oops when heavy disk io leads to the oops:
http://bugs.archlinux.org/task/11141?getfile=2445


> -hpa

Gerhard

--
Standards sind eine tolle Sache.
Ich finde, jeder sollte einen haben.

2008-08-26 14:53:50

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Gerhard Brauer ([email protected]) wrote:
> On Fri, Aug 22, 2008 at 02:08:13PM -0700, H. Peter Anvin wrote:
> > Luiz Fernando N. Capitulino wrote:
> >>
> >> I have asked Mandriva and Ubuntu users to test this and all of
> >> them so far are saying that noreplace-paravirt works.
> >>
> >> It makes the system slower, but it works.
> >>
> >
> > Yes, the big issue is exactly what VirtualBox screws up in this matter,
> > how to detect it, and how to work around it.
> >
> > It's pretty clear it's a VirtualBox f*ckup at this point, but the failure
> > mechanism isn't at all obvious and so far the workaround is elusive.
> >
> > I'm strongly suspect this is a VirtualBox tcache management failure, but
> > that doesn't help the situation without knowing how it happens.
>
> On Archlinux we have the same problem. We have a bugreport here:
> http://bugs.archlinux.org/task/11141
>
> Myself test it with a LiveCD/Install-ISO which has 2.6.26 as install
> kernel. We have the guest oops on virtualbox-ose, virtualbox-sun and both on
> i686 or x86_64 hosts.
>
> Some things i noticed:
> - The system boots always when i either enable VT-x in guest settings or
> disable acpi and run the guest with acpi=off.
> - The oops occurs always on (disk)-io, no matter which file system i
> use.
> - When the oops has occured and the guest has to close and restart then,
> if i don't use VT-x or acpi=off, i always get an oops directly when
> initrd/kernel is starting. Last screen message before the oops then is
> "Freeing SMP alternatives".
>
> Here is also an archive with guest dmesg and messages.log from such an
> oops when heavy disk io leads to the oops:
> http://bugs.archlinux.org/task/11141?getfile=2445
>

Hrm, can you try this ?

1 - Make sure you kernel is not CONFIG_DEBUG_RODATA

2 - Change the whole text_poke implementation in
arch/x86/kernel/alternative.c to this :

void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
{
return text_poke_early(addr, opcode, len);
}

If this works, I suspect that the problem comes from a vmap/vunmap
problem. If it still fails, the problem would likely come from a race
with interrupt disabling probably due to missing data/instruction cache
flush.

Then, after having tested (2), try this on top of it :

In arch/x86/kernel/alternative.c, alternatives_smp_switch()

Add unsigned long flags;
Change
spin_lock -> spin_lock_irqsave(&smp_alt, flags);
spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);

This will help testing if there is a problem with interrupts coming
shortly after the modification. If it fixes the problem, my guess is
that we should flush the instruction cache (and maybe the data cache ?)
in text_poke and text_poke early when interrupts are off.

Mathieu


>
> > -hpa
>
> Gerhard
>
> --
> Standards sind eine tolle Sache.
> Ich finde, jeder sollte einen haben.

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Tue, 26 Aug 2008 16:18:51 +0200
Gerhard Brauer <[email protected]> escreveu:

| On Fri, Aug 22, 2008 at 02:08:13PM -0700, H. Peter Anvin wrote:
| > Luiz Fernando N. Capitulino wrote:
| >>
| >> I have asked Mandriva and Ubuntu users to test this and all of
| >> them so far are saying that noreplace-paravirt works.
| >>
| >> It makes the system slower, but it works.
| >>
| >
| > Yes, the big issue is exactly what VirtualBox screws up in this matter,
| > how to detect it, and how to work around it.
| >
| > It's pretty clear it's a VirtualBox f*ckup at this point, but the failure
| > mechanism isn't at all obvious and so far the workaround is elusive.
| >
| > I'm strongly suspect this is a VirtualBox tcache management failure, but
| > that doesn't help the situation without knowing how it happens.
|
| On Archlinux we have the same problem. We have a bugreport here:
| http://bugs.archlinux.org/task/11141
|
| Myself test it with a LiveCD/Install-ISO which has 2.6.26 as install
| kernel. We have the guest oops on virtualbox-ose, virtualbox-sun and both on
| i686 or x86_64 hosts.
|
| Some things i noticed:
| - The system boots always when i either enable VT-x in guest settings or
| disable acpi and run the guest with acpi=off.

Yes, lots of ubuntu users have reported the same but another "lots"
of them have reported that the trick didn't work.

Thanks for joining!

--
Luiz Fernando N. Capitulino

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Tue, 26 Aug 2008 10:53:38 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Gerhard Brauer ([email protected]) wrote:
| > On Fri, Aug 22, 2008 at 02:08:13PM -0700, H. Peter Anvin wrote:
| > > Luiz Fernando N. Capitulino wrote:
| > >>
| > >> I have asked Mandriva and Ubuntu users to test this and all of
| > >> them so far are saying that noreplace-paravirt works.
| > >>
| > >> It makes the system slower, but it works.
| > >>
| > >
| > > Yes, the big issue is exactly what VirtualBox screws up in this matter,
| > > how to detect it, and how to work around it.
| > >
| > > It's pretty clear it's a VirtualBox f*ckup at this point, but the failure
| > > mechanism isn't at all obvious and so far the workaround is elusive.
| > >
| > > I'm strongly suspect this is a VirtualBox tcache management failure, but
| > > that doesn't help the situation without knowing how it happens.
| >
| > On Archlinux we have the same problem. We have a bugreport here:
| > http://bugs.archlinux.org/task/11141
| >
| > Myself test it with a LiveCD/Install-ISO which has 2.6.26 as install
| > kernel. We have the guest oops on virtualbox-ose, virtualbox-sun and both on
| > i686 or x86_64 hosts.
| >
| > Some things i noticed:
| > - The system boots always when i either enable VT-x in guest settings or
| > disable acpi and run the guest with acpi=off.
| > - The oops occurs always on (disk)-io, no matter which file system i
| > use.
| > - When the oops has occured and the guest has to close and restart then,
| > if i don't use VT-x or acpi=off, i always get an oops directly when
| > initrd/kernel is starting. Last screen message before the oops then is
| > "Freeing SMP alternatives".
| >
| > Here is also an archive with guest dmesg and messages.log from such an
| > oops when heavy disk io leads to the oops:
| > http://bugs.archlinux.org/task/11141?getfile=2445
| >
|
| Hrm, can you try this ?
|
| 1 - Make sure you kernel is not CONFIG_DEBUG_RODATA

"""
$ grep CONFIG_DEBUG_RODATA .config
# CONFIG_DEBUG_RODATA is not set
$
"""

| 2 - Change the whole text_poke implementation in
| arch/x86/kernel/alternative.c to this :
|
| void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
| {
| return text_poke_early(addr, opcode, len);
| }
|
| If this works, I suspect that the problem comes from a vmap/vunmap
| problem. If it still fails, the problem would likely come from a race
| with interrupt disabling probably due to missing data/instruction cache
| flush.

I still get the oops with this change. :((

| Then, after having tested (2), try this on top of it :
|
| In arch/x86/kernel/alternative.c, alternatives_smp_switch()
|
| Add unsigned long flags;
| Change
| spin_lock -> spin_lock_irqsave(&smp_alt, flags);
| spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);
|
| This will help testing if there is a problem with interrupts coming
| shortly after the modification. If it fixes the problem, my guess is
| that we should flush the instruction cache (and maybe the data cache ?)
| in text_poke and text_poke early when interrupts are off.

By 'on top of it' you mean I should make these changes with the
text_poke() version above right?

By the way, I have added a comment in the virtualbox's bugzilla
pointing out this thread but no feedback from them so far.

--
Luiz Fernando N. Capitulino

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Tue, 26 Aug 2008 10:53:38 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| Then, after having tested (2), try this on top of it :
|
| In arch/x86/kernel/alternative.c, alternatives_smp_switch()
|
| Add unsigned long flags;
| Change
| spin_lock -> spin_lock_irqsave(&smp_alt, flags);
| spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);

Hmm, I can't find spin_lock functions in alternatives_smp_switch()
looks like the current implementation is now using mutexes.

What tree are you referring to?

--
Luiz Fernando N. Capitulino

2008-08-26 16:40:50

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 01:02:16PM -0300, Luiz Fernando N. Capitulino wrote:
> Em Tue, 26 Aug 2008 16:18:51 +0200
> Gerhard Brauer <[email protected]> escreveu:
>
> | Some things i noticed:
> | - The system boots always when i either enable VT-x in guest settings or
> | disable acpi and run the guest with acpi=off.
>
> Yes, lots of ubuntu users have reported the same but another "lots"
> of them have reported that the trick didn't work.

I must relativate above: i have two test enviroments, one is our
LiveCD/Install-ISO with 2.6.26 which we made special for a linux
conference last weekend (our official iso comes still with 2.6.25). With
this iso the "trick" with VT-x or noacpi works.
But on an installed archlinux (with distribution kernel 2.6.26) this
does'nt work. Sometimes it works when i restart the virtualbox
application, but mostly not. So on this installed guest system the only
working solution seems to add noreplace-paravirt as kernel parameter.
But this makes the system terrible slow (mostly on udev things).

I try Mathieu's hints currently by building a new distribution kernel
with the changes.
But i think the biggest problem to maybe solve this from the sight of
kernel devs is that we all have different "test" enviroments (vbox
versions, architectures, distribution kernels,...) where the oops (i
think) not appears for all on the same place.
On the other hand, the more we we try such patches in different
enviroments there is a better chance to get a real fix - from kernel dev
side or from virtualbox/Sun side...

> Thanks for joining!
> Luiz Fernando N. Capitulino

Gerhard

--
www,archlinux.de

2008-08-26 17:18:36

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Tue, 26 Aug 2008 10:53:38 -0400
> Mathieu Desnoyers <[email protected]> escreveu:
>
> | Then, after having tested (2), try this on top of it :
> |
> | In arch/x86/kernel/alternative.c, alternatives_smp_switch()
> |
> | Add unsigned long flags;
> | Change
> | spin_lock -> spin_lock_irqsave(&smp_alt, flags);
> | spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);
>
> Hmm, I can't find spin_lock functions in alternatives_smp_switch()
> looks like the current implementation is now using mutexes.
>

Sorry, I was looking directly at the commit which caused the problem.
Yes, these modif should go on top of the text_poke -> text_poke_early.

So in current mainline, change, in alternatives_smp_switch() :

mutex_lock(&smp_alt);
...

mutex_unlock(&smp_alt);

to

mutex_lock(&smp_alt);
local_irq_save(flags);
...

local_irq_restore(flags);
mutex_unlock(&smp_alt);

Thanks,

Mathieu

> What tree are you referring to?
>
> --
> Luiz Fernando N. Capitulino

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-26 17:33:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

I have been unable to replicate this on my own hardware mostly because
my testing machine decided to blow its DVD drive in some very strange
way, but I did pick apart the data from Luiz, and found it very interesting:

The code sequence before patching looks like:

c012fc69: 51 push %ecx
c012fc6a: 52 push %edx
c012fc6b: ff 15 40 b9 41 c0 call *0xc041b940
c012fc71: 5a pop %edx
c012fc72: 59 pop %ecx

After patching:

50 9d 0f 1f 84 00 00 00 <00> 00

... which disassembles to (in Intel notation):

C012FC69 50 push eax
C012FC6A 9D popfd
C012FC6B 0F1F840000000000 nop dword [eax+eax+0x0]

We do, indeed have a return point that falls in the *middle* of a
patched instruction, and if the patching happens in the middle of the
instruction call, then, well, bad things happen.

Furthermore, why on Earth is %ecx/%edx pushed and popped in-line here?
Surely it should be the responsibility of the PV call to present a
no-clobber interface (using an assembly wrapper if necessary[*]), rather
than bloating every callsite like this?

-hpa


[*] One can compile gcc code with -fcall-saved-* to use nonstandard
register conventions. Unfortunately stock gcc only lets you do this
with a file parameter, and doesn't support doing this with attributes.

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Tue, 26 Aug 2008 13:18:22 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > Em Tue, 26 Aug 2008 10:53:38 -0400
| > Mathieu Desnoyers <[email protected]> escreveu:
| >
| > | Then, after having tested (2), try this on top of it :
| > |
| > | In arch/x86/kernel/alternative.c, alternatives_smp_switch()
| > |
| > | Add unsigned long flags;
| > | Change
| > | spin_lock -> spin_lock_irqsave(&smp_alt, flags);
| > | spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);
| >
| > Hmm, I can't find spin_lock functions in alternatives_smp_switch()
| > looks like the current implementation is now using mutexes.
| >
|
| Sorry, I was looking directly at the commit which caused the problem.
| Yes, these modif should go on top of the text_poke -> text_poke_early.
|
| So in current mainline, change, in alternatives_smp_switch() :
|
| mutex_lock(&smp_alt);
| ...
|
| mutex_unlock(&smp_alt);
|
| to
|
| mutex_lock(&smp_alt);
| local_irq_save(flags);
| ...
|
| local_irq_restore(flags);
| mutex_unlock(&smp_alt);

Did not help, same oops here.

--
Luiz Fernando N. Capitulino

2008-08-26 18:16:16

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Tue, 26 Aug 2008 13:18:22 -0400
> Mathieu Desnoyers <[email protected]> escreveu:
>
> | * Luiz Fernando N. Capitulino ([email protected]) wrote:
> | > Em Tue, 26 Aug 2008 10:53:38 -0400
> | > Mathieu Desnoyers <[email protected]> escreveu:
> | >
> | > | Then, after having tested (2), try this on top of it :
> | > |
> | > | In arch/x86/kernel/alternative.c, alternatives_smp_switch()
> | > |
> | > | Add unsigned long flags;
> | > | Change
> | > | spin_lock -> spin_lock_irqsave(&smp_alt, flags);
> | > | spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);
> | >
> | > Hmm, I can't find spin_lock functions in alternatives_smp_switch()
> | > looks like the current implementation is now using mutexes.
> | >
> |
> | Sorry, I was looking directly at the commit which caused the problem.
> | Yes, these modif should go on top of the text_poke -> text_poke_early.
> |
> | So in current mainline, change, in alternatives_smp_switch() :
> |
> | mutex_lock(&smp_alt);
> | ...
> |
> | mutex_unlock(&smp_alt);
> |
> | to
> |
> | mutex_lock(&smp_alt);
> | local_irq_save(flags);
> | ...
> |
> | local_irq_restore(flags);
> | mutex_unlock(&smp_alt);
>
> Did not help, same oops here.
>

Ok, it might still be caused by paravirt and alternatives instruction
patching. What if you also do :

alternative_instructions()

+ unsigned long flags;
/* The patching is not fully atomic, so try to avoid local interruptions
that might execute the to be patched code.
Other CPUs are not running. */
stop_nmi();
#ifdef CONFIG_X86_MCE
stop_mce();
#endif
+ local_irq_save(flags);


...
+ local_irq_restore(flags);
restart_nmi();
#ifdef CONFIG_X86_MCE
restart_mce();
#endif

?

Hrm,

Since those local_irq_save/restore occur _before_ the paravirt patching
is done, I wonder if there would be a race in the way cli/sti traps are
handled by Virtualbox wrt incoming interrupt ?

Thanks,

Mathieu

> --
> Luiz Fernando N. Capitulino

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-26 19:27:48

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 10:53:38AM -0400, Mathieu Desnoyers wrote:
> * Gerhard Brauer ([email protected]) wrote:
> >
> > Here is also an archive with guest dmesg and messages.log from such an
> > oops when heavy disk io leads to the oops:
> > http://bugs.archlinux.org/task/11141?getfile=2445
> >
>
> Hrm, can you try this ?

Sorry for the delay but i need to build a complete distribution kernel
and my machine is not the fastest.

My host:
archlinux 2.6.26 P4 2Ghz
VirtualBox: Sun xVM 1.6.4
gcc 4.4.1-3

My guest:
archlinux 2.6.26

My "tests":
I could sometimes boot the guest with the "tricks" (VT-x enabled, acpi
off,...). But i always get an oops if i compile something bigger on this
guest (ex. virtualbox-modules where the tarball must be untarrt with
bsdtar -> disk io)
If this happens the next reboot leads always to the early oops (Freeing
smp....). Each reboot do this. Then i close virtualbox application,
unload/reload vboxdrv from host and start vbox again. Then i could
mostimes boot the guest again. But next heavy disk IO leads again to the
oops.
If i could boot without oops, and reboot or halt the guest, then the
next boots are clean.


> 1 - Make sure you kernel is not CONFIG_DEBUG_RODATA
Not set.

> 2 - Change the whole text_poke implementation in
> arch/x86/kernel/alternative.c to this :
With this changes i also get the oops, in all above mentioned tests.

> Then, after having tested (2), try this on top of it :
>
> In arch/x86/kernel/alternative.c, alternatives_smp_switch()
>
> Add unsigned long flags;
> Change
> spin_lock -> spin_lock_irqsave(&smp_alt, flags);
> spin_unlock(&smp_alt); -> spin_unlock_irqrestore(&smp_alt, flags);

With our distribution kernel i could change these spin_lock/unlock in
alternatives.c. Fist thought was that there was a slightly better
behavior (first boot goes on, i could compile something, but next
package i build thee opps (heavy io opps) comes again. And then also
after reboot the early oops (freeing smp...)
Here is a screenie from oops when building something:
http://users.archlinux.de/~gerbra/tmp/2008-08-26-210724_724x456_scrot.png

Sometimes (could not be reproduced) the virtualbox app also traps with
an error dialog (Guru message), which offers a log from the VM and a
scren shot. Maybe this could be helpfull. Log and screenie could be
found here:
http://users.archlinux.de/~gerbra/tmp/vbox-guru/

>
> This will help testing if there is a problem with interrupts coming
> shortly after the modification. If it fixes the problem, my guess is
> that we should flush the instruction cache (and maybe the data cache ?)
> in text_poke and text_poke early when interrupts are off.

>From my side i would say: both changes would not solve the oops.

> Mathieu

Regards
Gerhard

2008-08-26 19:53:43

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

One thing that I think really needs to be considered is that the current
PV stubs are (a) large, and (b) non-atomic.

In the case at hand we have:

c012fc69: 51 push %ecx
c012fc6a: 52 push %edx
c012fc6b: ff 15 40 b9 41 c0 call *0xc041b940
c012fc71: 5a pop %edx
c012fc72: 59 pop %ecx

Ten bytes replacing a two-byte native sequence.

If this was done as a call to an out-of-line stub, it would be only five
bytes, which would reduce native icache overhead from 400% to 150%, but
perhaps more importantly, it would not be subject to returns inside the
sequence itself (since the out-of-line stub would still exist.) As an
optional bonus, at least on 32 bits the indirect call could be replaced
with a direct call in the out-of-line stub.

-hpa

2008-08-26 20:35:04

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
>
> Ok, it might still be caused by paravirt and alternatives instruction
> patching. What if you also do :
>
> alternative_instructions()
>
> + unsigned long flags;
> /* The patching is not fully atomic, so try to avoid local interruptions
> that might execute the to be patched code.
> Other CPUs are not running. */
> stop_nmi();
> #ifdef CONFIG_X86_MCE
> stop_mce();
> #endif
> + local_irq_save(flags);
>
>
> ...
> + local_irq_restore(flags);
> restart_nmi();
> #ifdef CONFIG_X86_MCE
> restart_mce();
> #endif
>
> ?

Hej! This last changes (in addition to the others you mentioned) seems
to be a good shot. I could reboot 8 times the guest, compile several
packages (something which always leeds to the oops) and currently i
build two big packages simultan. So this is heavy IO.

I will try tomorrow more heavy build tests (to gain the good feeling to
the vbox+guest kernel again like it was with 2.6.25), but i think your
changes goes in the right direction.

Here is the diff what i've changed on your hints:

,----[ arch/x86/kernel/alternative.c ]
| --- alternative.c.org 2008-07-13 23:51:29.000000000 +0200
| +++ alternative.c 2008-08-26 21:35:20.000000000 +0200
| @@ -343,6 +343,7 @@
| void alternatives_smp_switch(int smp)
| {
| struct smp_alt_module *mod;
| + unsigned long flags;
|
| #ifdef CONFIG_LOCKDEP
| /*
| @@ -359,7 +360,7 @@
| return;
| BUG_ON(!smp && (num_online_cpus() > 1));
|
| - spin_lock(&smp_alt);
| + spin_lock_irqsave(&smp_alt, flags);
|
| /*
| * Avoid unnecessary switches because it forces JIT based VMs to
| @@ -383,7 +384,7 @@
| mod->text, mod->text_end);
| }
| smp_mode = smp;
| - spin_unlock(&smp_alt);
| + spin_unlock_irqrestore(&smp_alt, flags);
| }
|
| #endif
| @@ -420,6 +421,7 @@
|
| void __init alternative_instructions(void)
| {
| + unsigned long flags;
| /* The patching is not fully atomic, so try to avoid local interruptions
| that might execute the to be patched code.
| Other CPUs are not running. */
| @@ -427,6 +429,7 @@
| #ifdef CONFIG_X86_MCE
| stop_mce();
| #endif
| + local_irq_save(flags);
|
| apply_alternatives(__alt_instructions, __alt_instructions_end);
|
| @@ -465,6 +468,7 @@
| (unsigned long)__smp_locks,
| (unsigned long)__smp_locks_end);
|
| + local_irq_restore(flags);
| restart_nmi();
| #ifdef CONFIG_X86_MCE
| restart_mce();
| @@ -508,33 +512,5 @@
| */
| void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
| {
| - unsigned long flags;
| - char *vaddr;
| - int nr_pages = 2;
| - struct page *pages[2];
| - int i;
| -
| - if (!core_kernel_text((unsigned long)addr)) {
| - pages[0] = vmalloc_to_page(addr);
| - pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
| - } else {
| - pages[0] = virt_to_page(addr);
| - WARN_ON(!PageReserved(pages[0]));
| - pages[1] = virt_to_page(addr + PAGE_SIZE);
| - }
| - BUG_ON(!pages[0]);
| - if (!pages[1])
| - nr_pages = 1;
| - vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
| - BUG_ON(!vaddr);
| - local_irq_save(flags);
| - memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
| - local_irq_restore(flags);
| - vunmap(vaddr);
| - sync_core();
| - /* Could also do a CLFLUSH here to speed up CPU recovery; but
| - that causes hangs on some VIA CPUs. */
| - for (i = 0; i < len; i++)
| - BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
| - return addr;
| + return text_poke_early(addr, opcode, len);
| }
`----

So if Luiz and others could also try all 3 mentioned changes, maybe we
have a solution. I also will build tomorrow a new LiveCD/Install ISO
with these patches to see if the error there is also gone.

> Thanks,
>
> Mathieu

Gerhard


--
Was wir wissen, ist ein Tropfen.
Was wir nicht wissen, ein Ozean (Newton)

2008-08-26 20:48:25

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Gerhard Brauer ([email protected]) wrote:
> On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
> >
> > Ok, it might still be caused by paravirt and alternatives instruction
> > patching. What if you also do :
> >
> > alternative_instructions()
> >
> > + unsigned long flags;
> > /* The patching is not fully atomic, so try to avoid local interruptions
> > that might execute the to be patched code.
> > Other CPUs are not running. */
> > stop_nmi();
> > #ifdef CONFIG_X86_MCE
> > stop_mce();
> > #endif
> > + local_irq_save(flags);
> >
> >
> > ...
> > + local_irq_restore(flags);
> > restart_nmi();
> > #ifdef CONFIG_X86_MCE
> > restart_mce();
> > #endif
> >
> > ?
>
> Hej! This last changes (in addition to the others you mentioned) seems
> to be a good shot. I could reboot 8 times the guest, compile several
> packages (something which always leeds to the oops) and currently i
> build two big packages simultan. So this is heavy IO.
>
> I will try tomorrow more heavy build tests (to gain the good feeling to
> the vbox+guest kernel again like it was with 2.6.25), but i think your
> changes goes in the right direction.
>

OK, so we have a problem with interrupts coming while we are doing the
alternatives patching.

First thing, I wonder if Virtualbox expects the OS to patch all its
paravirt instructions in one go ?

Also, could you then try to :
- to revert all those changes
- Do this to text_poke_early and text_poke :

- put the sync_core() within the irq off critical section
(test)
- add a wbinvd(); just after the sync_core() in both functions
(test).

Thanks,

Mathieu


> Here is the diff what i've changed on your hints:
>
> ,----[ arch/x86/kernel/alternative.c ]
> | --- alternative.c.org 2008-07-13 23:51:29.000000000 +0200
> | +++ alternative.c 2008-08-26 21:35:20.000000000 +0200
> | @@ -343,6 +343,7 @@
> | void alternatives_smp_switch(int smp)
> | {
> | struct smp_alt_module *mod;
> | + unsigned long flags;
> |
> | #ifdef CONFIG_LOCKDEP
> | /*
> | @@ -359,7 +360,7 @@
> | return;
> | BUG_ON(!smp && (num_online_cpus() > 1));
> |
> | - spin_lock(&smp_alt);
> | + spin_lock_irqsave(&smp_alt, flags);
> |
> | /*
> | * Avoid unnecessary switches because it forces JIT based VMs to
> | @@ -383,7 +384,7 @@
> | mod->text, mod->text_end);
> | }
> | smp_mode = smp;
> | - spin_unlock(&smp_alt);
> | + spin_unlock_irqrestore(&smp_alt, flags);
> | }
> |
> | #endif
> | @@ -420,6 +421,7 @@
> |
> | void __init alternative_instructions(void)
> | {
> | + unsigned long flags;
> | /* The patching is not fully atomic, so try to avoid local interruptions
> | that might execute the to be patched code.
> | Other CPUs are not running. */
> | @@ -427,6 +429,7 @@
> | #ifdef CONFIG_X86_MCE
> | stop_mce();
> | #endif
> | + local_irq_save(flags);
> |
> | apply_alternatives(__alt_instructions, __alt_instructions_end);
> |
> | @@ -465,6 +468,7 @@
> | (unsigned long)__smp_locks,
> | (unsigned long)__smp_locks_end);
> |
> | + local_irq_restore(flags);
> | restart_nmi();
> | #ifdef CONFIG_X86_MCE
> | restart_mce();
> | @@ -508,33 +512,5 @@
> | */
> | void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> | {
> | - unsigned long flags;
> | - char *vaddr;
> | - int nr_pages = 2;
> | - struct page *pages[2];
> | - int i;
> | -
> | - if (!core_kernel_text((unsigned long)addr)) {
> | - pages[0] = vmalloc_to_page(addr);
> | - pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
> | - } else {
> | - pages[0] = virt_to_page(addr);
> | - WARN_ON(!PageReserved(pages[0]));
> | - pages[1] = virt_to_page(addr + PAGE_SIZE);
> | - }
> | - BUG_ON(!pages[0]);
> | - if (!pages[1])
> | - nr_pages = 1;
> | - vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
> | - BUG_ON(!vaddr);
> | - local_irq_save(flags);
> | - memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> | - local_irq_restore(flags);
> | - vunmap(vaddr);
> | - sync_core();
> | - /* Could also do a CLFLUSH here to speed up CPU recovery; but
> | - that causes hangs on some VIA CPUs. */
> | - for (i = 0; i < len; i++)
> | - BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]);
> | - return addr;
> | + return text_poke_early(addr, opcode, len);
> | }
> `----
>
> So if Luiz and others could also try all 3 mentioned changes, maybe we
> have a solution. I also will build tomorrow a new LiveCD/Install ISO
> with these patches to see if the error there is also gone.
>
> > Thanks,
> >
> > Mathieu
>
> Gerhard
>
>
> --
> Was wir wissen, ist ein Tropfen.
> Was wir nicht wissen, ein Ozean (Newton)

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-26 21:36:17

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Gerhard Brauer ([email protected]) wrote:
> On Tue, Aug 26, 2008 at 04:48:14PM -0400, Mathieu Desnoyers wrote:
> >
> > OK, so we have a problem with interrupts coming while we are doing the
> > alternatives patching.
> >
> > First thing, I wonder if Virtualbox expects the OS to patch all its
> > paravirt instructions in one go ?
> >
> > Also, could you then try to :
> > - to revert all those changes
> > - Do this to text_poke_early and text_poke :
> >
> > - put the sync_core() within the irq off critical section
> > (test)
>
> Could you please explain more what to change? I don't see where to put
> sync_core(), i not found this section in both functions. (I'm not a developer)
>

Sure,

First patch to test :

x86 alternative text_poke move sync_core

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
arch/x86/kernel/alternative.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-08-26 17:26:41.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-08-26 17:26:58.000000000 -0400
@@ -488,8 +488,8 @@ void *text_poke_early(void *addr, const
unsigned long flags;
local_irq_save(flags);
memcpy(addr, opcode, len);
- local_irq_restore(flags);
sync_core();
+ local_irq_restore(flags);
/* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
return addr;
@@ -529,9 +529,9 @@ void *__kprobes text_poke(void *addr, co
BUG_ON(!vaddr);
local_irq_save(flags);
memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
+ sync_core();
local_irq_restore(flags);
vunmap(vaddr);
- sync_core();
/* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
for (i = 0; i < len; i++)


> > - add a wbinvd(); just after the sync_core() in both functions
> > (test).
>
> Also verbose please...
>

Second patch to apply on top of the first one :


x86 alternative text_poke add wbinvd

Add a cache flush instruction before reenabling interrupts in text_poke.

If this works, we could use clflush() (which is sadly buggy on some archs) which
is faster since it only clear a cacheline instead of the entire cache.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
arch/x86/kernel/alternative.c | 2 ++
1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-08-26 17:27:33.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-08-26 17:27:53.000000000 -0400
@@ -489,6 +489,7 @@ void *text_poke_early(void *addr, const
local_irq_save(flags);
memcpy(addr, opcode, len);
sync_core();
+ wbinvd();
local_irq_restore(flags);
/* Could also do a CLFLUSH here to speed up CPU recovery; but
that causes hangs on some VIA CPUs. */
@@ -530,6 +531,7 @@ void *__kprobes text_poke(void *addr, co
local_irq_save(flags);
memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
sync_core();
+ wbinvd();
local_irq_restore(flags);
vunmap(vaddr);
/* Could also do a CLFLUSH here to speed up CPU recovery; but



Thanks,

Mathieu

> > Thanks,
> >
> > Mathieu
>
> Thank you
> Gerhard
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-08-26 21:39:27

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 04:48:14PM -0400, Mathieu Desnoyers wrote:
>
> OK, so we have a problem with interrupts coming while we are doing the
> alternatives patching.
>
> First thing, I wonder if Virtualbox expects the OS to patch all its
> paravirt instructions in one go ?
>
> Also, could you then try to :
> - to revert all those changes
> - Do this to text_poke_early and text_poke :
>
> - put the sync_core() within the irq off critical section
> (test)

Could you please explain more what to change? I don't see where to put
sync_core(), i not found this section in both functions. (I'm not a developer)

> - add a wbinvd(); just after the sync_core() in both functions
> (test).

Also verbose please...

> Thanks,
>
> Mathieu

Thank you
Gerhard

2008-08-26 21:52:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Mathieu Desnoyers wrote:
>
> x86 alternative text_poke add wbinvd
>
> Add a cache flush instruction before reenabling interrupts in text_poke.
>
> If this works, we could use clflush() (which is sadly buggy on some archs) which
> is faster since it only clear a cacheline instead of the entire cache.
>

Well, in this case it's VirtualBox we're talking about, a virtual
architecture. It's hard to know what it will do under *any* circumstances.

-hpa

2008-08-27 00:13:36

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Tue, Aug 26, 2008 at 05:35:59PM -0400, Mathieu Desnoyers wrote:
> * Gerhard Brauer ([email protected]) wrote:
>
> First patch to test :
>
> x86 alternative text_poke move sync_core

With this got the oops again when compiling in guest. Reboot afterwards
leads to the early oops.

> Second patch to apply on top of the first one :
>
> x86 alternative text_poke add wbinvd

With second patch i get the early oops after "freeing smp". Seems no way
to get the guest bootet normaly (ony with replace-paravirt).

So the changes from the other mail took more effect IMHO.
I could test more tomorrow if you have more ideas.

> Mathieu

Gerhard

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Tue, 26 Aug 2008 22:34:49 +0200
Gerhard Brauer <[email protected]> escreveu:

| On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
| >
| > Ok, it might still be caused by paravirt and alternatives instruction
| > patching. What if you also do :
| >
| > alternative_instructions()
| >
| > + unsigned long flags;
| > /* The patching is not fully atomic, so try to avoid local interruptions
| > that might execute the to be patched code.
| > Other CPUs are not running. */
| > stop_nmi();
| > #ifdef CONFIG_X86_MCE
| > stop_mce();
| > #endif
| > + local_irq_save(flags);
| >
| >
| > ...
| > + local_irq_restore(flags);
| > restart_nmi();
| > #ifdef CONFIG_X86_MCE
| > restart_mce();
| > #endif
| >
| > ?
|
| Hej! This last changes (in addition to the others you mentioned) seems
| to be a good shot. I could reboot 8 times the guest, compile several
| packages (something which always leeds to the oops) and currently i
| build two big packages simultan. So this is heavy IO.

Yeah, it works for me too and it's good to know that you are doing
additional tests. I'm doing only boot tests... I was testing lots of
kernels and doing additional tests would take a lot of time.

Now, what does this mean? Is VirtualBox issuing interrupts when it
shouldn't or should this section of the code be better protected?

--
Luiz Fernando N. Capitulino

2008-08-27 23:33:39

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

* Luiz Fernando N. Capitulino ([email protected]) wrote:
> Em Tue, 26 Aug 2008 22:34:49 +0200
> Gerhard Brauer <[email protected]> escreveu:
>
> | On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
> | >
> | > Ok, it might still be caused by paravirt and alternatives instruction
> | > patching. What if you also do :
> | >
> | > alternative_instructions()
> | >
> | > + unsigned long flags;
> | > /* The patching is not fully atomic, so try to avoid local interruptions
> | > that might execute the to be patched code.
> | > Other CPUs are not running. */
> | > stop_nmi();
> | > #ifdef CONFIG_X86_MCE
> | > stop_mce();
> | > #endif
> | > + local_irq_save(flags);
> | >
> | >
> | > ...
> | > + local_irq_restore(flags);
> | > restart_nmi();
> | > #ifdef CONFIG_X86_MCE
> | > restart_mce();
> | > #endif
> | >
> | > ?
> |
> | Hej! This last changes (in addition to the others you mentioned) seems
> | to be a good shot. I could reboot 8 times the guest, compile several
> | packages (something which always leeds to the oops) and currently i
> | build two big packages simultan. So this is heavy IO.
>
> Yeah, it works for me too and it's good to know that you are doing
> additional tests. I'm doing only boot tests... I was testing lots of
> kernels and doing additional tests would take a lot of time.
>
> Now, what does this mean? Is VirtualBox issuing interrupts when it
> shouldn't or should this section of the code be better protected?
>

Since this problem appears while we are using a simple memcpy (the
text_poke_early version), but disappears when we disable interrupts for
a longer period of this, I suspect a problem with irq disabling in
Virtualbox.

We could try to add some nsleep() or msleep() calls within text_poke and
text_poke_early before and after the code modificatoin to see if the
problem disappears. If it does, then that would somewhat confirm the
racy irq disable thesis.

Mathieu

> --
> Luiz Fernando N. Capitulino

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Wed, 27 Aug 2008 19:33:28 -0400
Mathieu Desnoyers <[email protected]> escreveu:

| * Luiz Fernando N. Capitulino ([email protected]) wrote:
| > Em Tue, 26 Aug 2008 22:34:49 +0200
| > Gerhard Brauer <[email protected]> escreveu:
| >
| > | On Tue, Aug 26, 2008 at 02:15:58PM -0400, Mathieu Desnoyers wrote:
| > | >
| > | > Ok, it might still be caused by paravirt and alternatives instruction
| > | > patching. What if you also do :
| > | >
| > | > alternative_instructions()
| > | >
| > | > + unsigned long flags;
| > | > /* The patching is not fully atomic, so try to avoid local interruptions
| > | > that might execute the to be patched code.
| > | > Other CPUs are not running. */
| > | > stop_nmi();
| > | > #ifdef CONFIG_X86_MCE
| > | > stop_mce();
| > | > #endif
| > | > + local_irq_save(flags);
| > | >
| > | >
| > | > ...
| > | > + local_irq_restore(flags);
| > | > restart_nmi();
| > | > #ifdef CONFIG_X86_MCE
| > | > restart_mce();
| > | > #endif
| > | >
| > | > ?
| > |
| > | Hej! This last changes (in addition to the others you mentioned) seems
| > | to be a good shot. I could reboot 8 times the guest, compile several
| > | packages (something which always leeds to the oops) and currently i
| > | build two big packages simultan. So this is heavy IO.
| >
| > Yeah, it works for me too and it's good to know that you are doing
| > additional tests. I'm doing only boot tests... I was testing lots of
| > kernels and doing additional tests would take a lot of time.
| >
| > Now, what does this mean? Is VirtualBox issuing interrupts when it
| > shouldn't or should this section of the code be better protected?
| >
|
| Since this problem appears while we are using a simple memcpy (the
| text_poke_early version), but disappears when we disable interrupts for
| a longer period of this, I suspect a problem with irq disabling in
| Virtualbox.
|
| We could try to add some nsleep() or msleep() calls within text_poke and
| text_poke_early before and after the code modificatoin to see if the
| problem disappears. If it does, then that would somewhat confirm the
| racy irq disable thesis.

Well, a Ubuntu kernel guy has reported in the virtualbox's ticket[1]
that the oops doesn't happen if he puts a printk() in the crash site.

The funny thing is that someone (who might be a virtualbox developer)
used the same race argument to say that this is a bug in the kernel.

What concerns me though is that how can virtualbox be worth using
in the Linux community if it's probably not working for various distros
(currently Fedora, Ubuntu, Mandriva and ArchLinux).

Thanks for the effort, guys.

[1] http://www.virtualbox.org/ticket/1875

--
Luiz Fernando N. Capitulino

2008-08-28 13:50:27

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Wed, Aug 27, 2008 at 07:33:28PM -0400, Mathieu Desnoyers wrote:
>
> We could try to add some nsleep() or msleep() calls within text_poke and
> text_poke_early before and after the code modificatoin to see if the
> problem disappears. If it does, then that would somewhat confirm the
> racy irq disable thesis.

nsleep isn't known here as a function, only references i found is maybe
in posix-timers.c.

msleep() is known, but each time i add for ex.
msleep(100);
in any place in text_poke and/or text_poke_early it get a kernel panic
on boot. Here's a screenie:
http://users.archlinux.de/~gerbra/tmp/2008-08-28-132337_724x456_scrot.png

I also tried to work with the isolated changes we have last made, but it
seems that only the 3 changes together work.
Also i tried to went back to older versions of alternatives.c referenced
in:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.26.y.git;a=history;f=arch/x86/kernel/alternative.c;h=65c7857a90ddfc6ff084c6817baba045ced0ad71;hb=v2.6.26

But with my few knowledges i ran in too many errors.

So, have you any further ideas, code that i/we could test?
Or - i'm naive - are the "3 changes" we made ready to go in the kernel
without to harm something real important than virtualbox?

> Mathieu

Gerhard

2008-08-31 09:29:35

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Thu, Aug 28, 2008 at 10:30:13AM -0300, Luiz Fernando N. Capitulino wrote:
> Em Wed, 27 Aug 2008 19:33:28 -0400
> Mathieu Desnoyers <[email protected]> escreveu:
> |
> | Since this problem appears while we are using a simple memcpy (the
> | text_poke_early version), but disappears when we disable interrupts for
> | a longer period of this, I suspect a problem with irq disabling in
> | Virtualbox.
> |
> | We could try to add some nsleep() or msleep() calls within text_poke and
> | text_poke_early before and after the code modificatoin to see if the
> | problem disappears. If it does, then that would somewhat confirm the
> | racy irq disable thesis.
>
> Well, a Ubuntu kernel guy has reported in the virtualbox's ticket[1]
> that the oops doesn't happen if he puts a printk() in the crash site.
>
> The funny thing is that someone (who might be a virtualbox developer)
> used the same race argument to say that this is a bug in the kernel.
>
> What concerns me though is that how can virtualbox be worth using
> in the Linux community if it's probably not working for various distros
> (currently Fedora, Ubuntu, Mandriva and ArchLinux).
>
> Thanks for the effort, guys.
>
> [1] http://www.virtualbox.org/ticket/1875

Ok, some news from archlinux side:
Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
My virtual machines boots always fine with this, and i have one
confirmation from a user about this.

Kernel upgrade does not solve the kernel panic during work with the VM,
when there is heavy disk IO. I test and could reproduce this by untar 2
big files in seperate dirs: bsdtar -x -f VirtualBox-1.6.2-OSE.tar.bz2.
Doing this simultan crashed the VM always.
SreenShot:
http://users.archlinux.de/~gerbra/tmp/2008-08-31-110449_724x456_scrot.png

This heavy IO oops does not occur under 2.6.26.2 when using the
"3-changes-patch" against alternatives.c, which we have tested in the
other mails. There must be something irq related which fix this
3-changes-patch, and what was not fixed in 2.6.26.3
On the other hand: I never have stressed a VM like this before
researching for this problem. So it could also be that the heavy-IO
problem way a total seperate problem from that we're talking about here.
Doing my "normal" work now in VM (it's my devel VM for compiling and
testing), until now i don't have had this IO oops.

We use a mostly unpatched kernel as distribution kernel.

So short summary from my side:
a) With "3-changes-patch" i got a rock solide VM
b) 2.6.26.2 have the early oops on boot and IO oops when sometimes
bootet.
c) 2.6.26.3 have only the heavy-IO oops

I'll try a fresh VM, where i will test:
a) Using sata controller emulation as bus (now i have ide(piix3))
b) Using different filesystems (With 2.6.26.2 early oops and heavy-io
oops could be reproduced with any filesystem).


Regards
Gerhard

2008-08-31 13:28:42

by Stefan Lippers-Hollmann

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Hi

On Sonntag, 31. August 2008, Gerhard Brauer wrote:
[...]
> Ok, some news from archlinux side:
> Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
> this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
> My virtual machines boots always fine with this, and i have one
> confirmation from a user about this.

Sorry, I can't confirm this here on Debian unstable (with virtualbox-ose
1.6.2 or 1.6.4), are you sure that other configuration options didn't
change between the different kernel versions? Preemption and paravirt can
influence the probability of the early boot panic seriously, without really
avoiding it alltogether.

Actually I still get the same issues with implanting
ftp://ftp5.gwdg.de/pub/linux/archlinux/core/os/i686/kernel26-2.6.26.3-1-i686.pkg.tar.gz
into the test vm using virtualbox-ose 1.6.4.

> Kernel upgrade does not solve the kernel panic during work with the VM,
> when there is heavy disk IO. I test and could reproduce this by untar 2
> big files in seperate dirs: bsdtar -x -f VirtualBox-1.6.2-OSE.tar.bz2.
> Doing this simultan crashed the VM always.
> SreenShot:
> http://users.archlinux.de/~gerbra/tmp/2008-08-31-110449_724x456_scrot.png
>
> This heavy IO oops does not occur under 2.6.26.2 when using the
> "3-changes-patch" against alternatives.c, which we have tested in the
> other mails. There must be something irq related which fix this
> 3-changes-patch, and what was not fixed in 2.6.26.3
> On the other hand: I never have stressed a VM like this before
> researching for this problem. So it could also be that the heavy-IO
> problem way a total seperate problem from that we're talking about here.
> Doing my "normal" work now in VM (it's my devel VM for compiling and
> testing), until now i don't have had this IO oops.
>
> We use a mostly unpatched kernel as distribution kernel.
>
> So short summary from my side:
> a) With "3-changes-patch" i got a rock solide VM
> b) 2.6.26.2 have the early oops on boot and IO oops when sometimes
> bootet.
> c) 2.6.26.3 have only the heavy-IO oops
>
> I'll try a fresh VM, where i will test:
> a) Using sata controller emulation as bus (now i have ide(piix3))
> b) Using different filesystems (With 2.6.26.2 early oops and heavy-io
> oops could be reproduced with any filesystem).
>
>
> Regards
> Gerhard

Regards
Stefan Lippers-Hollmann

2008-08-31 14:03:58

by Gerhard Brauer

[permalink] [raw]
Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

On Sun, Aug 31, 2008 at 03:28:22PM +0200, Stefan Lippers-Hollmann wrote:
> Hi
>
> On Sonntag, 31. August 2008, Gerhard Brauer wrote:
> [...]
> > Ok, some news from archlinux side:
> > Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
> > this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
> > My virtual machines boots always fine with this, and i have one
> > confirmation from a user about this.
>
> Sorry, I can't confirm this here on Debian unstable (with virtualbox-ose
> 1.6.2 or 1.6.4), are you sure that other configuration options didn't
> change between the different kernel versions? Preemption and paravirt can
> influence the probability of the early boot panic seriously, without really
> avoiding it alltogether.

Only changes between our 2.6.26.2-1 and 2.6.26.3-1 are some minor
framebuffer changes in config. If i have a look at the different
patchsets between the two versions i don't see something which could be
the reason between work and not work.

> Actually I still get the same issues with implanting
> ftp://ftp5.gwdg.de/pub/linux/archlinux/core/os/i686/kernel26-2.6.26.3-1-i686.pkg.tar.gz
> into the test vm using virtualbox-ose 1.6.4.

Hmm, one user also reports that he have no problem when using a vanilla
2.6.26 as guest kernel. But there must be some reasons when different
distributions notice a major problem between 2.6.25 and 2.6.26 with
their stock kernels. Although i don't even know if our few reports here
are very representavive...

> Regards
> Stefan Lippers-Hollmann

Gerhard

--
Heute ist das Morgen wovor du gestern Angst hattest...

Subject: Re: 2.6.{26.2,27-rc} oops on virtualbox

Em Sun, 31 Aug 2008 11:29:23 +0200
Gerhard Brauer <[email protected]> escreveu:

| On Thu, Aug 28, 2008 at 10:30:13AM -0300, Luiz Fernando N. Capitulino wrote:
| > Em Wed, 27 Aug 2008 19:33:28 -0400
| > Mathieu Desnoyers <[email protected]> escreveu:
| > |
| > | Since this problem appears while we are using a simple memcpy (the
| > | text_poke_early version), but disappears when we disable interrupts for
| > | a longer period of this, I suspect a problem with irq disabling in
| > | Virtualbox.
| > |
| > | We could try to add some nsleep() or msleep() calls within text_poke and
| > | text_poke_early before and after the code modificatoin to see if the
| > | problem disappears. If it does, then that would somewhat confirm the
| > | racy irq disable thesis.
| >
| > Well, a Ubuntu kernel guy has reported in the virtualbox's ticket[1]
| > that the oops doesn't happen if he puts a printk() in the crash site.
| >
| > The funny thing is that someone (who might be a virtualbox developer)
| > used the same race argument to say that this is a bug in the kernel.
| >
| > What concerns me though is that how can virtualbox be worth using
| > in the Linux community if it's probably not working for various distros
| > (currently Fedora, Ubuntu, Mandriva and ArchLinux).
| >
| > Thanks for the effort, guys.
| >
| > [1] http://www.virtualbox.org/ticket/1875
|
| Ok, some news from archlinux side:
| Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
| this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
| My virtual machines boots always fine with this, and i have one
| confirmation from a user about this.
|
| Kernel upgrade does not solve the kernel panic during work with the VM,
| when there is heavy disk IO. I test and could reproduce this by untar 2
| big files in seperate dirs: bsdtar -x -f VirtualBox-1.6.2-OSE.tar.bz2.
| Doing this simultan crashed the VM always.
| SreenShot:
| http://users.archlinux.de/~gerbra/tmp/2008-08-31-110449_724x456_scrot.png
|
| This heavy IO oops does not occur under 2.6.26.2 when using the
| "3-changes-patch" against alternatives.c, which we have tested in the
| other mails. There must be something irq related which fix this
| 3-changes-patch, and what was not fixed in 2.6.26.3
| On the other hand: I never have stressed a VM like this before
| researching for this problem. So it could also be that the heavy-IO
| problem way a total seperate problem from that we're talking about here.
| Doing my "normal" work now in VM (it's my devel VM for compiling and
| testing), until now i don't have had this IO oops.

Mandriva kernel was 2.6.26.3 based at the time I started testing
this and all my last tests have been done on 2.6.27-rc4. I think it's
very unusual to have a change in a -stable kernel not present in the
latest -rc.

Also note that CPU settings in the VM has a big influency in the
problem, so I'm pretty sure 2.6.26.3 doesn't fix the problem.


--
Luiz Fernando N. Capitulino