2013-03-04 15:15:46

by Tetsuo Handa

[permalink] [raw]
Subject: [3.9-rc1] Bug in bootup code or debug code?

Tetsuo Handa wrote:
> Hello.
>
> I can boot linux-next-20130205 using kernel config at
> http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> But I get VMware's virtual machine kernel stack fault (hardware reset) as soon
> as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
>
> Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> kernel config generated by "make allnoconfig", I guess something is wrong with
> code which is executed at very early stage of bootup.
>
> Any clue?
>
> Regards.
>

This bug is not yet fixed as of 3.9-rc1.
Should I run git bisect?

Regards.


2013-03-05 11:31:25

by Tetsuo Handa

[permalink] [raw]
Subject: [3.9-rc1 x86/microcode] Bug in CONFIG_MICROCODE_INTEL_EARLY=y

Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Hello.
> >
> > I can boot linux-next-20130205 using kernel config at
> > http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> > But I get VMware's virtual machine kernel stack fault (hardware reset) as soon
> > as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
> >
> > Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> > kernel config generated by "make allnoconfig", I guess something is wrong with
> > code which is executed at very early stage of bootup.
> >
> > Any clue?
> >
> > Regards.
> >
>
> This bug is not yet fixed as of 3.9-rc1.
> Should I run git bisect?
>
> Regards.
>
I couldn't find the exact commit due to build failure, but I can guess that
this problem is triggered by early microcode loading changes, for this problem
happens only when CONFIG_MICROCODE_INTEL_EARLY=y on x86_32 kernel.

Candidate commits are:

086fc8f8 "x86/tlbflush.h: Define __native_flush_tlb_global_irq_disabled()"
e666dfa2 "x86/microcode_intel_lib.c: Early update ucode on Intel's CPU"
a8ebf6d1 "x86/microcode_core_early.c: Define interfaces for early loading ucode"
ec400dde "x86/microcode_intel_early.c: Early update ucode on Intel's CPU"
63b553c6 "x86/head_32.S: Early update ucode in 32-bit"
e6ebf5de "x86/common.c: load ucode in 64 bit or show loading ucode info in 32 bit on AP"
d288e1cf "x86/common.c: Make have_cpuid_p() a global function"
feddc9de "x86/head64.c: Early update ucode in 64-bit"
9cd4d78e "x86/microcode_intel.h: Define functions and macros for early loading ucode"
cd745be8 "x86/mm/init.c: Copy ucode from initrd image to kernel memory"
da76f64e "x86/Kconfig: Make early microcode loading a configuration feature"

I'm using Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz on VMware Player 4.0.5.

Regards.

2013-03-05 15:41:14

by Tetsuo Handa

[permalink] [raw]
Subject: [3.9-rc1 x86] Bug in ioremap code?

Another problem

[ 0.021748] Mount-cache hash table entries: 512
[ 0.036341] Disabled fast string operations
[ 0.037760] mce: CPU supports 0 MCE banks
[ 0.039813] Last level iTLB entries: 4KB 128, 2MB 4, 4MB 4
[ 0.039813] Last level dTLB entries: 4KB 256, 2MB 0, 4MB 32
[ 0.039813] tlb_flushall_shift: -1
[ 0.074005] debug: unmapping init [mem 0xc186a000-0xc186efff]
[ 0.077005] ACPI: Core revision 20121018
[ 0.083350] ------------[ cut here ]------------
[ 0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
[ 0.084000] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 0.084000] Modules linked in:
[ 0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb #47 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 0.084000] EIP: 0060:[<c102fa12>] EFLAGS: 00010206 CPU: 0
[ 0.084000] EIP is at __phys_addr+0x42/0x90
[ 0.084000] EAX: 00000000 EBX: 1fef0000 ECX: 0000000c EDX: 00000000
[ 0.084000] ESI: c1657edc EDI: 0000000f EBP: c1657dcc ESP: c1657dc8
[ 0.084000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 0.084000] CR0: 8005003b CR2: ffe13000 CR3: 01872000 CR4: 000006d0
[ 0.084000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.084000] DR6: ffff0ff0 DR7: 00000400
[ 0.084000] Process swapper/0 (pid: 0, ti=c1656000 task=c1661180 task.ti=c1656000)
[ 0.084000] Stack:
[ 0.084000] c1657e90 c1657dec c102ca3e c1661608 c166f700 00000000 c166f700 c1657df0
[ 0.084000] 00000000 c1657e70 c102ceee c10d7899 00000002 c1655000 00000000 c10d7632
[ 0.084000] 00000001 00000dfc 000002f0 c1657e60 c14ab000 0000000f 00000110 c1657e90
[ 0.084000] Call Trace:
[ 0.084000] [<c102ca3e>] __cpa_process_fault+0x3e/0x80
[ 0.084000] [<c102ceee>] __change_page_attr_set_clr+0x3de/0x6d0
[ 0.084000] [<c10d7899>] ? __purge_vmap_area_lazy+0x2a9/0x360
[ 0.084000] [<c10d7632>] ? __purge_vmap_area_lazy+0x42/0x360
[ 0.084000] [<c10d913c>] ? vm_unmap_aliases+0x2bc/0x300
[ 0.084000] [<c10d8ee4>] ? vm_unmap_aliases+0x64/0x300
[ 0.084000] [<c102d2c5>] change_page_attr_set_clr+0xe5/0x390
[ 0.084000] [<c102d5a2>] _set_memory_wb+0x32/0x40
[ 0.084000] [<c102c46f>] ioremap_change_attr+0xf/0x40
[ 0.084000] [<c102e857>] kernel_map_sync_memtype+0x87/0xf0
[ 0.084000] [<c102c29b>] __ioremap_caller+0x21b/0x2f0
[ 0.084000] [<c103d32a>] ? walk_system_ram_range+0xca/0xf0
[ 0.084000] [<c102c3a3>] ioremap_cache+0x13/0x20
[ 0.084000] [<c149a231>] ? acpi_os_map_memory+0xb6/0x112
[ 0.084000] [<c149a231>] acpi_os_map_memory+0xb6/0x112
[ 0.084000] [<c12ce038>] acpi_tb_verify_table+0x20/0x49
[ 0.084000] [<c12cea67>] acpi_load_tables+0x35/0x13e
[ 0.084000] [<c16c78c6>] acpi_early_init+0x67/0xeb
[ 0.084000] [<c16a7b14>] start_kernel+0x30e/0x319
[ 0.084000] [<c16a7677>] ? repair_env_string+0x5b/0x5b
[ 0.084000] [<c16a7356>] i386_start_kernel+0x12c/0x12f
[ 0.084000] Code: 0c db c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 ca 72 19 e8 be cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 8c 0b db
[ 0.084000] EIP: [<c102fa12>] __phys_addr+0x42/0x90 SS:ESP 0068:c1657dc8
[ 0.085033] ---[ end trace bd778c4c9eceaf67 ]---
[ 0.088242] Kernel panic - not syncing: Attempted to kill the idle task!

was found using http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1 and was bisected
to commit 68d00bbe "Merge remote-tracking branch 'origin/x86/mm' into x86/mm2".

Regards.

2013-03-05 18:07:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

+ Dave.

This still says 3.8.0-rc5-00105-g68d00bb. Can you still trigger this
with 3.9-rc1?

And also, this is Linux running as a 32-bit guest in vmware, correct?

On Wed, Mar 06, 2013 at 12:41:10AM +0900, Tetsuo Handa wrote:
> Another problem
>
> [ 0.021748] Mount-cache hash table entries: 512
> [ 0.036341] Disabled fast string operations
> [ 0.037760] mce: CPU supports 0 MCE banks
> [ 0.039813] Last level iTLB entries: 4KB 128, 2MB 4, 4MB 4
> [ 0.039813] Last level dTLB entries: 4KB 256, 2MB 0, 4MB 32
> [ 0.039813] tlb_flushall_shift: -1
> [ 0.074005] debug: unmapping init [mem 0xc186a000-0xc186efff]
> [ 0.077005] ACPI: Core revision 20121018
> [ 0.083350] ------------[ cut here ]------------
> [ 0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
> [ 0.084000] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> [ 0.084000] Modules linked in:
> [ 0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb #47 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
> [ 0.084000] EIP: 0060:[<c102fa12>] EFLAGS: 00010206 CPU: 0
> [ 0.084000] EIP is at __phys_addr+0x42/0x90
> [ 0.084000] EAX: 00000000 EBX: 1fef0000 ECX: 0000000c EDX: 00000000
> [ 0.084000] ESI: c1657edc EDI: 0000000f EBP: c1657dcc ESP: c1657dc8
> [ 0.084000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 0.084000] CR0: 8005003b CR2: ffe13000 CR3: 01872000 CR4: 000006d0
> [ 0.084000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [ 0.084000] DR6: ffff0ff0 DR7: 00000400
> [ 0.084000] Process swapper/0 (pid: 0, ti=c1656000 task=c1661180 task.ti=c1656000)
> [ 0.084000] Stack:
> [ 0.084000] c1657e90 c1657dec c102ca3e c1661608 c166f700 00000000 c166f700 c1657df0
> [ 0.084000] 00000000 c1657e70 c102ceee c10d7899 00000002 c1655000 00000000 c10d7632
> [ 0.084000] 00000001 00000dfc 000002f0 c1657e60 c14ab000 0000000f 00000110 c1657e90
> [ 0.084000] Call Trace:
> [ 0.084000] [<c102ca3e>] __cpa_process_fault+0x3e/0x80
> [ 0.084000] [<c102ceee>] __change_page_attr_set_clr+0x3de/0x6d0
> [ 0.084000] [<c10d7899>] ? __purge_vmap_area_lazy+0x2a9/0x360
> [ 0.084000] [<c10d7632>] ? __purge_vmap_area_lazy+0x42/0x360
> [ 0.084000] [<c10d913c>] ? vm_unmap_aliases+0x2bc/0x300
> [ 0.084000] [<c10d8ee4>] ? vm_unmap_aliases+0x64/0x300
> [ 0.084000] [<c102d2c5>] change_page_attr_set_clr+0xe5/0x390
> [ 0.084000] [<c102d5a2>] _set_memory_wb+0x32/0x40
> [ 0.084000] [<c102c46f>] ioremap_change_attr+0xf/0x40
> [ 0.084000] [<c102e857>] kernel_map_sync_memtype+0x87/0xf0
> [ 0.084000] [<c102c29b>] __ioremap_caller+0x21b/0x2f0
> [ 0.084000] [<c103d32a>] ? walk_system_ram_range+0xca/0xf0
> [ 0.084000] [<c102c3a3>] ioremap_cache+0x13/0x20
> [ 0.084000] [<c149a231>] ? acpi_os_map_memory+0xb6/0x112
> [ 0.084000] [<c149a231>] acpi_os_map_memory+0xb6/0x112
> [ 0.084000] [<c12ce038>] acpi_tb_verify_table+0x20/0x49
> [ 0.084000] [<c12cea67>] acpi_load_tables+0x35/0x13e
> [ 0.084000] [<c16c78c6>] acpi_early_init+0x67/0xeb
> [ 0.084000] [<c16a7b14>] start_kernel+0x30e/0x319
> [ 0.084000] [<c16a7677>] ? repair_env_string+0x5b/0x5b
> [ 0.084000] [<c16a7356>] i386_start_kernel+0x12c/0x12f
> [ 0.084000] Code: 0c db c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 ca 72 19 e8 be cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 8c 0b db
> [ 0.084000] EIP: [<c102fa12>] __phys_addr+0x42/0x90 SS:ESP 0068:c1657dc8
> [ 0.085033] ---[ end trace bd778c4c9eceaf67 ]---
> [ 0.088242] Kernel panic - not syncing: Attempted to kill the idle task!
>
> was found using http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1 and was bisected
> to commit 68d00bbe "Merge remote-tracking branch 'origin/x86/mm' into x86/mm2".
>
> Regards.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-03-05 21:36:38

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Borislav Petkov wrote:
> + Dave.
>
> This still says 3.8.0-rc5-00105-g68d00bb. Can you still trigger this
> with 3.9-rc1?

Yes, since I saw it in 3.9-rc1, I ran "git bisect" starting from 3.9-rc1
and below is the output from the first bad commit.

>
> And also, this is Linux running as a 32-bit guest in vmware, correct?
>
> On Wed, Mar 06, 2013 at 12:41:10AM +0900, Tetsuo Handa wrote:
> > Another problem
> >
> > [ 0.021748] Mount-cache hash table entries: 512
> > [ 0.036341] Disabled fast string operations
> > [ 0.037760] mce: CPU supports 0 MCE banks
> > [ 0.039813] Last level iTLB entries: 4KB 128, 2MB 4, 4MB 4
> > [ 0.039813] Last level dTLB entries: 4KB 256, 2MB 0, 4MB 32
> > [ 0.039813] tlb_flushall_shift: -1
> > [ 0.074005] debug: unmapping init [mem 0xc186a000-0xc186efff]
> > [ 0.077005] ACPI: Core revision 20121018
> > [ 0.083350] ------------[ cut here ]------------
> > [ 0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
> > [ 0.084000] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> > [ 0.084000] Modules linked in:
> > [ 0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb #47 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
> > [ 0.084000] EIP: 0060:[<c102fa12>] EFLAGS: 00010206 CPU: 0
> > [ 0.084000] EIP is at __phys_addr+0x42/0x90
> > [ 0.084000] EAX: 00000000 EBX: 1fef0000 ECX: 0000000c EDX: 00000000
> > [ 0.084000] ESI: c1657edc EDI: 0000000f EBP: c1657dcc ESP: c1657dc8
> > [ 0.084000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 0.084000] CR0: 8005003b CR2: ffe13000 CR3: 01872000 CR4: 000006d0
> > [ 0.084000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > [ 0.084000] DR6: ffff0ff0 DR7: 00000400
> > [ 0.084000] Process swapper/0 (pid: 0, ti=c1656000 task=c1661180 task.ti=c1656000)
> > [ 0.084000] Stack:
> > [ 0.084000] c1657e90 c1657dec c102ca3e c1661608 c166f700 00000000 c166f700 c1657df0
> > [ 0.084000] 00000000 c1657e70 c102ceee c10d7899 00000002 c1655000 00000000 c10d7632
> > [ 0.084000] 00000001 00000dfc 000002f0 c1657e60 c14ab000 0000000f 00000110 c1657e90
> > [ 0.084000] Call Trace:
> > [ 0.084000] [<c102ca3e>] __cpa_process_fault+0x3e/0x80
> > [ 0.084000] [<c102ceee>] __change_page_attr_set_clr+0x3de/0x6d0
> > [ 0.084000] [<c10d7899>] ? __purge_vmap_area_lazy+0x2a9/0x360
> > [ 0.084000] [<c10d7632>] ? __purge_vmap_area_lazy+0x42/0x360
> > [ 0.084000] [<c10d913c>] ? vm_unmap_aliases+0x2bc/0x300
> > [ 0.084000] [<c10d8ee4>] ? vm_unmap_aliases+0x64/0x300
> > [ 0.084000] [<c102d2c5>] change_page_attr_set_clr+0xe5/0x390
> > [ 0.084000] [<c102d5a2>] _set_memory_wb+0x32/0x40
> > [ 0.084000] [<c102c46f>] ioremap_change_attr+0xf/0x40
> > [ 0.084000] [<c102e857>] kernel_map_sync_memtype+0x87/0xf0
> > [ 0.084000] [<c102c29b>] __ioremap_caller+0x21b/0x2f0
> > [ 0.084000] [<c103d32a>] ? walk_system_ram_range+0xca/0xf0
> > [ 0.084000] [<c102c3a3>] ioremap_cache+0x13/0x20
> > [ 0.084000] [<c149a231>] ? acpi_os_map_memory+0xb6/0x112
> > [ 0.084000] [<c149a231>] acpi_os_map_memory+0xb6/0x112
> > [ 0.084000] [<c12ce038>] acpi_tb_verify_table+0x20/0x49
> > [ 0.084000] [<c12cea67>] acpi_load_tables+0x35/0x13e
> > [ 0.084000] [<c16c78c6>] acpi_early_init+0x67/0xeb
> > [ 0.084000] [<c16a7b14>] start_kernel+0x30e/0x319
> > [ 0.084000] [<c16a7677>] ? repair_env_string+0x5b/0x5b
> > [ 0.084000] [<c16a7356>] i386_start_kernel+0x12c/0x12f
> > [ 0.084000] Code: 0c db c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 ca 72 19 e8 be cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 8c 0b db
> > [ 0.084000] EIP: [<c102fa12>] __phys_addr+0x42/0x90 SS:ESP 0068:c1657dc8
> > [ 0.085033] ---[ end trace bd778c4c9eceaf67 ]---
> > [ 0.088242] Kernel panic - not syncing: Attempted to kill the idle task!
> >
> > was found using http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1 and was bisected
> > to commit 68d00bbe "Merge remote-tracking branch 'origin/x86/mm' into x86/mm2".
> >
> > Regards.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
>

2013-03-05 22:15:14

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/05/2013 01:28 PM, Tetsuo Handa wrote:
> Borislav Petkov wrote:
>> + Dave.
>>
>> This still says 3.8.0-rc5-00105-g68d00bb. Can you still trigger this
>> with 3.9-rc1?
>
> Yes, since I saw it in 3.9-rc1, I ran "git bisect" starting from 3.9-rc1
> and below is the output from the first bad commit.
>

How reliable is this?

-hpa

2013-03-05 22:26:29

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Just booted a qemu-kvm guest with this .config. It didn't trip over
anything, so I'm looking for some more ACPI tables to feed in to it.

Looking through the code, it looks like this is the __pa() that's
hitting the BUG_ON():

static int __cpa_process_fault(struct cpa_data *cpa, unsigned long
...
if (within(vaddr, PAGE_OFFSET,
PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT))) {
cpa->numpages = 1;
cpa->pfn = __pa(vaddr) >> PAGE_SHIFT;
return 0;
} else {

The within() check should ensure that we're not doing __pa() on
vmalloc() addresses. So, either somebody managed to remap part of the
kernel identity mapping, or that within() check is failing us somehow.

What kind of hardware is this?

2013-03-05 22:44:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On Tue, Mar 05, 2013 at 02:26:12PM -0800, Dave Hansen wrote:
> Just booted a qemu-kvm guest with this .config. It didn't trip over
> anything, so I'm looking for some more ACPI tables to feed in to it.
>
> Looking through the code, it looks like this is the __pa() that's
> hitting the BUG_ON():

Shouldn't it be this one:

#ifdef CONFIG_DEBUG_VIRTUAL
unsigned long __phys_addr(unsigned long x)
{
unsigned long phys_addr = x - PAGE_OFFSET;
/* VMALLOC_* aren't constants */
VIRTUAL_BUG_ON(x < PAGE_OFFSET);
VIRTUAL_BUG_ON(__vmalloc_start_set && is_vmalloc_addr((void *) x));
/* max_low_pfn is set early, but not _that_ early */
if (max_low_pfn) {
VIRTUAL_BUG_ON((phys_addr >> PAGE_SHIFT) > max_low_pfn);
BUG_ON(slow_virt_to_phys((void *)x) != phys_addr); <--- **
}
return phys_addr;
}
EXPORT_SYMBOL(__phys_addr);
#endif

?

At least this is what the oops says:

> [ 0.083350] ------------[ cut here ]------------
> [ 0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
> [ 0.084000] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC


Tetsuo says in some of the earlier mails:

"But I get VMware's virtual machine kernel stack fault (hardware reset)
as soon as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the
config above."

> What kind of hardware is this?

[ 0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb #47 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform

I asked Tetsuo to confirm but it looks like 32-bit guest running in
vmware.

Ok, before we continue guessing stuff, Tetsuo, can you please explain
how exactly you're triggering this. More specifically, we need .config,
hypervisor version, I'm assuming kernel is 3.9-rc1, Linux is guest/host
etc, etc.

Basically everything one would need to know if one would like to
reproduce this bug in his environment.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-03-05 23:46:48

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Not sure if it's related by 3.9-rc1 gets in to a reboot loop for me. I
assume it's triple-faulting. The last line on the console I see is:

[ 0.085702] SMP alternatives: lockdep: fixing up alternatives
[ 0.086859] smpboot: Booting Node 0, Processors [ 0.086859]
smpboot: Booting Node 0, Processors #1 OK
#1 OK

I bisected it down to the neighborhood of c47f39e. After that, I get
some compile errors:

> arch/x86/built-in.o: In function `generic_load_microcode':
> microcode_intel.c:(.text+0x28195): undefined reference to `microcode_sanity_check'
> microcode_intel.c:(.text+0x281ab): undefined reference to `get_matching_microcode'

Turning off CONFIG_MICROCODE:

-CONFIG_MICROCODE=y
-CONFIG_MICROCODE_INTEL=y
-# CONFIG_MICROCODE_AMD is not set
-CONFIG_MICROCODE_OLD_INTERFACE=y
-CONFIG_MICROCODE_INTEL_LIB=y
-CONFIG_MICROCODE_INTEL_EARLY=y
-CONFIG_MICROCODE_EARLY=y
+# CONFIG_MICROCODE is not set

lets it boot again and fixes those compile errors. This is with this
config:

http://i-love.sakura.ne.jp/tmp/config-3.9-rc1

running under a kvm guest:

qemu-system-x86_64 -append 'earlyprintk=ttyS0,115200,keep
console=ttyS0,115200 nmi_watchdog=0 root=/dev/sda1 bootmem_debug'
-kernel vmlinuz -usbdevice tablet -vnc :1 -net user -net
nic,model=e1000 -hda sarge-amd64-runme-1G.img -m 10240 -smp 2

2013-03-05 23:59:03

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Fenghua,

Could you look at this thread and see if you can see what the problem is?

-hpa

On 03/05/2013 03:46 PM, Dave Hansen wrote:
> Not sure if it's related by 3.9-rc1 gets in to a reboot loop for me. I
> assume it's triple-faulting. The last line on the console I see is:
>
> [ 0.085702] SMP alternatives: lockdep: fixing up alternatives
> [ 0.086859] smpboot: Booting Node 0, Processors [ 0.086859]
> smpboot: Booting Node 0, Processors #1 OK
> #1 OK
>
> I bisected it down to the neighborhood of c47f39e. After that, I get
> some compile errors:
>
>> arch/x86/built-in.o: In function `generic_load_microcode':
>> microcode_intel.c:(.text+0x28195): undefined reference to `microcode_sanity_check'
>> microcode_intel.c:(.text+0x281ab): undefined reference to `get_matching_microcode'
>
> Turning off CONFIG_MICROCODE:
>
> -CONFIG_MICROCODE=y
> -CONFIG_MICROCODE_INTEL=y
> -# CONFIG_MICROCODE_AMD is not set
> -CONFIG_MICROCODE_OLD_INTERFACE=y
> -CONFIG_MICROCODE_INTEL_LIB=y
> -CONFIG_MICROCODE_INTEL_EARLY=y
> -CONFIG_MICROCODE_EARLY=y
> +# CONFIG_MICROCODE is not set
>
> lets it boot again and fixes those compile errors. This is with this
> config:
>
> http://i-love.sakura.ne.jp/tmp/config-3.9-rc1
>
> running under a kvm guest:
>
> qemu-system-x86_64 -append 'earlyprintk=ttyS0,115200,keep
> console=ttyS0,115200 nmi_watchdog=0 root=/dev/sda1 bootmem_debug'
> -kernel vmlinuz -usbdevice tablet -vnc :1 -net user -net
> nic,model=e1000 -hda sarge-amd64-runme-1G.img -m 10240 -smp 2
>

2013-03-06 00:22:38

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Could you also add the following to your .config:

CONFIG_ACPI_DEBUG=y

and boot with these on the kernel command-line:

acpi.debug_layer=0xffffffff acpi.debug_level=0x2

I _think_ that'll shed some light on exactly which ACPI table is being
parsed when the BUG_ON() trips. That will hopefully let other folks
reproduce it more easily.

2013-03-06 00:39:45

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1 x86] Bug in ioremap code?

> -----Original Message-----
> From: H. Peter Anvin [mailto:[email protected]]
> Sent: Tuesday, March 05, 2013 3:58 PM
> To: Dave Hansen
> Cc: Tetsuo Handa; [email protected]; [email protected]; Yu,
> Fenghua
> Subject: Re: [3.9-rc1 x86] Bug in ioremap code?
>
> Fenghua,
>
> Could you look at this thread and see if you can see what the problem
> is?
>
> -hpa
>
> On 03/05/2013 03:46 PM, Dave Hansen wrote:
> > Not sure if it's related by 3.9-rc1 gets in to a reboot loop for me.
> I
> > assume it's triple-faulting. The last line on the console I see is:
> >
> > [ 0.085702] SMP alternatives: lockdep: fixing up alternatives
> > [ 0.086859] smpboot: Booting Node 0, Processors [ 0.086859]
> > smpboot: Booting Node 0, Processors #1 OK
> > #1 OK
> >
> > I bisected it down to the neighborhood of c47f39e. After that, I get
> > some compile errors:
> >
> >> arch/x86/built-in.o: In function `generic_load_microcode':
> >> microcode_intel.c:(.text+0x28195): undefined reference to
> `microcode_sanity_check'
> >> microcode_intel.c:(.text+0x281ab): undefined reference to
> `get_matching_microcode'

The bisect warnings are because the early load microcode patchset doesn't separate patches right for bisect.

> >
> > Turning off CONFIG_MICROCODE:

Turning off CONFIG_MICROCODE should not compile the microcode code including early loading microcode code. With this configuration, microcode code should not be in debugging scope.

> >
> > -CONFIG_MICROCODE=y
> > -CONFIG_MICROCODE_INTEL=y
> > -# CONFIG_MICROCODE_AMD is not set
> > -CONFIG_MICROCODE_OLD_INTERFACE=y
> > -CONFIG_MICROCODE_INTEL_LIB=y
> > -CONFIG_MICROCODE_INTEL_EARLY=y
> > -CONFIG_MICROCODE_EARLY=y
> > +# CONFIG_MICROCODE is not set
> >
> > lets it boot again and fixes those compile errors. This is with this
> > config:
> >
> > http://i-love.sakura.ne.jp/tmp/config-3.9-rc1
> >
> > running under a kvm guest:
> >
> > qemu-system-x86_64 -append 'earlyprintk=ttyS0,115200,keep
> > console=ttyS0,115200 nmi_watchdog=0 root=/dev/sda1 bootmem_debug'
> > -kernel vmlinuz -usbdevice tablet -vnc :1 -net user -net
> > nic,model=e1000 -hda sarge-amd64-runme-1G.img -m 10240 -smp 2
> >

2013-03-06 01:12:19

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1 x86] Bug in ioremap code?

> Subject: Re: [3.9-rc1 x86] Bug in ioremap code?
>
> Fenghua,
>
> Could you look at this thread and see if you can see what the problem
> is?
>
> -hpa
>
> On 03/05/2013 03:46 PM, Dave Hansen wrote:
> > Not sure if it's related by 3.9-rc1 gets in to a reboot loop for me.
> I
> > assume it's triple-faulting. The last line on the console I see is:
> >
> > [ 0.085702] SMP alternatives: lockdep: fixing up alternatives
> > [ 0.086859] smpboot: Booting Node 0, Processors [ 0.086859]
> > smpboot: Booting Node 0, Processors #1 OK
> > #1 OK
> >
> > I bisected it down to the neighborhood of c47f39e. After that, I get
> > some compile errors:
> >
> >> arch/x86/built-in.o: In function `generic_load_microcode':
> >> microcode_intel.c:(.text+0x28195): undefined reference to
> `microcode_sanity_check'
> >> microcode_intel.c:(.text+0x281ab): undefined reference to
> `get_matching_microcode'
> >
> > Turning off CONFIG_MICROCODE:
> >
> > -CONFIG_MICROCODE=y
> > -CONFIG_MICROCODE_INTEL=y
> > -# CONFIG_MICROCODE_AMD is not set
> > -CONFIG_MICROCODE_OLD_INTERFACE=y
> > -CONFIG_MICROCODE_INTEL_LIB=y
> > -CONFIG_MICROCODE_INTEL_EARLY=y
> > -CONFIG_MICROCODE_EARLY=y
> > +# CONFIG_MICROCODE is not set
> >
> > lets it boot again and fixes those compile errors. This is with this
> > config:
> >
> > http://i-love.sakura.ne.jp/tmp/config-3.9-rc1
> >
> > running under a kvm guest:
> >
> > qemu-system-x86_64 -append 'earlyprintk=ttyS0,115200,keep
> > console=ttyS0,115200 nmi_watchdog=0 root=/dev/sda1 bootmem_debug'
> > -kernel vmlinuz -usbdevice tablet -vnc :1 -net user -net
> > nic,model=e1000 -hda sarge-amd64-runme-1G.img -m 10240 -smp 2
> >

Before setting page in 32-bit kernel boot, we load microcode. To access global variables in linear addr, we use __pa_symbol(). There is no problem in native. Is this a problem in 32-bit guest kernel boot?

Related code is as follows:

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index 8e7f655..2f70530 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -144,6 +144,11 @@ ENTRY(startup_32)
movl %eax, pa(olpc_ofw_pgd)
#endif

+#ifdef CONFIG_MICROCODE_EARLY
+ /* Early load ucode on BSP. */
+ call load_ucode_bsp
+#endif
+
/*
* Initialize page tables. This creates a PDE and a set of page
* tables, which are located immediately beyond __brk_base. The variable @@ -299,6 +304,12 @@ ENTRY(startup_32_smp)
+void __init
+load_ucode_intel_bsp(void)
+{
+ u64 ramdisk_image, ramdisk_size;
+ unsigned long initrd_start_early, initrd_end_early;
+ struct ucode_cpu_info uci;
+#ifdef CONFIG_X86_32
+ struct boot_params *boot_params_p;
+
+ boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
+ ramdisk_image = boot_params_p->hdr.ramdisk_image;
+ ramdisk_size = boot_params_p->hdr.ramdisk_size;
+ initrd_start_early = ramdisk_image;
+ initrd_end_early = initrd_start_early + ramdisk_size;
+
+ _load_ucode_intel_bsp(
+ (struct mc_saved_data *)__pa_symbol(&mc_saved_data),
+ (unsigned long *)__pa_symbol(&mc_saved_in_initrd),
+ initrd_start_early, initrd_end_early, &uci);
+ #else

2013-03-06 01:35:03

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/05/2013 05:12 PM, Yu, Fenghua wrote:
>
> Before setting page in 32-bit kernel boot, we load microcode. To access global variables in linear addr, we use __pa_symbol(). There is no problem in native. Is this a problem in 32-bit guest kernel boot?
>

Or is this tripping up the new debugging code?

-hpa

2013-03-06 10:18:58

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Dave Hansen wrote:
> Could you also add the following to your .config:
>
> CONFIG_ACPI_DEBUG=y
>
> and boot with these on the kernel command-line:
>
> acpi.debug_layer=0xffffffff acpi.debug_level=0x2
>
> I _think_ that\'ll shed some light on exactly which ACPI table is being
> parsed when the BUG_ON() trips. That will hopefully let other folks
> reproduce it more easily.
>
Using CONFIG_ACPI_DEBUG=y and adding acpi.debug_layer=0xffffffff acpi.debug_level=0x2
changed nothing.

But I found that this bug occurs only when the system has little RAM.

With 892MB RAM where /proc/meminfo would show HighTotal > 0,
this bug does not occur.

HighTotal: 4040 kB
LowTotal: 873960 kB

With 888MB RAM where /proc/meminfo would show HighTotal == 0,
this bug occurs.

[ 0.005852] ------------[ cut here ]------------
[ 0.007043] kernel BUG at arch/x86/mm/physaddr.c:79!
[ 0.008203] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 0.009546] Modules linked in:
[ 0.010303] Pid: 0, comm: swapper/0 Not tainted 3.9.0-rc1 #38 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 0.013023] EIP: 0060:[<c1030082>] EFLAGS: 00210206 CPU: 0
[ 0.014294] EIP is at __phys_addr+0x42/0x90
[ 0.015270] EAX: 00000000 EBX: 376f0000 ECX: 0000000c EDX: 00000000
[ 0.016686] ESI: 00000000 EDI: c1665e90 EBP: c1665dc8 ESP: c1665dc4
[ 0.018161] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 0.019422] CR0: 80050033 CR2: ffe13000 CR3: 01882000 CR4: 000406d0
[ 0.020911] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.022363] DR6: ffff0ff0 DR7: 00000400
[ 0.023242] Process swapper/0 (pid: 0, ti=c1664000 task=c166f140 task.ti=c1664000)
[ 0.024957] Stack:
[ 0.025427] c1665e90 c1665de8 c102d02e c10d7b49 c166f5c8 c167dd00 00000000 c167dd00
[ 0.027391] f72b9bc0 c1665e70 c102d565 c1665e30 c10d7b49 00000002 c1664000 00000000
[ 0.029372] c10d78e2 c1665e18 c167dce8 00000001 c1665e60 c14b7000 c1665e18 000002f0
[ 0.031373] Call Trace:
[ 0.031947] [<c102d02e>] __cpa_process_fault+0x3e/0x80
[ 0.033165] [<c10d7b49>] ? __purge_vmap_area_lazy+0x2a9/0x360
[ 0.034500] [<c102d565>] __change_page_attr_set_clr+0x2c5/0x5b0
[ 0.035879] [<c10d7b49>] ? __purge_vmap_area_lazy+0x2a9/0x360
[ 0.037211] [<c10d78e2>] ? __purge_vmap_area_lazy+0x42/0x360
[ 0.038538] [<c10d9194>] ? vm_unmap_aliases+0x64/0x300
[ 0.039717] [<c102d935>] change_page_attr_set_clr+0xe5/0x390
[ 0.041059] [<c102dc12>] _set_memory_wb+0x32/0x40
[ 0.042143] [<c102ca5f>] ioremap_change_attr+0xf/0x40
[ 0.043330] [<c102eec7>] kernel_map_sync_memtype+0x87/0xf0
[ 0.044610] [<c102c88b>] __ioremap_caller+0x21b/0x2f0
[ 0.045813] [<c103db8a>] ? walk_system_ram_range+0xca/0xf0
[ 0.047072] [<c102c993>] ioremap_cache+0x13/0x20
[ 0.048183] [<c14a5eb1>] ? acpi_os_map_memory+0xb6/0x112
[ 0.049405] [<c14a5eb1>] acpi_os_map_memory+0xb6/0x112
[ 0.050620] [<c12d1af8>] acpi_tb_verify_table+0x20/0x49
[ 0.051840] [<c12d2527>] acpi_load_tables+0x35/0x156
[ 0.053009] [<c16d7be7>] acpi_early_init+0x67/0xeb
[ 0.054117] [<c16b7b17>] start_kernel+0x30e/0x319
[ 0.055203] [<c16b767a>] ? repair_env_string+0x5b/0x5b
[ 0.056422] [<c16b7356>] i386_start_kernel+0x12c/0x12f
[ 0.057599] Code: 0e dc c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 ca 72 19 e8 3e cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 4c 0e dc
[ 0.063695] EIP: [<c1030082>] __phys_addr+0x42/0x90 SS:ESP 0068:c1665dc4
[ 0.065336] ---[ end trace ddccf428d5f1e08d ]---
[ 0.066391] Kernel panic - not syncing: Attempted to kill the idle task!

2013-03-06 11:16:59

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/06/2013 02:16 AM, Tetsuo Handa wrote:
>
> But I found that this bug occurs only when the system has little RAM.
>
> With 892MB RAM where /proc/meminfo would show HighTotal > 0,
> this bug does not occur.
>
> HighTotal: 4040 kB
> LowTotal: 873960 kB
>
> With 888MB RAM where /proc/meminfo would show HighTotal == 0,
> this bug occurs.
>

OK, this makes sense. I guess we have a bug in the case where we do not
exhaust lowmem.

-hpa

2013-03-06 15:00:23

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Borislav Petkov wrote:
> Ok, before we continue guessing stuff, Tetsuo, can you please explain
> how exactly you're triggering this. More specifically, we need .config,
> hypervisor version, I'm assuming kernel is 3.9-rc1, Linux is guest/host
> etc, etc.

I'm using CentOS 6.3 x86_32 guest running on VMware Workstation 6.5.5 for
Windows XP x86_32 host and VMware Player 4.0.5 for Windows 7 x86_64 host.

Kernel version is 3.9-rc1 x86_32. This bug can be triggered only when the
guest has little RAM such that /proc/meminfo reports that HighTotal == 0.
Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-acpi .

I don't know why but changing kernel config to CONFIG_ACPI=n
( http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-noacpi ) solves this bug.
Well, should I run bisection on ACPI code?

Regards.

2013-03-06 15:45:19

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/06/2013 06:58 AM, Tetsuo Handa wrote:
> I don't know why but changing kernel config to CONFIG_ACPI=n
> ( http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-noacpi ) solves this bug.
> Well, should I run bisection on ACPI code?

The ACPI code definitely _triggers_ it. However, if you bisect, I bet
you'll just bisect down to the new debugging check that you're hitting:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a25b9316841c5afa226f8f70a457861b35276a92

So, if you're able to bisect it, you'll need to apply that patch
whenever you're at a bisect point _before_ it was applied. That might
also mean doing some merging because I'm sure the code around there has
changed. IOW, bisecting isn't going to be super-easy.

2013-03-06 17:52:03

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/06/2013 06:58 AM, Tetsuo Handa wrote:
> Borislav Petkov wrote:
>> Ok, before we continue guessing stuff, Tetsuo, can you please explain
>> how exactly you're triggering this. More specifically, we need .config,
>> hypervisor version, I'm assuming kernel is 3.9-rc1, Linux is guest/host
>> etc, etc.
>
> I'm using CentOS 6.3 x86_32 guest running on VMware Workstation 6.5.5 for
> Windows XP x86_32 host and VMware Player 4.0.5 for Windows 7 x86_64 host.
>
> Kernel version is 3.9-rc1 x86_32. This bug can be triggered only when the
> guest has little RAM such that /proc/meminfo reports that HighTotal == 0.
> Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-acpi .
>
> I don't know why but changing kernel config to CONFIG_ACPI=n
> ( http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-noacpi ) solves this bug.
> Well, should I run bisection on ACPI code?

I was able to reproduce and got some better debugging out of this:

[ 0.193170] __cpa_process_fault(c1673e90, e4afa000, 1)
[ 0.208752] max_pfn_mapped: 150528
[ 0.218886] PAGE_OFFSET: c0000000
[ 0.228597] PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT)): e4c00000
[ 0.247837] slow_virt_to_phys(e4afa000): 0

The pte looks to actually _be_ empty:

[ 44.038145] slow_virt_to_phys() pte: 0000000000000000 level: 1

Not sure what's going on in the end, but it does appear this is another
win for the new BUG_ON(). There really does look to be a real bug here.

BTW, the BUG_ON() is proving to be woefully inadequate. We need some
better diagnostic messages out of there, and probably a nice dump of the
pagetable walk too.

2013-03-06 18:15:55

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

Does anybody know what this business is in __cpa_process_fault()?

> /*
> * Ignore the NULL PTE for kernel identity mapping, as it is expected
> * to have holes.
> * Also set numpages to '1' indicating that we processed cpa req for
> * one virtual address page and its pfn. TBD: numpages can be set based
> * on the initial value and the level returned by lookup_address().
> */

If this is expecting the identity mapping to have holes, then the
BUG_ON() is wrong. But how would it _get_ holes?

2013-03-06 23:19:31

by Dave Hansen

[permalink] [raw]
Subject: Re: [3.9-rc1 x86] Bug in ioremap code?

On 03/06/2013 10:15 AM, Dave Hansen wrote:
> Does anybody know what this business is in __cpa_process_fault()?
>
>> /*
>> * Ignore the NULL PTE for kernel identity mapping, as it is expected
>> * to have holes.
>> * Also set numpages to '1' indicating that we processed cpa req for
>> * one virtual address page and its pfn. TBD: numpages can be set based
>> * on the initial value and the level returned by lookup_address().
>> */
>
> If this is expecting the identity mapping to have holes, then the
> BUG_ON() is wrong. But how would it _get_ holes?

The holes in question were part of the PCI space. The ioremap code was
trying to keep the kernel identity map in sync with the mappings that it
had made. But, in the process, it called __pa() on a virtual address
that was part of one of those holes. I'm not _quite_ sure what the
fallout is from trying to sync a nonexistent map, but it surely wasn't
an intended or good thing to be doing.

A patch to fix it should be on the way. Tetsuo, can you apply it and
see if it helps you?

2013-03-13 13:24:39

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Hello.
> >
> > I can boot linux-next-20130205 using kernel config at
> > http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> > But I get VMware's virtual machine kernel stack fault (hardware reset) as soon
> > as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
> >
> > Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> > kernel config generated by "make allnoconfig", I guess something is wrong with
> > code which is executed at very early stage of bootup.
> >
> > Any clue?
> >
> > Regards.
> >
>
> This bug is not yet fixed as of 3.9-rc1.
> Should I run git bisect?
>
> Regards.
>

I found the location of "hardware reset" trigger.

It is __pa_symbol(&boot_params) call, for I don't encounter "hardware reset" if
I remove the "//" from below debug patch.

This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .

--- a/arch/x86/kernel/microcode_intel_early.c
+++ b/arch/x86/kernel/microcode_intel_early.c
@@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
#ifdef CONFIG_X86_32
struct boot_params *boot_params_p;

+ //while (1);
boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
+ while (1);
ramdisk_image = boot_params_p->hdr.ramdisk_image;
ramdisk_size = boot_params_p->hdr.ramdisk_size;
initrd_start_early = ramdisk_image;

2013-03-13 15:23:07

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1] Bug in bootup code or debug code?

> -----Original Message-----
> From: Tetsuo Handa [mailto:[email protected]]
> Sent: Wednesday, March 13, 2013 6:24 AM
> To: Yu, Fenghua; [email protected]
> Cc: [email protected]
> Subject: Re: [3.9-rc1] Bug in bootup code or debug code?
>
> Tetsuo Handa wrote:
> > Tetsuo Handa wrote:
> > > Hello.
> > >
> > > I can boot linux-next-20130205 using kernel config at
> > > http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> > > But I get VMware's virtual machine kernel stack fault (hardware
> reset) as soon
> > > as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the
> config above.
> > >
> > > Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is
> added to
> > > kernel config generated by "make allnoconfig", I guess something is
> wrong with
> > > code which is executed at very early stage of bootup.
> > >
> > > Any clue?
> > >
> > > Regards.
> > >
> >
> > This bug is not yet fixed as of 3.9-rc1.
> > Should I run git bisect?
> >
> > Regards.
> >
>
> I found the location of "hardware reset" trigger.
>
> It is __pa_symbol(&boot_params) call, for I don't encounter "hardware
> reset" if
> I remove the "//" from below debug patch.
>
> This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .
>
> --- a/arch/x86/kernel/microcode_intel_early.c
> +++ b/arch/x86/kernel/microcode_intel_early.c
> @@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
> #ifdef CONFIG_X86_32
> struct boot_params *boot_params_p;
>
> + //while (1);
> boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
> + while (1);
> ramdisk_image = boot_params_p->hdr.ramdisk_image;
> ramdisk_size = boot_params_p->hdr.ramdisk_size;
> initrd_start_early = ramdisk_image;

Tetsuo and Dave,

That's the place where we suspected to cause the problem.

My question is: how to access global variable in linear mode in virtualization? __pa_symbol() is not a problem for native.

Thanks.

-Fenghua

2013-03-13 17:44:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

On 03/13/2013 08:22 AM, Yu, Fenghua wrote:
>>
>> I found the location of "hardware reset" trigger.
>>
>> It is __pa_symbol(&boot_params) call, for I don't encounter "hardware
>> reset" if
>> I remove the "//" from below debug patch.
>>
>> This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .
>>
>> --- a/arch/x86/kernel/microcode_intel_early.c
>> +++ b/arch/x86/kernel/microcode_intel_early.c
>> @@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
>> #ifdef CONFIG_X86_32
>> struct boot_params *boot_params_p;
>>
>> + //while (1);
>> boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
>> + while (1);
>> ramdisk_image = boot_params_p->hdr.ramdisk_image;
>> ramdisk_size = boot_params_p->hdr.ramdisk_size;
>> initrd_start_early = ramdisk_image;
>
> Tetsuo and Dave,
>
> That's the place where we suspected to cause the problem.
>
> My question is: how to access global variable in linear mode in virtualization? __pa_symbol() is not a problem for native.
>

What kind of virtualization are we talking about here? We should not be
running this code under any paravirtualized code path -- this is the
hypervisor's job to take care of this. For HVM, this should just work
the same way.

-hpa

2013-03-13 17:45:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

This is a CONFIG_DEBUG_VIRTUAL configuration, isn't it?

-hpa

2013-03-13 20:46:09

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

H. Peter Anvin wrote:
> On 03/13/2013 08:22 AM, Yu, Fenghua wrote:
> >>
> >> I found the location of "hardware reset" trigger.
> >>
> >> It is __pa_symbol(&boot_params) call, for I don't encounter "hardware
> >> reset" if
> >> I remove the "//" from below debug patch.
> >>
> >> This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .
> >>
> >> --- a/arch/x86/kernel/microcode_intel_early.c
> >> +++ b/arch/x86/kernel/microcode_intel_early.c
> >> @@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
> >> #ifdef CONFIG_X86_32
> >> struct boot_params *boot_params_p;
> >>
> >> + //while (1);
> >> boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
> >> + while (1);
> >> ramdisk_image = boot_params_p->hdr.ramdisk_image;
> >> ramdisk_size = boot_params_p->hdr.ramdisk_size;
> >> initrd_start_early = ramdisk_image;
> >
> > Tetsuo and Dave,
> >
> > That's the place where we suspected to cause the problem.
> >
> > My question is: how to access global variable in linear mode in virtualization? __pa_symbol() is not a problem for native.
> >
>
> What kind of virtualization are we talking about here? We should not be
> running this code under any paravirtualized code path -- this is the
> hypervisor's job to take care of this. For HVM, this should just work
> the same way.
>
> -hpa
>
H. Peter Anvin wrote:
> This is a CONFIG_DEBUG_VIRTUAL configuration, isn't it?

Yes. CONFIG_MICROCODE_INTEL_EARLY=y && CONFIG_64BIT=n && CONFIG_DEBUG_VIRTUAL=y
on VMware Workstation/Player environment.

2013-03-19 22:12:42

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1] Bug in bootup code or debug code?

> From: Tetsuo Handa [mailto:[email protected]]
> H. Peter Anvin wrote:
> > On 03/13/2013 08:22 AM, Yu, Fenghua wrote:
> > >>
> > >> I found the location of "hardware reset" trigger.
> > >>
> > >> It is __pa_symbol(&boot_params) call, for I don't encounter
> "hardware
> > >> reset" if
> > >> I remove the "//" from below debug patch.
> > >>
> > >> This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .
> > >>
> > >> --- a/arch/x86/kernel/microcode_intel_early.c
> > >> +++ b/arch/x86/kernel/microcode_intel_early.c
> > >> @@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
> > >> #ifdef CONFIG_X86_32
> > >> struct boot_params *boot_params_p;
> > >>
> > >> + //while (1);
> > >> boot_params_p = (struct boot_params
> *)__pa_symbol(&boot_params);
> > >> + while (1);
> > >> ramdisk_image = boot_params_p->hdr.ramdisk_image;
> > >> ramdisk_size = boot_params_p->hdr.ramdisk_size;
> > >> initrd_start_early = ramdisk_image;
> > >
> > > Tetsuo and Dave,
> > >
> > > That's the place where we suspected to cause the problem.
> > >
> > > My question is: how to access global variable in linear mode in
> virtualization? __pa_symbol() is not a problem for native.
> > >
> >
> > What kind of virtualization are we talking about here? We should not
> be
> > running this code under any paravirtualized code path -- this is the
> > hypervisor's job to take care of this. For HVM, this should just
> work
> > the same way.
> >
> > -hpa
> >
> H. Peter Anvin wrote:
> > This is a CONFIG_DEBUG_VIRTUAL configuration, isn't it?
>
> Yes. CONFIG_MICROCODE_INTEL_EARLY=y && CONFIG_64BIT=n &&
> CONFIG_DEBUG_VIRTUAL=y
> on VMware Workstation/Player environment.

Tetsuo,

I just now sent out a patch to fix this issue and you are in the list.

Could you please verify if it fixes the issue you saw?

Thanks.

-Fenghua

2013-03-20 14:44:37

by Shaun Ruffell

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

On Tue, Mar 19, 2013 at 10:12:39PM +0000, Yu, Fenghua wrote:
> > From: Tetsuo Handa [mailto:[email protected]]
> > H. Peter Anvin wrote:
> > > This is a CONFIG_DEBUG_VIRTUAL configuration, isn't it?
> >
> > Yes. CONFIG_MICROCODE_INTEL_EARLY=y && CONFIG_64BIT=n &&
> > CONFIG_DEBUG_VIRTUAL=y
> > on VMware Workstation/Player environment.
>
> Tetsuo,
>
> I just now sent out a patch to fix this issue and you are in the list.
>
> Could you please verify if it fixes the issue you saw?

Hi Fenghua,

I ran into the same issue on a test system I use (not a virtual
machine) and went through basically the same process as Dave
Hansen w/bisecting before finding this thread.

Any chance you could send the patch to the mailing list and I could
also throw it on my test system?

Thanks,
Shaun

2013-03-20 16:32:23

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1] Bug in bootup code or debug code?

> From: Shaun Ruffell [mailto:[email protected]]
> On Tue, Mar 19, 2013 at 10:12:39PM +0000, Yu, Fenghua wrote:
> > > From: Tetsuo Handa [mailto:[email protected]]
> > > H. Peter Anvin wrote:

> Hi Fenghua,
>
> I ran into the same issue on a test system I use (not a virtual
> machine) and went through basically the same process as Dave
> Hansen w/bisecting before finding this thread.
>
> Any chance you could send the patch to the mailing list and I could
> also throw it on my test system?
Hi, Shaun,

The patch is in tip.git tree now. You can get it from:
http://git.kernel.org/tip/c83a9d5e425d4678b05ca058fec6254f18601474

Please let us know if the patch fixes the issue you saw.

Thanks.

-Fenghua

2013-03-20 17:02:59

by Shaun Ruffell

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

On Wed, Mar 20, 2013 at 04:32:14PM +0000, Yu, Fenghua wrote:
> > From: Shaun Ruffell [mailto:[email protected]]
> > On Tue, Mar 19, 2013 at 10:12:39PM +0000, Yu, Fenghua wrote:
> > > > From: Tetsuo Handa [mailto:[email protected]]
> > > > H. Peter Anvin wrote:
>
> > Hi Fenghua,
> >
> > I ran into the same issue on a test system I use (not a virtual
> > machine) and went through basically the same process as Dave
> > Hansen w/bisecting before finding this thread.
> >
> > Any chance you could send the patch to the mailing list and I could
> > also throw it on my test system?
> Hi, Shaun,
>
> The patch is in tip.git tree now. You can get it from:
> http://git.kernel.org/tip/c83a9d5e425d4678b05ca058fec6254f18601474
>
> Please let us know if the patch fixes the issue you saw.

Thanks for the link. That patch applied on 3.9-rc3 did allow me to boot with my
default kernel config.

Not related to this patch, and not sure it really matters, but FYI: I just
noticed the following warning when building the patched kernel:

WARNING: vmlinux.o(.text+0x2a1a7): Section mismatch in reference from the function apply_microcode_early() to the function .cpuinit.text:print_ucode()
The function apply_microcode_early() references
the function __cpuinit print_ucode().
This is often because apply_microcode_early lacks a __cpuinit
annotation or the annotation of print_ucode is wrong.

Thanks,
Shaun

2013-03-20 17:05:49

by Fenghua Yu

[permalink] [raw]
Subject: RE: [3.9-rc1] Bug in bootup code or debug code?

> From: Shaun Ruffell [mailto:[email protected]]
> On Wed, Mar 20, 2013 at 04:32:14PM +0000, Yu, Fenghua wrote:
> > > From: Shaun Ruffell [mailto:[email protected]]
> > > On Tue, Mar 19, 2013 at 10:12:39PM +0000, Yu, Fenghua wrote:
> > > > > From: Tetsuo Handa [mailto:[email protected]]
> > > > > H. Peter Anvin wrote:
> Thanks for the link. That patch applied on 3.9-rc3 did allow me to boot
> with my
> default kernel config.

Great!

>
> Not related to this patch, and not sure it really matters, but FYI: I
> just
> noticed the following warning when building the patched kernel:
>
> WARNING: vmlinux.o(.text+0x2a1a7): Section mismatch in reference from
> the function apply_microcode_early() to the
> function .cpuinit.text:print_ucode()
> The function apply_microcode_early() references
> the function __cpuinit print_ucode().
> This is often because apply_microcode_early lacks a __cpuinit
> annotation or the annotation of print_ucode is wrong.

Could you please send your .config to me?

Thanks.

-Fenghua

2013-03-20 17:14:37

by Shaun Ruffell

[permalink] [raw]
Subject: Re: [3.9-rc1] Bug in bootup code or debug code?

On Wed, Mar 20, 2013 at 05:05:40PM +0000, Yu, Fenghua wrote:
> From: Shaun Ruffell [mailto:[email protected]]
> >
> > Not related to this patch, and not sure it really matters, but FYI: I
> > just
> > noticed the following warning when building the patched kernel:
> >
> > WARNING: vmlinux.o(.text+0x2a1a7): Section mismatch in reference from
> > the function apply_microcode_early() to the
> > function .cpuinit.text:print_ucode()
> > The function apply_microcode_early() references
> > the function __cpuinit print_ucode().
> > This is often because apply_microcode_early lacks a __cpuinit
> > annotation or the annotation of print_ucode is wrong.
>
> Could you please send your .config to me?

Attached.

--
Shaun Ruffell
Digium, Inc. | Linux Kernel Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://www.digium.com & http://www.asterisk.org


Attachments:
(No filename) (872.00 B)
config (77.92 kB)
Download all attachments
Subject: [tip:x86/urgent] x86, microcode_intel_early: Mark apply_microcode_early() as cpuinit

Commit-ID: f564c24103f87dc740c1c293c975565ac46b12ef
Gitweb: http://git.kernel.org/tip/f564c24103f87dc740c1c293c975565ac46b12ef
Author: H. Peter Anvin <[email protected]>
AuthorDate: Thu, 21 Mar 2013 17:32:36 -0700
Committer: H. Peter Anvin <[email protected]>
CommitDate: Thu, 21 Mar 2013 17:32:36 -0700

x86, microcode_intel_early: Mark apply_microcode_early() as cpuinit

Add missing __cpuinit annotation to apply_microcode_early().

Reported-by: Shaun Ruffell <[email protected]>
Cc: Fenghua Yu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/microcode_intel_early.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/microcode_intel_early.c b/arch/x86/kernel/microcode_intel_early.c
index 5992ee8..d893e8e 100644
--- a/arch/x86/kernel/microcode_intel_early.c
+++ b/arch/x86/kernel/microcode_intel_early.c
@@ -659,8 +659,8 @@ static inline void __cpuinit print_ucode(struct ucode_cpu_info *uci)
}
#endif

-static int apply_microcode_early(struct mc_saved_data *mc_saved_data,
- struct ucode_cpu_info *uci)
+static int __cpuinit apply_microcode_early(struct mc_saved_data *mc_saved_data,
+ struct ucode_cpu_info *uci)
{
struct microcode_intel *mc_intel;
unsigned int val[2];