_______________________________________________
LKP mailing list
[email protected]
On 7 November 2014 06:47, LKP <[email protected]> wrote:
> FYI, we noticed the below changes on
>
> https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>
>
> +-----------------------+------------+------------+
> | | 2fa165a26c | aacdce6e88 |
> +-----------------------+------------+------------+
> | boot_successes | 20 | 10 |
> | early-boot-hang | 1 | |
> | boot_failures | 0 | 5 |
> | PANIC:early_exception | 0 | 5 |
> +-----------------------+------------+------------+
>
>
> [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> [ 0.000000] bootconsole [earlyser0] enabled
> [ 0.000000] NX (Execute Disable) protection: active
> PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> [ 0.000000] RIP 0x4
>
This is most puzzling. Could anyone decode the exception?
This looks like the non-EFI path through dmi_scan_machine(), which
calls dmi_present() /after/ calling dmi_smbios3_present(), which
apparently has not found the _SM3_ header tag. Or could the call stack
be inaccurate?
Anyway, it would be good to know the exact type of the platform, and
perhaps we could find out if there is an inadvertent _SM3_ tag
somewhere in the 0xF0000 - 0xFFFFF range?
--
Ard.
On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> > FYI, we noticed the below changes on
> >
> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >
> >
> > +-----------------------+------------+------------+
> > | | 2fa165a26c | aacdce6e88 |
> > +-----------------------+------------+------------+
> > | boot_successes | 20 | 10 |
> > | early-boot-hang | 1 | |
> > | boot_failures | 0 | 5 |
> > | PANIC:early_exception | 0 | 5 |
> > +-----------------------+------------+------------+
> >
> >
> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> > [ 0.000000] bootconsole [earlyser0] enabled
> > [ 0.000000] NX (Execute Disable) protection: active
> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> > [ 0.000000] Call Trace:
> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> > [ 0.000000] RIP 0x4
> >
>
> This is most puzzling. Could anyone decode the exception?
> This looks like the non-EFI path through dmi_scan_machine(), which
> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> apparently has not found the _SM3_ header tag. Or could the call stack
> be inaccurate?
>
> Anyway, it would be good to know the exact type of the platform,
It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
> and
> perhaps we could find out if there is an inadvertent _SM3_ tag
> somewhere in the 0xF0000 - 0xFFFFF range?
Sorry, how?
--yliu
On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
> On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>> > FYI, we noticed the below changes on
>> >
>> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>> >
>> >
>> > +-----------------------+------------+------------+
>> > | | 2fa165a26c | aacdce6e88 |
>> > +-----------------------+------------+------------+
>> > | boot_successes | 20 | 10 |
>> > | early-boot-hang | 1 | |
>> > | boot_failures | 0 | 5 |
>> > | PANIC:early_exception | 0 | 5 |
>> > +-----------------------+------------+------------+
>> >
>> >
>> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>> > [ 0.000000] bootconsole [earlyser0] enabled
>> > [ 0.000000] NX (Execute Disable) protection: active
>> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>> > [ 0.000000] Call Trace:
>> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>> > [ 0.000000] RIP 0x4
>> >
>>
>> This is most puzzling. Could anyone decode the exception?
>> This looks like the non-EFI path through dmi_scan_machine(), which
>> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>> apparently has not found the _SM3_ header tag. Or could the call stack
>> be inaccurate?
>>
>> Anyway, it would be good to know the exact type of the platform,
>
> It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>
>> and
>> perhaps we could find out if there is an inadvertent _SM3_ tag
>> somewhere in the 0xF0000 - 0xFFFFF range?
>
> Sorry, how?
>
That's not a brand new machine, so I suppose there wouldn't be a
SMBIOS 3.0 header lurking in there.
Anyway, if you are in a position to try things, could you apply this
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
memset(buf, 0, 16);
for (q = p; q < p + 0x10000; q += 16) {
memcpy_fromio(buf + 16, q, 16);
- if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
+ if (!dmi_present(buf)) {
dmi_available = 1;
dmi_early_unmap(p, 0x10000);
goto out;
and try again? That is the only change that is relevant to the non-EFI
code path which this machine appears to take, so if this fixes things,
that would be valuable information even if it doesn't tell us exactly
what is going wrong.
Thanks,
Ard.
On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> >> > FYI, we noticed the below changes on
> >> >
> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >> >
> >> >
> >> > +-----------------------+------------+------------+
> >> > | | 2fa165a26c | aacdce6e88 |
> >> > +-----------------------+------------+------------+
> >> > | boot_successes | 20 | 10 |
> >> > | early-boot-hang | 1 | |
> >> > | boot_failures | 0 | 5 |
> >> > | PANIC:early_exception | 0 | 5 |
> >> > +-----------------------+------------+------------+
> >> >
> >> >
> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> >> > [ 0.000000] bootconsole [earlyser0] enabled
> >> > [ 0.000000] NX (Execute Disable) protection: active
> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> >> > [ 0.000000] Call Trace:
> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> >> > [ 0.000000] RIP 0x4
> >> >
> >>
> >> This is most puzzling. Could anyone decode the exception?
> >> This looks like the non-EFI path through dmi_scan_machine(), which
> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> >> apparently has not found the _SM3_ header tag. Or could the call stack
> >> be inaccurate?
> >>
> >> Anyway, it would be good to know the exact type of the platform,
> >
> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
> >
> >> and
> >> perhaps we could find out if there is an inadvertent _SM3_ tag
> >> somewhere in the 0xF0000 - 0xFFFFF range?
> >
> > Sorry, how?
> >
>
> That's not a brand new machine, so I suppose there wouldn't be a
> SMBIOS 3.0 header lurking in there.
>
> Anyway, if you are in a position to try things, could you apply this
>
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
> memset(buf, 0, 16);
> for (q = p; q < p + 0x10000; q += 16) {
> memcpy_fromio(buf + 16, q, 16);
> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
> + if (!dmi_present(buf)) {
> dmi_available = 1;
> dmi_early_unmap(p, 0x10000);
> goto out;
>
> and try again?
kernel boots perfectly with this patch applied.
--yliu
> That is the only change that is relevant to the non-EFI
> code path which this machine appears to take, so if this fixes things,
> that would be valuable information even if it doesn't tell us exactly
> what is going wrong.
>
> Thanks,
> Ard.
On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
> On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
>> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
>> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>> >> > FYI, we noticed the below changes on
>> >> >
>> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>> >> >
>> >> >
>> >> > +-----------------------+------------+------------+
>> >> > | | 2fa165a26c | aacdce6e88 |
>> >> > +-----------------------+------------+------------+
>> >> > | boot_successes | 20 | 10 |
>> >> > | early-boot-hang | 1 | |
>> >> > | boot_failures | 0 | 5 |
>> >> > | PANIC:early_exception | 0 | 5 |
>> >> > +-----------------------+------------+------------+
>> >> >
>> >> >
>> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>> >> > [ 0.000000] bootconsole [earlyser0] enabled
>> >> > [ 0.000000] NX (Execute Disable) protection: active
>> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>> >> > [ 0.000000] Call Trace:
>> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>> >> > [ 0.000000] RIP 0x4
>> >> >
>> >>
>> >> This is most puzzling. Could anyone decode the exception?
>> >> This looks like the non-EFI path through dmi_scan_machine(), which
>> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>> >> apparently has not found the _SM3_ header tag. Or could the call stack
>> >> be inaccurate?
>> >>
>> >> Anyway, it would be good to know the exact type of the platform,
>> >
>> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>> >
>> >> and
>> >> perhaps we could find out if there is an inadvertent _SM3_ tag
>> >> somewhere in the 0xF0000 - 0xFFFFF range?
>> >
>> > Sorry, how?
>> >
>>
>> That's not a brand new machine, so I suppose there wouldn't be a
>> SMBIOS 3.0 header lurking in there.
>>
>> Anyway, if you are in a position to try things, could you apply this
>>
>> --- a/drivers/firmware/dmi_scan.c
>> +++ b/drivers/firmware/dmi_scan.c
>> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
>> memset(buf, 0, 16);
>> for (q = p; q < p + 0x10000; q += 16) {
>> memcpy_fromio(buf + 16, q, 16);
>> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
>> + if (!dmi_present(buf)) {
>> dmi_available = 1;
>> dmi_early_unmap(p, 0x10000);
>> goto out;
>>
>> and try again?
>
> kernel boots perfectly with this patch applied.
>
> --yliu
>
Thank you! Very useful to know
Sorry to keep you busy, but could you please apply this on top of the
previous patch
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -617,6 +617,8 @@ void __init dmi_scan_machine(void)
memset(buf, 0, 16);
for (q = p; q < p + 0x10000; q += 16) {
memcpy_fromio(buf + 16, q, 16);
+ if (memcmp(buf, "_SM3_", 5) == 0)
+ pr_warn("DMI: Ignoring SMBIOS 3.0
header at %p\n", buf);
if (!dmi_present(buf)) {
dmi_available = 1;
dmi_early_unmap(p, 0x10000);
and check if there is any output?
Thanks again,
Ard.
On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> >> >> > FYI, we noticed the below changes on
> >> >> >
> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >> >> >
> >> >> >
> >> >> > +-----------------------+------------+------------+
> >> >> > | | 2fa165a26c | aacdce6e88 |
> >> >> > +-----------------------+------------+------------+
> >> >> > | boot_successes | 20 | 10 |
> >> >> > | early-boot-hang | 1 | |
> >> >> > | boot_failures | 0 | 5 |
> >> >> > | PANIC:early_exception | 0 | 5 |
> >> >> > +-----------------------+------------+------------+
> >> >> >
> >> >> >
> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
> >> >> > [ 0.000000] NX (Execute Disable) protection: active
> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> >> >> > [ 0.000000] Call Trace:
> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> >> >> > [ 0.000000] RIP 0x4
> >> >> >
> >> >>
> >> >> This is most puzzling. Could anyone decode the exception?
> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
> >> >> be inaccurate?
> >> >>
> >> >> Anyway, it would be good to know the exact type of the platform,
> >> >
> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
> >> >
> >> >> and
> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
> >> >
> >> > Sorry, how?
> >> >
> >>
> >> That's not a brand new machine, so I suppose there wouldn't be a
> >> SMBIOS 3.0 header lurking in there.
> >>
> >> Anyway, if you are in a position to try things, could you apply this
> >>
> >> --- a/drivers/firmware/dmi_scan.c
> >> +++ b/drivers/firmware/dmi_scan.c
> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
> >> memset(buf, 0, 16);
> >> for (q = p; q < p + 0x10000; q += 16) {
> >> memcpy_fromio(buf + 16, q, 16);
> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
> >> + if (!dmi_present(buf)) {
> >> dmi_available = 1;
> >> dmi_early_unmap(p, 0x10000);
> >> goto out;
> >>
> >> and try again?
> >
> > kernel boots perfectly with this patch applied.
> >
> > --yliu
> >
>
> Thank you! Very useful to know
>
Sigh, I made a silly error, I speicified wrong commit while testing your
patch. Sorry for that.
And I tested it again, with your former patch, sorry, the panic still
happens.
--yliu
> Sorry to keep you busy, but could you please apply this on top of the
> previous patch
>
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -617,6 +617,8 @@ void __init dmi_scan_machine(void)
> memset(buf, 0, 16);
> for (q = p; q < p + 0x10000; q += 16) {
> memcpy_fromio(buf + 16, q, 16);
> + if (memcmp(buf, "_SM3_", 5) == 0)
> + pr_warn("DMI: Ignoring SMBIOS 3.0
> header at %p\n", buf);
> if (!dmi_present(buf)) {
> dmi_available = 1;
> dmi_early_unmap(p, 0x10000);
>
>
> and check if there is any output?
>
> Thanks again,
> Ard.
On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
> On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
>> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
>> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
>> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
>> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>> >> >> > FYI, we noticed the below changes on
>> >> >> >
>> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>> >> >> >
>> >> >> >
>> >> >> > +-----------------------+------------+------------+
>> >> >> > | | 2fa165a26c | aacdce6e88 |
>> >> >> > +-----------------------+------------+------------+
>> >> >> > | boot_successes | 20 | 10 |
>> >> >> > | early-boot-hang | 1 | |
>> >> >> > | boot_failures | 0 | 5 |
>> >> >> > | PANIC:early_exception | 0 | 5 |
>> >> >> > +-----------------------+------------+------------+
>> >> >> >
>> >> >> >
>> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
>> >> >> > [ 0.000000] NX (Execute Disable) protection: active
>> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>> >> >> > [ 0.000000] Call Trace:
>> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>> >> >> > [ 0.000000] RIP 0x4
>> >> >> >
>> >> >>
>> >> >> This is most puzzling. Could anyone decode the exception?
>> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
>> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
>> >> >> be inaccurate?
>> >> >>
>> >> >> Anyway, it would be good to know the exact type of the platform,
>> >> >
>> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>> >> >
>> >> >> and
>> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
>> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
>> >> >
>> >> > Sorry, how?
>> >> >
>> >>
>> >> That's not a brand new machine, so I suppose there wouldn't be a
>> >> SMBIOS 3.0 header lurking in there.
>> >>
>> >> Anyway, if you are in a position to try things, could you apply this
>> >>
>> >> --- a/drivers/firmware/dmi_scan.c
>> >> +++ b/drivers/firmware/dmi_scan.c
>> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
>> >> memset(buf, 0, 16);
>> >> for (q = p; q < p + 0x10000; q += 16) {
>> >> memcpy_fromio(buf + 16, q, 16);
>> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
>> >> + if (!dmi_present(buf)) {
>> >> dmi_available = 1;
>> >> dmi_early_unmap(p, 0x10000);
>> >> goto out;
>> >>
>> >> and try again?
>> >
>> > kernel boots perfectly with this patch applied.
>> >
>> > --yliu
>> >
>>
>> Thank you! Very useful to know
>>
>
> Sigh, I made a silly error, I speicified wrong commit while testing your
> patch. Sorry for that.
>
> And I tested it again, with your former patch, sorry, the panic still
> happens.
>
> --yliu
>
OK, no worries.
Could you please try the attached patch? On my ARM system, it produces
something like this
====== Decoding _DMI_ header:
5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
SMBIOS 2.7 present.
DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
That should help us pinpoint what is going on here.
--
Ard.
On Fri, 2014-11-07 at 08:17 +0100, Ard Biesheuvel wrote:
> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> > FYI, we noticed the below changes on
> >
> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >
> >
> > +-----------------------+------------+------------+
> > | | 2fa165a26c | aacdce6e88 |
> > +-----------------------+------------+------------+
> > | boot_successes | 20 | 10 |
> > | early-boot-hang | 1 | |
> > | boot_failures | 0 | 5 |
> > | PANIC:early_exception | 0 | 5 |
> > +-----------------------+------------+------------+
> >
> >
> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> > [ 0.000000] bootconsole [earlyser0] enabled
> > [ 0.000000] NX (Execute Disable) protection: active
> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> > [ 0.000000] Call Trace:
> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> > [ 0.000000] RIP 0x4
> >
>
> This is most puzzling. Could anyone decode the exception?
> This looks like the non-EFI path through dmi_scan_machine(), which
> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> apparently has not found the _SM3_ header tag. Or could the call stack
> be inaccurate?
The code triggered a page fault while trying to access
0xffffffffff240000, caused because the reserved bit was set in the page
table and no page was found. Looks like it jumped through a bogus
pointer.
And yes, the callstack may definitely be wrong - the stack dumper is
just scraping addresses from the stack, as indicated by the '?' symbol.
Yuanhan, what symbol does 0xffffffff81899e6b (the faulting instruction)
translate to?
On Fri, Nov 07, 2014 at 10:03:55AM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
> > On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
> >> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
> >> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
> >> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
> >> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
> >> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> >> >> >> > FYI, we noticed the below changes on
> >> >> >> >
> >> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> >> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >> >> >> >
> >> >> >> >
> >> >> >> > +-----------------------+------------+------------+
> >> >> >> > | | 2fa165a26c | aacdce6e88 |
> >> >> >> > +-----------------------+------------+------------+
> >> >> >> > | boot_successes | 20 | 10 |
> >> >> >> > | early-boot-hang | 1 | |
> >> >> >> > | boot_failures | 0 | 5 |
> >> >> >> > | PANIC:early_exception | 0 | 5 |
> >> >> >> > +-----------------------+------------+------------+
> >> >> >> >
> >> >> >> >
> >> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> >> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
> >> >> >> > [ 0.000000] NX (Execute Disable) protection: active
> >> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> >> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> >> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> >> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> >> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> >> >> >> > [ 0.000000] Call Trace:
> >> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> >> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> >> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> >> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> >> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> >> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> >> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> >> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> >> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> >> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> >> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> >> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> >> >> >> > [ 0.000000] RIP 0x4
> >> >> >> >
> >> >> >>
> >> >> >> This is most puzzling. Could anyone decode the exception?
> >> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
> >> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> >> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
> >> >> >> be inaccurate?
> >> >> >>
> >> >> >> Anyway, it would be good to know the exact type of the platform,
> >> >> >
> >> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
> >> >> >
> >> >> >> and
> >> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
> >> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
> >> >> >
> >> >> > Sorry, how?
> >> >> >
> >> >>
> >> >> That's not a brand new machine, so I suppose there wouldn't be a
> >> >> SMBIOS 3.0 header lurking in there.
> >> >>
> >> >> Anyway, if you are in a position to try things, could you apply this
> >> >>
> >> >> --- a/drivers/firmware/dmi_scan.c
> >> >> +++ b/drivers/firmware/dmi_scan.c
> >> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
> >> >> memset(buf, 0, 16);
> >> >> for (q = p; q < p + 0x10000; q += 16) {
> >> >> memcpy_fromio(buf + 16, q, 16);
> >> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
> >> >> + if (!dmi_present(buf)) {
> >> >> dmi_available = 1;
> >> >> dmi_early_unmap(p, 0x10000);
> >> >> goto out;
> >> >>
> >> >> and try again?
> >> >
> >> > kernel boots perfectly with this patch applied.
> >> >
> >> > --yliu
> >> >
> >>
> >> Thank you! Very useful to know
> >>
> >
> > Sigh, I made a silly error, I speicified wrong commit while testing your
> > patch. Sorry for that.
> >
> > And I tested it again, with your former patch, sorry, the panic still
> > happens.
> >
> > --yliu
> >
>
> OK, no worries.
>
> Could you please try the attached patch? On my ARM system, it produces
> something like this
>
> ====== Decoding _DMI_ header:
> 5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
> ====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
> ====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
> ====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
> ====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
> ====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
> ====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
> ====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
> ====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
> ====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
> ====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
> ====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
> ====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
> SMBIOS 2.7 present.
> DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
>
> That should help us pinpoint what is going on here.
>
Here is the output:
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] ====== Decoding _DMI_ header:
[ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
[ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
PANIC: early exception 0e rip 10:ffffffff8167aa1a error 9 cr2 ffffffffff240001
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-00008-g4d3a0be #66
[ 0.000000] 0000000000000ba3 ffffffff81bcfd10 ffffffff818010a4 00000000000003f8
[ 0.000000] 000000000000003e ffffffff81bcfdf8 ffffffff81d801b0 617420534f49424d
[ 0.000000] 000000000000001f ffffffffff240000 0000000000000000 ffffffffff240000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff818010a4>] dump_stack+0x46/0x58
[ 0.000000] [<ffffffff81d801b0>] early_idt_handler+0x90/0xb7
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff8167aa1a>] ? dmi_table+0x4a/0xf0
[ 0.000000] [<ffffffff817fa71b>] ? printk+0x61/0x63
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff81dd49dc>] dmi_walk_early+0x6b/0x90
[ 0.000000] [<ffffffff81dd52fc>] dmi_present+0x1b4/0x23f
[ 0.000000] [<ffffffff81dd55ab>] dmi_scan_machine+0x1d4/0x23a
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d883a2>] setup_arch+0x462/0xcc6
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80167>] ? early_idt_handler+0x47/0xb7
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80cf0>] start_kernel+0x97/0x456
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d805ee>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d8072e>] x86_64_start_kernel+0x13e/0x14d
[ 0.000000] RIP 0xba2
--yliu
On Fri, Nov 07, 2014 at 09:16:02AM +0000, Matt Fleming wrote:
> On Fri, 2014-11-07 at 08:17 +0100, Ard Biesheuvel wrote:
> > On 7 November 2014 06:47, LKP <[email protected]> wrote:
> > > FYI, we noticed the below changes on
> > >
> > > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> > > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> > >
> > >
> > > +-----------------------+------------+------------+
> > > | | 2fa165a26c | aacdce6e88 |
> > > +-----------------------+------------+------------+
> > > | boot_successes | 20 | 10 |
> > > | early-boot-hang | 1 | |
> > > | boot_failures | 0 | 5 |
> > > | PANIC:early_exception | 0 | 5 |
> > > +-----------------------+------------+------------+
> > >
> > >
> > > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> > > [ 0.000000] bootconsole [earlyser0] enabled
> > > [ 0.000000] NX (Execute Disable) protection: active
> > > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> > > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> > > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> > > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> > > [ 0.000000] Call Trace:
> > > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> > > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> > > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> > > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> > > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> > > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> > > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> > > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> > > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> > > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> > > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> > > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> > > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> > > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> > > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> > > [ 0.000000] RIP 0x4
> > >
> >
> > This is most puzzling. Could anyone decode the exception?
> > This looks like the non-EFI path through dmi_scan_machine(), which
> > calls dmi_present() /after/ calling dmi_smbios3_present(), which
> > apparently has not found the _SM3_ header tag. Or could the call stack
> > be inaccurate?
>
> The code triggered a page fault while trying to access
> 0xffffffffff240000, caused because the reserved bit was set in the page
> table and no page was found. Looks like it jumped through a bogus
> pointer.
>
> And yes, the callstack may definitely be wrong - the stack dumper is
> just scraping addresses from the stack, as indicated by the '?' symbol.
>
> Yuanhan, what symbol does 0xffffffff81899e6b (the faulting instruction)
> translate to?
I found no System.map for that kernel, I then changed to another kernel,
and here is the new panic dmesg:
PANIC: early exception 0e rip 10:ffffffff8167aa1a error 9 cr2 ffffffffff240001
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-00008-g4d3a0be #66
[ 0.000000] 0000000000000ba3 ffffffff81bcfd10 ffffffff818010a4 00000000000003f8
[ 0.000000] 000000000000003e ffffffff81bcfdf8 ffffffff81d801b0 617420534f49424d
[ 0.000000] 000000000000001f ffffffffff240000 0000000000000000 ffffffffff240000
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff818010a4>] dump_stack+0x46/0x58
[ 0.000000] [<ffffffff81d801b0>] early_idt_handler+0x90/0xb7
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff8167aa1a>] ? dmi_table+0x4a/0xf0
[ 0.000000] [<ffffffff817fa71b>] ? printk+0x61/0x63
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
[ 0.000000] [<ffffffff81dd49dc>] dmi_walk_early+0x6b/0x90
[ 0.000000] [<ffffffff81dd52fc>] dmi_present+0x1b4/0x23f
[ 0.000000] [<ffffffff81dd55ab>] dmi_scan_machine+0x1d4/0x23a
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d883a2>] setup_arch+0x462/0xcc6
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80167>] ? early_idt_handler+0x47/0xb7
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80cf0>] start_kernel+0x97/0x456
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81d805ee>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81d8072e>] x86_64_start_kernel+0x13e/0x14d
[ 0.000000] RIP 0xba2
The address changes to 10:ffffffff8167aa1a, and in the System.map, it has:
ffffffff8167a9d0 t dmi_table
ffffffff8167aac0 T dmi_name_in_vendors
Sorry, I don't know how to dig furture.
--yliu
On 7 November 2014 10:26, Yuanhan Liu <[email protected]> wrote:
> On Fri, Nov 07, 2014 at 10:03:55AM +0100, Ard Biesheuvel wrote:
>> On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
>> > On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
>> >> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
>> >> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
>> >> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
>> >> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>> >> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>> >> >> >> > FYI, we noticed the below changes on
>> >> >> >> >
>> >> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>> >> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> > | | 2fa165a26c | aacdce6e88 |
>> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> > | boot_successes | 20 | 10 |
>> >> >> >> > | early-boot-hang | 1 | |
>> >> >> >> > | boot_failures | 0 | 5 |
>> >> >> >> > | PANIC:early_exception | 0 | 5 |
>> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>> >> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
>> >> >> >> > [ 0.000000] NX (Execute Disable) protection: active
>> >> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>> >> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>> >> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>> >> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>> >> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>> >> >> >> > [ 0.000000] Call Trace:
>> >> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>> >> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>> >> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>> >> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>> >> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>> >> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>> >> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>> >> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>> >> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>> >> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>> >> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>> >> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>> >> >> >> > [ 0.000000] RIP 0x4
>> >> >> >> >
>> >> >> >>
>> >> >> >> This is most puzzling. Could anyone decode the exception?
>> >> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
>> >> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>> >> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
>> >> >> >> be inaccurate?
>> >> >> >>
>> >> >> >> Anyway, it would be good to know the exact type of the platform,
>> >> >> >
>> >> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>> >> >> >
>> >> >> >> and
>> >> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
>> >> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
>> >> >> >
>> >> >> > Sorry, how?
>> >> >> >
>> >> >>
>> >> >> That's not a brand new machine, so I suppose there wouldn't be a
>> >> >> SMBIOS 3.0 header lurking in there.
>> >> >>
>> >> >> Anyway, if you are in a position to try things, could you apply this
>> >> >>
>> >> >> --- a/drivers/firmware/dmi_scan.c
>> >> >> +++ b/drivers/firmware/dmi_scan.c
>> >> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
>> >> >> memset(buf, 0, 16);
>> >> >> for (q = p; q < p + 0x10000; q += 16) {
>> >> >> memcpy_fromio(buf + 16, q, 16);
>> >> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
>> >> >> + if (!dmi_present(buf)) {
>> >> >> dmi_available = 1;
>> >> >> dmi_early_unmap(p, 0x10000);
>> >> >> goto out;
>> >> >>
>> >> >> and try again?
>> >> >
>> >> > kernel boots perfectly with this patch applied.
>> >> >
>> >> > --yliu
>> >> >
>> >>
>> >> Thank you! Very useful to know
>> >>
>> >
>> > Sigh, I made a silly error, I speicified wrong commit while testing your
>> > patch. Sorry for that.
>> >
>> > And I tested it again, with your former patch, sorry, the panic still
>> > happens.
>> >
>> > --yliu
>> >
>>
>> OK, no worries.
>>
>> Could you please try the attached patch? On my ARM system, it produces
>> something like this
>>
>> ====== Decoding _DMI_ header:
>> 5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
>> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
>> ====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
>> ====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
>> ====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
>> ====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
>> ====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
>> ====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
>> ====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
>> ====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
>> ====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
>> ====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
>> ====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
>> ====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
>> SMBIOS 2.7 present.
>> DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
>>
>> That should help us pinpoint what is going on here.
>>
>
> Here is the output:
>
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] ====== Decoding _DMI_ header:
> [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
> [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
OK, so that looks like more type promotion silliness.
Could you apply this, and retry?
> PANIC: early exception 0e rip 10:ffffffff8167aa1a error 9 cr2 ffffffffff240001
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-00008-g4d3a0be #66
> [ 0.000000] 0000000000000ba3 ffffffff81bcfd10 ffffffff818010a4 00000000000003f8
> [ 0.000000] 000000000000003e ffffffff81bcfdf8 ffffffff81d801b0 617420534f49424d
> [ 0.000000] 000000000000001f ffffffffff240000 0000000000000000 ffffffffff240000
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff818010a4>] dump_stack+0x46/0x58
> [ 0.000000] [<ffffffff81d801b0>] early_idt_handler+0x90/0xb7
> [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> [ 0.000000] [<ffffffff8167aa1a>] ? dmi_table+0x4a/0xf0
> [ 0.000000] [<ffffffff817fa71b>] ? printk+0x61/0x63
> [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> [ 0.000000] [<ffffffff81dd49dc>] dmi_walk_early+0x6b/0x90
> [ 0.000000] [<ffffffff81dd52fc>] dmi_present+0x1b4/0x23f
> [ 0.000000] [<ffffffff81dd55ab>] dmi_scan_machine+0x1d4/0x23a
> [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff81d883a2>] setup_arch+0x462/0xcc6
> [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff81d80167>] ? early_idt_handler+0x47/0xb7
> [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff81d80cf0>] start_kernel+0x97/0x456
> [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> [ 0.000000] [<ffffffff81d805ee>] x86_64_start_reservations+0x2a/0x2c
> [ 0.000000] [<ffffffff81d8072e>] x86_64_start_kernel+0x13e/0x14d
> [ 0.000000] RIP 0xba2
>
>
> --yliu
On 7 November 2014 10:35, Ard Biesheuvel <[email protected]> wrote:
> On 7 November 2014 10:26, Yuanhan Liu <[email protected]> wrote:
>> On Fri, Nov 07, 2014 at 10:03:55AM +0100, Ard Biesheuvel wrote:
>>> On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
>>> > On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
>>> >> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
>>> >> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
>>> >> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
>>> >> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>>> >> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>>> >> >> >> > FYI, we noticed the below changes on
>>> >> >> >> >
>>> >> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>>> >> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > +-----------------------+------------+------------+
>>> >> >> >> > | | 2fa165a26c | aacdce6e88 |
>>> >> >> >> > +-----------------------+------------+------------+
>>> >> >> >> > | boot_successes | 20 | 10 |
>>> >> >> >> > | early-boot-hang | 1 | |
>>> >> >> >> > | boot_failures | 0 | 5 |
>>> >> >> >> > | PANIC:early_exception | 0 | 5 |
>>> >> >> >> > +-----------------------+------------+------------+
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>>> >> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
>>> >> >> >> > [ 0.000000] NX (Execute Disable) protection: active
>>> >> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>>> >> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>>> >> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>>> >> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>>> >> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>>> >> >> >> > [ 0.000000] Call Trace:
>>> >> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>>> >> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>>> >> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>>> >> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>>> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>>> >> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>>> >> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>>> >> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>>> >> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>>> >> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>>> >> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>>> >> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>>> >> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>>> >> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>>> >> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>>> >> >> >> > [ 0.000000] RIP 0x4
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >> This is most puzzling. Could anyone decode the exception?
>>> >> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
>>> >> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>>> >> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
>>> >> >> >> be inaccurate?
>>> >> >> >>
>>> >> >> >> Anyway, it would be good to know the exact type of the platform,
>>> >> >> >
>>> >> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>>> >> >> >
>>> >> >> >> and
>>> >> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
>>> >> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
>>> >> >> >
>>> >> >> > Sorry, how?
>>> >> >> >
>>> >> >>
>>> >> >> That's not a brand new machine, so I suppose there wouldn't be a
>>> >> >> SMBIOS 3.0 header lurking in there.
>>> >> >>
>>> >> >> Anyway, if you are in a position to try things, could you apply this
>>> >> >>
>>> >> >> --- a/drivers/firmware/dmi_scan.c
>>> >> >> +++ b/drivers/firmware/dmi_scan.c
>>> >> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
>>> >> >> memset(buf, 0, 16);
>>> >> >> for (q = p; q < p + 0x10000; q += 16) {
>>> >> >> memcpy_fromio(buf + 16, q, 16);
>>> >> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
>>> >> >> + if (!dmi_present(buf)) {
>>> >> >> dmi_available = 1;
>>> >> >> dmi_early_unmap(p, 0x10000);
>>> >> >> goto out;
>>> >> >>
>>> >> >> and try again?
>>> >> >
>>> >> > kernel boots perfectly with this patch applied.
>>> >> >
>>> >> > --yliu
>>> >> >
>>> >>
>>> >> Thank you! Very useful to know
>>> >>
>>> >
>>> > Sigh, I made a silly error, I speicified wrong commit while testing your
>>> > patch. Sorry for that.
>>> >
>>> > And I tested it again, with your former patch, sorry, the panic still
>>> > happens.
>>> >
>>> > --yliu
>>> >
>>>
>>> OK, no worries.
>>>
>>> Could you please try the attached patch? On my ARM system, it produces
>>> something like this
>>>
>>> ====== Decoding _DMI_ header:
>>> 5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
>>> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
>>> ====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
>>> ====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
>>> ====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
>>> ====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
>>> ====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
>>> ====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
>>> ====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
>>> ====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
>>> ====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
>>> ====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
>>> ====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
>>> ====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
>>> SMBIOS 2.7 present.
>>> DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
>>>
>>> That should help us pinpoint what is going on here.
>>>
>>
>> Here is the output:
>>
>> [ 0.000000] NX (Execute Disable) protection: active
>> [ 0.000000] ====== Decoding _DMI_ header:
>> [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
>> [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
>
> OK, so that looks like more type promotion silliness.
>
> Could you apply this, and retry?
>
(I hit 'send' by accident)
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -497,7 +497,7 @@ static int __init dmi_present(const u8 *buf)
if (memcmp(buf, "_DMI_", 5) == 0 && dmi_checksum(buf, 15)) {
dmi_num = get_unaligned_le16(buf + 12);
dmi_len = get_unaligned_le16(buf + 6);
- dmi_base = get_unaligned_le32(buf + 8);
+ dmi_base = (u32)get_unaligned_le32(buf + 8);
if (dmi_walk_early(dmi_decode) == 0) {
if (smbios_ver) {
On Fri, 2014-11-07 at 17:26 +0800, Yuanhan Liu wrote:
>
> Here is the output:
>
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] ====== Decoding _DMI_ header:
> [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
> [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
Smells like a sign extension issue. Previously how could dmi_base (u32)
hold 0xffffffff8f602000?
On 7 November 2014 10:36, Matt Fleming <[email protected]> wrote:
> On Fri, 2014-11-07 at 17:26 +0800, Yuanhan Liu wrote:
>>
>> Here is the output:
>>
>> [ 0.000000] NX (Execute Disable) protection: active
>> [ 0.000000] ====== Decoding _DMI_ header:
>> [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
>> [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
>
> Smells like a sign extension issue. Previously how could dmi_base (u32)
> hold 0xffffffff8f602000?
>
Exactly. And note that we already found (and fixed, or so we thought)
this issue.
I.e, on the ARM you get
>>> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
which has the top bit set as well, but is handled correctly, whereas
with the original code (i.e., before adding the get_unaligned_le32()),
we were hitting the same problem.
--
Ard.
On Fri, Nov 07, 2014 at 10:35:44AM +0100, Ard Biesheuvel wrote:
> On 7 November 2014 10:26, Yuanhan Liu <[email protected]> wrote:
> > On Fri, Nov 07, 2014 at 10:03:55AM +0100, Ard Biesheuvel wrote:
> >> On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
> >> > On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
> >> >> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
> >> >> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
> >> >> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
> >> >> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
> >> >> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
> >> >> >> >> > FYI, we noticed the below changes on
> >> >> >> >> >
> >> >> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
> >> >> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > +-----------------------+------------+------------+
> >> >> >> >> > | | 2fa165a26c | aacdce6e88 |
> >> >> >> >> > +-----------------------+------------+------------+
> >> >> >> >> > | boot_successes | 20 | 10 |
> >> >> >> >> > | early-boot-hang | 1 | |
> >> >> >> >> > | boot_failures | 0 | 5 |
> >> >> >> >> > | PANIC:early_exception | 0 | 5 |
> >> >> >> >> > +-----------------------+------------+------------+
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
> >> >> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
> >> >> >> >> > [ 0.000000] NX (Execute Disable) protection: active
> >> >> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
> >> >> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
> >> >> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
> >> >> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
> >> >> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
> >> >> >> >> > [ 0.000000] Call Trace:
> >> >> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
> >> >> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
> >> >> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
> >> >> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
> >> >> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
> >> >> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
> >> >> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
> >> >> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
> >> >> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
> >> >> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
> >> >> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
> >> >> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
> >> >> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
> >> >> >> >> > [ 0.000000] RIP 0x4
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> This is most puzzling. Could anyone decode the exception?
> >> >> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
> >> >> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
> >> >> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
> >> >> >> >> be inaccurate?
> >> >> >> >>
> >> >> >> >> Anyway, it would be good to know the exact type of the platform,
> >> >> >> >
> >> >> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
> >> >> >> >
> >> >> >> >> and
> >> >> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
> >> >> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
> >> >> >> >
> >> >> >> > Sorry, how?
> >> >> >> >
> >> >> >>
> >> >> >> That's not a brand new machine, so I suppose there wouldn't be a
> >> >> >> SMBIOS 3.0 header lurking in there.
> >> >> >>
> >> >> >> Anyway, if you are in a position to try things, could you apply this
> >> >> >>
> >> >> >> --- a/drivers/firmware/dmi_scan.c
> >> >> >> +++ b/drivers/firmware/dmi_scan.c
> >> >> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
> >> >> >> memset(buf, 0, 16);
> >> >> >> for (q = p; q < p + 0x10000; q += 16) {
> >> >> >> memcpy_fromio(buf + 16, q, 16);
> >> >> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
> >> >> >> + if (!dmi_present(buf)) {
> >> >> >> dmi_available = 1;
> >> >> >> dmi_early_unmap(p, 0x10000);
> >> >> >> goto out;
> >> >> >>
> >> >> >> and try again?
> >> >> >
> >> >> > kernel boots perfectly with this patch applied.
> >> >> >
> >> >> > --yliu
> >> >> >
> >> >>
> >> >> Thank you! Very useful to know
> >> >>
> >> >
> >> > Sigh, I made a silly error, I speicified wrong commit while testing your
> >> > patch. Sorry for that.
> >> >
> >> > And I tested it again, with your former patch, sorry, the panic still
> >> > happens.
> >> >
> >> > --yliu
> >> >
> >>
> >> OK, no worries.
> >>
> >> Could you please try the attached patch? On my ARM system, it produces
> >> something like this
> >>
> >> ====== Decoding _DMI_ header:
> >> 5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
> >> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
> >> ====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
> >> ====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
> >> ====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
> >> ====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
> >> ====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
> >> ====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
> >> ====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
> >> ====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
> >> ====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
> >> ====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
> >> ====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
> >> ====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
> >> SMBIOS 2.7 present.
> >> DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
> >>
> >> That should help us pinpoint what is going on here.
> >>
> >
> > Here is the output:
> >
> > [ 0.000000] NX (Execute Disable) protection: active
> > [ 0.000000] ====== Decoding _DMI_ header:
> > [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
> > [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
>
> OK, so that looks like more type promotion silliness.
>
> Could you apply this, and retry?
Despites the long output like following, it fixes the hang: the kernel
boots perfectly this time. Is that expected? ;)
....
[ 12.568459] ====== Processing SMBIOS table entry at ffffc900018ee1a2, type 0x8, length 0x9
[ 12.577941] ====== Processing SMBIOS table entry at ffffc900018ee1ba, type 0x8, length 0x9
[ 12.587433] ====== Processing SMBIOS table entry at ffffc900018ee1cf, type 0x8, length 0x9
[ 12.596918] ====== Processing SMBIOS table entry at ffffc900018ee1e4, type 0x8, length 0x9
[ 12.606400] ====== Processing SMBIOS table entry at ffffc900018ee1f9, type 0x8, length 0x9
[ 12.615904] ====== Processing SMBIOS table entry at ffffc900018ee20e, type 0x8, length 0x9
[ 12.625389] ====== Processing SMBIOS table entry at ffffc900018ee22c, type 0x8, length 0x9
[ 12.634871] ====== Processing SMBIOS table entry at ffffc900018ee24a, type 0x8, length 0x9
[ 12.644359] ====== Processing SMBIOS table entry at ffffc900018ee268, type 0x8, length 0x9
[ 12.653842] ====== Processing SMBIOS table entry at ffffc900018ee286, type 0x8, length 0x9
[ 12.663324] ====== Processing SMBIOS table entry at ffffc900018ee2a4, type 0x8, length 0x9
[ 12.672821] ====== Processing SMBIOS table entry at ffffc900018ee2c2, type 0x9, length 0xd
[ 12.682307] ====== Processing SMBIOS table entry at ffffc900018ee2e1, type 0x9, length 0xd
[ 12.691788] ====== Processing SMBIOS table entry at ffffc900018ee300, type 0x9, length 0xd
[ 12.701276] ====== Processing SMBIOS table entry at ffffc900018ee31f, type 0x9, length 0xd
[ 12.710757] ====== Processing SMBIOS table entry at ffffc900018ee33e, type 0xa, length 0x6
[ 12.720241] ====== Processing SMBIOS table entry at ffffc900018ee35c, type 0xa, length 0x6
[ 12.729729] ====== Processing SMBIOS table entry at ffffc900018ee37a, type 0xa, length 0x6
[ 12.739218] ====== Processing SMBIOS table entry at ffffc900018ee3a2, type 0xb, length 0x5
[ 12.748705] ====== Processing SMBIOS table entry at ffffc900018ee3b2, type 0xc, length 0x5
[ 12.758197] ====== Processing SMBIOS table entry at ffffc900018ee3da, type 0xc, length 0x5
[ 12.767687] ====== Processing SMBIOS table entry at ffffc900018ee401, type 0xc, length 0x5
[ 12.777173] ====== Processing SMBIOS table entry at ffffc900018ee429, type 0xc, length 0x5
[ 12.786634] ====== Processing SMBIOS table entry at ffffc900018ee458, type 0xd, length 0x16
[ 12.796220] ====== Processing SMBIOS table entry at ffffc900018ee47f, type 0x18, length 0x5
[ 12.805800] ====== Processing SMBIOS table entry at ffffc900018ee486, type 0x20, length 0x14
[ 12.815483] ====== Processing SMBIOS table entry at ffffc900018ee49c, type 0x10, length 0xf
[ 12.825066] ====== Processing SMBIOS table entry at ffffc900018ee4ad, type 0x13, length 0xf
[ 12.834630] ====== Processing SMBIOS table entry at ffffc900018ee4be, type 0x11, length 0x1b
[ 12.844321] ====== Processing SMBIOS table entry at ffffc900018ee527, type 0x14, length 0x13
[ 12.854000] ====== Processing SMBIOS table entry at ffffc900018ee53c, type 0x11, length 0x1b
[ 12.863688] ====== Processing SMBIOS table entry at ffffc900018ee598, type 0x11, length 0x1b
[ 12.873375] ====== Processing SMBIOS table entry at ffffc900018ee601, type 0x14, length 0x13
...
And there are more of them .., if you need, I can attach the whole dmesg.
--yliu
>
> > PANIC: early exception 0e rip 10:ffffffff8167aa1a error 9 cr2 ffffffffff240001
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-00008-g4d3a0be #66
> > [ 0.000000] 0000000000000ba3 ffffffff81bcfd10 ffffffff818010a4 00000000000003f8
> > [ 0.000000] 000000000000003e ffffffff81bcfdf8 ffffffff81d801b0 617420534f49424d
> > [ 0.000000] 000000000000001f ffffffffff240000 0000000000000000 ffffffffff240000
> > [ 0.000000] Call Trace:
> > [ 0.000000] [<ffffffff818010a4>] dump_stack+0x46/0x58
> > [ 0.000000] [<ffffffff81d801b0>] early_idt_handler+0x90/0xb7
> > [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> > [ 0.000000] [<ffffffff8167aa1a>] ? dmi_table+0x4a/0xf0
> > [ 0.000000] [<ffffffff817fa71b>] ? printk+0x61/0x63
> > [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> > [ 0.000000] [<ffffffff81dd4cfc>] ? dmi_format_ids.constprop.9+0x13c/0x13c
> > [ 0.000000] [<ffffffff81dd49dc>] dmi_walk_early+0x6b/0x90
> > [ 0.000000] [<ffffffff81dd52fc>] dmi_present+0x1b4/0x23f
> > [ 0.000000] [<ffffffff81dd55ab>] dmi_scan_machine+0x1d4/0x23a
> > [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff81d883a2>] setup_arch+0x462/0xcc6
> > [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff81d80167>] ? early_idt_handler+0x47/0xb7
> > [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff81d80cf0>] start_kernel+0x97/0x456
> > [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff81d80120>] ? early_idt_handlers+0x120/0x120
> > [ 0.000000] [<ffffffff81d805ee>] x86_64_start_reservations+0x2a/0x2c
> > [ 0.000000] [<ffffffff81d8072e>] x86_64_start_kernel+0x13e/0x14d
> > [ 0.000000] RIP 0xba2
> >
> >
> > --yliu
On 7 November 2014 11:14, Yuanhan Liu <[email protected]> wrote:
> On Fri, Nov 07, 2014 at 10:35:44AM +0100, Ard Biesheuvel wrote:
>> On 7 November 2014 10:26, Yuanhan Liu <[email protected]> wrote:
>> > On Fri, Nov 07, 2014 at 10:03:55AM +0100, Ard Biesheuvel wrote:
>> >> On 7 November 2014 09:46, Yuanhan Liu <[email protected]> wrote:
>> >> > On Fri, Nov 07, 2014 at 09:23:56AM +0100, Ard Biesheuvel wrote:
>> >> >> On 7 November 2014 09:13, Yuanhan Liu <[email protected]> wrote:
>> >> >> > On Fri, Nov 07, 2014 at 08:44:40AM +0100, Ard Biesheuvel wrote:
>> >> >> >> On 7 November 2014 08:37, Yuanhan Liu <[email protected]> wrote:
>> >> >> >> > On Fri, Nov 07, 2014 at 08:17:36AM +0100, Ard Biesheuvel wrote:
>> >> >> >> >> On 7 November 2014 06:47, LKP <[email protected]> wrote:
>> >> >> >> >> > FYI, we noticed the below changes on
>> >> >> >> >> >
>> >> >> >> >> > https://git.linaro.org/people/ard.biesheuvel/linux-arm efi-for-3.19
>> >> >> >> >> > commit aacdce6e880894acb57d71dcb2e3fc61b4ed4e96 ("dmi: add support for SMBIOS 3.0 64-bit entry point")
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> >> > | | 2fa165a26c | aacdce6e88 |
>> >> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> >> > | boot_successes | 20 | 10 |
>> >> >> >> >> > | early-boot-hang | 1 | |
>> >> >> >> >> > | boot_failures | 0 | 5 |
>> >> >> >> >> > | PANIC:early_exception | 0 | 5 |
>> >> >> >> >> > +-----------------------+------------+------------+
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000036fffffff] usable
>> >> >> >> >> > [ 0.000000] bootconsole [earlyser0] enabled
>> >> >> >> >> > [ 0.000000] NX (Execute Disable) protection: active
>> >> >> >> >> > PANIC: early exception 0e rip 10:ffffffff81899e6b error 9 cr2 ffffffffff240000
>> >> >> >> >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc2-gc5221e6 #1
>> >> >> >> >> > [ 0.000000] 0000000000000000 ffffffff82203d30 ffffffff819f0a6e 00000000000003f8
>> >> >> >> >> > [ 0.000000] ffffffffff240000 ffffffff82203e18 ffffffff823701b0 ffffffff82511401
>> >> >> >> >> > [ 0.000000] 0000000000000000 0000000000000ba3 0000000000000000 ffffffffff240000
>> >> >> >> >> > [ 0.000000] Call Trace:
>> >> >> >> >> > [ 0.000000] [<ffffffff819f0a6e>] dump_stack+0x4e/0x68
>> >> >> >> >> > [ 0.000000] [<ffffffff823701b0>] early_idt_handler+0x90/0xb7
>> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> >> > [ 0.000000] [<ffffffff81899e6b>] ? dmi_table+0x3f/0x94
>> >> >> >> >> > [ 0.000000] [<ffffffff81899e42>] ? dmi_table+0x16/0x94
>> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> >> > [ 0.000000] [<ffffffff823c80da>] ? dmi_save_one_device+0x81/0x81
>> >> >> >> >> > [ 0.000000] [<ffffffff823c7eff>] dmi_walk_early+0x44/0x69
>> >> >> >> >> > [ 0.000000] [<ffffffff823c88a2>] dmi_present+0x180/0x1ff
>> >> >> >> >> > [ 0.000000] [<ffffffff823c8ab3>] dmi_scan_machine+0x144/0x191
>> >> >> >> >> > [ 0.000000] [<ffffffff82370702>] ? loglevel+0x31/0x31
>> >> >> >> >> > [ 0.000000] [<ffffffff82377f52>] setup_arch+0x490/0xc73
>> >> >> >> >> > [ 0.000000] [<ffffffff819eef73>] ? printk+0x4d/0x4f
>> >> >> >> >> > [ 0.000000] [<ffffffff82370b90>] start_kernel+0x9c/0x43f
>> >> >> >> >> > [ 0.000000] [<ffffffff82370120>] ? early_idt_handlers+0x120/0x120
>> >> >> >> >> > [ 0.000000] [<ffffffff823704a2>] x86_64_start_reservations+0x2a/0x2c
>> >> >> >> >> > [ 0.000000] [<ffffffff823705df>] x86_64_start_kernel+0x13b/0x14a
>> >> >> >> >> > [ 0.000000] RIP 0x4
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >> This is most puzzling. Could anyone decode the exception?
>> >> >> >> >> This looks like the non-EFI path through dmi_scan_machine(), which
>> >> >> >> >> calls dmi_present() /after/ calling dmi_smbios3_present(), which
>> >> >> >> >> apparently has not found the _SM3_ header tag. Or could the call stack
>> >> >> >> >> be inaccurate?
>> >> >> >> >>
>> >> >> >> >> Anyway, it would be good to know the exact type of the platform,
>> >> >> >> >
>> >> >> >> > It's a Nehalem-EP machine, wht 16 CPU and 12G memory.
>> >> >> >> >
>> >> >> >> >> and
>> >> >> >> >> perhaps we could find out if there is an inadvertent _SM3_ tag
>> >> >> >> >> somewhere in the 0xF0000 - 0xFFFFF range?
>> >> >> >> >
>> >> >> >> > Sorry, how?
>> >> >> >> >
>> >> >> >>
>> >> >> >> That's not a brand new machine, so I suppose there wouldn't be a
>> >> >> >> SMBIOS 3.0 header lurking in there.
>> >> >> >>
>> >> >> >> Anyway, if you are in a position to try things, could you apply this
>> >> >> >>
>> >> >> >> --- a/drivers/firmware/dmi_scan.c
>> >> >> >> +++ b/drivers/firmware/dmi_scan.c
>> >> >> >> @@ -617,7 +617,7 @@ void __init dmi_scan_machine(void)
>> >> >> >> memset(buf, 0, 16);
>> >> >> >> for (q = p; q < p + 0x10000; q += 16) {
>> >> >> >> memcpy_fromio(buf + 16, q, 16);
>> >> >> >> - if (!dmi_smbios3_present(buf) || !dmi_present(buf)) {
>> >> >> >> + if (!dmi_present(buf)) {
>> >> >> >> dmi_available = 1;
>> >> >> >> dmi_early_unmap(p, 0x10000);
>> >> >> >> goto out;
>> >> >> >>
>> >> >> >> and try again?
>> >> >> >
>> >> >> > kernel boots perfectly with this patch applied.
>> >> >> >
>> >> >> > --yliu
>> >> >> >
>> >> >>
>> >> >> Thank you! Very useful to know
>> >> >>
>> >> >
>> >> > Sigh, I made a silly error, I speicified wrong commit while testing your
>> >> > patch. Sorry for that.
>> >> >
>> >> > And I tested it again, with your former patch, sorry, the panic still
>> >> > happens.
>> >> >
>> >> > --yliu
>> >> >
>> >>
>> >> OK, no worries.
>> >>
>> >> Could you please try the attached patch? On my ARM system, it produces
>> >> something like this
>> >>
>> >> ====== Decoding _DMI_ header:
>> >> 5f 44 4d 49 5f 89 62 02 00 c0 8a fe 0c 00 27 cf
>> >> ====== Remapped SMBIOS table 0xfe8ac000 at ffffff800001e000, size 0x262, num 0xc
>> >> ====== Processing SMBIOS table entry at ffffff800001e000, type 0x0, length 0x18
>> >> ====== Processing SMBIOS table entry at ffffff800001e043, type 0x1, length 0x1b
>> >> ====== Processing SMBIOS table entry at ffffff800001e09d, type 0x2, length 0x11
>> >> ====== Processing SMBIOS table entry at ffffff800001e105, type 0x3, length 0x18
>> >> ====== Processing SMBIOS table entry at ffffff800001e155, type 0x4, length 0x2a
>> >> ====== Processing SMBIOS table entry at ffffff800001e19a, type 0x7, length 0x13
>> >> ====== Processing SMBIOS table entry at ffffff800001e1b5, type 0x9, length 0x11
>> >> ====== Processing SMBIOS table entry at ffffff800001e1cf, type 0x10, length 0x17
>> >> ====== Processing SMBIOS table entry at ffffff800001e1e8, type 0x11, length 0x28
>> >> ====== Processing SMBIOS table entry at ffffff800001e22e, type 0x13, length 0x1f
>> >> ====== Processing SMBIOS table entry at ffffff800001e24f, type 0x20, length 0xb
>> >> ====== Processing SMBIOS table entry at ffffff800001e25c, type 0x7f, length 0x4
>> >> SMBIOS 2.7 present.
>> >> DMI: ARM Arm Versatile Express/Arm Versatile Express, BIOS 16:20:46 Oct 28 2014
>> >>
>> >> That should help us pinpoint what is going on here.
>> >>
>> >
>> > Here is the output:
>> >
>> > [ 0.000000] NX (Execute Disable) protection: active
>> > [ 0.000000] ====== Decoding _DMI_ header:
>> > [ 0.000000] 5f 44 4d 49 5f 48 a3 0b 00 20 60 8f 3e 00 25 00
>> > [ 0.000000] ====== Remapped SMBIOS table 0xffffffff8f602000 at ffffffffff240000, size 0xba3, num 0x3e
>>
>> OK, so that looks like more type promotion silliness.
>>
>> Could you apply this, and retry?
>
> Despites the long output like following, it fixes the hang: the kernel
> boots perfectly this time. Is that expected? ;)
>
> ....
> [ 12.568459] ====== Processing SMBIOS table entry at ffffc900018ee1a2, type 0x8, length 0x9
> [ 12.577941] ====== Processing SMBIOS table entry at ffffc900018ee1ba, type 0x8, length 0x9
> [ 12.587433] ====== Processing SMBIOS table entry at ffffc900018ee1cf, type 0x8, length 0x9
> [ 12.596918] ====== Processing SMBIOS table entry at ffffc900018ee1e4, type 0x8, length 0x9
> [ 12.606400] ====== Processing SMBIOS table entry at ffffc900018ee1f9, type 0x8, length 0x9
> [ 12.615904] ====== Processing SMBIOS table entry at ffffc900018ee20e, type 0x8, length 0x9
> [ 12.625389] ====== Processing SMBIOS table entry at ffffc900018ee22c, type 0x8, length 0x9
> [ 12.634871] ====== Processing SMBIOS table entry at ffffc900018ee24a, type 0x8, length 0x9
> [ 12.644359] ====== Processing SMBIOS table entry at ffffc900018ee268, type 0x8, length 0x9
> [ 12.653842] ====== Processing SMBIOS table entry at ffffc900018ee286, type 0x8, length 0x9
> [ 12.663324] ====== Processing SMBIOS table entry at ffffc900018ee2a4, type 0x8, length 0x9
> [ 12.672821] ====== Processing SMBIOS table entry at ffffc900018ee2c2, type 0x9, length 0xd
> [ 12.682307] ====== Processing SMBIOS table entry at ffffc900018ee2e1, type 0x9, length 0xd
> [ 12.691788] ====== Processing SMBIOS table entry at ffffc900018ee300, type 0x9, length 0xd
> [ 12.701276] ====== Processing SMBIOS table entry at ffffc900018ee31f, type 0x9, length 0xd
> [ 12.710757] ====== Processing SMBIOS table entry at ffffc900018ee33e, type 0xa, length 0x6
> [ 12.720241] ====== Processing SMBIOS table entry at ffffc900018ee35c, type 0xa, length 0x6
> [ 12.729729] ====== Processing SMBIOS table entry at ffffc900018ee37a, type 0xa, length 0x6
> [ 12.739218] ====== Processing SMBIOS table entry at ffffc900018ee3a2, type 0xb, length 0x5
> [ 12.748705] ====== Processing SMBIOS table entry at ffffc900018ee3b2, type 0xc, length 0x5
> [ 12.758197] ====== Processing SMBIOS table entry at ffffc900018ee3da, type 0xc, length 0x5
> [ 12.767687] ====== Processing SMBIOS table entry at ffffc900018ee401, type 0xc, length 0x5
> [ 12.777173] ====== Processing SMBIOS table entry at ffffc900018ee429, type 0xc, length 0x5
> [ 12.786634] ====== Processing SMBIOS table entry at ffffc900018ee458, type 0xd, length 0x16
> [ 12.796220] ====== Processing SMBIOS table entry at ffffc900018ee47f, type 0x18, length 0x5
> [ 12.805800] ====== Processing SMBIOS table entry at ffffc900018ee486, type 0x20, length 0x14
> [ 12.815483] ====== Processing SMBIOS table entry at ffffc900018ee49c, type 0x10, length 0xf
> [ 12.825066] ====== Processing SMBIOS table entry at ffffc900018ee4ad, type 0x13, length 0xf
> [ 12.834630] ====== Processing SMBIOS table entry at ffffc900018ee4be, type 0x11, length 0x1b
> [ 12.844321] ====== Processing SMBIOS table entry at ffffc900018ee527, type 0x14, length 0x13
> [ 12.854000] ====== Processing SMBIOS table entry at ffffc900018ee53c, type 0x11, length 0x1b
> [ 12.863688] ====== Processing SMBIOS table entry at ffffc900018ee598, type 0x11, length 0x1b
> [ 12.873375] ====== Processing SMBIOS table entry at ffffc900018ee601, type 0x14, length 0x13
>
> ...
>
> And there are more of them .., if you need, I can attach the whole dmesg.
>
Yes, that is expected. Congratulations, we found the bug!
Thanks for helping me out here.
Regards,
Ard.
On Fri, 2014-11-07 at 17:30 +0800, Yuanhan Liu wrote:
>
>
> The address changes to 10:ffffffff8167aa1a, and in the System.map, it has:
>
> ffffffff8167a9d0 t dmi_table
> ffffffff8167aac0 T dmi_name_in_vendors
>
> Sorry, I don't know how to dig furture.
Could you build a kernel without Ard's u32 cast fix and do,
objdump -dr vmlinux > /tmp/1
Then apply Ard's u32 cast fix, and rebuild the kernel and do,
objdump -dr vmlinux > /tmp/2
Then send me the output of,
diff -u /tmp/1 /tmp/2