Hi,
Sombody reported following BUG while testing kdump. Please find attached
the fix.
------------[ cut here ]------------
kernel BUG at arch/i386/kernel/apic.c:447!
invalid opcode: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c100acc3>] Not tainted VLI
EFLAGS: 00010246 (2.6.17-rc4-16M #5)
EIP is at setup_local_APIC+0x20/0x1a3
eax: 00000000 ebx: c4e61f88 ecx: 00000000 edx: 00000020
esi: 00050014 edi: c4e61f88 ebp: 00000000 esp: c4e61f5c
ds: 007b es: 007b ss: 0068
Process swapper (pid: 1, threadinfo=c4e60000 task=c1389a10)
Stack: <0>c4e61f88 c133dd44 c13264c0 00000001 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
00000000 00000000 00000000 c4e60000 00000000 c1000284 c1002342 c12a6260
Call Trace:
<c13264c0> APIC_init_uniprocessor+0xa9/0xd8 <c1000284> init+0x28/0x1f6
<c1002342> ret_from_fork+0x6/0x14 <c100025c> init+0x0/0x1f6
<c100025c> init+0x0/0x1f6 <c1000ae5> kernel_thread_helper+0x5/0xb
Code: 9d 5b 5e c3 a1 18 1c 35 c1 eb e0 56 53 8b 35 30 d0 ff ff a1 20 d0 ff ff c1 e8 18 83 e0 0f 0f a3 05 80 d2 34 c1 19 c0 85 c0 75 08 <0f> 0b bf 01 9b 78 24 c1 c7 05 e0 d0 ff ff ff ff ff ff a1 d0 d0
EIP: [<c100acc3>] setup_local_APIC+0x20/0x1a3 SS:ESP 0068:c4e61f5c
<0>Kernel panic - not syncing: Attempted to kill init!
=============================================================================
o Kdump second kernel boot fails after a system crash if second kernel
is UP and acpi=off and if crash occurred on a non-boot cpu.
o Issue here is that MP tables report boot cpu lapic id as 0 but second
kernel is booting on a different processor and MP table data is stale
in this context. Hence apic_id_registered() check fails in setup_local_APIC()
when called from APIC_init_uniprocessor().
o Problem is not seen if ACPI is enabled as in that case
boot_cpu_physical_apicid is read from the LAPIC.
o Problem is not seen with SMP kernels as well because in this case also
boot_cpu_physical_apicid is read from LAPIC. (smp_boot_cpus()).
o This patch fixes the problem by reading the boot_cpu_physical_apicid
from LAPIC for all the cases hence bringing uniformity. At the same time
reading from LAPIC should be more reliable then trusting MP tables. My
understanding is that MP tables are anyway becoming a thing of past.
Signed-off-by: Vivek Goyal <[email protected]>
---
arch/i386/kernel/apic.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff -puN arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix arch/i386/kernel/apic.c
--- linux-2.6.17-rc4-16M/arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix 2006-05-17 13:27:44.000000000 -0400
+++ linux-2.6.17-rc4-16M-root/arch/i386/kernel/apic.c 2006-05-18 05:11:44.000000000 -0400
@@ -860,12 +860,7 @@ void __init init_apic_mappings(void)
printk(KERN_DEBUG "mapped APIC to %08lx (%08lx)\n", APIC_BASE,
apic_phys);
- /*
- * Fetch the APIC ID of the BSP in case we have a
- * default configuration (or the MP table is broken).
- */
- if (boot_cpu_physical_apicid == -1U)
- boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+ boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
#ifdef CONFIG_X86_IO_APIC
{
_
Vivek Goyal <[email protected]> wrote:
>
> Hi,
>
> Sombody reported following BUG while testing kdump. Please find attached
> the fix.
>
> ------------[ cut here ]------------
> kernel BUG at arch/i386/kernel/apic.c:447!
> invalid opcode: 0000 [#1]
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c100acc3>] Not tainted VLI
> EFLAGS: 00010246 (2.6.17-rc4-16M #5)
> EIP is at setup_local_APIC+0x20/0x1a3
> eax: 00000000 ebx: c4e61f88 ecx: 00000000 edx: 00000020
> esi: 00050014 edi: c4e61f88 ebp: 00000000 esp: c4e61f5c
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 1, threadinfo=c4e60000 task=c1389a10)
> Stack: <0>c4e61f88 c133dd44 c13264c0 00000001 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 c4e60000 00000000 c1000284 c1002342 c12a6260
> Call Trace:
> <c13264c0> APIC_init_uniprocessor+0xa9/0xd8 <c1000284> init+0x28/0x1f6
> <c1002342> ret_from_fork+0x6/0x14 <c100025c> init+0x0/0x1f6
> <c100025c> init+0x0/0x1f6 <c1000ae5> kernel_thread_helper+0x5/0xb
> Code: 9d 5b 5e c3 a1 18 1c 35 c1 eb e0 56 53 8b 35 30 d0 ff ff a1 20 d0 ff ff c1 e8 18 83 e0 0f 0f a3 05 80 d2 34 c1 19 c0 85 c0 75 08 <0f> 0b bf 01 9b 78 24 c1 c7 05 e0 d0 ff ff ff ff ff ff a1 d0 d0
> EIP: [<c100acc3>] setup_local_APIC+0x20/0x1a3 SS:ESP 0068:c4e61f5c
> <0>Kernel panic - not syncing: Attempted to kill init!
>
> =============================================================================
>
>
> o Kdump second kernel boot fails after a system crash if second kernel
> is UP and acpi=off and if crash occurred on a non-boot cpu.
>
> o Issue here is that MP tables report boot cpu lapic id as 0 but second
> kernel is booting on a different processor and MP table data is stale
> in this context. Hence apic_id_registered() check fails in setup_local_APIC()
> when called from APIC_init_uniprocessor().
>
> o Problem is not seen if ACPI is enabled as in that case
> boot_cpu_physical_apicid is read from the LAPIC.
>
> o Problem is not seen with SMP kernels as well because in this case also
> boot_cpu_physical_apicid is read from LAPIC. (smp_boot_cpus()).
>
> o This patch fixes the problem by reading the boot_cpu_physical_apicid
> from LAPIC for all the cases hence bringing uniformity. At the same time
> reading from LAPIC should be more reliable then trusting MP tables. My
> understanding is that MP tables are anyway becoming a thing of past.
>
Oh dear. The APIC code is a teetering wreck already.
>
> diff -puN arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix arch/i386/kernel/apic.c
> --- linux-2.6.17-rc4-16M/arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix 2006-05-17 13:27:44.000000000 -0400
> +++ linux-2.6.17-rc4-16M-root/arch/i386/kernel/apic.c 2006-05-18 05:11:44.000000000 -0400
> @@ -860,12 +860,7 @@ void __init init_apic_mappings(void)
> printk(KERN_DEBUG "mapped APIC to %08lx (%08lx)\n", APIC_BASE,
> apic_phys);
>
> - /*
> - * Fetch the APIC ID of the BSP in case we have a
> - * default configuration (or the MP table is broken).
> - */
> - if (boot_cpu_physical_apicid == -1U)
> - boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
> + boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
>
I just don't think we can do this sort of thing. The workaround for broken
MP tables is one of those nasty things we've gained through many years of
hard-won real-world experience.
If we just go and toss it away like this, all those machines which used to
work will break and there'll be a sad little dribble of regression reports
which everyone cheerily ignores as usual.
So, sorry, nope. Please find a way to fix kdump while retaining the
broken-MP-table workaround.
On Thu, May 18, 2006 at 12:36:55PM -0700, Andrew Morton wrote:
> >
> > diff -puN arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix arch/i386/kernel/apic.c
> > --- linux-2.6.17-rc4-16M/arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix 2006-05-17 13:27:44.000000000 -0400
> > +++ linux-2.6.17-rc4-16M-root/arch/i386/kernel/apic.c 2006-05-18 05:11:44.000000000 -0400
> > @@ -860,12 +860,7 @@ void __init init_apic_mappings(void)
> > printk(KERN_DEBUG "mapped APIC to %08lx (%08lx)\n", APIC_BASE,
> > apic_phys);
> >
> > - /*
> > - * Fetch the APIC ID of the BSP in case we have a
> > - * default configuration (or the MP table is broken).
> > - */
> > - if (boot_cpu_physical_apicid == -1U)
> > - boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
> > + boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
> >
>
> I just don't think we can do this sort of thing. The workaround for broken
> MP tables is one of those nasty things we've gained through many years of
> hard-won real-world experience.
>
> If we just go and toss it away like this, all those machines which used to
> work will break and there'll be a sad little dribble of regression reports
> which everyone cheerily ignores as usual.
>
> So, sorry, nope. Please find a way to fix kdump while retaining the
> broken-MP-table workaround.
Ok. I thought if overriding MP tables works in SMP case then it should work
in UP case too. But anyway, here is the take2. This one is little ugly and
hackish but limits the impact to only UP kernels and that too if CRASH_DUMP
is enabled.
o Kdump second kernel boot fails after a system crash if second kernel
is UP and acpi=off and if crash occurred on a non-boot cpu.
o Issue here is that MP tables report boot cpu lapic id as 0 but second
kernel is booting on a different processor and MP table data is stale
in this context. Hence apic_id_registered() check fails in setup_local_APIC()
when called from APIC_init_uniprocessor().
o Problem is not seen if ACPI is enabled as in that case
boot_cpu_physical_apicid is read from the LAPIC.
o Problem is not seen with SMP kernels as well because in this case also
boot_cpu_physical_apicid is read from LAPIC. (smp_boot_cpus()).
o The problem is fixed by reading boot_cpu_physical_apicid from LAPIC
if it is a UP kernel and CRASH_DUMP is enabled.
Signed-off-by: Vivek Goyal <[email protected]>
---
arch/i386/kernel/apic.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff -puN arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix-take2 arch/i386/kernel/apic.c
--- linux-2.6.17-rc4-16M/arch/i386/kernel/apic.c~kdump-i386-boot-cpu-physical-apicid-fix-take2 2006-05-18 11:26:45.000000000 -0400
+++ linux-2.6.17-rc4-16M-root/arch/i386/kernel/apic.c 2006-05-18 11:26:45.000000000 -0400
@@ -1341,6 +1341,14 @@ int __init APIC_init_uniprocessor (void)
connect_bsp_APIC();
+ /*
+ * Hack: In case of kdump, after a crash, kernel might be booting
+ * on a cpu with non-zero lapic id. But boot_cpu_physical_apicid
+ * might be zero if read from MP tables. Get it from LAPIC.
+ */
+#ifdef CONFIG_CRASH_DUMP
+ boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
+#endif
phys_cpu_present_map = physid_mask_of_physid(boot_cpu_physical_apicid);
setup_local_APIC();
_