Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1034138AbcJ1SbV convert rfc822-to-8bit (ORCPT ); Fri, 28 Oct 2016 14:31:21 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:20961 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966105AbcJ1SaX (ORCPT ); Fri, 28 Oct 2016 14:30:23 -0400 MIME-Version: 1.0 Message-ID: Date: Fri, 28 Oct 2016 11:30:09 -0700 (PDT) From: Michal Necasek To: Cc: , , , , , Subject: Re: 4.8.2 not booting in 32-bit VM without I/O-APIC X-Mailer: Zimbra on Oracle Beehive Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Content-Disposition: inline X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4137 Lines: 92 Hi Thomas, In case you haven't had a chance to take a look yet... We had to dig a bit because the problem introduced by commit 2a51fe08 (arch/x86: Handle non enumerated CPU after physical hotplug) <1> is not fixed for us by commit ff856051 (arch/x86: Handle non enumerated CPU after physical hotplug) <2>. To recap, after the initial commit, systems with no local APIC panicked <4> early during boot. That showed up for us in VirtualBox, but not surprisingly, physical systems are also affected <3>. The second patch fixes systems with no local APIC, but not systems which have no ACPI MADT (or no ACPI), no MP tables, yet do have an APIC. The core problem is init ordering. In setup_arch() in arch/x86/kernel/setup.c, prefill_possible_map() is called *before* init_apic_mappings(). On typical modern systems, the local APIC will be set up either through ACPI or MP tables by the time prefill_possible_map() runs, but it is incorrect to assume that the APIC must be initialized by the time prefill_possible_map() is entered. That's why the APIC callbacks aren't no-ops there, they simply haven't been set up yet. I suspect that either init_apic_mappings() needs to be called earlier or the initial fix from commit 2a51fe08 needs to be done later. Regards, Michal <1> https://patchwork.kernel.org/patch/9366095/ <2> https://patchwork.kernel.org/patch/9390349/ <3> https://bugs.archlinux.org/task/51506 <4> Using APIC driver default ACPI: PM-Timer IO Port: 0x4008 BUG: unable to handle kernel paging request at ffffc020 IP: [] native_apic_mem_read+0xd/0x10 *pde = 08b8a063 *pte = 00000000 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-040900rc1-generic #201610151630 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 task: c89fda80 task.stack: c89f8000 EIP: 0060:[] EFLAGS: 00210046 CPU: 0 EIP is at native_apic_mem_read+0xd/0x10 EAX: ffffc020 EBX: ffffffff ECX: c89f9f40 EDX: fffff000 ESI: c8b8d000 EDI: c8b89400 EBP: c89f9f88 ESP: c89f9f84 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: ffffc020 CR3: 08b8c000 CR4: 00040690 Stack: c8040eb6 c89f9fb8 c8accc5e c89f9fb8 c8b8d000 c8ac424a 33120000 00000000 35888000 00000000 00033120 00000000 c80ba5f7 00000000 00000000 00000000 00000000 00174f46 00174f46 0008f800 c8b8d800 08e34003 c8abe7f5 c88f4b62 Call Trace: [] ? hard_smp_processor_id+0x16/0x30 [] ? prefill_possible_map+0x16/0x137 [] ? setup_arch+0xaf3/0xbdf [] ? vprintk_default+0x37/0x40 [] ? start_kernel+0x8d/0x3d7 Code: a1 d8 89 b9 c8 5d c3 66 90 66 90 66 90 90 8b 0d b0 f5 a0 c8 8d 84 08 00 d0 ff ff 89 10 c3 8b 15 b0 f5 a0 c8 8d 84 10 00 d0 ff ff <8b> 00 c3 8b 15 20 94 9a c8 53 89 c3 b8 30 00 00 00 ff 52 78 3c EIP: [] native_apic_mem_read+0xd/0x10 SS:ESP 0068:c89f9f84 CR2: 00000000ffffc020 ---[ end trace f68728a0d3053b52 ]--- ----- Original Message ----- From: tglx@linutronix.de To: michal.necasek@oracle.com Cc: michael.thayer@oracle.com, frank.mehnert@oracle.com, knut.osmundsen@oracle.com Sent: Monday, October 24, 2016 9:39:45 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: 4.8.2 not booting in 32-bit VM without I/O-APIC On Mon, 24 Oct 2016, Michal Necasek wrote: > > To explain a bit, disabling the I/O APIC also prevents the MP tables > from being created in the VirtualBox VM (historical reasons) and there > will likewise be no ACPI MADT. > > I believe the panic is triggered when neither ACPI nor MPS does any CPU > discovery. Then the local APIC isn't mapped and prefill_possible_map() > will page fault and panic because num_processors is zero and it just > assumes that the local APIC is present and accessible. > On systems with no MP tables, 'acpi=off' or 'nolapic' kernel arguments > trigger the same panic. I didn't find a way to prevent Linux from looking > at the MP tables if they're present. Hmm. In both cases we should end up with apic == apic_noop() so any access to the apic should not result in a panic. I'll have a look. Thanks, tglx