Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7752575imu; Fri, 28 Dec 2018 04:21:50 -0800 (PST) X-Google-Smtp-Source: ALg8bN7ocM7LNhPtWiecO5cSHvnO/pOOx1+JpuE8iHgIsLOIPxEFhdK1ph0VSvRRCJGUSe9Ngce8 X-Received: by 2002:a17:902:4324:: with SMTP id i33mr26854479pld.227.1545999710442; Fri, 28 Dec 2018 04:21:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545999710; cv=none; d=google.com; s=arc-20160816; b=EL8/6ZH3IP1ayj3vkzAZ2fM0nlGV0OiGiy2gGalFDnE4rVcMoWAOGRyAoOKuT9AUh7 Kx9xbZlOdxTTqsVtVtX3DsE3pmVBXhV+o8mLRODA//ZbR540yO56LE/SWnm10vsIPlZv wzG0JiSJDJjqE9X3PAMqa4JmeuP2m5UULVtQvewAjM7CmTqvHUWrT3WQfqBiox3KkO7O pZMn9MRgQunerPMZypLYTKJxMjW41weiiLIc2guj4OqjT5ahXkl4CcLLWCxaRPqz+T1t 9CfVWxAGvaHwz2Lmy1E15gssX+165kxK2GHr8xbp+nk2VmPIXhLBigAjZrcpBxs0M/56 4R4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :mime-version:dkim-signature; bh=D36GjWMZ7ePAhmpAjQk40TSwf5XhT2qu7iQC9vePce8=; b=gMXaqdiFyhuDl6fRhpMQGUIGVknZ4CLmGf245/zaLCC/9JnWwuovzeIpQmFneuuXHm S9+xHqo6aEc/3n0LpnXJq7raGkXe/OvEkrc5ZW07Rsb9XG+q8y5aOc2yZkBGmHEwfdOE S9A4Pv3sEpmvdjPq21MSuNIWDcfEZLF7SyP4SGX+c6Lcwge4IOLT5d/Amamx9CA/zr+0 gQtyOPGcBTtIUkdLBB/5FiBPUkMAJY4hTo2K19E4YvfP7P5mnfOJy4IuxGtqjlIjcPhf 3vT6WGmd75G5PqqCAY7bDPsa60GvA8+NdJbapcdwK2EiX0vc7uxxkzfpHt1VJCJ9cu8I fb8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@endlessm-com.20150623.gappssmtp.com header.s=20150623 header.b=MvgVh3Vj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 193si9745906pgc.220.2018.12.28.04.21.35; Fri, 28 Dec 2018 04:21:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@endlessm-com.20150623.gappssmtp.com header.s=20150623 header.b=MvgVh3Vj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728865AbeL1GLZ (ORCPT + 99 others); Fri, 28 Dec 2018 01:11:25 -0500 Received: from mail-qk1-f178.google.com ([209.85.222.178]:34710 "EHLO mail-qk1-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728319AbeL1GLY (ORCPT ); Fri, 28 Dec 2018 01:11:24 -0500 Received: by mail-qk1-f178.google.com with SMTP id q8so12119730qke.1 for ; Thu, 27 Dec 2018 22:11:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=endlessm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to:cc; bh=D36GjWMZ7ePAhmpAjQk40TSwf5XhT2qu7iQC9vePce8=; b=MvgVh3VjwVAcICPIgGSj1CTh+UVdR+4VU3e9IexVZ4h86MbKikonWnJQQ8E0Uvuiom vVxZasgal924i4O93asfQpTgUNlZIsM9yedg6SvTzA05YIDNDa0/y+hE1w9Efg5dm1jj iL6EPU5N7WEeqUI9e/zGJa+F6Tm8i0pfoR47cfFuAiKtxYAc4ql4WSmoAlmTuKm5FRJB ZULsp3u6zk+mwOlqHRvKWuQRrWZ5uV58hPWeCHY9E3y3rHPEj/hTUeDakkxrDBjb2ngT ZFb+MGEE2s7NPXGObyFDTnyBVmuzy32Iz2ZiOHYzaLvwgxE6J/lW73oqivSREkYDAU4V 8b8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=D36GjWMZ7ePAhmpAjQk40TSwf5XhT2qu7iQC9vePce8=; b=ks/MM+3oT/UZqQmAW5hyksgANFBal/yvjMMTG2mGPJgvGnYOL+/bN0oMyE+IuOmH1c 104RBiBXJLb37vSBOq7x+CuNjThiv2FOTtsMnPVISG7ORAmFkfhsD8f213TzR/CtjVFM S/N1s3gWYii8S+iIYDPp2Xf3yehrh2JmLw4UoQyhlaD8RV5N1QIskRvzVNecQ9OQUtI2 SsSNGsETFXN5ouvz5sXzzfXTYMOyiV95bfWU3huMqEx6uOTe10BGyzDHaFZY5iz9wy9u fIffC200E3rLV+LoCe7JG6YGafCK35WEs8TVDnOk+Z99OyAc7U/09JUjm2TeUtfNkL3G seew== X-Gm-Message-State: AJcUukeQO0t/cLt2K3m6cDXjj5jPe2ySOBgEGpCHFh6sDn3GgR7H87Gt Tt0xn0YvCfvWnf9hTp9+PcOHf+kc9MWtK0taAITrSg== X-Received: by 2002:a37:34f:: with SMTP id 76mr23924509qkd.347.1545977483095; Thu, 27 Dec 2018 22:11:23 -0800 (PST) MIME-Version: 1.0 From: Daniel Drake Date: Fri, 28 Dec 2018 14:11:12 +0800 Message-ID: Subject: APIC timer checked before it is set up, boot fails on Connex L1430 To: Thomas Gleixner , suresh.b.siddha@intel.com Cc: Linux Upstreaming Team , Linux Kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On the Connex L1430 laptop based on Intel Apollo Lake N3350, Linux doesn't boot. It hangs early on a blank screen. Reproduced with Linus git, 4.18 and 4.19 (there is no previous known working kernel version). EFI earlyprintk shows: APIC: switch to symmetric I/O mode setup x2apic: IRQ remapping doesn't support X2APIC mode ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...tryign to set up timer (IRQ0) through the 8259A ... ..... (found apic 0 pin 2) ... ....... failed. ...trying to set up timer as Virtual Wire IRQ... ..... failed. ...trying to set up timer as ExtINT IRQ... do_IRQ: 0.55 No irq handler for vector ..... failed :(. Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Looking closer, check_timer() is observing that the IOAPIC timer doesn't tick, so it then tries some other approaches but doesn't manage to get them working either. The strange thing is, I booted with the no_timer_check parameter and the system works fine! With this parameter it assumes the IOAPIC timer is ticking and just continues the boot sequence anyway. Here is the boot log with apic=debug no_timer_check: https://gist.github.com/dsd/6f40d8ecc7102dd5dcb90c5dedc69214#file-dmesg-txt /proc/interrupts shows that APIC Local timer interrupts are working fine on both CPUs: https://gist.github.com/dsd/6f40d8ecc7102dd5dcb90c5dedc69214#file-interrupts-txt So, check_timer() is incorrectly deducing that the IOAPIC timer isn't working. The way it checks this is to do a delay loop and then check if jiffies has advanced. I presume the expectation here is that during this delay, the hardware IRQ will invoke local_apic_timer_interrupt() which will then increment jiffies. Indeed, during check_timer() execution this interrupt does not fire, however by using no_timer_check and adding a log message I can see that it fires for the first time quite some time after check_timer() is done: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0xfc66f4fc7c, max_idle_ns: 440795224246 ns Calibrating delay loop (skipped), value calculated using timer frequency.. 2188.80 BogoMIPS (lpj=1094400) pid_max: default: 32768 minimum: 301 LSM: Security Framework initializing SELinux: Initializing. Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) Mount-cache hash table entries: 4096 (order: 3, 32768 bytes) Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes) mce: CPU supports 7 MCE banks mce: CPU0: Thermal monitoring enabled (TM1) Last level iTLB entries: 4KB 48, 2MB 0, 4MB 0 Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 Spectre V2 : Spectre mitigation: kernel not compiled with retpoline; no mitigation available! Freeing SMP alternatives memory: 44K TSC deadline timer enabled smpboot: CPU0: Intel(R) Celeron(R) CPU N3350 @ 1.10GHz (family: 0x6, model: 0x5c, stepping: 0x9) Performance Events: PEBS fmt3+, Goldmont events, 32-deep LBR, full-width counters, Intel PMU driver. ... version: 4 ... bit width: 48 ... generic registers: 4 ... value mask: 0000ffffffffffff ... max period: 00007fffffffffff ... fixed-purpose events: 3 ... event mask: 000000070000000f rcu: Hierarchical SRCU implementation. smp: Bringing up secondary CPUs ... !!! local_apic_timer_interrupt for the first time cpu0 !!! Experimenting further, I used the same approach of adding delays and checking for the interrupt during the delay to figure out at which precise point during the boot sequence the timer interrupt starts working. It's here: static void setup_APIC_timer(void) { [...] if (this_cpu_has(X86_FEATURE_TSC_DEADLINE_TIMER)) { [...] clockevents_config_and_register(levt, tsc_khz * (1000 / TSC_DIVISOR), 0xF, ~0UL); } } We reach clockevents_register_device() which does: 1. Take a spinlock and disable IRQs 2. lapic_set_oneshot() which leads to "TSC deadline timer enabled" message 3. lapic_next_deadline() 4. Spin unlock & re-enable IRQs At the exact point where IRQs are re-enabled above, which is at the time of return from clockevents_config_and_register(), timer interrupts start working. The overall ordering here seems surprising. check_timer() is probing whether the APIC timer works well before setup_APIC_timer() has been called. Shouldn't the timer be checked only after it has been set up? Or is Linux assuming that the BIOS will boot with the APIC timer already running? Any other debugging suggestions much appreciated. Thanks Daniel