Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp51701yba; Wed, 3 Apr 2019 04:23:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqwofTcGpTP+QQwbe/aEBXlsq98AGop0m2wUYBBcVzT5ho43t7Zh8oKGtmCNNmFSQ6dKsdND X-Received: by 2002:a17:902:e7:: with SMTP id a94mr14092389pla.114.1554290589947; Wed, 03 Apr 2019 04:23:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554290589; cv=none; d=google.com; s=arc-20160816; b=P1dZi53/y39O1+ldf+cEEe3wIo0K8mWv2qwq08oPiOKzeSZki2D//q5o2Kt1B6Vmv5 pNtFNwqgYBxX1yy2Ntngq8Y9fXFaYF5J985+LL12iYQLrXaLE9uurlODrAwZTEKsPzox oNtIP9/7fJUy1NvjBhXyuLZyDVyB5Xqu25yNjW8pMOITOVy0FFIn9sJeAl1uyVNTGry1 uEbGkLbQ/fWf2enGVoBj1OH5UX8RjaTkJrXpW/Tj3DwveE6h86VhXqdCTJwTUA10QPaP YxMY/87iX8HLXR1S/xibf9dbSjI4ThLT5jCtz5KYBJiwTCd31zfKt0c80JBDjkk68N83 5EmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=YRtZMpnrx/8I/E7Oqe7bVSCr0sK/7TthVB42qouxDSM=; b=oaF7summsoxF5FIuB1ZCXFhlNYAuxeCG4Moy5ipNV9QQEWEH+sLLwuPXASrWozM9zJ G8njStmzJpZ2YbdCe/pncBbZru/Fsmr0rZEtzIjeZKCyKbwmh0XPn2ZuV+sGwd+hqi0g 48RkfualhjpaO/dH+pTUyEtLyGzvSe4qWqhQqzQZ39i8KkhaWcEuebXnEpJXuFH54OtT 5yhzp1u/rnNSZaXi5CV3zulbw9eX23ttBXLK06iC/ZwWFrC3DJnPVrQPDU28TfSB6q8O mdakRxh17wd/f4H8rRj193EPRo8wtqc2AFF54U3JA/cMwhqHPYskKk+nA4ygSALNXo0T bpNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q16si14161324pgk.405.2019.04.03.04.22.54; Wed, 03 Apr 2019 04:23:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726328AbfDCLVw (ORCPT + 99 others); Wed, 3 Apr 2019 07:21:52 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:39858 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725954AbfDCLVv (ORCPT ); Wed, 3 Apr 2019 07:21:51 -0400 Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1hBdxf-0004Rj-8I; Wed, 03 Apr 2019 13:21:43 +0200 Date: Wed, 3 Apr 2019 13:21:42 +0200 (CEST) From: Thomas Gleixner To: Daniel Drake cc: Linux Kernel , Ingo Molnar , Borislav Petkov , Hans de Goede , david.e.box@linux.intel.com, Endless Linux Upstreaming Team , "Rafael J. Wysocki" , x86@kernel.org Subject: Re: No 8254 PIT & no HPET on new Intel N3350 platforms causes kernel panic during early boot In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Daniel, On Wed, 3 Apr 2019, Daniel Drake wrote: > After encountering this on Connex L1430 last time, we have now > encountered another affected product, from a different vendor (SCOPE > SN116PYA). They both have Intel Apollo Lake N3350 and AMI BIOS. > > The code in question is making sure that the IRQ0 timer works, by > waiting for an interrupt. In this case there is no interrupt. Right. > The x86 platform code in hpet_time_init() tries to enable the HPET > timer for this, however that is not available on these affected > platforms (no HPET ACPI table). So it then falls back on the 8253/8254 > legacy PIT. The i8253.c driver is invoked to program the PIT > accordingly, however in this case it does not result in any IRQ0 > interrupts being generated --> panic. Correct. > I found a relevant setting in the BIOS: Chipset -> South Cluster > Configuration -> Miscellaneous Configuration -> 8254 Clock Gating > This option is set to Enabled by default. Setting it to Disabled makes > the PIT tick and Linux boot finally works. Well, your BIOS at least has this switch ... > As another data point, Windows 10 boots fine in this no-PIT no-HPET > configuation. We have support for HPET/PIT less systems already. We just need to figure out how to switch to that mode automagically at early boot. ACPI obviously does not switch to it with the ACPI_FADT_HW_REDUCED flag. > Going deeper, I found the clock_gate_8254 option in the coreboot > source code. This pointed me to the ITSSPRC register, which is > documented on page 1694 of > https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf > > "8254 Static Clock Gating Enable (CGE8254): When set, the 8254 timer > is disabled statically. This bit shall be set by BIOS if the 8254 > feature is not needed in the system or before BIOS hands off the > system that supports C11. Normal operation of 8254 requires this bit > to 0." > > (what's C11?) Don't know. Some magic new C-State perhaps? Rafael? Btw, one of those links you provided https://www.manualslib.com/manual/1316475/Ecs-Ed20pa2.html?page=23 claims that you have to disable MWAIT as well. No idea why. Is MWAIT disabled on your platform? > I verified that the BIOS setting controls this specific bit value, and > I also created and verified a workaround that unsets this bit - now > Linux boots fine regardless of the BIOS setting: > > #define INTEL_APL_PSR_BASE 0xd0000000 > #define INTEL_APL_PID_ITSS 0xd0 > #define INTEL_PCR_PORTID_SHIFT 16 > #define INTEL_APL_PCR_ITSSPRC 0x3300 > static void quirk_intel_apl_8254(void) > { > u32 addr = INTEL_APL_PSR_BASE | \ > (INTEL_APL_PID_ITSS << INTEL_PCR_PORTID_SHIFT) | \ > INTEL_APL_PCR_ITSSPRC; > u32 value; > void __iomem *itssprc = ioremap_nocache(addr, 4); > > if (!itssprc) > return; > > value = readl(itssprc); > if (value & 4) { > value &= ~4; > writel(value, itssprc); > } > iounmap(itssprc); > } > > I was hoping I could send a workaround patch here, but I'm not sure of > an appropriate way to detect that we are on an Intel Apollo Lake > platform. This timer stuff happens during early boot, the early quirks > in pci/quirks.c run too late for this. Suggestions appreciated. We have early-quirks.c in arch/x86/kernel/ for that. > Poking at other angles, I tried taking the HPET ACPI table from > another (working) Intel N3350 system and putting it in the initrd as > an override. This makes the HPET work fine, at which point Linux boots > OK without having to touch the (BIOS-crippled) PIT. We already have quirks for force enabling HPET, so that could be added. > I'm at the limit of my current knowledge here, but there's an open > question of whether Linux could be made to work without a working PIT > and no HPET, in the same way that grub and Windows seem to manage. > Even though it is currently essential for boot, the PIT (or HPET) is > usually only needed to tick a few times before being replaced with the > APIC timer as a clocksource (when setup_APIC_timer() happens, the > clocksource layer disables the previous timer source). However, Thomas > Gleixner gave some hints at the importance of the PIT/HPET here: > > > Well, [avoiding the PIT/HPET ticking requirement] would be trivial if we > > could rely on the APIC timer being functional on all CPUs and if we could > > figure out the APIC timer frequency without calibrating it against the > > PIT/HPET on older CPUs. Plus a gazillion of other issues (e.g. APIC stops > > in C states ....) > > [...] > > Under certain conditions we actually might avoid touching PIT/HPET and > > solely rely on the CPUID/MSR calibration values. Needs quite some thought > > though. For newer CPUs we might assume that: 1) The TSC and APIC timer are actually usable 2) The frequencies can be retrieved from CPUID or MSRs If #1 and #2 are reliable we can avoid the whole calibration and interrupt delivery mess. That means we need the following decision logic: 1) If HPET is available in ACPI, boot normal. 2) If HPET is not available, verify that the PIT actually counts. If it does, boot normal. If it does not either: 2A) Verify that this is a PCH 300/C240 and fiddle with that ISST bit. But that means that we need to chase PCH ids forever... 2B) Shrug and just avoid the whole PIT/HPET magic all over the place: - Avoid the interrupt delivery check in the IOAPIC code as it's uninteresting in that case. Trivial to do. - Prevent the TSC calibration code from touching PIT/HPET. It should do that already when the TSC frequency can be retrieved via CPUID or MSR. Should work, emphasis on should ... See the mess in: native_calibrate_tsc() and the magic tables in tsc_msr.c how well that stuff works. The cpu_khz_from_cpuid() case at seems to not have these issues. Knock on wood! - Prevent the APIC calibration code from touching PIT/HPET. That's only happening right now when the TSC frequency comes from the MSRs. No idea why the CPUID method does not provide that. CPUID leaf 0x16 provides the bus frequency, so we can deduce the APIC timer frequency from there and spare the whole APIC timer calibration mess: ECX Bits 15 - 00: Bus (Reference) Frequency (in MHz). It's usually not required on these newer CPUs because they support TSC deadline timer, but you can disable that on the kernel command line and some implementations of that were broken. With that we are back to square one. So we need to make sure that these things work under all circumstances. Rafael? Thanks, tglx