Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp770119ybv; Fri, 7 Feb 2020 08:13:11 -0800 (PST) X-Google-Smtp-Source: APXvYqyatq2HfDTUfPtZGvuarGCVybZErorM32+W4CYgd+GG7hENyLH3gkI/ITlz/yNator4YI2W X-Received: by 2002:a05:6830:128e:: with SMTP id z14mr99630otp.184.1581091991228; Fri, 07 Feb 2020 08:13:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581091991; cv=none; d=google.com; s=arc-20160816; b=nKXyRInoVfC33YL5rqNCYGKuzXqv9vg9AwunB+3MsoMtvVewZ6zbRypR8jN+lWuyWV /H5R4gpyUrOwv2ywnzA5TQK29m0dPrCvSAreHfxPgQGzJlHEhWz46VECOcm07FQ4wHq2 XS7yag7HTdMpc5bxHgZmMhCcjuSBbv5guKtmDF0rGLG9ktxtpcb3kd+3ik1s75fmYz/C qcPddeoleQ583+lFCKQuy0c8o/LazaU2Neg2L5ejKznZnM9JbmUZhjoOpw9zHmla2YzG IMQQLUAuN0fR8VtymFtnv3l/3k+YzTPieKSr0BIMSgJaRFjyWB9RDCHA/IV07dYqnDDy w6GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:references :in-reply-to:subject:cc:to:from:date:content-transfer-encoding :mime-version:dkim-signature; bh=3ukaLCHwo4aNytwZ4yeIN+CO9DOHAaEizyQcI4J6z70=; b=DWoe7CJYIQNxPcukaQfTv3ROaatF9Fo/pL7WDfxaa7/VY21q/Yb7+ru5spy4Mji9QG l+129hkcnSpLoFFPiXBT5aAf2y/IQYxeGgY+2cKzMVPhDYk9QPgHsLFG3S8MDl6Mfysf EElTQ8cPUHUeIwC72V2AP8p/BVc/C/x7Bo4TRKd/cej6qj8JYLlfdQ9uXvfpfxvL28xS 12Iv8euOzHsKcal7bpUTDNSaSJswDSDjMxD8/gTY13J5CG9bl9X2X4z+mXqDTnVs5Wo7 Yy4W9AfHYhxlIg/zVTmZRFkO0nq2OsxySp0RI0GGJLbQ4Uwu65QQXdE2XREn4K1NTExO xPEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EQv4vxbk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c189si4237161oig.74.2020.02.07.08.12.58; Fri, 07 Feb 2020 08:13:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EQv4vxbk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727028AbgBGQMD (ORCPT + 99 others); Fri, 7 Feb 2020 11:12:03 -0500 Received: from mail.kernel.org ([198.145.29.99]:33058 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726874AbgBGQMC (ORCPT ); Fri, 7 Feb 2020 11:12:02 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1235920838; Fri, 7 Feb 2020 16:12:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581091921; bh=BXPbsIHGdvZu4lhUnpl6V3vmhO/S2FUUVrZo1qXjLK0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=EQv4vxbkRG7xf/fz7MLweFKPorp7j3grKdfmjEgkIBrmzVwGQdk7ibLX+iVrqrLRx xOD/yb8ye9ZJLyELukYZ45G5/KG/qsX+N+XpkLhOlroBU15jX/wEiVHaV9wmtem7pq vngPmuKpcSMukUfwpRssLANB/iejPsxG84Xxa4BU= Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1j06EZ-003ZIA-CI; Fri, 07 Feb 2020 16:11:59 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 07 Feb 2020 16:11:59 +0000 From: Marc Zyngier To: Lukas Wunner Cc: Thomas Gleixner , Jason Cooper , Nicolas Saenz Julienne , Florian Fainelli , Ray Jui , Scott Branden , bcm-kernel-feedback-list@broadcom.com, linux-kernel@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Serge Schneider , Kristina Brooks , Stefan Wahren , Matthias Brugger , Martin Sperl , Phil Elwell Subject: Re: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader In-Reply-To: <988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de> References: <988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de> Message-ID: <713627a200d9c8fd7cac424d69e98166@kernel.org> X-Sender: maz@kernel.org User-Agent: Roundcube Webmail/1.3.8 X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: lukas@wunner.de, tglx@linutronix.de, jason@lakedaemon.net, nsaenzjulienne@suse.de, f.fainelli@gmail.com, rjui@broadcom.com, sbranden@broadcom.com, bcm-kernel-feedback-list@broadcom.com, linux-kernel@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, serge@raspberrypi.org, notstina@gmail.com, wahrenst@gmx.net, mbrugger@suse.com, kernel@martin.sperl.org, phil@raspberrypi.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Lukas, On 2020-02-07 15:46, Lukas Wunner wrote: > Customers of our "Revolution Pi" open source PLCs (which are based on > the Raspberry Pi) have reported random lockups as well as jittery eMMC, > UART and SPI latency. We were able to reproduce the lockups in our lab > and hooked up a JTAG debugger: > > It turns out that the USB controller's interrupt is already enabled > when > the kernel boots. All interrupts are disabled when the chip comes out > of power-on reset, according to the spec. So apparently the bootloader > enables the interrupt but neglects to disable it before handing over > control to the kernel. > > The bootloader is a closed source blob provided by the Raspberry Pi > Foundation. Development of an alternative open source bootloader was > begun by Kristina Brooks but it's not fully functional yet. Usage of > the blob is thus without alternative for the time being. > > The Raspberry Pi Foundation's downstream kernel has a performance- > optimized USB driver (which we use on our Revolution Pi products). > The driver takes advantage of the FIQ fast interrupt. Because the > regular USB interrupt was left enabled by the bootloader, both the > FIQ and the normal interrupt is enabled once the USB driver probes. > > The spec has the following to say on simultaneously enabling the FIQ > and the normal interrupt of a peripheral: > > "One interrupt source can be selected to be connected to the ARM FIQ > input. An interrupt which is selected as FIQ should have its normal > interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt > will be fired at the same time. Not a good idea!" Or to spell it out more clearly: Braindead hardware. Really. > ^^^^^^^^^^^^^^^ > https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf > page 110 > > On a multicore Raspberry Pi, the Foundation's kernel routes all normal > interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the > normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in > bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it. > Interrupts with a lower priority than USB are starved as long. > > That explains the jittery eMMC, UART and SPI latency: On one occasion > I've seen CPU 0 blocked for no less than 2.9 msec. Basically, > everything not USB takes a performance hit: Whereas eMMC throughput > on a Compute Module 3 remains relatively constant at 23.5 MB/s with > this commit, it irregularly dips to 23.0 MB/s without this commit. > > The lockups occur when CPU 0 receives a USB interrupt while holding a > lock which CPU 1 is trying to acquire while the FIQ is temporarily > disabled on CPU 1. > > I've tested old releases of the Foundation's bootloader as far back as > 1.20160202-1 and they all leave the USB interrupt enabled. Still older > releases fail to boot a contemporary kernel on a Compute Module 1 or 3, > which are the only Raspberry Pi variants I have at my disposal for > testing. > > Fix by disabling IRQs left enabled by the bootloader. Although the > impact is most pronounced on the Foundation's downstream kernel, > it seems prudent to apply the fix to the upstream kernel to guard > against such mistakes in any present and future bootloader. > > An alternative, though more convoluted approach would be to clear the > IRQD_IRQ_MASKED flag on all interrupts left enabled on boot. Then the > first invocation of handle_level_irq() would mask and thereby quiesce > those interrupts. Nah, that's terrible. The right thing to do is indeed to mop up the mess that the bootloader is bound to leave and start with a clean slate. > > Signed-off-by: Lukas Wunner > Cc: Serge Schneider > Cc: Kristina Brooks > Cc: stable@vger.kernel.org > --- > drivers/irqchip/irq-bcm2835.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/irqchip/irq-bcm2835.c > b/drivers/irqchip/irq-bcm2835.c > index 418245d31921..0d9a5a7ebe2c 100644 > --- a/drivers/irqchip/irq-bcm2835.c > +++ b/drivers/irqchip/irq-bcm2835.c > @@ -150,6 +150,13 @@ static int __init armctrl_of_init(struct > device_node *node, > intc.enable[b] = base + reg_enable[b]; > intc.disable[b] = base + reg_disable[b]; > > + irq = readl(intc.enable[b]); readl_relaxed(), please. irq is not quite the right type either, please use a u32. > + if (irq) { > + writel(irq, intc.disable[b]); writel_relaxed(). > + pr_err(FW_BUG "Bootloader left irq enabled: " > + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &irq); > + } > + > for (i = 0; i < bank_irqs[b]; i++) { > irq = irq_create_mapping(intc.domain, MAKE_HWIRQ(b, i)); > BUG_ON(irq <= 0); Don't you need to do something about the FIQ side as well? M. -- Jazz is not dead. It just smells funny...