2020-02-07 15:49:44

by Lukas Wunner

[permalink] [raw]
Subject: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Customers of our "Revolution Pi" open source PLCs (which are based on
the Raspberry Pi) have reported random lockups as well as jittery eMMC,
UART and SPI latency. We were able to reproduce the lockups in our lab
and hooked up a JTAG debugger:

It turns out that the USB controller's interrupt is already enabled when
the kernel boots. All interrupts are disabled when the chip comes out
of power-on reset, according to the spec. So apparently the bootloader
enables the interrupt but neglects to disable it before handing over
control to the kernel.

The bootloader is a closed source blob provided by the Raspberry Pi
Foundation. Development of an alternative open source bootloader was
begun by Kristina Brooks but it's not fully functional yet. Usage of
the blob is thus without alternative for the time being.

The Raspberry Pi Foundation's downstream kernel has a performance-
optimized USB driver (which we use on our Revolution Pi products).
The driver takes advantage of the FIQ fast interrupt. Because the
regular USB interrupt was left enabled by the bootloader, both the
FIQ and the normal interrupt is enabled once the USB driver probes.

The spec has the following to say on simultaneously enabling the FIQ
and the normal interrupt of a peripheral:

"One interrupt source can be selected to be connected to the ARM FIQ
input. An interrupt which is selected as FIQ should have its normal
interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
will be fired at the same time. Not a good idea!"
^^^^^^^^^^^^^^^
https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
page 110

On a multicore Raspberry Pi, the Foundation's kernel routes all normal
interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
Interrupts with a lower priority than USB are starved as long.

That explains the jittery eMMC, UART and SPI latency: On one occasion
I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
everything not USB takes a performance hit: Whereas eMMC throughput
on a Compute Module 3 remains relatively constant at 23.5 MB/s with
this commit, it irregularly dips to 23.0 MB/s without this commit.

The lockups occur when CPU 0 receives a USB interrupt while holding a
lock which CPU 1 is trying to acquire while the FIQ is temporarily
disabled on CPU 1.

I've tested old releases of the Foundation's bootloader as far back as
1.20160202-1 and they all leave the USB interrupt enabled. Still older
releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
which are the only Raspberry Pi variants I have at my disposal for
testing.

Fix by disabling IRQs left enabled by the bootloader. Although the
impact is most pronounced on the Foundation's downstream kernel,
it seems prudent to apply the fix to the upstream kernel to guard
against such mistakes in any present and future bootloader.

An alternative, though more convoluted approach would be to clear the
IRQD_IRQ_MASKED flag on all interrupts left enabled on boot. Then the
first invocation of handle_level_irq() would mask and thereby quiesce
those interrupts.

Signed-off-by: Lukas Wunner <[email protected]>
Cc: Serge Schneider <[email protected]>
Cc: Kristina Brooks <[email protected]>
Cc: [email protected]
---
drivers/irqchip/irq-bcm2835.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 418245d31921..0d9a5a7ebe2c 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -150,6 +150,13 @@ static int __init armctrl_of_init(struct device_node *node,
intc.enable[b] = base + reg_enable[b];
intc.disable[b] = base + reg_disable[b];

+ irq = readl(intc.enable[b]);
+ if (irq) {
+ writel(irq, intc.disable[b]);
+ pr_err(FW_BUG "Bootloader left irq enabled: "
+ "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &irq);
+ }
+
for (i = 0; i < bank_irqs[b]; i++) {
irq = irq_create_mapping(intc.domain, MAKE_HWIRQ(b, i));
BUG_ON(irq <= 0);
--
2.24.0


2020-02-07 16:13:11

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Hi Lukas,

On 2020-02-07 15:46, Lukas Wunner wrote:
> Customers of our "Revolution Pi" open source PLCs (which are based on
> the Raspberry Pi) have reported random lockups as well as jittery eMMC,
> UART and SPI latency. We were able to reproduce the lockups in our lab
> and hooked up a JTAG debugger:
>
> It turns out that the USB controller's interrupt is already enabled
> when
> the kernel boots. All interrupts are disabled when the chip comes out
> of power-on reset, according to the spec. So apparently the bootloader
> enables the interrupt but neglects to disable it before handing over
> control to the kernel.
>
> The bootloader is a closed source blob provided by the Raspberry Pi
> Foundation. Development of an alternative open source bootloader was
> begun by Kristina Brooks but it's not fully functional yet. Usage of
> the blob is thus without alternative for the time being.
>
> The Raspberry Pi Foundation's downstream kernel has a performance-
> optimized USB driver (which we use on our Revolution Pi products).
> The driver takes advantage of the FIQ fast interrupt. Because the
> regular USB interrupt was left enabled by the bootloader, both the
> FIQ and the normal interrupt is enabled once the USB driver probes.
>
> The spec has the following to say on simultaneously enabling the FIQ
> and the normal interrupt of a peripheral:
>
> "One interrupt source can be selected to be connected to the ARM FIQ
> input. An interrupt which is selected as FIQ should have its normal
> interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
> will be fired at the same time. Not a good idea!"

Or to spell it out more clearly: Braindead hardware. Really.

> ^^^^^^^^^^^^^^^
> https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
> page 110
>
> On a multicore Raspberry Pi, the Foundation's kernel routes all normal
> interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
> normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
> bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
> Interrupts with a lower priority than USB are starved as long.
>
> That explains the jittery eMMC, UART and SPI latency: On one occasion
> I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
> everything not USB takes a performance hit: Whereas eMMC throughput
> on a Compute Module 3 remains relatively constant at 23.5 MB/s with
> this commit, it irregularly dips to 23.0 MB/s without this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1.
>
> I've tested old releases of the Foundation's bootloader as far back as
> 1.20160202-1 and they all leave the USB interrupt enabled. Still older
> releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
> which are the only Raspberry Pi variants I have at my disposal for
> testing.
>
> Fix by disabling IRQs left enabled by the bootloader. Although the
> impact is most pronounced on the Foundation's downstream kernel,
> it seems prudent to apply the fix to the upstream kernel to guard
> against such mistakes in any present and future bootloader.
>
> An alternative, though more convoluted approach would be to clear the
> IRQD_IRQ_MASKED flag on all interrupts left enabled on boot. Then the
> first invocation of handle_level_irq() would mask and thereby quiesce
> those interrupts.

Nah, that's terrible. The right thing to do is indeed to mop up the mess
that the bootloader is bound to leave and start with a clean slate.

>
> Signed-off-by: Lukas Wunner <[email protected]>
> Cc: Serge Schneider <[email protected]>
> Cc: Kristina Brooks <[email protected]>
> Cc: [email protected]
> ---
> drivers/irqchip/irq-bcm2835.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c
> b/drivers/irqchip/irq-bcm2835.c
> index 418245d31921..0d9a5a7ebe2c 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -150,6 +150,13 @@ static int __init armctrl_of_init(struct
> device_node *node,
> intc.enable[b] = base + reg_enable[b];
> intc.disable[b] = base + reg_disable[b];
>
> + irq = readl(intc.enable[b]);

readl_relaxed(), please. irq is not quite the right type either, please
use a u32.

> + if (irq) {
> + writel(irq, intc.disable[b]);

writel_relaxed().

> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &irq);
> + }
> +
> for (i = 0; i < bank_irqs[b]; i++) {
> irq = irq_create_mapping(intc.domain, MAKE_HWIRQ(b, i));
> BUG_ON(irq <= 0);

Don't you need to do something about the FIQ side as well?

M.
--
Jazz is not dead. It just smells funny...

2020-02-10 09:59:55

by Lukas Wunner

[permalink] [raw]
Subject: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Customers of our "Revolution Pi" open source PLCs (which are based on
the Raspberry Pi) have reported random lockups as well as jittery eMMC,
UART and SPI latency. We were able to reproduce the lockups in our lab
and hooked up a JTAG debugger:

It turns out that the USB controller's interrupt is already enabled when
the kernel boots. All interrupts are disabled when the chip comes out
of power-on reset, according to the spec. So apparently the bootloader
enables the interrupt but neglects to disable it before handing over
control to the kernel.

The bootloader is a closed source blob provided by the Raspberry Pi
Foundation. Development of an alternative open source bootloader was
begun by Kristina Brooks but it's not fully functional yet. Usage of
the blob is thus without alternative for the time being.

The Raspberry Pi Foundation's downstream kernel has a performance-
optimized USB driver (which we use on our Revolution Pi products).
The driver takes advantage of the FIQ fast interrupt. Because the
regular USB interrupt was left enabled by the bootloader, both the
FIQ and the normal interrupt is enabled once the USB driver probes.

The spec has the following to say on simultaneously enabling the FIQ
and the normal interrupt of a peripheral:

"One interrupt source can be selected to be connected to the ARM FIQ
input. An interrupt which is selected as FIQ should have its normal
interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
will be fired at the same time. Not a good idea!"
^^^^^^^^^^^^^^^
https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
page 110

On a multicore Raspberry Pi, the Foundation's kernel routes all normal
interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
Interrupts with a lower priority than USB are starved as long.

That explains the jittery eMMC, UART and SPI latency: On one occasion
I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
everything not USB takes a performance hit: Whereas eMMC throughput
on a Compute Module 3 remains relatively constant at 23.5 MB/s with
this commit, it irregularly dips to 23.0 MB/s without this commit.

The lockups occur when CPU 0 receives a USB interrupt while holding a
lock which CPU 1 is trying to acquire while the FIQ is temporarily
disabled on CPU 1.

I've tested old releases of the Foundation's bootloader as far back as
1.20160202-1 and they all leave the USB interrupt enabled. Still older
releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
which are the only Raspberry Pi variants I have at my disposal for
testing.

Fix by disabling IRQs left enabled by the bootloader. Although the
impact is most pronounced on the Foundation's downstream kernel,
it seems prudent to apply the fix to the upstream kernel to guard
against such mistakes in any present and future bootloader.

Signed-off-by: Lukas Wunner <[email protected]>
Cc: Serge Schneider <[email protected]>
Cc: Kristina Brooks <[email protected]>
Cc: [email protected]
---
Changes since v1:
* Use "relaxed" MMIO accessors to avoid memory barriers (Marc)
* Use u32 instead of int for register access (Marc)
* Quiesce FIQ as well (Marc)
* Quiesce IRQs after mapping them for better readability
* Drop alternative approach from commit message (Marc)

Link to v1:
https://lore.kernel.org/lkml/988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de/

drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 418245d31921..63539c88ac3a 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -61,6 +61,7 @@
| SHORTCUT1_MASK | SHORTCUT2_MASK)

#define REG_FIQ_CONTROL 0x0c
+#define REG_FIQ_ENABLE 0x80

#define NR_BANKS 3
#define IRQS_PER_BANK 32
@@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct device_node *node,
{
void __iomem *base;
int irq, b, i;
+ u32 reg;

base = of_iomap(node, 0);
if (!base)
@@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct device_node *node,
handle_level_irq);
irq_set_probe(irq);
}
+
+ reg = readl_relaxed(intc.enable[b]);
+ if (reg) {
+ writel_relaxed(reg, intc.disable[b]);
+ pr_err(FW_BUG "Bootloader left irq enabled: "
+ "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
+ }
+ }
+
+ reg = readl_relaxed(base + REG_FIQ_CONTROL);
+ if (reg & REG_FIQ_ENABLE) {
+ writel_relaxed(0, base + REG_FIQ_CONTROL);
+ pr_err(FW_BUG "Bootloader left fiq enabled\n");
}

if (is_2836) {
--
2.24.0

2020-02-12 04:47:39

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader



On 2/10/2020 1:52 AM, Lukas Wunner wrote:
> Customers of our "Revolution Pi" open source PLCs (which are based on
> the Raspberry Pi) have reported random lockups as well as jittery eMMC,
> UART and SPI latency. We were able to reproduce the lockups in our lab
> and hooked up a JTAG debugger:
>
> It turns out that the USB controller's interrupt is already enabled when
> the kernel boots. All interrupts are disabled when the chip comes out
> of power-on reset, according to the spec. So apparently the bootloader
> enables the interrupt but neglects to disable it before handing over
> control to the kernel.
>
> The bootloader is a closed source blob provided by the Raspberry Pi
> Foundation. Development of an alternative open source bootloader was
> begun by Kristina Brooks but it's not fully functional yet. Usage of
> the blob is thus without alternative for the time being.
>
> The Raspberry Pi Foundation's downstream kernel has a performance-
> optimized USB driver (which we use on our Revolution Pi products).
> The driver takes advantage of the FIQ fast interrupt. Because the
> regular USB interrupt was left enabled by the bootloader, both the
> FIQ and the normal interrupt is enabled once the USB driver probes.
>
> The spec has the following to say on simultaneously enabling the FIQ
> and the normal interrupt of a peripheral:
>
> "One interrupt source can be selected to be connected to the ARM FIQ
> input. An interrupt which is selected as FIQ should have its normal
> interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
> will be fired at the same time. Not a good idea!"
> ^^^^^^^^^^^^^^^
> https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
> page 110
>
> On a multicore Raspberry Pi, the Foundation's kernel routes all normal
> interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
> normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
> bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
> Interrupts with a lower priority than USB are starved as long.
>
> That explains the jittery eMMC, UART and SPI latency: On one occasion
> I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
> everything not USB takes a performance hit: Whereas eMMC throughput
> on a Compute Module 3 remains relatively constant at 23.5 MB/s with
> this commit, it irregularly dips to 23.0 MB/s without this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1.
>
> I've tested old releases of the Foundation's bootloader as far back as
> 1.20160202-1 and they all leave the USB interrupt enabled. Still older
> releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
> which are the only Raspberry Pi variants I have at my disposal for
> testing.
>
> Fix by disabling IRQs left enabled by the bootloader. Although the
> impact is most pronounced on the Foundation's downstream kernel,
> it seems prudent to apply the fix to the upstream kernel to guard
> against such mistakes in any present and future bootloader.
>
> Signed-off-by: Lukas Wunner <[email protected]>
> Cc: Serge Schneider <[email protected]>
> Cc: Kristina Brooks <[email protected]>
> Cc: [email protected]

It would be nice to provide a Fixes: tag so it gets backported to the
relevant -stable trees, this may be dating back to the first time the
driver was brought in tree. The commit message is a bit long and starts
going into details that I am not sure add anything, but FWIW:

Reviewed-by: Florian Fainelli <[email protected]>
--
Florian

2020-02-12 08:13:59

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Hi Lukas,

Thanks for the update on this.

On 2020-02-10 09:52, Lukas Wunner wrote:
> Customers of our "Revolution Pi" open source PLCs (which are based on
> the Raspberry Pi) have reported random lockups as well as jittery eMMC,
> UART and SPI latency. We were able to reproduce the lockups in our lab
> and hooked up a JTAG debugger:
>
> It turns out that the USB controller's interrupt is already enabled
> when
> the kernel boots. All interrupts are disabled when the chip comes out
> of power-on reset, according to the spec. So apparently the bootloader
> enables the interrupt but neglects to disable it before handing over
> control to the kernel.
>
> The bootloader is a closed source blob provided by the Raspberry Pi
> Foundation. Development of an alternative open source bootloader was
> begun by Kristina Brooks but it's not fully functional yet. Usage of
> the blob is thus without alternative for the time being.
>
> The Raspberry Pi Foundation's downstream kernel has a performance-
> optimized USB driver (which we use on our Revolution Pi products).
> The driver takes advantage of the FIQ fast interrupt. Because the
> regular USB interrupt was left enabled by the bootloader, both the
> FIQ and the normal interrupt is enabled once the USB driver probes.
>
> The spec has the following to say on simultaneously enabling the FIQ
> and the normal interrupt of a peripheral:
>
> "One interrupt source can be selected to be connected to the ARM FIQ
> input. An interrupt which is selected as FIQ should have its normal
> interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt
> will be fired at the same time. Not a good idea!"
> ^^^^^^^^^^^^^^^
> https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
> page 110
>
> On a multicore Raspberry Pi, the Foundation's kernel routes all normal
> interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the
> normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in
> bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it.
> Interrupts with a lower priority than USB are starved as long.
>
> That explains the jittery eMMC, UART and SPI latency: On one occasion
> I've seen CPU 0 blocked for no less than 2.9 msec. Basically,
> everything not USB takes a performance hit: Whereas eMMC throughput
> on a Compute Module 3 remains relatively constant at 23.5 MB/s with
> this commit, it irregularly dips to 23.0 MB/s without this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1.
>
> I've tested old releases of the Foundation's bootloader as far back as
> 1.20160202-1 and they all leave the USB interrupt enabled. Still older
> releases fail to boot a contemporary kernel on a Compute Module 1 or 3,
> which are the only Raspberry Pi variants I have at my disposal for
> testing.
>
> Fix by disabling IRQs left enabled by the bootloader. Although the
> impact is most pronounced on the Foundation's downstream kernel,
> it seems prudent to apply the fix to the upstream kernel to guard
> against such mistakes in any present and future bootloader.

While the story is interesting, it doesn't really belong to a commit
message.
Please trim it down to something along the lines of:

- The RPi bootloader is a bit crap, as it leaves IRQs and FIQs enabled
and for the OS to deal with the consequences

- The kernel driver is not great either, as it doesn't properly
initialize
the interrupt state, resulting in both IRQ and FIQ misfiring and
resulting
in bizarre behaviours

- Properly initializing the irqchip fixes the issue. Add a couple a
warnings
for a good measure, so that people realize their favourite toy comes
with
sub-par SW.

> Signed-off-by: Lukas Wunner <[email protected]>
> Cc: Serge Schneider <[email protected]>
> Cc: Kristina Brooks <[email protected]>
> Cc: [email protected]
> ---
> Changes since v1:
> * Use "relaxed" MMIO accessors to avoid memory barriers (Marc)
> * Use u32 instead of int for register access (Marc)
> * Quiesce FIQ as well (Marc)
> * Quiesce IRQs after mapping them for better readability
> * Drop alternative approach from commit message (Marc)
>
> Link to v1:
> https://lore.kernel.org/lkml/988737dbbc4e499c2faaaa4e567ba3ed8deb9a89.1581089797.git.lukas@wunner.de/
>
> drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c
> b/drivers/irqchip/irq-bcm2835.c
> index 418245d31921..63539c88ac3a 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -61,6 +61,7 @@
> | SHORTCUT1_MASK | SHORTCUT2_MASK)
>
> #define REG_FIQ_CONTROL 0x0c
> +#define REG_FIQ_ENABLE 0x80
>
> #define NR_BANKS 3
> #define IRQS_PER_BANK 32
> @@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct
> device_node *node,
> {
> void __iomem *base;
> int irq, b, i;
> + u32 reg;
>
> base = of_iomap(node, 0);
> if (!base)
> @@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct
> device_node *node,
> handle_level_irq);
> irq_set_probe(irq);
> }
> +
> + reg = readl_relaxed(intc.enable[b]);
> + if (reg) {
> + writel_relaxed(reg, intc.disable[b]);
> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
> + }
> + }
> +
> + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> + if (reg & REG_FIQ_ENABLE) {
> + writel_relaxed(0, base + REG_FIQ_CONTROL);
> + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> }
>
> if (is_2836) {

It otherwise looks good. You can either resend it with a fixed commit
message,
or provide me with a commit message that I can stick there while
applying it.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2020-02-12 12:37:21

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

On Tue, Feb 11, 2020 at 08:47:05PM -0800, Florian Fainelli wrote:
> The commit message is a bit long and starts
> going into details that I am not sure add anything

I adhere to the school of thought which holds that commit messages
shall provide complete context, including numbers to back up claims,
user-visible impact, affected versions, genesis of the fix and so on.
By that logic there's no such a thing as a too long commit message.

Nevertheless please find a shortened version below, complete with
the Fixes tag you requested as well as your R-b.


On Wed, Feb 12, 2020 at 08:13:29AM +0000, Marc Zyngier wrote:
> It otherwise looks good. You can either resend it with a fixed commit
> message,
> or provide me with a commit message that I can stick there while applying
> it.

The below also contains the patch itself, so can be applied directly
with git am --scissors. Feel free to tweak as you see fit.
Shout if I've missed anything. Thanks.

-- >8 --
From: Lukas Wunner <[email protected]>
Subject: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Per the spec, the BCM2835's IRQs are all disabled when coming out of
power-on reset. Its IRQ driver assumes that's still the case when the
kernel boots and does not perform any initialization of the registers.
However the Raspberry Pi Foundation's bootloader leaves the USB
interrupt enabled when handing over control to the kernel.

Quiesce IRQs and the FIQ if they were left enabled and log a message to
let users know that they should update the bootloader once a fixed
version is released.

If the USB interrupt is not quiesced and the USB driver later on claims
the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
interrupt latency for all other peripherals increases and occasional
lockups occur. That's because both the FIQ and the normal USB interrupt
fire simultaneously.

On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
but remains relatively constant at 23.5 MB/s with this commit.

The lockups occur when CPU 0 receives a USB interrupt while holding a
lock which CPU 1 is trying to acquire while the FIQ is temporarily
disabled on CPU 1. At best users get RCU CPU stall warnings, but most
of the time the system just freezes.

Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Cc: [email protected] # v3.7+
Cc: Serge Schneider <[email protected]>
Cc: Kristina Brooks <[email protected]>
---
drivers/irqchip/irq-bcm2835.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 418245d..eca9ac7 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -135,6 +135,7 @@ static int __init armctrl_of_init(struct device_node *node,
{
void __iomem *base;
int irq, b, i;
+ u32 reg;

base = of_iomap(node, 0);
if (!base)
@@ -157,6 +158,19 @@ static int __init armctrl_of_init(struct device_node *node,
handle_level_irq);
irq_set_probe(irq);
}
+
+ reg = readl_relaxed(intc.enable[b]);
+ if (reg) {
+ writel_relaxed(reg, intc.disable[b]);
+ pr_err(FW_BUG "Bootloader left irq enabled: "
+ "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
+ }
+ }
+
+ reg = readl_relaxed(base + REG_FIQ_CONTROL);
+ if (reg & REG_FIQ_ENABLE) {
+ writel_relaxed(0, base + REG_FIQ_CONTROL);
+ pr_err(FW_BUG "Bootloader left fiq enabled\n");
}

if (is_2836) {
--
2.24.0

2020-02-12 12:55:58

by Nicolas Saenz Julienne

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

On Wed, 2020-02-12 at 13:36 +0100, Lukas Wunner wrote:
> On Tue, Feb 11, 2020 at 08:47:05PM -0800, Florian Fainelli wrote:
> > The commit message is a bit long and starts
> > going into details that I am not sure add anything
>
> I adhere to the school of thought which holds that commit messages
> shall provide complete context, including numbers to back up claims,
> user-visible impact, affected versions, genesis of the fix and so on.
> By that logic there's no such a thing as a too long commit message.
>
> Nevertheless please find a shortened version below, complete with
> the Fixes tag you requested as well as your R-b.
>
>
> On Wed, Feb 12, 2020 at 08:13:29AM +0000, Marc Zyngier wrote:
> > It otherwise looks good. You can either resend it with a fixed commit
> > message,
> > or provide me with a commit message that I can stick there while applying
> > it.
>
> The below also contains the patch itself, so can be applied directly
> with git am --scissors. Feel free to tweak as you see fit.
> Shout if I've missed anything. Thanks.
>
> -- >8 --
> From: Lukas Wunner <[email protected]>
> Subject: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader
>
> Per the spec, the BCM2835's IRQs are all disabled when coming out of
> power-on reset. Its IRQ driver assumes that's still the case when the
> kernel boots and does not perform any initialization of the registers.
> However the Raspberry Pi Foundation's bootloader leaves the USB
> interrupt enabled when handing over control to the kernel.
>
> Quiesce IRQs and the FIQ if they were left enabled and log a message to
> let users know that they should update the bootloader once a fixed
> version is released.
>
> If the USB interrupt is not quiesced and the USB driver later on claims
> the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
> interrupt latency for all other peripherals increases and occasional
> lockups occur. That's because both the FIQ and the normal USB interrupt
> fire simultaneously.
>
> On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
> and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
> interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
> FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
> as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
> on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
> but remains relatively constant at 23.5 MB/s with this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1. At best users get RCU CPU stall warnings, but most
> of the time the system just freezes.
>
> Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
> Signed-off-by: Lukas Wunner <[email protected]>
> Reviewed-by: Florian Fainelli <[email protected]>
> Cc: [email protected] # v3.7+
> Cc: Serge Schneider <[email protected]>
> Cc: Kristina Brooks <[email protected]>

Reviewed-by: Nicolas Saenz Julienne <[email protected]>

Thanks!

> ---
> drivers/irqchip/irq-bcm2835.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
> index 418245d..eca9ac7 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -135,6 +135,7 @@ static int __init armctrl_of_init(struct device_node
> *node,
> {
> void __iomem *base;
> int irq, b, i;
> + u32 reg;
>
> base = of_iomap(node, 0);
> if (!base)
> @@ -157,6 +158,19 @@ static int __init armctrl_of_init(struct device_node
> *node,
> handle_level_irq);
> irq_set_probe(irq);
> }
> +
> + reg = readl_relaxed(intc.enable[b]);
> + if (reg) {
> + writel_relaxed(reg, intc.disable[b]);
> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
> + }
> + }
> +
> + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> + if (reg & REG_FIQ_ENABLE) {
> + writel_relaxed(0, base + REG_FIQ_CONTROL);
> + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> }
>
> if (is_2836) {


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2020-02-23 18:01:14

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Hi Lukas,

Am 12.02.20 um 13:36 schrieb Lukas Wunner:
> On Tue, Feb 11, 2020 at 08:47:05PM -0800, Florian Fainelli wrote:
>> The commit message is a bit long and starts
>> going into details that I am not sure add anything
> I adhere to the school of thought which holds that commit messages
> shall provide complete context, including numbers to back up claims,
> user-visible impact, affected versions, genesis of the fix and so on.
> By that logic there's no such a thing as a too long commit message.
>
> Nevertheless please find a shortened version below, complete with
> the Fixes tag you requested as well as your R-b.
>
>
> On Wed, Feb 12, 2020 at 08:13:29AM +0000, Marc Zyngier wrote:
>> It otherwise looks good. You can either resend it with a fixed commit
>> message,
>> or provide me with a commit message that I can stick there while applying
>> it.
> The below also contains the patch itself, so can be applied directly
> with git am --scissors. Feel free to tweak as you see fit.
> Shout if I've missed anything. Thanks.

thanks for all the investigation. Unfortunately the patch below doesn't
compile, since it lacks the definiton of REG_FIQ_ENABLE.

Btw the name is a little bit unlucky because it defines a single flag
within REG_FIQ_CONTROL instead of a separate register.

Regards
Stefan

>
> -- >8 --
> From: Lukas Wunner <[email protected]>
> Subject: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader
>
> Per the spec, the BCM2835's IRQs are all disabled when coming out of
> power-on reset. Its IRQ driver assumes that's still the case when the
> kernel boots and does not perform any initialization of the registers.
> However the Raspberry Pi Foundation's bootloader leaves the USB
> interrupt enabled when handing over control to the kernel.
>
> Quiesce IRQs and the FIQ if they were left enabled and log a message to
> let users know that they should update the bootloader once a fixed
> version is released.
>
> If the USB interrupt is not quiesced and the USB driver later on claims
> the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
> interrupt latency for all other peripherals increases and occasional
> lockups occur. That's because both the FIQ and the normal USB interrupt
> fire simultaneously.
>
> On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
> and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
> interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
> FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
> as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
> on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
> but remains relatively constant at 23.5 MB/s with this commit.
>
> The lockups occur when CPU 0 receives a USB interrupt while holding a
> lock which CPU 1 is trying to acquire while the FIQ is temporarily
> disabled on CPU 1. At best users get RCU CPU stall warnings, but most
> of the time the system just freezes.
>
> Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
> Signed-off-by: Lukas Wunner <[email protected]>
> Reviewed-by: Florian Fainelli <[email protected]>
> Cc: [email protected] # v3.7+
> Cc: Serge Schneider <[email protected]>
> Cc: Kristina Brooks <[email protected]>
> ---
> drivers/irqchip/irq-bcm2835.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
> index 418245d..eca9ac7 100644
> --- a/drivers/irqchip/irq-bcm2835.c
> +++ b/drivers/irqchip/irq-bcm2835.c
> @@ -135,6 +135,7 @@ static int __init armctrl_of_init(struct device_node *node,
> {
> void __iomem *base;
> int irq, b, i;
> + u32 reg;
>
> base = of_iomap(node, 0);
> if (!base)
> @@ -157,6 +158,19 @@ static int __init armctrl_of_init(struct device_node *node,
> handle_level_irq);
> irq_set_probe(irq);
> }
> +
> + reg = readl_relaxed(intc.enable[b]);
> + if (reg) {
> + writel_relaxed(reg, intc.disable[b]);
> + pr_err(FW_BUG "Bootloader left irq enabled: "
> + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
> + }
> + }
> +
> + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> + if (reg & REG_FIQ_ENABLE) {
> + writel_relaxed(0, base + REG_FIQ_CONTROL);
> + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> }
>
> if (is_2836) {

2020-02-23 18:25:10

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

On Sun, Feb 23, 2020 at 06:59:56PM +0100, Stefan Wahren wrote:
> thanks for all the investigation. Unfortunately the patch below doesn't
> compile, since it lacks the definiton of REG_FIQ_ENABLE.

Ugh, I recall fixing that when compile-testing. I must have forgotten
to invoke "git commit --amend" before "git format-patch".

> Btw the name is a little bit unlucky because it defines a single flag
> within REG_FIQ_CONTROL instead of a separate register.

The Foundation's repo uses that name so I stuck by it to reduce the
number of merge conflicts Phil will have to resolve. Happy to change
though, suggestions welcome.

Thanks!

Lukas

> >
> > -- >8 --
> > From: Lukas Wunner <[email protected]>
> > Subject: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader
> >
> > Per the spec, the BCM2835's IRQs are all disabled when coming out of
> > power-on reset. Its IRQ driver assumes that's still the case when the
> > kernel boots and does not perform any initialization of the registers.
> > However the Raspberry Pi Foundation's bootloader leaves the USB
> > interrupt enabled when handing over control to the kernel.
> >
> > Quiesce IRQs and the FIQ if they were left enabled and log a message to
> > let users know that they should update the bootloader once a fixed
> > version is released.
> >
> > If the USB interrupt is not quiesced and the USB driver later on claims
> > the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
> > interrupt latency for all other peripherals increases and occasional
> > lockups occur. That's because both the FIQ and the normal USB interrupt
> > fire simultaneously.
> >
> > On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
> > and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
> > interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
> > FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
> > as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
> > on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
> > but remains relatively constant at 23.5 MB/s with this commit.
> >
> > The lockups occur when CPU 0 receives a USB interrupt while holding a
> > lock which CPU 1 is trying to acquire while the FIQ is temporarily
> > disabled on CPU 1. At best users get RCU CPU stall warnings, but most
> > of the time the system just freezes.
> >
> > Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
> > Signed-off-by: Lukas Wunner <[email protected]>
> > Reviewed-by: Florian Fainelli <[email protected]>
> > Cc: [email protected] # v3.7+
> > Cc: Serge Schneider <[email protected]>
> > Cc: Kristina Brooks <[email protected]>
> > ---
> > drivers/irqchip/irq-bcm2835.c | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
> > index 418245d..eca9ac7 100644
> > --- a/drivers/irqchip/irq-bcm2835.c
> > +++ b/drivers/irqchip/irq-bcm2835.c
> > @@ -135,6 +135,7 @@ static int __init armctrl_of_init(struct device_node *node,
> > {
> > void __iomem *base;
> > int irq, b, i;
> > + u32 reg;
> >
> > base = of_iomap(node, 0);
> > if (!base)
> > @@ -157,6 +158,19 @@ static int __init armctrl_of_init(struct device_node *node,
> > handle_level_irq);
> > irq_set_probe(irq);
> > }
> > +
> > + reg = readl_relaxed(intc.enable[b]);
> > + if (reg) {
> > + writel_relaxed(reg, intc.disable[b]);
> > + pr_err(FW_BUG "Bootloader left irq enabled: "
> > + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
> > + }
> > + }
> > +
> > + reg = readl_relaxed(base + REG_FIQ_CONTROL);
> > + if (reg & REG_FIQ_ENABLE) {
> > + writel_relaxed(0, base + REG_FIQ_CONTROL);
> > + pr_err(FW_BUG "Bootloader left fiq enabled\n");
> > }
> >
> > if (is_2836) {

2020-02-24 09:22:38

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH v2] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Hi Lukas,

On 23.02.20 19:24, Lukas Wunner wrote:
> On Sun, Feb 23, 2020 at 06:59:56PM +0100, Stefan Wahren wrote:
>> thanks for all the investigation. Unfortunately the patch below doesn't
>> compile, since it lacks the definiton of REG_FIQ_ENABLE.
> Ugh, I recall fixing that when compile-testing. I must have forgotten
> to invoke "git commit --amend" before "git format-patch".
>
>> Btw the name is a little bit unlucky because it defines a single flag
>> within REG_FIQ_CONTROL instead of a separate register.
> The Foundation's repo uses that name so I stuck by it to reduce the
> number of merge conflicts Phil will have to resolve. Happy to change
> though, suggestions welcome.

readability has a higher prio. How about:

#define FIQ_CONTROL_ENABLE BIT(7)

Regards
Stefan

2020-02-25 09:53:44

by Lukas Wunner

[permalink] [raw]
Subject: [PATCH v4] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Per the spec, the BCM2835's IRQs are all disabled when coming out of
power-on reset. Its IRQ driver assumes that's still the case when the
kernel boots and does not perform any initialization of the registers.
However the Raspberry Pi Foundation's bootloader leaves the USB
interrupt enabled when handing over control to the kernel.

Quiesce IRQs and the FIQ if they were left enabled and log a message to
let users know that they should update the bootloader once a fixed
version is released.

If the USB interrupt is not quiesced and the USB driver later on claims
the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
interrupt latency for all other peripherals increases and occasional
lockups occur. That's because both the FIQ and the normal USB interrupt
fire simultaneously:

On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
but remains relatively constant at 23.5 MB/s with this commit.

The lockups occur when CPU 0 receives a USB interrupt while holding a
lock which CPU 1 is trying to acquire while the FIQ is temporarily
disabled on CPU 1. At best users get RCU CPU stall warnings, but most
of the time the system just freezes.

Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
Signed-off-by: Lukas Wunner <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Nicolas Saenz Julienne <[email protected]>
Cc: [email protected] # v3.7+
Cc: Serge Schneider <[email protected]>
Cc: Kristina Brooks <[email protected]>
Cc: Stefan Wahren <[email protected]>
---
v4:
* Add missing REG_FIQ_ENABLE macro, rename to FIQ_CONTROL_ENABLE (Stefan)

v3: (submitted as inline patch)
* Shorten commit message (Florian, Marc)

v2:
* Use "relaxed" MMIO accessors to avoid memory barriers (Marc)
* Use u32 instead of int for register access (Marc)
* Quiesce FIQ as well (Marc)
* Quiesce IRQs after mapping them for better readability
* Drop alternative approach from commit message (Marc)

drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 418245d31921..a1e004af23e7 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -61,6 +61,7 @@
| SHORTCUT1_MASK | SHORTCUT2_MASK)

#define REG_FIQ_CONTROL 0x0c
+#define FIQ_CONTROL_ENABLE BIT(7)

#define NR_BANKS 3
#define IRQS_PER_BANK 32
@@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct device_node *node,
{
void __iomem *base;
int irq, b, i;
+ u32 reg;

base = of_iomap(node, 0);
if (!base)
@@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct device_node *node,
handle_level_irq);
irq_set_probe(irq);
}
+
+ reg = readl_relaxed(intc.enable[b]);
+ if (reg) {
+ writel_relaxed(reg, intc.disable[b]);
+ pr_err(FW_BUG "Bootloader left irq enabled: "
+ "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
+ }
+ }
+
+ reg = readl_relaxed(base + REG_FIQ_CONTROL);
+ if (reg & FIQ_CONTROL_ENABLE) {
+ writel_relaxed(0, base + REG_FIQ_CONTROL);
+ pr_err(FW_BUG "Bootloader left fiq enabled\n");
}

if (is_2836) {
--
2.24.0

Subject: [tip: irq/core] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

The following commit has been merged into the irq/core branch of tip:

Commit-ID: bd59b343a9c902c522f006e6d71080f4893bbf42
Gitweb: https://git.kernel.org/tip/bd59b343a9c902c522f006e6d71080f4893bbf42
Author: Lukas Wunner <[email protected]>
AuthorDate: Tue, 25 Feb 2020 10:50:41 +01:00
Committer: Marc Zyngier <[email protected]>
CommitterDate: Mon, 16 Mar 2020 15:48:54

irqchip/bcm2835: Quiesce IRQs left enabled by bootloader

Per the spec, the BCM2835's IRQs are all disabled when coming out of
power-on reset. Its IRQ driver assumes that's still the case when the
kernel boots and does not perform any initialization of the registers.
However the Raspberry Pi Foundation's bootloader leaves the USB
interrupt enabled when handing over control to the kernel.

Quiesce IRQs and the FIQ if they were left enabled and log a message to
let users know that they should update the bootloader once a fixed
version is released.

If the USB interrupt is not quiesced and the USB driver later on claims
the FIQ (as it does on the Raspberry Pi Foundation's downstream kernel),
interrupt latency for all other peripherals increases and occasional
lockups occur. That's because both the FIQ and the normal USB interrupt
fire simultaneously:

On a multicore Raspberry Pi, if normal interrupts are routed to CPU 0
and the FIQ to CPU 1 (hardcoded in the Foundation's kernel), then a USB
interrupt causes CPU 0 to spin in bcm2836_chained_handle_irq() until the
FIQ on CPU 1 has cleared it. Other peripherals' interrupts are starved
as long. I've seen CPU 0 blocked for up to 2.9 msec. eMMC throughput
on a Compute Module 3 irregularly dips to 23.0 MB/s without this commit
but remains relatively constant at 23.5 MB/s with this commit.

The lockups occur when CPU 0 receives a USB interrupt while holding a
lock which CPU 1 is trying to acquire while the FIQ is temporarily
disabled on CPU 1. At best users get RCU CPU stall warnings, but most
of the time the system just freezes.

Fixes: 89214f009c1d ("ARM: bcm2835: add interrupt controller driver")
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Nicolas Saenz Julienne <[email protected]>
Link: https://lore.kernel.org/r/f97868ba4e9b86ddad71f44ec9d8b3b7d8daa1ea.1582618537.git.lukas@wunner.de
---
drivers/irqchip/irq-bcm2835.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/irqchip/irq-bcm2835.c b/drivers/irqchip/irq-bcm2835.c
index 418245d..a1e004a 100644
--- a/drivers/irqchip/irq-bcm2835.c
+++ b/drivers/irqchip/irq-bcm2835.c
@@ -61,6 +61,7 @@
| SHORTCUT1_MASK | SHORTCUT2_MASK)

#define REG_FIQ_CONTROL 0x0c
+#define FIQ_CONTROL_ENABLE BIT(7)

#define NR_BANKS 3
#define IRQS_PER_BANK 32
@@ -135,6 +136,7 @@ static int __init armctrl_of_init(struct device_node *node,
{
void __iomem *base;
int irq, b, i;
+ u32 reg;

base = of_iomap(node, 0);
if (!base)
@@ -157,6 +159,19 @@ static int __init armctrl_of_init(struct device_node *node,
handle_level_irq);
irq_set_probe(irq);
}
+
+ reg = readl_relaxed(intc.enable[b]);
+ if (reg) {
+ writel_relaxed(reg, intc.disable[b]);
+ pr_err(FW_BUG "Bootloader left irq enabled: "
+ "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &reg);
+ }
+ }
+
+ reg = readl_relaxed(base + REG_FIQ_CONTROL);
+ if (reg & FIQ_CONTROL_ENABLE) {
+ writel_relaxed(0, base + REG_FIQ_CONTROL);
+ pr_err(FW_BUG "Bootloader left fiq enabled\n");
}

if (is_2836) {