2019-02-20 22:16:21

by Florian Fainelli

[permalink] [raw]
Subject: [PATCH] irqchip: brcmstb-l2: use _irqsave variants in non-interrupt code

From: Doug Berger <[email protected]>

Using the irq_gc_lock/irq_gc_unlock functions in the suspend and
resume functions creates the opportunity for a deadlock during
suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/
irq_gc_unlock_irqrestore variants prevents this possible deadlock.

Signed-off-by: Doug Berger <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
---
drivers/irqchip/irq-brcmstb-l2.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
index 0e65f609352e..83364fedbf0a 100644
--- a/drivers/irqchip/irq-brcmstb-l2.c
+++ b/drivers/irqchip/irq-brcmstb-l2.c
@@ -129,8 +129,9 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
struct brcmstb_l2_intc_data *b = gc->private;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
/* Save the current mask */
b->saved_mask = irq_reg_readl(gc, ct->regs.mask);

@@ -139,7 +140,7 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
irq_reg_writel(gc, ~gc->wake_active, ct->regs.disable);
irq_reg_writel(gc, gc->wake_active, ct->regs.enable);
}
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}

static void brcmstb_l2_intc_resume(struct irq_data *d)
@@ -147,8 +148,9 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
struct brcmstb_l2_intc_data *b = gc->private;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
if (ct->chip.irq_ack) {
/* Clear unmasked non-wakeup interrupts */
irq_reg_writel(gc, ~b->saved_mask & ~gc->wake_active,
@@ -158,7 +160,7 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
/* Restore the saved mask */
irq_reg_writel(gc, b->saved_mask, ct->regs.disable);
irq_reg_writel(gc, ~b->saved_mask, ct->regs.enable);
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}

static int __init brcmstb_l2_intc_of_init(struct device_node *np,
--
2.17.1



2019-02-21 10:16:45

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip: brcmstb-l2: use _irqsave variants in non-interrupt code

On Wed, 20 Feb 2019 14:15:28 -0800
Florian Fainelli <[email protected]> wrote:

> From: Doug Berger <[email protected]>
>
> Using the irq_gc_lock/irq_gc_unlock functions in the suspend and
> resume functions creates the opportunity for a deadlock during
> suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/
> irq_gc_unlock_irqrestore variants prevents this possible deadlock.
>
> Signed-off-by: Doug Berger <[email protected]>
> Signed-off-by: Florian Fainelli <[email protected]>

Applied to irqchip-next with:

Cc: [email protected]
Fixes: 7f646e92766e2 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box
Level-2 interrupt controller")

Now, I'm worried this is not the only issue, see below.

> ---
> drivers/irqchip/irq-brcmstb-l2.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
> index 0e65f609352e..83364fedbf0a 100644
> --- a/drivers/irqchip/irq-brcmstb-l2.c
> +++ b/drivers/irqchip/irq-brcmstb-l2.c
> @@ -129,8 +129,9 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
> struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> struct irq_chip_type *ct = irq_data_get_chip_type(d);
> struct brcmstb_l2_intc_data *b = gc->private;
> + unsigned long flags;
>
> - irq_gc_lock(gc);
> + irq_gc_lock_irqsave(gc, flags);
> /* Save the current mask */
> b->saved_mask = irq_reg_readl(gc, ct->regs.mask);
>
> @@ -139,7 +140,7 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
> irq_reg_writel(gc, ~gc->wake_active, ct->regs.disable);
> irq_reg_writel(gc, gc->wake_active, ct->regs.enable);
> }
> - irq_gc_unlock(gc);
> + irq_gc_unlock_irqrestore(gc, flags);
> }
>
> static void brcmstb_l2_intc_resume(struct irq_data *d)
> @@ -147,8 +148,9 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
> struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> struct irq_chip_type *ct = irq_data_get_chip_type(d);
> struct brcmstb_l2_intc_data *b = gc->private;
> + unsigned long flags;
>
> - irq_gc_lock(gc);
> + irq_gc_lock_irqsave(gc, flags);
> if (ct->chip.irq_ack) {
> /* Clear unmasked non-wakeup interrupts */
> irq_reg_writel(gc, ~b->saved_mask & ~gc->wake_active,
> @@ -158,7 +160,7 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
> /* Restore the saved mask */
> irq_reg_writel(gc, b->saved_mask, ct->regs.disable);
> irq_reg_writel(gc, ~b->saved_mask, ct->regs.enable);
> - irq_gc_unlock(gc);
> + irq_gc_unlock_irqrestore(gc, flags);
> }
>
> static int __init brcmstb_l2_intc_of_init(struct device_node *np,


I've had a quick look at the generic irqchip code, and the mask/unmask
code seems to suffer from something similar. Both implementations use
the non irq-safe version, and seem vulnerable to the following scenario:

From a non-interrupt context:

irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY)
disable_irq(irq)
irq_disable(irqdesc, true)
irq_gc_mask_disable_reg(&irqdesc->irq_data)
irq_gc_lock()

interrupt fires here:

brcmstb_l2_intc_irq_handle()
generic_handle_irq()
handle_edge_irq()
mask_ack_irq()
brcmstb_l2_mask_and_ack()
irq_gc_lock() ----> deadlock

I haven't actually observed this, but it feels like it could happen.
This should just be a matter of turning the mask/unmask/set_wake
callbacks into the irq-safe version (see patch below).

Thomas, what do you think?

Thanks,

M.

From ef003fbeb54c7f12920b842636ee3c7867cc7d69 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <[email protected]>
Date: Thu, 21 Feb 2019 10:05:24 +0000
Subject: [PATCH] genirq: Fix generic irq chip locking in non-interrupt code

A number of the irqchip methods can be called both in and out
of interrupt context (mask, unmask). If these methods are
using any form of locking, it is crucial to make this locking
interrupt safe, or risk a deadlock in an interrupt occurs during
the critical section.

The generic irqchip implementation doesn't obey the above rule.
Let's address it by changing all the callbacks that can be used
in non-interrupt context to using irqsafe locking.

Fixes: 7d8280624797 ("genirq: Implement a generic interrupt chip")
Signed-off-by: Marc Zyngier <[email protected]>
---
kernel/irq/generic-chip.c | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/irq/generic-chip.c b/kernel/irq/generic-chip.c
index e2999a070a99..a21c42c6db51 100644
--- a/kernel/irq/generic-chip.c
+++ b/kernel/irq/generic-chip.c
@@ -38,11 +38,12 @@ void irq_gc_mask_disable_reg(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
u32 mask = d->mask;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
irq_reg_writel(gc, mask, ct->regs.disable);
*ct->mask_cache &= ~mask;
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}

/**
@@ -57,11 +58,12 @@ void irq_gc_mask_set_bit(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
u32 mask = d->mask;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
*ct->mask_cache |= mask;
irq_reg_writel(gc, *ct->mask_cache, ct->regs.mask);
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}
EXPORT_SYMBOL_GPL(irq_gc_mask_set_bit);

@@ -77,11 +79,12 @@ void irq_gc_mask_clr_bit(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
u32 mask = d->mask;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
*ct->mask_cache &= ~mask;
irq_reg_writel(gc, *ct->mask_cache, ct->regs.mask);
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}
EXPORT_SYMBOL_GPL(irq_gc_mask_clr_bit);

@@ -97,11 +100,12 @@ void irq_gc_unmask_enable_reg(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct irq_chip_type *ct = irq_data_get_chip_type(d);
u32 mask = d->mask;
+ unsigned long flags;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
irq_reg_writel(gc, mask, ct->regs.enable);
*ct->mask_cache |= mask;
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
}

/**
@@ -188,16 +192,17 @@ int irq_gc_set_wake(struct irq_data *d, unsigned int on)
{
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
u32 mask = d->mask;
+ unsigned long flags;

if (!(mask & gc->wake_enabled))
return -EINVAL;

- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
if (on)
gc->wake_active |= mask;
else
gc->wake_active &= ~mask;
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
return 0;
}

--
2.20.1


--
Without deviation from the norm, progress is not possible.

2019-02-21 21:04:13

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH] irqchip: brcmstb-l2: use _irqsave variants in non-interrupt code

On 2/21/19 2:15 AM, Marc Zyngier wrote:
> On Wed, 20 Feb 2019 14:15:28 -0800
> Florian Fainelli <[email protected]> wrote:
>
>> From: Doug Berger <[email protected]>
>>
>> Using the irq_gc_lock/irq_gc_unlock functions in the suspend and
>> resume functions creates the opportunity for a deadlock during
>> suspend, resume, and shutdown. Using the irq_gc_lock_irqsave/
>> irq_gc_unlock_irqrestore variants prevents this possible deadlock.
>>
>> Signed-off-by: Doug Berger <[email protected]>
>> Signed-off-by: Florian Fainelli <[email protected]>
>
> Applied to irqchip-next with:
>
> Cc: [email protected]
> Fixes: 7f646e92766e2 ("irqchip: brcmstb-l2: Add Broadcom Set Top Box
> Level-2 interrupt controller")
>
> Now, I'm worried this is not the only issue, see below.

Thanks for adding the Fixes tag, I should have done that, but did not
consider it to be a particularly serious issue (we have not see it in
real life, just only when LOCKDEP is enabled).
--
Florian