2009-04-14 05:47:24

by Ben Nizette

[permalink] [raw]
Subject: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

[cc'ing *correct* lkml address]

After pulling in Linus' latest this morning, a tap on the touchscreen of
my AVR32-based board caused the system to hang hard. It's an ads7843
touchscreen controller driven by drivers/input/touchscreen/ads7846.c -
the tap triggers a pen-down IRQ. Note that I made a small mod to that
driver; due to limitations of the AVR32's gpio interrupt controller I
have to do

@@ -1129,7 +1129,7 @@ static int __devinit ads7846_probe(struct spi_device
*spi)

ts->last_msg = m;

- if (request_irq(spi->irq, ads7846_irq, IRQF_TRIGGER_FALLING,
+ if (request_irq(spi->irq, ads7846_irq, IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING,
spi->dev.driver->name, ts)) {
dev_dbg(&spi->dev, "irq %d busy?\n", spi->irq);
err = -EBUSY;

This doesn't effect the driver operation as the irq handler checks the
logic level of the line anyway.

I've bisected the problem to

commit 3aa551c9b4c40018f0e261a178e3d25478dc04a9
Author: Thomas Gleixner <[email protected]>
Date: Mon Mar 23 18:28:15 2009 +0100

genirq: add threaded interrupt handler support

Add support for threaded interrupt handlers:


Because this doesn't revert cleanly I did the bisection again just to be
sure and it came out the same.

Just before the hang,
~ # cat /proc/interrupts
CPU0
0: 42 intc avr32_comparator
1: 0 intc atmel_lcdfb
2: 1 intc dw_dmac
4: 4 intc atmel_spi.1
9: 555 intc ttyS0
21: 0 intc rtc
22: 34037 intc tc_clkevt
24: 0 intc atmel_pwm
25: 2439 intc eth0
28: 156 intc atmel_mci.0
31: 0 intc atmel_usba_udc
131: 0 gpio ads7846

Note it's that very last line which, when triggered, causes the hang.
As you can see, it's the only one connected through the gpio interrupt
chip.

No sysrq action is possible because of the limited kinds of terminals I
can attach to this board.

Given the limited scope for debug (no logs or sysrq output available) I
can't think exactly what more I can tell you to help you get to the
bottom of it. I'll do whatever I can though, it's perfectly repeatable.

Thanks,
--Ben.



2009-04-14 08:37:41

by Ben Nizette

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

On Tue, 2009-04-14 at 14:59 +1000, Ben Nizette wrote:

> Note it's that very last line which, when triggered, causes the hang.
> As you can see, it's the only one connected through the gpio interrupt
> chip.

Okhayy.. This is probably a red herring. I just connected a button to
a gpio irq on that board and it sat there happily interrupting all night
long.

If anyone's reading with an ADS7846 on some other arch can they try to
reproduce?

If no one beats me to it i'll try and scatter printks and narrow down
exactly *where* it all explodes tomorrow.

Thx,
--Ben.

2009-04-15 01:31:29

by Ben Nizette

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

On Tue, 2009-04-14 at 18:37 +1000, Ben Nizette wrote:

> If no one beats me to it i'll try and scatter printks and narrow down
> exactly *where* it all explodes tomorrow.

static irqreturn_t ads7846_irq(int irq, void *handle)
{
struct ads7846 *ts = handle;
unsigned long flags;

spin_lock_irqsave(&ts->lock, flags);
if (likely(get_pendown_state(ts))) {
if (!ts->irq_disabled) {
/* The ARM do_simple_IRQ() dispatcher doesn't act
* like the other dispatchers: it will report IRQs
* even after they've been disabled. We work around
* that here. (The "generic irq" framework may help...)
*/
ts->irq_disabled = 1;
disable_irq(ts->spi->irq);
ts->pending = 1;
hrtimer_start(&ts->timer, ktime_set(0, TS_POLL_DELAY),
HRTIMER_MODE_REL);
}
}
spin_unlock_irqrestore(&ts->lock, flags);

return IRQ_HANDLED;
}

Right, a scattering of printks in there shows execution gets at least as
far as inside the inner-most code block. A printk directly before the
disable_irq() doesn't make it to my console so I'm guessing that the
hang is in there somewhere.

Hope this helps,
--Ben.


2009-04-15 07:57:25

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

Ben Nizette wrote:
> static irqreturn_t ads7846_irq(int irq, void *handle)
> {
> struct ads7846 *ts = handle;
> unsigned long flags;
>
> spin_lock_irqsave(&ts->lock, flags);
> if (likely(get_pendown_state(ts))) {
> if (!ts->irq_disabled) {
> /* The ARM do_simple_IRQ() dispatcher doesn't act
> * like the other dispatchers: it will report IRQs
> * even after they've been disabled. We work around
> * that here. (The "generic irq" framework may help...)
> */
> ts->irq_disabled = 1;
> disable_irq(ts->spi->irq);

Shouldn't that be disable_irq_nosync()?

Haavard

2009-04-15 08:35:12

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

Haavard Skinnemoen wrote:
> Ben Nizette wrote:
> > static irqreturn_t ads7846_irq(int irq, void *handle)
> > {
> > struct ads7846 *ts = handle;
> > unsigned long flags;
> >
> > spin_lock_irqsave(&ts->lock, flags);
> > if (likely(get_pendown_state(ts))) {
> > if (!ts->irq_disabled) {
> > /* The ARM do_simple_IRQ() dispatcher doesn't act
> > * like the other dispatchers: it will report IRQs
> > * even after they've been disabled. We work around
> > * that here. (The "generic irq" framework may help...)
> > */
> > ts->irq_disabled = 1;
> > disable_irq(ts->spi->irq);
>
> Shouldn't that be disable_irq_nosync()?

Ok, simply stating that without providing an explanation was probably
not so helpful...

I think the problem is that disable_irq() calls synchronize_irq(),
which waits until the corresponding interrupt handler is no longer
running on any CPU. And since we're calling it from the interrupt
handler, we'll have a deadlock because the interrupt handler won't
return until synchronize_irq() returns, and that won't happen until the
interrupt handler returns and so on.

So I think changing disable_irq() into disable_irq_nosync() (which
doesn't call synchronize_irq()) will fix it, though I'm not sure what
difference the threaded interrupt handling code makes -- AFAICT, this
code has always been dangerous.

Haavard

2009-04-15 16:01:25

by Bill Gatliff

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

Haavard Skinnemoen wrote:
> AFAICT, this code has always been dangerous.
>

Can you propose a better way to handle the problem? It's a case that
crops up with a lot of touch screen controllers.


b.g.

--
Bill Gatliff
[email protected]

2009-04-16 00:42:36

by Ben Nizette

[permalink] [raw]
Subject: [PATCH] ads7846: fix unsafe disable_irq (was [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32)

On Wed, 2009-04-15 at 09:57 +0200, Haavard Skinnemoen wrote:
> Shouldn't that be disable_irq_nosync()?

Indeed, good catch. That fixes it.

--------------8<--------------------

The use of disable_irq inside the handler for the interrupt being
disabled has always been dangerous. disable_irq should wait for that
handler to complete before returning -> deadlock.

For some reason this wasn't actually the case until 3aa551c9b was merged
but since this time, the ads7846 driver has deadlocked the system on
first interrupt.

Convert the driver to use the handler-safe _nosync variant.

Signed-off-by: Ben Nizette <[email protected]>

---

diff --git a/drivers/input/touchscreen/ads7846.c b/drivers/input/touchscreen/ads7846.c
index 61b4a02..f2513e6 100644
--- a/drivers/input/touchscreen/ads7846.c
+++ b/drivers/input/touchscreen/ads7846.c
@@ -746,7 +746,7 @@ static irqreturn_t ads7846_irq(int irq, void *handle)
* that here. (The "generic irq" framework may help...)
*/
ts->irq_disabled = 1;
- disable_irq(ts->spi->irq);
+ disable_irq_nosync(ts->spi->irq);
ts->pending = 1;
hrtimer_start(&ts->timer, ktime_set(0, TS_POLL_DELAY),
HRTIMER_MODE_REL);

2009-04-16 01:58:46

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] ads7846: fix unsafe disable_irq (was [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32)

On Thu, Apr 16, 2009 at 10:41:57AM +1000, Ben Nizette wrote:
> On Wed, 2009-04-15 at 09:57 +0200, Haavard Skinnemoen wrote:
> > Shouldn't that be disable_irq_nosync()?
>
> Indeed, good catch. That fixes it.
>
> --------------8<--------------------
>
> The use of disable_irq inside the handler for the interrupt being
> disabled has always been dangerous. disable_irq should wait for that
> handler to complete before returning -> deadlock.
>
> For some reason this wasn't actually the case until 3aa551c9b was merged
> but since this time, the ads7846 driver has deadlocked the system on
> first interrupt.
>
> Convert the driver to use the handler-safe _nosync variant.
>

Applied, thank you very much Ben.

--
Dmitry

2009-04-16 08:41:10

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

Bill Gatliff wrote:
> Haavard Skinnemoen wrote:
> > AFAICT, this code has always been dangerous.
> >
>
> Can you propose a better way to handle the problem? It's a case that
> crops up with a lot of touch screen controllers.

I thought I already did: Use disable_irq_nosync() instead.

Haavard

2009-04-16 13:25:01

by Bill Gatliff

[permalink] [raw]
Subject: Re: [REGRESSION] threaded interrupt handler support breaks (some) irq handling on AVR32

Haavard Skinnemoen wrote:
> Bill Gatliff wrote:
>
>> Haavard Skinnemoen wrote:
>>
>>> AFAICT, this code has always been dangerous.
>>>
>>>
>> Can you propose a better way to handle the problem? It's a case that
>> crops up with a lot of touch screen controllers.
>>
>
> I thought I already did: Use disable_irq_nosync() instead.
>

Aah. I misinterpreted what you wrote. Thanks!


b.g.

--
Bill Gatliff
[email protected]