2009-03-01 09:44:25

by Thomas Gleixner

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Sat, 28 Feb 2009, David Brownell wrote:
> That seems to presume a hardirq-to-taskirq handoff. But the
> problem case is taskirq-to-taskirq chaining, through e.g.
> what set_irq_chip_and_handler() provided. (Details not very
> amenable to brief emails, just UTSL.)
>
> Thing is, I'm not sure a per-IRQ thread can work easily with
> that chaining. The chained IRQs can need to be handled before
> the top-level IRQ gets re-enabled. That's why the twl4030-irq
> code uses just one taskirq thread for all incoming events.

This can be solved by a completion as well.

> (Which of course is rarely more than one at a time, so there's
> little reason not to share that task between the demuxing code
> and the events being demuxed. Interrupts that need processing
> via I2C/SPI/etc are more or less by definition not frequent or
> performance-critical.)

Then all we need to provide in the generic code is a function which
does not go through the handle_IRQ_event() logic and calls the action
handler directly. Not rocket science to do that and better than using
a facility which is designed to run in hardirq context and expect that
it works in thread context without complaints.

Thanks,

tglx


2009-03-01 22:54:41

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Sunday 01 March 2009, Thomas Gleixner wrote:
> On Sat, 28 Feb 2009, David Brownell wrote:
> > That seems to presume a hardirq-to-taskirq handoff. But the
> > problem case is taskirq-to-taskirq chaining, through e.g.
> > what set_irq_chip_and_handler() provided. (Details not very
> > amenable to brief emails, just UTSL.)
> >
> > Thing is, I'm not sure a per-IRQ thread can work easily with
> > that chaining. The chained IRQs can need to be handled before
> > the top-level IRQ gets re-enabled. That's why the twl4030-irq
> > code uses just one taskirq thread for all incoming events.
>
> This can be solved by a completion as well.

One of many potential solutions; it's probably a better
use case for a counting semaphore though, especially if
they were all going in parallel. And of course there's
the issue of where that synch code lives...

Though I still don't see any real issue with keeping it
simple and serializing them without creating new threads.
In terms of resource consumption, that simple solution is
clearly superior.


> > (Which of course is rarely more than one at a time, so there's
> > little reason not to share that task between the demuxing code
> > and the events being demuxed. Interrupts that need processing
> > via I2C/SPI/etc are more or less by definition not frequent or
> > performance-critical.)
>
> Then all we need to provide in the generic code is a function which
> does not go through the handle_IRQ_event() logic and calls the action
> handler directly.

That is, something to replace handle_simple_irq() and
handle_edge_irq() flow handlers? (irq_desc.handle_irq)


> Not rocket science to do that and better than using
> a facility which is designed to run in hardirq context and expect that
> it works in thread context without complaints.

The main "complaint" is the pre-existing lockdep breakage. :)

The need to call irq_desc.handle_irq() with IRQs disabled is a
bit strange, but not really a problem; and it ensures consistent
locking for the irq_desc statistics and flag updates.

- Dave

2009-03-02 13:16:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Sun, 2009-03-01 at 14:54 -0800, David Brownell wrote:
> The main "complaint" is the pre-existing lockdep breakage. :)

Right, I just send a patch to fix that, by simply fixing !lockdep to do
as lockdep does :-)

IRQF_DISABLED is bonkers, we should simply always disable interrupts for
interrupt handlers.

The only problem you have is your of your own tinkering.


2009-03-02 21:05:40

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Monday 02 March 2009, Peter Zijlstra wrote:
> IRQF_DISABLED is bonkers,

Hmm, after all the work that's been done to get Linux
to the point where *most* drivers run without IRQs
enabled ... that sentiment surprises me.

And I suspect it would surprise most driver developers.


> we should simply always disable interrupts for
> interrupt handlers.

That would be why you have refused to fix the bug
in lockdep, whereby it forcibly enables that flag?

I've been wondering for some months now why you've
left that bug unfixed.

2009-03-02 21:17:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 2009-03-02 at 13:04 -0800, David Brownell wrote:
> On Monday 02 March 2009, Peter Zijlstra wrote:
> > IRQF_DISABLED is bonkers,
>
> Hmm, after all the work that's been done to get Linux
> to the point where *most* drivers run without IRQs
> enabled ... that sentiment surprises me.
>
> And I suspect it would surprise most driver developers.

How so?, its the natural extension of that work.

> > we should simply always disable interrupts for
> > interrupt handlers.
>
> That would be why you have refused to fix the bug
> in lockdep, whereby it forcibly enables that flag?
>
> I've been wondering for some months now why you've
> left that bug unfixed.

Because running irq handlers with irqs enabled it plain silly.

Except it turns out there is some really broken ass hardware out there.
But supposedly IDE PIO could be done from a threaded handler.


2009-03-02 21:31:05

by Andrew Morton

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 02 Mar 2009 22:16:57 +0100
Peter Zijlstra <[email protected]> wrote:

> > > we should simply always disable interrupts for
> > > interrupt handlers.
> >
> > That would be why you have refused to fix the bug
> > in lockdep, whereby it forcibly enables that flag?
> >
> > I've been wondering for some months now why you've
> > left that bug unfixed.
>
> Because running irq handlers with irqs enabled it plain silly.

Do we get to revert irqstacks?

2009-03-02 21:37:37

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Monday 02 March 2009, Peter Zijlstra wrote:
> On Mon, 2009-03-02 at 13:04 -0800, David Brownell wrote:
> > On Monday 02 March 2009, Peter Zijlstra wrote:
> > > IRQF_DISABLED is bonkers,
> >
> > Hmm, after all the work that's been done to get Linux
> > to the point where *most* drivers run without IRQs

Typo: "drivers run *with* IRQs enabled". Not "without".
Should be evident from context.


> > enabled ... that sentiment surprises me.
> >
> > And I suspect it would surprise most driver developers.
>
> How so?, its the natural extension of that work.

Not the work to shrink the amount of time IRQ latencies
by shrinking the amount of time IRQs are disabled by
IRQ handlers.


> > > we should simply always disable interrupts for
> > > interrupt handlers.
> >
> > That would be why you have refused to fix the bug
> > in lockdep, whereby it forcibly enables that flag?
> >
> > I've been wondering for some months now why you've
> > left that bug unfixed.
>
> Because running irq handlers with irqs enabled it plain silly.

Not if you have hardware-prioritized IRQs, which are
fairly common in some environments ... handling an IRQ
for high priority device A needn't interfere with the
handler for lower priority device B, and the system
overall can work better.

Not if you need to shrink IRQ latencies by minimizing
irqs-off critical sections everywhere ... IRQ handlers
being common offenders for keeping IRQs off too long.

Not when IRQs can be disabled selectively around the
real critical sections ... so drivers can leave IRQs
enabled except in those brief sections.


2009-03-02 21:42:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 2009-03-02 at 13:37 -0800, David Brownell wrote:

> > How so?, its the natural extension of that work.
>
> Not the work to shrink the amount of time IRQ latencies
> by shrinking the amount of time IRQs are disabled by
> IRQ handlers.

Ugh, that's done by pushing work out of the hardirq context, not by
doing silly things like enabling irqs from hardirq context.

> > > > we should simply always disable interrupts for
> > > > interrupt handlers.
> > >
> > > That would be why you have refused to fix the bug
> > > in lockdep, whereby it forcibly enables that flag?
> > >
> > > I've been wondering for some months now why you've
> > > left that bug unfixed.
> >
> > Because running irq handlers with irqs enabled it plain silly.
>
> Not if you have hardware-prioritized IRQs, which are
> fairly common in some environments ... handling an IRQ
> for high priority device A needn't interfere with the
> handler for lower priority device B, and the system
> overall can work better.
>
> Not if you need to shrink IRQ latencies by minimizing
> irqs-off critical sections everywhere ... IRQ handlers
> being common offenders for keeping IRQs off too long.
>
> Not when IRQs can be disabled selectively around the
> real critical sections ... so drivers can leave IRQs
> enabled except in those brief sections.

Yeah, and who gets to debug the stack overflow?

Hardware with irq prio levels can do a prio mask and use a stack per
prio level.

2009-03-02 22:09:34

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Monday 02 March 2009, Peter Zijlstra wrote:
> > > How so?, its the natural extension of that work.
> >
> > Not the work to shrink the amount of time IRQ latencies
> > by shrinking the amount of time IRQs are disabled by
> > IRQ handlers.
>
> Ugh, that's done by pushing work out of the hardirq context,

That's one of many techniques currently used.

Tradeoffs don't always favor larger driver updates
and re-validation though. Sometimes it's simpler
to just leverage the reality that "hardirq context"
does not require using IRQF_DISABLED.


> not by doing silly things like enabling irqs from hardirq context.

Somehow I'm certain you have NOT analysed every one of the
thousands of IRQ handlers in various Linux drivers to know
with certainty that's not the reason IRQ_DISABLED is cleared.

There are also *other* reasons to leave IRQ_DISABLED clear.

- Dave

2009-03-02 22:20:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 2009-03-02 at 14:09 -0800, David Brownell wrote:
> On Monday 02 March 2009, Peter Zijlstra wrote:
> > > > How so?, its the natural extension of that work.
> > >
> > > Not the work to shrink the amount of time IRQ latencies
> > > by shrinking the amount of time IRQs are disabled by
> > > IRQ handlers.
> >
> > Ugh, that's done by pushing work out of the hardirq context,
>
> That's one of many techniques currently used.
>
> Tradeoffs don't always favor larger driver updates
> and re-validation though. Sometimes it's simpler
> to just leverage the reality that "hardirq context"
> does not require using IRQF_DISABLED.

Simpler doesn't mean its a good idea, it opens the door to stack
overruns for one.

Its very unfortunate that people think its a good idea.

> > not by doing silly things like enabling irqs from hardirq context.
>
> Somehow I'm certain you have NOT analysed every one of the
> thousands of IRQ handlers in various Linux drivers to know
> with certainty that's not the reason IRQ_DISABLED is cleared.
>
> There are also *other* reasons to leave IRQ_DISABLED clear.

There might be reasons to do so, that doesn't mean its a good idea and
the desired thing to do.

I state that every !IRQF_DISABLED usage is a bug, either due to broken
hardware or broken drivers.

2009-03-02 22:41:18

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Monday 02 March 2009, Peter Zijlstra wrote:
> I state that every !IRQF_DISABLED usage is a bug, either
> due to broken hardware or broken drivers.

That's a novel position. You do realize that removing that
capability breaks drivers?

But if that's what is keeping you from fixing the lockdep bug,
why haven't you submitted patches to remove IRQF_DISABLED from
the kernel, and update all the drivers relying on IRQs being
enabled when their handlers run?

At least that would get the appropriate level of discussion.

- Dave

2009-03-02 22:47:19

by David Miller

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs

From: Peter Zijlstra <[email protected]>
Date: Mon, 02 Mar 2009 23:19:31 +0100

> I state that every !IRQF_DISABLED usage is a bug, either due to broken
> hardware or broken drivers.

We'll send you the bill to have everyone's hardware
replaced :-)

2009-03-02 22:51:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 2009-03-02 at 14:40 -0800, David Brownell wrote:
> On Monday 02 March 2009, Peter Zijlstra wrote:
> > I state that every !IRQF_DISABLED usage is a bug, either
> > due to broken hardware or broken drivers.
>
> That's a novel position. You do realize that removing that
> capability breaks drivers?

Then we fix them.

> But if that's what is keeping you from fixing the lockdep bug,
> why haven't you submitted patches to remove IRQF_DISABLED from
> the kernel, and update all the drivers relying on IRQs being
> enabled when their handlers run?

I did so today. Just didn't realize things actually relied on it since
lockdep turned them off and my system has been working fine.

Your driver needs threaded interrupts, Thomas is working on that now,
and I saw a conversion of your driver to use that.

IDE PIO can hopefully also be converted to threaded interrupts.

After that I'll post patches to remove IRQF_DISABLED and provide a
another flag to quick-'fix' other iffy drivers.

Once such drivers are found we can work on proper fixes for them.

2009-03-02 23:02:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs

On Mon, 2009-03-02 at 14:46 -0800, David Miller wrote:
> From: Peter Zijlstra <[email protected]>
> Date: Mon, 02 Mar 2009 23:19:31 +0100
>
> > I state that every !IRQF_DISABLED usage is a bug, either due to broken
> > hardware or broken drivers.
>
> We'll send you the bill to have everyone's hardware
> replaced :-)

I'm not saying to remove support for such stuff, as long as we clearly
annotate that its due to broken ass hardware we can leave a IRQF_ENABLED
thingy in there.

Preferably such drivers would be converted to threaded interrupts, but I
thought Alan mentioned an IDE chipset that was so broken even that would
be impossible (could not mask the IRQ for it would corrupt stuff).

The thing I am strongly opposing though, is keeping interrupts enabled
for regular drivers on sane hardware.

2009-03-02 23:07:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs

On Mon, 2009-03-02 at 14:57 -0800, Andrew Morton wrote:
> On Mon, 02 Mar 2009 14:46:47 -0800 (PST)
> David Miller <[email protected]> wrote:
>
> > From: Peter Zijlstra <[email protected]>
> > Date: Mon, 02 Mar 2009 23:19:31 +0100
> >
> > > I state that every !IRQF_DISABLED usage is a bug, either due to broken
> > > hardware or broken drivers.
> >
> > We'll send you the bill to have everyone's hardware
> > replaced :-)
>
> yes, but with what?
>
> No matter how fast all our interrupt handlers are, running them with
> local interrupts disabled has to worsen the worst-case interrupt
> latency.
>
> I don't see how removing !IRQF_DISABLED improves the kernel - in fact
> there's a latency argument for making !IRQF_DISABLED the default.

On preempt-rt all we do in the hardirq path is mask the interrupt line
and wake up a thread. That's the extreme end of low latency interrupts.

Arguably there is a middle way that works for !-rt.

However, striving to enable interrupts in all interrupt handlers is
asking for stack overruns. Interrupt nesting just isn't really helpful.

2009-03-02 23:29:22

by David Brownell

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Monday 02 March 2009, Peter Zijlstra wrote:
> > But if that's what is keeping you from fixing the lockdep bug,
> > why haven't you submitted patches to remove IRQF_DISABLED from
> > the kernel, and update all the drivers relying on IRQs being
> > enabled when their handlers run?
>
> I did so today. Just didn't realize things actually relied on it since
> lockdep turned them off and my system has been working fine.

That patch did no such thing. It added a BUG_ON(),
which has nothing to do with removing IRQF_DISABLED.


> Your driver needs threaded interrupts, Thomas is working on that now,
> and I saw a conversion of your driver to use that.

Thomas hasn't yet touched the issue of how to chain such IRQs
though ... I consider his v2 patches a decent start, with some
limitations that could be attributed to an x86 focus.


> IDE PIO can hopefully also be converted to threaded interrupts.

I have worked with ARMs with IDE support. That's become
rare in new chips though, even for CF cards; it needs too
many signal wires.

- Dave

2009-03-02 23:35:29

by Alan

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

> Because running irq handlers with irqs enabled it plain silly.

It makes enormous sense to run with the ability to mask *just* the IRQ
that occurred and not others so that long lasting interrupts.

There is another reason not to do this of course - the real time work
makes it essentially irrelevant by doing the job right and reducing the
entire problem space to software priority management.

> Except it turns out there is some really broken ass hardware out there.
> But supposedly IDE PIO could be done from a threaded handler.

Only if you support reliable masking of a single IRQ *after* the IRQ
handler returns. The real time patches can do that for some chipsets but
on the PC at least you are into the realms of deepest darkest APIC
weirdness and that requires Ingo's pointy hat and wand..

Alan

2009-03-02 23:59:41

by Andrew Morton

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs

On Mon, 02 Mar 2009 14:46:47 -0800 (PST)
David Miller <[email protected]> wrote:

> From: Peter Zijlstra <[email protected]>
> Date: Mon, 02 Mar 2009 23:19:31 +0100
>
> > I state that every !IRQF_DISABLED usage is a bug, either due to broken
> > hardware or broken drivers.
>
> We'll send you the bill to have everyone's hardware
> replaced :-)

yes, but with what?

No matter how fast all our interrupt handlers are, running them with
local interrupts disabled has to worsen the worst-case interrupt
latency.

I don't see how removing !IRQF_DISABLED improves the kernel - in fact
there's a latency argument for making !IRQF_DISABLED the default.

2009-03-03 07:46:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep and threaded IRQs (was: ...)

On Mon, 2009-03-02 at 15:29 -0800, David Brownell wrote:

> That patch did no such thing. It added a BUG_ON(),
> which has nothing to do with removing IRQF_DISABLED.

http://lkml.org/lkml/2009/3/2/33