2008-10-27 21:39:46

by David Brownell

[permalink] [raw]
Subject: [patch 2.6.28-rc2] at91_mci: workaround lockdep

From: David Brownell <[email protected]>

Lockdep reported a problem in the at91_mci driver ... in this case, the
issue is with lockdep, not with the driver. A trimmed stack dump, from
trying to boot with root on MMC, shows:

WARNING: at kernel/lockdep.c:2195 trace_hardirqs_on_caller+0xf4/0x170()
Modules linked in:
[<c005bc98>] (trace_hardirqs_on+0x0/0x18) from [<c0213bf4>] (_spin_unlock_irq+0x2c/0x3c)
[<c0213bc8>] (_spin_unlock_irq+0x0/0x3c) from [<c0029a88>] (flush_dcache_page+0x114/0x144)
[<c0029974>] (flush_dcache_page+0x0/0x144) from [<c019b034>] (at91_mci_irq+0x150/0x414)
[<c019aee4>] (at91_mci_irq+0x0/0x414) from [<c0066c5c>] (handle_IRQ_event+0x2c/0x6c)
[<c0066c30>] (handle_IRQ_event+0x0/0x6c) from [<c0068a60>] (handle_level_irq+0x108/0x124)
[<c0068958>] (handle_level_irq+0x0/0x124) from [<c0022064>] (__exception_text_start+0x64/0x90)

When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
above -- it re-enables IRQs ... since that evidently may only be called with
IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
to be disabled. Except ... that lockdep went and disabled them, then went on
to complains about the breakage *it* caused!

Workaround: depend on LOCKDEP=n ... and for paranoia, disable IRQF_SHARED
for this interrupt. (At the hardware level, this is dedicated to MCI, so
there's never a need for multiple handlers.)

Signed-off-by: David Brownell <[email protected]>
---
An alternate fix might be to have ARM dcache mmap locking save and restore
the IRQ flags ... but there are several such "can't run with IRQs disabled"
restrictions scattered through that code, with no rationales commented. Or
stop lockdep from mucking with IRQ flags; probably not any time soon!

So for now, this seems to be the best "solution" available...

drivers/mmc/host/Kconfig | 1 +
drivers/mmc/host/at91_mci.c | 3 +--
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -99,6 +99,7 @@ config MMC_AU1X
config MMC_AT91
tristate "AT91 SD/MMC Card Interface support"
depends on ARCH_AT91
+ depends on LOCKDEP=n
help
This selects the AT91 MCI controller.

--- a/drivers/mmc/host/at91_mci.c
+++ b/drivers/mmc/host/at91_mci.c
@@ -1081,8 +1081,7 @@ static int __init at91_mci_probe(struct
* Allocate the MCI interrupt
*/
host->irq = platform_get_irq(pdev, 0);
- ret = request_irq(host->irq, at91_mci_irq, IRQF_SHARED,
- mmc_hostname(mmc), host);
+ ret = request_irq(host->irq, at91_mci_irq, 0, mmc_hostname(mmc), host);
if (ret) {
dev_dbg(&pdev->dev, "request MCI interrupt failed\n");
goto fail0;


2008-10-28 17:04:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Mon, 2008-10-27 at 14:26 -0700, David Brownell wrote:
> From: David Brownell <[email protected]>
>
> Lockdep reported a problem in the at91_mci driver ... in this case, the
> issue is with lockdep, not with the driver. A trimmed stack dump, from
> trying to boot with root on MMC, shows:
>
> WARNING: at kernel/lockdep.c:2195 trace_hardirqs_on_caller+0xf4/0x170()
> Modules linked in:
> [<c005bc98>] (trace_hardirqs_on+0x0/0x18) from [<c0213bf4>] (_spin_unlock_irq+0x2c/0x3c)
> [<c0213bc8>] (_spin_unlock_irq+0x0/0x3c) from [<c0029a88>] (flush_dcache_page+0x114/0x144)
> [<c0029974>] (flush_dcache_page+0x0/0x144) from [<c019b034>] (at91_mci_irq+0x150/0x414)
> [<c019aee4>] (at91_mci_irq+0x0/0x414) from [<c0066c5c>] (handle_IRQ_event+0x2c/0x6c)
> [<c0066c30>] (handle_IRQ_event+0x0/0x6c) from [<c0068a60>] (handle_level_irq+0x108/0x124)
> [<c0068958>] (handle_level_irq+0x0/0x124) from [<c0022064>] (__exception_text_start+0x64/0x90)
>
> When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
> above -- it re-enables IRQs ... since that evidently may only be called with
> IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
> to be disabled. Except ... that lockdep went and disabled them, then went on
> to complains about the breakage *it* caused!
>
> Workaround: depend on LOCKDEP=n ... and for paranoia, disable IRQF_SHARED
> for this interrupt. (At the hardware level, this is dedicated to MCI, so
> there's never a need for multiple handlers.)

In all previous such cases it was deemed the IRQ handler should deal
with whatever it gets.

2008-10-28 17:31:29

by David Brownell

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Tuesday 28 October 2008, Peter Zijlstra wrote:
> On Mon, 2008-10-27 at 14:26 -0700, David Brownell wrote:
> > From: David Brownell <[email protected]>
> >
> > Lockdep reported a problem in the at91_mci driver ... in this case, the
> > issue is with lockdep, not with the driver. ...
> >
> > When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
> > above -- it re-enables IRQs ... since that evidently may only be called with
> > IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
> > to be disabled. Except ... that lockdep went and disabled them, then went on
> > to complains about the breakage *it* caused!
> >
> > Workaround: depend on LOCKDEP=n ...
>
> In all previous such cases it was deemed the IRQ handler should deal
> with whatever it gets.

In which case I'll wait until someone changes that IRQ handler (or that
ARM MM utility, or lockdep), and give up using AT91 platforms for sanity
testing kernel changes; lockdep is important, when it doesn't lie.

I do think that lockdep should warn when that it's ignoring such driver
requests, however. I seem to have been tripping over it a lot lately,
and knowing that IRQ handlers were using strange modes would have saved
a bunch of time from being wasted.

Threaded IRQ handlers are going to need to rely even more on running
with IRQs enabled ... not to mention needing to sleep. So it's clear
to me that there *are* lockdep issues yet to be adressed here.

- Dave

2008-10-28 17:37:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Tue, 2008-10-28 at 10:22 -0700, David Brownell wrote:
> On Tuesday 28 October 2008, Peter Zijlstra wrote:
> > On Mon, 2008-10-27 at 14:26 -0700, David Brownell wrote:
> > > From: David Brownell <[email protected]>
> > >
> > > Lockdep reported a problem in the at91_mci driver ... in this case, the
> > > issue is with lockdep, not with the driver. ...
> > >
> > > When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
> > > above -- it re-enables IRQs ... since that evidently may only be called with
> > > IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
> > > to be disabled. Except ... that lockdep went and disabled them, then went on
> > > to complains about the breakage *it* caused!
> > >
> > > Workaround: depend on LOCKDEP=n ...
> >
> > In all previous such cases it was deemed the IRQ handler should deal
> > with whatever it gets.
>
> In which case I'll wait until someone changes that IRQ handler (or that
> ARM MM utility, or lockdep), and give up using AT91 platforms for sanity
> testing kernel changes; lockdep is important, when it doesn't lie.
>
> I do think that lockdep should warn when that it's ignoring such driver
> requests, however. I seem to have been tripping over it a lot lately,
> and knowing that IRQ handlers were using strange modes would have saved
> a bunch of time from being wasted.
>
> Threaded IRQ handlers are going to need to rely even more on running
> with IRQs enabled ... not to mention needing to sleep. So it's clear
> to me that there *are* lockdep issues yet to be adressed here.

Sure, care so send a patch fixing those? :-)

2008-10-28 20:13:04

by David Brownell

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Tuesday 28 October 2008, Peter Zijlstra wrote:
> > > > Workaround: depend on LOCKDEP=n ...
> > >
> > > In all previous such cases it was deemed the IRQ handler should deal
> > > with whatever it gets.
> >
> > In which case I'll wait until someone changes that IRQ handler (or that
> > ARM MM utility, or lockdep), and give up using AT91 platforms for sanity
> > testing kernel changes; lockdep is important, when it doesn't lie.
> >
> > I do think that lockdep should warn when that it's ignoring such driver
> > requests, however. I seem to have been tripping over it a lot lately,
> > and knowing that IRQ handlers were using strange modes would have saved
> > a bunch of time from being wasted.
> >
> > Threaded IRQ handlers are going to need to rely even more on running
> > with IRQs enabled ... not to mention needing to sleep. So it's clear
> > to me that there *are* lockdep issues yet to be addressed here.
>
> Sure, care so send a patch fixing those? :-)

Here's one for the warning; that's the only one straightforward enough
to justify detouring from Real Work. Plus, the IRQ threading patches
aren't that near a merge queue yet. ;)

- Dave


========================== CUT HERE
From: David Brownell <[email protected]>

When lockdep turns on IRQF_DISABLED, emit a warning to make it
easier to track down problems this introduces in drivers that
expect handlers to run with IRQs enabled.

Signed-off-by: David Brownell <[email protected]>
---
kernel/irq/manage.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -681,7 +681,11 @@ int request_irq(unsigned int irq, irq_ha
/*
* Lockdep wants atomic interrupt handlers:
*/
- irqflags |= IRQF_DISABLED;
+ if (!(irqflags & IRQF_DISABLED)) {
+ pr_warning("IRQ %d/%s: lockdep sets IRQF_DISABLED\n",
+ irq, devname);
+ irqflags |= IRQF_DISABLED;
+ }
#endif
/*
* Sanity-check: shared interrupts must pass in a real dev-ID,

2008-10-29 07:20:24

by David Brownell

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Tuesday 28 October 2008, David Brownell wrote:
>
> > > Workaround: depend on LOCKDEP=n ...

Note that LOCKDEP causes the BUG() on line 1159 of drivers/mmc/host/omap.c
to fire too -- exactly the same root cause: IRQ handler getting called
in a goofy context, because of lockdep.

That's the older OMAP1/OMAP2 driver ... the new highspeed MMC controller
in OMAP3 (handling 8-bit parallel I/O etc, nyet in mainline) doesn't seem
to have the same issue(s).

- Dave


2008-11-03 13:48:23

by Nicolas Ferre

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

David Brownell :
> On Tuesday 28 October 2008, Peter Zijlstra wrote:
>> On Mon, 2008-10-27 at 14:26 -0700, David Brownell wrote:
>>> From: David Brownell <[email protected]>
>>>
>>> Lockdep reported a problem in the at91_mci driver ... in this case, the
>>> issue is with lockdep, not with the driver. ...
>>>
>>> When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
>>> above -- it re-enables IRQs ... since that evidently may only be called with
>>> IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
>>> to be disabled. Except ... that lockdep went and disabled them, then went on
>>> to complains about the breakage *it* caused!
>>>
>>> Workaround: depend on LOCKDEP=n ...
>> In all previous such cases it was deemed the IRQ handler should deal
>> with whatever it gets.
>
> In which case I'll wait until someone changes that IRQ handler (or that
> ARM MM utility, or lockdep), and give up using AT91 platforms for sanity
> testing kernel changes; lockdep is important, when it doesn't lie.

Changing IRQ handler in this driver... seem to be a big work.

Well, Dave, I tend to acknowledge your patch above as the IRQ for MCI is
indeed a dedicated line (no need for IRQF_SHARED).

Are you ok with this ?

Regards,
--
Nicolas Ferre

2008-11-17 09:29:31

by Nicolas Ferre

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

David Brownell :
> From: David Brownell <[email protected]>
>
> Lockdep reported a problem in the at91_mci driver ... in this case, the
> issue is with lockdep, not with the driver. A trimmed stack dump, from
> trying to boot with root on MMC, shows:
>
> WARNING: at kernel/lockdep.c:2195 trace_hardirqs_on_caller+0xf4/0x170()
> Modules linked in:
> [<c005bc98>] (trace_hardirqs_on+0x0/0x18) from [<c0213bf4>] (_spin_unlock_irq+0x2c/0x3c)
> [<c0213bc8>] (_spin_unlock_irq+0x0/0x3c) from [<c0029a88>] (flush_dcache_page+0x114/0x144)
> [<c0029974>] (flush_dcache_page+0x0/0x144) from [<c019b034>] (at91_mci_irq+0x150/0x414)
> [<c019aee4>] (at91_mci_irq+0x0/0x414) from [<c0066c5c>] (handle_IRQ_event+0x2c/0x6c)
> [<c0066c30>] (handle_IRQ_event+0x0/0x6c) from [<c0068a60>] (handle_level_irq+0x108/0x124)
> [<c0068958>] (handle_level_irq+0x0/0x124) from [<c0022064>] (__exception_text_start+0x64/0x90)
>
> When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
> above -- it re-enables IRQs ... since that evidently may only be called with
> IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
> to be disabled. Except ... that lockdep went and disabled them, then went on
> to complains about the breakage *it* caused!
>
> Workaround: depend on LOCKDEP=n ... and for paranoia, disable IRQF_SHARED
> for this interrupt. (At the hardware level, this is dedicated to MCI, so
> there's never a need for multiple handlers.)
>
> Signed-off-by: David Brownell <[email protected]>

Acked-by: Nicolas Ferre <[email protected]>

> ---
> An alternate fix might be to have ARM dcache mmap locking save and restore
> the IRQ flags ... but there are several such "can't run with IRQs disabled"
> restrictions scattered through that code, with no rationales commented. Or
> stop lockdep from mucking with IRQ flags; probably not any time soon!
>
> So for now, this seems to be the best "solution" available...
>
> drivers/mmc/host/Kconfig | 1 +
> drivers/mmc/host/at91_mci.c | 3 +--
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -99,6 +99,7 @@ config MMC_AU1X
> config MMC_AT91
> tristate "AT91 SD/MMC Card Interface support"
> depends on ARCH_AT91
> + depends on LOCKDEP=n
> help
> This selects the AT91 MCI controller.
>
> --- a/drivers/mmc/host/at91_mci.c
> +++ b/drivers/mmc/host/at91_mci.c
> @@ -1081,8 +1081,7 @@ static int __init at91_mci_probe(struct
> * Allocate the MCI interrupt
> */
> host->irq = platform_get_irq(pdev, 0);
> - ret = request_irq(host->irq, at91_mci_irq, IRQF_SHARED,
> - mmc_hostname(mmc), host);
> + ret = request_irq(host->irq, at91_mci_irq, 0, mmc_hostname(mmc), host);
> if (ret) {
> dev_dbg(&pdev->dev, "request MCI interrupt failed\n");
> goto fail0;
>

Thanks, regards,
--
Nicolas Ferre

2008-11-19 18:45:29

by Pierre Ossman

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Mon, 17 Nov 2008 10:28:46 +0100
Nicolas Ferre <[email protected]> wrote:

>
> Acked-by: Nicolas Ferre <[email protected]>
>

Any side-effects besides from dmesg noise? IOW should I push for .28?

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.

2008-11-20 15:24:52

by Nicolas Ferre

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

Pierre Ossman :
> On Mon, 17 Nov 2008 10:28:46 +0100
> Nicolas Ferre <[email protected]> wrote:
>
>> Acked-by: Nicolas Ferre <[email protected]>
>>
>
> Any side-effects besides from dmesg noise? IOW should I push for .28?

No side-effect. I advice you to push it for .28-final.

Kind regards,
--
Nicolas Ferre

2008-11-20 15:43:18

by Pierre Ossman

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Thu, 20 Nov 2008 16:02:43 +0100
Nicolas Ferre <[email protected]> wrote:

> Pierre Ossman :
> > On Mon, 17 Nov 2008 10:28:46 +0100
> > Nicolas Ferre <[email protected]> wrote:
> >
> >> Acked-by: Nicolas Ferre <[email protected]>
> >>
> >
> > Any side-effects besides from dmesg noise? IOW should I push for .28?
>
> No side-effect. I advice you to push it for .28-final.
>

Well, considering Linus' recent clarification I was thinking more along
the lines that if the only thing it "solves" is some dmesg noise, then
it can wait for .29. :)

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.

2008-11-23 14:27:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 2.6.28-rc2] at91_mci: workaround lockdep

On Mon, 2008-11-17 at 10:28 +0100, Nicolas Ferre wrote:
> David Brownell :
> > From: David Brownell <[email protected]>
> >
> > Lockdep reported a problem in the at91_mci driver ... in this case, the
> > issue is with lockdep, not with the driver. A trimmed stack dump, from
> > trying to boot with root on MMC, shows:
> >
> > WARNING: at kernel/lockdep.c:2195 trace_hardirqs_on_caller+0xf4/0x170()
> > Modules linked in:
> > [<c005bc98>] (trace_hardirqs_on+0x0/0x18) from [<c0213bf4>] (_spin_unlock_irq+0x2c/0x3c)
> > [<c0213bc8>] (_spin_unlock_irq+0x0/0x3c) from [<c0029a88>] (flush_dcache_page+0x114/0x144)
> > [<c0029974>] (flush_dcache_page+0x0/0x144) from [<c019b034>] (at91_mci_irq+0x150/0x414)
> > [<c019aee4>] (at91_mci_irq+0x0/0x414) from [<c0066c5c>] (handle_IRQ_event+0x2c/0x6c)
> > [<c0066c30>] (handle_IRQ_event+0x0/0x6c) from [<c0068a60>] (handle_level_irq+0x108/0x124)
> > [<c0068958>] (handle_level_irq+0x0/0x124) from [<c0022064>] (__exception_text_start+0x64/0x90)
> >
> > When __flush_dcache_aliases() returns -- inlined into flush_dcache_page(),
> > above -- it re-enables IRQs ... since that evidently may only be called with
> > IRQs enabled. That's OK since the (unshared) IRQ handler doesn't ask for IRQs
> > to be disabled. Except ... that lockdep went and disabled them, then went on
> > to complains about the breakage *it* caused!
> >
> > Workaround: depend on LOCKDEP=n ... and for paranoia, disable IRQF_SHARED
> > for this interrupt. (At the hardware level, this is dedicated to MCI, so
> > there's never a need for multiple handlers.)
> >
> > Signed-off-by: David Brownell <[email protected]>
>
> Acked-by: Nicolas Ferre <[email protected]>

Nacked-by: Peter Zijlstra <[email protected]>

>
> > ---
> > An alternate fix might be to have ARM dcache mmap locking save and restore
> > the IRQ flags ... but there are several such "can't run with IRQs disabled"
> > restrictions scattered through that code, with no rationales commented. Or
> > stop lockdep from mucking with IRQ flags; probably not any time soon!
> >
> > So for now, this seems to be the best "solution" available...
> >
> > drivers/mmc/host/Kconfig | 1 +
> > drivers/mmc/host/at91_mci.c | 3 +--
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > --- a/drivers/mmc/host/Kconfig
> > +++ b/drivers/mmc/host/Kconfig
> > @@ -99,6 +99,7 @@ config MMC_AU1X
> > config MMC_AT91
> > tristate "AT91 SD/MMC Card Interface support"
> > depends on ARCH_AT91
> > + depends on LOCKDEP=n
> > help
> > This selects the AT91 MCI controller.
> >
> > --- a/drivers/mmc/host/at91_mci.c
> > +++ b/drivers/mmc/host/at91_mci.c
> > @@ -1081,8 +1081,7 @@ static int __init at91_mci_probe(struct
> > * Allocate the MCI interrupt
> > */
> > host->irq = platform_get_irq(pdev, 0);
> > - ret = request_irq(host->irq, at91_mci_irq, IRQF_SHARED,
> > - mmc_hostname(mmc), host);
> > + ret = request_irq(host->irq, at91_mci_irq, 0, mmc_hostname(mmc), host);
> > if (ret) {
> > dev_dbg(&pdev->dev, "request MCI interrupt failed\n");
> > goto fail0;
> >
>
> Thanks, regards,