2007-09-02 13:11:49

by Christian Kujau

[permalink] [raw]
Subject: 2.6.23-rc5: possible irq lock inversion dependency detected

Hi,

after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep
was quite noisy when I tried to shape my external (wireless) interface:

[ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
[ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>] netif_receive_skb+0x2d5/0x3c0
[ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the past:
[ 6400.535145] (police_lock){-.--}

This happened when I executed: http://nerdbynature.de/bits/2.6.23-rc5/qos.sh.txt
(using iproute2-ss070313). The is still running, I just noticed a short
hickup, probably when it was busy writing the warning to the disk.

More details and .config: http://nerdbynature.de/bits/2.6.23-rc5/

I'm not really sure what the application mentioned in the message above
has to do with this: the application[1] has been running since bootup as
a non-privileged user and did so for earlier kernel versions too.

Christian.

[0] http://lkml.org/lkml/2007/9/2/6
[1] http://folding.stanford.edu/linux.html
--
BOFH excuse #294:

PCMCIA slave driver


2007-09-10 12:05:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Sun, 2007-09-02 at 15:11 +0200, Christian Kujau wrote:
> Hi,
>
> after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep
> was quite noisy when I tried to shape my external (wireless) interface:
>
> [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
> [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>] netif_receive_skb+0x2d5/0x3c0
> [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the past:
> [ 6400.535145] (police_lock){-.--}
>
> This happened when I executed: http://nerdbynature.de/bits/2.6.23-rc5/qos.sh.txt
> (using iproute2-ss070313). The is still running, I just noticed a short
> hickup, probably when it was busy writing the warning to the disk.
>
> More details and .config: http://nerdbynature.de/bits/2.6.23-rc5/

seems unavailable at this time, please submit the whole lockdep report
if possible.

2007-09-10 13:00:55

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote:
>
> after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep
> was quite noisy when I tried to shape my external (wireless) interface:
>
> [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
> [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>]
> netif_receive_skb+0x2d5/0x3c0
> [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the
> past:
> [ 6400.535145] (police_lock){-.--}

This is a genuine dead-lock. The police lock can be taken
for reading with softirqs on. If a second CPU tries to take
the police lock for writing, while holding the ingress lock,
then a softirq on the first CPU can dead-lock when it tries
to get the ingress lock.

The minimal fix would be to make sure that we disable BH on
the first CPU. Jamal, could you take a look at this please?

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-09-11 00:04:57

by jamal

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Mon, 2007-10-09 at 21:00 +0800, Herbert Xu wrote:


> The minimal fix would be to make sure that we disable BH on
> the first CPU.

disabling BH would make it more symmetric to the way we handle
egress. I couldnt reproduce the issue, but this should hopefully resolve
it.
Christian, can you test with this patch?

cheers,
jamal





Attachments:
ing1 (549.00 B)

2007-09-11 02:18:35

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Mon, Sep 10, 2007 at 08:04:41PM -0400, jamal wrote:
>
> disabling BH would make it more symmetric to the way we handle
> egress. I couldnt reproduce the issue, but this should hopefully resolve
> it.
> Christian, can you test with this patch?

Jamal, it's the police_lock that we need to make _bh. The
ingress_lock is already _bh because of the spin_lock_bh that
directly precedes it.

Oh and I think the same thing applies for the other actions
too.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-09-11 12:02:06

by jamal

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Tue, 2007-11-09 at 10:18 +0800, Herbert Xu wrote:

> Jamal, it's the police_lock that we need to make _bh. The
> ingress_lock is already _bh because of the spin_lock_bh that
> directly precedes it.
>
> Oh and I think the same thing applies for the other actions
> too.

ga-Dang. Ok, here it is. If you see(?) any more farts let me know.
I am around for another 30 minutes and off for about 18 hours.

Christian, i took your config and qos setup but I cant reproduce the
issue - i think i may need some of that wireless setup to recreate. So
if you can test this and validate it works we can push it forward.

cheers,
jamal


Attachments:
act_bhl (2.15 kB)

2007-09-11 12:44:00

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

On Tue, Sep 11, 2007 at 08:01:46AM -0400, jamal wrote:
>
> [NET_SCHED] protect action config/dump from irqs

Looks good! Thanks Jamal.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-09-12 14:34:17

by David Miller

[permalink] [raw]
Subject: Re: 2.6.23-rc5: possible irq lock inversion dependency detected

From: Herbert Xu <[email protected]>
Date: Tue, 11 Sep 2007 20:43:27 +0800

> On Tue, Sep 11, 2007 at 08:01:46AM -0400, jamal wrote:
> >
> > [NET_SCHED] protect action config/dump from irqs
>
> Looks good! Thanks Jamal.

Applied, I'll try to push this in some time soon.