2008-02-24 12:11:15

by Jan Engelhardt

[permalink] [raw]
Subject: lockdep warnings in ipv6

Hi,


when doing IPv6 (ping6, ssh otherhost, etc.), lockdep spews a warning in
2.6.25-rc2 on the target. CONFIG_..._FRAME_POINTER is off,

CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
# CONFIG_FRAME_POINTER is not set

so I am not sure if the stack trace is worth something, is it?

[ 449.168320] =======================================================
[ 449.168637] [ INFO: possible circular locking dependency detected ]
[ 449.168851] 2.6.25-rc2 #67
[ 449.168951] -------------------------------------------------------
[ 449.169078] swapper/0 is trying to acquire lock:
[ 449.169315] (&tbl->lock){-+-+}, at: [<c023d5a3>]
neigh_lookup+0x43/0xb0
[ 449.170147]
[ 449.170150] but task is already holding lock:
[ 449.170309] (&n->lock){-+-+}, at: [<c023f975>]
neigh_timer_handler+0x15/0x2a0
[ 449.170512]
[ 449.170514] which lock already depends on the new lock.
[ 449.170516]
[ 449.170736]
[ 449.170738] the existing dependency chain (in reverse order) is:
[ 449.170926]
[ 449.170927] -> #1 (&n->lock){-+-+}:
[ 449.171188] [<c0132266>] add_lock_to_list+0x46/0xc0
[ 449.171509] [<c0134c48>] __lock_acquire+0xb48/0xf90
[ 449.171678] [<c023dcda>] neigh_periodic_timer+0x7a/0x170
[ 449.171849] [<c01350ef>] lock_acquire+0x5f/0x80
[ 449.172007] [<c023dcda>] neigh_periodic_timer+0x7a/0x170
[ 449.172173] [<c029b239>] _write_lock+0x29/0x40
[ 449.172342] [<c023dcda>] neigh_periodic_timer+0x7a/0x170
[ 449.172508] [<c023dcda>] neigh_periodic_timer+0x7a/0x170
[ 449.172670] [<c011dd5f>] run_timer_softirq+0x15f/0x1c0
[ 449.172984] [<c023dc60>] neigh_periodic_timer+0x0/0x170
[ 449.173180] [<c023dc60>] neigh_periodic_timer+0x0/0x170
[ 449.173349] [<c011a182>] __do_softirq+0x52/0xb0
[ 449.173516] [<c011a225>] do_softirq+0x45/0x50
[ 449.173675] [<c011a665>] irq_exit+0x55/0x70
[ 449.173831] [<c0104e83>] do_IRQ+0x43/0x90
[ 449.173987] [<c0103214>] common_interrupt+0x24/0x40
[ 449.174147] [<c010321e>] common_interrupt+0x2e/0x40
[ 449.174308] [<ffffffff>] 0xffffffff
[ 449.174767]
[ 449.174769] -> #0 (&tbl->lock){-+-+}:
[ 449.174939] [<c01324f0>] print_circular_bug_entry+0x40/0x50
[ 449.175108] [<c0134a71>] __lock_acquire+0x971/0xf90
[ 449.175268] [<c013425e>] __lock_acquire+0x15e/0xf90
[ 449.175468] [<c0133c16>] trace_hardirqs_on+0x66/0x110
[ 449.175633] [<c01350ef>] lock_acquire+0x5f/0x80
[ 449.175793] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.175973] [<c029b2fe>] _read_lock_bh+0x2e/0x40
[ 449.176139] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.176303] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.177821] [<c8a74a8c>] ndisc_dst_alloc+0x13c/0x1a0 [ipv6]
[ 449.177821] [<c8a74950>] ndisc_dst_alloc+0x0/0x1a0 [ipv6]
[ 449.177821] [<c8a78575>] __ndisc_send+0x95/0x4e0 [ipv6]
[ 449.177821] [<c8a67410>] ip6_output+0x0/0xb00 [ipv6]
[ 449.177821] [<c8a79467>] ndisc_send_ns+0x67/0xa0 [ipv6]
[ 449.177821] [<c8a6eeb0>] ipv6_chk_addr+0xa0/0xb0 [ipv6]
[ 449.177821] [<c8a7a4ef>] ndisc_solicit+0x9f/0x1b0 [ipv6]
[ 449.177821] [<c0133af8>] mark_held_locks+0x38/0x70
[ 449.177821] [<c029b665>] _spin_unlock_irqrestore+0x45/0x60
[ 449.177821] [<c0133c16>] trace_hardirqs_on+0x66/0x110
[ 449.177821] [<c023fab5>] neigh_timer_handler+0x155/0x2a0
[ 449.177821] [<c011dd5f>] run_timer_softirq+0x15f/0x1c0
[ 449.177821] [<c023f960>] neigh_timer_handler+0x0/0x2a0
[ 449.177821] [<c023f960>] neigh_timer_handler+0x0/0x2a0
[ 449.177821] [<c011a182>] __do_softirq+0x52/0xb0
[ 449.177821] [<c011a225>] do_softirq+0x45/0x50
[ 449.177821] [<c011a665>] irq_exit+0x55/0x70
[ 449.177821] [<c0104e83>] do_IRQ+0x43/0x90
[ 449.177821] [<c0103214>] common_interrupt+0x24/0x40
[ 449.177821] [<c010321e>] common_interrupt+0x2e/0x40
[ 449.177821] [<c0101ace>] default_idle+0x5e/0x90
[ 449.177821] [<c0101a70>] default_idle+0x0/0x90
[ 449.177821] [<c0101910>] cpu_idle+0x30/0x80
[ 449.177821] [<ffffffff>] 0xffffffff
[ 449.177821]
[ 449.177821] other info that might help us debug this:
[ 449.177821]
[ 449.177821] 1 lock held by swapper/0:
[ 449.177821] #0: (&n->lock){-+-+}, at: [<c023f975>]
neigh_timer_handler+0x15/0x2a0
[ 449.177821]
[ 449.177821] stack backtrace:
[ 449.177821] Pid: 0, comm: swapper Not tainted 2.6.25-rc2 #67
[ 449.177821] [<c0132d72>] print_circular_bug_tail+0x72/0x80
[ 449.177821] [<c0134a71>] __lock_acquire+0x971/0xf90
[ 449.177821] [<c013425e>] __lock_acquire+0x15e/0xf90
[ 449.177821] [<c0133c16>] trace_hardirqs_on+0x66/0x110
[ 449.177821] [<c01350ef>] lock_acquire+0x5f/0x80
[ 449.177821] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.177821] [<c029b2fe>] _read_lock_bh+0x2e/0x40
[ 449.177821] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.177821] [<c023d5a3>] neigh_lookup+0x43/0xb0
[ 449.177821] [<c8a74a8c>] ndisc_dst_alloc+0x13c/0x1a0 [ipv6]
[ 449.177821] [<c8a74950>] ndisc_dst_alloc+0x0/0x1a0 [ipv6]
[ 449.177821] [<c8a78575>] __ndisc_send+0x95/0x4e0 [ipv6]
[ 449.177821] [<c8a67410>] ip6_output+0x0/0xb00 [ipv6]
[ 449.177821] [<c8a79467>] ndisc_send_ns+0x67/0xa0 [ipv6]
[ 449.177821] [<c8a6eeb0>] ipv6_chk_addr+0xa0/0xb0 [ipv6]
[ 449.177821] [<c8a7a4ef>] ndisc_solicit+0x9f/0x1b0 [ipv6]
[ 449.177821] [<c0133af8>] mark_held_locks+0x38/0x70
[ 449.177821] [<c029b665>] _spin_unlock_irqrestore+0x45/0x60
[ 449.177821] [<c0133c16>] trace_hardirqs_on+0x66/0x110
[ 449.177821] [<c023fab5>] neigh_timer_handler+0x155/0x2a0
[ 449.177821] [<c011dd5f>] run_timer_softirq+0x15f/0x1c0
[ 449.177821] [<c023f960>] neigh_timer_handler+0x0/0x2a0
[ 449.177821] [<c023f960>] neigh_timer_handler+0x0/0x2a0
[ 449.177821] [<c011a182>] __do_softirq+0x52/0xb0
[ 449.177821] [<c011a225>] do_softirq+0x45/0x50
[ 449.177821] [<c011a665>] irq_exit+0x55/0x70
[ 449.177821] [<c0104e83>] do_IRQ+0x43/0x90
[ 449.177821] [<c0103214>] common_interrupt+0x24/0x40
[ 449.177821] [<c010321e>] common_interrupt+0x2e/0x40
[ 449.177821] [<c0101ace>] default_idle+0x5e/0x90
[ 449.177821] [<c0101a70>] default_idle+0x0/0x90
[ 449.177821] [<c0101910>] cpu_idle+0x30/0x80
[ 449.177821] =======================


2008-02-24 13:19:33

by Kristof Provost

[permalink] [raw]
Subject: Re: lockdep warnings in ipv6

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2008-02-24 13:10:58 (+0100), Jan Engelhardt <[email protected]> wrote:
> Hi,
>
>
> when doing IPv6 (ping6, ssh otherhost, etc.), lockdep spews a warning in
> 2.6.25-rc2 on the target. CONFIG_..._FRAME_POINTER is off,
I think that's the same bug I ran into. It was introduced in
69cc64d8d9 and reversed in 9ff56607468. I think Linus' tree has the
reversal by now.

Kristof

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHwW1CUEZ9DhGwDugRAkVJAKCFxUjwfQNTtAX0Z5OquJRyNimmCgCbBgio
0voGZL/B6iaz5kSLgTbV2EY=
=CPho
-----END PGP SIGNATURE-----

2008-02-25 01:16:00

by David Miller

[permalink] [raw]
Subject: Re: lockdep warnings in ipv6

From: Jan Engelhardt <[email protected]>
Date: Sun, 24 Feb 2008 13:10:58 +0100 (CET)

> when doing IPv6 (ping6, ssh otherhost, etc.), lockdep spews a warning in
> 2.6.25-rc2 on the target. CONFIG_..._FRAME_POINTER is off,

We reverted the change which causes this, and it results
in real bonafide lockups too :-)

commit 9ff566074689e3aed1488780b97714ec43ba361d
Author: David S. Miller <[email protected]>
Date: Sun Feb 17 18:39:54 2008 -0800

Revert "[NDISC]: Fix race in generic address resolution"

This reverts commit 69cc64d8d92bf852f933e90c888dfff083bd4fc9.

It causes recursive locking in IPV6 because unlike other
neighbour layer clients, it even needs neighbour cache
entries to send neighbour soliciation messages :-(

We'll have to find another way to fix this race.

Signed-off-by: David S. Miller <[email protected]>

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 7bb6a9a..a16cf1e 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -834,12 +834,18 @@ static void neigh_timer_handler(unsigned long arg)
}
if (neigh->nud_state & (NUD_INCOMPLETE | NUD_PROBE)) {
struct sk_buff *skb = skb_peek(&neigh->arp_queue);
-
+ /* keep skb alive even if arp_queue overflows */
+ if (skb)
+ skb_get(skb);
+ write_unlock(&neigh->lock);
neigh->ops->solicit(neigh, skb);
atomic_inc(&neigh->probes);
- }
+ if (skb)
+ kfree_skb(skb);
+ } else {
out:
- write_unlock(&neigh->lock);
+ write_unlock(&neigh->lock);
+ }

if (notify)
neigh_update_notify(neigh);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index c663fa5..8e17f65 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -368,6 +368,7 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
if (!(neigh->nud_state&NUD_VALID))
printk(KERN_DEBUG "trying to ucast probe in NUD_INVALID\n");
dst_ha = neigh->ha;
+ read_lock_bh(&neigh->lock);
} else if ((probes -= neigh->parms->app_probes) < 0) {
#ifdef CONFIG_ARPD
neigh_app_ns(neigh);
@@ -377,6 +378,8 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)

arp_send(ARPOP_REQUEST, ETH_P_ARP, target, dev, saddr,
dst_ha, dev->dev_addr, NULL);
+ if (dst_ha)
+ read_unlock_bh(&neigh->lock);
}

static int arp_ignore(struct in_device *in_dev, __be32 sip, __be32 tip)