2010-08-12 23:10:34

by David Miller

[permalink] [raw]
Subject: [GIT] Networking


1) qlcnic driver using way too much stack space during probe, fix
from Dan Carpenter.

2) sfq was oops'able during config because it only partially provided
the classful qdisc interfaces. Fix this, and also add validation
code during qdisc registry so that we can catch this ahead of time
in the future. From Jarek Poplawski.

3) Properly undo pci_enable_device() when PCI driver probe fails in
ISDN. From Kulikov Vasiliy.

4) Buffer overflow fix in CAN protocol from Oliver Kartkopp.

5) kernel-doc and Kconfig dependency fixups from Randy Dunlap.

6) s390 driver use macros named READ/WRITE, fix to avoid namespace
conflicts. From Heiko Carstens and Ursula Braun.

7) Minor fixes to CAIF protocol stack from Sjur Braendeland.

8) The usual assortment of wireless fixes via John Linville and co.

Please pull, thanks a lot.

The following changes since commit ad41a1e0cab07c5125456e8d38e5b1ab148d04aa:

Merge branch 'io_remap_pfn_range' of git://http://www.jni.nu/cris (2010-08-12 10:17:19 -0700)

are available in the git repository at:

master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Ben Greear (1):
net: Use NET_XMIT_SUCCESS where possible.

Dan Carpenter (2):
qlcnic: clean up qlcnic_init_pci_info()
qlcnic: using too much stack

David S. Miller (3):
farsync: Fix compile warnings.
usbnet: rx_submit() should return an error code.
Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

Heiko Carstens (1):
claw: rename READ/WRITE defines to avoid redefinitions

Jarek Poplawski (4):
pkt_sched: sch_sfq: Add dummy unbind_tcf and put handles. Was: [PATCH] sfq: add dummy bind/unbind handles
pkt_sched: Add some basic qdisc class ops verification. Was: [PATCH] sfq: add dummy bind/unbind handles
pkt_sched: Fix sch_sfq vs tc_modify_qdisc oops
pkt_sched: Check .walk and .leaf class handlers

Johannes Berg (2):
cfg80211: fix locking in action frame TX
iwlagn: fix rts cts protection

John W. Linville (4):
Revert "p54pci: Add PCI ID for SMC2802W"
libertas: fix build break by including linux/sched.h
Merge branch 'master' of git://git.kernel.org/.../holtmann/bluetooth-2.6
net: make netpoll_rx return bool for !CONFIG_NETPOLL

Kulikov Vasiliy (3):
isdn: avm: call pci_disable_device() if pci_probe() failed
isdn: avm: call pci_disable_device() if pci_probe() failed
isdn: mISDN: call pci_disable_device() if pci_probe() failed

Mat Martineau (4):
Bluetooth: Fix endianness issue with L2CAP MPS configuration
Bluetooth: Change default L2CAP ERTM retransmit timeout
Bluetooth: Fix incorrect setting of remote_tx_win for L2CAP ERTM
Bluetooth: Use 3-DH5 payload size for default ERTM max PDU size

Oliver Hartkopp (1):
can: add limit for nframes and clean up signed/unsigned variables

Randy Dunlap (4):
etherdevice.h: fix kernel-doc typo
net/sock.h: add missing kernel-doc notation
phy.c: fix kernel-doc warnings
phylib: available for any speed ethernet

Sjur Braendeland (2):
caif: Bugfix - Increase default headroom size for control channel.
caif-spi: Bugfix SPI_DATA_POS settings were inverted.

Ursula Braun (1):
ctcm: rename READ/WRITE defines to avoid redefinitions

drivers/isdn/hardware/avm/c4.c | 1 +
drivers/isdn/hardware/avm/t1pci.c | 1 +
drivers/isdn/hardware/mISDN/mISDNinfineon.c | 5 +-
drivers/net/caif/caif_spi_slave.c | 4 +-
drivers/net/phy/Kconfig | 2 +-
drivers/net/phy/phy.c | 2 +-
drivers/net/qlcnic/qlcnic_main.c | 72 ++++++++++-------
drivers/net/usb/usbnet.c | 22 ++++--
drivers/net/wan/farsync.c | 15 ++--
drivers/net/wireless/iwlwifi/iwl-1000.c | 2 +-
drivers/net/wireless/iwlwifi/iwl-3945.c | 18 +----
drivers/net/wireless/iwlwifi/iwl-4965.c | 2 +-
drivers/net/wireless/iwlwifi/iwl-5000.c | 10 +-
drivers/net/wireless/iwlwifi/iwl-6000.c | 16 ++--
drivers/net/wireless/iwlwifi/iwl-agn-hcmd.c | 19 ++++-
drivers/net/wireless/iwlwifi/iwl-agn-tx.c | 20 +-----
drivers/net/wireless/iwlwifi/iwl-agn.c | 85 +++++---------------
drivers/net/wireless/iwlwifi/iwl-core.c | 29 ++++++-
drivers/net/wireless/iwlwifi/iwl-core.h | 14 ++--
drivers/net/wireless/iwlwifi/iwl3945-base.c | 5 +-
drivers/net/wireless/libertas/cfg.c | 1 +
drivers/net/wireless/p54/p54pci.c | 2 -
drivers/s390/net/claw.c | 118 +++++++++++++-------------
drivers/s390/net/claw.h | 4 +-
drivers/s390/net/ctcm_fsms.c | 60 +++++++-------
drivers/s390/net/ctcm_main.c | 80 +++++++++---------
drivers/s390/net/ctcm_main.h | 4 +-
drivers/s390/net/ctcm_mpc.c | 64 ++++++++-------
drivers/s390/net/ctcm_sysfs.c | 20 +++---
include/linux/etherdevice.h | 2 +-
include/linux/netpoll.h | 2 +-
include/net/bluetooth/l2cap.h | 4 +-
include/net/sock.h | 4 +-
net/bluetooth/l2cap.c | 11 +--
net/caif/cfpkt_skbuff.c | 2 +-
net/can/bcm.c | 41 ++++++---
net/dsa/Kconfig | 2 +-
net/sched/sch_api.c | 22 ++++-
net/sched/sch_atm.c | 4 +-
net/sched/sch_sfq.c | 14 +++-
net/sched/sch_tbf.c | 4 +-
net/sched/sch_teql.c | 2 +-
net/wireless/mlme.c | 8 ++-
43 files changed, 428 insertions(+), 391 deletions(-)


2010-08-14 18:06:24

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT] Networking

David,
I completely screwed up locking in my VM guard page patch, so when I
verified my fix for that I tried to enable all the lock debug crud I
possibly could.

And as a result, I got a locking error report, but it had nothing to
do with the VM guard page any more (so hopefully I finally fixed my
mindless code-drivel correctly. I'm a bit ashamed of myself).

Anyway, the lock warning I do get seems to be networking-related, and
is appended. Does this ring any bells? It could easily be something
old: I turn on lock debugging only when I look for bugs (or when
people point out bugs that I've created :^/ )

The only thing that seems to be related that google can find is pretty
recent too: a report from Valdis Kletnieks about this apparently
happening on e1000e too (Subject "mmotm 2010-08-11 - lockdep whinges
at e1000e driver ifconfig up"). So it does seem to be pretty recent.

Hmm? Everything obviously still works, but judging by the lockdep
report this might be a deadlock situation (lock taken in softirq _and_
outside softirq without disabling bhs)

Linus

---
r8169 0000:01:00.0: eth0: link up
r8169 0000:01:00.0: eth0: link up

=================================
[ INFO: inconsistent lock state ]
2.6.35-07956-g92fa5bd9-dirty #7
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
dbus-daemon/2432 [HC0[0]:SC1[2]:HE1:SE0] takes:
(&(&lock->lock)->rlock){+.?...}, at: [<ffffffff814e9fd8>]
ip6t_do_table+0x7f/0x3e7
{SOFTIRQ-ON-W} state was registered at:
[<ffffffff8105eb80>] __lock_acquire+0x756/0x1712
[<ffffffff8105fbbf>] lock_acquire+0x83/0x9d
[<ffffffff81505c77>] _raw_spin_lock+0x31/0x64
[<ffffffff814e943d>] get_counters+0xa4/0x139
[<ffffffff814e9509>] alloc_counters+0x37/0x42
[<ffffffff814eab47>] do_ip6t_get_ctl+0x107/0x35e
[<ffffffff81463241>] nf_sockopt+0x55/0x81
[<ffffffff81463280>] nf_getsockopt+0x13/0x15
[<ffffffff814d0cce>] ipv6_getsockopt+0x7f/0xb5
[<ffffffff814d7e29>] rawv6_getsockopt+0x3d/0x46
[<ffffffff81434fae>] sock_common_getsockopt+0xf/0x11
[<ffffffff814323ff>] sys_getsockopt+0x75/0x96
[<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
irq event stamp: 21072
hardirqs last enabled at (21072): [<ffffffff8103b5cf>]
local_bh_enable+0xbd/0xc2
hardirqs last disabled at (21071): [<ffffffff8103b564>]
local_bh_enable+0x52/0xc2
softirqs last enabled at (20000): [<ffffffff814bd355>]
unix_create1+0x164/0x17c
softirqs last disabled at (21037): [<ffffffff81002dcc>] call_softirq+0x1c/0x28

other info that might help us debug this:
3 locks held by dbus-daemon/2432:
#0: (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff810404f6>]
run_timer_softirq+0x14e/0x28c
#1: (rcu_read_lock){.+.+..}, at: [<ffffffff814dc9b2>] mld_sendpack+0x0/0x3bd
#2: (rcu_read_lock){.+.+..}, at: [<ffffffff81461ed5>] nf_hook_slow+0x0/0x10a

stack backtrace:
Pid: 2432, comm: dbus-daemon Not tainted 2.6.35-07956-g92fa5bd9-dirty #7
Call Trace:
<IRQ> [<ffffffff8105bf16>] print_usage_bug+0x1a4/0x1b5
[<ffffffff8100c7d3>] ? save_stack_trace+0x2a/0x47
[<ffffffff8105cb6c>] ? check_usage_forwards+0x0/0xc6
[<ffffffff8105c211>] mark_lock+0x2ea/0x552
[<ffffffff8105eb07>] __lock_acquire+0x6dd/0x1712
[<ffffffff8105246e>] ? local_clock+0x2b/0x3c
[<ffffffff8105b2e8>] ? lock_release_holdtime+0x1c/0x123
[<ffffffff8105fbbf>] lock_acquire+0x83/0x9d
[<ffffffff814e9fd8>] ? ip6t_do_table+0x7f/0x3e7
[<ffffffff81505c77>] _raw_spin_lock+0x31/0x64
[<ffffffff814e9fd8>] ? ip6t_do_table+0x7f/0x3e7
[<ffffffff814e9fd8>] ip6t_do_table+0x7f/0x3e7
[<ffffffff814ebd87>] ip6table_filter_hook+0x17/0x1c
[<ffffffff81461e92>] nf_iterate+0x41/0x84
[<ffffffff814db3ba>] ? dst_output+0x0/0x58
[<ffffffff81461f63>] nf_hook_slow+0x8e/0x10a
[<ffffffff814db3ba>] ? dst_output+0x0/0x58
[<ffffffff814dcc00>] mld_sendpack+0x24e/0x3bd
[<ffffffff8105c4cb>] ? mark_held_locks+0x52/0x70
[<ffffffff814dd478>] mld_ifc_timer_expire+0x24f/0x288
[<ffffffff814dd229>] ? mld_ifc_timer_expire+0x0/0x288
[<ffffffff81040584>] run_timer_softirq+0x1dc/0x28c
[<ffffffff810404f6>] ? run_timer_softirq+0x14e/0x28c
[<ffffffff8103b6dc>] ? __do_softirq+0x69/0x13d
[<ffffffff8103b715>] __do_softirq+0xa2/0x13d
[<ffffffff810590eb>] ? tick_program_event+0x25/0x27
[<ffffffff81002dcc>] call_softirq+0x1c/0x28
[<ffffffff81004834>] do_softirq+0x38/0x80
[<ffffffff8103b2e0>] irq_exit+0x45/0x87
[<ffffffff8101a278>] smp_apic_timer_interrupt+0x88/0x96
[<ffffffff81002893>] apic_timer_interrupt+0x13/0x20
<EOI>
r8169: WARNING! Changing of MTU on this NIC may lead to frame
reception errors!

2010-08-15 02:27:46

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: Linus Torvalds <[email protected]>
Date: Sat, 14 Aug 2010 11:05:54 -0700

> Anyway, the lock warning I do get seems to be networking-related, and
> is appended. Does this ring any bells? It could easily be something
> old: I turn on lock debugging only when I look for bugs (or when
> people point out bugs that I've created :^/ )
>
> The only thing that seems to be related that google can find is pretty
> recent too: a report from Valdis Kletnieks about this apparently
> happening on e1000e too (Subject "mmotm 2010-08-11 - lockdep whinges
> at e1000e driver ifconfig up"). So it does seem to be pretty recent.
>
> Hmm? Everything obviously still works, but judging by the lockdep
> report this might be a deadlock situation (lock taken in softirq _and_
> outside softirq without disabling bhs)

I'll take a look.

2010-08-15 05:09:29

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: Linus Torvalds <[email protected]>
Date: Sat, 14 Aug 2010 11:05:54 -0700

> Anyway, the lock warning I do get seems to be networking-related, and
> is appended. Does this ring any bells? It could easily be something
> old: I turn on lock debugging only when I look for bugs (or when
> people point out bugs that I've created :^/ )

This is a false positive but I have no idea how we can annotate
this to not trigger in lockdep.

These are per-cpu locks for counter management.

The get_counters() code knows that the locks for other cpu's counters
can only be taken in software interrupt context of that other cpu. So
it is legal to turn software interrupts back on when grabbing their
locks in base context.

CC:'ing Eric Dumazet since he put the code the way it is now :-)
Via commit 24b36f0193467fa727b85b4c004016a8dae999b9
("netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessary")

2010-08-15 05:10:19

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: David Miller <[email protected]>
Date: Sat, 14 Aug 2010 22:09:45 -0700 (PDT)

Umm, didn't actually CC: Eric, doing that now :-)

> From: Linus Torvalds <[email protected]>
> Date: Sat, 14 Aug 2010 11:05:54 -0700
>
>> Anyway, the lock warning I do get seems to be networking-related, and
>> is appended. Does this ring any bells? It could easily be something
>> old: I turn on lock debugging only when I look for bugs (or when
>> people point out bugs that I've created :^/ )
>
> This is a false positive but I have no idea how we can annotate
> this to not trigger in lockdep.
>
> These are per-cpu locks for counter management.
>
> The get_counters() code knows that the locks for other cpu's counters
> can only be taken in software interrupt context of that other cpu. So
> it is legal to turn software interrupts back on when grabbing their
> locks in base context.
>
> CC:'ing Eric Dumazet since he put the code the way it is now :-)
> Via commit 24b36f0193467fa727b85b4c004016a8dae999b9
> ("netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessary")
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2010-08-15 15:40:56

by Eric Dumazet

[permalink] [raw]
Subject: Re: [GIT] Networking

Le dimanche 15 août 2010 à 12:55 +0200, Eric Dumazet a écrit :

> We have one lock per cpu, and only one cpu can possibly lock its
> associated lock under softirq. So the usual lockdep check, warning a
> lock is taken with BH enabled, while same lock was taken inside softirq
> handler is triggering a false positive here.
>
> I believe no existing lockdep annotation can instruct lockdep this use
> is OK, I guess we have following choice :
>
> 1) Mask BH again, using xt_info_wrlock_lockdep(cpu) instead of
> xt_info_wrlock(cpu).
>
> xt_info_wrlock_lockdep() being a variant, that disables BH in case
> CONFIG_PROVE_LOCKING=y
>
> 2) temporally switch off lockdep in get_counters(), using a
> lockdep_off()/lockdep_on() pair, and a comment why this is necessary.
>

In any case, here is patch implementing the later

CC Patrick, our netfilter maintainer...

Maybe lockdep rules could be improved to take care of this later ?

Thanks

[PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive

After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.

In this use case, the lockdep splat is a false positive, because
the BH disabling only matters for one cpu for a given lock
(we use one lock per cpu).

Use lockdep_off()/lockdep_on() around the problematic section to
avoid the splat.

Reported-by: Linus Torvalds <[email protected]>
Diagnosed-by: David S. Miller <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
CC: Patrick McHardy <[email protected]>
---
net/ipv4/netfilter/arp_tables.c | 3 +++
net/ipv4/netfilter/ip_tables.c | 3 +++
net/ipv6/netfilter/ip6_tables.c | 3 +++
3 files changed, 9 insertions(+)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 6bccba3..b4f7ebf 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -729,8 +729,10 @@ static void get_counters(const struct xt_table_info *t,
local_bh_enable();
/* Processing counters from other cpus, we can let bottom half enabled,
* (preemption is disabled)
+ * We must turn off lockdep to avoid a false positive.
*/

+ lockdep_off();
for_each_possible_cpu(cpu) {
if (cpu == curcpu)
continue;
@@ -743,6 +745,7 @@ static void get_counters(const struct xt_table_info *t,
}
xt_info_wrunlock(cpu);
}
+ lockdep_on();
put_cpu();
}

diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index c439721..dc5b2fd 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -903,8 +903,10 @@ get_counters(const struct xt_table_info *t,
local_bh_enable();
/* Processing counters from other cpus, we can let bottom half enabled,
* (preemption is disabled)
+ * We must turn off lockdep to avoid a false positive.
*/

+ lockdep_off();
for_each_possible_cpu(cpu) {
if (cpu == curcpu)
continue;
@@ -917,6 +919,7 @@ get_counters(const struct xt_table_info *t,
}
xt_info_wrunlock(cpu);
}
+ lockdep_on();
put_cpu();
}

diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 5359ef4..fb55443 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -916,8 +916,10 @@ get_counters(const struct xt_table_info *t,
local_bh_enable();
/* Processing counters from other cpus, we can let bottom half enabled,
* (preemption is disabled)
+ * We must turn off lockdep to avoid a false positive.
*/

+ lockdep_off();
for_each_possible_cpu(cpu) {
if (cpu == curcpu)
continue;
@@ -930,6 +932,7 @@ get_counters(const struct xt_table_info *t,
}
xt_info_wrunlock(cpu);
}
+ lockdep_on();
put_cpu();
}


2010-08-15 16:41:04

by Eric Dumazet

[permalink] [raw]
Subject: Re: [GIT] Networking

Le samedi 14 août 2010 à 22:09 -0700, David Miller a écrit :
> From: Linus Torvalds <[email protected]>
> Date: Sat, 14 Aug 2010 11:05:54 -0700
>
> > Anyway, the lock warning I do get seems to be networking-related, and
> > is appended. Does this ring any bells? It could easily be something
> > old: I turn on lock debugging only when I look for bugs (or when
> > people point out bugs that I've created :^/ )
>
> This is a false positive but I have no idea how we can annotate
> this to not trigger in lockdep.
>
> These are per-cpu locks for counter management.
>
> The get_counters() code knows that the locks for other cpu's counters
> can only be taken in software interrupt context of that other cpu. So
> it is legal to turn software interrupts back on when grabbing their
> locks in base context.
>
> CC:'ing Eric Dumazet since he put the code the way it is now :-)
> Via commit 24b36f0193467fa727b85b4c004016a8dae999b9
> ("netfilter: {ip,ip6,arp}_tables: dont block bottom half more than necessary")

Hmm... thats right.

We have one lock per cpu, and only one cpu can possibly lock its
associated lock under softirq. So the usual lockdep check, warning a
lock is taken with BH enabled, while same lock was taken inside softirq
handler is triggering a false positive here.

I believe no existing lockdep annotation can instruct lockdep this use
is OK, I guess we have following choice :

1) Mask BH again, using xt_info_wrlock_lockdep(cpu) instead of
xt_info_wrlock(cpu).

xt_info_wrlock_lockdep() being a variant, that disables BH in case
CONFIG_PROVE_LOCKING=y

2) temporally switch off lockdep in get_counters(), using a
lockdep_off()/lockdep_on() pair, and a comment why this is necessary.

I'll provide a patch with either way, just tell me which one you
prefer !

Thanks

2010-08-16 09:53:31

by Jarek Poplawski

[permalink] [raw]
Subject: Re: [GIT] Networking

Eric Dumazet wrote:
> Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a ?crit :
...
> [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
>
> After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> bottom half more than necessary), lockdep can raise a warning
> because we attempt to lock a spinlock with BH enabled, while
> the same lock is usually locked by another cpu in a softirq context.

Btw, could you remind us how get_counters() are serialized (I guess
you can't have them on 2 cpus at the same time)?

Thanks,
Jarek P.

2010-08-16 10:42:49

by Eric Dumazet

[permalink] [raw]
Subject: Re: [GIT] Networking

Le lundi 16 août 2010 à 09:53 +0000, Jarek Poplawski a écrit :
> Eric Dumazet wrote:
> > Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a écrit :
> ...
> > [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
> >
> > After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> > bottom half more than necessary), lockdep can raise a warning
> > because we attempt to lock a spinlock with BH enabled, while
> > the same lock is usually locked by another cpu in a softirq context.
>
> Btw, could you remind us how get_counters() are serialized (I guess
> you can't have them on 2 cpus at the same time)?
>

get_counters() is serialized by the xt_find_table_lock() done from
get_entries(). This use a mutex to guard against changes.

You are right that if we ever allow two concurrent "iptables -nvL"
operations in the future (using a read lock on a rwlock instead of a
mutex), then we must disable BH even for summing data from the other
cpus.

Thanks

2010-08-16 10:49:05

by Jarek Poplawski

[permalink] [raw]
Subject: Re: [GIT] Networking

On Mon, Aug 16, 2010 at 12:42:41PM +0200, Eric Dumazet wrote:
> Le lundi 16 ao??t 2010 ?? 09:53 +0000, Jarek Poplawski a ?crit :
> > Eric Dumazet wrote:
> > > Le dimanche 15 aou^t 2010 a` 12:55 +0200, Eric Dumazet a ?crit :
> > ...
> > > [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
> > >
> > > After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> > > bottom half more than necessary), lockdep can raise a warning
> > > because we attempt to lock a spinlock with BH enabled, while
> > > the same lock is usually locked by another cpu in a softirq context.
> >
> > Btw, could you remind us how get_counters() are serialized (I guess
> > you can't have them on 2 cpus at the same time)?
> >
>
> get_counters() is serialized by the xt_find_table_lock() done from
> get_entries(). This use a mutex to guard against changes.
>
> You are right that if we ever allow two concurrent "iptables -nvL"
> operations in the future (using a read lock on a rwlock instead of a
> mutex), then we must disable BH even for summing data from the other
> cpus.

OK, thanks for the explanation,

Jarek P.

2010-08-16 19:35:51

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: Eric Dumazet <[email protected]>
Date: Sun, 15 Aug 2010 16:47:17 +0200

> Le dimanche 15 ao?t 2010 ? 12:55 +0200, Eric Dumazet a ?crit :
>
>> We have one lock per cpu, and only one cpu can possibly lock its
>> associated lock under softirq. So the usual lockdep check, warning a
>> lock is taken with BH enabled, while same lock was taken inside softirq
>> handler is triggering a false positive here.
>>
>> I believe no existing lockdep annotation can instruct lockdep this use
>> is OK, I guess we have following choice :
>>
>> 1) Mask BH again, using xt_info_wrlock_lockdep(cpu) instead of
>> xt_info_wrlock(cpu).
>>
>> xt_info_wrlock_lockdep() being a variant, that disables BH in case
>> CONFIG_PROVE_LOCKING=y
>>
>> 2) temporally switch off lockdep in get_counters(), using a
>> lockdep_off()/lockdep_on() pair, and a comment why this is necessary.
>>
>
> In any case, here is patch implementing the later

I'm hesistent to say that we should put this kind of patch in.

It will shut up lockdep for this specific case, but it also means
that if we do any other kinds of locking in this sequence we will
not validate it.

The valuable of this is open for debate I guess.

But locking is hard so I would say that disabling lockdep to kill a
warning it generates should be an absolute last resort.

I also don't think making the locking mechanics conditional upon
LOCKDEP is sane either, exactly because it means lockdep is testing
something other than what actually gets used in practice. :-)

2010-08-16 20:22:28

by Eric Dumazet

[permalink] [raw]
Subject: Re: [GIT] Networking

Le lundi 16 août 2010 à 12:36 -0700, David Miller a écrit :

> I'm hesistent to say that we should put this kind of patch in.
>
> It will shut up lockdep for this specific case, but it also means
> that if we do any other kinds of locking in this sequence we will
> not validate it.
>
> The valuable of this is open for debate I guess.
>
> But locking is hard so I would say that disabling lockdep to kill a
> warning it generates should be an absolute last resort.
>
> I also don't think making the locking mechanics conditional upon
> LOCKDEP is sane either, exactly because it means lockdep is testing
> something other than what actually gets used in practice. :-)

Hmm, maybe just disable BH, not for whole duration, but for each cpu.

Its a bit late here and I prefer to close this problem before whole
earth shout on me.

Thanks

[PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive

After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.

Disable again BH to avoid these lockdep warnings.

Reported-by: Linus Torvalds <[email protected]>
Diagnosed-by: David S. Miller <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
CC: Patrick McHardy <[email protected]>
CC: Peter Zijlstra <[email protected]>
---
net/ipv4/netfilter/arp_tables.c | 2 ++
net/ipv4/netfilter/ip_tables.c | 2 ++
net/ipv6/netfilter/ip6_tables.c | 2 ++
3 files changed, 6 insertions(+)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 6bccba3..51d6c31 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -735,6 +735,7 @@ static void get_counters(const struct xt_table_info *t,
if (cpu == curcpu)
continue;
i = 0;
+ local_bh_disable();
xt_info_wrlock(cpu);
xt_entry_foreach(iter, t->entries[cpu], t->size) {
ADD_COUNTER(counters[i], iter->counters.bcnt,
@@ -742,6 +743,7 @@ static void get_counters(const struct xt_table_info *t,
++i;
}
xt_info_wrunlock(cpu);
+ local_bh_enable();
}
put_cpu();
}
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index c439721..97b64b2 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -909,6 +909,7 @@ get_counters(const struct xt_table_info *t,
if (cpu == curcpu)
continue;
i = 0;
+ local_bh_disable();
xt_info_wrlock(cpu);
xt_entry_foreach(iter, t->entries[cpu], t->size) {
ADD_COUNTER(counters[i], iter->counters.bcnt,
@@ -916,6 +917,7 @@ get_counters(const struct xt_table_info *t,
++i; /* macro does multi eval of i */
}
xt_info_wrunlock(cpu);
+ local_bh_enable();
}
put_cpu();
}
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 5359ef4..29a7bca 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -922,6 +922,7 @@ get_counters(const struct xt_table_info *t,
if (cpu == curcpu)
continue;
i = 0;
+ local_bh_disable();
xt_info_wrlock(cpu);
xt_entry_foreach(iter, t->entries[cpu], t->size) {
ADD_COUNTER(counters[i], iter->counters.bcnt,
@@ -929,6 +930,7 @@ get_counters(const struct xt_table_info *t,
++i;
}
xt_info_wrunlock(cpu);
+ local_bh_enable();
}
put_cpu();
}

2010-08-16 20:56:50

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: Eric Dumazet <[email protected]>
Date: Mon, 16 Aug 2010 22:22:10 +0200

> [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
>
> After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> bottom half more than necessary), lockdep can raise a warning
> because we attempt to lock a spinlock with BH enabled, while
> the same lock is usually locked by another cpu in a softirq context.
>
> Disable again BH to avoid these lockdep warnings.
>
> Reported-by: Linus Torvalds <[email protected]>
> Diagnosed-by: David S. Miller <[email protected]>
> Signed-off-by: Eric Dumazet <[email protected]>
> CC: Patrick McHardy <[email protected]>
> CC: Peter Zijlstra <[email protected]>

Acked-by: David S. Miller <[email protected]>

2010-08-17 22:14:19

by David Miller

[permalink] [raw]
Subject: Re: [GIT] Networking

From: Eric Dumazet <[email protected]>
Date: Mon, 16 Aug 2010 22:22:10 +0200

> [PATCH] netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
>
> After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
> bottom half more than necessary), lockdep can raise a warning
> because we attempt to lock a spinlock with BH enabled, while
> the same lock is usually locked by another cpu in a softirq context.
>
> Disable again BH to avoid these lockdep warnings.
>
> Reported-by: Linus Torvalds <[email protected]>
> Diagnosed-by: David S. Miller <[email protected]>
> Signed-off-by: Eric Dumazet <[email protected]>

Since Patrick seems to be busy, I'll merge this directly into
net-2.6, thanks everyone!