2007-12-04 09:49:33

by Joonwoo Park

[permalink] [raw]
Subject: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

Hi,
dev_set_rx_mode calls __dev_set_rx_mode with softirq disabled (by netif_tx_lock_bh)
therefore __dev_set_promiscuity can be called with softirq disabled.
It will cause in_interrupt() to return true and ASSERT_RTNL warning.
Is there a good solution to fix it besides blowing ASSERT_RTNL up?

Thanks.
Joonwoo

---
diff --git a/net/core/dev.c b/net/core/dev.c
index 86d6261..6f8e72f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2747,8 +2747,6 @@ static void __dev_set_promiscuity(struct net_device *dev, int inc)
{
unsigned short old_flags = dev->flags;

- ASSERT_RTNL();
-
if ((dev->promiscuity += inc) == 0)
dev->flags &= ~IFF_PROMISC;
else
---

message from e1000 (netdev->set_rx_mode is NULL) + vlan device mac address chage.

WARNING: at kernel/mutex.c:324 __mutex_trylock_slowpath()
Pid: 4523, comm: ifconfig Not tainted 2.6.24-rc3 #179
[<c400544a>] show_trace_log_lvl+0x1a/0x30
[<c4005fa2>] show_trace+0x12/0x20
[<c400679e>] dump_stack+0x6e/0x80
[<c43b084d>] mutex_trylock+0xcd/0x130
[<c42dccfd>] rtnl_trylock+0xd/0x10
[<c42d1f49>] __dev_set_promiscuity+0x19/0x130
[<c42d20b4>] __dev_set_rx_mode+0x54/0x90
[<c42d2151>] dev_unicast_add+0x61/0xb0
[<c438c764>] vlan_set_mac_address+0x104/0x150
[<c42d2ee1>] dev_set_mac_address+0x51/0x70
[<c42d3dca>] dev_ifsioc+0x11a/0x270
[<c42d4131>] dev_ioctl+0x211/0x510
[<c42c699a>] sock_ioctl+0xea/0x220
[<c408a468>] do_ioctl+0x28/0x80
[<c408a517>] vfs_ioctl+0x57/0x280
[<c408a779>] sys_ioctl+0x39/0x60
[<c40043ee>] sysenter_past_esp+0x5f/0x85
=======================


2007-12-04 22:23:44

by Jarek Poplawski

[permalink] [raw]
Subject: Re: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

Joonwoo Park wrote, On 12/04/2007 10:48 AM:

> Hi,
> dev_set_rx_mode calls __dev_set_rx_mode with softirq disabled (by netif_tx_lock_bh)
> therefore __dev_set_promiscuity can be called with softirq disabled.
> It will cause in_interrupt() to return true and ASSERT_RTNL warning.
> Is there a good solution to fix it besides blowing ASSERT_RTNL up?


Fixing ASSERT_RTNL up?!

But, IMHO, blowing ASSERT_RTNL up in a few places shouldn't be much
worse. After all, how long such a debugging code should be kept. It
seems, at least sometimes we should be a bit more confident of how
it's called.

Regards,
Jarek P.

2007-12-04 22:38:45

by Herbert Xu

[permalink] [raw]
Subject: Re: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

Joonwoo Park <[email protected]> wrote:
> Hi,
> dev_set_rx_mode calls __dev_set_rx_mode with softirq disabled (by netif_tx_lock_bh)
> therefore __dev_set_promiscuity can be called with softirq disabled.
> It will cause in_interrupt() to return true and ASSERT_RTNL warning.
> Is there a good solution to fix it besides blowing ASSERT_RTNL up?

We've talked this one before on netdev. It's on my todo list to fix.
The correct solution is to untangle this so that __dev_set_promiscuity
does not get called in the first place on BH paths.

Unfortunately I've been busy so I haven't completed the patches yet.
But as this problem is not urgent let's not just put on a bandaid.

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-12-05 05:31:10

by Joonwoo Park

[permalink] [raw]
Subject: RE: [PATCH] NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

2007/12/5, Herbert Xu <[email protected]>:
> Joonwoo Park <[email protected]> wrote:
> > Hi,
> > dev_set_rx_mode calls __dev_set_rx_mode with softirq disabled (by netif_tx_lock_bh)
> > therefore __dev_set_promiscuity can be called with softirq disabled.
> > It will cause in_interrupt() to return true and ASSERT_RTNL warning.
> > Is there a good solution to fix it besides blowing ASSERT_RTNL up?
>
> We've talked this one before on netdev. It's on my todo list to fix.
> The correct solution is to untangle this so that __dev_set_promiscuity
> does not get called in the first place on BH paths.
>
> Unfortunately I've been busy so I haven't completed the patches yet.
> But as this problem is not urgent let's not just put on a bandaid.
>

Thanks Herbert,
According to your opinion I tried a patch against davem/net-2.6.git

Thanks.
Joonwoo


[NET]: Fix to __dev_set_promiscuity does not get called with softirq is disabled

Signed-off-by: Joonwoo Park <[email protected]>
---
include/linux/netdevice.h | 3 +-
net/core/dev.c | 57 +++++++++++++++++++++++++++++---------------
net/core/dev_mcast.c | 21 +++++++++++++---
3 files changed, 56 insertions(+), 25 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1e6af4f..6532405 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1403,7 +1403,8 @@ extern int register_netdev(struct net_device *dev);
extern void unregister_netdev(struct net_device *dev);
/* Functions used for secondary unicast and multicast support */
extern void dev_set_rx_mode(struct net_device *dev);
-extern void __dev_set_rx_mode(struct net_device *dev);
+extern int __dev_set_rx_mode(struct net_device *dev);
+extern void __dev_set_rx_mode_fini(struct net_device *dev);
extern int dev_unicast_delete(struct net_device *dev, void *addr, int alen);
extern int dev_unicast_add(struct net_device *dev, void *addr, int alen);
extern int dev_mc_delete(struct net_device *dev, void *addr, int alen, int all);
diff --git a/net/core/dev.c b/net/core/dev.c
index 86d6261..eb1a11f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2822,39 +2822,50 @@ void dev_set_allmulti(struct net_device *dev, int inc)
* filtering it is put in promiscous mode while unicast addresses
* are present.
*/
-void __dev_set_rx_mode(struct net_device *dev)
+int __dev_set_rx_mode(struct net_device *dev)
{
/* dev_open will call this function so the list will stay sane. */
if (!(dev->flags&IFF_UP))
- return;
+ return 0;

if (!netif_device_present(dev))
- return;
+ return 0;

- if (dev->set_rx_mode)
+ if (dev->set_rx_mode) {
dev->set_rx_mode(dev);
- else {
- /* Unicast addresses changes may only happen under the rtnl,
- * therefore calling __dev_set_promiscuity here is safe.
- */
- if (dev->uc_count > 0 && !dev->uc_promisc) {
- __dev_set_promiscuity(dev, 1);
- dev->uc_promisc = 1;
- } else if (dev->uc_count == 0 && dev->uc_promisc) {
- __dev_set_promiscuity(dev, -1);
- dev->uc_promisc = 0;
- }
+ return 0;
+ }
+ return 1;
+}

- if (dev->set_multicast_list)
- dev->set_multicast_list(dev);
+void __dev_set_rx_mode_fini(struct net_device *dev)
+{
+ BUG_TRAP(dev->flags&IFF_UP);
+ BUG_TRAP(netif_device_present(dev));
+
+ /* Unicast addresses changes may only happen under the rtnl,
+ * therefore calling __dev_set_promiscuity here is safe.
+ */
+ if (dev->uc_count > 0 && !dev->uc_promisc) {
+ __dev_set_promiscuity(dev, 1);
+ dev->uc_promisc = 1;
+ } else if (dev->uc_count == 0 && dev->uc_promisc) {
+ __dev_set_promiscuity(dev, -1);
+ dev->uc_promisc = 0;
}
+
+ if (dev->set_multicast_list)
+ dev->set_multicast_list(dev);
}

void dev_set_rx_mode(struct net_device *dev)
{
+ int pending;
netif_tx_lock_bh(dev);
- __dev_set_rx_mode(dev);
+ pending = __dev_set_rx_mode(dev);
netif_tx_unlock_bh(dev);
+ if (pending)
+ __dev_set_rx_mode_fini(dev);
}

int __dev_addr_delete(struct dev_addr_list **list, int *count,
@@ -2929,14 +2940,17 @@ int __dev_addr_add(struct dev_addr_list **list, int *count,
int dev_unicast_delete(struct net_device *dev, void *addr, int alen)
{
int err;
+ int pending = 0;

ASSERT_RTNL();

netif_tx_lock_bh(dev);
err = __dev_addr_delete(&dev->uc_list, &dev->uc_count, addr, alen, 0);
if (!err)
- __dev_set_rx_mode(dev);
+ pending = __dev_set_rx_mode(dev);
netif_tx_unlock_bh(dev);
+ if (pending)
+ __dev_set_rx_mode_fini(dev);
return err;
}
EXPORT_SYMBOL(dev_unicast_delete);
@@ -2955,14 +2969,17 @@ EXPORT_SYMBOL(dev_unicast_delete);
int dev_unicast_add(struct net_device *dev, void *addr, int alen)
{
int err;
+ int pending = 0;

ASSERT_RTNL();

netif_tx_lock_bh(dev);
err = __dev_addr_add(&dev->uc_list, &dev->uc_count, addr, alen, 0);
if (!err)
- __dev_set_rx_mode(dev);
+ pending = __dev_set_rx_mode(dev);
netif_tx_unlock_bh(dev);
+ if (pending)
+ __dev_set_rx_mode_fini(dev);
return err;
}
EXPORT_SYMBOL(dev_unicast_add);
diff --git a/net/core/dev_mcast.c b/net/core/dev_mcast.c
index 69fff16..0170359 100644
--- a/net/core/dev_mcast.c
+++ b/net/core/dev_mcast.c
@@ -71,6 +71,7 @@
int dev_mc_delete(struct net_device *dev, void *addr, int alen, int glbl)
{
int err;
+ int pending = 0;

netif_tx_lock_bh(dev);
err = __dev_addr_delete(&dev->mc_list, &dev->mc_count,
@@ -81,9 +82,11 @@ int dev_mc_delete(struct net_device *dev, void *addr, int alen, int glbl)
* loaded filter is now wrong. Fix it
*/

- __dev_set_rx_mode(dev);
+ pending = __dev_set_rx_mode(dev);
}
netif_tx_unlock_bh(dev);
+ if (pending)
+ __dev_set_rx_mode_fini(dev);
return err;
}

@@ -94,12 +97,15 @@ int dev_mc_delete(struct net_device *dev, void *addr, int alen, int glbl)
int dev_mc_add(struct net_device *dev, void *addr, int alen, int glbl)
{
int err;
+ int pending = 0;

netif_tx_lock_bh(dev);
err = __dev_addr_add(&dev->mc_list, &dev->mc_count, addr, alen, glbl);
if (!err)
- __dev_set_rx_mode(dev);
+ pending = __dev_set_rx_mode(dev);
netif_tx_unlock_bh(dev);
+ if (pending)
+ __dev_set_rx_mode_fini(dev);
return err;
}

@@ -119,6 +125,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
{
struct dev_addr_list *da, *next;
int err = 0;
+ int pending = 0;

netif_tx_lock_bh(to);
da = from->mc_list;
@@ -140,9 +147,11 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
da = next;
}
if (!err)
- __dev_set_rx_mode(to);
+ pending = __dev_set_rx_mode(to);
netif_tx_unlock_bh(to);

+ if (pending)
+ __dev_set_rx_mode_fini(to);
return err;
}
EXPORT_SYMBOL(dev_mc_sync);
@@ -161,6 +170,7 @@ EXPORT_SYMBOL(dev_mc_sync);
void dev_mc_unsync(struct net_device *to, struct net_device *from)
{
struct dev_addr_list *da, *next;
+ int pending;

netif_tx_lock_bh(from);
netif_tx_lock_bh(to);
@@ -177,10 +187,13 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from)
}
da = next;
}
- __dev_set_rx_mode(to);
+ pending = __dev_set_rx_mode(to);

netif_tx_unlock_bh(to);
netif_tx_unlock_bh(from);
+
+ if (pending)
+ __dev_set_rx_mode_fini(to);
}
EXPORT_SYMBOL(dev_mc_unsync);

---

2007-12-05 05:36:23

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

On Wed, Dec 05, 2007 at 02:30:10PM +0900, Joonwoo Park wrote:
>
> @@ -140,9 +147,11 @@ int dev_mc_sync(struct net_device *to, struct net_device *from)
> da = next;
> }
> if (!err)
> - __dev_set_rx_mode(to);
> + pending = __dev_set_rx_mode(to);
> netif_tx_unlock_bh(to);
>
> + if (pending)
> + __dev_set_rx_mode_fini(to);

The idea is to not touch the unicast stuff at all on the multicast path.

Anyway, this was discussed on netdev so please check the archives because
there is more to this than just changing the multicast handling. We also
talked about consolidating the driver interface so that all these calls
have the same environment rather than the mix-and-match that we have now.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-12-05 05:56:20

by Joonwoo Park

[permalink] [raw]
Subject: Re: [PATCH] NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

2007/12/5, Herbert Xu <[email protected]>:
> The idea is to not touch the unicast stuff at all on the multicast path.
>
> Anyway, this was discussed on netdev so please check the archives because
> there is more to this than just changing the multicast handling. We also
> talked about consolidating the driver interface so that all these calls
> have the same environment rather than the mix-and-match that we have now.
>

Thanks again Herbert,
I'll take a look at old discussions.

Thanks.
Joonwoo

2007-12-05 06:21:04

by Jarek Poplawski

[permalink] [raw]
Subject: Re: NET: ASSERT_RTNL in __dev_set_promiscuity makes debug warning

On 04-12-2007 23:26, Jarek Poplawski wrote:
...
> But, IMHO, blowing ASSERT_RTNL up in a few places shouldn't be much
> worse. After all, how long such a debugging code should be kept. It
> seems, at least sometimes we should be a bit more confident of how
> it's called.

I see this won't be done this way, but, if it were, then there is no
reason to remove the second: documenting feature of ASSERT_RTNL, so
some comment about locking should be added.

Jarek P.