bond_init() is not releasing rtnl_sem after register_netdevice() and
before calling unregister_netdevice() (from bond_free_all()) in the
exception path. As the device registration is not completed
(dev->reg_state == NETREG_REGISTERING), the call to
unregister_netdevice() triggers BUG_ON(dev->reg_state != NETREG_REGISTERED).
Signed-off-by: Florin Malita <[email protected]>
----
diff --git a/drivers/net/bonding/bond_main.c
b/drivers/net/bonding/bond_main.c
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -5039,6 +5039,10 @@ static int __init bonding_init(void)
return 0;
out_err:
+ /* give register_netdevice() a chance to complete */
+ rtnl_unlock();
+ rtnl_lock();
+
/* free and unregister all bonds that were successfully added */
bond_free_all();
Florin Malita <[email protected]> wrote:
>
> User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720)
Your MUA is converting tabs to spaces.
> bond_init() is not releasing rtnl_sem after register_netdevice() and
> before calling unregister_netdevice() (from bond_free_all()) in the
> exception path. As the device registration is not completed
> (dev->reg_state == NETREG_REGISTERING), the call to
> unregister_netdevice() triggers BUG_ON(dev->reg_state != NETREG_REGISTERED).
>
> Signed-off-by: Florin Malita <[email protected]>
> ----
> diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -5039,6 +5039,10 @@ static int __init bonding_init(void)
> return 0;
>
> out_err:
> + /* give register_netdevice() a chance to complete */
> + rtnl_unlock();
> + rtnl_lock();
> +
> /* free and unregister all bonds that were successfully added */
> bond_free_all();
OK, so the bonding devices which were registered are now in state
NETREG_REGISTERING and we need to run netdev_run_todo() to flip them into
state NETREG_REGISTERED. If we don't do that,
bond_free_all()->unregister_netdevice() will go BUG.
The fix is solid enough, although a better comment is needed. Let's let
Dave decide whether that was a sane thing to go BUG over..
From: Florin Malita <[email protected]>
bond_init() is not releasing rtnl_sem after register_netdevice() and before
calling unregister_netdevice() (from bond_free_all()) in the exception
path. As the device registration is not completed (dev->reg_state ==
NETREG_REGISTERING), the call to unregister_netdevice() triggers
BUG_ON(dev->reg_state != NETREG_REGISTERED).
Signed-off-by: Florin Malita <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---
drivers/net/bonding/bond_main.c | 8 ++++++++
1 files changed, 8 insertions(+)
diff -puN drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception drivers/net/bonding/bond_main.c
--- devel/drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception 2005-09-17 23:18:38.000000000 -0700
+++ devel-akpm/drivers/net/bonding/bond_main.c 2005-09-17 23:31:02.000000000 -0700
@@ -5039,6 +5039,14 @@ static int __init bonding_init(void)
return 0;
out_err:
+ /*
+ * rtnl_unlock() will run netdev_run_todo(), putting the
+ * thus-far-registered bonding devices into a state which
+ * unregigister_netdevice() will accept
+ */
+ rtnl_unlock();
+ rtnl_lock();
+
/* free and unregister all bonds that were successfully added */
bond_free_all();
_
From: Andrew Morton <[email protected]>
Date: Sat, 17 Sep 2005 23:32:24 -0700
> The fix is solid enough, although a better comment is needed. Let's
> let Dave decide whether that was a sane thing to go BUG over..
The fix is fine and so is the BUG() check.
Usually, you're supposed to make sure that the very last thing
you do is register_netdev(), and that all possible errors are
possible only before that call.
And that's what BOND basically does, except that it must register
multiple devices in a loop and therefore cannot follow that rule
precisely.
So I've added the patch to my tree, it's fine to do this for a
special case usage like this one.
Andrew Morton wrote:
> diff -puN drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception drivers/net/bonding/bond_main.c
> --- devel/drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception 2005-09-17 23:18:38.000000000 -0700
> +++ devel-akpm/drivers/net/bonding/bond_main.c 2005-09-17 23:31:02.000000000 -0700
> @@ -5039,6 +5039,14 @@ static int __init bonding_init(void)
> return 0;
>
> out_err:
> + /*
> + * rtnl_unlock() will run netdev_run_todo(), putting the
> + * thus-far-registered bonding devices into a state which
> + * unregigister_netdevice() will accept
> + */
> + rtnl_unlock();
> + rtnl_lock();
> +
Don't we want a schedule() or schedule_timeout(1) in between?
Jeff
On Wed, Sep 21, 2005 at 10:38:26PM -0400, Jeff Garzik wrote:
> >+ rtnl_unlock();
> >+ rtnl_lock();
> >+
>
>
> Don't we want a schedule() or schedule_timeout(1) in between?
No. rtnl_unlock() does the needed calls directly.
Jeff Garzik <[email protected]> wrote:
>
> Andrew Morton wrote:
> > diff -puN drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception drivers/net/bonding/bond_main.c
> > --- devel/drivers/net/bonding/bond_main.c~bond_mainc-fix-device-deregistration-in-init-exception 2005-09-17 23:18:38.000000000 -0700
> > +++ devel-akpm/drivers/net/bonding/bond_main.c 2005-09-17 23:31:02.000000000 -0700
> > @@ -5039,6 +5039,14 @@ static int __init bonding_init(void)
> > return 0;
> >
> > out_err:
> > + /*
> > + * rtnl_unlock() will run netdev_run_todo(), putting the
> > + * thus-far-registered bonding devices into a state which
> > + * unregigister_netdevice() will accept
> > + */
> > + rtnl_unlock();
> > + rtnl_lock();
> > +
>
>
> Don't we want a schedule() or schedule_timeout(1) in between?
>
No, it's all synchronous. See the nice comment ;)