2010-12-14 16:10:05

by Tejun Heo

[permalink] [raw]
Subject: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

Michael pointed out that bnx2_close() already cancels bp->reset_task
and thus it is guaranteed to be idle when bnx2_remove_one() is called.
Remove the unnecessary cancel_work_sync() in remove_one.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Michael Chan <[email protected]>
---
drivers/net/bnx2.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 5c811f3..85fc2c8 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -8393,8 +8393,6 @@ bnx2_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct bnx2 *bp = netdev_priv(dev);

- cancel_work_sync(&bp->reset_task);
-
unregister_netdev(dev);

if (bp->mips_firmware)


2010-12-14 17:51:42

by Michael Chan

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one


On Tue, 2010-12-14 at 08:09 -0800, Tejun Heo wrote:
> Michael pointed out that bnx2_close() already cancels bp->reset_task
> and thus it is guaranteed to be idle when bnx2_remove_one() is called.
> Remove the unnecessary cancel_work_sync() in remove_one.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Michael Chan <[email protected]>

Acked-by: Michael Chan <[email protected]>

> ---
> drivers/net/bnx2.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> index 5c811f3..85fc2c8 100644
> --- a/drivers/net/bnx2.c
> +++ b/drivers/net/bnx2.c
> @@ -8393,8 +8393,6 @@ bnx2_remove_one(struct pci_dev *pdev)
> struct net_device *dev = pci_get_drvdata(pdev);
> struct bnx2 *bp = netdev_priv(dev);
>
> - cancel_work_sync(&bp->reset_task);
> -
> unregister_netdev(dev);
>
> if (bp->mips_firmware)
>

2010-12-15 13:52:35

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

On 12/14/2010 06:48 PM, Michael Chan wrote:
>
> On Tue, 2010-12-14 at 08:09 -0800, Tejun Heo wrote:
>> Michael pointed out that bnx2_close() already cancels bp->reset_task
>> and thus it is guaranteed to be idle when bnx2_remove_one() is called.
>> Remove the unnecessary cancel_work_sync() in remove_one.
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> Cc: Michael Chan <[email protected]>
>
> Acked-by: Michael Chan <[email protected]>

After looking through the code, I don't think this is necessarily
correct. ->ndo_close() doesn't guarantee that the watchdog timer has
finished running (the timer is deleted with del_timer() not
del_timer_sync()). ie. the watchdog timer could still be running
after ->ndo_close() and may schedule reset_task. If remove_one
doesn't flush the task, it may still be running when remove_one() is
called.

David, am I missing something? Wouldn't it cleaner to guarantee that
->ndo_close() is called with the guarantee that the watchdog timer is
not running anymore?

--
tejun

2010-12-20 21:11:18

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

From: Tejun Heo <[email protected]>
Date: Wed, 15 Dec 2010 14:52:29 +0100

> After looking through the code, I don't think this is necessarily
> correct. ->ndo_close() doesn't guarantee that the watchdog timer has
> finished running (the timer is deleted with del_timer() not
> del_timer_sync()). ie. the watchdog timer could still be running
> after ->ndo_close() and may schedule reset_task. If remove_one
> doesn't flush the task, it may still be running when remove_one() is
> called.
>
> David, am I missing something? Wouldn't it cleaner to guarantee that
> ->ndo_close() is called with the guarantee that the watchdog timer is
> not running anymore?

It would but we can't just make the change over to del_timer_sync()
otherwise we'd deadlock on netif_tx_lock().

But I think things might be OK as-is.

The timer is deleted by dev_deactivate_many() which resets the qdisc
to the no-op qdisc. Then it deletes the timer.

Any running timer will complete or see the no-op qdisc attached and
return immediately.

synchronize_rcu() is then executed which guarentees completion.

Since both the watchdog timer itself and the del_timer() call run
with netif_tx_lock() held, this makes sure the timer, once deleted,
will only see the no-op qdisc and return immediately if it is
amidst running, else it has already returned when the timer delete
completes.

So we might be OK here.

2010-12-21 10:51:12

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

Hello, David.

On Mon, Dec 20, 2010 at 01:11:46PM -0800, David Miller wrote:
> It would but we can't just make the change over to del_timer_sync()
> otherwise we'd deadlock on netif_tx_lock().
>
> But I think things might be OK as-is.
>
> The timer is deleted by dev_deactivate_many() which resets the qdisc
> to the no-op qdisc. Then it deletes the timer.
>
> Any running timer will complete or see the no-op qdisc attached and
> return immediately.
>
> synchronize_rcu() is then executed which guarentees completion.
>
> Since both the watchdog timer itself and the del_timer() call run
> with netif_tx_lock() held, this makes sure the timer, once deleted,
> will only see the no-op qdisc and return immediately if it is
> amidst running, else it has already returned when the timer delete
> completes.
>
> So we might be OK here.

Yeah, I agree the synchronize_rcu() there would guarantee the actual
timer completion but as it currently stands it looks a bit too subtle.
Maybe it's a good idea to add a big fat comment explaining that the
the timer is guaranteed to stop after close() and how it's guaranteed
through synchronize_rcu() at the moment? Also, it might be better to
use synchronize_sched() there as timer synchronization through
synchronize_rcu() is more of a happy accident.

Thanks.

--
tejun

2010-12-21 20:19:31

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

From: Tejun Heo <[email protected]>
Date: Tue, 21 Dec 2010 11:51:04 +0100

> Yeah, I agree the synchronize_rcu() there would guarantee the actual
> timer completion but as it currently stands it looks a bit too subtle.
> Maybe it's a good idea to add a big fat comment explaining that the
> the timer is guaranteed to stop after close() and how it's guaranteed
> through synchronize_rcu() at the moment? Also, it might be better to
> use synchronize_sched() there as timer synchronization through
> synchronize_rcu() is more of a happy accident.

I'm not sure the synchronize_*() is even necessary to guarentee
watchdog timer completion.

Like I said, I think the netif_tx_lock() held around both the timer
function itself, and the del_timer() call, are sufficient.

So, this ensures that the watchdog timer either runs to completion or
sees the no-op scheduler attached and returns immediately without
rescheduling the timer.

In any event, I'm going to apply your bnx2 patch to net-next-2.6

Thanks.

2010-12-22 08:48:25

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH net-next-2.6] bnx2: remove cancel_work_sync() from remove_one

Hello,

On Tue, Dec 21, 2010 at 12:20:00PM -0800, David Miller wrote:
> From: Tejun Heo <[email protected]>
> Date: Tue, 21 Dec 2010 11:51:04 +0100
>
> > Yeah, I agree the synchronize_rcu() there would guarantee the actual
> > timer completion but as it currently stands it looks a bit too subtle.
> > Maybe it's a good idea to add a big fat comment explaining that the
> > the timer is guaranteed to stop after close() and how it's guaranteed
> > through synchronize_rcu() at the moment? Also, it might be better to
> > use synchronize_sched() there as timer synchronization through
> > synchronize_rcu() is more of a happy accident.
>
> I'm not sure the synchronize_*() is even necessary to guarentee
> watchdog timer completion.
>
> Like I said, I think the netif_tx_lock() held around both the timer
> function itself, and the del_timer() call, are sufficient.
>
> So, this ensures that the watchdog timer either runs to completion or
> sees the no-op scheduler attached and returns immediately without
> rescheduling the timer.
>
> In any event, I'm going to apply your bnx2 patch to net-next-2.6

Oh, yeah, all is good if the timer is guaranteed to stop after close
one way or the other. Thanks and happy new year!

--
tejun