From: "Li, Zhen-Hua" <[email protected]>
In benet driver, netif_device_detach and netif_device_attach should be
called between rtnl_lock and rtnl_unlock.
Signed-off-by: Li, Zhen-Hua <[email protected]>
---
drivers/net/ethernet/emulex/benet/be_main.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 3e6df47..9c44b3f 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4555,8 +4555,11 @@ static void be_func_recovery_task(struct work_struct *work)
rtnl_unlock();
status = lancer_recover_func(adapter);
- if (!status)
+ if (!status) {
+ rtnl_lock();
netif_device_attach(adapter->netdev);
+ rtnl_unlock();
+ }
}
/* In Lancer, for all errors other than provisioning error (-EAGAIN),
@@ -4784,12 +4787,12 @@ static int be_suspend(struct pci_dev *pdev, pm_message_t state)
be_intr_set(adapter, false);
cancel_delayed_work_sync(&adapter->func_recovery_work);
+ rtnl_lock();
netif_device_detach(netdev);
if (netif_running(netdev)) {
- rtnl_lock();
be_close(netdev);
- rtnl_unlock();
}
+ rtnl_unlock();
be_clear(adapter);
pci_save_state(pdev);
@@ -4804,7 +4807,9 @@ static int be_resume(struct pci_dev *pdev)
struct be_adapter *adapter = pci_get_drvdata(pdev);
struct net_device *netdev = adapter->netdev;
+ rtnl_lock();
netif_device_detach(netdev);
+ rtnl_unlock();
status = pci_enable_device(pdev);
if (status)
@@ -4832,7 +4837,9 @@ static int be_resume(struct pci_dev *pdev)
schedule_delayed_work(&adapter->func_recovery_work,
msecs_to_jiffies(1000));
+ rtnl_lock();
netif_device_attach(netdev);
+ rtnl_unlock();
if (adapter->wol_en)
be_setup_wol(adapter, false);
@@ -4853,7 +4860,9 @@ static void be_shutdown(struct pci_dev *pdev)
cancel_delayed_work_sync(&adapter->work);
cancel_delayed_work_sync(&adapter->func_recovery_work);
+ rtnl_lock();
netif_device_detach(adapter->netdev);
+ rtnl_unlock();
be_cmd_reset_function(adapter);
@@ -4957,7 +4966,9 @@ static void be_eeh_resume(struct pci_dev *pdev)
schedule_delayed_work(&adapter->func_recovery_work,
msecs_to_jiffies(1000));
+ rtnl_lock();
netif_device_attach(netdev);
+ rtnl_unlock();
return;
err:
dev_err(&adapter->pdev->dev, "EEH resume failed\n");
--
1.7.10.4
> -----Original Message-----
> From: Li, Zhen-Hua [mailto:[email protected]]
>
> In benet driver, netif_device_detach and netif_device_attach should be
> called between rtnl_lock and rtnl_unlock.
Zhen, it's not clear to me why rtnl_lock is needed around netif_device_attach().
Can you pls explain what exact data-structure you are protecting with the lock?
Are you see warning stack trace? I don't see ASSERT_RTNL() called anywhere from netif_device_attach/detach()?
thanks!
>
> Signed-off-by: Li, Zhen-Hua <[email protected]>
> ---
> drivers/net/ethernet/emulex/benet/be_main.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c
> b/drivers/net/ethernet/emulex/benet/be_main.c
> index 3e6df47..9c44b3f 100644
> --- a/drivers/net/ethernet/emulex/benet/be_main.c
> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
> @@ -4555,8 +4555,11 @@ static void be_func_recovery_task(struct work_struct *work)
> rtnl_unlock();
>
> status = lancer_recover_func(adapter);
> - if (!status)
> + if (!status) {
> + rtnl_lock();
> netif_device_attach(adapter->netdev);
> + rtnl_unlock();
> + }
> }
>
> /* In Lancer, for all errors other than provisioning error (-EAGAIN),
> @@ -4784,12 +4787,12 @@ static int be_suspend(struct pci_dev *pdev, pm_message_t
> state)
> be_intr_set(adapter, false);
> cancel_delayed_work_sync(&adapter->func_recovery_work);
>
> + rtnl_lock();
> netif_device_detach(netdev);
> if (netif_running(netdev)) {
> - rtnl_lock();
> be_close(netdev);
> - rtnl_unlock();
> }
> + rtnl_unlock();
> be_clear(adapter);
>
> pci_save_state(pdev);
> @@ -4804,7 +4807,9 @@ static int be_resume(struct pci_dev *pdev)
> struct be_adapter *adapter = pci_get_drvdata(pdev);
> struct net_device *netdev = adapter->netdev;
>
> + rtnl_lock();
> netif_device_detach(netdev);
> + rtnl_unlock();
>
> status = pci_enable_device(pdev);
> if (status)
> @@ -4832,7 +4837,9 @@ static int be_resume(struct pci_dev *pdev)
>
> schedule_delayed_work(&adapter->func_recovery_work,
> msecs_to_jiffies(1000));
> + rtnl_lock();
> netif_device_attach(netdev);
> + rtnl_unlock();
>
> if (adapter->wol_en)
> be_setup_wol(adapter, false);
> @@ -4853,7 +4860,9 @@ static void be_shutdown(struct pci_dev *pdev)
> cancel_delayed_work_sync(&adapter->work);
> cancel_delayed_work_sync(&adapter->func_recovery_work);
>
> + rtnl_lock();
> netif_device_detach(adapter->netdev);
> + rtnl_unlock();
>
> be_cmd_reset_function(adapter);
>
> @@ -4957,7 +4966,9 @@ static void be_eeh_resume(struct pci_dev *pdev)
>
> schedule_delayed_work(&adapter->func_recovery_work,
> msecs_to_jiffies(1000));
> + rtnl_lock();
> netif_device_attach(netdev);
> + rtnl_unlock();
> return;
> err:
> dev_err(&adapter->pdev->dev, "EEH resume failed\n");
> --
> 1.7.10.4
Because netif_running() is called in netif_device_detach and
netif_device_attach. To avoid dev status changed while
netif_device_detach/attach is not finished, I think a rtnl_lock and
unlock should be called to avoid this.
Thanks
Zhenhua
On 04/15/2014 04:07 PM, Sathya Perla wrote:
>> -----Original Message-----
>> From: Li, Zhen-Hua [mailto:[email protected]]
>>
>> In benet driver, netif_device_detach and netif_device_attach should be
>> called between rtnl_lock and rtnl_unlock.
>
> Zhen, it's not clear to me why rtnl_lock is needed around netif_device_attach().
> Can you pls explain what exact data-structure you are protecting with the lock?
>
> Are you see warning stack trace? I don't see ASSERT_RTNL() called anywhere from netif_device_attach/detach()?
>
> thanks!
>
>>
>> Signed-off-by: Li, Zhen-Hua <[email protected]>
>> ---
>> drivers/net/ethernet/emulex/benet/be_main.c | 17 ++++++++++++++---
>> 1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c
>> b/drivers/net/ethernet/emulex/benet/be_main.c
>> index 3e6df47..9c44b3f 100644
>> --- a/drivers/net/ethernet/emulex/benet/be_main.c
>> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
>> @@ -4555,8 +4555,11 @@ static void be_func_recovery_task(struct work_struct *work)
>> rtnl_unlock();
>>
>> status = lancer_recover_func(adapter);
>> - if (!status)
>> + if (!status) {
>> + rtnl_lock();
>> netif_device_attach(adapter->netdev);
>> + rtnl_unlock();
>> + }
>> }
>>
>> /* In Lancer, for all errors other than provisioning error (-EAGAIN),
>> @@ -4784,12 +4787,12 @@ static int be_suspend(struct pci_dev *pdev, pm_message_t
>> state)
>> be_intr_set(adapter, false);
>> cancel_delayed_work_sync(&adapter->func_recovery_work);
>>
>> + rtnl_lock();
>> netif_device_detach(netdev);
>> if (netif_running(netdev)) {
>> - rtnl_lock();
>> be_close(netdev);
>> - rtnl_unlock();
>> }
>> + rtnl_unlock();
>> be_clear(adapter);
>>
>> pci_save_state(pdev);
>> @@ -4804,7 +4807,9 @@ static int be_resume(struct pci_dev *pdev)
>> struct be_adapter *adapter = pci_get_drvdata(pdev);
>> struct net_device *netdev = adapter->netdev;
>>
>> + rtnl_lock();
>> netif_device_detach(netdev);
>> + rtnl_unlock();
>>
>> status = pci_enable_device(pdev);
>> if (status)
>> @@ -4832,7 +4837,9 @@ static int be_resume(struct pci_dev *pdev)
>>
>> schedule_delayed_work(&adapter->func_recovery_work,
>> msecs_to_jiffies(1000));
>> + rtnl_lock();
>> netif_device_attach(netdev);
>> + rtnl_unlock();
>>
>> if (adapter->wol_en)
>> be_setup_wol(adapter, false);
>> @@ -4853,7 +4860,9 @@ static void be_shutdown(struct pci_dev *pdev)
>> cancel_delayed_work_sync(&adapter->work);
>> cancel_delayed_work_sync(&adapter->func_recovery_work);
>>
>> + rtnl_lock();
>> netif_device_detach(adapter->netdev);
>> + rtnl_unlock();
>>
>> be_cmd_reset_function(adapter);
>>
>> @@ -4957,7 +4966,9 @@ static void be_eeh_resume(struct pci_dev *pdev)
>>
>> schedule_delayed_work(&adapter->func_recovery_work,
>> msecs_to_jiffies(1000));
>> + rtnl_lock();
>> netif_device_attach(netdev);
>> + rtnl_unlock();
>> return;
>> err:
>> dev_err(&adapter->pdev->dev, "EEH resume failed\n");
>> --
>> 1.7.10.4
>
> -----Original Message-----
> From: Li, ZhenHua [mailto:[email protected]]
>
> Because netif_running() is called in netif_device_detach and
> netif_device_attach. To avoid dev status changed while
> netif_device_detach/attach is not finished, I think a rtnl_lock and
> unlock should be called to avoid this.
Ok. I'd like to then factor the code slightly differently by using
routines like this:
be_close_sync() {
rtnl_lock();
netif_device_detach(netdev);
if (netif_running(netdev))
be_close(netdev);
rtnl_unlock();
}
and similarly for be_open_sync()
And, I'd need some time to test these flows too.
Would you be OK with this?
thanks,
-Sathya
Yes, that's ok for me.
?????ҵ? iPhone
?? 2014??4??15?գ?????7:57??"Sathya Perla" <[email protected]> д????
>> -----Original Message-----
>> From: Li, ZhenHua [mailto:[email protected]]
>>
>> Because netif_running() is called in netif_device_detach and
>> netif_device_attach. To avoid dev status changed while
>> netif_device_detach/attach is not finished, I think a rtnl_lock and
>> unlock should be called to avoid this.
>
> Ok. I'd like to then factor the code slightly differently by using
> routines like this:
>
> be_close_sync() {
> rtnl_lock();
>
> netif_device_detach(netdev);
> if (netif_running(netdev))
> be_close(netdev);
>
> rtnl_unlock();
> }
>
> and similarly for be_open_sync()
>
> And, I'd need some time to test these flows too.
> Would you be OK with this?
>
> thanks,
> -Sathya
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
From: "Li, Zhen-Hua" <[email protected]>
Date: Tue, 15 Apr 2014 14:45:52 +0800
> From: "Li, Zhen-Hua" <[email protected]>
>
> In benet driver, netif_device_detach and netif_device_attach should be
> called between rtnl_lock and rtnl_unlock.
>
> Signed-off-by: Li, Zhen-Hua <[email protected]>
This absolutely does not look like a driver specific issue, therefore
I do not want you to make such locking context adjustments only in
your driver.
Do it somewhere generic so that every driver gets the fix, not just
your driver.
Hi David,
I think you are right. I checked other NIC drivers, found some of them
call rtnl_lock and rtnl_unlock around netif_device_detach and attach
functions, while some drivers did not.
I will create a new patch in generic way to fix this.
Regards
ZhenHua
On 04/16/2014 03:09 AM, David Miller wrote:
> From: "Li, Zhen-Hua" <[email protected]>
> Date: Tue, 15 Apr 2014 14:45:52 +0800
>
>> From: "Li, Zhen-Hua" <[email protected]>
>>
>> In benet driver, netif_device_detach and netif_device_attach should be
>> called between rtnl_lock and rtnl_unlock.
>>
>> Signed-off-by: Li, Zhen-Hua <[email protected]>
>
> This absolutely does not look like a driver specific issue, therefore
> I do not want you to make such locking context adjustments only in
> your driver.
>
> Do it somewhere generic so that every driver gets the fix, not just
> your driver.
>
Hi David,
I have sent out another patch for the fix.
Thanks
ZhenHua
On 04/16/2014 02:30 PM, Li, ZhenHua wrote:
> Hi David,
>
> I think you are right. I checked other NIC drivers, found some of them
> call rtnl_lock and rtnl_unlock around netif_device_detach and attach
> functions, while some drivers did not.
>
> I will create a new patch in generic way to fix this.
>
> Regards
> ZhenHua
> On 04/16/2014 03:09 AM, David Miller wrote:
>> From: "Li, Zhen-Hua" <[email protected]>
>> Date: Tue, 15 Apr 2014 14:45:52 +0800
>>
>>> From: "Li, Zhen-Hua" <[email protected]>
>>>
>>> In benet driver, netif_device_detach and netif_device_attach should be
>>> called between rtnl_lock and rtnl_unlock.
>>>
>>> Signed-off-by: Li, Zhen-Hua <[email protected]>
>>
>> This absolutely does not look like a driver specific issue, therefore
>> I do not want you to make such locking context adjustments only in
>> your driver.
>>
>> Do it somewhere generic so that every driver gets the fix, not just
>> your driver.
>>
>