In xgmac_probe, the priv->tx_timeout_work is bound with
xgmac_tx_timeout_work. In xgmac_remove, if there is an
unfinished work, there might be a race condition that
priv->base was written byte after iounmap it.
Fix it by finishing the work before cleanup.
Signed-off-by: Zheng Wang <[email protected]>
---
drivers/net/ethernet/calxeda/xgmac.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index f4f87dfa9687..94c3804001e3 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -1831,6 +1831,7 @@ static int xgmac_remove(struct platform_device *pdev)
/* Free the IRQ lines */
free_irq(ndev->irq, ndev);
free_irq(priv->pmt_irq, ndev);
+ cancel_work_sync(&priv->tx_timeout_work);
unregister_netdev(ndev);
netif_napi_del(&priv->napi);
--
2.25.1
On 2023/3/9 11:56, Zheng Wang wrote:
> In xgmac_probe, the priv->tx_timeout_work is bound with
> xgmac_tx_timeout_work. In xgmac_remove, if there is an
> unfinished work, there might be a race condition that
> priv->base was written byte after iounmap it.
>
> Fix it by finishing the work before cleanup.
This should go to net branch, so title should be:
[PATCH net] net: calxeda: fix race condition in xgmac_remove due to unfinshed work
From history commit, it seems more common to use "net: calxedaxgmac" instead of
"net: calxeda", I am not sure which one is better.
Also there should be a Fixes tag for net branch, maybe:
Fixes: 8746f671ef04 ("net: calxedaxgmac: fix race between xgmac_tx_complete and xgmac_tx_err")
>
> Signed-off-by: Zheng Wang <[email protected]>
> ---
> drivers/net/ethernet/calxeda/xgmac.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
> index f4f87dfa9687..94c3804001e3 100644
> --- a/drivers/net/ethernet/calxeda/xgmac.c
> +++ b/drivers/net/ethernet/calxeda/xgmac.c
> @@ -1831,6 +1831,7 @@ static int xgmac_remove(struct platform_device *pdev)
> /* Free the IRQ lines */
> free_irq(ndev->irq, ndev);
> free_irq(priv->pmt_irq, ndev);
> + cancel_work_sync(&priv->tx_timeout_work);
It seems the blow function need to stop the dev_watchdog() from
calling dev->netdev_ops->ndo_tx_timeout before calling
cancel_work_sync(&priv->tx_timeout_work), otherwise the
dev_watchdog() may trigger the priv->tx_timeout_work to run again.
netif_carrier_off(ndev);
netif_tx_disable(ndev);
>
> unregister_netdev(ndev);
> netif_napi_del(&priv->napi);
>
On 2023/3/9 14:23, Yunsheng Lin wrote:
> On 2023/3/9 11:56, Zheng Wang wrote:
>> In xgmac_probe, the priv->tx_timeout_work is bound with
>> xgmac_tx_timeout_work. In xgmac_remove, if there is an
>> unfinished work, there might be a race condition that
>> priv->base was written byte after iounmap it.
>>
>> Fix it by finishing the work before cleanup.
>
> This should go to net branch, so title should be:
>
> [PATCH net] net: calxeda: fix race condition in xgmac_remove due to unfinshed work
typo error:
unfinshed -> unfinished
>
>>From history commit, it seems more common to use "net: calxedaxgmac" instead of
> "net: calxeda", I am not sure which one is better.
>
> Also there should be a Fixes tag for net branch, maybe:
>
> Fixes: 8746f671ef04 ("net: calxedaxgmac: fix race between xgmac_tx_complete and xgmac_tx_err")
>
>
>>
>> Signed-off-by: Zheng Wang <[email protected]>
>> ---
>> drivers/net/ethernet/calxeda/xgmac.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
>> index f4f87dfa9687..94c3804001e3 100644
>> --- a/drivers/net/ethernet/calxeda/xgmac.c
>> +++ b/drivers/net/ethernet/calxeda/xgmac.c
>> @@ -1831,6 +1831,7 @@ static int xgmac_remove(struct platform_device *pdev)
>> /* Free the IRQ lines */
>> free_irq(ndev->irq, ndev);
>> free_irq(priv->pmt_irq, ndev);
>> + cancel_work_sync(&priv->tx_timeout_work);
>
> It seems the blow function need to stop the dev_watchdog() from
> calling dev->netdev_ops->ndo_tx_timeout before calling
> cancel_work_sync(&priv->tx_timeout_work), otherwise the
> dev_watchdog() may trigger the priv->tx_timeout_work to run again.
>
> netif_carrier_off(ndev);
> netif_tx_disable(ndev);
>
>>
>> unregister_netdev(ndev);
>> netif_napi_del(&priv->napi);
>>
> .
>
Yunsheng Lin <[email protected]> 于2023年3月9日周四 14:24写道:
>
> On 2023/3/9 11:56, Zheng Wang wrote:
> > In xgmac_probe, the priv->tx_timeout_work is bound with
> > xgmac_tx_timeout_work. In xgmac_remove, if there is an
> > unfinished work, there might be a race condition that
> > priv->base was written byte after iounmap it.
> >
> > Fix it by finishing the work before cleanup.
>
> This should go to net branch, so title should be:
>
> [PATCH net] net: calxeda: fix race condition in xgmac_remove due to unfinshed work
>
Sorry for the confusion.
> From history commit, it seems more common to use "net: calxedaxgmac" instead of
> "net: calxeda", I am not sure which one is better.
>
> Also there should be a Fixes tag for net branch, maybe:
>
> Fixes: 8746f671ef04 ("net: calxedaxgmac: fix race between xgmac_tx_complete and xgmac_tx_err")
>
>
Yes, I was eager to report the fix and ignored that. Thanks for
pointing that out.
> >
> > Signed-off-by: Zheng Wang <[email protected]>
> > ---
> > drivers/net/ethernet/calxeda/xgmac.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
> > index f4f87dfa9687..94c3804001e3 100644
> > --- a/drivers/net/ethernet/calxeda/xgmac.c
> > +++ b/drivers/net/ethernet/calxeda/xgmac.c
> > @@ -1831,6 +1831,7 @@ static int xgmac_remove(struct platform_device *pdev)
> > /* Free the IRQ lines */
> > free_irq(ndev->irq, ndev);
> > free_irq(priv->pmt_irq, ndev);
> > + cancel_work_sync(&priv->tx_timeout_work);
>
> It seems the blow function need to stop the dev_watchdog() from
> calling dev->netdev_ops->ndo_tx_timeout before calling
> cancel_work_sync(&priv->tx_timeout_work), otherwise the
> dev_watchdog() may trigger the priv->tx_timeout_work to run again.
>
> netif_carrier_off(ndev);
> netif_tx_disable(ndev);
>
> >
> > unregister_netdev(ndev);
> > netif_napi_del(&priv->napi);
> >
Yes, I agree with that. Thanks for your advice. I learned a lot from it.
Best regards,
Zheng
Yunsheng Lin <[email protected]> 于2023年3月9日周四 14:27写道:
>
> On 2023/3/9 14:23, Yunsheng Lin wrote:
> > On 2023/3/9 11:56, Zheng Wang wrote:
> >> In xgmac_probe, the priv->tx_timeout_work is bound with
> >> xgmac_tx_timeout_work. In xgmac_remove, if there is an
> >> unfinished work, there might be a race condition that
> >> priv->base was written byte after iounmap it.
> >>
> >> Fix it by finishing the work before cleanup.
> >
> > This should go to net branch, so title should be:
> >
> > [PATCH net] net: calxeda: fix race condition in xgmac_remove due to unfinshed work
>
> typo error:
> unfinshed -> unfinished
>
Got it. Will correct it in the next version of patch.
Thanks,
Zheng