2023-12-20 09:22:12

by Mingyen Hsieh

[permalink] [raw]
Subject: [PATCH 2/4] wifi: mt76: mt7921: fix potential command timeout issues when suspend

From: Ming Yen Hsieh <[email protected]>

An ongoing command may be interrupted if suspended, leading to a
command timeout. Add a lock to the suspend function in order to
protect the ongoing command from being interrupted.

Signed-off-by: Leon Yen <[email protected]>
Signed-off-by: Ming Yen Hsieh <[email protected]>
---
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 57903c6e4f11..e1c53f18abdc 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -409,7 +409,9 @@ static int mt7921_pci_suspend(struct device *device)
if (err < 0)
goto restore_suspend;

+ mt792x_mutex_acquire(dev);
err = mt76_connac_mcu_set_hif_suspend(mdev, true);
+ mt792x_mutex_release(dev);
if (err)
goto restore_suspend;

--
2.18.0



2023-12-20 22:11:18

by Sean Wang

[permalink] [raw]
Subject: Re: [PATCH 2/4] wifi: mt76: mt7921: fix potential command timeout issues when suspend

Hi Mingyen,

On Wed, Dec 20, 2023 at 3:22 AM Mingyen Hsieh
<[email protected]> wrote:
>
> From: Ming Yen Hsieh <[email protected]>
>
> An ongoing command may be interrupted if suspended, leading to a
> command timeout. Add a lock to the suspend function in order to
> protect the ongoing command from being interrupted.
>
> Signed-off-by: Leon Yen <[email protected]>
> Signed-off-by: Ming Yen Hsieh <[email protected]>
> ---
> drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> index 57903c6e4f11..e1c53f18abdc 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> @@ -409,7 +409,9 @@ static int mt7921_pci_suspend(struct device *device)
> if (err < 0)
> goto restore_suspend;
>
> + mt792x_mutex_acquire(dev);
> err = mt76_connac_mcu_set_hif_suspend(mdev, true);
> + mt792x_mutex_release(dev);

I had assumed that all commands would finish up before the suspend
handler is called, given that processes are frozen before the
suspension can progress. Could you provide more details on the
specific conflict between these two commands and how it results in the
command timeout?

Moreover, while I understand you've identified some potential issues,
I suggest we work towards a comprehensive solution to address timeouts
caused by potential command conflicts, not just for the specific
command. For instance, we should take into account other MCU commands
during suspension and extend this consideration to similar scenarios
in SDIO suspension.

> if (err)
> goto restore_suspend;
>
> --
> 2.18.0
>
>

2023-12-21 02:27:18

by Mingyen Hsieh

[permalink] [raw]
Subject: Re: [PATCH 2/4] wifi: mt76: mt7921: fix potential command timeout issues when suspend

On Wed, 2023-12-20 at 16:11 -0600, Sean Wang wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> Hi Mingyen,
>
> On Wed, Dec 20, 2023 at 3:22 AM Mingyen Hsieh
> <[email protected]> wrote:
> >
> > From: Ming Yen Hsieh <[email protected]>
> >
> > An ongoing command may be interrupted if suspended, leading to a
> > command timeout. Add a lock to the suspend function in order to
> > protect the ongoing command from being interrupted.
> >
> > Signed-off-by: Leon Yen <[email protected]>
> > Signed-off-by: Ming Yen Hsieh <[email protected]>
> > ---
> > drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > index 57903c6e4f11..e1c53f18abdc 100644
> > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > @@ -409,7 +409,9 @@ static int mt7921_pci_suspend(struct device
> *device)
> > if (err < 0)
> > goto restore_suspend;
> >
> > + mt792x_mutex_acquire(dev);
> > err = mt76_connac_mcu_set_hif_suspend(mdev, true);
> > + mt792x_mutex_release(dev);
>
> I had assumed that all commands would finish up before the suspend
> handler is called, given that processes are frozen before the
> suspension can progress. Could you provide more details on the
> specific conflict between these two commands and how it results in
> the
> command timeout?
>
> Moreover, while I understand you've identified some potential issues,
> I suggest we work towards a comprehensive solution to address
> timeouts
> caused by potential command conflicts, not just for the specific
> command. For instance, we should take into account other MCU commands
> during suspension and extend this consideration to similar scenarios
> in SDIO suspension.
>
> > if (err)
> > goto restore_suspend;
> >
> > --
> > 2.18.0
> >
> >

Hi Sean,

This case occurs when entering suspend mode during a connection. As the
system transitions from a connected state to a disconnected state upon
entering suspend, it calls the .regd_notifier() function. Consequently,
the mt7921 executes the CLC cmd. Meanwhile, because the suspend cmd is
not protected by a lock, the system proceeds to suspend, resulting in
the CLC cmd not completing and causing a command timeout.