2020-06-05 07:48:41

by Tony Chuang

[permalink] [raw]
Subject: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

From: Yan-Hsuan Chuang <[email protected]>

Some platforms cannot read the DBI register successfully for the
ASPM settings. After the read failed, the bus could be unstable,
and the device just became unavailable [1]. For those platforms,
the ASPM should be disabled. But as the ASPM can help the driver
to save the power consumption in power save mode, the ASPM is still
needed. So, add a module parameter for them to disable it, then
the device can still work, while others can benefit from the less
power consumption that brings by ASPM enabled.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=206411
[2] Note that my lenovo T430 is the same.

Fixes: 3dff7c6e3749 ("rtw88: allows to enable/disable HCI link PS mechanism")
Signed-off-by: Yan-Hsuan Chuang <[email protected]>
---
drivers/net/wireless/realtek/rtw88/pci.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
index 8228db9a5fc8..3413973bc475 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -14,8 +14,11 @@
#include "debug.h"

static bool rtw_disable_msi;
+static bool rtw_pci_disable_aspm;
module_param_named(disable_msi, rtw_disable_msi, bool, 0644);
+module_param_named(disable_aspm, rtw_pci_disable_aspm, bool, 0644);
MODULE_PARM_DESC(disable_msi, "Set Y to disable MSI interrupt support");
+MODULE_PARM_DESC(disable_aspm, "Set Y to disable PCI ASPM support");

static u32 rtw_pci_tx_queue_idx_addr[] = {
[RTW_TX_QUEUE_BK] = RTK_PCI_TXBD_IDX_BKQ,
@@ -1200,6 +1203,9 @@ static void rtw_pci_clkreq_set(struct rtw_dev *rtwdev, bool enable)
u8 value;
int ret;

+ if (rtw_pci_disable_aspm)
+ return;
+
ret = rtw_dbi_read8(rtwdev, RTK_PCIE_LINK_CFG, &value);
if (ret) {
rtw_err(rtwdev, "failed to read CLKREQ_L1, ret=%d", ret);
@@ -1219,6 +1225,9 @@ static void rtw_pci_aspm_set(struct rtw_dev *rtwdev, bool enable)
u8 value;
int ret;

+ if (rtw_pci_disable_aspm)
+ return;
+
ret = rtw_dbi_read8(rtwdev, RTK_PCIE_LINK_CFG, &value);
if (ret) {
rtw_err(rtwdev, "failed to read ASPM, ret=%d", ret);
--
2.17.1


Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> From: Yan-Hsuan Chuang <[email protected]>
>
> Some platforms cannot read the DBI register successfully for the
> ASPM settings. After the read failed, the bus could be unstable,
> and the device just became unavailable [1]. For those platforms,
> the ASPM should be disabled. But as the ASPM can help the driver
> to save the power consumption in power save mode, the ASPM is still
> needed. So, add a module parameter for them to disable it, then
> the device can still work, while others can benefit from the less
> power consumption that brings by ASPM enabled.

Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
is save to use?

If someone notices the warning they still have to search for the warning
in order to make the link towards loading the module with the
disable_aspm=1 paramter.
Is it known what causes the failure?

> [1] https://bugzilla.kernel.org/show_bug.cgi?id=206411
> [2] Note that my lenovo T430 is the same.

Sebastian

2020-06-16 11:10:18

by Tony Chuang

[permalink] [raw]
Subject: RE: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

> On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> > From: Yan-Hsuan Chuang <[email protected]>
> >
> > Some platforms cannot read the DBI register successfully for the
> > ASPM settings. After the read failed, the bus could be unstable,
> > and the device just became unavailable [1]. For those platforms,
> > the ASPM should be disabled. But as the ASPM can help the driver
> > to save the power consumption in power save mode, the ASPM is still
> > needed. So, add a module parameter for them to disable it, then
> > the device can still work, while others can benefit from the less
> > power consumption that brings by ASPM enabled.
>
> Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
> is save to use?
>
> If someone notices the warning they still have to search for the warning
> in order to make the link towards loading the module with the
> disable_aspm=1 paramter.
> Is it known what causes the failure?
>

I think as long as the rtw_dbi_read() fails, the consequent register
operation will also fail, and still get an error read/write the register.
And this is some sort of PCI issue, and I am not really familiar with it.
Such as the root cause or how it fails.

If we can default disable it, then we can help those platforms, but
then other platform will suffer from higher power consumption.

Yen-Hsuan

Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

On 2020-06-16 11:06:28 [+0000], Tony Chuang wrote:
> > On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> > > From: Yan-Hsuan Chuang <[email protected]>
> > >
> > > Some platforms cannot read the DBI register successfully for the
> > > ASPM settings. After the read failed, the bus could be unstable,
> > > and the device just became unavailable [1]. For those platforms,
> > > the ASPM should be disabled. But as the ASPM can help the driver
> > > to save the power consumption in power save mode, the ASPM is still
> > > needed. So, add a module parameter for them to disable it, then
> > > the device can still work, while others can benefit from the less
> > > power consumption that brings by ASPM enabled.
> >
> > Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
> > is save to use?
> >
> > If someone notices the warning they still have to search for the warning
> > in order to make the link towards loading the module with the
> > disable_aspm=1 paramter.
> > Is it known what causes the failure?
> >
>
> I think as long as the rtw_dbi_read() fails, the consequent register
> operation will also fail, and still get an error read/write the register.
> And this is some sort of PCI issue, and I am not really familiar with it.
> Such as the root cause or how it fails.

Then it does not sound safe to enable it by default.

> If we can default disable it, then we can help those platforms, but
> then other platform will suffer from higher power consumption.

So for those platform, where the error occurs, you expect that the user
manages to read the error message (a backtrace from rtw_dbi_read8()) and
connects this the need to set a certain module option.

> Yen-Hsuan

Sebastian

2020-06-17 05:31:52

by Tony Chuang

[permalink] [raw]
Subject: RE: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

0000], Tony Chuang wrote:
> > > On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> > > > From: Yan-Hsuan Chuang <[email protected]>
> > > >
> > > > Some platforms cannot read the DBI register successfully for the
> > > > ASPM settings. After the read failed, the bus could be unstable,
> > > > and the device just became unavailable [1]. For those platforms,
> > > > the ASPM should be disabled. But as the ASPM can help the driver
> > > > to save the power consumption in power save mode, the ASPM is still
> > > > needed. So, add a module parameter for them to disable it, then
> > > > the device can still work, while others can benefit from the less
> > > > power consumption that brings by ASPM enabled.
> > >
> > > Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
> > > is save to use?
> > >
> > > If someone notices the warning they still have to search for the warning
> > > in order to make the link towards loading the module with the
> > > disable_aspm=1 paramter.
> > > Is it known what causes the failure?
> > >
> >
> > I think as long as the rtw_dbi_read() fails, the consequent register
> > operation will also fail, and still get an error read/write the register.
> > And this is some sort of PCI issue, and I am not really familiar with it.
> > Such as the root cause or how it fails.
>
> Then it does not sound safe to enable it by default.

We have had a discussion about this, but I cannot find the thread now.
People suggested that the module parameter should not be used.
And they think that if the ASPM can help for power consumption, then
it should be default enabled. But I think it should be based on that the
other platforms will not just fail to bring up the device. However, the
platforms are less than the others, not sure if default enable or disable
is better.

>
> > If we can default disable it, then we can help those platforms, but
> > then other platform will suffer from higher power consumption.
>
> So for those platform, where the error occurs, you expect that the user
> manages to read the error message (a backtrace from rtw_dbi_read8()) and
> connects this the need to set a certain module option.

Yes, we can discuss if it should be default enabled or not. Otherwise the
people with those platforms can only do that to prevent this. Really bad.

> Sebastian
>

Yen-Hsuan

Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

On 2020-06-17 05:30:22 [+0000], Tony Chuang wrote:
> 0000], Tony Chuang wrote:
> > > > On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> > > > > From: Yan-Hsuan Chuang <[email protected]>
> > > > >
> > > > > Some platforms cannot read the DBI register successfully for the
> > > > > ASPM settings. After the read failed, the bus could be unstable,
> > > > > and the device just became unavailable [1]. For those platforms,
> > > > > the ASPM should be disabled. But as the ASPM can help the driver
> > > > > to save the power consumption in power save mode, the ASPM is still
> > > > > needed. So, add a module parameter for them to disable it, then
> > > > > the device can still work, while others can benefit from the less
> > > > > power consumption that brings by ASPM enabled.
> > > >
> > > > Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
> > > > is save to use?
> > > >
> > > > If someone notices the warning they still have to search for the warning
> > > > in order to make the link towards loading the module with the
> > > > disable_aspm=1 paramter.
> > > > Is it known what causes the failure?
> > > >
> > >
> > > I think as long as the rtw_dbi_read() fails, the consequent register
> > > operation will also fail, and still get an error read/write the register.
> > > And this is some sort of PCI issue, and I am not really familiar with it.
> > > Such as the root cause or how it fails.
> >
> > Then it does not sound safe to enable it by default.
>
> We have had a discussion about this, but I cannot find the thread now.
> People suggested that the module parameter should not be used.
> And they think that if the ASPM can help for power consumption, then
> it should be default enabled. But I think it should be based on that the
> other platforms will not just fail to bring up the device. However, the
> platforms are less than the others, not sure if default enable or disable
> is better.

What I fail to understand is if this error affects other PCI devices as
well or just this one. And if it is possible to reset the wifi device
and everything gets back no normal. Or is it just the register reading,
that spams the log and would affect the system otherwise if you would
just avoided after the first fail.

> > > If we can default disable it, then we can help those platforms, but
> > > then other platform will suffer from higher power consumption.
> >
> > So for those platform, where the error occurs, you expect that the user
> > manages to read the error message (a backtrace from rtw_dbi_read8()) and
> > connects this the need to set a certain module option.
>
> Yes, we can discuss if it should be default enabled or not. Otherwise the
> people with those platforms can only do that to prevent this. Really bad.

It would be good to know the root cause of this. So then default enable
would depend on it.
You could have a allow/forbid list based on DMI once you identified
good/bad systems but this includes additional maintenance.

I think that at the very least, if the read fails you should give the
user additional information how to stop this from happening again. And
either stop issuing the commands again or skip driver loading (depending
what it means for device stability).

> Yen-Hsuan

Sebastian

2020-06-20 19:55:04

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

On 6/5/20 2:47 AM, [email protected] wrote:
> From: Yan-Hsuan Chuang <[email protected]>
>
> Some platforms cannot read the DBI register successfully for the
> ASPM settings. After the read failed, the bus could be unstable,
> and the device just became unavailable [1]. For those platforms,
> the ASPM should be disabled. But as the ASPM can help the driver
> to save the power consumption in power save mode, the ASPM is still
> needed. So, add a module parameter for them to disable it, then
> the device can still work, while others can benefit from the less
> power consumption that brings by ASPM enabled.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=206411
> [2] Note that my lenovo T430 is the same.

As someone who maintains these drivers in a GitHub repo so that users of older
kernels can have access to them, I am in favor of this module option. Only a
very few cases would need to disable ASPM, and I see no reason for everyone else
to use additional power as would be needed with automatic disabling. Adding a
new machine to some quirk list would be more difficult than merely telling a
novice user how to turn ASPM off for their system.

In case someone is collecting machines that would need a quirk, I found another
one as shown in https://github.com/lwfinger/rtlwifi_new/issues/622. That one is
a Lenovo Thinkpad E490.

Larry




2020-07-15 09:25:14

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

Sebastian Andrzej Siewior <[email protected]> writes:

> On 2020-06-17 05:30:22 [+0000], Tony Chuang wrote:
>> 0000], Tony Chuang wrote:
>> > > > On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
>> > > > > From: Yan-Hsuan Chuang <[email protected]>
>> > > > >
>> > > > > Some platforms cannot read the DBI register successfully for the
>> > > > > ASPM settings. After the read failed, the bus could be unstable,
>> > > > > and the device just became unavailable [1]. For those platforms,
>> > > > > the ASPM should be disabled. But as the ASPM can help the driver
>> > > > > to save the power consumption in power save mode, the ASPM is still
>> > > > > needed. So, add a module parameter for them to disable it, then
>> > > > > the device can still work, while others can benefit from the less
>> > > > > power consumption that brings by ASPM enabled.
>> > > >
>> > > > Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
>> > > > is save to use?
>> > > >
>> > > > If someone notices the warning they still have to search for the warning
>> > > > in order to make the link towards loading the module with the
>> > > > disable_aspm=1 paramter.
>> > > > Is it known what causes the failure?
>> > > >
>> > >
>> > > I think as long as the rtw_dbi_read() fails, the consequent register
>> > > operation will also fail, and still get an error read/write the register.
>> > > And this is some sort of PCI issue, and I am not really familiar with it.
>> > > Such as the root cause or how it fails.
>> >
>> > Then it does not sound safe to enable it by default.
>>
>> We have had a discussion about this, but I cannot find the thread now.
>> People suggested that the module parameter should not be used.
>> And they think that if the ASPM can help for power consumption, then
>> it should be default enabled. But I think it should be based on that the
>> other platforms will not just fail to bring up the device. However, the
>> platforms are less than the others, not sure if default enable or disable
>> is better.
>
> What I fail to understand is if this error affects other PCI devices as
> well or just this one. And if it is possible to reset the wifi device
> and everything gets back no normal. Or is it just the register reading,
> that spams the log and would affect the system otherwise if you would
> just avoided after the first fail.
>
>> > > If we can default disable it, then we can help those platforms, but
>> > > then other platform will suffer from higher power consumption.
>> >
>> > So for those platform, where the error occurs, you expect that the user
>> > manages to read the error message (a backtrace from rtw_dbi_read8()) and
>> > connects this the need to set a certain module option.
>>
>> Yes, we can discuss if it should be default enabled or not. Otherwise the
>> people with those platforms can only do that to prevent this. Really bad.
>
> It would be good to know the root cause of this. So then default enable
> would depend on it.
> You could have a allow/forbid list based on DMI once you identified
> good/bad systems but this includes additional maintenance.

I think there should be this kind of quirk list in rtw88 which would
disable ASPM automatically on problematic platforms. We should not
require the user to figure out the problem on their own and disable ASPM
manually using the module parameter.

> I think that at the very least, if the read fails you should give the
> user additional information how to stop this from happening again. And
> either stop issuing the commands again or skip driver loading (depending
> what it means for device stability).

Yes, if we can guess that this is an ASPM problem giving additional
information is very helpful for the user, please do that as well.

--
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2020-07-15 09:32:45

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

<[email protected]> wrote:

> From: Yan-Hsuan Chuang <[email protected]>
>
> Some platforms cannot read the DBI register successfully for the
> ASPM settings. After the read failed, the bus could be unstable,
> and the device just became unavailable [1]. For those platforms,
> the ASPM should be disabled. But as the ASPM can help the driver
> to save the power consumption in power save mode, the ASPM is still
> needed. So, add a module parameter for them to disable it, then
> the device can still work, while others can benefit from the less
> power consumption that brings by ASPM enabled.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=206411
> [2] Note that my lenovo T430 is the same.
>
> Fixes: 3dff7c6e3749 ("rtw88: allows to enable/disable HCI link PS mechanism")
> Signed-off-by: Yan-Hsuan Chuang <[email protected]>

Patch applied to wireless-drivers-next.git, thanks.

68aa716b7dd3 rtw88: pci: disable aspm for platform inter-op with module parameter

--
https://patchwork.kernel.org/patch/11589181/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2020-07-15 09:38:13

by Tony Chuang

[permalink] [raw]
Subject: RE: [PATCH v1] rtw88: pci: disable aspm for platform inter-op with module parameter

> Sebastian Andrzej Siewior <[email protected]> writes:
>
> > On 2020-06-17 05:30:22 [+0000], Tony Chuang wrote:
> >> 0000], Tony Chuang wrote:
> >> > > > On 2020-06-05 15:47:03 [+0800], [email protected] wrote:
> >> > > > > From: Yan-Hsuan Chuang <[email protected]>
> >> > > > >
> >> > > > > Some platforms cannot read the DBI register successfully for the
> >> > > > > ASPM settings. After the read failed, the bus could be unstable,
> >> > > > > and the device just became unavailable [1]. For those platforms,
> >> > > > > the ASPM should be disabled. But as the ASPM can help the driver
> >> > > > > to save the power consumption in power save mode, the ASPM is still
> >> > > > > needed. So, add a module parameter for them to disable it, then
> >> > > > > the device can still work, while others can benefit from the less
> >> > > > > power consumption that brings by ASPM enabled.
> >> > > >
> >> > > > Can you set disable_aspm if rtw_dbi_read8() fails? Or make a test if it
> >> > > > is save to use?
> >> > > >
> >> > > > If someone notices the warning they still have to search for the warning
> >> > > > in order to make the link towards loading the module with the
> >> > > > disable_aspm=1 paramter.
> >> > > > Is it known what causes the failure?
> >> > > >
> >> > >
> >> > > I think as long as the rtw_dbi_read() fails, the consequent register
> >> > > operation will also fail, and still get an error read/write the register.
> >> > > And this is some sort of PCI issue, and I am not really familiar with it.
> >> > > Such as the root cause or how it fails.
> >> >
> >> > Then it does not sound safe to enable it by default.
> >>
> >> We have had a discussion about this, but I cannot find the thread now.
> >> People suggested that the module parameter should not be used.
> >> And they think that if the ASPM can help for power consumption, then
> >> it should be default enabled. But I think it should be based on that the
> >> other platforms will not just fail to bring up the device. However, the
> >> platforms are less than the others, not sure if default enable or disable
> >> is better.
> >
> > What I fail to understand is if this error affects other PCI devices as
> > well or just this one. And if it is possible to reset the wifi device
> > and everything gets back no normal. Or is it just the register reading,
> > that spams the log and would affect the system otherwise if you would
> > just avoided after the first fail.
> >
> >> > > If we can default disable it, then we can help those platforms, but
> >> > > then other platform will suffer from higher power consumption.
> >> >
> >> > So for those platform, where the error occurs, you expect that the user
> >> > manages to read the error message (a backtrace from rtw_dbi_read8())
> and
> >> > connects this the need to set a certain module option.
> >>
> >> Yes, we can discuss if it should be default enabled or not. Otherwise the
> >> people with those platforms can only do that to prevent this. Really bad.
> >
> > It would be good to know the root cause of this. So then default enable
> > would depend on it.
> > You could have a allow/forbid list based on DMI once you identified
> > good/bad systems but this includes additional maintenance.
>
> I think there should be this kind of quirk list in rtw88 which would
> disable ASPM automatically on problematic platforms. We should not
> require the user to figure out the problem on their own and disable ASPM
> manually using the module parameter.

OK, I'll add a quirk list for the platforms.

>
> > I think that at the very least, if the read fails you should give the
> > user additional information how to stop this from happening again. And
> > either stop issuing the commands again or skip driver loading (depending
> > what it means for device stability).
>
> Yes, if we can guess that this is an ASPM problem giving additional
> information is very helpful for the user, please do that as well.
>

Yes, should add additional information for the users, so they can
help to report the platforms.

Yen-Hsuan