2013-03-04 08:41:16

by Emmanuel Grumbach

[permalink] [raw]
Subject: is L1 really disabled in iwlwifi

Hi,

In 1a7123cdd9f49cf1c908fb2c16d26f279c88d8c9, John Linville disabled
ASPM for iwlwifi:

/* W/A - seems to solve weird behavior. We need to remove this if we
* don't want to stay in L1 all the time. This wastes a lot of power */
pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
PCIE_LINK_STATE_CLKPM);

Now, I took a laptop with 3.7.9 (that supports L1) and measured power.
I would expect to see that as long as the driver isn't loaded, the
power consumption is low (L1 enabled), and when I load the driver, the
power consumption rises because L1 gets disabled.
But this is not what I see. I see more the less the same numbers
before and after driver load.
I don't remember exactly the numbers, but I do remember they were low
(around 1mA or so) - but I really need to check.
I also removed the code above, and it didn't change anything.
Can it be that this code doesn't have any effect?

And as you can see, I am not an expert at this...


Emmanuel Grumbach
[email protected]


2013-03-30 21:26:49

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Sat, Mar 30, 2013 at 12:38 PM, Emmanuel Grumbach <[email protected]> wrote:
>>>
>>> Anyway - I think I will just remove this pci_disable_link_state call
>>> and hopefully, it will stay long enough in linux-next so that people
>>> will report issues before it get into linux.
>>> I don't like leaving that code if it doesn't do anything.
>>
>> I think we broke at least some cases of pci_disable_link_state() a
>> while back. I'd like to fix it rather than just removing calls to it.
>>
>> Can someone collect a complete dmesg log and "lspci -vv" output?
>>
>
> I will. But I am on vacation right now and my linux box is not
> accessible right now...
> In any case, what you'll see is that ASPM is enabled (L1 enabled with
> or without driver).
> We are also seeing some really weird stuff (which I can't reproduce of
> course) like HW becoming not accessible, I guess it would be worth
> trying to *really* disable L1. I can tweak that in the code of the
> driver and tell the NIC that L1 is disabled.
>
> /* Disable L1-Active */
> iwl_set_bits_prph(trans, APMG_PCIDEV_STT_REG,
> APMG_PCIDEV_STT_VAL_L1_ACT_DIS);
>
> Do you have an idea about *when* did pci_disable_link_state() break?

No, not yet. After we figure out exactly what the problem is, we
should be able to determine when it broke.

If I understand correctly, you have CONFIG_PCIEASPM=y, the iwlwifi
driver uses pci_disable_link_state() to disable L1, but the link state
stays in L0 all the time. So I think we should look at the dmesg log,
"lspci -vv" output before loading iwlwifi, and "lspci -vv" output
after loading iwlwifi.

Bjorn

2013-03-04 13:49:40

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 04, 2013 at 10:41:14AM +0200, Emmanuel Grumbach wrote:
> In 1a7123cdd9f49cf1c908fb2c16d26f279c88d8c9, John Linville disabled
> ASPM for iwlwifi:
>
> /* W/A - seems to solve weird behavior. We need to remove this if we
> * don't want to stay in L1 all the time. This wastes a lot of power */
> pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
> PCIE_LINK_STATE_CLKPM);
>
> Now, I took a laptop with 3.7.9 (that supports L1) and measured power.
> I would expect to see that as long as the driver isn't loaded, the
> power consumption is low (L1 enabled), and when I load the driver, the
> power consumption rises because L1 gets disabled.
> But this is not what I see. I see more the less the same numbers
> before and after driver load.
> I don't remember exactly the numbers, but I do remember they were low
> (around 1mA or so) - but I really need to check.
> I also removed the code above, and it didn't change anything.
> Can it be that this code doesn't have any effect?

Depend on BIOS settings ASPM can be enabled or disabled on the whole
system. I think kernel can change settings globally as well.
Reloading driver can not be enough to check that, try to reboot the
system with iwlwifi module blacklisted and check lspci -vvv to see
LnkCtl ASPM settings.

Additionally despite above code, iwlwifi try to enable L0S in
iwl_pcie_apm_config(), what is basically wrong.

Stanislaw

2013-03-29 18:24:53

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 4, 2013 at 10:58 AM, Emmanuel Grumbach <[email protected]> wrote:
> On Mon, Mar 4, 2013 at 5:48 PM, Stanislaw Gruszka <[email protected]> wrote:
>> On Mon, Mar 04, 2013 at 04:57:46PM +0200, Emmanuel Grumbach wrote:
>>
>>> This is what I did. On 3.7, lspci tells me that ASPM is enabled before
>>> I load the module. And same after I load the module.
>>
>> CONFIG_PCIEASPM disabled ?
>>
>
> nope - enabled on both kernel: 3.1 and 3.7. Makes sense though.
> CONFIG_PCIEASPM enabled, I have L1. Just wondering why doesn't that
> pci_disable_link_state do anything....
> Also, I remember so issues Mathew Garret debugged and he ended up
> disabling ASPM on older kernel because some BIOS issue (don't remember
> really). Apparently, ASPM is enabled again in newer kernels.
>
> Anyway - I think I will just remove this pci_disable_link_state call
> and hopefully, it will stay long enough in linux-next so that people
> will report issues before it get into linux.
> I don't like leaving that code if it doesn't do anything.

I think we broke at least some cases of pci_disable_link_state() a
while back. I'd like to fix it rather than just removing calls to it.

Can someone collect a complete dmesg log and "lspci -vv" output?

Bjorn

2013-03-04 15:48:17

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 04, 2013 at 04:57:46PM +0200, Emmanuel Grumbach wrote:

> This is what I did. On 3.7, lspci tells me that ASPM is enabled before
> I load the module. And same after I load the module.

CONFIG_PCIEASPM disabled ?

Stanislaw

2013-03-04 14:57:48

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>>
>> /* W/A - seems to solve weird behavior. We need to remove this if we
>> * don't want to stay in L1 all the time. This wastes a lot of power */
>> pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
>> PCIE_LINK_STATE_CLKPM);
>>
>> Now, I took a laptop with 3.7.9 (that supports L1) and measured power.
>> I would expect to see that as long as the driver isn't loaded, the
>> power consumption is low (L1 enabled), and when I load the driver, the
>> power consumption rises because L1 gets disabled.
>> But this is not what I see. I see more the less the same numbers
>> before and after driver load.
>> I don't remember exactly the numbers, but I do remember they were low
>> (around 1mA or so) - but I really need to check.
>> I also removed the code above, and it didn't change anything.
>> Can it be that this code doesn't have any effect?
>
> Depend on BIOS settings ASPM can be enabled or disabled on the whole
> system. I think kernel can change settings globally as well.

Obviously kernel can change things here. I have 3.1 (!) and 3.7 on the
same machines, on the first, ASPM is disabled, on the latter it is
enabled.

> Reloading driver can not be enough to check that, try to reboot the
> system with iwlwifi module blacklisted and check lspci -vvv to see
> LnkCtl ASPM settings.
>

This is what I did. On 3.7, lspci tells me that ASPM is enabled before
I load the module. And same after I load the module.

> Additionally despite above code, iwlwifi try to enable L0S in
> iwl_pcie_apm_config(), what is basically wrong.
>

Right, good point. The thing is that the device doesn't know how to do
both L0S and L1. This is the origin of this code. But obviously, we
first need to make sure that L0S is supported in the LnkCtl. Will
check.
Thanks!

So - what about just removing this code?

2013-03-04 17:58:59

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 4, 2013 at 5:48 PM, Stanislaw Gruszka <[email protected]> wrote:
> On Mon, Mar 04, 2013 at 04:57:46PM +0200, Emmanuel Grumbach wrote:
>
>> This is what I did. On 3.7, lspci tells me that ASPM is enabled before
>> I load the module. And same after I load the module.
>
> CONFIG_PCIEASPM disabled ?
>

nope - enabled on both kernel: 3.1 and 3.7. Makes sense though.
CONFIG_PCIEASPM enabled, I have L1. Just wondering why doesn't that
pci_disable_link_state do anything....
Also, I remember so issues Mathew Garret debugged and he ended up
disabling ASPM on older kernel because some BIOS issue (don't remember
really). Apparently, ASPM is enabled again in newer kernels.

Anyway - I think I will just remove this pci_disable_link_state call
and hopefully, it will stay long enough in linux-next so that people
will report issues before it get into linux.
I don't like leaving that code if it doesn't do anything.

2013-03-30 18:38:34

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>>
>> Anyway - I think I will just remove this pci_disable_link_state call
>> and hopefully, it will stay long enough in linux-next so that people
>> will report issues before it get into linux.
>> I don't like leaving that code if it doesn't do anything.
>
> I think we broke at least some cases of pci_disable_link_state() a
> while back. I'd like to fix it rather than just removing calls to it.
>
> Can someone collect a complete dmesg log and "lspci -vv" output?
>

I will. But I am on vacation right now and my linux box is not
accessible right now...
In any case, what you'll see is that ASPM is enabled (L1 enabled with
or without driver).
We are also seeing some really weird stuff (which I can't reproduce of
course) like HW becoming not accessible, I guess it would be worth
trying to *really* disable L1. I can tweak that in the code of the
driver and tell the NIC that L1 is disabled.

/* Disable L1-Active */
iwl_set_bits_prph(trans, APMG_PCIDEV_STT_REG,
APMG_PCIDEV_STT_VAL_L1_ACT_DIS);

Do you have an idea about *when* did pci_disable_link_state() break?

2013-03-04 13:45:25

by John W. Linville

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 04, 2013 at 10:41:14AM +0200, Emmanuel Grumbach wrote:
> Hi,
>
> In 1a7123cdd9f49cf1c908fb2c16d26f279c88d8c9, John Linville disabled
> ASPM for iwlwifi:
>
> /* W/A - seems to solve weird behavior. We need to remove this if we
> * don't want to stay in L1 all the time. This wastes a lot of power */
> pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
> PCIE_LINK_STATE_CLKPM);
>
> Now, I took a laptop with 3.7.9 (that supports L1) and measured power.
> I would expect to see that as long as the driver isn't loaded, the
> power consumption is low (L1 enabled), and when I load the driver, the
> power consumption rises because L1 gets disabled.
> But this is not what I see. I see more the less the same numbers
> before and after driver load.
> I don't remember exactly the numbers, but I do remember they were low
> (around 1mA or so) - but I really need to check.
> I also removed the code above, and it didn't change anything.
> Can it be that this code doesn't have any effect?
>
> And as you can see, I am not an expert at this...

Neither am I...that commit was a couple of years ago, and the code
seems to have moved around a bit since then. FWIW, the "W/A -
seems to solve..." comment isn't even mine. I don't even know what
"W/A" means...?

Anyway, ISTRC that change being suggested by Wey-Yi in response to
some device disconnects (i.e. from the PCI-E bus). I have no idea
how much power it should cost or if it is really even necessary.

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2013-03-04 15:15:25

by John W. Linville

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Mon, Mar 04, 2013 at 04:57:46PM +0200, Emmanuel Grumbach wrote:

> Right, good point. The thing is that the device doesn't know how to do
> both L0S and L1. This is the origin of this code. But obviously, we
> first need to make sure that L0S is supported in the LnkCtl. Will
> check.
> Thanks!
>
> So - what about just removing this code?

As long as the device doesn't randoly disconnect itself from the bus...

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2013-04-09 05:29:27

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

Hi Bjorn,

> Sorry, I haven't had a chance to look at this yet. Can you please
> attach the *complete* dmesg and lspci output, not just the parts that
> mention iwlwifi? Some of this depends on other PCI core stuff, like
> what happened with the _OSC evaluation when we discovered the PCI host
> bridge.
>

Here is the info you wanted. Same config file as before.
Note that I now have two WiFi NICs connected.


Attachments:
dmesg (59.68 kB)
lspci_after (30.82 kB)
lspci_before (30.82 kB)
Download all attachments

2013-04-02 11:12:53

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>
> I attached all the data you needed.
> The link state seems to go to L1 based on the low power the NIC consumes.
> So it seems that even if we use pci_disable_link_state() to disabled
> L1, L1 is still disabled.
>

Ugh... so doesn't make any sense.... so again:

So it seems that even if we use pci_disable_link_state() to disable
L1, the link state still enters L1.

2013-04-08 16:28:33

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Sun, Apr 7, 2013 at 6:23 AM, Emmanuel Grumbach <[email protected]> wrote:
>>>
>>> I attached all the data you needed.
>>> The link state seems to go to L1 based on the low power the NIC consumes.
>>> So it seems that even if we use pci_disable_link_state() to disabled
>>> L1, L1 is still disabled.
>>>
>>
>> Ugh... so doesn't make any sense.... so again:
>>
>> So it seems that even if we use pci_disable_link_state() to disable
>> L1, the link state still enters L1.
>
> ping? :-)

Sorry, I haven't had a chance to look at this yet. Can you please
attach the *complete* dmesg and lspci output, not just the parts that
mention iwlwifi? Some of this depends on other PCI core stuff, like
what happened with the _OSC evaluation when we discovered the PCI host
bridge.

Bjorn

2013-04-07 12:23:23

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>>
>> I attached all the data you needed.
>> The link state seems to go to L1 based on the low power the NIC consumes.
>> So it seems that even if we use pci_disable_link_state() to disabled
>> L1, L1 is still disabled.
>>
>
> Ugh... so doesn't make any sense.... so again:
>
> So it seems that even if we use pci_disable_link_state() to disable
> L1, the link state still enters L1.

ping? :-)

2013-04-02 11:10:58

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>>>> Anyway - I think I will just remove this pci_disable_link_state call
>>>> and hopefully, it will stay long enough in linux-next so that people
>>>> will report issues before it get into linux.
>>>> I don't like leaving that code if it doesn't do anything.
>>>
>>> I think we broke at least some cases of pci_disable_link_state() a
>>> while back. I'd like to fix it rather than just removing calls to it.
>>>
>>> Can someone collect a complete dmesg log and "lspci -vv" output?
>>>
>>
>> I will. But I am on vacation right now and my linux box is not
>> accessible right now...
>> In any case, what you'll see is that ASPM is enabled (L1 enabled with
>> or without driver).
>> We are also seeing some really weird stuff (which I can't reproduce of
>> course) like HW becoming not accessible, I guess it would be worth
>> trying to *really* disable L1. I can tweak that in the code of the
>> driver and tell the NIC that L1 is disabled.
>>
>> /* Disable L1-Active */
>> iwl_set_bits_prph(trans, APMG_PCIDEV_STT_REG,
>> APMG_PCIDEV_STT_VAL_L1_ACT_DIS);
>>
>> Do you have an idea about *when* did pci_disable_link_state() break?
>
> No, not yet. After we figure out exactly what the problem is, we
> should be able to determine when it broke.
>
> If I understand correctly, you have CONFIG_PCIEASPM=y, the iwlwifi
> driver uses pci_disable_link_state() to disable L1, but the link state
> stays in L0 all the time. So I think we should look at the dmesg log,
> "lspci -vv" output before loading iwlwifi, and "lspci -vv" output
> after loading iwlwifi.
>


I attached all the data you needed.
The link state seems to go to L1 based on the low power the NIC consumes.
So it seems that even if we use pci_disable_link_state() to disabled
L1, L1 is still disabled.

I can't really tweak the driver to clear the bit in the LnkCtl of the
ConfigSpace, but I can tell the HW not to accept to go to L1 (since it
is handshake based with the root complex IIRC).
I can also use setpci, but then....


Attachments:
config-3.7.9+ (107.54 kB)
dmesg (2.82 kB)
lspci_iwlwifi (2.75 kB)
lspci_no_iwlwifi (2.72 kB)
Download all attachments

2013-04-30 10:57:14

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>
>> Sorry, I haven't had a chance to look at this yet. Can you please
>> attach the *complete* dmesg and lspci output, not just the parts that
>> mention iwlwifi? Some of this depends on other PCI core stuff, like
>> what happened with the _OSC evaluation when we discovered the PCI host
>> bridge.
>>
>
> Here is the info you wanted. Same config file as before.
> Note that I now have two WiFi NICs connected.

Any chance you'll find time for me? :-)

2013-05-01 17:13:36

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Wed, May 1, 2013 at 2:31 AM, Emmanuel Grumbach <[email protected]> wrote:
> [from Bjorn's mail]
>> In Emmanuel's case, we don't get _OSC control, so
>> pci_disable_link_state() does nothing.
>
> Right, but this is true with the specific log I sent to you. Is it
> possible that another platform / BIOS, we *will* get _OSC control and
> that pci_disable_link_state() will actually do something? In that case
> I would prefer not to remove the call to pcie_disable_link_state().

Yes, absolutely, on many platforms we will get _OSC control, and
pci_disable_link_state() will work as expected. The problem is that
the driver doesn't have a good way to know whether pci_disable_link()
did anything or not.

Today I think we have:

1) If the BIOS grants the OS permission to control PCIe services via
_OSC, pci_disable_link_state() works and L1 will be disabled.

2) If the BIOS does not grant permission, pci_disable_link_state()
does nothing and L1 may be enabled or not depending on what
configuration the BIOS did.

If the device really doesn't work reliably when L1 is enabled, we're
currently at the mercy of the BIOS -- if the BIOS enables L1 but
doesn't grant us permission via _OSC, L1 will remain enabled (as it is
on your system).

> Also - in the log I sent you, we don't get _OSC control, but I can see
> 'disabling PCIe ASPM" ... and it *is* enabled (since L1 works). So I
> am kinda scratching my head here...

The code is a bit convoluted, but I *think* the "disabling PCIe ASPM"
message basically means "Linux is not going to touch ASPM
configuration," so we just use whatever the BIOS set up.

>> The general issue here is that Windows will (as far as we've been able
>> to determine) never touch ASPM registers unless given PCIe control via
>> _OSC. Drivers are presumably able to override this by hitting
>> configuration registers themselves. The pci_disable_link_state()
>> functions are currently broadly equivalent to the helper functions
>> provided in the Windows .inf files, and if drivers really want to
>> disable the control themselves then they can do so.

The Windows mechanism (described in "PCI Express in Depth for Windows
Vista and Beyond" [1]) is to put a "Needs=PciASPMOptOut" statement in
the .inf file.

My impression is that if the BIOS enables ASPM but doesn't grant
control to the OS, Vista will generally leave the ASPM config alone.
I guess the question is what happens when a driver .inf specifies
PciASPMOptOut in this case.

I don't think it makes sense to provide this mechanism to driver
writers, but make it only work if the BIOS grants control. The doc
doesn't mention anything about drivers needing to do anything extra to
cover the case when the BIOS didn't grant control. So my guess is
that when PciASPMOptOut is specified, Vista will turn off ASPM for
that device in either case. But maybe you've tested this and
determined otherwise?

>> The only time this should be relevant is if (a) the BIOS has enabled L1
>> on iwlwifi, (b) the BIOS has disabled ASPM control, and (c) the hardware
>> doesn't work with L1 enabled. Are there really cases where that's true?
>
> After having read (a bit) about _OSC I can say the following:
> L1 is a must for the Windows driver - so that it doesn't sounds
> reasonable that the .inf of the Windows driver for Intel Wireless NICs
> disables ASPM through _OSC or through any other mechanism.
> (a) - this is really platform / BIOS dependent, and I can't test all
> the platform out there
> (b) - ditto
> (c) - the HW *should* work with L1 enabled but we may very well have a
> bug in the driver. I have tried to check the way we initialize the HW
> and it seems fine, but one never knows.

Do you know if the Windows driver for these iwlwifi devices specifies
PciASPMOptOut? If it doesn't, that's a pretty good clue that the
Linux driver shouldn't need to use pci_disable_link_state().

Bjorn

[1] http://download.microsoft.com/download/5/b/9/5b97017b-e28a-4bae-ba48-174cf47d23cd/cpa070_wh06.ppt

2013-05-10 22:53:01

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

[+cc Rafael, other pci_disable_link_state() users]

On Wed, May 01, 2013 at 11:13:15AM -0600, Bjorn Helgaas wrote:
> On Wed, May 1, 2013 at 2:31 AM, Emmanuel Grumbach <[email protected]> wrote:
> > [from Bjorn's mail]
> >> In Emmanuel's case, we don't get _OSC control, so
> >> pci_disable_link_state() does nothing.
> >
> > Right, but this is true with the specific log I sent to you. Is it
> > possible that another platform / BIOS, we *will* get _OSC control and
> > that pci_disable_link_state() will actually do something? In that case
> > I would prefer not to remove the call to pcie_disable_link_state().
>
> Yes, absolutely, on many platforms we will get _OSC control, and
> pci_disable_link_state() will work as expected. The problem is that
> the driver doesn't have a good way to know whether pci_disable_link()
> did anything or not.
>
> Today I think we have:
>
> 1) If the BIOS grants the OS permission to control PCIe services via
> _OSC, pci_disable_link_state() works and L1 will be disabled.
>
> 2) If the BIOS does not grant permission, pci_disable_link_state()
> does nothing and L1 may be enabled or not depending on what
> configuration the BIOS did.
>
> If the device really doesn't work reliably when L1 is enabled, we're
> currently at the mercy of the BIOS -- if the BIOS enables L1 but
> doesn't grant us permission via _OSC, L1 will remain enabled (as it is
> on your system).

I propose the following patch. Any comments?


commit cd11e3f87c4d2777cf8921c0454500c9baa54b46
Author: Bjorn Helgaas <[email protected]>
Date: Fri May 10 15:54:35 2013 -0600

PCI/ASPM: Allow drivers to disable ASPM unconditionally

Some devices have hardware problems related to using ASPM. Drivers for
these devices use pci_disable_link_state() to prevent their device from
entering L0s or L1. But on platforms where the OS doesn't have permission
to manage ASPM, pci_disable_link_state() does nothing, and the driver has
no way to know this.

Therefore, if the BIOS enables ASPM but declines (either via the FADT
ACPI_FADT_NO_ASPM bit or the _OSC method) to allow the OS to manage it,
the device can still use ASPM and trip over the hardware issue.

This patch makes pci_disable_link_state() disable ASPM unconditionally,
regardless of whether the OS has permission to manage ASPM in general.

Reported-by: Emmanuel Grumbach <[email protected]>
Reference: https://lkml.kernel.org/r/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=57331
Signed-off-by: Bjorn Helgaas <[email protected]>

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index d320df6..9ef4ab8 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -718,15 +718,11 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
* pci_disable_link_state - disable pci device's link state, so the link will
* never enter specific states
*/
-static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,
- bool force)
+static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
{
struct pci_dev *parent = pdev->bus->self;
struct pcie_link_state *link;

- if (aspm_disabled && !force)
- return;
-
if (!pci_is_pcie(pdev))
return;

@@ -757,13 +753,13 @@ static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,

void pci_disable_link_state_locked(struct pci_dev *pdev, int state)
{
- __pci_disable_link_state(pdev, state, false, false);
+ __pci_disable_link_state(pdev, state, false);
}
EXPORT_SYMBOL(pci_disable_link_state_locked);

void pci_disable_link_state(struct pci_dev *pdev, int state)
{
- __pci_disable_link_state(pdev, state, true, false);
+ __pci_disable_link_state(pdev, state, true);
}
EXPORT_SYMBOL(pci_disable_link_state);

@@ -781,7 +777,7 @@ void pcie_clear_aspm(struct pci_bus *bus)
__pci_disable_link_state(child, PCIE_LINK_STATE_L0S |
PCIE_LINK_STATE_L1 |
PCIE_LINK_STATE_CLKPM,
- false, true);
+ false);
}
}


2013-05-17 05:49:04

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

>
> I couldn't imagine that silently ignoring the request to disable ASPM
> would be the right thing, but I spent a long time experimenting with
> Windows on qemu, and I think you're right. Windows 7 also seems to
> ignore the "PciASPMOptOut" directive when we don't have permission
> to manage ASPM. All the gory details are at
> https://bugzilla.kernel.org/show_bug.cgi?id=57331
>
> The current behavior is definitely confusing. I hate to rename or change
> pci_disable_link_state() because it's exported and we'd have to maintain
> the old interface for a while anyway. And I don't really want to return
> failure to drivers, because I think that would encourage people to fiddle
> with the Link Control register directly in the driver, which doesn't seem
> like a good idea.
>
> And you're also right that (as far as I know) there's not an actual
> problem with the current behavior other than the confusion it causes.
>
> So, how about something like the following patch, which just prints a
> warning when we can't do what the driver requested? I suppose this may
> also be a nuisance, because users will be worried, but they can't actually
> *do* anything about it. Maybe it should be dev_info() instead.
>

Good for me - now I would be notified that something wrong happened.

2013-05-01 08:31:26

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

[from Bjorn's mail]
> In Emmanuel's case, we don't get _OSC control, so
> pci_disable_link_state() does nothing.

Right, but this is true with the specific log I sent to you. Is it
possible that another platform / BIOS, we *will* get _OSC control and
that pci_disable_link_state() will actually do something? In that case
I would prefer not to remove the call to pcie_disable_link_state().
Also - in the log I sent you, we don't get _OSC control, but I can see
'disabling PCIe ASPM" ... and it *is* enabled (since L1 works). So I
am kinda scratching my head here...

>
>> I'm not convinced iwlwifi really needs to disable the link power
>> management states in the first place -- the 1a7123cdd9 changelog isn't
>> very convincing, and I'm not aware of any reported issues even though
>> we're actually leaving them enabled in many cases.
>>
>> More generally, I don't think the pci_disable_link_state() interface
>> is very useful because it often doesn't do what it claims, and the
>> caller has no indication.
>
> The general issue here is that Windows will (as far as we've been able
> to determine) never touch ASPM registers unless given PCIe control via
> _OSC. Drivers are presumably able to override this by hitting
> configuration registers themselves. The pci_disable_link_state()
> functions are currently broadly equivalent to the helper functions
> provided in the Windows .inf files, and if drivers really want to
> disable the control themselves then they can do so. Renaming the
> functions to make that clearer, and possibly adding some additional
> functions to force the case, may be worthwhile.
>
> The only time this should be relevant is if (a) the BIOS has enabled L1
> on iwlwifi, (b) the BIOS has disabled ASPM control, and (c) the hardware
> doesn't work with L1 enabled. Are there really cases where that's true?
>

After having read (a bit) about _OSC I can say the following:
L1 is a must for the Windows driver - so that it doesn't sounds
reasonable that the .inf of the Windows driver for Intel Wireless NICs
disables ASPM through _OSC or through any other mechanism.
(a) - this is really platform / BIOS dependent, and I can't test all
the platform out there
(b) - ditto
(c) - the HW *should* work with L1 enabled but we may very well have a
bug in the driver. I have tried to check the way we initialize the HW
and it seems fine, but one never knows.


thanks for your time!

2013-05-11 20:18:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Friday, May 10, 2013 04:52:57 PM Bjorn Helgaas wrote:
> [+cc Rafael, other pci_disable_link_state() users]
>
> On Wed, May 01, 2013 at 11:13:15AM -0600, Bjorn Helgaas wrote:
> > On Wed, May 1, 2013 at 2:31 AM, Emmanuel Grumbach <[email protected]> wrote:
> > > [from Bjorn's mail]
> > >> In Emmanuel's case, we don't get _OSC control, so
> > >> pci_disable_link_state() does nothing.
> > >
> > > Right, but this is true with the specific log I sent to you. Is it
> > > possible that another platform / BIOS, we *will* get _OSC control and
> > > that pci_disable_link_state() will actually do something? In that case
> > > I would prefer not to remove the call to pcie_disable_link_state().
> >
> > Yes, absolutely, on many platforms we will get _OSC control, and
> > pci_disable_link_state() will work as expected. The problem is that
> > the driver doesn't have a good way to know whether pci_disable_link()
> > did anything or not.
> >
> > Today I think we have:
> >
> > 1) If the BIOS grants the OS permission to control PCIe services via
> > _OSC, pci_disable_link_state() works and L1 will be disabled.
> >
> > 2) If the BIOS does not grant permission, pci_disable_link_state()
> > does nothing and L1 may be enabled or not depending on what
> > configuration the BIOS did.
> >
> > If the device really doesn't work reliably when L1 is enabled, we're
> > currently at the mercy of the BIOS -- if the BIOS enables L1 but
> > doesn't grant us permission via _OSC, L1 will remain enabled (as it is
> > on your system).
>
> I propose the following patch. Any comments?

In my opinion this is dangerous, because it opens us to bugs that right now
are prevented from happening due to the way the code works.

Thanks,
Rafael


> commit cd11e3f87c4d2777cf8921c0454500c9baa54b46
> Author: Bjorn Helgaas <[email protected]>
> Date: Fri May 10 15:54:35 2013 -0600
>
> PCI/ASPM: Allow drivers to disable ASPM unconditionally
>
> Some devices have hardware problems related to using ASPM. Drivers for
> these devices use pci_disable_link_state() to prevent their device from
> entering L0s or L1. But on platforms where the OS doesn't have permission
> to manage ASPM, pci_disable_link_state() does nothing, and the driver has
> no way to know this.
>
> Therefore, if the BIOS enables ASPM but declines (either via the FADT
> ACPI_FADT_NO_ASPM bit or the _OSC method) to allow the OS to manage it,
> the device can still use ASPM and trip over the hardware issue.
>
> This patch makes pci_disable_link_state() disable ASPM unconditionally,
> regardless of whether the OS has permission to manage ASPM in general.
>
> Reported-by: Emmanuel Grumbach <[email protected]>
> Reference: https://lkml.kernel.org/r/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=57331
> Signed-off-by: Bjorn Helgaas <[email protected]>
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index d320df6..9ef4ab8 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -718,15 +718,11 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
> * pci_disable_link_state - disable pci device's link state, so the link will
> * never enter specific states
> */
> -static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,
> - bool force)
> +static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
> {
> struct pci_dev *parent = pdev->bus->self;
> struct pcie_link_state *link;
>
> - if (aspm_disabled && !force)
> - return;
> -
> if (!pci_is_pcie(pdev))
> return;
>
> @@ -757,13 +753,13 @@ static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,
>
> void pci_disable_link_state_locked(struct pci_dev *pdev, int state)
> {
> - __pci_disable_link_state(pdev, state, false, false);
> + __pci_disable_link_state(pdev, state, false);
> }
> EXPORT_SYMBOL(pci_disable_link_state_locked);
>
> void pci_disable_link_state(struct pci_dev *pdev, int state)
> {
> - __pci_disable_link_state(pdev, state, true, false);
> + __pci_disable_link_state(pdev, state, true);
> }
> EXPORT_SYMBOL(pci_disable_link_state);
>
> @@ -781,7 +777,7 @@ void pcie_clear_aspm(struct pci_bus *bus)
> __pci_disable_link_state(child, PCIE_LINK_STATE_L0S |
> PCIE_LINK_STATE_L1 |
> PCIE_LINK_STATE_CLKPM,
> - false, true);
> + false);
> }
> }
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-05-11 20:22:20

by Matthew Garrett

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

T24gU2F0LCAyMDEzLTA1LTExIGF0IDIyOjI2ICswMjAwLCBSYWZhZWwgSi4gV3lzb2NraSB3cm90
ZToNCj4gT24gRnJpZGF5LCBNYXkgMTAsIDIwMTMgMDQ6NTI6NTcgUE0gQmpvcm4gSGVsZ2FhcyB3
cm90ZToNCj4gPiBJIHByb3Bvc2UgdGhlIGZvbGxvd2luZyBwYXRjaC4gIEFueSBjb21tZW50cz8N
Cj4gDQo+IEluIG15IG9waW5pb24gdGhpcyBpcyBkYW5nZXJvdXMsIGJlY2F1c2UgaXQgb3BlbnMg
dXMgdG8gYnVncyB0aGF0IHJpZ2h0IG5vdw0KPiBhcmUgcHJldmVudGVkIGZyb20gaGFwcGVuaW5n
IGR1ZSB0byB0aGUgd2F5IHRoZSBjb2RlIHdvcmtzLg0KDQpSaWdodCwgSSdtIGFsc28gbm90IGVu
dGlyZWx5IGNvbWZvcnRhYmxlIHdpdGggdGhpcy4gVGhlIGN1cnJlbnQNCmJlaGF2aW91ciBtYXkg
YmUgY29uZnVzaW5nLCBidXQgd2UgY291bGQgcmVkdWNlIHRoYXQgYnkgcmVuYW1pbmcgdGhlDQpm
dW5jdGlvbnMuIEknbSBzdGlsbCBub3QgY2xlYXIgb24gd2hldGhlciBhbnlvbmUncyBhY3R1YWxs
eSBzZWVpbmcNCnByb2JsZW1zIGNhdXNlZCBieSB0aGUgZXhpc3RpbmcgYmVoYXZpb3VyLg0KDQot
LSANCk1hdHRoZXcgR2FycmV0dCB8IG1qZzU5QHNyY2YudWNhbS5vcmcNCg==


2013-05-16 22:55:40

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

On Sat, May 11, 2013 at 08:22:11PM +0000, Matthew Garrett wrote:
> On Sat, 2013-05-11 at 22:26 +0200, Rafael J. Wysocki wrote:
> > On Friday, May 10, 2013 04:52:57 PM Bjorn Helgaas wrote:
> > > I propose the following patch. Any comments?
> >
> > In my opinion this is dangerous, because it opens us to bugs that right now
> > are prevented from happening due to the way the code works.
>
> Right, I'm also not entirely comfortable with this. The current
> behaviour may be confusing, but we could reduce that by renaming the
> functions. I'm still not clear on whether anyone's actually seeing
> problems caused by the existing behaviour.

I couldn't imagine that silently ignoring the request to disable ASPM
would be the right thing, but I spent a long time experimenting with
Windows on qemu, and I think you're right. Windows 7 also seems to
ignore the "PciASPMOptOut" directive when we don't have permission
to manage ASPM. All the gory details are at
https://bugzilla.kernel.org/show_bug.cgi?id=57331

The current behavior is definitely confusing. I hate to rename or change
pci_disable_link_state() because it's exported and we'd have to maintain
the old interface for a while anyway. And I don't really want to return
failure to drivers, because I think that would encourage people to fiddle
with the Link Control register directly in the driver, which doesn't seem
like a good idea.

And you're also right that (as far as I know) there's not an actual
problem with the current behavior other than the confusion it causes.

So, how about something like the following patch, which just prints a
warning when we can't do what the driver requested? I suppose this may
also be a nuisance, because users will be worried, but they can't actually
*do* anything about it. Maybe it should be dev_info() instead.

commit f1956960fa0759c53b28e3a2810bd7e1b6e8925f
Author: Bjorn Helgaas <[email protected]>
Date: Wed May 15 17:02:37 2013 -0600

PCI/ASPM: Warn when driver asks to disable ASPM, but we can't do it

Some devices have hardware problems related to using ASPM. Drivers for
these devices use pci_disable_link_state() to prevent their device from
entering L0s or L1. But on platforms where the OS doesn't have permission
to manage ASPM, pci_disable_link_state() doesn't actually disable ASPM.

Windows has a similar mechanism ("PciASPMOptOut"), and when the OS doesn't
have control of ASPM, it doesn't actually disable ASPM either.

This patch just adds a warning in dmesg about the fact that
pci_disable_link_state() is doing nothing.

Reported-by: Emmanuel Grumbach <[email protected]>
Reference: https://lkml.kernel.org/r/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=57331
Signed-off-by: Bjorn Helgaas <[email protected]>

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index d320df6..faa83b6 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -724,9 +724,6 @@ static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,
struct pci_dev *parent = pdev->bus->self;
struct pcie_link_state *link;

- if (aspm_disabled && !force)
- return;
-
if (!pci_is_pcie(pdev))
return;

@@ -736,6 +733,19 @@ static void __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem,
if (!parent || !parent->link_state)
return;

+ /*
+ * A driver requested that ASPM be disabled on this device, but
+ * if we don't have permission to manage ASPM (e.g., on ACPI
+ * systems we have to observe the FADT ACPI_FADT_NO_ASPM bit and
+ * the _OSC method), we can't honor that request. Windows has
+ * a similar mechanism using "PciASPMOptOut", which is also
+ * ignored in this situation.
+ */
+ if (aspm_disabled && !force) {
+ dev_warn(&pdev->dev, "can't disable ASPM; OS doesn't have ASPM control\n");
+ return;
+ }
+
if (sem)
down_read(&pci_bus_sem);
mutex_lock(&aspm_lock);

2013-04-30 22:46:17

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

[+cc Wei-Yi, Matthew]

On Tue, Apr 30, 2013 at 4:57 AM, Emmanuel Grumbach <[email protected]> wrote:
>>
>>> Sorry, I haven't had a chance to look at this yet. Can you please
>>> attach the *complete* dmesg and lspci output, not just the parts that
>>> mention iwlwifi? Some of this depends on other PCI core stuff, like
>>> what happened with the _OSC evaluation when we discovered the PCI host
>>> bridge.
>>>
>>
>> Here is the info you wanted. Same config file as before.
>> Note that I now have two WiFi NICs connected.
>
> Any chance you'll find time for me? :-)

Thanks for reminding me again; I had forgotten all about this. I
opened https://bugzilla.kernel.org/show_bug.cgi?id=57331 and attached
your logs there.

Synopsis for Wei-Yi and Matthew: after 1a7123cdd9, iwlwifi calls
pci_disable_link_state() to prevent the link from entering L1, which
avoided "inexplicable PCIe disconnects." pci_disable_link_state() has
always been a no-op when aspm_disabled is set, and we set it whenever
the BIOS declines to give us _OSC control.

In Emmanuel's case, we don't get _OSC control, so
pci_disable_link_state() does nothing.

I'm not convinced iwlwifi really needs to disable the link power
management states in the first place -- the 1a7123cdd9 changelog isn't
very convincing, and I'm not aware of any reported issues even though
we're actually leaving them enabled in many cases.

More generally, I don't think the pci_disable_link_state() interface
is very useful because it often doesn't do what it claims, and the
caller has no indication.

Wei-Yi, can you tell us anything about whether iwlwifi really needs to
disable link power management?

Bjorn

2013-04-30 22:55:46

by Matthew Garrett

[permalink] [raw]
Subject: Re: is L1 really disabled in iwlwifi

T24gVHVlLCAyMDEzLTA0LTMwIGF0IDE2OjQ1IC0wNjAwLCBCam9ybiBIZWxnYWFzIHdyb3RlOg0K
DQo+IEknbSBub3QgY29udmluY2VkIGl3bHdpZmkgcmVhbGx5IG5lZWRzIHRvIGRpc2FibGUgdGhl
IGxpbmsgcG93ZXINCj4gbWFuYWdlbWVudCBzdGF0ZXMgaW4gdGhlIGZpcnN0IHBsYWNlIC0tIHRo
ZSAxYTcxMjNjZGQ5IGNoYW5nZWxvZyBpc24ndA0KPiB2ZXJ5IGNvbnZpbmNpbmcsIGFuZCBJJ20g
bm90IGF3YXJlIG9mIGFueSByZXBvcnRlZCBpc3N1ZXMgZXZlbiB0aG91Z2gNCj4gd2UncmUgYWN0
dWFsbHkgbGVhdmluZyB0aGVtIGVuYWJsZWQgaW4gbWFueSBjYXNlcy4NCj4gDQo+IE1vcmUgZ2Vu
ZXJhbGx5LCBJIGRvbid0IHRoaW5rIHRoZSBwY2lfZGlzYWJsZV9saW5rX3N0YXRlKCkgaW50ZXJm
YWNlDQo+IGlzIHZlcnkgdXNlZnVsIGJlY2F1c2UgaXQgb2Z0ZW4gZG9lc24ndCBkbyB3aGF0IGl0
IGNsYWltcywgYW5kIHRoZQ0KPiBjYWxsZXIgaGFzIG5vIGluZGljYXRpb24uDQoNClRoZSBnZW5l
cmFsIGlzc3VlIGhlcmUgaXMgdGhhdCBXaW5kb3dzIHdpbGwgKGFzIGZhciBhcyB3ZSd2ZSBiZWVu
IGFibGUNCnRvIGRldGVybWluZSkgbmV2ZXIgdG91Y2ggQVNQTSByZWdpc3RlcnMgdW5sZXNzIGdp
dmVuIFBDSWUgY29udHJvbCB2aWENCl9PU0MuIERyaXZlcnMgYXJlIHByZXN1bWFibHkgYWJsZSB0
byBvdmVycmlkZSB0aGlzIGJ5IGhpdHRpbmcNCmNvbmZpZ3VyYXRpb24gcmVnaXN0ZXJzIHRoZW1z
ZWx2ZXMuIFRoZSBwY2lfZGlzYWJsZV9saW5rX3N0YXRlKCkNCmZ1bmN0aW9ucyBhcmUgY3VycmVu
dGx5IGJyb2FkbHkgZXF1aXZhbGVudCB0byB0aGUgaGVscGVyIGZ1bmN0aW9ucw0KcHJvdmlkZWQg
aW4gdGhlIFdpbmRvd3MgLmluZiBmaWxlcywgYW5kIGlmIGRyaXZlcnMgcmVhbGx5IHdhbnQgdG8N
CmRpc2FibGUgdGhlIGNvbnRyb2wgdGhlbXNlbHZlcyB0aGVuIHRoZXkgY2FuIGRvIHNvLiBSZW5h
bWluZyB0aGUNCmZ1bmN0aW9ucyB0byBtYWtlIHRoYXQgY2xlYXJlciwgYW5kIHBvc3NpYmx5IGFk
ZGluZyBzb21lIGFkZGl0aW9uYWwNCmZ1bmN0aW9ucyB0byBmb3JjZSB0aGUgY2FzZSwgbWF5IGJl
IHdvcnRod2hpbGUuDQoNClRoZSBvbmx5IHRpbWUgdGhpcyBzaG91bGQgYmUgcmVsZXZhbnQgaXMg
aWYgKGEpIHRoZSBCSU9TIGhhcyBlbmFibGVkIEwxDQpvbiBpd2x3aWZpLCAoYikgdGhlIEJJT1Mg
aGFzIGRpc2FibGVkIEFTUE0gY29udHJvbCwgYW5kIChjKSB0aGUgaGFyZHdhcmUNCmRvZXNuJ3Qg
d29yayB3aXRoIEwxIGVuYWJsZWQuIEFyZSB0aGVyZSByZWFsbHkgY2FzZXMgd2hlcmUgdGhhdCdz
IHRydWU/DQoNCi0tIA0KTWF0dGhldyBHYXJyZXR0IHwgbWpnNTlAc3JjZi51Y2FtLm9yZw0K