2021-11-04 13:57:57

by Jonas Dreßler

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: Add quirk resetting the PCI bridge on MS Surface devices

On 10/26/21 01:56, Bjorn Helgaas wrote:
> On Mon, Oct 25, 2021 at 06:45:29PM +0200, Jonas Dreßler wrote:
>> On 10/18/21 17:35, Bjorn Helgaas wrote:
>>> On Thu, Oct 14, 2021 at 12:08:31AM +0200, Jonas Dreßler wrote:
>>>> On 10/12/21 17:39, Bjorn Helgaas wrote:
>>>>> [+cc Vidya, Victor, ASPM L1.2 config issue; beginning of thread:
>>>>> https://lore.kernel.org/all/[email protected]/]
>>>
>>>>> I wonder if this reset quirk works because pci_reset_function() saves
>>>>> and restores much of config space, but it currently does *not* restore
>>>>> the L1 PM Substates capability, so those T_POWER_ON,
>>>>> Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD values probably get
>>>>> cleared out by the reset. We did briefly save/restore it [1], but we
>>>>> had to revert that because of a regression that AFAIK was never
>>>>> resolved [2]. I expect we will eventually save/restore this, so if
>>>>> the quirk depends on it *not* being restored, that would be a problem.
>>>>>
>>>>> You should be able to test whether this is the critical thing by
>>>>> clearing those registers with setpci instead of doing the reset. Per
>>>>> spec, they can only be modified when L1.2 is disabled, so you would
>>>>> have to disable it via sysfs (for the endpoint, I think)
>>>>> /sys/.../l1_2_aspm and /sys/.../l1_2_pcipm, do the setpci on the root
>>>>> port, then re-enable L1.2.
>>>>>
>>>>> [1] https://git.kernel.org/linus/4257f7e008ea
>>>>> [2] https://lore.kernel.org/all/[email protected]/
>>>>
>>>> Hmm, interesting, thanks for those links.
>>>>
>>>> Are you sure the config values will get lost on the reset? If we
>>>> only reset the port by going into D3hot and back into D0, the
>>>> device will remain powered and won't lose the config space, will
>>>> it?
>>>
>>> I think you're doing a PM reset (transition to D3hot and back to
>>> D0). Linux only does this when PCI_PM_CTRL_NO_SOFT_RESET == 0.
>>> The spec doesn't actually *require* the device to be reset; it
>>> only says the internal state of the device is undefined after
>>> these transitions.
>>
>> Not requiring the device to be reset sounds sensible to me given
>> that D3hot is what devices are transitioned into during suspend.
>>
>> But anyway, that doesn't really get us any further except it
>> somewhat gives an explanation why the LTR is suddenly 0 after the
>> reset. Or are you making the point that we shouldn't rely on
>> "undefined state" for this hack because not all PCI bridges/ports
>> will necessarily behave the same?
>
> I guess I'm just making the point that I don't understand why the
> bridge reset fixes something, and I'm not confident that the fix will
> work on every system and continue working even if/when the PCI core
> starts saving and restoring the L1 PM Substates capability.
>

FWIW, I've tested it with the restoring of L1 PM Substates enabled now
and the bridge reset worked just as before.

But yeah I, too, have no clue why exactly the bridge reset does what it
does...

Anyway, I've also confirmed that it actually impacts the power usage by
measuring consumed energy during idle over a few minutes: Applying either
the bridge reset quirk or ignoring the LTR via pmc_core results in about
7% less energy usage. Given that the overall energy usage was almost
nothing to make the measurement easier, those 7% are not a lot, but
nonetheless it confirms that the quirk works.