2020-11-10 11:04:13

by Bastien Nocera

[permalink] [raw]
Subject: How to enable auto-suspend by default

Hey,

systemd has been shipping this script to enable auto-suspend on a
number of USB and PCI devices:
https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspend_rules.py

The problem here is twofold. First, the list of devices is updated from
ChromeOS, and the original list obviously won't be updated by ChromeOS
developers unless a device listed exists in a ChromeBook computer,
which means a number of devices that do support autosuspend aren't
listed.

The other problem is that this list needs to exist at all, and that it
doesn't seem possible for device driver developers (at various levels
of the stack) to opt-in to auto-suspend when all the variants of the
device (or at least detectable ones) support auto-suspend.

So the question is: how can we make it easier for device drivers to
implicitly allow autosuspend *unless they opt-out*, especially for
frameworks where the device's transport layer isn't directly available
(eg. HID devices)?

If that can't be done in the kernel drivers directly, would it be
possible for the kernel to ship with a somewhat canonical list that
systemd (or its replacement on other "Linuxes") could use to generate
those user-space quirks?

Ideally, for example, all new "iwlwifi" or all tested "iwlwifi" devices
should have autosuspend enabled by the developers adding support for
them, as in the script above, rather than downstreams (systemd upstream
included) having to chase new PCI IDs.

Cheers


2020-11-10 11:38:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> Hey,
>
> systemd has been shipping this script to enable auto-suspend on a
> number of USB and PCI devices:
> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspend_rules.py
>
> The problem here is twofold. First, the list of devices is updated from
> ChromeOS, and the original list obviously won't be updated by ChromeOS
> developers unless a device listed exists in a ChromeBook computer,
> which means a number of devices that do support autosuspend aren't
> listed.
>
> The other problem is that this list needs to exist at all, and that it
> doesn't seem possible for device driver developers (at various levels
> of the stack) to opt-in to auto-suspend when all the variants of the
> device (or at least detectable ones) support auto-suspend.

A driver can say they support autosuspend today, but I think you are
concerned about the devices that are controlled by class-compliant
drivers, right? And for those, no, we can't do this in the kernel as
there are just too many broken devices out there.

As proof of this, look at other operating systems. They had to
implement the same type of "allowed devices" list that we do. In fact,
we did this for Linux because they did this, which means that when
hardware manufacturers test their device, they only test with other
operating systems and not Linux and so, we need to match what those
other OSes do as well.

Sorry,

greg k-h

2020-11-10 16:05:02

by Mario Limonciello

[permalink] [raw]
Subject: RE: How to enable auto-suspend by default

>
> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> > Hey,
> >
> > systemd has been shipping this script to enable auto-suspend on a
> > number of USB and PCI devices:
> >
> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
> d_rules.py
> >
> > The problem here is twofold. First, the list of devices is updated from
> > ChromeOS, and the original list obviously won't be updated by ChromeOS
> > developers unless a device listed exists in a ChromeBook computer,
> > which means a number of devices that do support autosuspend aren't
> > listed.
> >
> > The other problem is that this list needs to exist at all, and that it
> > doesn't seem possible for device driver developers (at various levels
> > of the stack) to opt-in to auto-suspend when all the variants of the
> > device (or at least detectable ones) support auto-suspend.
>
> A driver can say they support autosuspend today, but I think you are
> concerned about the devices that are controlled by class-compliant
> drivers, right? And for those, no, we can't do this in the kernel as
> there are just too many broken devices out there.
>

I guess what Bastien is getting at is for newer devices supported by class
drivers rather than having to store an allowlist in udev rules, can we set
the allowlist in the kernel instead. Then distributions that either don't
use systemd or don't regularly update udev rules from systemd can take
advantage of better defaults on modern hardware.

The one item that stood out to me in that rules file was 8086:a0ed.
It's listed as "Volteer XHCI", but that same device ID is actually present
in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.

Given we're effectively ending up with the combination of runtime PM turned
on by udev rules, do we need something like this for that ID:

https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a

@Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?

> As proof of this, look at other operating systems. They had to
> implement the same type of "allowed devices" list that we do. In fact,
> we did this for Linux because they did this, which means that when
> hardware manufacturers test their device, they only test with other
> operating systems and not Linux and so, we need to match what those
> other OSes do as well.
>

(insert "some" ????)

2020-11-10 17:19:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
> >
> > On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> > > Hey,
> > >
> > > systemd has been shipping this script to enable auto-suspend on a
> > > number of USB and PCI devices:
> > >
> > https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
> > d_rules.py
> > >
> > > The problem here is twofold. First, the list of devices is updated from
> > > ChromeOS, and the original list obviously won't be updated by ChromeOS
> > > developers unless a device listed exists in a ChromeBook computer,
> > > which means a number of devices that do support autosuspend aren't
> > > listed.
> > >
> > > The other problem is that this list needs to exist at all, and that it
> > > doesn't seem possible for device driver developers (at various levels
> > > of the stack) to opt-in to auto-suspend when all the variants of the
> > > device (or at least detectable ones) support auto-suspend.
> >
> > A driver can say they support autosuspend today, but I think you are
> > concerned about the devices that are controlled by class-compliant
> > drivers, right? And for those, no, we can't do this in the kernel as
> > there are just too many broken devices out there.
> >
>
> I guess what Bastien is getting at is for newer devices supported by class
> drivers rather than having to store an allowlist in udev rules, can we set
> the allowlist in the kernel instead. Then distributions that either don't
> use systemd or don't regularly update udev rules from systemd can take
> advantage of better defaults on modern hardware.

That's what the "hardware ids" database is supposed to be handling.
It's easier to manage this in userspace than in the kernel.

I just love systems where people feel it is safer to update the kernel
than it is to update a hardware database file :)

> The one item that stood out to me in that rules file was 8086:a0ed.
> It's listed as "Volteer XHCI", but that same device ID is actually present
> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.

That's an Intel PCI device id. If someone else is abusing that number,
I'm sure Intel would want to know about it and would be glad to go after
them.

But note, PCI devices can be behind lots of different types of busses,
so maybe the "can this device autosuspend" differs for them because of
different implementations of where the device is, right?

> Given we're effectively ending up with the combination of runtime PM turned
> on by udev rules, do we need something like this for that ID:
>
> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>
> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?

Submit a patch!

thanks,

greg k-h

2020-11-10 17:27:22

by Mika Westerberg

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
> >
> > On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> > > Hey,
> > >
> > > systemd has been shipping this script to enable auto-suspend on a
> > > number of USB and PCI devices:
> > >
> > https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
> > d_rules.py
> > >
> > > The problem here is twofold. First, the list of devices is updated from
> > > ChromeOS, and the original list obviously won't be updated by ChromeOS
> > > developers unless a device listed exists in a ChromeBook computer,
> > > which means a number of devices that do support autosuspend aren't
> > > listed.
> > >
> > > The other problem is that this list needs to exist at all, and that it
> > > doesn't seem possible for device driver developers (at various levels
> > > of the stack) to opt-in to auto-suspend when all the variants of the
> > > device (or at least detectable ones) support auto-suspend.
> >
> > A driver can say they support autosuspend today, but I think you are
> > concerned about the devices that are controlled by class-compliant
> > drivers, right? And for those, no, we can't do this in the kernel as
> > there are just too many broken devices out there.
> >
>
> I guess what Bastien is getting at is for newer devices supported by class
> drivers rather than having to store an allowlist in udev rules, can we set
> the allowlist in the kernel instead. Then distributions that either don't
> use systemd or don't regularly update udev rules from systemd can take
> advantage of better defaults on modern hardware.
>
> The one item that stood out to me in that rules file was 8086:a0ed.
> It's listed as "Volteer XHCI", but that same device ID is actually present
> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
>
> Given we're effectively ending up with the combination of runtime PM turned
> on by udev rules, do we need something like this for that ID:
>
> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>
> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?

I think this one is the TGL PCH xHCI. The quirk currently for xHCI
controllers that are part of the TCSS (Type-C SubSystem) where it is
important to put all devices into low power mode whenever possible,
otherwise it keeps the whole block on.

Typically we haven't done that for PCH side xHCI controllers though, but
I don't see why not if it works that is. Adding Mathias to comment more
on that since he is the xHCI maintainer.

2020-11-10 17:47:47

by Mario Limonciello

[permalink] [raw]
Subject: RE: How to enable auto-suspend by default

> > I guess what Bastien is getting at is for newer devices supported by class
> > drivers rather than having to store an allowlist in udev rules, can we set
> > the allowlist in the kernel instead. Then distributions that either don't
> > use systemd or don't regularly update udev rules from systemd can take
> > advantage of better defaults on modern hardware.
>
> That's what the "hardware ids" database is supposed to be handling.
> It's easier to manage this in userspace than in the kernel.
>
> I just love systems where people feel it is safer to update the kernel
> than it is to update a hardware database file :)
>
> > The one item that stood out to me in that rules file was 8086:a0ed.
> > It's listed as "Volteer XHCI", but that same device ID is actually present
> > in an XPS 9310 in front of me as well and used by the xhci-pci kernel
> module.
>
> That's an Intel PCI device id. If someone else is abusing that number,
> I'm sure Intel would want to know about it and would be glad to go after
> them.

Sorry I wasn't intending to insinuate an abuse of the number, but rather that
the PCI device in the "Volteer" product and that in XPS 9310 appear are the
same so they are possibly using the same hardware for this device.

>
> But note, PCI devices can be behind lots of different types of busses,
> so maybe the "can this device autosuspend" differs for them because of
> different implementations of where the device is, right?
>

Well the reason that I raise it is that without that device auto-suspended the
SOC on the XPS 9310 consumes too much power.

> > Given we're effectively ending up with the combination of runtime PM turned
> > on by udev rules, do we need something like this for that ID:
> >
> >
> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400
> d8a
> >
> > @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
>
> Submit a patch!
>
> thanks,
>
> greg k-h

If that's the appropriate conclusion, will do.

2020-11-10 18:04:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, Nov 10, 2020 at 05:45:43PM +0000, Limonciello, Mario wrote:
> > > I guess what Bastien is getting at is for newer devices supported by class
> > > drivers rather than having to store an allowlist in udev rules, can we set
> > > the allowlist in the kernel instead. Then distributions that either don't
> > > use systemd or don't regularly update udev rules from systemd can take
> > > advantage of better defaults on modern hardware.
> >
> > That's what the "hardware ids" database is supposed to be handling.
> > It's easier to manage this in userspace than in the kernel.
> >
> > I just love systems where people feel it is safer to update the kernel
> > than it is to update a hardware database file :)
> >
> > > The one item that stood out to me in that rules file was 8086:a0ed.
> > > It's listed as "Volteer XHCI", but that same device ID is actually present
> > > in an XPS 9310 in front of me as well and used by the xhci-pci kernel
> > module.
> >
> > That's an Intel PCI device id. If someone else is abusing that number,
> > I'm sure Intel would want to know about it and would be glad to go after
> > them.
>
> Sorry I wasn't intending to insinuate an abuse of the number, but rather that
> the PCI device in the "Volteer" product and that in XPS 9310 appear are the
> same so they are possibly using the same hardware for this device.

Ok, but again, given that the device might be on different transports,
the ability for the device to properly autosuspend might be different,
right?

thanks,

greg k-h

2020-11-10 18:11:29

by Mario Limonciello

[permalink] [raw]
Subject: RE: How to enable auto-suspend by default

> On Tue, Nov 10, 2020 at 05:45:43PM +0000, Limonciello, Mario wrote:
> > > > I guess what Bastien is getting at is for newer devices supported by
> class
> > > > drivers rather than having to store an allowlist in udev rules, can we
> set
> > > > the allowlist in the kernel instead. Then distributions that either
> don't
> > > > use systemd or don't regularly update udev rules from systemd can take
> > > > advantage of better defaults on modern hardware.
> > >
> > > That's what the "hardware ids" database is supposed to be handling.
> > > It's easier to manage this in userspace than in the kernel.
> > >
> > > I just love systems where people feel it is safer to update the kernel
> > > than it is to update a hardware database file :)
> > >
> > > > The one item that stood out to me in that rules file was 8086:a0ed.
> > > > It's listed as "Volteer XHCI", but that same device ID is actually
> present
> > > > in an XPS 9310 in front of me as well and used by the xhci-pci kernel
> > > module.
> > >
> > > That's an Intel PCI device id. If someone else is abusing that number,
> > > I'm sure Intel would want to know about it and would be glad to go after
> > > them.
> >
> > Sorry I wasn't intending to insinuate an abuse of the number, but rather
> that
> > the PCI device in the "Volteer" product and that in XPS 9310 appear are the
> > same so they are possibly using the same hardware for this device.
>
> Ok, but again, given that the device might be on different transports,
> the ability for the device to properly autosuspend might be different,
> right?
>

I would think since it's provided by the PCH unlikely to be different transports.
I'm not sure though, Matthias might need to confirm.

On my side if I don't put that device into autosuspend the power consumption
goes up.

2020-11-10 20:00:25

by Theodore Ts'o

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

One note... I'll double check, but on my XPS 13 9380, as I recall, I
have to manually disable autosuspend on all of the XHCI controllers
and internal hubs after running "powertop --auto-tune", or else any
external mouse attached to said USB device will be dead to the world
for 2-3 seconds if the autosuspend timeout has kicked in, which was
***super*** annoying.

- Ted

2020-11-10 20:47:30

by Mario Limonciello

[permalink] [raw]
Subject: RE: How to enable auto-suspend by default


> One note... I'll double check, but on my XPS 13 9380, as I recall, I
> have to manually disable autosuspend on all of the XHCI controllers
> and internal hubs after running "powertop --auto-tune", or else any
> external mouse attached to said USB device will be dead to the world
> for 2-3 seconds if the autosuspend timeout has kicked in, which was
> ***super*** annoying.
>
> - Ted

Perhaps this was not XHCI controller in the system, but maybe external
one provided by ASMedia? I've not heard anything similar for host system
yet.

2020-11-11 11:29:20

by Hans de Goede

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

Hi,

On 11/10/20 6:25 PM, Mika Westerberg wrote:
> On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
>>>
>>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
>>>> Hey,
>>>>
>>>> systemd has been shipping this script to enable auto-suspend on a
>>>> number of USB and PCI devices:
>>>>
>>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
>>> d_rules.py
>>>>
>>>> The problem here is twofold. First, the list of devices is updated from
>>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
>>>> developers unless a device listed exists in a ChromeBook computer,
>>>> which means a number of devices that do support autosuspend aren't
>>>> listed.
>>>>
>>>> The other problem is that this list needs to exist at all, and that it
>>>> doesn't seem possible for device driver developers (at various levels
>>>> of the stack) to opt-in to auto-suspend when all the variants of the
>>>> device (or at least detectable ones) support auto-suspend.
>>>
>>> A driver can say they support autosuspend today, but I think you are
>>> concerned about the devices that are controlled by class-compliant
>>> drivers, right? And for those, no, we can't do this in the kernel as
>>> there are just too many broken devices out there.
>>>
>>
>> I guess what Bastien is getting at is for newer devices supported by class
>> drivers rather than having to store an allowlist in udev rules, can we set
>> the allowlist in the kernel instead. Then distributions that either don't
>> use systemd or don't regularly update udev rules from systemd can take
>> advantage of better defaults on modern hardware.
>>
>> The one item that stood out to me in that rules file was 8086:a0ed.
>> It's listed as "Volteer XHCI", but that same device ID is actually present
>> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
>>
>> Given we're effectively ending up with the combination of runtime PM turned
>> on by udev rules, do we need something like this for that ID:
>>
>> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>>
>> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
>
> I think this one is the TGL PCH xHCI. The quirk currently for xHCI
> controllers that are part of the TCSS (Type-C SubSystem) where it is
> important to put all devices into low power mode whenever possible,
> otherwise it keeps the whole block on.

Note that there are currently some IDs missing from the xHCIs which
are part of the TCSS too. At least the id for the xHCI in the thunderbolt
controller on the Lenovo T14 gen 1 is missing. I started a discussion
about extending the kernel quirk list for this vs switching to hwdb
a while a go:

https://lore.kernel.org/linux-usb/[email protected]/

The conclusion back then was to switch to hwdb, but I never got around to this.

> Typically we haven't done that for PCH side xHCI controllers though, but
> I don't see why not if it works that is. Adding Mathias to comment more
> on that since he is the xHCI maintainer.

If we are also going to enable this for the non TCSS Intel XHCI controllers,
maybe just uncondtionally enable it for all Intel XHCI controllers, or
if necessary do a deny-list for some older models and enable it for anything
not on the deny-list (so all newer models). That should avoid the game of
whack-a-mole which we will have with this otherwise.

Note the deny-list + enable anything not on it approach could be done
either in the kernel or in a udev-rule + hwdb combo.

Regards,

Hans

2020-11-11 14:35:00

by Mika Westerberg

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
> Hi,
>
> On 11/10/20 6:25 PM, Mika Westerberg wrote:
> > On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
> >>>
> >>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> >>>> Hey,
> >>>>
> >>>> systemd has been shipping this script to enable auto-suspend on a
> >>>> number of USB and PCI devices:
> >>>>
> >>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
> >>> d_rules.py
> >>>>
> >>>> The problem here is twofold. First, the list of devices is updated from
> >>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
> >>>> developers unless a device listed exists in a ChromeBook computer,
> >>>> which means a number of devices that do support autosuspend aren't
> >>>> listed.
> >>>>
> >>>> The other problem is that this list needs to exist at all, and that it
> >>>> doesn't seem possible for device driver developers (at various levels
> >>>> of the stack) to opt-in to auto-suspend when all the variants of the
> >>>> device (or at least detectable ones) support auto-suspend.
> >>>
> >>> A driver can say they support autosuspend today, but I think you are
> >>> concerned about the devices that are controlled by class-compliant
> >>> drivers, right? And for those, no, we can't do this in the kernel as
> >>> there are just too many broken devices out there.
> >>>
> >>
> >> I guess what Bastien is getting at is for newer devices supported by class
> >> drivers rather than having to store an allowlist in udev rules, can we set
> >> the allowlist in the kernel instead. Then distributions that either don't
> >> use systemd or don't regularly update udev rules from systemd can take
> >> advantage of better defaults on modern hardware.
> >>
> >> The one item that stood out to me in that rules file was 8086:a0ed.
> >> It's listed as "Volteer XHCI", but that same device ID is actually present
> >> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
> >>
> >> Given we're effectively ending up with the combination of runtime PM turned
> >> on by udev rules, do we need something like this for that ID:
> >>
> >> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
> >>
> >> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
> >
> > I think this one is the TGL PCH xHCI. The quirk currently for xHCI
> > controllers that are part of the TCSS (Type-C SubSystem) where it is
> > important to put all devices into low power mode whenever possible,
> > otherwise it keeps the whole block on.
>
> Note that there are currently some IDs missing from the xHCIs which
> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
> controller on the Lenovo T14 gen 1 is missing. I started a discussion
> about extending the kernel quirk list for this vs switching to hwdb
> a while a go:
>
> https://lore.kernel.org/linux-usb/[email protected]/
>
> The conclusion back then was to switch to hwdb, but I never got around to this.

The reason I've added these to the xHCI driver is that it works even if
you are running some really small userspace (like busybox). Also for the
xHCI in TCSS we know for sure that it fully supports D3cold.

(The one you refer above is actually mistake from my side as I never
tested Alpine Ridge LP controller which I think this is).

> > Typically we haven't done that for PCH side xHCI controllers though, but
> > I don't see why not if it works that is. Adding Mathias to comment more
> > on that since he is the xHCI maintainer.
>
> If we are also going to enable this for the non TCSS Intel XHCI controllers,
> maybe just uncondtionally enable it for all Intel XHCI controllers, or
> if necessary do a deny-list for some older models and enable it for anything
> not on the deny-list (so all newer models). That should avoid the game of
> whack-a-mole which we will have with this otherwise.

This is really up to Mathias to decide. I'm fine either way :)

2020-11-11 16:06:01

by Mario Limonciello

[permalink] [raw]
Subject: RE: How to enable auto-suspend by default

> >> Given we're effectively ending up with the combination of runtime PM turned
> >> on by udev rules, do we need something like this for that ID:
> >>
> >>
> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400
> d8a
> >>
> >> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
> >
> > I think this one is the TGL PCH xHCI. The quirk currently for xHCI
> > controllers that are part of the TCSS (Type-C SubSystem) where it is
> > important to put all devices into low power mode whenever possible,
> > otherwise it keeps the whole block on.
>
> Note that there are currently some IDs missing from the xHCIs which
> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
> controller on the Lenovo T14 gen 1 is missing. I started a discussion
> about extending the kernel quirk list for this vs switching to hwdb
> a while a go:
>
> https://lore.kernel.org/linux-usb/b8b21ba3-0a8a-ff54-5e12-
> [email protected]/
>
> The conclusion back then was to switch to hwdb, but I never got around to
> this.

I guess the problem I see with switching to a hwdb for this type of thing is
that if there is a "bug" in your kernel driver around autosuspend you will
then be potentially causing it to occur more regularly on a kernel that didn't
necessarily pick up the fix but does have the newer hwdb.

I don't know how common that will really be though.

Since Mika mentioned the really light userspace scenario, what about shipping
the hwdb "with" the kernel in tree? This could allow evicting all these quirk
scenarios from the kernel at the same time as switching to a hwdb and also cover
the problem I suggested might happen with a bug in older kernel and newer userspace.

>
> > Typically we haven't done that for PCH side xHCI controllers though, but
> > I don't see why not if it works that is. Adding Mathias to comment more
> > on that since he is the xHCI maintainer.
>
> If we are also going to enable this for the non TCSS Intel XHCI controllers,
> maybe just uncondtionally enable it for all Intel XHCI controllers, or
> if necessary do a deny-list for some older models and enable it for anything
> not on the deny-list (so all newer models). That should avoid the game of
> whack-a-mole which we will have with this otherwise.
>
> Note the deny-list + enable anything not on it approach could be done
> either in the kernel or in a udev-rule + hwdb combo.
>
> Regards,
>
> Hans

2020-11-11 16:33:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Wed, Nov 11, 2020 at 04:03:30PM +0000, Limonciello, Mario wrote:
> > >> Given we're effectively ending up with the combination of runtime PM turned
> > >> on by udev rules, do we need something like this for that ID:
> > >>
> > >>
> > https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400
> > d8a
> > >>
> > >> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
> > >
> > > I think this one is the TGL PCH xHCI. The quirk currently for xHCI
> > > controllers that are part of the TCSS (Type-C SubSystem) where it is
> > > important to put all devices into low power mode whenever possible,
> > > otherwise it keeps the whole block on.
> >
> > Note that there are currently some IDs missing from the xHCIs which
> > are part of the TCSS too. At least the id for the xHCI in the thunderbolt
> > controller on the Lenovo T14 gen 1 is missing. I started a discussion
> > about extending the kernel quirk list for this vs switching to hwdb
> > a while a go:
> >
> > https://lore.kernel.org/linux-usb/b8b21ba3-0a8a-ff54-5e12-
> > [email protected]/
> >
> > The conclusion back then was to switch to hwdb, but I never got around to
> > this.
>
> I guess the problem I see with switching to a hwdb for this type of thing is
> that if there is a "bug" in your kernel driver around autosuspend you will
> then be potentially causing it to occur more regularly on a kernel that didn't
> necessarily pick up the fix but does have the newer hwdb.
>
> I don't know how common that will really be though.
>
> Since Mika mentioned the really light userspace scenario, what about shipping
> the hwdb "with" the kernel in tree? This could allow evicting all these quirk
> scenarios from the kernel at the same time as switching to a hwdb and also cover
> the problem I suggested might happen with a bug in older kernel and newer userspace.

We took things out of the kernel to put it in hwdb years ago as it was
easier for people to update a "text file" than it was their kernel
image. I don't think you want to go backwards here :)

thanks,

greg k-h

2020-11-23 13:57:15

by Hans de Goede

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

Hi,

On 11/11/20 3:31 PM, Mika Westerberg wrote:
> On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> On 11/10/20 6:25 PM, Mika Westerberg wrote:
>>> On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
>>>>>
>>>>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
>>>>>> Hey,
>>>>>>
>>>>>> systemd has been shipping this script to enable auto-suspend on a
>>>>>> number of USB and PCI devices:
>>>>>>
>>>>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
>>>>> d_rules.py
>>>>>>
>>>>>> The problem here is twofold. First, the list of devices is updated from
>>>>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
>>>>>> developers unless a device listed exists in a ChromeBook computer,
>>>>>> which means a number of devices that do support autosuspend aren't
>>>>>> listed.
>>>>>>
>>>>>> The other problem is that this list needs to exist at all, and that it
>>>>>> doesn't seem possible for device driver developers (at various levels
>>>>>> of the stack) to opt-in to auto-suspend when all the variants of the
>>>>>> device (or at least detectable ones) support auto-suspend.
>>>>>
>>>>> A driver can say they support autosuspend today, but I think you are
>>>>> concerned about the devices that are controlled by class-compliant
>>>>> drivers, right? And for those, no, we can't do this in the kernel as
>>>>> there are just too many broken devices out there.
>>>>>
>>>>
>>>> I guess what Bastien is getting at is for newer devices supported by class
>>>> drivers rather than having to store an allowlist in udev rules, can we set
>>>> the allowlist in the kernel instead. Then distributions that either don't
>>>> use systemd or don't regularly update udev rules from systemd can take
>>>> advantage of better defaults on modern hardware.
>>>>
>>>> The one item that stood out to me in that rules file was 8086:a0ed.
>>>> It's listed as "Volteer XHCI", but that same device ID is actually present
>>>> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
>>>>
>>>> Given we're effectively ending up with the combination of runtime PM turned
>>>> on by udev rules, do we need something like this for that ID:
>>>>
>>>> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>>>>
>>>> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
>>>
>>> I think this one is the TGL PCH xHCI. The quirk currently for xHCI
>>> controllers that are part of the TCSS (Type-C SubSystem) where it is
>>> important to put all devices into low power mode whenever possible,
>>> otherwise it keeps the whole block on.
>>
>> Note that there are currently some IDs missing from the xHCIs which
>> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
>> controller on the Lenovo T14 gen 1 is missing. I started a discussion
>> about extending the kernel quirk list for this vs switching to hwdb
>> a while a go:
>>
>> https://lore.kernel.org/linux-usb/[email protected]/
>>
>> The conclusion back then was to switch to hwdb, but I never got around to this.
>
> The reason I've added these to the xHCI driver is that it works even if
> you are running some really small userspace (like busybox). Also for the
> xHCI in TCSS we know for sure that it fully supports D3cold.
>
> (The one you refer above is actually mistake from my side as I never
> tested Alpine Ridge LP controller which I think this is).

Ok, so I'll submit a patch adding the 15c1 product-id for the
INTEL_ALPINE_RIDGE_LP_2C_XHCI controller to the list of ids for which we
set the XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk. To fix the much too high
idle-power consumption problem on devices with this Alpine Ridge variant.

>>> Typically we haven't done that for PCH side xHCI controllers though, but
>>> I don't see why not if it works that is. Adding Mathias to comment more
>>> on that since he is the xHCI maintainer.
>>
>> If we are also going to enable this for the non TCSS Intel XHCI controllers,
>> maybe just uncondtionally enable it for all Intel XHCI controllers, or
>> if necessary do a deny-list for some older models and enable it for anything
>> not on the deny-list (so all newer models). That should avoid the game of
>> whack-a-mole which we will have with this otherwise.
>
> This is really up to Mathias to decide. I'm fine either way :)

Ok, Matthias what do you think about this?

Regards,

Hans

2020-11-23 14:05:49

by Mika Westerberg

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Mon, Nov 23, 2020 at 02:54:19PM +0100, Hans de Goede wrote:
> Hi,
>
> On 11/11/20 3:31 PM, Mika Westerberg wrote:
> > On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 11/10/20 6:25 PM, Mika Westerberg wrote:
> >>> On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
> >>>>>
> >>>>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
> >>>>>> Hey,
> >>>>>>
> >>>>>> systemd has been shipping this script to enable auto-suspend on a
> >>>>>> number of USB and PCI devices:
> >>>>>>
> >>>>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
> >>>>> d_rules.py
> >>>>>>
> >>>>>> The problem here is twofold. First, the list of devices is updated from
> >>>>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
> >>>>>> developers unless a device listed exists in a ChromeBook computer,
> >>>>>> which means a number of devices that do support autosuspend aren't
> >>>>>> listed.
> >>>>>>
> >>>>>> The other problem is that this list needs to exist at all, and that it
> >>>>>> doesn't seem possible for device driver developers (at various levels
> >>>>>> of the stack) to opt-in to auto-suspend when all the variants of the
> >>>>>> device (or at least detectable ones) support auto-suspend.
> >>>>>
> >>>>> A driver can say they support autosuspend today, but I think you are
> >>>>> concerned about the devices that are controlled by class-compliant
> >>>>> drivers, right? And for those, no, we can't do this in the kernel as
> >>>>> there are just too many broken devices out there.
> >>>>>
> >>>>
> >>>> I guess what Bastien is getting at is for newer devices supported by class
> >>>> drivers rather than having to store an allowlist in udev rules, can we set
> >>>> the allowlist in the kernel instead. Then distributions that either don't
> >>>> use systemd or don't regularly update udev rules from systemd can take
> >>>> advantage of better defaults on modern hardware.
> >>>>
> >>>> The one item that stood out to me in that rules file was 8086:a0ed.
> >>>> It's listed as "Volteer XHCI", but that same device ID is actually present
> >>>> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
> >>>>
> >>>> Given we're effectively ending up with the combination of runtime PM turned
> >>>> on by udev rules, do we need something like this for that ID:
> >>>>
> >>>> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
> >>>>
> >>>> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
> >>>
> >>> I think this one is the TGL PCH xHCI. The quirk currently for xHCI
> >>> controllers that are part of the TCSS (Type-C SubSystem) where it is
> >>> important to put all devices into low power mode whenever possible,
> >>> otherwise it keeps the whole block on.
> >>
> >> Note that there are currently some IDs missing from the xHCIs which
> >> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
> >> controller on the Lenovo T14 gen 1 is missing. I started a discussion
> >> about extending the kernel quirk list for this vs switching to hwdb
> >> a while a go:
> >>
> >> https://lore.kernel.org/linux-usb/[email protected]/
> >>
> >> The conclusion back then was to switch to hwdb, but I never got around to this.
> >
> > The reason I've added these to the xHCI driver is that it works even if
> > you are running some really small userspace (like busybox). Also for the
> > xHCI in TCSS we know for sure that it fully supports D3cold.
> >
> > (The one you refer above is actually mistake from my side as I never
> > tested Alpine Ridge LP controller which I think this is).
>
> Ok, so I'll submit a patch adding the 15c1 product-id for the
> INTEL_ALPINE_RIDGE_LP_2C_XHCI controller to the list of ids for which we
> set the XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk. To fix the much too high
> idle-power consumption problem on devices with this Alpine Ridge variant.

Thanks!

2020-11-24 12:44:03

by Hans de Goede

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

Hi,

On 11/24/20 1:37 PM, Mathias Nyman wrote:
> On 23.11.2020 15.54, Hans de Goede wrote:
>> Hi,
>>
>> On 11/11/20 3:31 PM, Mika Westerberg wrote:
>>> On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 11/10/20 6:25 PM, Mika Westerberg wrote:
>>>>> On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
>>>>>>>
>>>>>>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> systemd has been shipping this script to enable auto-suspend on a
>>>>>>>> number of USB and PCI devices:
>>>>>>>>
>>>>>>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
>>>>>>> d_rules.py
>>>>>>>>
>>>>>>>> The problem here is twofold. First, the list of devices is updated from
>>>>>>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
>>>>>>>> developers unless a device listed exists in a ChromeBook computer,
>>>>>>>> which means a number of devices that do support autosuspend aren't
>>>>>>>> listed.
>>>>>>>>
>>>>>>>> The other problem is that this list needs to exist at all, and that it
>>>>>>>> doesn't seem possible for device driver developers (at various levels
>>>>>>>> of the stack) to opt-in to auto-suspend when all the variants of the
>>>>>>>> device (or at least detectable ones) support auto-suspend.
>>>>>>>
>>>>>>> A driver can say they support autosuspend today, but I think you are
>>>>>>> concerned about the devices that are controlled by class-compliant
>>>>>>> drivers, right? And for those, no, we can't do this in the kernel as
>>>>>>> there are just too many broken devices out there.
>>>>>>>
>>>>>>
>>>>>> I guess what Bastien is getting at is for newer devices supported by class
>>>>>> drivers rather than having to store an allowlist in udev rules, can we set
>>>>>> the allowlist in the kernel instead. Then distributions that either don't
>>>>>> use systemd or don't regularly update udev rules from systemd can take
>>>>>> advantage of better defaults on modern hardware.
>>>>>>
>>>>>> The one item that stood out to me in that rules file was 8086:a0ed.
>>>>>> It's listed as "Volteer XHCI", but that same device ID is actually present
>>>>>> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
>>>>>>
>>>>>> Given we're effectively ending up with the combination of runtime PM turned
>>>>>> on by udev rules, do we need something like this for that ID:
>>>>>>
>>>>>> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>>>>>>
>>>>>> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
>>>>>
>>>>> I think this one is the TGL PCH xHCI. The quirk currently for xHCI
>>>>> controllers that are part of the TCSS (Type-C SubSystem) where it is
>>>>> important to put all devices into low power mode whenever possible,
>>>>> otherwise it keeps the whole block on.
>>>>
>>>> Note that there are currently some IDs missing from the xHCIs which
>>>> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
>>>> controller on the Lenovo T14 gen 1 is missing. I started a discussion
>>>> about extending the kernel quirk list for this vs switching to hwdb
>>>> a while a go:
>>>>
>>>> https://lore.kernel.org/linux-usb/[email protected]/
>>>>
>>>> The conclusion back then was to switch to hwdb, but I never got around to this.
>>>
>>> The reason I've added these to the xHCI driver is that it works even if
>>> you are running some really small userspace (like busybox). Also for the
>>> xHCI in TCSS we know for sure that it fully supports D3cold.
>>>
>>> (The one you refer above is actually mistake from my side as I never
>>> tested Alpine Ridge LP controller which I think this is).
>>
>> Ok, so I'll submit a patch adding the 15c1 product-id for the
>> INTEL_ALPINE_RIDGE_LP_2C_XHCI controller to the list of ids for which we
>> set the XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk. To fix the much too high
>> idle-power consumption problem on devices with this Alpine Ridge variant.
>
> Thanks
>
>>
>>>>> Typically we haven't done that for PCH side xHCI controllers though, but
>>>>> I don't see why not if it works that is. Adding Mathias to comment more
>>>>> on that since he is the xHCI maintainer.
>>>>
>>>> If we are also going to enable this for the non TCSS Intel XHCI controllers,
>>>> maybe just uncondtionally enable it for all Intel XHCI controllers, or
>>>> if necessary do a deny-list for some older models and enable it for anything
>>>> not on the deny-list (so all newer models). That should avoid the game of
>>>> whack-a-mole which we will have with this otherwise.
>>>
>>> This is really up to Mathias to decide. I'm fine either way :)
>>
>> Ok, Matthias what do you think about this?
>
> I don't think we are ready to enable runtime pm as default for all Intel xHCI controllers.
> The risk of xHCI not waking up when user plugs a mouse/keyboard, making the system unusable
> just seems too high compared to the powersaving benefit.
>
> The powersaving benefit from autosuspending the TCSS xHCI is a lot better, and we, (Mika mostly)
> has been able to verify they work.
>
> So I propose we for now continue adding TCSS xHCI controllers to the allowlist in kernel.
> For others I think a userspace allow/denylist makes sense.
>
> Long term goal would be default allow for all, with short denylist in kernel.

Ok, thank you for your input on this.

Regards,

Hans

2020-11-24 15:56:16

by Bastien Nocera

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, 2020-11-24 at 14:37 +0200, Mathias Nyman wrote:
> <snip>
> I don't think we are ready to enable runtime pm as default for all
> Intel xHCI controllers.
> The risk of xHCI not waking up when user plugs a mouse/keyboard,
> making the system unusable
> just seems too high compared to the powersaving benefit.
>
> The powersaving benefit from autosuspending the TCSS xHCI is a lot
> better, and we, (Mika mostly)
> has been able to verify they work.
>
> So I propose we for now continue adding TCSS xHCI controllers to the
> allowlist in kernel.
> For others I think a userspace allow/denylist makes sense.
>
> Long term goal would be default allow for all, with short denylist in
> kernel.

Is there any way to preemptively enable autosuspend for all the _TCSS_
xHCI controllers?

This was the problem the original post tried to tease out, whether it
would be easier/better to enable autosuspend by default, and not enable
it on systems where it breaks something, rather than default to sucking
battery until somebody notices that a device ID got missed.

2020-11-24 21:43:45

by Mathias Nyman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On 23.11.2020 15.54, Hans de Goede wrote:
> Hi,
>
> On 11/11/20 3:31 PM, Mika Westerberg wrote:
>> On Wed, Nov 11, 2020 at 12:27:32PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 11/10/20 6:25 PM, Mika Westerberg wrote:
>>>> On Tue, Nov 10, 2020 at 04:02:33PM +0000, Limonciello, Mario wrote:
>>>>>>
>>>>>> On Tue, Nov 10, 2020 at 11:57:07AM +0100, Bastien Nocera wrote:
>>>>>>> Hey,
>>>>>>>
>>>>>>> systemd has been shipping this script to enable auto-suspend on a
>>>>>>> number of USB and PCI devices:
>>>>>>>
>>>>>> https://github.com/systemd/systemd/blob/master/tools/chromiumos/gen_autosuspen
>>>>>> d_rules.py
>>>>>>>
>>>>>>> The problem here is twofold. First, the list of devices is updated from
>>>>>>> ChromeOS, and the original list obviously won't be updated by ChromeOS
>>>>>>> developers unless a device listed exists in a ChromeBook computer,
>>>>>>> which means a number of devices that do support autosuspend aren't
>>>>>>> listed.
>>>>>>>
>>>>>>> The other problem is that this list needs to exist at all, and that it
>>>>>>> doesn't seem possible for device driver developers (at various levels
>>>>>>> of the stack) to opt-in to auto-suspend when all the variants of the
>>>>>>> device (or at least detectable ones) support auto-suspend.
>>>>>>
>>>>>> A driver can say they support autosuspend today, but I think you are
>>>>>> concerned about the devices that are controlled by class-compliant
>>>>>> drivers, right? And for those, no, we can't do this in the kernel as
>>>>>> there are just too many broken devices out there.
>>>>>>
>>>>>
>>>>> I guess what Bastien is getting at is for newer devices supported by class
>>>>> drivers rather than having to store an allowlist in udev rules, can we set
>>>>> the allowlist in the kernel instead. Then distributions that either don't
>>>>> use systemd or don't regularly update udev rules from systemd can take
>>>>> advantage of better defaults on modern hardware.
>>>>>
>>>>> The one item that stood out to me in that rules file was 8086:a0ed.
>>>>> It's listed as "Volteer XHCI", but that same device ID is actually present
>>>>> in an XPS 9310 in front of me as well and used by the xhci-pci kernel module.
>>>>>
>>>>> Given we're effectively ending up with the combination of runtime PM turned
>>>>> on by udev rules, do we need something like this for that ID:
>>>>>
>>>>> https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400d8a
>>>>>
>>>>> @Mika Westerberg should 8086:a0ed be quirked like the TCSS xHCI too?
>>>>
>>>> I think this one is the TGL PCH xHCI. The quirk currently for xHCI
>>>> controllers that are part of the TCSS (Type-C SubSystem) where it is
>>>> important to put all devices into low power mode whenever possible,
>>>> otherwise it keeps the whole block on.
>>>
>>> Note that there are currently some IDs missing from the xHCIs which
>>> are part of the TCSS too. At least the id for the xHCI in the thunderbolt
>>> controller on the Lenovo T14 gen 1 is missing. I started a discussion
>>> about extending the kernel quirk list for this vs switching to hwdb
>>> a while a go:
>>>
>>> https://lore.kernel.org/linux-usb/[email protected]/
>>>
>>> The conclusion back then was to switch to hwdb, but I never got around to this.
>>
>> The reason I've added these to the xHCI driver is that it works even if
>> you are running some really small userspace (like busybox). Also for the
>> xHCI in TCSS we know for sure that it fully supports D3cold.
>>
>> (The one you refer above is actually mistake from my side as I never
>> tested Alpine Ridge LP controller which I think this is).
>
> Ok, so I'll submit a patch adding the 15c1 product-id for the
> INTEL_ALPINE_RIDGE_LP_2C_XHCI controller to the list of ids for which we
> set the XHCI_DEFAULT_PM_RUNTIME_ALLOW quirk. To fix the much too high
> idle-power consumption problem on devices with this Alpine Ridge variant.

Thanks

>
>>>> Typically we haven't done that for PCH side xHCI controllers though, but
>>>> I don't see why not if it works that is. Adding Mathias to comment more
>>>> on that since he is the xHCI maintainer.
>>>
>>> If we are also going to enable this for the non TCSS Intel XHCI controllers,
>>> maybe just uncondtionally enable it for all Intel XHCI controllers, or
>>> if necessary do a deny-list for some older models and enable it for anything
>>> not on the deny-list (so all newer models). That should avoid the game of
>>> whack-a-mole which we will have with this otherwise.
>>
>> This is really up to Mathias to decide. I'm fine either way :)
>
> Ok, Matthias what do you think about this?

I don't think we are ready to enable runtime pm as default for all Intel xHCI controllers.
The risk of xHCI not waking up when user plugs a mouse/keyboard, making the system unusable
just seems too high compared to the powersaving benefit.

The powersaving benefit from autosuspending the TCSS xHCI is a lot better, and we, (Mika mostly)
has been able to verify they work.

So I propose we for now continue adding TCSS xHCI controllers to the allowlist in kernel.
For others I think a userspace allow/denylist makes sense.

Long term goal would be default allow for all, with short denylist in kernel.

Thanks
Mathias

2020-11-25 02:09:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Tue, Nov 24, 2020 at 05:02:18PM +0100, Bastien Nocera wrote:
> On Wed, 2020-11-11 at 17:32 +0100, Greg KH wrote:
> > On Wed, Nov 11, 2020 at 04:03:30PM +0000, Limonciello, Mario wrote:
> > > > > > Given we're effectively ending up with the combination of
> > > > > > runtime PM turned
> > > > > > on by udev rules, do we need something like this for that ID:
> > > > > >
> > > > > >
> > > > https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400
> > > > d8a
> > > > > >
> > > > > > @Mika Westerberg should 8086:a0ed be quirked like the TCSS
> > > > > > xHCI too?
> > > > >
> > > > > I think this one is the TGL PCH xHCI. The quirk currently for
> > > > > xHCI
> > > > > controllers that are part of the TCSS (Type-C SubSystem) where
> > > > > it is
> > > > > important to put all devices into low power mode whenever
> > > > > possible,
> > > > > otherwise it keeps the whole block on.
> > > >
> > > > Note that there are currently some IDs missing from the xHCIs
> > > > which
> > > > are part of the TCSS too. At least the id for the xHCI in the
> > > > thunderbolt
> > > > controller on the Lenovo T14 gen 1 is missing. I started a
> > > > discussion
> > > > about extending the kernel quirk list for this vs switching to
> > > > hwdb
> > > > a while a go:
> > > >
> > > > https://lore.kernel.org/linux-usb/b8b21ba3-0a8a-ff54-5e12-
> > > > [email protected]/
> > > >
> > > > The conclusion back then was to switch to hwdb, but I never got
> > > > around to
> > > > this.
> > >
> > > I guess the problem I see with switching to a hwdb for this type of
> > > thing is
> > > that if there is a "bug" in your kernel driver around autosuspend
> > > you will
> > > then be potentially causing it to occur more regularly on a kernel
> > > that didn't
> > > necessarily pick up the fix but does have the newer hwdb.
> > >
> > > I don't know how common that will really be though.
> > >
> > > Since Mika mentioned the really light userspace scenario, what
> > > about shipping
> > > the hwdb "with" the kernel in tree?? This could allow evicting all
> > > these quirk
> > > scenarios from the kernel at the same time as switching to a hwdb
> > > and also cover
> > > the problem I suggested might happen with a bug in older kernel and
> > > newer userspace.
> >
> > We took things out of the kernel to put it in hwdb years ago as it
> > was
> > easier for people to update a "text file" than it was their kernel
> > image.? I don't think you want to go backwards here :)
>
> There are (unfortunately) a couple of Linux based OSes that don't use
> systemd, which is one of the problems we see.

You don't have to use systemd to use hwdb. If you want to handle quirks
for hardware issues that are done in userspace, the overall solution for
this in Linux is hwdb. To try to reverse that decision we all made a
long time ago is just going to duplicate work for almost no gain that I
can see.

What distros need this that can not pick this up from hwdb today?

> I think it might be a good idea to have a repository or directory
> that's accessible to same contributions as the drivers, where this sort
> of data is kept, as close to the drivers as possible.

And who is going to maintain that? The data that ends up in hwdb
already comes from multiple places today, why add yet-another-one? Are
you going to somehow unify all of those existing data sources into a
single entity? Who is going to run that service and what would the end
output look like (hint, you would have to provide hwdb support, so why
not just use that?)

> You could always split off your quirks into separate "works with any
> kernel" and "works from this version of the kernel" files,

We don't have those today, that's not a thing.

> the goal here would be to make sure that there is a canonical list of
> devices that can be autosuspended, without user-space always playing
> catch-up (especially as is the case now where systemd is being fed by
> ChromeOS which is fed in some other way).

There is no way we can ever create such a "canonical list". Hint,
another operating system tried it, they failed, and they actually had
partnerships with most hardware vendors, and paid developers to do this
work. What are you going to do differently than they did to solve this
problem?

> The Venn diagram of folks that contribute to hwdb quirks databases in
> systemd and that contribute to kernel drivers has a pretty small
> overlap. Moving much of those quirks to a kernel-controlled repository
> (whatever format it ends up being in) would make sense so that the
> "quirk enablement" and the "driver writing" sections overlap.

But they don't overlap today, why make us do more work for no gain?

thanks,

greg k-h

2020-11-25 02:10:20

by Bastien Nocera

[permalink] [raw]
Subject: Re: How to enable auto-suspend by default

On Wed, 2020-11-11 at 17:32 +0100, Greg KH wrote:
> On Wed, Nov 11, 2020 at 04:03:30PM +0000, Limonciello, Mario wrote:
> > > > > Given we're effectively ending up with the combination of
> > > > > runtime PM turned
> > > > > on by udev rules, do we need something like this for that ID:
> > > > >
> > > > >
> > > https://github.com/torvalds/linux/commit/6a7c533d4a1854f54901a065d8c672e890400
> > > d8a
> > > > >
> > > > > @Mika Westerberg should 8086:a0ed be quirked like the TCSS
> > > > > xHCI too?
> > > >
> > > > I think this one is the TGL PCH xHCI. The quirk currently for
> > > > xHCI
> > > > controllers that are part of the TCSS (Type-C SubSystem) where
> > > > it is
> > > > important to put all devices into low power mode whenever
> > > > possible,
> > > > otherwise it keeps the whole block on.
> > >
> > > Note that there are currently some IDs missing from the xHCIs
> > > which
> > > are part of the TCSS too. At least the id for the xHCI in the
> > > thunderbolt
> > > controller on the Lenovo T14 gen 1 is missing. I started a
> > > discussion
> > > about extending the kernel quirk list for this vs switching to
> > > hwdb
> > > a while a go:
> > >
> > > https://lore.kernel.org/linux-usb/b8b21ba3-0a8a-ff54-5e12-
> > > [email protected]/
> > >
> > > The conclusion back then was to switch to hwdb, but I never got
> > > around to
> > > this.
> >
> > I guess the problem I see with switching to a hwdb for this type of
> > thing is
> > that if there is a "bug" in your kernel driver around autosuspend
> > you will
> > then be potentially causing it to occur more regularly on a kernel
> > that didn't
> > necessarily pick up the fix but does have the newer hwdb.
> >
> > I don't know how common that will really be though.
> >
> > Since Mika mentioned the really light userspace scenario, what
> > about shipping
> > the hwdb "with" the kernel in tree?  This could allow evicting all
> > these quirk
> > scenarios from the kernel at the same time as switching to a hwdb
> > and also cover
> > the problem I suggested might happen with a bug in older kernel and
> > newer userspace.
>
> We took things out of the kernel to put it in hwdb years ago as it
> was
> easier for people to update a "text file" than it was their kernel
> image.  I don't think you want to go backwards here :)

There are (unfortunately) a couple of Linux based OSes that don't use
systemd, which is one of the problems we see.

I think it might be a good idea to have a repository or directory
that's accessible to same contributions as the drivers, where this sort
of data is kept, as close to the drivers as possible.

You could always split off your quirks into separate "works with any
kernel" and "works from this version of the kernel" files, the goal
here would be to make sure that there is a canonical list of devices
that can be autosuspended, without user-space always playing catch-up
(especially as is the case now where systemd is being fed by ChromeOS
which is fed in some other way).

The Venn diagram of folks that contribute to hwdb quirks databases in
systemd and that contribute to kernel drivers has a pretty small
overlap. Moving much of those quirks to a kernel-controlled repository
(whatever format it ends up being in) would make sense so that the
"quirk enablement" and the "driver writing" sections overlap.