Subject: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Greetings -

With mainline kernel 6.6.2+ (and 6.1.63, etc), bluetooth is inoperative
(reports "opcode 0x0c03 failed") on my motherboard's bluetooth adapter
(Intel chipset). Details below.

I reported this in a comment tacked onto bugzilla #218142, but got no
response, so posting here as a possibly new issue.

Details, original email:
----------------------------------------------------------------------
I have a regression going from mainline kernel 6.1.62 to 6.1.63, and
also from kernel 6.6.1 to 6.6.2; I can bisect if patch authors can't
locate the relevant commit. In the most recent kernels mentioned,
bluetooth won't function.

Hardware: ASRock "X470 Taichi" motherboard - on board chipset.
lsusb: ID 8087:0aa7 Intel Corp. Wireless-AC 3168 Bluetooth.
dmesg: Bluetooth: hci0: Legacy ROM 2.x revision 5.0 build 25 week 20 2015
Bluetooth: hci0: Intel Bluetooth firmware file:
intel/ibt-hw-37.8.10-fw-22.50.19.14.f.bseq
Bluetooth: hci0: Intel BT fw patch 0x43 completed & activated
bluez: Version 5.70, bluez firmware version 1.2
Linux kernel firmware: 20231117_7124ce3

On a working kernel (such as 6.6.1), in addition to the dmesg output
above, we have this:
dmesg: Bluetooth: MGMT ver 1.22
Bluetooth: hci0: Bad flag given (0x1) vs supported (0x0)

On a failed kernel (such as 6.6.2), instead of the good output above, we
have:
dmesg: Bluetooth: hci0: Opcode 0x0c03 failed: -110
Bluetooth: hci0: Opcode 0x0c03 failed: -110
...
repeats several times as bluez attempts to communicate with hci0.
----------------------------------------------------------------------

Since that email was sent, kernel firmware has been updated to
20231128_aae6052, and kernels 6.1.64 and 6.6.3 have been tried with no
change observed.

Kris


2023-12-01 06:33:24

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

CCing a few lists and people. Greg is among them, who might know if this
is a known issue that 6.6.4-rc1 et. al. might already fix.

If that is not the case I guess we might need a bisection between 6.6.1
and 6.6.2 know if mainline is affected might be good, too.

Cioa, Thorsten

On 01.12.23 02:54, Kris Karas (Bug Reporting) wrote:
>
> With mainline kernel 6.6.2+ (and 6.1.63, etc), bluetooth is inoperative
> (reports "opcode 0x0c03 failed") on my motherboard's bluetooth adapter
> (Intel chipset).  Details below.
>
> I reported this in a comment tacked onto bugzilla #218142, but got no
> response, so posting here as a possibly new issue.
>
> Details, original email:
> ----------------------------------------------------------------------
> I have a regression going from mainline kernel 6.1.62 to 6.1.63, and
> also from kernel 6.6.1 to 6.6.2; I can bisect if patch authors can't
> locate the relevant commit.  In the most recent kernels mentioned,
> bluetooth won't function.
>
> Hardware: ASRock "X470 Taichi" motherboard - on board chipset.
> lsusb: ID 8087:0aa7 Intel Corp. Wireless-AC 3168 Bluetooth.
> dmesg: Bluetooth: hci0: Legacy ROM 2.x revision 5.0 build 25 week 20 2015
>        Bluetooth: hci0: Intel Bluetooth firmware file:
>          intel/ibt-hw-37.8.10-fw-22.50.19.14.f.bseq
>        Bluetooth: hci0: Intel BT fw patch 0x43 completed & activated
> bluez: Version 5.70, bluez firmware version 1.2
> Linux kernel firmware: 20231117_7124ce3
>
> On a working kernel (such as 6.6.1), in addition to the dmesg output
> above, we have this:
> dmesg: Bluetooth: MGMT ver 1.22
>        Bluetooth: hci0: Bad flag given (0x1) vs supported (0x0)
>
> On a failed kernel (such as 6.6.2), instead of the good output above, we
> have:
> dmesg: Bluetooth: hci0: Opcode 0x0c03 failed: -110
>        Bluetooth: hci0: Opcode 0x0c03 failed: -110
>        ...
> repeats several times as bluez attempts to communicate with hci0.
> ----------------------------------------------------------------------
>
> Since that email was sent, kernel firmware has been updated to
> 20231128_aae6052, and kernels 6.1.64 and 6.6.3 have been tried with no
> change observed.
>
> Kris

2023-12-01 08:15:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

On Fri, Dec 01, 2023 at 07:33:03AM +0100, Thorsten Leemhuis wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> CCing a few lists and people. Greg is among them, who might know if this
> is a known issue that 6.6.4-rc1 et. al. might already fix.

Not known to me, bisection is needed so we can track down the problem
please.

thanks,

greg k-h

Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Bagas Sanjaya wrote:
> Kris Karas (Bug Reporting) wrote:
>> I have a regression going from mainline kernel 6.1.62 to 6.1.63, and also
>> from kernel 6.6.1 to 6.6.2; I can bisect if patch authors can't locate the
>> relevant commit. In the most recent kernels mentioned, bluetooth won't
>> function.
>
> Then please do bisection; without it, nobody will look into this properly.

As only a few people are reporting this, it must be pretty
hardware-specific (or perhaps Kconfig/firmware specific). I'll do a
bisect. A bit too late here in Boston (03:00), and kiddo's birthday
"later today", so will probably get to this on the weekend.

> You may also want to check current mainline (v6.7-rc3) to see if this
> regression have already been fixed.

Just tried 6.7.0-rc3, and it is also affected.

I hadn't git-pulled my linux-stable since May, so that gave me a good
chance to test the very latest. :-) And conveniently I'm now set for
the bisect.

Kris

2023-12-01 08:28:34

by Paul Menzel

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Dear Kris,


Am 01.12.23 um 09:19 schrieb Kris Karas (Bug Reporting):
> Bagas Sanjaya wrote:
>> Kris Karas (Bug Reporting) wrote:
>>> I have a regression going from mainline kernel 6.1.62 to 6.1.63, and
>>> also
>>> from kernel 6.6.1 to 6.6.2; I can bisect if patch authors can't
>>> locate the
>>> relevant commit.  In the most recent kernels mentioned, bluetooth won't
>>> function.
>>
>> Then please do bisection; without it, nobody will look into this
>> properly.
>
> As only a few people are reporting this, it must be pretty
> hardware-specific (or perhaps Kconfig/firmware specific).  I'll do a
> bisect.  A bit too late here in Boston (03:00), and kiddo's birthday
> "later today", so will probably get to this on the weekend.
>
>> You may also want to check current mainline (v6.7-rc3) to see if this
>> regression have already been fixed.
>
> Just tried 6.7.0-rc3, and it is also affected.
>
> I hadn't git-pulled my linux-stable since May, so that gave me a good
> chance to test the very latest.  :-)  And conveniently I'm now set for
> the bisect.

Nice, that is often the fastest way to fix something.

To avoid the time rebooting the system, you could try to expose the
drive to a virtual machine [1].


Kind regards,

Paul


[1]:
https://lore.kernel.org/all/[email protected]/
(The failure in the VM was due to another regression in the Linux
kernel, so the how-to actually worked for me.)

Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Greg KH wrote:
> On Fri, Dec 01, 2023 at 07:33:03AM +0100, Thorsten Leemhuis wrote:
>> CCing a few lists and people. Greg is among them, who might know if this
>> is a known issue that 6.6.4-rc1 et. al. might already fix.
>
> Not known to me, bisection is needed so we can track down the problem
> please.

And the winner is...

> commit 14a51fa544225deb9ac2f1f9f3c10dedb29f5d2f
> Author: Basavaraj Natikar <[email protected]>
> Date: Thu Oct 19 13:29:19 2023 +0300
>
> xhci: Loosen RPM as default policy to cover for AMD xHC 1.1
>
> [ Upstream commit 4baf1218150985ee3ab0a27220456a1f027ea0ac ]
>
> The AMD USB host controller (1022:43f7) isn't going into PCI D3 by default
> without anything connected. This is because the policy that was introduced
> by commit a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all
> xHC 1.2 or later devices") only covered 1.2 or later.
>
> [ snip ]
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index b9ae5c2a2527..bde43cef8846 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -535,6 +535,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
> /* xHC spec requires PCI devices to support D3hot and D3cold */
> if (xhci->hci_version >= 0x120)
> xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> + else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
> + xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
>
> if (xhci->quirks & XHCI_RESET_ON_RESUME)
> xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,


Huh, OK, I was expecting this to be a patch made to the bluetooth code,
as it caused bluetoothd to bomb with "opcode 0x0c03 failed". But I just
verified I did the bisect correctly by backing this two-liner out of
vanilla 6.6.3, and bluetooth returned to normal operation. Huzzah!

Just a brief recap:

This bug appears to be rather hardware-specific, as only a few folks
have reported it. In my case, the hardware is an ASrock "X470 Taichi"
motherboard, and its on-board bluetooth hardware, reporting itself as:
lspci: 0f:00.3 USB controller: Advanced Micro Devices, Inc. [AMD]
Zeppelin USB 3.0 xHCI Compliant Host Controller
lsusb: ID 8087:0aa7 Intel Corp. Wireless-AC 3168 Bluetooth

When Basavaraj's patch is applied (in mainline 6.6.2+), bluetooth stops
functioning on my motherboard.

Originally from bugzilla #218142

--
Kris

2023-12-02 07:25:04

by Paul Menzel

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

[Cc: +Mario, Mathias, linux-usb]

Am 02.12.23 um 07:43 schrieb Kris Karas (Bug Reporting):
> Greg KH wrote:
>> On Fri, Dec 01, 2023 at 07:33:03AM +0100, Thorsten Leemhuis wrote:
>>> CCing a few lists and people. Greg is among them, who might know if this
>>> is a known issue that 6.6.4-rc1 et. al. might already fix.
>>
>> Not known to me, bisection is needed so we can track down the problem
>> please.
>
> And the winner is...
>
>> commit 14a51fa544225deb9ac2f1f9f3c10dedb29f5d2f
>> Author: Basavaraj Natikar <[email protected]>
>> Date:   Thu Oct 19 13:29:19 2023 +0300
>>
>>     xhci: Loosen RPM as default policy to cover for AMD xHC 1.1
>> >>     [ Upstream commit 4baf1218150985ee3ab0a27220456a1f027ea0ac ]
>>
>>     The AMD USB host controller (1022:43f7) isn't going into PCI D3 by default
>>     without anything connected. This is because the policy that was introduced
>>     by commit a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all
>>     xHC 1.2 or later devices") only covered 1.2 or later.
>> [ snip ]
>> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
>> index b9ae5c2a2527..bde43cef8846 100644
>> --- a/drivers/usb/host/xhci-pci.c
>> +++ b/drivers/usb/host/xhci-pci.c
>> @@ -535,6 +535,8 @@ static void xhci_pci_quirks(struct device *dev,
>> struct xhci_hcd *xhci)
>>         /* xHC spec requires PCI devices to support D3hot and D3cold */
>>         if (xhci->hci_version >= 0x120)
>>                 xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
>> +       else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
>> +               xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
>>
>>         if (xhci->quirks & XHCI_RESET_ON_RESUME)
>>                 xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
>
>
> Huh, OK, I was expecting this to be a patch made to the bluetooth code,
> as it caused bluetoothd to bomb with "opcode 0x0c03 failed".  But I just
> verified I did the bisect correctly by backing this two-liner out of
> vanilla 6.6.3, and bluetooth returned to normal operation.  Huzzah!
>
> Just a brief recap:
>
> This bug appears to be rather hardware-specific, as only a few folks
> have reported it.  In my case, the hardware is an ASrock "X470 Taichi"
> motherboard, and its on-board bluetooth hardware, reporting itself as:
> lspci: 0f:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 xHCI Compliant Host Controller
> lsusb: ID 8087:0aa7 Intel Corp. Wireless-AC 3168 Bluetooth
>
> When Basavaraj's patch is applied (in mainline 6.6.2+), bluetooth stops
> functioning on my motherboard.
>
> Originally from bugzilla #218142 [1]
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218142

2023-12-02 07:50:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

On Sat, Dec 02, 2023 at 08:23:55AM +0100, Paul Menzel wrote:
> [Cc: +Mario, Mathias, linux-usb]
>
> Am 02.12.23 um 07:43 schrieb Kris Karas (Bug Reporting):
> > Greg KH wrote:
> > > On Fri, Dec 01, 2023 at 07:33:03AM +0100, Thorsten Leemhuis wrote:
> > > > CCing a few lists and people. Greg is among them, who might know if this
> > > > is a known issue that 6.6.4-rc1 et. al. might already fix.
> > >
> > > Not known to me, bisection is needed so we can track down the problem
> > > please.
> >
> > And the winner is...
> >
> > > commit 14a51fa544225deb9ac2f1f9f3c10dedb29f5d2f
> > > Author: Basavaraj Natikar <[email protected]>
> > > Date:?? Thu Oct 19 13:29:19 2023 +0300
> > >
> > > ??? xhci: Loosen RPM as default policy to cover for AMD xHC 1.1
> > > >> ??? [ Upstream commit 4baf1218150985ee3ab0a27220456a1f027ea0ac ]
> > >
> > > ??? The AMD USB host controller (1022:43f7) isn't going into PCI D3 by default
> > > ??? without anything connected. This is because the policy that was introduced
> > > ??? by commit a611bf473d1f ("xhci-pci: Set runtime PM as default policy on all
> > > ??? xHC 1.2 or later devices") only covered 1.2 or later.
> > > [ snip ]
> > > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> > > index b9ae5c2a2527..bde43cef8846 100644
> > > --- a/drivers/usb/host/xhci-pci.c
> > > +++ b/drivers/usb/host/xhci-pci.c
> > > @@ -535,6 +535,8 @@ static void xhci_pci_quirks(struct device *dev,
> > > struct xhci_hcd *xhci)
> > > ??????? /* xHC spec requires PCI devices to support D3hot and D3cold */
> > > ??????? if (xhci->hci_version >= 0x120)
> > > ??????????????? xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> > > +?????? else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
> > > +?????????????? xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> > >
> > > ??????? if (xhci->quirks & XHCI_RESET_ON_RESUME)
> > > ??????????????? xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
> >
> >
> > Huh, OK, I was expecting this to be a patch made to the bluetooth code,
> > as it caused bluetoothd to bomb with "opcode 0x0c03 failed".? But I just
> > verified I did the bisect correctly by backing this two-liner out of
> > vanilla 6.6.3, and bluetooth returned to normal operation.? Huzzah!
> >
> > Just a brief recap:
> >
> > This bug appears to be rather hardware-specific, as only a few folks
> > have reported it.? In my case, the hardware is an ASrock "X470 Taichi"
> > motherboard, and its on-board bluetooth hardware, reporting itself as:
> > lspci: 0f:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 xHCI Compliant Host Controller
> > lsusb: ID 8087:0aa7 Intel Corp. Wireless-AC 3168 Bluetooth
> >
> > When Basavaraj's patch is applied (in mainline 6.6.2+), bluetooth stops
> > functioning on my motherboard.
> >
> > Originally from bugzilla #218142 [1]
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218142

Should already be fixed in the 6.6.3 release, can you please verify that
this is broken there?

thanks,

greg k-h

Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Greg KH wrote:
>> Am 02.12.23 um 07:43 schrieb Kris Karas (Bug Reporting):
>>> When Basavaraj's patch is applied (in mainline 6.6.2+), bluetooth stops
>>> functioning on my motherboard.
>>>
>>> Originally from bugzilla #218142 [1]
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218142
>
> Should already be fixed in the 6.6.3 release, can you please verify that
> this is broken there?

Double-checked and confirmed. 6.6.3 shows the bug (hci0: Opcode 0x0c03
failed: -110) and my currently-running system (6.6.3 with
14a51fa544225deb9ac2f1f9f3c10dedb29f5d2f backed out) is running fine
(with its MX Master 3S bluetooth mouse).

Cheers,
Kris

2023-12-02 08:15:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

On Sat, Dec 02, 2023 at 02:58:32AM -0500, Kris Karas (Bug Reporting) wrote:
> Greg KH wrote:
> > > Am 02.12.23 um 07:43 schrieb Kris Karas (Bug Reporting):
> > > > When Basavaraj's patch is applied (in mainline 6.6.2+), bluetooth stops
> > > > functioning on my motherboard.
> > > >
> > > > Originally from bugzilla #218142 [1]
> > > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218142
> >
> > Should already be fixed in the 6.6.3 release, can you please verify that
> > this is broken there?
>
> Double-checked and confirmed. 6.6.3 shows the bug (hci0: Opcode 0x0c03
> failed: -110) and my currently-running system (6.6.3 with
> 14a51fa544225deb9ac2f1f9f3c10dedb29f5d2f backed out) is running fine (with
> its MX Master 3S bluetooth mouse).

Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
hours for me to release 6.6.4 if you don't want to mess with a -rc
release.

Also, is this showing up in 6.7-rc3? If so, that would be a big help in
tracking this down.

thanks,

greg k-h

Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Greg KH wrote:
> Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
> hours for me to release 6.6.4 if you don't want to mess with a -rc
> release.

As I mentioned to Greg off-list (to save wasting other peoples'
bandwidth), I couldn't find 6.6.4-rc1. Looking in wrong git tree? But
6.6.4 is now out, which I have tested and am running at the moment,
albeit with the problem commit from 6.6.2 backed out.

There is no change with respect to this bug. The problematic patch
introduced in 6.6.2 was neither reverted nor amended. The "opcode
0x0c03 failed" lines to the kernel log continue to be present.

> Also, is this showing up in 6.7-rc3? If so, that would be a big help in
> tracking this down.

The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+
and in 6.1.63+. The problematic patch bisected earlier appears
identically (and seems to have been introduced simultaneously) in these
recent releases.

Kris

2023-12-03 08:38:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

On Sun, Dec 03, 2023 at 03:32:52AM -0500, Kris Karas (Bug Reporting) wrote:
> Greg KH wrote:
> > Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
> > hours for me to release 6.6.4 if you don't want to mess with a -rc
> > release.
>
> As I mentioned to Greg off-list (to save wasting other peoples' bandwidth),
> I couldn't find 6.6.4-rc1. Looking in wrong git tree? But 6.6.4 is now
> out, which I have tested and am running at the moment, albeit with the
> problem commit from 6.6.2 backed out.
>
> There is no change with respect to this bug. The problematic patch
> introduced in 6.6.2 was neither reverted nor amended. The "opcode 0x0c03
> failed" lines to the kernel log continue to be present.
>
> > Also, is this showing up in 6.7-rc3? If so, that would be a big help in
> > tracking this down.
>
> The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+ and
> in 6.1.63+. The problematic patch bisected earlier appears identically (and
> seems to have been introduced simultaneously) in these recent releases.

Ok, in a way, this is good as that means I haven't missed a fix, but bad
in that this does affect everyone more.

So let's start over, you found the offending commit, and nothing has
fixed it, so what do we do? xhci/amd developers, any ideas?

thanks,

greg k-h

2023-12-03 16:16:45

by Basavaraj Natikar

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+


On 12/3/2023 2:08 PM, Greg KH wrote:
> On Sun, Dec 03, 2023 at 03:32:52AM -0500, Kris Karas (Bug Reporting) wrote:
>> Greg KH wrote:
>>> Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
>>> hours for me to release 6.6.4 if you don't want to mess with a -rc
>>> release.
>> As I mentioned to Greg off-list (to save wasting other peoples' bandwidth),
>> I couldn't find 6.6.4-rc1. Looking in wrong git tree? But 6.6.4 is now
>> out, which I have tested and am running at the moment, albeit with the
>> problem commit from 6.6.2 backed out.
>>
>> There is no change with respect to this bug. The problematic patch
>> introduced in 6.6.2 was neither reverted nor amended. The "opcode 0x0c03
>> failed" lines to the kernel log continue to be present.
>>
>>> Also, is this showing up in 6.7-rc3? If so, that would be a big help in
>>> tracking this down.
>> The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+ and
>> in 6.1.63+. The problematic patch bisected earlier appears identically (and
>> seems to have been introduced simultaneously) in these recent releases.
> Ok, in a way, this is good as that means I haven't missed a fix, but bad
> in that this does affect everyone more.
>
> So let's start over, you found the offending commit, and nothing has
> fixed it, so what do we do? xhci/amd developers, any ideas?

Can we enable RPM on specific controllers for AMD xHC 1.1
instead to cover all AMD xHC 1.1?

Please find below the proposed changes and let me know if it is OK?

Author: Basavaraj Natikar <[email protected]>
Date: Sun Dec 3 18:28:27 2023 +0530

xhci: Remove RPM as default policy to cover AMD xHC 1.1

xHC 1.1 runtime PM as default policy causes issues on few AMD controllers.
Hence remove RPM as default policy to cover AMD xHC 1.1 and add only
AMD USB host controller (1022:43f7) which has RPM support.

Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
Link: https://lore.kernel.org/all/2023120329-length-strum-9ee1@gregkh
Signed-off-by: Basavaraj Natikar <[email protected]>

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 95ed9404f6f8..7ffd6b8227cc 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -535,7 +535,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
/* xHC spec requires PCI devices to support D3hot and D3cold */
if (xhci->hci_version >= 0x120)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
- else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
+ else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->vendor == 0x43f7)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;

if (xhci->quirks & XHCI_RESET_ON_RESUME)

Thanks,
--
Basavaraj

>
> thanks,
>
> greg k-h


2023-12-03 16:24:55

by Basavaraj Natikar

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+


On 12/3/2023 9:46 PM, Basavaraj Natikar wrote:
> On 12/3/2023 2:08 PM, Greg KH wrote:
>> On Sun, Dec 03, 2023 at 03:32:52AM -0500, Kris Karas (Bug Reporting) wrote:
>>> Greg KH wrote:
>>>> Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
>>>> hours for me to release 6.6.4 if you don't want to mess with a -rc
>>>> release.
>>> As I mentioned to Greg off-list (to save wasting other peoples' bandwidth),
>>> I couldn't find 6.6.4-rc1. Looking in wrong git tree? But 6.6.4 is now
>>> out, which I have tested and am running at the moment, albeit with the
>>> problem commit from 6.6.2 backed out.
>>>
>>> There is no change with respect to this bug. The problematic patch
>>> introduced in 6.6.2 was neither reverted nor amended. The "opcode 0x0c03
>>> failed" lines to the kernel log continue to be present.
>>>
>>>> Also, is this showing up in 6.7-rc3? If so, that would be a big help in
>>>> tracking this down.
>>> The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+ and
>>> in 6.1.63+. The problematic patch bisected earlier appears identically (and
>>> seems to have been introduced simultaneously) in these recent releases.
>> Ok, in a way, this is good as that means I haven't missed a fix, but bad
>> in that this does affect everyone more.
>>
>> So let's start over, you found the offending commit, and nothing has
>> fixed it, so what do we do? xhci/amd developers, any ideas?
> Can we enable RPM on specific controllers for AMD xHC 1.1
> instead to cover all AMD xHC 1.1?
>
> Please find below the proposed changes and let me know if it is OK?
>
> Author: Basavaraj Natikar <[email protected]>
> Date: Sun Dec 3 18:28:27 2023 +0530
>
> xhci: Remove RPM as default policy to cover AMD xHC 1.1
>
> xHC 1.1 runtime PM as default policy causes issues on few AMD controllers.
> Hence remove RPM as default policy to cover AMD xHC 1.1 and add only
> AMD USB host controller (1022:43f7) which has RPM support.
>
> Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
> Link: https://lore.kernel.org/all/2023120329-length-strum-9ee1@gregkh
> Signed-off-by: Basavaraj Natikar <[email protected]>
>
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 95ed9404f6f8..7ffd6b8227cc 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -535,7 +535,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
> /* xHC spec requires PCI devices to support D3hot and D3cold */
> if (xhci->hci_version >= 0x120)
> xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> - else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
> + else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->vendor == 0x43f7)

sorry its
pdev->device == 0x43f7

Incorrect ---> else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->vendor == 0x43f7)
correct line --> else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43f7)

> xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
>
> if (xhci->quirks & XHCI_RESET_ON_RESUME)
>
> Thanks,
> --
> Basavaraj
>
>> thanks,
>>
>> greg k-h


2023-12-03 19:52:57

by Oleksandr Natalenko

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

Hello.

On neděle 3. prosince 2023 17:24:28 CET Basavaraj Natikar wrote:
>
> On 12/3/2023 9:46 PM, Basavaraj Natikar wrote:
> > On 12/3/2023 2:08 PM, Greg KH wrote:
> >> On Sun, Dec 03, 2023 at 03:32:52AM -0500, Kris Karas (Bug Reporting) wrote:
> >>> Greg KH wrote:
> >>>> Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
> >>>> hours for me to release 6.6.4 if you don't want to mess with a -rc
> >>>> release.
> >>> As I mentioned to Greg off-list (to save wasting other peoples' bandwidth),
> >>> I couldn't find 6.6.4-rc1. Looking in wrong git tree? But 6.6.4 is now
> >>> out, which I have tested and am running at the moment, albeit with the
> >>> problem commit from 6.6.2 backed out.
> >>>
> >>> There is no change with respect to this bug. The problematic patch
> >>> introduced in 6.6.2 was neither reverted nor amended. The "opcode 0x0c03
> >>> failed" lines to the kernel log continue to be present.
> >>>
> >>>> Also, is this showing up in 6.7-rc3? If so, that would be a big help in
> >>>> tracking this down.
> >>> The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+ and
> >>> in 6.1.63+. The problematic patch bisected earlier appears identically (and
> >>> seems to have been introduced simultaneously) in these recent releases.
> >> Ok, in a way, this is good as that means I haven't missed a fix, but bad
> >> in that this does affect everyone more.
> >>
> >> So let's start over, you found the offending commit, and nothing has
> >> fixed it, so what do we do? xhci/amd developers, any ideas?
> > Can we enable RPM on specific controllers for AMD xHC 1.1
> > instead to cover all AMD xHC 1.1?
> >
> > Please find below the proposed changes and let me know if it is OK?
> >
> > Author: Basavaraj Natikar <[email protected]>
> > Date: Sun Dec 3 18:28:27 2023 +0530
> >
> > xhci: Remove RPM as default policy to cover AMD xHC 1.1
> >
> > xHC 1.1 runtime PM as default policy causes issues on few AMD controllers.
> > Hence remove RPM as default policy to cover AMD xHC 1.1 and add only
> > AMD USB host controller (1022:43f7) which has RPM support.
> >
> > Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
> > Link: https://lore.kernel.org/all/2023120329-length-strum-9ee1@gregkh
> > Signed-off-by: Basavaraj Natikar <[email protected]>
> >
> > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> > index 95ed9404f6f8..7ffd6b8227cc 100644
> > --- a/drivers/usb/host/xhci-pci.c
> > +++ b/drivers/usb/host/xhci-pci.c
> > @@ -535,7 +535,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
> > /* xHC spec requires PCI devices to support D3hot and D3cold */
> > if (xhci->hci_version >= 0x120)
> > xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> > - else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
> > + else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->vendor == 0x43f7)
>
> sorry its
> pdev->device == 0x43f7
>
> Incorrect ---> else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->vendor == 0x43f7)
> correct line --> else if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43f7)
>
> > xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> >
> > if (xhci->quirks & XHCI_RESET_ON_RESUME)

Given the following hardware:

[~]> lspci -nn | grep -i usb
06:00.4 USB controller [0c03]: Realtek Semiconductor Co., Ltd. RTL811x EHCI host controller [10ec:816d] (rev 1a)
07:00.1 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
07:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]
0f:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller [1022:149c]

and v6.6.4 kernel, without this patch:

[~]> LC_TIME=C jctl -kb -1 --grep 'hci version'
Dec 03 13:22:03 archlinux kernel: xhci_hcd 0000:07:00.1: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000200000410
Dec 03 13:22:03 archlinux kernel: xhci_hcd 0000:07:00.3: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000200000410
Dec 03 13:22:03 archlinux kernel: xhci_hcd 0000:0f:00.3: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000200000410

With the patch applied:

[~]> LC_TIME=C jctl -kb --grep 'hci version'
Dec 03 20:46:59 archlinux kernel: xhci_hcd 0000:07:00.1: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000000000410
Dec 03 20:46:59 archlinux kernel: xhci_hcd 0000:07:00.3: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000000000410
Dec 03 20:46:59 archlinux kernel: xhci_hcd 0000:0f:00.3: hcc params 0x0278ffe5 hci version 0x110 quirks 0x0000000000000410

(note the difference in `quirks` as expected)

Hence, feel free to add:

Tested-by: Oleksandr Natalenko <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/

Thank you.

> >
> > Thanks,
> > --
> > Basavaraj
> >
> >> thanks,
> >>
> >> greg k-h
>
>
>


--
Oleksandr Natalenko (post-factum)


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part.

2023-12-04 09:11:13

by Mathias Nyman

[permalink] [raw]
Subject: Re: Regression: Inoperative bluetooth, Intel chipset, mainline kernel 6.6.2+

On 3.12.2023 10.38, Greg KH wrote:
> On Sun, Dec 03, 2023 at 03:32:52AM -0500, Kris Karas (Bug Reporting) wrote:
>> Greg KH wrote:
>>> Thanks for testing, any chance you can try 6.6.4-rc1? Or wait a few
>>> hours for me to release 6.6.4 if you don't want to mess with a -rc
>>> release.
>>
>> As I mentioned to Greg off-list (to save wasting other peoples' bandwidth),
>> I couldn't find 6.6.4-rc1. Looking in wrong git tree? But 6.6.4 is now
>> out, which I have tested and am running at the moment, albeit with the
>> problem commit from 6.6.2 backed out.
>>
>> There is no change with respect to this bug. The problematic patch
>> introduced in 6.6.2 was neither reverted nor amended. The "opcode 0x0c03
>> failed" lines to the kernel log continue to be present.
>>
>>> Also, is this showing up in 6.7-rc3? If so, that would be a big help in
>>> tracking this down.
>>
>> The bug shows up in 6.7-rc3 as well, exactly as it does here in 6.6.2+ and
>> in 6.1.63+. The problematic patch bisected earlier appears identically (and
>> seems to have been introduced simultaneously) in these recent releases.
>
> Ok, in a way, this is good as that means I haven't missed a fix, but bad
> in that this does affect everyone more.
>
> So let's start over, you found the offending commit, and nothing has
> fixed it, so what do we do? xhci/amd developers, any ideas?
> thanks,
>
> greg k-h
>

I suggest reverting these two patches from everywhere (all stable):
a5d6264b638e xhci: Enable RPM on controllers that support low-power states
4baf12181509 xhci: Loosen RPM as default policy to cover for AMD xHC 1.1

Then write a new well tested patch that adds default runtime pm to those AMD
hosts that support it. And only add that to usb-next

-Mathias





2023-12-04 10:08:12

by Mathias Nyman

[permalink] [raw]
Subject: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.

This patch was an attempt to solve issues seen when enabling runtime PM
as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")

This was not enough, regressions are still seen, so start from a clean
slate and revert both of them.

This patch went to stable and should be reverted from there as well

Fixes: a5d6264b638e ("xhci: Enable RPM on controllers that support low-power states")
Cc: [email protected]
Cc: Mario Limonciello <[email protected]>
Cc: Basavaraj Natikar <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
---
drivers/usb/host/xhci-pci.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 95ed9404f6f8..bde43cef8846 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -695,9 +695,7 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
/* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */
pm_runtime_put_noidle(&dev->dev);

- if (pci_choose_state(dev, PMSG_SUSPEND) == PCI_D0)
- pm_runtime_forbid(&dev->dev);
- else if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW)
+ if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW)
pm_runtime_allow(&dev->dev);

dma_set_max_seg_size(&dev->dev, UINT_MAX);
--
2.25.1


2023-12-04 10:08:33

by Mathias Nyman

[permalink] [raw]
Subject: [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

This reverts commit 4baf1218150985ee3ab0a27220456a1f027ea0ac.

Enabling runtime pm as default for all AMD xHC 1.1 controllers caused
regression. An initial attempt to fix those was done in commit a5d6264b638e
("xhci: Enable RPM on controllers that support low-power states") but new
issues are still seen.

Revert them both and start from a clean slate.

This patch went to stable an needs to be reverted from there as well.

Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
Link: https://lore.kernel.org/linux-usb/[email protected]
Cc: Mario Limonciello <[email protected]>
Cc: Basavaraj Natikar <[email protected]>
Cc: [email protected]
Signed-off-by: Mathias Nyman <[email protected]>
---
drivers/usb/host/xhci-pci.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index bde43cef8846..b9ae5c2a2527 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -535,8 +535,6 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
/* xHC spec requires PCI devices to support D3hot and D3cold */
if (xhci->hci_version >= 0x120)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
- else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
- xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;

if (xhci->quirks & XHCI_RESET_ON_RESUME)
xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
--
2.25.1


2023-12-04 10:14:37

by bluez.test.bot

[permalink] [raw]
Subject: RE: [1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----

error: patch failed: drivers/usb/host/xhci-pci.c:695
error: drivers/usb/host/xhci-pci.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch

Please resolve the issue and submit the patches again.


---
Regards,
Linux Bluetooth

2023-12-04 10:49:58

by Basavaraj Natikar

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"


On 12/4/2023 3:38 PM, Mathias Nyman wrote:
> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>
> This patch was an attempt to solve issues seen when enabling runtime PM
> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")

AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is not regression
patch and its unrelated to AMD xHC 1.1.

Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
alone in this series solves regression issues.

>
> This was not enough, regressions are still seen, so start from a clean
> slate and revert both of them.
>
> This patch went to stable and should be reverted from there as well
>
> Fixes: a5d6264b638e ("xhci: Enable RPM on controllers that support low-power states")
> Cc: [email protected]
> Cc: Mario Limonciello <[email protected]>
> Cc: Basavaraj Natikar <[email protected]>
> Signed-off-by: Mathias Nyman <[email protected]>
> ---
> drivers/usb/host/xhci-pci.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 95ed9404f6f8..bde43cef8846 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -695,9 +695,7 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
> /* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */
> pm_runtime_put_noidle(&dev->dev);
>
> - if (pci_choose_state(dev, PMSG_SUSPEND) == PCI_D0)
> - pm_runtime_forbid(&dev->dev);
> - else if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW)
> + if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW)
> pm_runtime_allow(&dev->dev);
>
> dma_set_max_seg_size(&dev->dev, UINT_MAX);


2023-12-04 14:21:31

by Mathias Nyman

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

On 4.12.2023 12.49, Basavaraj Natikar wrote:
>
> On 12/4/2023 3:38 PM, Mathias Nyman wrote:
>> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>>
>> This patch was an attempt to solve issues seen when enabling runtime PM
>> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
>> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
>
> AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is not regression
> patch and its unrelated to AMD xHC 1.1.
>
> Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
> alone in this series solves regression issues.
>

Patch a5d6264b638e ("xhci: Enable RPM on controllers that support low-power states")
was originally not supposed to go to stable. It was added later as it solved some
cases triggered by 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
see:
https://lore.kernel.org/linux-usb/[email protected]/

Turns out it wasn't enough.

If we now revert 4baf12181509 "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
I still think it makes sense to also revert a5d6264b638e.
Especially from the stable kernels.

This way we roll back this whole issue to a known working state.

Thanks
Mathias

2023-12-04 14:49:40

by Basavaraj Natikar

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"


On 12/4/2023 7:52 PM, Mathias Nyman wrote:
> On 4.12.2023 12.49, Basavaraj Natikar wrote:
>>
>> On 12/4/2023 3:38 PM, Mathias Nyman wrote:
>>> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>>>
>>> This patch was an attempt to solve issues seen when enabling runtime PM
>>> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
>>> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
>>
>> AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is
>> not regression
>> patch and its unrelated to AMD xHC 1.1.
>>
>> Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover
>> for AMD xHC 1.1"
>> alone in this series solves regression issues.
>>
>
> Patch a5d6264b638e ("xhci: Enable RPM on controllers that support
> low-power states")
> was originally not supposed to go to stable. It was added later as it
> solved some
> cases triggered by 4baf12181509 ("xhci: Loosen RPM as default policy
> to cover for AMD xHC 1.1")
> see:
> https://lore.kernel.org/linux-usb/[email protected]/
>
> Turns out it wasn't enough.
>
> If we now revert 4baf12181509 "xhci: Loosen RPM as default policy to
> cover for AMD xHC 1.1"
> I still think it makes sense to also revert a5d6264b638e.
> Especially from the stable kernels.

Yes , a5d6264b638e still solves other issues if underlying hardware doesn't support RPM
if we revert a5d6264b638e on stable releases then new issues (not related to regression)
other than AMD xHC 1.1 controllers including xHC 1.2 will still exist on stable releases.
If revert then we can backport to stable release later if required.

Sure, will send a follow up patch to fix 4baf12181509 alone on mainline if revert on all releases.

>
> This way we roll back this whole issue to a known working state.

Sure, for at-least a5d6264b638e if not revert on mainline then will not resend the same patch.

Thanks,
--
Basavaraj

>
> Thanks
> Mathias


2023-12-04 15:05:59

by Mathias Nyman

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

On 4.12.2023 16.49, Basavaraj Natikar wrote:
>
> On 12/4/2023 7:52 PM, Mathias Nyman wrote:
>> On 4.12.2023 12.49, Basavaraj Natikar wrote:
>>>
>>> On 12/4/2023 3:38 PM, Mathias Nyman wrote:
>>>> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>>>>
>>>> This patch was an attempt to solve issues seen when enabling runtime PM
>>>> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
>>>> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
>>>
>>> AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is
>>> not regression
>>> patch and its unrelated to AMD xHC 1.1.
>>>
>>> Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover
>>> for AMD xHC 1.1"
>>> alone in this series solves regression issues.
>>>
>>
>> Patch a5d6264b638e ("xhci: Enable RPM on controllers that support
>> low-power states")
>> was originally not supposed to go to stable. It was added later as it
>> solved some
>> cases triggered by 4baf12181509 ("xhci: Loosen RPM as default policy
>> to cover for AMD xHC 1.1")
>> see:
>> https://lore.kernel.org/linux-usb/[email protected]/
>>
>> Turns out it wasn't enough.
>>
>> If we now revert 4baf12181509 "xhci: Loosen RPM as default policy to
>> cover for AMD xHC 1.1"
>> I still think it makes sense to also revert a5d6264b638e.
>> Especially from the stable kernels.
>
> Yes , a5d6264b638e still solves other issues if underlying hardware doesn't support RPM
> if we revert a5d6264b638e on stable releases then new issues (not related to regression)
> other than AMD xHC 1.1 controllers including xHC 1.2 will still exist on stable releases.

Ok, got it, so a5d6264b638e also solves other issues than those exposed by 4baf12181509.
And that one (a5d6264b638) should originally have been marked for stable.

So only revert 4baf12181509, PATCH 2/2 in this series

Thanks
Mathias

2023-12-04 15:30:01

by Basavaraj Natikar

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"


On 12/4/2023 8:36 PM, Mathias Nyman wrote:
> On 4.12.2023 16.49, Basavaraj Natikar wrote:
>>
>> On 12/4/2023 7:52 PM, Mathias Nyman wrote:
>>> On 4.12.2023 12.49, Basavaraj Natikar wrote:
>>>>
>>>> On 12/4/2023 3:38 PM, Mathias Nyman wrote:
>>>>> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>>>>>
>>>>> This patch was an attempt to solve issues seen when enabling
>>>>> runtime PM
>>>>> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
>>>>> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
>>>>
>>>> AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is
>>>> not regression
>>>> patch and its unrelated to AMD xHC 1.1.
>>>>
>>>> Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover
>>>> for AMD xHC 1.1"
>>>> alone in this series solves regression issues.
>>>>
>>>
>>> Patch a5d6264b638e ("xhci: Enable RPM on controllers that support
>>> low-power states")
>>> was originally not supposed to go to stable. It was added later as it
>>> solved some
>>> cases triggered by 4baf12181509 ("xhci: Loosen RPM as default policy
>>> to cover for AMD xHC 1.1")
>>> see:
>>> https://lore.kernel.org/linux-usb/[email protected]/
>>>
>>> Turns out it wasn't enough.
>>>
>>> If we now revert 4baf12181509 "xhci: Loosen RPM as default policy to
>>> cover for AMD xHC 1.1"
>>> I still think it makes sense to also revert a5d6264b638e.
>>> Especially from the stable kernels.
>>
>> Yes , a5d6264b638e still solves other issues if underlying hardware
>> doesn't support RPM
>> if we revert a5d6264b638e on stable releases then new issues (not
>> related to regression)
>> other than AMD xHC 1.1 controllers including xHC 1.2 will still exist
>> on stable releases.
>
> Ok, got it, so a5d6264b638e also solves other issues than those
> exposed by 4baf12181509.
> And that one (a5d6264b638) should originally have been marked for stable.
>
> So only revert 4baf12181509, PATCH 2/2 in this series

Thank you, that is correct.

Thanks,
--
Basavaraj

>
> Thanks
> Mathias


2023-12-04 23:55:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

On Mon, Dec 04, 2023 at 08:59:35PM +0530, Basavaraj Natikar wrote:
>
> On 12/4/2023 8:36 PM, Mathias Nyman wrote:
> > On 4.12.2023 16.49, Basavaraj Natikar wrote:
> >>
> >> On 12/4/2023 7:52 PM, Mathias Nyman wrote:
> >>> On 4.12.2023 12.49, Basavaraj Natikar wrote:
> >>>>
> >>>> On 12/4/2023 3:38 PM, Mathias Nyman wrote:
> >>>>> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
> >>>>>
> >>>>> This patch was an attempt to solve issues seen when enabling
> >>>>> runtime PM
> >>>>> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
> >>>>> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
> >>>>
> >>>> AFAK, only 4baf12181509 commit has regression on AMD xHc 1.1 below is
> >>>> not regression
> >>>> patch and its unrelated to AMD xHC 1.1.
> >>>>
> >>>> Only [PATCH 2/2] Revert "xhci: Loosen RPM as default policy to cover
> >>>> for AMD xHC 1.1"
> >>>> alone in this series solves regression issues.
> >>>>
> >>>
> >>> Patch a5d6264b638e ("xhci: Enable RPM on controllers that support
> >>> low-power states")
> >>> was originally not supposed to go to stable. It was added later as it
> >>> solved some
> >>> cases triggered by 4baf12181509 ("xhci: Loosen RPM as default policy
> >>> to cover for AMD xHC 1.1")
> >>> see:
> >>> https://lore.kernel.org/linux-usb/[email protected]/
> >>>
> >>> Turns out it wasn't enough.
> >>>
> >>> If we now revert 4baf12181509 "xhci: Loosen RPM as default policy to
> >>> cover for AMD xHC 1.1"
> >>> I still think it makes sense to also revert a5d6264b638e.
> >>> Especially from the stable kernels.
> >>
> >> Yes , a5d6264b638e still solves other issues if underlying hardware
> >> doesn't support RPM
> >> if we revert a5d6264b638e on stable releases then new issues (not
> >> related to regression)
> >> other than AMD xHC 1.1 controllers including xHC 1.2 will still exist
> >> on stable releases.
> >
> > Ok, got it, so a5d6264b638e also solves other issues than those
> > exposed by 4baf12181509.
> > And that one (a5d6264b638) should originally have been marked for stable.
> >
> > So only revert 4baf12181509, PATCH 2/2 in this series
>
> Thank you, that is correct.

So just take patch 2/2 here, or will someone be sending me a new patch?

thanks,

greg k-h

2023-12-05 09:04:55

by Mathias Nyman

[permalink] [raw]
Subject: [PATCH v2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

This reverts commit 4baf1218150985ee3ab0a27220456a1f027ea0ac.

Enabling runtime pm as default for all AMD xHC 1.1 controllers caused
regression. An initial attempt to fix those was done in commit a5d6264b638e
("xhci: Enable RPM on controllers that support low-power states") but new
issues are still seen.

Revert this to get those AMD xHC 1.1 systems working

This patch went to stable an needs to be reverted from there as well.

Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
Link: https://lore.kernel.org/linux-usb/[email protected]
Cc: Mario Limonciello <[email protected]>
Cc: Basavaraj Natikar <[email protected]>
Cc: [email protected]
Signed-off-by: Mathias Nyman <[email protected]>
---
v1 -> v2
Revert only one patch, keep commit a5d6264b638
Minor commit message changes

drivers/usb/host/xhci-pci.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 95ed9404f6f8..d6fc08e5db8f 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -535,8 +535,6 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
/* xHC spec requires PCI devices to support D3hot and D3cold */
if (xhci->hci_version >= 0x120)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
- else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
- xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;

if (xhci->quirks & XHCI_RESET_ON_RESUME)
xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
--
2.25.1


2023-12-05 09:13:31

by bluez.test.bot

[permalink] [raw]
Subject: RE: [v2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----

error: patch failed: drivers/usb/host/xhci-pci.c:535
error: drivers/usb/host/xhci-pci.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch

Please resolve the issue and submit the patches again.


---
Regards,
Linux Bluetooth

2023-12-05 18:36:53

by Mario Limonciello

[permalink] [raw]
Subject: Re: [PATCH v2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

On 12/5/2023 03:05, Mathias Nyman wrote:
> This reverts commit 4baf1218150985ee3ab0a27220456a1f027ea0ac.
>
> Enabling runtime pm as default for all AMD xHC 1.1 controllers caused
> regression. An initial attempt to fix those was done in commit a5d6264b638e
> ("xhci: Enable RPM on controllers that support low-power states") but new
> issues are still seen.
>
> Revert this to get those AMD xHC 1.1 systems working
>
> This patch went to stable an needs to be reverted from there as well.
>
> Fixes: 4baf12181509 ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
> Link: https://lore.kernel.org/linux-usb/[email protected]
> Cc: Mario Limonciello <[email protected]>
> Cc: Basavaraj Natikar <[email protected]>
> Cc: [email protected]
> Signed-off-by: Mathias Nyman <[email protected]>

Reviewed-by: Mario Limonciello <[email protected]>

This presumes that Basavaraj is going to send up another patch for the
ID it for sure improves, works and is needed.

> ---
> v1 -> v2
> Revert only one patch, keep commit a5d6264b638
> Minor commit message changes
>
> drivers/usb/host/xhci-pci.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 95ed9404f6f8..d6fc08e5db8f 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -535,8 +535,6 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
> /* xHC spec requires PCI devices to support D3hot and D3cold */
> if (xhci->hci_version >= 0x120)
> xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
> - else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
> - xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
>
> if (xhci->quirks & XHCI_RESET_ON_RESUME)
> xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,


2023-12-15 16:53:34

by patchwork-bot+bluetooth

[permalink] [raw]
Subject: Re: [PATCH v2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

Hello:

This patch was applied to bluetooth/bluetooth-next.git (master)
by Greg Kroah-Hartman <[email protected]>:

On Tue, 5 Dec 2023 11:05:48 +0200 you wrote:
> This reverts commit 4baf1218150985ee3ab0a27220456a1f027ea0ac.
>
> Enabling runtime pm as default for all AMD xHC 1.1 controllers caused
> regression. An initial attempt to fix those was done in commit a5d6264b638e
> ("xhci: Enable RPM on controllers that support low-power states") but new
> issues are still seen.
>
> [...]

Here is the summary with links:
- [v2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
https://git.kernel.org/bluetooth/bluetooth-next/c/24be0b3c4059

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2023-12-15 16:53:34

by patchwork-bot+bluetooth

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "xhci: Enable RPM on controllers that support low-power states"

Hello:

This series was applied to bluetooth/bluetooth-next.git (master)
by Greg Kroah-Hartman <[email protected]>:

On Mon, 4 Dec 2023 12:08:58 +0200 you wrote:
> This reverts commit a5d6264b638efeca35eff72177fd28d149e0764b.
>
> This patch was an attempt to solve issues seen when enabling runtime PM
> as default for all AMD 1.1 xHC hosts. see commit 4baf12181509
> ("xhci: Loosen RPM as default policy to cover for AMD xHC 1.1")
>
> This was not enough, regressions are still seen, so start from a clean
> slate and revert both of them.
>
> [...]

Here is the summary with links:
- [1/2] Revert "xhci: Enable RPM on controllers that support low-power states"
(no matching commit)
- [2/2] Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
https://git.kernel.org/bluetooth/bluetooth-next/c/24be0b3c4059

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html