2023-01-06 11:57:46

by Rasmus Villemoes

[permalink] [raw]
Subject: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks imx8mp

We have an imx8mp board with a lan7801 usb ethernet chip hardwired on
the PCB, which is used as the host port for a Microchip KSZ9567 switch.

While trying to update the kernel to 6.1.y, I found something quite
weird: When the switch was being probed for the second time (the first
ends with a standard -EPROBE_DEFER), the board would spontaneously reset.

Now when I disable the switch driver in .config just to see how far I
could otherwise get, the lan7801 device didn't appear until about 47
seconds after boot. Bisecting unambiguously points at 3497b9a5, and
digging in, it's pretty obvious why that is bogus at least for imx8mp.

The .dtsi file lists IMX8MP_CLK_USB_ROOT as the "suspend" clk, and
clk_get_rate() of that returns 500000000 ; divided by 16000 that's
31250, which certainly doesn't fit in the 13-bit field GCTL_PWRDNSCALE.
But I assume the .dtsi file is wrong, because imx8mq.dtsi has
74bd5951dd3 (arm64: dts: imx8mq: correct usb controller clocks), and it
seems likely from the commit log of 3497b9a5 that it was at least tested
on imx8mq.

Now I have no idea if the right clock for imx8mp is also some 32kHz clk,
but it would certainly make sense; unlike what the reference manual
claims, it seems that the reset value of the GCTL register is
0x00112004, amounting to a pwrdwnscale value of 0x00100000>>19 == 2 ==
32kHz/16kHz, and that could explain why things worked just fine without
3497b9a5.

Li Jun, please either revert 3497b9a5 or figure out if imx8mp.dtsi is
broken and needs a fix similar to 74bd5951dd3.

Rasmus


Subject: Re: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks imx8mp

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; all text you find below is based on a few templates
paragraphs you might have encountered already already in similar form.
See link in footer if these mails annoy you.]

On 06.01.23 12:54, Rasmus Villemoes wrote:
> We have an imx8mp board with a lan7801 usb ethernet chip hardwired on
> the PCB, which is used as the host port for a Microchip KSZ9567 switch.
>
> While trying to update the kernel to 6.1.y, I found something quite
> weird: When the switch was being probed for the second time (the first
> ends with a standard -EPROBE_DEFER), the board would spontaneously reset.
>
> Now when I disable the switch driver in .config just to see how far I
> could otherwise get, the lan7801 device didn't appear until about 47
> seconds after boot. Bisecting unambiguously points at 3497b9a5, and
> digging in, it's pretty obvious why that is bogus at least for imx8mp.
>
> The .dtsi file lists IMX8MP_CLK_USB_ROOT as the "suspend" clk, and
> clk_get_rate() of that returns 500000000 ; divided by 16000 that's
> 31250, which certainly doesn't fit in the 13-bit field GCTL_PWRDNSCALE.
> But I assume the .dtsi file is wrong, because imx8mq.dtsi has
> 74bd5951dd3 (arm64: dts: imx8mq: correct usb controller clocks), and it
> seems likely from the commit log of 3497b9a5 that it was at least tested
> on imx8mq.
>
> Now I have no idea if the right clock for imx8mp is also some 32kHz clk,
> but it would certainly make sense; unlike what the reference manual
> claims, it seems that the reset value of the GCTL register is
> 0x00112004, amounting to a pwrdwnscale value of 0x00100000>>19 == 2 ==
> 32kHz/16kHz, and that could explain why things worked just fine without
> 3497b9a5.
>
> Li Jun, please either revert 3497b9a5 or figure out if imx8mp.dtsi is
> broken and needs a fix similar to 74bd5951dd3.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 3497b9a5c8c
#regzbot title usb: dwc3: imx8mp broken
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (see page linked in footer for details).

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

2023-01-06 13:56:45

by Jun Li

[permalink] [raw]
Subject: RE: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks imx8mp



> -----Original Message-----
> From: Rasmus Villemoes <[email protected]>
> Sent: Friday, January 6, 2023 7:55 PM
> To: Greg Kroah-Hartman <[email protected]>; Jun Li
> <[email protected]>
> Cc: LKML <[email protected]>
> Subject: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks
> imx8mp
>
> We have an imx8mp board with a lan7801 usb ethernet chip hardwired on the
> PCB, which is used as the host port for a Microchip KSZ9567 switch.
>
> While trying to update the kernel to 6.1.y, I found something quite
> weird: When the switch was being probed for the second time (the first ends
> with a standard -EPROBE_DEFER), the board would spontaneously reset.
>
> Now when I disable the switch driver in .config just to see how far I could
> otherwise get, the lan7801 device didn't appear until about 47 seconds after
> boot. Bisecting unambiguously points at 3497b9a5, and digging in, it's pretty
> obvious why that is bogus at least for imx8mp.
>
> The .dtsi file lists IMX8MP_CLK_USB_ROOT as the "suspend" clk, and
> clk_get_rate() of that returns 500000000 ; divided by 16000 that's 31250,
> which certainly doesn't fit in the 13-bit field GCTL_PWRDNSCALE.
> But I assume the .dtsi file is wrong, because imx8mq.dtsi has
> 74bd5951dd3 (arm64: dts: imx8mq: correct usb controller clocks), and it seems
> likely from the commit log of 3497b9a5 that it was at least tested on imx8mq.
>
> Now I have no idea if the right clock for imx8mp is also some 32kHz clk,
> but it would certainly make sense; unlike what the reference manual claims,
> it seems that the reset value of the GCTL register is 0x00112004, amounting
> to a pwrdwnscale value of 0x00100000>>19 == 2 == 32kHz/16kHz, and that could
> explain why things worked just fine without 3497b9a5.
>
> Li Jun, please either revert 3497b9a5 or figure out if imx8mp.dtsi is broken
> and needs a fix similar to 74bd5951dd3.

iMX8MP suspend clock(with name IMX8MP_CLK_USB_ROOT) was 32K when 3497b9a5
was merged, a later iMX8MP clock driver patch change the clock to be root
clock gate(actually this is a shared clock gate for both suspend and root
clock), so break the iMX8MP USB as you are seeing, a fix patch set already
addressed this issue, please check and apply below 3 patches:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/dt-bindings/clock/imx8mp-clock.h?id=5c1f7f1090947d494c30042123e0ec846f696336
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/clk/imx/clk-imx8mp.c?id=ed1f4ccfe947a3e1018a3bd7325134574c7ff9b3
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/arm64/boot/dts/freescale/imx8mp.dtsi?id=8a1ed98fe0f2e7669f0409de0f46f317b275f8be

Li Jun
>
> Rasmus

2023-01-09 09:23:14

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks imx8mp

On 06/01/2023 14.35, Jun Li wrote:
>
>
>> -----Original Message-----
>> From: Rasmus Villemoes <[email protected]>
>> Sent: Friday, January 6, 2023 7:55 PM
>> To: Greg Kroah-Hartman <[email protected]>; Jun Li
>> <[email protected]>

>> Li Jun, please either revert 3497b9a5 or figure out if imx8mp.dtsi is broken
>> and needs a fix similar to 74bd5951dd3.
>
> iMX8MP suspend clock(with name IMX8MP_CLK_USB_ROOT) was 32K when 3497b9a5
> was merged, a later iMX8MP clock driver patch change the clock to be root
> clock gate(actually this is a shared clock gate for both suspend and root
> clock), so break the iMX8MP USB as you are seeing, a fix patch set already
> addressed this issue, please check and apply below 3 patches:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/dt-bindings/clock/imx8mp-clock.h?id=5c1f7f1090947d494c30042123e0ec846f696336
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/clk/imx/clk-imx8mp.c?id=ed1f4ccfe947a3e1018a3bd7325134574c7ff9b3
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/arm64/boot/dts/freescale/imx8mp.dtsi?id=8a1ed98fe0f2e7669f0409de0f46f317b275f8be

Thanks, I see that the first two are in 6.2-rc1 and have already been
picked up in 6.0.y and 6.1.y. The last is not yet in mainline and thus
not eligible for -stable, but does seem to be included in Shawn's PR

https://lore.kernel.org/all/20230102132016.GA10699@T480/

so I assume that it will land shortly.

Rasmus

Subject: Re: commit 3497b9a5 (usb: dwc3: add power down scale setting) breaks imx8mp

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 06.01.23 13:05, Linux kernel regression tracking (#adding) wrote:

> On 06.01.23 12:54, Rasmus Villemoes wrote:
>> We have an imx8mp board with a lan7801 usb ethernet chip hardwired on
>> the PCB, which is used as the host port for a Microchip KSZ9567 switch.
>>
>> While trying to update the kernel to 6.1.y, I found something quite
>> weird: When the switch was being probed for the second time (the first
>> ends with a standard -EPROBE_DEFER), the board would spontaneously reset.
> #regzbot ^introduced 3497b9a5c8c
> #regzbot title usb: dwc3: imx8mp broken
> #regzbot ignore-activity

#regzbot fix: 8a1ed98fe0f2e76
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.