2023-02-25 14:29:08

by Linus Walleij

[permalink] [raw]
Subject: Regression in probing some AMBA devices possibly devlink related

Hi Saravana,

I have a boot regression for Ux500 on mainline, but bisecting mainline
isn't quite working for misc reasons :/

I'm not sure about this regression, but it smells like devlink-related.

Ux500 have a bunch of normal and some AMBA devices. After
boot this happens and we hang waiting for rootfs (which is on eMMC):

[ 31.849586] amba 80126000.mmc: deferred probe pending
[ 31.854801] amba 80118000.mmc: deferred probe pending
[ 31.859895] amba 80005000.mmc: deferred probe pending
[ 31.870297] amba 80120000.uart: deferred probe pending
[ 31.875472] amba 80121000.uart: deferred probe pending
[ 31.880689] amba 80004000.i2c: deferred probe pending
[ 31.885799] amba 80128000.i2c: deferred probe pending
[ 31.890932] amba 80110000.i2c: deferred probe pending
[ 51.688361] vmem_3v3: disabling

The last regulator (vmem_3v3) is something the eMMC that didn't
probe was supposed to use.

All the failing drivers are AMBA PrimeCell devices:
drivers/mmc/host/mmci.c
drivers/tty/serial/amba-pl011.c
drivers/i2c/busses/i2c-nomadik.c

This makes me suspect something was done for ordinary (platform)
devices that didn't happen for AMBA devices?

This is the main portion of the device tree containing these
devices and their resources:
arch/arm/boot/dts/ste-dbx5x0.dtsi

Any hints?

Yours,
Linus Walleij


2023-02-25 15:07:03

by Linus Walleij

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sat, Feb 25, 2023 at 3:28 PM Linus Walleij <[email protected]> wrote:

> I have a boot regression for Ux500 on mainline, but bisecting mainline
> isn't quite working for misc reasons :/

I however managed to revert Gregs entire branch for driver core
with:

git revert -m 1 a93e884edf61

And this makes the regression go away, so I'm pretty sure it has
something to do with the device link patches.

Yours,
Linus Walleij

2023-02-25 15:14:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sat, Feb 25, 2023 at 04:06:48PM +0100, Linus Walleij wrote:
> On Sat, Feb 25, 2023 at 3:28 PM Linus Walleij <[email protected]> wrote:
>
> > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > isn't quite working for misc reasons :/
>
> I however managed to revert Gregs entire branch for driver core
> with:
>
> git revert -m 1 a93e884edf61
>
> And this makes the regression go away, so I'm pretty sure it has
> something to do with the device link patches.

Does the fix at:
https://lore.kernel.org/r/0a92979f6e790737544638e8a4c19b0564e660a2.1676983596.git.geert+renesas@glider.be

resolve this?

thanks,

greg k-h

2023-02-25 18:43:27

by Linus Walleij

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sat, Feb 25, 2023 at 4:14 PM Greg KH <[email protected]> wrote:
> On Sat, Feb 25, 2023 at 04:06:48PM +0100, Linus Walleij wrote:
> > On Sat, Feb 25, 2023 at 3:28 PM Linus Walleij <[email protected]> wrote:
> >
> > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > isn't quite working for misc reasons :/
> >
> > I however managed to revert Gregs entire branch for driver core
> > with:
> >
> > git revert -m 1 a93e884edf61
> >
> > And this makes the regression go away, so I'm pretty sure it has
> > something to do with the device link patches.
>
> Does the fix at:
> https://lore.kernel.org/r/0a92979f6e790737544638e8a4c19b0564e660a2.1676983596.git.geert+renesas@glider.be
>
> resolve this?

Sadly no.

It seems pretty different, my problem is eternal probe deferral.

That patch looks good though.

Yours,
Linus Walleij

2023-02-25 23:56:23

by Saravana Kannan

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <[email protected]> wrote:
>
> Hi Saravana,
>
> I have a boot regression for Ux500 on mainline, but bisecting mainline
> isn't quite working for misc reasons :/
>
> I'm not sure about this regression, but it smells like devlink-related.
>
> Ux500 have a bunch of normal and some AMBA devices. After
> boot this happens and we hang waiting for rootfs (which is on eMMC):

Hmmm... my recent changes were definitely tested on systems with AMBA
devices and it worked.

> [ 31.849586] amba 80126000.mmc: deferred probe pending
> [ 31.854801] amba 80118000.mmc: deferred probe pending
> [ 31.859895] amba 80005000.mmc: deferred probe pending
> [ 31.870297] amba 80120000.uart: deferred probe pending
> [ 31.875472] amba 80121000.uart: deferred probe pending
> [ 31.880689] amba 80004000.i2c: deferred probe pending
> [ 31.885799] amba 80128000.i2c: deferred probe pending
> [ 31.890932] amba 80110000.i2c: deferred probe pending
> [ 51.688361] vmem_3v3: disabling

What does /debug/devices_deferred say about these? That should tell us
exactly what's blocking it.

Also if you convert all the pr_debug and dev_dbg in
drivers/base/core.c that should give us enough of an idea of what's
happening. Can you do that too and send it as an attachment (I logs
are hard to read when they get word wrapped)?

> The last regulator (vmem_3v3) is something the eMMC that didn't
> probe was supposed to use.
>
> All the failing drivers are AMBA PrimeCell devices:
> drivers/mmc/host/mmci.c
> drivers/tty/serial/amba-pl011.c
> drivers/i2c/busses/i2c-nomadik.c
>
> This makes me suspect something was done for ordinary (platform)
> devices that didn't happen for AMBA devices?
>
> This is the main portion of the device tree containing these
> devices and their resources:
> arch/arm/boot/dts/ste-dbx5x0.dtsi

I'll take a closer look on Monday. Btw, I always prefer the dts file
because there's always some property that adds a dependency and that
might be the clue to whatever is broken. But I'll take a look at this.

It's unlikely the patch attached might fix it, but can you give it a shot?

-Saravana


Attachments:
0001-add-class-support-to-device-links.patch (1.53 kB)

2023-02-26 21:57:41

by Linus Walleij

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <[email protected]> wrote:
> On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <[email protected]> wrote:
> >
> > Hi Saravana,
> >
> > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > isn't quite working for misc reasons :/
> >
> > I'm not sure about this regression, but it smells like devlink-related.
> >
> > Ux500 have a bunch of normal and some AMBA devices. After
> > boot this happens and we hang waiting for rootfs (which is on eMMC):
>
> Hmmm... my recent changes were definitely tested on systems with AMBA
> devices and it worked.

Some of them work, such as the TWD, interrupt controller GIC
and the CoreSight stuff.

> > [ 31.849586] amba 80126000.mmc: deferred probe pending
> > [ 31.854801] amba 80118000.mmc: deferred probe pending
> > [ 31.859895] amba 80005000.mmc: deferred probe pending
> > [ 31.870297] amba 80120000.uart: deferred probe pending
> > [ 31.875472] amba 80121000.uart: deferred probe pending
> > [ 31.880689] amba 80004000.i2c: deferred probe pending
> > [ 31.885799] amba 80128000.i2c: deferred probe pending
> > [ 31.890932] amba 80110000.i2c: deferred probe pending
> > [ 51.688361] vmem_3v3: disabling
>
> What does /debug/devices_deferred say about these? That should tell us
> exactly what's blocking it.

Can't get to check that sadly, because the root fs is on eMMC and
that one is deferring indefinitely.

I tried to compile in an initramfs but that became too big for this
target. I'll check if I can use that on another target which I
can boot directly from RAM.

> Also if you convert all the pr_debug and dev_dbg in
> drivers/base/core.c that should give us enough of an idea of what's
> happening. Can you do that too and send it as an attachment (I logs
> are hard to read when they get word wrapped)?

This I could do!

See attachment.

It seems these devices are waiting for the PRCC reset-controller
which is in
drivers/clk/ux500/reset-prcc.c

Very well! I went into
arch/arm/boot/dts/ste-dbx5x0.dtsi
and commented out all "resets = < &prcc....>; statements,
and indeed then the system comes up fine! (The resets are
not vital to boot the system.)

But this is weird because the driver for the resets is spawn from
the same device node that is spawning the clocks...
All resources (clocks and resets) are spawn from here:
drivers/clk/ux500/u8500_of_clk.c

From these nodes in the DTS
arch/arm/boot/dts/ste-dbx5x0.dtsi:

clocks {
compatible = "stericsson,u8500-clks";
/*
* Registers for the CLKRST block on peripheral
* groups 1, 2, 3, 5, 6,
*/
reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
<0x8000f000 0x1000>, <0xa03ff000 0x1000>,
<0xa03cf000 0x1000>;

prcmu_clk: prcmu-clock {
#clock-cells = <1>;
};

prcc_pclk: prcc-periph-clock {
#clock-cells = <2>;
};

prcc_kclk: prcc-kernel-clock {
#clock-cells = <2>;
};

prcc_reset: prcc-reset-controller {
#reset-cells = <2>;
};

rtc_clk: rtc32k-clock {
#clock-cells = <0>;
};

smp_twd_clk: smp-twd-clock {
#clock-cells = <0>;
};

clkout_clk: clkout-clock {
/* Cell 1 id, cell 2 source, cell 3 div */
#clock-cells = <3>;
};
};

So it seems like the device core things that the
resets are never coming online. But they are! I put some
debug prints into the PRCC reset driver to make sure.

diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
index f7e48941fbc7..b1ed9a48cdfb 100644
--- a/drivers/clk/ux500/reset-prcc.c
+++ b/drivers/clk/ux500/reset-prcc.c
@@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
struct u8500_prcc_reset *ur)
ret = reset_controller_register(rcdev);
if (ret)
pr_err("PRCC failed to register reset controller\n");
+
+ pr_info("PRCC reset controller registered\n");
}

[ 0.000000] PRCC reset controller registered

It is registered as early as the clocks.

So it seems something is fishy with the reset resource resolution?

> > The last regulator (vmem_3v3) is something the eMMC that didn't
> > probe was supposed to use.
> >
> > All the failing drivers are AMBA PrimeCell devices:
> > drivers/mmc/host/mmci.c
> > drivers/tty/serial/amba-pl011.c
> > drivers/i2c/busses/i2c-nomadik.c
> >
> > This makes me suspect something was done for ordinary (platform)
> > devices that didn't happen for AMBA devices?
> >
> > This is the main portion of the device tree containing these
> > devices and their resources:
> > arch/arm/boot/dts/ste-dbx5x0.dtsi
>
> I'll take a closer look on Monday. Btw, I always prefer the dts file
> because there's always some property that adds a dependency and that
> might be the clue to whatever is broken. But I'll take a look at this.

The top level DTS for this system I'm testing on (source for the
log) is
arch/arm/boot/dts/ste-ux500-samsung-golden.dts

> It's unlikely the patch attached might fix it, but can you give it a shot?

Sadly does not help!

Yours,
Linus Walleij


Attachments:
bar.txt (140.34 kB)

2023-02-27 21:08:02

by Linus Walleij

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Sun, Feb 26, 2023 at 10:55 PM Linus Walleij <[email protected]> wrote:
> On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <[email protected]> wrote:
> > On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <[email protected]> wrote:
> > >
> > > Hi Saravana,
> > >
> > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > isn't quite working for misc reasons :/
> > >
> > > I'm not sure about this regression, but it smells like devlink-related.
> > >
> > > Ux500 have a bunch of normal and some AMBA devices. After
> > > boot this happens and we hang waiting for rootfs (which is on eMMC):
> >
> > Hmmm... my recent changes were definitely tested on systems with AMBA
> > devices and it worked.
>
> Some of them work, such as the TWD, interrupt controller GIC
> and the CoreSight stuff.
>
> > > [ 31.849586] amba 80126000.mmc: deferred probe pending
> > > [ 31.854801] amba 80118000.mmc: deferred probe pending
> > > [ 31.859895] amba 80005000.mmc: deferred probe pending
> > > [ 31.870297] amba 80120000.uart: deferred probe pending
> > > [ 31.875472] amba 80121000.uart: deferred probe pending
> > > [ 31.880689] amba 80004000.i2c: deferred probe pending
> > > [ 31.885799] amba 80128000.i2c: deferred probe pending
> > > [ 31.890932] amba 80110000.i2c: deferred probe pending
> > > [ 51.688361] vmem_3v3: disabling
> >
> > What does /debug/devices_deferred say about these? That should tell us
> > exactly what's blocking it.
>
> Can't get to check that sadly, because the root fs is on eMMC and
> that one is deferring indefinitely.
>
> I tried to compile in an initramfs but that became too big for this
> target. I'll check if I can use that on another target which I
> can boot directly from RAM.
>
> > Also if you convert all the pr_debug and dev_dbg in
> > drivers/base/core.c that should give us enough of an idea of what's
> > happening. Can you do that too and send it as an attachment (I logs
> > are hard to read when they get word wrapped)?
>
> This I could do!
>
> See attachment.
>
> It seems these devices are waiting for the PRCC reset-controller
> which is in
> drivers/clk/ux500/reset-prcc.c
>
> Very well! I went into
> arch/arm/boot/dts/ste-dbx5x0.dtsi
> and commented out all "resets = < &prcc....>; statements,
> and indeed then the system comes up fine! (The resets are
> not vital to boot the system.)
>
> But this is weird because the driver for the resets is spawn from
> the same device node that is spawning the clocks...
> All resources (clocks and resets) are spawn from here:
> drivers/clk/ux500/u8500_of_clk.c
>
> From these nodes in the DTS
> arch/arm/boot/dts/ste-dbx5x0.dtsi:
>
> clocks {
> compatible = "stericsson,u8500-clks";
> /*
> * Registers for the CLKRST block on peripheral
> * groups 1, 2, 3, 5, 6,
> */
> reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
> <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
> <0xa03cf000 0x1000>;
>
> prcmu_clk: prcmu-clock {
> #clock-cells = <1>;
> };
>
> prcc_pclk: prcc-periph-clock {
> #clock-cells = <2>;
> };
>
> prcc_kclk: prcc-kernel-clock {
> #clock-cells = <2>;
> };
>
> prcc_reset: prcc-reset-controller {
> #reset-cells = <2>;
> };
>
> rtc_clk: rtc32k-clock {
> #clock-cells = <0>;
> };
>
> smp_twd_clk: smp-twd-clock {
> #clock-cells = <0>;
> };
>
> clkout_clk: clkout-clock {
> /* Cell 1 id, cell 2 source, cell 3 div */
> #clock-cells = <3>;
> };
> };
>
> So it seems like the device core things that the
> resets are never coming online. But they are! I put some
> debug prints into the PRCC reset driver to make sure.
>
> diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
> index f7e48941fbc7..b1ed9a48cdfb 100644
> --- a/drivers/clk/ux500/reset-prcc.c
> +++ b/drivers/clk/ux500/reset-prcc.c
> @@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
> struct u8500_prcc_reset *ur)
> ret = reset_controller_register(rcdev);
> if (ret)
> pr_err("PRCC failed to register reset controller\n");
> +
> + pr_info("PRCC reset controller registered\n");
> }
>
> [ 0.000000] PRCC reset controller registered
>
> It is registered as early as the clocks.
>
> So it seems something is fishy with the reset resource resolution?
>
> > > The last regulator (vmem_3v3) is something the eMMC that didn't
> > > probe was supposed to use.
> > >
> > > All the failing drivers are AMBA PrimeCell devices:
> > > drivers/mmc/host/mmci.c
> > > drivers/tty/serial/amba-pl011.c
> > > drivers/i2c/busses/i2c-nomadik.c
> > >
> > > This makes me suspect something was done for ordinary (platform)
> > > devices that didn't happen for AMBA devices?
> > >
> > > This is the main portion of the device tree containing these
> > > devices and their resources:
> > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> >
> > I'll take a closer look on Monday. Btw, I always prefer the dts file
> > because there's always some property that adds a dependency and that
> > might be the clue to whatever is broken. But I'll take a look at this.
>
> The top level DTS for this system I'm testing on (source for the
> log) is
> arch/arm/boot/dts/ste-ux500-samsung-golden.dts
>
> > It's unlikely the patch attached might fix it, but can you give it a shot?
>
> Sadly does not help!

Did this mail go through with the attachment?

I have pastebin:ed it here as well:
https://pastebin.com/Fwc7E1Sm

Yours,
Linus Walleij

2023-02-27 22:01:33

by Saravana Kannan

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Mon, Feb 27, 2023 at 1:07 PM Linus Walleij <[email protected]> wrote:
>
> On Sun, Feb 26, 2023 at 10:55 PM Linus Walleij <[email protected]> wrote:
> > On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <[email protected]> wrote:
> > > On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <[email protected]> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > > isn't quite working for misc reasons :/
> > > >
> > > > I'm not sure about this regression, but it smells like devlink-related.
> > > >
> > > > Ux500 have a bunch of normal and some AMBA devices. After
> > > > boot this happens and we hang waiting for rootfs (which is on eMMC):
> > >
> > > Hmmm... my recent changes were definitely tested on systems with AMBA
> > > devices and it worked.
> >
> > Some of them work, such as the TWD, interrupt controller GIC
> > and the CoreSight stuff.
> >
> > > > [ 31.849586] amba 80126000.mmc: deferred probe pending
> > > > [ 31.854801] amba 80118000.mmc: deferred probe pending
> > > > [ 31.859895] amba 80005000.mmc: deferred probe pending
> > > > [ 31.870297] amba 80120000.uart: deferred probe pending
> > > > [ 31.875472] amba 80121000.uart: deferred probe pending
> > > > [ 31.880689] amba 80004000.i2c: deferred probe pending
> > > > [ 31.885799] amba 80128000.i2c: deferred probe pending
> > > > [ 31.890932] amba 80110000.i2c: deferred probe pending
> > > > [ 51.688361] vmem_3v3: disabling
> > >
> > > What does /debug/devices_deferred say about these? That should tell us
> > > exactly what's blocking it.
> >
> > Can't get to check that sadly, because the root fs is on eMMC and
> > that one is deferring indefinitely.
> >
> > I tried to compile in an initramfs but that became too big for this
> > target. I'll check if I can use that on another target which I
> > can boot directly from RAM.
> >
> > > Also if you convert all the pr_debug and dev_dbg in
> > > drivers/base/core.c that should give us enough of an idea of what's
> > > happening. Can you do that too and send it as an attachment (I logs
> > > are hard to read when they get word wrapped)?
> >
> > This I could do!
> >
> > See attachment.
> >
> > It seems these devices are waiting for the PRCC reset-controller
> > which is in
> > drivers/clk/ux500/reset-prcc.c
> >
> > Very well! I went into
> > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > and commented out all "resets = < &prcc....>; statements,
> > and indeed then the system comes up fine! (The resets are
> > not vital to boot the system.)
> >
> > But this is weird because the driver for the resets is spawn from
> > the same device node that is spawning the clocks...
> > All resources (clocks and resets) are spawn from here:
> > drivers/clk/ux500/u8500_of_clk.c
> >
> > From these nodes in the DTS
> > arch/arm/boot/dts/ste-dbx5x0.dtsi:
> >
> > clocks {
> > compatible = "stericsson,u8500-clks";
> > /*
> > * Registers for the CLKRST block on peripheral
> > * groups 1, 2, 3, 5, 6,
> > */
> > reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
> > <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
> > <0xa03cf000 0x1000>;
> >
> > prcmu_clk: prcmu-clock {
> > #clock-cells = <1>;
> > };
> >
> > prcc_pclk: prcc-periph-clock {
> > #clock-cells = <2>;
> > };
> >
> > prcc_kclk: prcc-kernel-clock {
> > #clock-cells = <2>;
> > };
> >
> > prcc_reset: prcc-reset-controller {
> > #reset-cells = <2>;
> > };
> >
> > rtc_clk: rtc32k-clock {
> > #clock-cells = <0>;
> > };
> >
> > smp_twd_clk: smp-twd-clock {
> > #clock-cells = <0>;
> > };
> >
> > clkout_clk: clkout-clock {
> > /* Cell 1 id, cell 2 source, cell 3 div */
> > #clock-cells = <3>;
> > };
> > };
> >
> > So it seems like the device core things that the
> > resets are never coming online. But they are! I put some
> > debug prints into the PRCC reset driver to make sure.
> >
> > diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
> > index f7e48941fbc7..b1ed9a48cdfb 100644
> > --- a/drivers/clk/ux500/reset-prcc.c
> > +++ b/drivers/clk/ux500/reset-prcc.c
> > @@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
> > struct u8500_prcc_reset *ur)
> > ret = reset_controller_register(rcdev);
> > if (ret)
> > pr_err("PRCC failed to register reset controller\n");
> > +
> > + pr_info("PRCC reset controller registered\n");
> > }
> >
> > [ 0.000000] PRCC reset controller registered
> >
> > It is registered as early as the clocks.
> >
> > So it seems something is fishy with the reset resource resolution?
> >
> > > > The last regulator (vmem_3v3) is something the eMMC that didn't
> > > > probe was supposed to use.
> > > >
> > > > All the failing drivers are AMBA PrimeCell devices:
> > > > drivers/mmc/host/mmci.c
> > > > drivers/tty/serial/amba-pl011.c
> > > > drivers/i2c/busses/i2c-nomadik.c
> > > >
> > > > This makes me suspect something was done for ordinary (platform)
> > > > devices that didn't happen for AMBA devices?
> > > >
> > > > This is the main portion of the device tree containing these
> > > > devices and their resources:
> > > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > >
> > > I'll take a closer look on Monday. Btw, I always prefer the dts file
> > > because there's always some property that adds a dependency and that
> > > might be the clue to whatever is broken. But I'll take a look at this.
> >
> > The top level DTS for this system I'm testing on (source for the
> > log) is
> > arch/arm/boot/dts/ste-ux500-samsung-golden.dts
> >
> > > It's unlikely the patch attached might fix it, but can you give it a shot?
> >
> > Sadly does not help!
>
> Did this mail go through with the attachment?

Attachment came through for me. I should be able to give you a test
patch this week. My guess on what the issue is: I have to walk up to
the parents and check for FWNODE_FLAG_DEV_INITIALIZED or set it
recursively for all the child nodes (termination condition being when
the fwnode has a device).

-Saravana

>
> I have pastebin:ed it here as well:
> https://pastebin.com/Fwc7E1Sm
>
> Yours,
> Linus Walleij

2023-03-01 01:28:33

by Saravana Kannan

[permalink] [raw]
Subject: Re: Regression in probing some AMBA devices possibly devlink related

On Mon, Feb 27, 2023 at 2:00 PM Saravana Kannan <[email protected]> wrote:
>
> On Mon, Feb 27, 2023 at 1:07 PM Linus Walleij <[email protected]> wrote:
> >
> > On Sun, Feb 26, 2023 at 10:55 PM Linus Walleij <[email protected]> wrote:
> > > On Sun, Feb 26, 2023 at 12:56 AM Saravana Kannan <[email protected]> wrote:
> > > > On Sat, Feb 25, 2023 at 6:29 AM Linus Walleij <[email protected]> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > I have a boot regression for Ux500 on mainline, but bisecting mainline
> > > > > isn't quite working for misc reasons :/
> > > > >
> > > > > I'm not sure about this regression, but it smells like devlink-related.
> > > > >
> > > > > Ux500 have a bunch of normal and some AMBA devices. After
> > > > > boot this happens and we hang waiting for rootfs (which is on eMMC):
> > > >
> > > > Hmmm... my recent changes were definitely tested on systems with AMBA
> > > > devices and it worked.
> > >
> > > Some of them work, such as the TWD, interrupt controller GIC
> > > and the CoreSight stuff.
> > >
> > > > > [ 31.849586] amba 80126000.mmc: deferred probe pending
> > > > > [ 31.854801] amba 80118000.mmc: deferred probe pending
> > > > > [ 31.859895] amba 80005000.mmc: deferred probe pending
> > > > > [ 31.870297] amba 80120000.uart: deferred probe pending
> > > > > [ 31.875472] amba 80121000.uart: deferred probe pending
> > > > > [ 31.880689] amba 80004000.i2c: deferred probe pending
> > > > > [ 31.885799] amba 80128000.i2c: deferred probe pending
> > > > > [ 31.890932] amba 80110000.i2c: deferred probe pending
> > > > > [ 51.688361] vmem_3v3: disabling
> > > >
> > > > What does /debug/devices_deferred say about these? That should tell us
> > > > exactly what's blocking it.
> > >
> > > Can't get to check that sadly, because the root fs is on eMMC and
> > > that one is deferring indefinitely.
> > >
> > > I tried to compile in an initramfs but that became too big for this
> > > target. I'll check if I can use that on another target which I
> > > can boot directly from RAM.
> > >
> > > > Also if you convert all the pr_debug and dev_dbg in
> > > > drivers/base/core.c that should give us enough of an idea of what's
> > > > happening. Can you do that too and send it as an attachment (I logs
> > > > are hard to read when they get word wrapped)?
> > >
> > > This I could do!
> > >
> > > See attachment.
> > >
> > > It seems these devices are waiting for the PRCC reset-controller
> > > which is in
> > > drivers/clk/ux500/reset-prcc.c
> > >
> > > Very well! I went into
> > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > > and commented out all "resets = < &prcc....>; statements,
> > > and indeed then the system comes up fine! (The resets are
> > > not vital to boot the system.)
> > >
> > > But this is weird because the driver for the resets is spawn from
> > > the same device node that is spawning the clocks...
> > > All resources (clocks and resets) are spawn from here:
> > > drivers/clk/ux500/u8500_of_clk.c
> > >
> > > From these nodes in the DTS
> > > arch/arm/boot/dts/ste-dbx5x0.dtsi:
> > >
> > > clocks {
> > > compatible = "stericsson,u8500-clks";
> > > /*
> > > * Registers for the CLKRST block on peripheral
> > > * groups 1, 2, 3, 5, 6,
> > > */
> > > reg = <0x8012f000 0x1000>, <0x8011f000 0x1000>,
> > > <0x8000f000 0x1000>, <0xa03ff000 0x1000>,
> > > <0xa03cf000 0x1000>;
> > >
> > > prcmu_clk: prcmu-clock {
> > > #clock-cells = <1>;
> > > };
> > >
> > > prcc_pclk: prcc-periph-clock {
> > > #clock-cells = <2>;
> > > };
> > >
> > > prcc_kclk: prcc-kernel-clock {
> > > #clock-cells = <2>;
> > > };
> > >
> > > prcc_reset: prcc-reset-controller {
> > > #reset-cells = <2>;
> > > };
> > >
> > > rtc_clk: rtc32k-clock {
> > > #clock-cells = <0>;
> > > };
> > >
> > > smp_twd_clk: smp-twd-clock {
> > > #clock-cells = <0>;
> > > };
> > >
> > > clkout_clk: clkout-clock {
> > > /* Cell 1 id, cell 2 source, cell 3 div */
> > > #clock-cells = <3>;
> > > };
> > > };
> > >
> > > So it seems like the device core things that the
> > > resets are never coming online. But they are! I put some
> > > debug prints into the PRCC reset driver to make sure.
> > >
> > > diff --git a/drivers/clk/ux500/reset-prcc.c b/drivers/clk/ux500/reset-prcc.c
> > > index f7e48941fbc7..b1ed9a48cdfb 100644
> > > --- a/drivers/clk/ux500/reset-prcc.c
> > > +++ b/drivers/clk/ux500/reset-prcc.c
> > > @@ -178,4 +178,6 @@ void u8500_prcc_reset_init(struct device_node *np,
> > > struct u8500_prcc_reset *ur)
> > > ret = reset_controller_register(rcdev);
> > > if (ret)
> > > pr_err("PRCC failed to register reset controller\n");
> > > +
> > > + pr_info("PRCC reset controller registered\n");
> > > }
> > >
> > > [ 0.000000] PRCC reset controller registered
> > >
> > > It is registered as early as the clocks.
> > >
> > > So it seems something is fishy with the reset resource resolution?
> > >
> > > > > The last regulator (vmem_3v3) is something the eMMC that didn't
> > > > > probe was supposed to use.
> > > > >
> > > > > All the failing drivers are AMBA PrimeCell devices:
> > > > > drivers/mmc/host/mmci.c
> > > > > drivers/tty/serial/amba-pl011.c
> > > > > drivers/i2c/busses/i2c-nomadik.c
> > > > >
> > > > > This makes me suspect something was done for ordinary (platform)
> > > > > devices that didn't happen for AMBA devices?
> > > > >
> > > > > This is the main portion of the device tree containing these
> > > > > devices and their resources:
> > > > > arch/arm/boot/dts/ste-dbx5x0.dtsi
> > > >
> > > > I'll take a closer look on Monday. Btw, I always prefer the dts file
> > > > because there's always some property that adds a dependency and that
> > > > might be the clue to whatever is broken. But I'll take a look at this.
> > >
> > > The top level DTS for this system I'm testing on (source for the
> > > log) is
> > > arch/arm/boot/dts/ste-ux500-samsung-golden.dts
> > >
> > > > It's unlikely the patch attached might fix it, but can you give it a shot?
> > >
> > > Sadly does not help!
> >
> > Did this mail go through with the attachment?
>
> Attachment came through for me. I should be able to give you a test
> patch this week. My guess on what the issue is: I have to walk up to
> the parents and check for FWNODE_FLAG_DEV_INITIALIZED or set it
> recursively for all the child nodes (termination condition being when
> the fwnode has a device).

Hi Linus,

Based on your debug about resets and looking at the DT, I'm pretty
sure this should fix your issue. Can you give it a shot please?
https://lore.kernel.org/lkml/[email protected]/

-Saravana