2024-03-23 17:02:55

by Pratham Patel

[permalink] [raw]
Subject: Fixing the devicetree of Rock 5 Model B (and possibly others)

Since the introduction of the `of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing` patch, an issue with the device-tree of the Rock 5 Model B has been detected. All the stable kernels (6.7.y and 6.8.y) work on the Orange Pi 5, which has the Rockchip RK3588S SoC (same as the RK3588, but less I/O basically). So, being an owner of only two SBCs which use the RK3588* SoC, it appears that the Rock 5 Model B's DT is incorrect.

I looked at the patch and tried several things, neither resulted in anything that would point me to the core issue. Then I tried this:

```
$ grep -C 3 remote-endpoint arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts

port {
es8316_p0_0: endpoint {
remote-endpoint = <&i2s0_8ch_p0_0>;
};
};
};
--
i2s0_8ch_p0_0: endpoint {
dai-format = "i2s";
mclk-fs = <256>;
remote-endpoint = <&es8316_p0_0>;
};
};
};
```

So, from a cursory look, the issue seems to be related to either the DT node for the audio codec or related to the es8316's binding itself. Though I doubt that the later is the issue because if that were the issue, _someone_ with a Pine64 Pinebook Pro would've raised alarms. So far, this seems to be related to the `rk3588-rock-5b.dts` and possibly with the `rk3588s-rock-5a.dts` too.

I would **love** to help but I'm afraid I device-trees are not something that I am at-all familiar with. That said, I am open to methods of debugging this issue to provide a fix myself.

I would have replied to the patch's link but unfortunately, I haven't yet setup neomutt and my email provider's web UI doesn't have a [straightforward] way to reply using the 'In-Reply-To' header, hence a new thread. Apologies for the inconvenience caused.

-- Pratham Patel


Attachments:
publickey - [email protected] - 0xF2DDE54D.asc (669.00 B)
signature.asc (258.00 B)
OpenPGP digital signature
Download all attachments

2024-03-23 17:09:27

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

Ugh, just now noticing that I forgot to send the boot log captured over UART and forgot to disable sending the pubkey as an attachment.

The word wrap is broken because of course the web UI isn't mindful of that.

Sorry!

-- Pratham Patel


On Saturday, March 23rd, 2024 at 22:32, Pratham Patel <[email protected]> wrote:

>

>

> Since the introduction of the `of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing` patch, an issue with the device-tree of the Rock 5 Model B has been detected. All the stable kernels (6.7.y and 6.8.y) work on the Orange Pi 5, which has the Rockchip RK3588S SoC (same as the RK3588, but less I/O basically). So, being an owner of only two SBCs which use the RK3588* SoC, it appears that the Rock 5 Model B's DT is incorrect.
>

> I looked at the patch and tried several things, neither resulted in anything that would point me to the core issue. Then I tried this:
>

> ```
> $ grep -C 3 remote-endpoint arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>

> port {
> es8316_p0_0: endpoint {
> remote-endpoint = <&i2s0_8ch_p0_0>;
>

> };
> };
> };
> --
> i2s0_8ch_p0_0: endpoint {
> dai-format = "i2s";
> mclk-fs = <256>;
>

> remote-endpoint = <&es8316_p0_0>;
>

> };
> };
> };
> ```
>

> So, from a cursory look, the issue seems to be related to either the DT node for the audio codec or related to the es8316's binding itself. Though I doubt that the later is the issue because if that were the issue, someone with a Pine64 Pinebook Pro would've raised alarms. So far, this seems to be related to the `rk3588-rock-5b.dts` and possibly with the `rk3588s-rock-5a.dts` too.
>

> I would love to help but I'm afraid I device-trees are not something that I am at-all familiar with. That said, I am open to methods of debugging this issue to provide a fix myself.
>

> I would have replied to the patch's link but unfortunately, I haven't yet setup neomutt and my email provider's web UI doesn't have a [straightforward] way to reply using the 'In-Reply-To' header, hence a new thread. Apologies for the inconvenience caused.
>

> -- Pratham Patel


Attachments:
rock5b-boot.log (29.59 kB)
signature.asc (258.00 B)
OpenPGP digital signature
Download all attachments

2024-03-23 17:10:21

by Dragan Simic

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

Hello Pratham,

On 2024-03-23 18:02, Pratham Patel wrote:
> Since the introduction of the `of: property: fw_devlink: Fix stupid
> bug in remote-endpoint parsing` patch, an issue with the device-tree
> of the Rock 5 Model B has been detected. All the stable kernels (6.7.y
> and 6.8.y) work on the Orange Pi 5, which has the Rockchip RK3588S SoC
> (same as the RK3588, but less I/O basically). So, being an owner of
> only two SBCs which use the RK3588* SoC, it appears that the Rock 5
> Model B's DT is incorrect.
>
> I looked at the patch and tried several things, neither resulted in
> anything that would point me to the core issue. Then I tried this:

Could you, please, clarify a bit what's the actual issue you're
experiencing on your Rock 5B?

> ```
> $ grep -C 3 remote-endpoint
> arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>
> port {
> es8316_p0_0: endpoint {
> remote-endpoint = <&i2s0_8ch_p0_0>;
> };
> };
> };
> --
> i2s0_8ch_p0_0: endpoint {
> dai-format = "i2s";
> mclk-fs = <256>;
> remote-endpoint = <&es8316_p0_0>;
> };
> };
> };
> ```
>
> So, from a cursory look, the issue seems to be related to either the
> DT node for the audio codec or related to the es8316's binding itself.
> Though I doubt that the later is the issue because if that were the
> issue, _someone_ with a Pine64 Pinebook Pro would've raised alarms. So
> far, this seems to be related to the `rk3588-rock-5b.dts` and possibly
> with the `rk3588s-rock-5a.dts` too.
>
> I would **love** to help but I'm afraid I device-trees are not
> something that I am at-all familiar with. That said, I am open to
> methods of debugging this issue to provide a fix myself.
>
> I would have replied to the patch's link but unfortunately, I haven't
> yet setup neomutt and my email provider's web UI doesn't have a
> [straightforward] way to reply using the 'In-Reply-To' header, hence a
> new thread. Apologies for the inconvenience caused.
>
> -- Pratham Patel
> _______________________________________________
> Linux-rockchip mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-rockchip

Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On 23.03.24 18:02, Pratham Patel wrote:
> Since the introduction of the `of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing` patch,

There is an earlier bug report asking for a revert of that patch:

https://lore.kernel.org/all/ZfvN5jDrftG-YRG4@titan/

> an issue

Is your problem maybe similar to the one above?

Ciao, Thorsten

> with the device-tree of the Rock 5 Model B has been detected. All the stable kernels (6.7.y and 6.8.y) work on the Orange Pi 5, which has the Rockchip RK3588S SoC (same as the RK3588, but less I/O basically). So, being an owner of only two SBCs which use the RK3588* SoC, it appears that the Rock 5 Model B's DT is incorrect.
>
> I looked at the patch and tried several things, neither resulted in anything that would point me to the core issue. Then I tried this:
>
> ```
> $ grep -C 3 remote-endpoint arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>
> port {
> es8316_p0_0: endpoint {
> remote-endpoint = <&i2s0_8ch_p0_0>;
> };
> };
> };
> --
> i2s0_8ch_p0_0: endpoint {
> dai-format = "i2s";
> mclk-fs = <256>;
> remote-endpoint = <&es8316_p0_0>;
> };
> };
> };
> ```
>
> So, from a cursory look, the issue seems to be related to either the DT node for the audio codec or related to the es8316's binding itself. Though I doubt that the later is the issue because if that were the issue, _someone_ with a Pine64 Pinebook Pro would've raised alarms. So far, this seems to be related to the `rk3588-rock-5b.dts` and possibly with the `rk3588s-rock-5a.dts` too.
>
> I would **love** to help but I'm afraid I device-trees are not something that I am at-all familiar with. That said, I am open to methods of debugging this issue to provide a fix myself.
>
> I would have replied to the patch's link but unfortunately, I haven't yet setup neomutt and my email provider's web UI doesn't have a [straightforward] way to reply using the 'In-Reply-To' header, hence a new thread. Apologies for the inconvenience caused.
>
> -- Pratham Patel

2024-03-23 17:24:12

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Saturday, March 23rd, 2024 at 22:47, Linux regression tracking (Thorsten Leemhuis) <[email protected]> wrote:

>

>

> On 23.03.24 18:02, Pratham Patel wrote:
>

> > Since the introduction of the `of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing` patch,
>

>

> There is an earlier bug report asking for a revert of that patch:
>

> https://lore.kernel.org/all/ZfvN5jDrftG-YRG4@titan/
>

> > an issue
>

>

> Is your problem maybe similar to the one above?

I don't get that exact message in the boot log but yes.

> Ciao, Thorsten

-- Pratham Patel


Attachments:
publickey - [email protected] - 0xF2DDE54D.asc (669.00 B)
signature.asc (258.00 B)
OpenPGP digital signature
Download all attachments

2024-04-01 23:25:45

by Saravana Kannan

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic <[email protected]> wrote:
>
> Hello Pratham,
>
> On 2024-03-23 18:02, Pratham Patel wrote:
> > Since the introduction of the `of: property: fw_devlink: Fix stupid
> > bug in remote-endpoint parsing` patch, an issue with the device-tree
> > of the Rock 5 Model B has been detected. All the stable kernels (6.7.y
> > and 6.8.y) work on the Orange Pi 5, which has the Rockchip RK3588S SoC
> > (same as the RK3588, but less I/O basically). So, being an owner of
> > only two SBCs which use the RK3588* SoC, it appears that the Rock 5
> > Model B's DT is incorrect.
> >
> > I looked at the patch and tried several things, neither resulted in
> > anything that would point me to the core issue. Then I tried this:
>
> Could you, please, clarify a bit what's the actual issue you're
> experiencing on your Rock 5B?

Pratham, can you reply to this please? I don't really understand what
your issue is for me to be able to help.

Also, can you give the output of <debugfs>/devices_deferred for the
good vs bad case?

Thanks,
Saravana

>
> > ```
> > $ grep -C 3 remote-endpoint
> > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >
> > port {
> > es8316_p0_0: endpoint {
> > remote-endpoint = <&i2s0_8ch_p0_0>;
> > };
> > };
> > };
> > --
> > i2s0_8ch_p0_0: endpoint {
> > dai-format = "i2s";
> > mclk-fs = <256>;
> > remote-endpoint = <&es8316_p0_0>;
> > };
> > };
> > };
> > ```
> >
> > So, from a cursory look, the issue seems to be related to either the
> > DT node for the audio codec or related to the es8316's binding itself.
> > Though I doubt that the later is the issue because if that were the
> > issue, _someone_ with a Pine64 Pinebook Pro would've raised alarms. So
> > far, this seems to be related to the `rk3588-rock-5b.dts` and possibly
> > with the `rk3588s-rock-5a.dts` too.
> >
> > I would **love** to help but I'm afraid I device-trees are not
> > something that I am at-all familiar with. That said, I am open to
> > methods of debugging this issue to provide a fix myself.
> >
> > I would have replied to the patch's link but unfortunately, I haven't
> > yet setup neomutt and my email provider's web UI doesn't have a
> > [straightforward] way to reply using the 'In-Reply-To' header, hence a
> > new thread. Apologies for the inconvenience caused.
> >
> > -- Pratham Patel
> > _______________________________________________
> > Linux-rockchip mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-rockchip

2024-04-02 23:32:47

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Tue Apr 2, 2024 at 4:54 AM IST, Saravana Kannan wrote:
> On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic <[email protected]> wrote:
> >
> > Hello Pratham,
> >
> > On 2024-03-23 18:02, Pratham Patel wrote:
> > > I looked at the patch and tried several things, neither resulted in
> > > anything that would point me to the core issue. Then I tried this:
> >
> > Could you, please, clarify a bit what's the actual issue you're
> > experiencing on your Rock 5B?
>
> Pratham, can you reply to this please? I don't really understand what
> your issue is for me to be able to help.

Hi,

I apologize for not replying. Somehow, I did not notice the reply from
Dragan. :(

Since this patch was applied, an issue in the Rock 5B's DT has been
unearthed which now results in the kernel being unable to boot properly.

Following is the relevant call trace from the UART capture:

[ 21.595068] Call trace:
[ 21.595288] smp_call_function_many_cond+0x174/0x5f8
[ 21.595728] on_each_cpu_cond_mask+0x2c/0x40
[ 21.596109] cpuidle_register_driver+0x294/0x318
[ 21.596524] cpuidle_register+0x24/0x100
[ 21.596875] psci_cpuidle_probe+0x2e4/0x490
[ 21.597247] platform_probe+0x70/0xd0
[ 21.597575] really_probe+0x18c/0x3d8
[ 21.597905] __driver_probe_device+0x84/0x180
[ 21.598294] driver_probe_device+0x44/0x120
[ 21.598669] __device_attach_driver+0xc4/0x168
[ 21.599063] bus_for_each_drv+0x8c/0xf0
[ 21.599408] __device_attach+0xa4/0x1c0
[ 21.599748] device_initial_probe+0x1c/0x30
[ 21.600118] bus_probe_device+0xb4/0xc0
[ 21.600462] device_add+0x68c/0x888
[ 21.600775] platform_device_add+0x19c/0x270
[ 21.601154] platform_device_register_full+0xdc/0x178
[ 21.601602] psci_idle_init+0xa0/0xc8
[ 21.601934] do_one_initcall+0x60/0x290
[ 21.602275] kernel_init_freeable+0x20c/0x3e0
[ 21.602664] kernel_init+0x2c/0x1f8
[ 21.602979] ret_from_fork+0x10/0x20

> Also, can you give the output of <debugfs>/devices_deferred for the
> good vs bad case?

I can't provide you with requested output from the bad case, since the
kernel never moves past this to an initramfs rescue shell, but following
is the output from v6.8.1 (**with aforementioned patch reverted**).

# cat /sys/kernel/debug/devices_deferred
fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
1-0022 typec_fusb302: cannot register tcpm port
fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port

It seems that v6.8.2 works _without needing to revert the patch_. I will
have to look into this sometime this week but it seems like
a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
seems to be the one that fixed the root issue. I will have to test it
sometime later this week.

-- Pratham Patel


2024-04-03 00:46:52

by Saravana Kannan

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Tue, Apr 2, 2024 at 4:32 PM Pratham Patel
<[email protected]> wrote:
>
> On Tue Apr 2, 2024 at 4:54 AM IST, Saravana Kannan wrote:
> > On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic <[email protected]> wrote:
> > >
> > > Hello Pratham,
> > >
> > > On 2024-03-23 18:02, Pratham Patel wrote:
> > > > I looked at the patch and tried several things, neither resulted in
> > > > anything that would point me to the core issue. Then I tried this:
> > >
> > > Could you, please, clarify a bit what's the actual issue you're
> > > experiencing on your Rock 5B?
> >
> > Pratham, can you reply to this please? I don't really understand what
> > your issue is for me to be able to help.
>
> Hi,
>
> I apologize for not replying. Somehow, I did not notice the reply from
> Dragan. :(
>
> Since this patch was applied, an issue in the Rock 5B's DT has been
> unearthed which now results in the kernel being unable to boot properly.
>
> Following is the relevant call trace from the UART capture:
>
> [ 21.595068] Call trace:
> [ 21.595288] smp_call_function_many_cond+0x174/0x5f8
> [ 21.595728] on_each_cpu_cond_mask+0x2c/0x40
> [ 21.596109] cpuidle_register_driver+0x294/0x318
> [ 21.596524] cpuidle_register+0x24/0x100
> [ 21.596875] psci_cpuidle_probe+0x2e4/0x490
> [ 21.597247] platform_probe+0x70/0xd0
> [ 21.597575] really_probe+0x18c/0x3d8
> [ 21.597905] __driver_probe_device+0x84/0x180
> [ 21.598294] driver_probe_device+0x44/0x120
> [ 21.598669] __device_attach_driver+0xc4/0x168
> [ 21.599063] bus_for_each_drv+0x8c/0xf0
> [ 21.599408] __device_attach+0xa4/0x1c0
> [ 21.599748] device_initial_probe+0x1c/0x30
> [ 21.600118] bus_probe_device+0xb4/0xc0
> [ 21.600462] device_add+0x68c/0x888
> [ 21.600775] platform_device_add+0x19c/0x270
> [ 21.601154] platform_device_register_full+0xdc/0x178
> [ 21.601602] psci_idle_init+0xa0/0xc8
> [ 21.601934] do_one_initcall+0x60/0x290
> [ 21.602275] kernel_init_freeable+0x20c/0x3e0
> [ 21.602664] kernel_init+0x2c/0x1f8
> [ 21.602979] ret_from_fork+0x10/0x20

This doesn't make a lot of sense. "remote-endpoint" shouldn't be
related to anything to do with psci cpuidle. I'm guessing something
else is failing much earlier in boot that's indirectly causing this
somehow? Can you please take a look at what's failing earlier and let
us know? Or see what driver probe is failing up to this point but used
to work in the good case.

Also, where is the dts file that corresponds to this board in upstream? Is it
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts

>
> > Also, can you give the output of <debugfs>/devices_deferred for the
> > good vs bad case?
>
> I can't provide you with requested output from the bad case, since the
> kernel never moves past this to an initramfs rescue shell, but following
> is the output from v6.8.1 (**with aforementioned patch reverted**).
>
> # cat /sys/kernel/debug/devices_deferred
> fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> 1-0022 typec_fusb302: cannot register tcpm port
> fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
>
> It seems that v6.8.2 works _without needing to revert the patch_. I will
> have to look into this sometime this week but it seems like
> a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
> seems to be the one that fixed the root issue. I will have to test it
> sometime later this week.

Ok, once you find the patch that fixes things, let me know too.

-Saravana

2024-04-03 01:03:28

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Wed Apr 3, 2024 at 6:16 AM IST, Saravana Kannan wrote:
> On Tue, Apr 2, 2024 at 4:32 PM Pratham Patel
> <[email protected]> wrote:
> >
> > On Tue Apr 2, 2024 at 4:54 AM IST, Saravana Kannan wrote:
> > > On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic <dsimic@manjaroorg> wrote:
> > > >
> > > > Hello Pratham,
> > > >
> > > > On 2024-03-23 18:02, Pratham Patel wrote:
> > > > > I looked at the patch and tried several things, neither resulted in
> > > > > anything that would point me to the core issue. Then I tried this:
> > > >
> > > > Could you, please, clarify a bit what's the actual issue you're
> > > > experiencing on your Rock 5B?
> > >
> > > Pratham, can you reply to this please? I don't really understand what
> > > your issue is for me to be able to help.
> >
> > Hi,
> >
> > I apologize for not replying. Somehow, I did not notice the reply from
> > Dragan. :(
> >
> > Since this patch was applied, an issue in the Rock 5B's DT has been
> > unearthed which now results in the kernel being unable to boot properly.
> >
> > Following is the relevant call trace from the UART capture:
> >
> > [ 21.595068] Call trace:
> > [ 21.595288] smp_call_function_many_cond+0x174/0x5f8
> > [ 21.595728] on_each_cpu_cond_mask+0x2c/0x40
> > [ 21.596109] cpuidle_register_driver+0x294/0x318
> > [ 21.596524] cpuidle_register+0x24/0x100
> > [ 21.596875] psci_cpuidle_probe+0x2e4/0x490
> > [ 21.597247] platform_probe+0x70/0xd0
> > [ 21.597575] really_probe+0x18c/0x3d8
> > [ 21.597905] __driver_probe_device+0x84/0x180
> > [ 21.598294] driver_probe_device+0x44/0x120
> > [ 21.598669] __device_attach_driver+0xc4/0x168
> > [ 21.599063] bus_for_each_drv+0x8c/0xf0
> > [ 21.599408] __device_attach+0xa4/0x1c0
> > [ 21.599748] device_initial_probe+0x1c/0x30
> > [ 21.600118] bus_probe_device+0xb4/0xc0
> > [ 21.600462] device_add+0x68c/0x888
> > [ 21.600775] platform_device_add+0x19c/0x270
> > [ 21.601154] platform_device_register_full+0xdc/0x178
> > [ 21.601602] psci_idle_init+0xa0/0xc8
> > [ 21.601934] do_one_initcall+0x60/0x290
> > [ 21.602275] kernel_init_freeable+0x20c/0x3e0
> > [ 21.602664] kernel_init+0x2c/0x1f8
> > [ 21.602979] ret_from_fork+0x10/0x20
>
> This doesn't make a lot of sense. "remote-endpoint" shouldn't be
> related to anything to do with psci cpuidle. I'm guessing something
> else is failing much earlier in boot that's indirectly causing this
> somehow? Can you please take a look at what's failing earlier and let
> us know? Or see what driver probe is failing up to this point but used
> to work in the good case.

I'm pretty new to this, "just starting". I'm not sure how to do that,
since the kernel doesn't really "move forward". I will verify if
a8037ceb8964 fixes it or not and get back by the end of this week.

> Also, where is the dts file that corresponds to this board in upstream? Is it
> arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts

Yes.

> >
> > > Also, can you give the output of <debugfs>/devices_deferred for the
> > > good vs bad case?
> >
> > I can't provide you with requested output from the bad case, since the
> > kernel never moves past this to an initramfs rescue shell, but following
> > is the output from v6.8.1 (**with aforementioned patch reverted**).
> >
> > # cat /sys/kernel/debug/devices_deferred
> > fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> > 1-0022 typec_fusb302: cannot register tcpm port
> > fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
> >
> > It seems that v6.8.2 works _without needing to revert the patch_. I will
> > have to look into this sometime this week but it seems like
> > a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
> > seems to be the one that fixed the root issue. I will have to test it
> > sometime later this week.
>
> Ok, once you find the patch that fixes things, let me know too.

Will do!

-- Pratham Patel


2024-04-03 13:51:48

by Dragan Simic

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

Hello Pratham,

On 2024-04-03 01:32, Pratham Patel wrote:
> On Tue Apr 2, 2024 at 4:54 AM IST, Saravana Kannan wrote:
>> On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic <[email protected]>
>> wrote:
>> > On 2024-03-23 18:02, Pratham Patel wrote:
>> > > I looked at the patch and tried several things, neither resulted in
>> > > anything that would point me to the core issue. Then I tried this:
>> >
>> > Could you, please, clarify a bit what's the actual issue you're
>> > experiencing on your Rock 5B?
>>
>> Pratham, can you reply to this please? I don't really understand what
>> your issue is for me to be able to help.
>
> I apologize for not replying. Somehow, I did not notice the reply from
> Dragan. :(

No worries, I saw the serial console log file in one of your messages,
which actually provided the answer to my question. :)

> Since this patch was applied, an issue in the Rock 5B's DT has been
> unearthed which now results in the kernel being unable to boot
> properly.
>
> Following is the relevant call trace from the UART capture:
>
> [ 21.595068] Call trace:
> [ 21.595288] smp_call_function_many_cond+0x174/0x5f8
> [ 21.595728] on_each_cpu_cond_mask+0x2c/0x40
> [ 21.596109] cpuidle_register_driver+0x294/0x318
> [ 21.596524] cpuidle_register+0x24/0x100
> [ 21.596875] psci_cpuidle_probe+0x2e4/0x490
> [ 21.597247] platform_probe+0x70/0xd0
> [ 21.597575] really_probe+0x18c/0x3d8
> [ 21.597905] __driver_probe_device+0x84/0x180
> [ 21.598294] driver_probe_device+0x44/0x120
> [ 21.598669] __device_attach_driver+0xc4/0x168
> [ 21.599063] bus_for_each_drv+0x8c/0xf0
> [ 21.599408] __device_attach+0xa4/0x1c0
> [ 21.599748] device_initial_probe+0x1c/0x30
> [ 21.600118] bus_probe_device+0xb4/0xc0
> [ 21.600462] device_add+0x68c/0x888
> [ 21.600775] platform_device_add+0x19c/0x270
> [ 21.601154] platform_device_register_full+0xdc/0x178
> [ 21.601602] psci_idle_init+0xa0/0xc8
> [ 21.601934] do_one_initcall+0x60/0x290
> [ 21.602275] kernel_init_freeable+0x20c/0x3e0
> [ 21.602664] kernel_init+0x2c/0x1f8
> [ 21.602979] ret_from_fork+0x10/0x20
>
>> Also, can you give the output of <debugfs>/devices_deferred for the
>> good vs bad case?
>
> I can't provide you with requested output from the bad case, since the
> kernel never moves past this to an initramfs rescue shell, but
> following
> is the output from v6.8.1 (**with aforementioned patch reverted**).
>
> # cat /sys/kernel/debug/devices_deferred
> fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> 1-0022 typec_fusb302: cannot register tcpm port
> fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
>
> It seems that v6.8.2 works _without needing to revert the patch_. I
> will
> have to look into this sometime this week but it seems like
> a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only
> from rk3588 i2s)
> seems to be the one that fixed the root issue. I will have to test it
> sometime later this week.

2024-04-03 14:09:32

by Sebastian Reichel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

Hi,

On Wed, Apr 03, 2024 at 01:03:07AM +0000, Pratham Patel wrote:
> > > > Also, can you give the output of <debugfs>/devices_deferred for the
> > > > good vs bad case?
> > >
> > > I can't provide you with requested output from the bad case, since the
> > > kernel never moves past this to an initramfs rescue shell, but following
> > > is the output from v6.8.1 (**with aforementioned patch reverted**).
> > >
> > > # cat /sys/kernel/debug/devices_deferred
> > > fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> > > 1-0022 typec_fusb302: cannot register tcpm port
> > > fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
> > >
> > > It seems that v6.8.2 works _without needing to revert the patch_. I will
> > > have to look into this sometime this week but it seems like
> > > a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
> > > seems to be the one that fixed the root issue. I will have to test it
> > > sometime later this week.
> >
> > Ok, once you find the patch that fixes things, let me know too.
>
> Will do!

FWIW the v6.8.1 kernel referenced above is definitely patched, since
upstream's Rock 5B DT does neither describe fusb302, nor the USB
port it is connected to.

We have a few Rock 5B in Kernel CI and upstream boots perfectly
fine:

https://lava.collabora.dev/scheduler/device_type/rk3588-rock-5b

So it could be one of your downstream patches, which is introducing
this problem.

-- Sebastian


Attachments:
(No filename) (1.49 kB)
signature.asc (849.00 B)
Download all attachments

2024-04-03 15:27:21

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Wed Apr 3, 2024 at 7:22 PM IST, Sebastian Reichel wrote:
> Hi,
>
> On Wed, Apr 03, 2024 at 01:03:07AM +0000, Pratham Patel wrote:
> > > > > Also, can you give the output of <debugfs>/devices_deferred for the
> > > > > good vs bad case?
> > > >
> > > > I can't provide you with requested output from the bad case, since the
> > > > kernel never moves past this to an initramfs rescue shell, but following
> > > > is the output from v6.8.1 (**with aforementioned patch reverted**).
> > > >
> > > > # cat /sys/kernel/debug/devices_deferred
> > > > fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> > > > 1-0022 typec_fusb302: cannot register tcpm port
> > > > fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
> > > >
> > > > It seems that v6.8.2 works _without needing to revert the patch_. I will
> > > > have to look into this sometime this week but it seems like
> > > > a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
> > > > seems to be the one that fixed the root issue. I will have to test it
> > > > sometime later this week.
> > >
> > > Ok, once you find the patch that fixes things, let me know too.
> >
> > Will do!
>
> FWIW the v6.8.1 kernel referenced above is definitely patched, since
> upstream's Rock 5B DT does neither describe fusb302, nor the USB
> port it is connected to.
>
> We have a few Rock 5B in Kernel CI and upstream boots perfectly
> fine:
>
> https://lava.collabora.dev/scheduler/device_type/rk3588-rock-5b

Hmm, weird then. I can confirm that v6.8.1 doesn't _always_ boot. It
boots some times but still fails a majority of times. There is a
2 out of 10 chance that v6.8.1 will not boot. If you keep rebooting
enough times, you might get it to boot but the next boot is
likely to be borked. :(

That said, v6.8.2 might still have the same issue, but the probably of a
failed boot might be _lesser_ than v6.8.1 (from what I saw). I will
verify that behaviour sometime tomorrow or day after tomorrow.

>
> So it could be one of your downstream patches, which is introducing
> this problem.

I thought so too. So I built a vanilla kernel from the release tarball
of v6.8.1, using GCC + arm64 defconfig. I also tried using LLVM just in
case but noticed the same result.

-- Pratham Patel


2024-04-05 08:32:33

by Pratham Patel

[permalink] [raw]
Subject: Re: Fixing the devicetree of Rock 5 Model B (and possibly others)

On Wednesday, April 3rd, 2024 at 06:33, Pratham Patel <[email protected]> wrote:

>
>
> On Wed Apr 3, 2024 at 6:16 AM IST, Saravana Kannan wrote:
>
> > On Tue, Apr 2, 2024 at 4:32 PM Pratham Patel
> > [email protected] wrote:
> >
> > > On Tue Apr 2, 2024 at 4:54 AM IST, Saravana Kannan wrote:
> > >
> > > > On Sat, Mar 23, 2024 at 10:10 AM Dragan Simic [email protected] wrote:
> > > >
> > > > > Hello Pratham,
> > > > >
> > > > > On 2024-03-23 18:02, Pratham Patel wrote:
> > > > >
> > > > > > I looked at the patch and tried several things, neither resulted in
> > > > > > anything that would point me to the core issue. Then I tried this:
> > > > >
> > > > > Could you, please, clarify a bit what's the actual issue you're
> > > > > experiencing on your Rock 5B?
> > > >
> > > > Pratham, can you reply to this please? I don't really understand what
> > > > your issue is for me to be able to help.
> > >
> > > Hi,
> > >
> > > I apologize for not replying. Somehow, I did not notice the reply from
> > > Dragan. :(
> > >
> > > Since this patch was applied, an issue in the Rock 5B's DT has been
> > > unearthed which now results in the kernel being unable to boot properly.
> > >
> > > Following is the relevant call trace from the UART capture:
> > >
> > > [ 21.595068] Call trace:
> > > [ 21.595288] smp_call_function_many_cond+0x174/0x5f8
> > > [ 21.595728] on_each_cpu_cond_mask+0x2c/0x40
> > > [ 21.596109] cpuidle_register_driver+0x294/0x318
> > > [ 21.596524] cpuidle_register+0x24/0x100
> > > [ 21.596875] psci_cpuidle_probe+0x2e4/0x490
> > > [ 21.597247] platform_probe+0x70/0xd0
> > > [ 21.597575] really_probe+0x18c/0x3d8
> > > [ 21.597905] __driver_probe_device+0x84/0x180
> > > [ 21.598294] driver_probe_device+0x44/0x120
> > > [ 21.598669] __device_attach_driver+0xc4/0x168
> > > [ 21.599063] bus_for_each_drv+0x8c/0xf0
> > > [ 21.599408] __device_attach+0xa4/0x1c0
> > > [ 21.599748] device_initial_probe+0x1c/0x30
> > > [ 21.600118] bus_probe_device+0xb4/0xc0
> > > [ 21.600462] device_add+0x68c/0x888
> > > [ 21.600775] platform_device_add+0x19c/0x270
> > > [ 21.601154] platform_device_register_full+0xdc/0x178
> > > [ 21.601602] psci_idle_init+0xa0/0xc8
> > > [ 21.601934] do_one_initcall+0x60/0x290
> > > [ 21.602275] kernel_init_freeable+0x20c/0x3e0
> > > [ 21.602664] kernel_init+0x2c/0x1f8
> > > [ 21.602979] ret_from_fork+0x10/0x20
> >
> > This doesn't make a lot of sense. "remote-endpoint" shouldn't be
> > related to anything to do with psci cpuidle. I'm guessing something
> > else is failing much earlier in boot that's indirectly causing this
> > somehow? Can you please take a look at what's failing earlier and let
> > us know? Or see what driver probe is failing up to this point but used
> > to work in the good case.
>
>
> I'm pretty new to this, "just starting". I'm not sure how to do that,
> since the kernel doesn't really "move forward". I will verify if
> a8037ceb8964 fixes it or not and get back by the end of this week.
>
> > Also, where is the dts file that corresponds to this board in upstream? Is it
> > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>
>
> Yes.
>
> > > > Also, can you give the output of <debugfs>/devices_deferred for the
> > > > good vs bad case?
> > >
> > > I can't provide you with requested output from the bad case, since the
> > > kernel never moves past this to an initramfs rescue shell, but following
> > > is the output from v6.8.1 (with aforementioned patch reverted).
> > >
> > > # cat /sys/kernel/debug/devices_deferred
> > > fc400000.usb platform: wait for supplier /phy@fed90000/usb3-port
> > > 1-0022 typec_fusb302: cannot register tcpm port
> > > fc000000.usb platform: wait for supplier /phy@fed80000/usb3-port
> > >
> > > It seems that v6.8.2 works without needing to revert the patch. I will
> > > have to look into this sometime this week but it seems like
> > > a8037ceb8964 (arm64: dts: rockchip: drop rockchip,trcm-sync-tx-only from rk3588 i2s)
> > > seems to be the one that fixed the root issue. I will have to test it
> > > sometime later this week.
> >
> > Ok, once you find the patch that fixes things, let me know too.

I confirm that a8037ceb8964 fixed this issue for me. Now, v6.8.2+ boots on my Rock 5B,
with my distro's config and the arm64 defconfig.

-- Pratham Patel