2022-07-16 21:00:28

by Rafał Miłecki

[permalink] [raw]
Subject: fw_devlink=on breaks probing devices when of_platform_populate() is used

Hi,

I added of_platform_populate() calls in mtd subsystem in the commit
bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94

I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
We started receiving reports that probing Ethernet devices stopped
working in kernel 5.15. I bisected it down to the kernel 5.13 change:

commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
Author: Saravana Kannan <[email protected]>
Date: Tue Mar 2 13:11:32 2021 -0800

Revert "Revert "driver core: Set fw_devlink=on by default""

This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.

Since all reported issues due to fw_devlink=on should be addressed by
this series, revert the revert. fw_devlink=on Take II.

Signed-off-by: Saravana Kannan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

For me with above commit kernel just never calls bcm4908_enet_probe().
Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
same issue happens with other drivers.

Critical detail is that in DT Ethernet block node references NVMEM cell
of MTD partition (see below).

Could you help me dealing with this issue, please? Can you see something
obvious breaking fw_devlink=on + of_platform_populate() case? Can I
provide some extra information to help fixing it?


Relevant DT part:

partitions {
compatible = "fixed-partitions";
#address-cells = <1>;
#size-cells = <1>;

partition@0 {
compatible = "nvmem-cells";
reg = <0x0 0x100000>;
label = "bootloader";

#address-cells = <1>;
#size-cells = <1>;
ranges = <0 0x0 0x100000>;

base_mac_addr: mac@106a0 {
reg = <0x106a0 0x6>;
};
};

partition@100000 {
reg = <0x100000 0x5700000>;
label = "firmware";
};
};

ethernet@2000 {
compatible = "brcm,bcm4908-enet";
reg = <0x2000 0x1000>;

interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "rx", "tx";

nvmem-cells = <&base_mac_addr>;
nvmem-cell-names = "mac-address";
};

OpenWrt bug report:
https://github.com/openwrt/openwrt/issues/10232


2022-07-30 08:14:23

by Rafał Miłecki

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

Hi guys,

On 16.07.2022 22:50, Rafał Miłecki wrote:
> I added of_platform_populate() calls in mtd subsystem in the commit
> bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94
>
> I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
> We started receiving reports that probing Ethernet devices stopped
> working in kernel 5.15. I bisected it down to the kernel 5.13 change:
>
> commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
> Author: Saravana Kannan <[email protected]>
> Date:   Tue Mar 2 13:11:32 2021 -0800
>
>     Revert "Revert "driver core: Set fw_devlink=on by default""
>
>     This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
>
>     Since all reported issues due to fw_devlink=on should be addressed by
>     this series, revert the revert. fw_devlink=on Take II.
>
>     Signed-off-by: Saravana Kannan <[email protected]>
>     Link: https://lore.kernel.org/r/[email protected]
>     Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> For me with above commit kernel just never calls bcm4908_enet_probe().
> Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
> same issue happens with other drivers.
>
> Critical detail is that in DT Ethernet block node references NVMEM cell
> of MTD partition (see below).
>
> Could you help me dealing with this issue, please? Can you see something
> obvious breaking fw_devlink=on + of_platform_populate() case? Can I
> provide some extra information to help fixing it?

Any ideas about this problem / solution?


> Relevant DT part:
>
> partitions {
>     compatible = "fixed-partitions";
>     #address-cells = <1>;
>     #size-cells = <1>;
>
>     partition@0 {
>         compatible = "nvmem-cells";
>         reg = <0x0 0x100000>;
>         label = "bootloader";
>
>         #address-cells = <1>;
>         #size-cells = <1>;
>         ranges = <0 0x0 0x100000>;
>
>         base_mac_addr: mac@106a0 {
>             reg = <0x106a0 0x6>;
>         };
>     };
>
>     partition@100000 {
>         reg = <0x100000 0x5700000>;
>         label = "firmware";
>     };
> };
>
> ethernet@2000 {
>     compatible = "brcm,bcm4908-enet";
>     reg = <0x2000 0x1000>;
>
>     interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
>             <GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
>     interrupt-names = "rx", "tx";
>
>     nvmem-cells = <&base_mac_addr>;
>     nvmem-cell-names = "mac-address";
> };
>
> OpenWrt bug report:
> https://github.com/openwrt/openwrt/issues/10232


2022-08-28 14:43:22

by Rafał Miłecki

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

On 30.07.2022 09:36, Rafał Miłecki wrote:
> On 16.07.2022 22:50, Rafał Miłecki wrote:
>> I added of_platform_populate() calls in mtd subsystem in the commit
>> bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94
>>
>> I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
>> We started receiving reports that probing Ethernet devices stopped
>> working in kernel 5.15. I bisected it down to the kernel 5.13 change:
>>
>> commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
>> Author: Saravana Kannan <[email protected]>
>> Date:   Tue Mar 2 13:11:32 2021 -0800
>>
>>      Revert "Revert "driver core: Set fw_devlink=on by default""
>>
>>      This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
>>
>>      Since all reported issues due to fw_devlink=on should be addressed by
>>      this series, revert the revert. fw_devlink=on Take II.
>>
>>      Signed-off-by: Saravana Kannan <[email protected]>
>>      Link: https://lore.kernel.org/r/[email protected]
>>      Signed-off-by: Greg Kroah-Hartman <[email protected]>
>>
>> For me with above commit kernel just never calls bcm4908_enet_probe().
>> Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
>> same issue happens with other drivers.
>>
>> Critical detail is that in DT Ethernet block node references NVMEM cell
>> of MTD partition (see below).
>>
>> Could you help me dealing with this issue, please? Can you see something
>> obvious breaking fw_devlink=on + of_platform_populate() case? Can I
>> provide some extra information to help fixing it?
>
> Any ideas about this problem / solution?

I didn't get any reponse for this bug for 6 weeks now. Is that OK if I
send a revert patch then?


>> Relevant DT part:
>>
>> partitions {
>>      compatible = "fixed-partitions";
>>      #address-cells = <1>;
>>      #size-cells = <1>;
>>
>>      partition@0 {
>>          compatible = "nvmem-cells";
>>          reg = <0x0 0x100000>;
>>          label = "bootloader";
>>
>>          #address-cells = <1>;
>>          #size-cells = <1>;
>>          ranges = <0 0x0 0x100000>;
>>
>>          base_mac_addr: mac@106a0 {
>>              reg = <0x106a0 0x6>;
>>          };
>>      };
>>
>>      partition@100000 {
>>          reg = <0x100000 0x5700000>;
>>          label = "firmware";
>>      };
>> };
>>
>> ethernet@2000 {
>>      compatible = "brcm,bcm4908-enet";
>>      reg = <0x2000 0x1000>;
>>
>>      interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
>>              <GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
>>      interrupt-names = "rx", "tx";
>>
>>      nvmem-cells = <&base_mac_addr>;
>>      nvmem-cell-names = "mac-address";
>> };
>>
>> OpenWrt bug report:
>> https://github.com/openwrt/openwrt/issues/10232

2022-08-30 08:04:32

by Saravana Kannan

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

On Sun, Aug 28, 2022 at 7:39 AM Rafał Miłecki <[email protected]> wrote:
>
> On 30.07.2022 09:36, Rafał Miłecki wrote:
> > On 16.07.2022 22:50, Rafał Miłecki wrote:
> >> I added of_platform_populate() calls in mtd subsystem in the commit
> >> bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94
> >>
> >> I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
> >> We started receiving reports that probing Ethernet devices stopped
> >> working in kernel 5.15. I bisected it down to the kernel 5.13 change:
> >>
> >> commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
> >> Author: Saravana Kannan <[email protected]>
> >> Date: Tue Mar 2 13:11:32 2021 -0800
> >>
> >> Revert "Revert "driver core: Set fw_devlink=on by default""
> >>
> >> This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
> >>
> >> Since all reported issues due to fw_devlink=on should be addressed by
> >> this series, revert the revert. fw_devlink=on Take II.
> >>
> >> Signed-off-by: Saravana Kannan <[email protected]>
> >> Link: https://lore.kernel.org/r/[email protected]
> >> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> >>
> >> For me with above commit kernel just never calls bcm4908_enet_probe().
> >> Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
> >> same issue happens with other drivers.
> >>
> >> Critical detail is that in DT Ethernet block node references NVMEM cell
> >> of MTD partition (see below).
> >>
> >> Could you help me dealing with this issue, please? Can you see something
> >> obvious breaking fw_devlink=on + of_platform_populate() case? Can I
> >> provide some extra information to help fixing it?
> >
> > Any ideas about this problem / solution?
>
> I didn't get any reponse for this bug for 6 weeks now. Is that OK if I
> send a revert patch then?

fw_devlink=on and of_platform_populate() have been working without any
problems for a while now. If your MTD change is causing an issue, why
not start debugging from there instead of suggesting reverting a
change that has been there before your change?

Most of us are busy getting stuff ready for LPC. So, things are going
to be a bit slow. I'm also working on other fw_devlink fixes. I can't
get to this quickly. So if you can debug this further and point out
where exactly it's getting caught up that'd help.

Enabling debug logs in drivers/base/core.c and drivers/base/dd.c might
make it easier for you to debug. Also, you can check the
<debugfs>/devices_deferred to tell why a device isn't getting probed
yet.

-Saravana

>
>
> >> Relevant DT part:
> >>
> >> partitions {
> >> compatible = "fixed-partitions";
> >> #address-cells = <1>;
> >> #size-cells = <1>;
> >>
> >> partition@0 {
> >> compatible = "nvmem-cells";
> >> reg = <0x0 0x100000>;
> >> label = "bootloader";
> >>
> >> #address-cells = <1>;
> >> #size-cells = <1>;
> >> ranges = <0 0x0 0x100000>;
> >>
> >> base_mac_addr: mac@106a0 {
> >> reg = <0x106a0 0x6>;
> >> };
> >> };
> >>
> >> partition@100000 {
> >> reg = <0x100000 0x5700000>;
> >> label = "firmware";
> >> };
> >> };
> >>
> >> ethernet@2000 {
> >> compatible = "brcm,bcm4908-enet";
> >> reg = <0x2000 0x1000>;
> >>
> >> interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
> >> <GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
> >> interrupt-names = "rx", "tx";
> >>
> >> nvmem-cells = <&base_mac_addr>;
> >> nvmem-cell-names = "mac-address";
> >> };
> >>
> >> OpenWrt bug report:
> >> https://github.com/openwrt/openwrt/issues/10232
>

2022-09-20 00:02:31

by Olof Johansson

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

On Sun, Aug 28, 2022 at 04:39:52PM +0200, Rafa?? Mi??ecki wrote:
> On 30.07.2022 09:36, Rafa?? Mi??ecki wrote:
> > On 16.07.2022 22:50, Rafa?? Mi??ecki wrote:
> > > I added of_platform_populate() calls in mtd subsystem in the commit
> > > bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94
> > >
> > > I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
> > > We started receiving reports that probing Ethernet devices stopped
> > > working in kernel 5.15. I bisected it down to the kernel 5.13 change:
> > >
> > > commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
> > > Author: Saravana Kannan <[email protected]>
> > > Date:???? Tue Mar 2 13:11:32 2021 -0800
> > >
> > > ???????? Revert "Revert "driver core: Set fw_devlink=on by default""
> > >
> > > ???????? This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
> > >
> > > ???????? Since all reported issues due to fw_devlink=on should be addressed by
> > > ???????? this series, revert the revert. fw_devlink=on Take II.
> > >
> > > ???????? Signed-off-by: Saravana Kannan <[email protected]>
> > > ???????? Link: https://lore.kernel.org/r/[email protected]
> > > ???????? Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > >
> > > For me with above commit kernel just never calls bcm4908_enet_probe().
> > > Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
> > > same issue happens with other drivers.
> > >
> > > Critical detail is that in DT Ethernet block node references NVMEM cell
> > > of MTD partition (see below).
> > >
> > > Could you help me dealing with this issue, please? Can you see something
> > > obvious breaking fw_devlink=on + of_platform_populate() case? Can I
> > > provide some extra information to help fixing it?
> >
> > Any ideas about this problem / solution?
>
> I didn't get any reponse for this bug for 6 weeks now. Is that OK if I
> send a revert patch then?

I'm pretty sure this is the same root cause as I had for PCIe with a reference
to iommu with fw_devlink.strict=1:

https://lore.kernel.org/lkml/CAOesGMgpJQjMvo6m7on+27F8REiHaVSRL6HBjiRPVDM9Jscnrg@mail.gmail.com/

There's the dependency on the nvmem-cells propery, so the driver core doesn't
call probe until it's fulfilled. Meanwhile, the platform_driver() code
unregisters the driver if it (thinks) it as called probe and there are no
devices linked to it -- since it's not a needed driver. Thus, probe will never
be called. That code is in drivers/base/platform.c:__platform_driver_probe().

I don't know what the proper fix is here, this seems like a design issue with
the fw_devlink code.


-Olof

2022-12-11 09:31:45

by Maksim Kiselev

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used


Hi, I have the same problem.
https://lore.kernel.org/all/CALHCpMgEZjnR39upkR6iozSk-b5A_GHRo9rcDSPXzzQi6x_qCw@mail.gmail.com/

I think the root of the problem was the choice of 'compatible'
device tree property to marking mtd partition node as a nvmem provider.

This property used only inside 'mtd_nvmem_add' function to setup
'no_of_node' flag.

> config.no_of_node = !of_device_is_compatible(node, "nvmem-cells");

This is how this flag processed by 'nvmem_register' function.

> if (config->of_node)
> nvmem->dev.of_node = config->of_node;
> else if (!config->no_of_node)
> nvmem->dev.of_node = config->dev->of_node;

Thats all, there is no such driver which compatible with 'nvmem-cells'.


So, maybe we should change the 'compatible' property to something else?

2022-12-14 22:21:29

by Saravana Kannan

[permalink] [raw]
Subject: Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

On Sun, Dec 11, 2022 at 12:46 AM Maksim Kiselev <[email protected]> wrote:
>
>
> Hi, I have the same problem.
> https://lore.kernel.org/all/CALHCpMgEZjnR39upkR6iozSk-b5A_GHRo9rcDSPXzzQi6x_qCw@mail.gmail.com/
>
> I think the root of the problem was the choice of 'compatible'
> device tree property to marking mtd partition node as a nvmem provider.
>
> This property used only inside 'mtd_nvmem_add' function to setup
> 'no_of_node' flag.
>
> > config.no_of_node = !of_device_is_compatible(node, "nvmem-cells");
>
> This is how this flag processed by 'nvmem_register' function.
>
> > if (config->of_node)
> > nvmem->dev.of_node = config->of_node;
> > else if (!config->no_of_node)
> > nvmem->dev.of_node = config->dev->of_node;
>
> Thats all, there is no such driver which compatible with 'nvmem-cells'.
>
>
> So, maybe we should change the 'compatible' property to something else?

Sorry about the accidental HTML in my previous reply. Resending as plain text.

I have a patch series [1](v1 sent out a while back) that stops
depending on the existence of "compatible" property for fw_devlink to
work. I had a few issues that I have fixes for that were tested in the
thread. I've been meaning to send out a v2 with all those fixes rolled
in. I'll try to get that out this week. Hopefully that'll address the
issues assuming Maksim's analysis about "compatible" is correct. If
not, I can take a closer look after that.

[1] - https://lore.kernel.org/lkml/[email protected]/

-Saravana