2020-07-06 18:14:13

by Maxime Ripard

[permalink] [raw]
Subject: PHY reset handling during DT parsing

Hi,

I came across an issue today on an Allwinner board, but I believe it's a
core issue.

That board is using the stmac driver together with a phy that happens to
have a reset GPIO, except that that GPIO will never be claimed, and the
PHY will thus never work.

You can find an example of such a board here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195

It looks like when of_mdiobus_register() will parse the DT, it will then
call of_mdiobus_register_phy() for each PHY it encounters [1].
of_mdiobus_register_phy() will then if the phy doesn't have an
ethernet-phy-id* compatible call get_phy_device() [2], and will later on
call phy_register_device [3].

get_phy_device() will then call get_phy_id() [4], that will try to
access the PHY through the MDIO bus [5].

The code that deals with the PHY reset line / GPIO is however only done
in mdiobus_device_register, called through phy_device_register. Since
this is happening way after the call to get_phy_device, our PHY might
still very well be in reset if the bootloader hasn't put it out of reset
and left it there.

I'm not entirely sure how to fix that though. I tried to fix it by
splitting away the gpio / reset code away from mdiobus_device_register
into a new function, and calling it before the first call to get_phy_id
so that we can put our phy out of reset, but it looks like the device
registration makes it more complicated than that. Any ideas?

Thanks!
Maxime

1: https://elixir.bootlin.com/linux/latest/source/drivers/of/of_mdio.c#L274
2: https://elixir.bootlin.com/linux/latest/source/drivers/of/of_mdio.c#L82
3: https://elixir.bootlin.com/linux/latest/source/drivers/of/of_mdio.c#L119
4: https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L830
5: https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L791


Attachments:
(No filename) (1.90 kB)
signature.asc (235.00 B)
Download all attachments

2020-07-07 14:22:22

by Andrew Lunn

[permalink] [raw]
Subject: Re: PHY reset handling during DT parsing

On Mon, Jul 06, 2020 at 08:13:31PM +0200, Maxime Ripard wrote:
> Hi,
>
> I came across an issue today on an Allwinner board, but I believe it's a
> core issue.
>
> That board is using the stmac driver together with a phy that happens to
> have a reset GPIO, except that that GPIO will never be claimed, and the
> PHY will thus never work.
>
> You can find an example of such a board here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195
>
> It looks like when of_mdiobus_register() will parse the DT, it will then
> call of_mdiobus_register_phy() for each PHY it encounters [1].
> of_mdiobus_register_phy() will then if the phy doesn't have an
> ethernet-phy-id* compatible call get_phy_device() [2], and will later on
> call phy_register_device [3].
>
> get_phy_device() will then call get_phy_id() [4], that will try to
> access the PHY through the MDIO bus [5].
>
> The code that deals with the PHY reset line / GPIO is however only done
> in mdiobus_device_register, called through phy_device_register. Since
> this is happening way after the call to get_phy_device, our PHY might
> still very well be in reset if the bootloader hasn't put it out of reset
> and left it there.

Hi Maxime

If you look at the history of this code,

commit bafbdd527d569c8200521f2f7579f65a044271be
Author: Sergei Shtylyov <[email protected]>
Date: Mon Dec 4 13:35:05 2017 +0100

phylib: Add device reset GPIO support

you will see there is an assumption the PHY can be detected while in
reset. The reset was originally handled inside the at803x PHY driver
probe function, before it got moved into the core.

What you are asking for it reasonable, but you have some history to
deal with, changing some assumptions as to what the reset is all
about.

Andrew

2020-07-07 14:57:12

by Maxime Ripard

[permalink] [raw]
Subject: Re: PHY reset handling during DT parsing

Hi Andrew,

On Tue, Jul 07, 2020 at 04:19:18PM +0200, Andrew Lunn wrote:
> On Mon, Jul 06, 2020 at 08:13:31PM +0200, Maxime Ripard wrote:
> > I came across an issue today on an Allwinner board, but I believe it's a
> > core issue.
> >
> > That board is using the stmac driver together with a phy that happens to
> > have a reset GPIO, except that that GPIO will never be claimed, and the
> > PHY will thus never work.
> >
> > You can find an example of such a board here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195
> >
> > It looks like when of_mdiobus_register() will parse the DT, it will then
> > call of_mdiobus_register_phy() for each PHY it encounters [1].
> > of_mdiobus_register_phy() will then if the phy doesn't have an
> > ethernet-phy-id* compatible call get_phy_device() [2], and will later on
> > call phy_register_device [3].
> >
> > get_phy_device() will then call get_phy_id() [4], that will try to
> > access the PHY through the MDIO bus [5].
> >
> > The code that deals with the PHY reset line / GPIO is however only done
> > in mdiobus_device_register, called through phy_device_register. Since
> > this is happening way after the call to get_phy_device, our PHY might
> > still very well be in reset if the bootloader hasn't put it out of reset
> > and left it there.
>
> Hi Maxime
>
> If you look at the history of this code,
>
> commit bafbdd527d569c8200521f2f7579f65a044271be
> Author: Sergei Shtylyov <[email protected]>
> Date: Mon Dec 4 13:35:05 2017 +0100
>
> phylib: Add device reset GPIO support
>
> you will see there is an assumption the PHY can be detected while in
> reset. The reset was originally handled inside the at803x PHY driver
> probe function, before it got moved into the core.
>
> What you are asking for it reasonable, but you have some history to
> deal with, changing some assumptions as to what the reset is all
> about.

Thanks for the pointer.

It looks to me from the commit log that the assumption was that a
bootloader could leave the PHY into reset though?

It starts with:

> The PHY devices sometimes do have their reset signal (maybe even power
> supply?) tied to some GPIO and sometimes it also does happen that a
> boot loader does not leave it deasserted.

This is exactly the case I was discussing. The bootloader hasn't used
the PHY, and thus the PHY reset signal is still asserted?

Maxime

2020-07-07 16:41:36

by Florian Fainelli

[permalink] [raw]
Subject: Re: PHY reset handling during DT parsing



On 7/7/2020 7:54 AM, Maxime Ripard wrote:
> Hi Andrew,
>
> On Tue, Jul 07, 2020 at 04:19:18PM +0200, Andrew Lunn wrote:
>> On Mon, Jul 06, 2020 at 08:13:31PM +0200, Maxime Ripard wrote:
>>> I came across an issue today on an Allwinner board, but I believe it's a
>>> core issue.
>>>
>>> That board is using the stmac driver together with a phy that happens to
>>> have a reset GPIO, except that that GPIO will never be claimed, and the
>>> PHY will thus never work.
>>>
>>> You can find an example of such a board here:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195
>>>
>>> It looks like when of_mdiobus_register() will parse the DT, it will then
>>> call of_mdiobus_register_phy() for each PHY it encounters [1].
>>> of_mdiobus_register_phy() will then if the phy doesn't have an
>>> ethernet-phy-id* compatible call get_phy_device() [2], and will later on
>>> call phy_register_device [3].
>>>
>>> get_phy_device() will then call get_phy_id() [4], that will try to
>>> access the PHY through the MDIO bus [5].
>>>
>>> The code that deals with the PHY reset line / GPIO is however only done
>>> in mdiobus_device_register, called through phy_device_register. Since
>>> this is happening way after the call to get_phy_device, our PHY might
>>> still very well be in reset if the bootloader hasn't put it out of reset
>>> and left it there.
>>
>> Hi Maxime
>>
>> If you look at the history of this code,
>>
>> commit bafbdd527d569c8200521f2f7579f65a044271be
>> Author: Sergei Shtylyov <[email protected]>
>> Date: Mon Dec 4 13:35:05 2017 +0100
>>
>> phylib: Add device reset GPIO support
>>
>> you will see there is an assumption the PHY can be detected while in
>> reset. The reset was originally handled inside the at803x PHY driver
>> probe function, before it got moved into the core.
>>
>> What you are asking for it reasonable, but you have some history to
>> deal with, changing some assumptions as to what the reset is all
>> about.
>
> Thanks for the pointer.
>
> It looks to me from the commit log that the assumption was that a
> bootloader could leave the PHY into reset though?
>
> It starts with:
>
>> The PHY devices sometimes do have their reset signal (maybe even power
>> supply?) tied to some GPIO and sometimes it also does happen that a
>> boot loader does not leave it deasserted.
>
> This is exactly the case I was discussing. The bootloader hasn't used
> the PHY, and thus the PHY reset signal is still asserted?

The current solution to this problem would be to have a reset property
specified for the MDIO bus controller node such that the reset would be
de-asserted during mdiobus_register() and prior to scanning the MDIO bus.

This has the nice property that for a 1:1 mapping between MDIO bus
controller and device, it works (albeit maybe not being accurately
describing hardware), but if you have multiple MDIO/PHY devices sitting
on the bus, they might each have their own reset control and while we
would attempt to manage that reset line from that call path:

mdiobus_scan()
-> phy_device_register()
-> mdiobus_register_device()

it would be too late because there is a preceding get_phy_device() which
attempts to read the PHY device's OUI and it would fail if the PHY is
still held in reset.

We have had a similar discussion before with regulators:

http://archive.lwn.net:8080/devicetree/[email protected]/

and so far we do not really have a good solution to this problem either.

For your specific use case with resets you could do a couple of things:

- if there is only one PHY on the MDIO bus you can describe the reset
line to be within the MDIO bus controller node (explained above), this
is not great but it is not necessarily too far from reality either

- if you know the PHY OUI, you can put it as a compatible string in the
form: ethernet-phy-id%4x.%4x and that will avoid the MDIO bus layer to
try to read from the PHY but instead match it with a driver and create a
phy_device instance right away

I think we are open to a "pre probe" model where it is possible to have
a phy_device instance created that would allow reset controller or
regulator (or clocks, too) to be usable with a struct device reference,
yet make no HW access until all of these resources have been enabled.
--
Florian

2020-07-07 17:21:16

by Rob Herring

[permalink] [raw]
Subject: Re: PHY reset handling during DT parsing

On Tue, Jul 7, 2020 at 10:39 AM Florian Fainelli <[email protected]> wrote:
>
>
>
> On 7/7/2020 7:54 AM, Maxime Ripard wrote:
> > Hi Andrew,
> >
> > On Tue, Jul 07, 2020 at 04:19:18PM +0200, Andrew Lunn wrote:
> >> On Mon, Jul 06, 2020 at 08:13:31PM +0200, Maxime Ripard wrote:
> >>> I came across an issue today on an Allwinner board, but I believe it's a
> >>> core issue.
> >>>
> >>> That board is using the stmac driver together with a phy that happens to
> >>> have a reset GPIO, except that that GPIO will never be claimed, and the
> >>> PHY will thus never work.
> >>>
> >>> You can find an example of such a board here:
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195
> >>>
> >>> It looks like when of_mdiobus_register() will parse the DT, it will then
> >>> call of_mdiobus_register_phy() for each PHY it encounters [1].
> >>> of_mdiobus_register_phy() will then if the phy doesn't have an
> >>> ethernet-phy-id* compatible call get_phy_device() [2], and will later on
> >>> call phy_register_device [3].
> >>>
> >>> get_phy_device() will then call get_phy_id() [4], that will try to
> >>> access the PHY through the MDIO bus [5].
> >>>
> >>> The code that deals with the PHY reset line / GPIO is however only done
> >>> in mdiobus_device_register, called through phy_device_register. Since
> >>> this is happening way after the call to get_phy_device, our PHY might
> >>> still very well be in reset if the bootloader hasn't put it out of reset
> >>> and left it there.
> >>
> >> Hi Maxime
> >>
> >> If you look at the history of this code,
> >>
> >> commit bafbdd527d569c8200521f2f7579f65a044271be
> >> Author: Sergei Shtylyov <[email protected]>
> >> Date: Mon Dec 4 13:35:05 2017 +0100
> >>
> >> phylib: Add device reset GPIO support
> >>
> >> you will see there is an assumption the PHY can be detected while in
> >> reset. The reset was originally handled inside the at803x PHY driver
> >> probe function, before it got moved into the core.
> >>
> >> What you are asking for it reasonable, but you have some history to
> >> deal with, changing some assumptions as to what the reset is all
> >> about.
> >
> > Thanks for the pointer.
> >
> > It looks to me from the commit log that the assumption was that a
> > bootloader could leave the PHY into reset though?
> >
> > It starts with:
> >
> >> The PHY devices sometimes do have their reset signal (maybe even power
> >> supply?) tied to some GPIO and sometimes it also does happen that a
> >> boot loader does not leave it deasserted.
> >
> > This is exactly the case I was discussing. The bootloader hasn't used
> > the PHY, and thus the PHY reset signal is still asserted?
>
> The current solution to this problem would be to have a reset property
> specified for the MDIO bus controller node such that the reset would be
> de-asserted during mdiobus_register() and prior to scanning the MDIO bus.

Unless the reset controls all the devices on the mdio bus, the
controller node is not the right place. The core could look into the
child nodes, but this is just one possible property.

> This has the nice property that for a 1:1 mapping between MDIO bus
> controller and device, it works (albeit maybe not being accurately
> describing hardware), but if you have multiple MDIO/PHY devices sitting
> on the bus, they might each have their own reset control and while we
> would attempt to manage that reset line from that call path:
>
> mdiobus_scan()
> -> phy_device_register()
> -> mdiobus_register_device()
>
> it would be too late because there is a preceding get_phy_device() which
> attempts to read the PHY device's OUI and it would fail if the PHY is
> still held in reset.
>
> We have had a similar discussion before with regulators:
>
> http://archive.lwn.net:8080/devicetree/[email protected]/
>
> and so far we do not really have a good solution to this problem either.
>
> For your specific use case with resets you could do a couple of things:
>
> - if there is only one PHY on the MDIO bus you can describe the reset
> line to be within the MDIO bus controller node (explained above), this
> is not great but it is not necessarily too far from reality either
>
> - if you know the PHY OUI, you can put it as a compatible string in the
> form: ethernet-phy-id%4x.%4x and that will avoid the MDIO bus layer to
> try to read from the PHY but instead match it with a driver and create a
> phy_device instance right away

The h/w is simply not discoverable, so it needs a compatible string.

> I think we are open to a "pre probe" model where it is possible to have
> a phy_device instance created that would allow reset controller or
> regulator (or clocks, too) to be usable with a struct device reference,
> yet make no HW access until all of these resources have been enabled.

I think this is needed for the kernel's driver model in general. It's
not an MDIO specific problem.

Rob

2020-07-07 21:17:20

by Florian Fainelli

[permalink] [raw]
Subject: Re: PHY reset handling during DT parsing



On 7/7/2020 10:18 AM, Rob Herring wrote:
> On Tue, Jul 7, 2020 at 10:39 AM Florian Fainelli <[email protected]> wrote:
>>
>>
>>
>> On 7/7/2020 7:54 AM, Maxime Ripard wrote:
>>> Hi Andrew,
>>>
>>> On Tue, Jul 07, 2020 at 04:19:18PM +0200, Andrew Lunn wrote:
>>>> On Mon, Jul 06, 2020 at 08:13:31PM +0200, Maxime Ripard wrote:
>>>>> I came across an issue today on an Allwinner board, but I believe it's a
>>>>> core issue.
>>>>>
>>>>> That board is using the stmac driver together with a phy that happens to
>>>>> have a reset GPIO, except that that GPIO will never be claimed, and the
>>>>> PHY will thus never work.
>>>>>
>>>>> You can find an example of such a board here:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/sun6i-a31-hummingbird.dts#n195
>>>>>
>>>>> It looks like when of_mdiobus_register() will parse the DT, it will then
>>>>> call of_mdiobus_register_phy() for each PHY it encounters [1].
>>>>> of_mdiobus_register_phy() will then if the phy doesn't have an
>>>>> ethernet-phy-id* compatible call get_phy_device() [2], and will later on
>>>>> call phy_register_device [3].
>>>>>
>>>>> get_phy_device() will then call get_phy_id() [4], that will try to
>>>>> access the PHY through the MDIO bus [5].
>>>>>
>>>>> The code that deals with the PHY reset line / GPIO is however only done
>>>>> in mdiobus_device_register, called through phy_device_register. Since
>>>>> this is happening way after the call to get_phy_device, our PHY might
>>>>> still very well be in reset if the bootloader hasn't put it out of reset
>>>>> and left it there.
>>>>
>>>> Hi Maxime
>>>>
>>>> If you look at the history of this code,
>>>>
>>>> commit bafbdd527d569c8200521f2f7579f65a044271be
>>>> Author: Sergei Shtylyov <[email protected]>
>>>> Date: Mon Dec 4 13:35:05 2017 +0100
>>>>
>>>> phylib: Add device reset GPIO support
>>>>
>>>> you will see there is an assumption the PHY can be detected while in
>>>> reset. The reset was originally handled inside the at803x PHY driver
>>>> probe function, before it got moved into the core.
>>>>
>>>> What you are asking for it reasonable, but you have some history to
>>>> deal with, changing some assumptions as to what the reset is all
>>>> about.
>>>
>>> Thanks for the pointer.
>>>
>>> It looks to me from the commit log that the assumption was that a
>>> bootloader could leave the PHY into reset though?
>>>
>>> It starts with:
>>>
>>>> The PHY devices sometimes do have their reset signal (maybe even power
>>>> supply?) tied to some GPIO and sometimes it also does happen that a
>>>> boot loader does not leave it deasserted.
>>>
>>> This is exactly the case I was discussing. The bootloader hasn't used
>>> the PHY, and thus the PHY reset signal is still asserted?
>>
>> The current solution to this problem would be to have a reset property
>> specified for the MDIO bus controller node such that the reset would be
>> de-asserted during mdiobus_register() and prior to scanning the MDIO bus.
>
> Unless the reset controls all the devices on the mdio bus, the
> controller node is not the right place. The core could look into the
> child nodes, but this is just one possible property.

Both are defined and supported (with the caveat mentioned below).

>
>> This has the nice property that for a 1:1 mapping between MDIO bus
>> controller and device, it works (albeit maybe not being accurately
>> describing hardware), but if you have multiple MDIO/PHY devices sitting
>> on the bus, they might each have their own reset control and while we
>> would attempt to manage that reset line from that call path:
>>
>> mdiobus_scan()
>> -> phy_device_register()
>> -> mdiobus_register_device()
>>
>> it would be too late because there is a preceding get_phy_device() which
>> attempts to read the PHY device's OUI and it would fail if the PHY is
>> still held in reset.
>>
>> We have had a similar discussion before with regulators:
>>
>> http://archive.lwn.net:8080/devicetree/[email protected]/
>>
>> and so far we do not really have a good solution to this problem either.
>>
>> For your specific use case with resets you could do a couple of things:
>>
>> - if there is only one PHY on the MDIO bus you can describe the reset
>> line to be within the MDIO bus controller node (explained above), this
>> is not great but it is not necessarily too far from reality either
>>
>> - if you know the PHY OUI, you can put it as a compatible string in the
>> form: ethernet-phy-id%4x.%4x and that will avoid the MDIO bus layer to
>> try to read from the PHY but instead match it with a driver and create a
>> phy_device instance right away
>
> The h/w is simply not discoverable, so it needs a compatible string.

It is discoverable if you have the various resources (regulator(s),
clock(s) and reset(s) managed ahead of issuing the discovery which is
really the crux of the problem here.

>
>> I think we are open to a "pre probe" model where it is possible to have
>> a phy_device instance created that would allow reset controller or
>> regulator (or clocks, too) to be usable with a struct device reference,
>> yet make no HW access until all of these resources have been enabled.
>
> I think this is needed for the kernel's driver model in general. It's
> not an MDIO specific problem.

Oh absolutely not, PCI and USB have the same issue, it is not clear to
me how to tackle this yet though.
--
Florian