2015-12-03 20:48:16

by Pavel Machek

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

On Thu 2015-10-15 13:25:59, Florian Fainelli wrote:
> On 15/10/15 12:59, Dinh Nguyen wrote:
> > On 10/15/2015 03:03 PM, Florian Fainelli wrote:
> >> On 15/10/15 12:09, Dinh Nguyen wrote:
> >>> Hi,
> >>>
> >>> commit "8b63ec1837fa phylib: Make PHYs children of their MDIO bus, not
> >>> the bus' parent." seems to have broken ethernet support for the SoCFPGA
> >>> platform which is using the stmmac ethernet driver.
> >>
> >> It is not clear to me how this relates to what you are seeing yet.
> >>
> >>>
> >>> It appears that during DHCP, it cannot get an IP address. This only
> >>> happens if ethernet was not used by the bootloader to tftp an kernel
> >>> image. If I use the bootloader to tftp an image then ethernet is working
> >>> fine. So I think the PHY is not getting enabled properly.
> >>>
> >>> If I revert this patch, then ethernet is back to working on the platform.
> >>
> >> Is the Device Tree source for this platform available somewhere to look at?
> >>
> >
> > Yes, I'm using the DTS that is in the mainline:
> >
> > arch/arm/boot/dts/socfpga.dtsi
> > arch/arm/boot/dts/socfpga_cyclone5.dtsi
> > arch/arm/boot/dts/socfpga_cyclone5_socdk.dts
>
> There are no PHY devices in any of these DTS files, instead there is the
> non-standard "phy-addr" property which is set to 0xffffffff supposedly
> to indicate that the MDIO bus should be scanned. This is likely part of
> your problem. The stmmac driver seems to be looking for "snps,phy-addr"
> and not "phy-addr", so I am not even clear how this is supposed to work,
> and the driver mentions this custom property is deprecated anyway.
>
> The core problem is in
> drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c::stmmac_mdio_register
> which manually detects the PHY, that is mostly fine, except that it does
> not really seem to work here for a reason that is still unclear to me.
>
> Your Ethernet PHYs need to be declared in Device Tree, see
> Documentation/devicetree/bindings/net/phy.txt

While updating DTS might be good idea, I don't think you can simply
blame this on DTS. If it worked before the change, it is supposed to
work after the change, otherwise we call that change a "regression"
and revert the change.

Plus, DTS is supposed to be ABI. Old DTS should still work on new
kernels in ideal world.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


2015-12-03 21:24:05

by David Daney

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

On 12/03/2015 12:48 PM, Pavel Machek wrote:
> On Thu 2015-10-15 13:25:59, Florian Fainelli wrote:
>> On 15/10/15 12:59, Dinh Nguyen wrote:
>>> On 10/15/2015 03:03 PM, Florian Fainelli wrote:
>>>> On 15/10/15 12:09, Dinh Nguyen wrote:
>>>>> Hi,
>>>>>
>>>>> commit "8b63ec1837fa phylib: Make PHYs children of their MDIO bus, not
>>>>> the bus' parent." seems to have broken ethernet support for the SoCFPGA
>>>>> platform which is using the stmmac ethernet driver.
>>>>
>>>> It is not clear to me how this relates to what you are seeing yet.
>>>>
>>>>>
>>>>> It appears that during DHCP, it cannot get an IP address. This only
>>>>> happens if ethernet was not used by the bootloader to tftp an kernel
>>>>> image. If I use the bootloader to tftp an image then ethernet is working
>>>>> fine. So I think the PHY is not getting enabled properly.
>>>>>
>>>>> If I revert this patch, then ethernet is back to working on the platform.
>>>>
>>>> Is the Device Tree source for this platform available somewhere to look at?
>>>>
>>>
>>> Yes, I'm using the DTS that is in the mainline:
>>>
>>> arch/arm/boot/dts/socfpga.dtsi
>>> arch/arm/boot/dts/socfpga_cyclone5.dtsi
>>> arch/arm/boot/dts/socfpga_cyclone5_socdk.dts
>>
>> There are no PHY devices in any of these DTS files, instead there is the
>> non-standard "phy-addr" property which is set to 0xffffffff supposedly
>> to indicate that the MDIO bus should be scanned. This is likely part of
>> your problem. The stmmac driver seems to be looking for "snps,phy-addr"
>> and not "phy-addr", so I am not even clear how this is supposed to work,
>> and the driver mentions this custom property is deprecated anyway.
>>
>> The core problem is in
>> drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c::stmmac_mdio_register
>> which manually detects the PHY, that is mostly fine, except that it does
>> not really seem to work here for a reason that is still unclear to me.
>>
>> Your Ethernet PHYs need to be declared in Device Tree, see
>> Documentation/devicetree/bindings/net/phy.txt
>
> While updating DTS might be good idea, I don't think you can simply
> blame this on DTS. If it worked before the change, it is supposed to
> work after the change, otherwise we call that change a "regression"
> and revert the change.

FWIW: My initial patch to address the failure worked with the original DTB.

Also: userspace wasn't broken. So, the commandment about not breaking
userspace wasn't broken. Although admittedly, breaking the kernel isn't
good either.



>
> Plus, DTS is supposed to be ABI. Old DTS should still work on new
> kernels in ideal world.

If you supply the device tree file in the kernel tree, it is not an ABI.

If the device tree is not part of the kernel, and instead comes from the
boot firmware of the board, then you could make the ABI claim.

David Daney

2015-12-04 00:58:15

by Dinh Nguyen

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

On 12/03/2015 03:23 PM, David Daney wrote:
> On 12/03/2015 12:48 PM, Pavel Machek wrote:
>> On Thu 2015-10-15 13:25:59, Florian Fainelli wrote:
>>> On 15/10/15 12:59, Dinh Nguyen wrote:
>>>> On 10/15/2015 03:03 PM, Florian Fainelli wrote:
>>>>> On 15/10/15 12:09, Dinh Nguyen wrote:
>>>>>> Hi,
>>>>>>
>>>>>> commit "8b63ec1837fa phylib: Make PHYs children of their MDIO bus,
>>>>>> not
>>>>>> the bus' parent." seems to have broken ethernet support for the
>>>>>> SoCFPGA
>>>>>> platform which is using the stmmac ethernet driver.
>>>>>
>>>>> It is not clear to me how this relates to what you are seeing yet.
>>>>>
>>>>>>
>>>>>> It appears that during DHCP, it cannot get an IP address. This only
>>>>>> happens if ethernet was not used by the bootloader to tftp an kernel
>>>>>> image. If I use the bootloader to tftp an image then ethernet is
>>>>>> working
>>>>>> fine. So I think the PHY is not getting enabled properly.
>>>>>>
>>>>>> If I revert this patch, then ethernet is back to working on the
>>>>>> platform.
>>>>>
>>>>> Is the Device Tree source for this platform available somewhere to
>>>>> look at?
>>>>>
>>>>
>>>> Yes, I'm using the DTS that is in the mainline:
>>>>
>>>> arch/arm/boot/dts/socfpga.dtsi
>>>> arch/arm/boot/dts/socfpga_cyclone5.dtsi
>>>> arch/arm/boot/dts/socfpga_cyclone5_socdk.dts
>>>
>>> There are no PHY devices in any of these DTS files, instead there is the
>>> non-standard "phy-addr" property which is set to 0xffffffff supposedly
>>> to indicate that the MDIO bus should be scanned. This is likely part of
>>> your problem. The stmmac driver seems to be looking for "snps,phy-addr"
>>> and not "phy-addr", so I am not even clear how this is supposed to work,
>>> and the driver mentions this custom property is deprecated anyway.
>>>
>>> The core problem is in
>>> drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c::stmmac_mdio_register
>>> which manually detects the PHY, that is mostly fine, except that it does
>>> not really seem to work here for a reason that is still unclear to me.
>>>
>>> Your Ethernet PHYs need to be declared in Device Tree, see
>>> Documentation/devicetree/bindings/net/phy.txt
>>
>> While updating DTS might be good idea, I don't think you can simply
>> blame this on DTS. If it worked before the change, it is supposed to
>> work after the change, otherwise we call that change a "regression"
>> and revert the change.
>
> FWIW: My initial patch to address the failure worked with the original DTB.
>

Can I ask what patch are you referring to? I was sidetracked for a while
on this issue, but I still see it failing as of v4.4-rc3. I'll try to
get back to debugging this.

Dinh

2015-12-04 01:11:00

by Andrew Lunn

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

> > FWIW: My initial patch to address the failure worked with the original DTB.
>
> Can I ask what patch are you referring to? I was sidetracked for a while
> on this issue, but I still see it failing as of v4.4-rc3. I'll try to
> get back to debugging this.

Hi Dinh

There are two different patches:

https://lkml.org/lkml/2015/10/16/669

and

https://www.mail-archive.com/[email protected]/msg83183.html

Although the first one works, it keeps searching up and up and up, so
you could in theory put the phy properties a lot higher than the MAC.

The second patch restricts where it looks for the phy properties to
only the MAC. But it does not work. We would like to understand why it
does not work.

Andrew

2015-12-04 01:51:06

by Andrew Lunn

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

On Fri, Dec 04, 2015 at 02:10:50AM +0100, Andrew Lunn wrote:
> > > FWIW: My initial patch to address the failure worked with the original DTB.
> >
> > Can I ask what patch are you referring to? I was sidetracked for a while
> > on this issue, but I still see it failing as of v4.4-rc3. I'll try to
> > get back to debugging this.
>
> Hi Dinh
>
> There are two different patches:
>
> https://lkml.org/lkml/2015/10/16/669
>
> and
>
> https://www.mail-archive.com/[email protected]/msg83183.html
>
> Although the first one works, it keeps searching up and up and up, so
> you could in theory put the phy properties a lot higher than the MAC.
>
> The second patch restricts where it looks for the phy properties to
> only the MAC. But it does not work. We would like to understand why it
> does not work.

Hi Dinh

Please could you run this patch and let us know what it outputs.

Thanks
Andrew

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index cf6312fafea5..d7ddc0bb0e7f 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -26,6 +26,7 @@
#include <linux/module.h>
#include <linux/phy.h>
#include <linux/micrel_phy.h>
+#include <linux/netdevice.h>
#include <linux/of.h>
#include <linux/clk.h>

@@ -339,9 +340,19 @@ static int ksz9021_config_init(struct phy_device *phydev)
{
const struct device *dev = &phydev->dev;
const struct device_node *of_node = dev->of_node;
+ const struct device *dev_walker;

- if (!of_node && dev->parent->of_node)
- of_node = dev->parent->of_node;
+ dev_info(dev, "dev->parent: %s\n", dev_name(dev->parent));
+ dev_info(dev, "phydev->attached_dev->dev: %s\n", dev_name(&phydev->attached_dev->dev));
+
+ dev_walker = &phydev->dev;
+ do {
+ of_node = dev_walker->of_node;
+ dev_info(dev, "walking: %s %p\n",
+ dev_name(dev_walker), of_node);
+ dev_walker = dev_walker->parent;
+
+ } while (!of_node && dev_walker);

if (of_node) {
ksz9021_load_values_from_of(phydev, of_node,

2015-12-04 09:38:44

by Pavel Machek

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken


> >While updating DTS might be good idea, I don't think you can simply
> >blame this on DTS. If it worked before the change, it is supposed to
> >work after the change, otherwise we call that change a "regression"
> >and revert the change.
>
> FWIW: My initial patch to address the failure worked with the original DTB.
>
> Also: userspace wasn't broken. So, the commandment about not breaking
> userspace wasn't broken. Although admittedly, breaking the kernel isn't
> good either.

You can't break neither kernel nor userspace.

> >Plus, DTS is supposed to be ABI. Old DTS should still work on new
> >kernels in ideal world.
>
> If you supply the device tree file in the kernel tree, it is not an ABI.
>
> If the device tree is not part of the kernel, and instead comes from the
> boot firmware of the board, then you could make the ABI claim.

It is an ABI if it was declared so, and it was. Yes, it _can_ come
from kernel tree. That does not mean it has to.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2015-12-04 11:35:06

by Dinh Nguyen

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

Hi Andrew,

On Fri, 4 Dec 2015, Andrew Lunn wrote:

> On Fri, Dec 04, 2015 at 02:10:50AM +0100, Andrew Lunn wrote:
> > > > FWIW: My initial patch to address the failure worked with the original DTB.
> > >
> > > Can I ask what patch are you referring to? I was sidetracked for a while
> > > on this issue, but I still see it failing as of v4.4-rc3. I'll try to
> > > get back to debugging this.
> >
> > Hi Dinh
> >
> > There are two different patches:
> >
> > https://lkml.org/lkml/2015/10/16/669
> >
> > and
> >
> > https://www.mail-archive.com/[email protected]/msg83183.html
> >
> > Although the first one works, it keeps searching up and up and up, so
> > you could in theory put the phy properties a lot higher than the MAC.
> >
> > The second patch restricts where it looks for the phy properties to
> > only the MAC. But it does not work. We would like to understand why it
> > does not work.
>
> Hi Dinh
>
> Please could you run this patch and let us know what it outputs.
>
> Thanks
> Andrew
>
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index cf6312fafea5..d7ddc0bb0e7f 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -26,6 +26,7 @@
> #include <linux/module.h>
> #include <linux/phy.h>
> #include <linux/micrel_phy.h>
> +#include <linux/netdevice.h>
> #include <linux/of.h>
> #include <linux/clk.h>
>
> @@ -339,9 +340,19 @@ static int ksz9021_config_init(struct phy_device *phydev)
> {
> const struct device *dev = &phydev->dev;
> const struct device_node *of_node = dev->of_node;
> + const struct device *dev_walker;
>
> - if (!of_node && dev->parent->of_node)
> - of_node = dev->parent->of_node;
> + dev_info(dev, "dev->parent: %s\n", dev_name(dev->parent));
> + dev_info(dev, "phydev->attached_dev->dev: %s\n", dev_name(&phydev->attached_dev->dev));
> +
> + dev_walker = &phydev->dev;
> + do {
> + of_node = dev_walker->of_node;
> + dev_info(dev, "walking: %s %p\n",
> + dev_name(dev_walker), of_node);
> + dev_walker = dev_walker->parent;
> +
> + } while (!of_node && dev_walker);
>
> if (of_node) {
> ksz9021_load_values_from_of(phydev, of_node,
>

Here is the output from the above patch:

[ 1.042049] mmc0: new high speed SDHC card at address aaaa
[ 1.048017] mmcblk0: mmc0:aaaa SU32G 29.7 GiB
[ 1.053506] mmcblk0: p1 p2 p3 p4
[ 1.057708] dwc2 ffb40000.usb: Configuration mismatch. Forcing host mode
[ 1.064418] dwc2 ffb40000.usb: no platform data or transceiver defined
[ 1.070966] Micrel KSZ9021 Gigabit PHY stmmac-0:04: dev->parent: stmmac-0
[ 1.077746] Micrel KSZ9021 Gigabit PHY stmmac-0:04: phydev->attached_dev->dev: eth0
[ 1.085389] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: stmmac-0:04 (null)
[ 1.092841] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: stmmac-0 (null)
[ 1.100042] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: ff702000.ethernet ef9f3538
[ 1.133638] Sending DHCP requests ..
[ 5.104138] socfpga-dwmac ff702000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

BR,
Dinh

2015-12-04 14:31:17

by Andrew Lunn

[permalink] [raw]
Subject: Re: SoCFPGA ethernet broken

> > @@ -339,9 +340,19 @@ static int ksz9021_config_init(struct phy_device *phydev)
> > {
> > const struct device *dev = &phydev->dev;
> > const struct device_node *of_node = dev->of_node;
> > + const struct device *dev_walker;
> >
> > - if (!of_node && dev->parent->of_node)
> > - of_node = dev->parent->of_node;
> > + dev_info(dev, "dev->parent: %s\n", dev_name(dev->parent));
> > + dev_info(dev, "phydev->attached_dev->dev: %s\n", dev_name(&phydev->attached_dev->dev));
> > +
> > + dev_walker = &phydev->dev;
> > + do {
> > + of_node = dev_walker->of_node;
> > + dev_info(dev, "walking: %s %p\n",
> > + dev_name(dev_walker), of_node);
> > + dev_walker = dev_walker->parent;
> > +
> > + } while (!of_node && dev_walker);
> >
> > if (of_node) {
> > ksz9021_load_values_from_of(phydev, of_node,
> >
>
> Here is the output from the above patch:
>
> [ 1.042049] mmc0: new high speed SDHC card at address aaaa
> [ 1.048017] mmcblk0: mmc0:aaaa SU32G 29.7 GiB
> [ 1.053506] mmcblk0: p1 p2 p3 p4
> [ 1.057708] dwc2 ffb40000.usb: Configuration mismatch. Forcing host mode
> [ 1.064418] dwc2 ffb40000.usb: no platform data or transceiver defined
> [ 1.070966] Micrel KSZ9021 Gigabit PHY stmmac-0:04: dev->parent: stmmac-0
> [ 1.077746] Micrel KSZ9021 Gigabit PHY stmmac-0:04: phydev->attached_dev->dev: eth0
> [ 1.085389] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: stmmac-0:04 (null)
> [ 1.092841] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: stmmac-0 (null)
> [ 1.100042] Micrel KSZ9021 Gigabit PHY stmmac-0:04: walking: ff702000.ethernet ef9f3538

Ah! So we have an intermediary device in the
chain. phydev->attached_dev->dev points to this intermediate device,
rather than the platform device, which is why my patch failed.

I don't know the network stack well enough. Is this consider broken?
Is this valid? Is there any generic code which might try looking in
netdev->dev.of_node? Do other MAC drivers have an intermediate
device?

We could modify the stmac driver so that phydev->attached_dev->dev
does point to the platform device, and then my patch works. But if
there are other MAC drivers which are structured like this, they are
also broken when used with the Micrel PHY. So then walking up the
device tree is a better solution.

Comments from more knowledgeable people requested!

Andrew