From: Bartosz Golaszewski <[email protected]>
Two MACs may share MDIO lines to the PHYs. Let's allow that in the
stmmac driver by providing a new device-tree property allowing one MAC
node to reference the MDIO bus defined on a second MAC node.
Bartosz Golaszewski (2):
dt-bindings: net: snps,dwmac: document the snps,shared-mdio property
net: stmmac: support shared MDIO
Documentation/devicetree/bindings/net/snps,dwmac.yaml | 6 ++++++
drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 8 ++++++++
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 6 ++++++
include/linux/stmmac.h | 1 +
4 files changed, 21 insertions(+)
--
2.39.2
From: Bartosz Golaszewski <[email protected]>
When two MACs share the MDIO lines to their respective PHYs, one is
considered the logical "owner" of the bus. The secondary controller must
wait until the MDIO bus is registered before trying to attach to the
PHY.
If the mdio node is not defined for given MAC, try to read the
"snps,shared-mdio" property on its node. If it exists, parse the phandle
and store the result as the MAC's mdio device-tree node.
When registering the MDIO bus: if we know that we share it with another
MAC, lookup the MDIO bus and if it's not up yet, defer probe until it
is.
Signed-off-by: Bartosz Golaszewski <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 8 ++++++++
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 6 ++++++
include/linux/stmmac.h | 1 +
3 files changed, 15 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index dd9e2fec5328..6a74b91595d0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -543,6 +543,14 @@ int stmmac_mdio_register(struct net_device *ndev)
if (!mdio_bus_data)
return 0;
+ if (priv->plat->flags & STMMAC_FLAG_SHARED_MDIO) {
+ new_bus = of_mdio_find_bus(mdio_node);
+ if (!new_bus)
+ return -EPROBE_DEFER;
+
+ goto bus_register_done;
+ }
+
new_bus = mdiobus_alloc();
if (!new_bus)
return -ENOMEM;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index be8e79c7aa34..11a24b1c7beb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -340,6 +340,12 @@ static int stmmac_dt_phy(struct plat_stmmacenet_data *plat,
}
}
+ if (!plat->mdio_node) {
+ plat->mdio_node = of_parse_phandle(np, "snps,shared-mdio", 0);
+ if (plat->mdio_node)
+ plat->flags |= STMMAC_FLAG_SHARED_MDIO;
+ }
+
if (plat->mdio_node) {
dev_dbg(dev, "Found MDIO subnode\n");
mdio = true;
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index 3d0702510224..892f61051002 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -218,6 +218,7 @@ struct dwmac4_addrs {
#define STMMAC_FLAG_INT_SNAPSHOT_EN BIT(9)
#define STMMAC_FLAG_RX_CLK_RUNS_IN_LPI BIT(10)
#define STMMAC_FLAG_EN_TX_LPI_CLOCKGATING BIT(11)
+#define STMMAC_FLAG_SHARED_MDIO BIT(12)
struct plat_stmmacenet_data {
int bus_id;
--
2.39.2
On Mon, Aug 07, 2023 at 09:31:00PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <[email protected]>
>
> Two MACs may share MDIO lines to the PHYs. Let's allow that in the
> stmmac driver by providing a new device-tree property allowing one MAC
> node to reference the MDIO bus defined on a second MAC node.
I don't understand why this is needed. phy-handle can point to a phy
on any MDIO bus. So it is no problem for one MAC to point to the other
MACs MDIO bus as is.
You do sometimes get into ordering problems, especially if MAC0 is
pointing to a PHY on MAC1 MDIO bus. But MAC0 should get a
-EPROBE_DEFER, MAC1 then probes, creating its MDIO bus and the two
PHYs on it, and then later MAC0 is probes again and is successful.
Andrew
From: Bartosz Golaszewski <[email protected]>
Two MACs may share one MDIO lines to their respective PHYs. In this case
one of the MACs is the logical "owner" of the bus, while the other can be
considered a secondary controller. Add a new property that allows one
MAC node to reference the MDIO node on a different MAC over a phandle.
Signed-off-by: Bartosz Golaszewski <[email protected]>
---
Documentation/devicetree/bindings/net/snps,dwmac.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
index ddf9522a5dc2..f9c2285674d1 100644
--- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
@@ -500,6 +500,12 @@ properties:
required:
- compatible
+ snps,shared-mdio:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Phandle to the mdio node defined on a different MAC node which this
+ device shares.
+
stmmac-axi-config:
type: object
unevaluatedProperties: false
--
2.39.2
On Tue, Aug 8, 2023 at 3:26 PM Russell King (Oracle)
<[email protected]> wrote:
>
> On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > Ok so upon some further investigation, the actual culprit is in stmmac
> > platform code - it always tries to register an MDIO bus - independent
> > of whether there is an actual mdio child node - unless the MAC is
> > marked explicitly as having a fixed-link.
> >
> > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > created the MDIO bus.
> >
> > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> >
> > If the schematics look something like this:
> >
> > -------- -------
> > | MAC0 |--MDIO-----| PHY |
> > -------- | | -------
> > | |
> > -------- | | -------
> > | MAC1 |-- ----| PHY |
> > -------- -------
> >
> > Then it would make sense to model it on the device tree?
>
> So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> masters, and the hardware designer decided to tie both together to
> a single set of clock and data lines, which then go to two PHYs.
The schematics I have are not very clear on that, but now that you
mention this, it's most likely the case.
>
> In that case, I would strongly advise only registering one MDIO bus,
> and avoid registering the second one - thereby preventing any issues
> caused by both MDIO bus masters trying to talk at the same time.
>
I sent a patch for that earlier today.
> The PHYs should be populated in firmware on just one of the buses.
>
> You will also need to ensure that whatever registers the bus does
> make sure that the clocks necessary for communicating on the bus
> are under control of the MDIO bus code and not the ethernet MAC
> code. We've run into problems in the past where this has not been
> the case, and it means - taking your example above - that when MAC1
> wants to talk to its PHY, if MAC0 isn't alive it can't.
Good point, but it's worse than that: when MAC0 is unbound, it will
unregister the MDIO bus and destroy all PHY devices. These are not
refcounted so they will literally go from under MAC1. Not sure how
this can be dealt with?
>
> So just be aware of the clocking situation and make sure that your
> MDIO bus code is managing the clocks necessary for the MDIO bus
> master to work.
Doesn't seem like stmmac is ready for it as it is now so this is going
to be fun...
Bartosz
>
> In regard to sharing of the MDIO bus signals between two bus
> masters, I do not believe that is permissible - there's no
> collision detection in hardware like there is on I涎. So
> having two MDIO bus masters talking at the same time would
> end up corrupting the MDC (clock) and MDIO (data) signals if
> both were active at the same time.
>
> > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > platform code - it always tries to register an MDIO bus - independent
> > > of whether there is an actual mdio child node - unless the MAC is
> > > marked explicitly as having a fixed-link.
> > >
> > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > created the MDIO bus.
> > >
> > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > >
> > > If the schematics look something like this:
> > >
> > > -------- -------
> > > | MAC0 |--MDIO-----| PHY |
> > > -------- | | -------
> > > | |
> > > -------- | | -------
> > > | MAC1 |-- ----| PHY |
> > > -------- -------
> > >
> > > Then it would make sense to model it on the device tree?
> >
> > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > masters, and the hardware designer decided to tie both together to
> > a single set of clock and data lines, which then go to two PHYs.
>
> The schematics I have are not very clear on that, but now that you
> mention this, it's most likely the case.
I hope not. That would be very broken. As Russell pointed out, MDIO is
not multi-master. You need to check with the hardware designer if the
schematics are not clear.
> Good point, but it's worse than that: when MAC0 is unbound, it will
> unregister the MDIO bus and destroy all PHY devices. These are not
> refcounted so they will literally go from under MAC1. Not sure how
> this can be dealt with?
unbinding is not a normal operation. So i would just live with it, and
if root decides to shoot herself in the foot, that is her choice.
Andrew
On Tue, Aug 08, 2023 at 04:30:05PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <[email protected]> wrote:
> >
> > > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > > platform code - it always tries to register an MDIO bus - independent
> > > > > of whether there is an actual mdio child node - unless the MAC is
> > > > > marked explicitly as having a fixed-link.
> > > > >
> > > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > > created the MDIO bus.
> > > > >
> > > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > > >
> > > > > If the schematics look something like this:
> > > > >
> > > > > -------- -------
> > > > > | MAC0 |--MDIO-----| PHY |
> > > > > -------- | | -------
> > > > > | |
> > > > > -------- | | -------
> > > > > | MAC1 |-- ----| PHY |
> > > > > -------- -------
> > > > >
> > > > > Then it would make sense to model it on the device tree?
> > > >
> > > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > > masters, and the hardware designer decided to tie both together to
> > > > a single set of clock and data lines, which then go to two PHYs.
> > >
> > > The schematics I have are not very clear on that, but now that you
> > > mention this, it's most likely the case.
> >
> > I hope not. That would be very broken. As Russell pointed out, MDIO is
> > not multi-master. You need to check with the hardware designer if the
> > schematics are not clear.
>
> Sorry, it was not very clear. It's the case that two MDIO masters
> share the MDC and data lines.
I'll make the water muddier (hopefully clearer?). I have access to the
board schematic (not SIP/SOM stuff though), but that should help here.
MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
pinmuxed to gpio21/22.
On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
one is connected to MAC1.
MDIO1 is not connected to anything on the board. So there is only one
MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
MAC0/MAC1.
Does that make sense? I don't think from a hardware design standpoint
this is violating anything, it isn't a multimaster setup on MDIO.
>
> >
> > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > refcounted so they will literally go from under MAC1. Not sure how
> > > this can be dealt with?
> >
> > unbinding is not a normal operation. So i would just live with it, and
> > if root decides to shoot herself in the foot, that is her choice.
> >
>
> I disagree. Unbinding is very much a normal operation. Less so for
> platform devices but still, it is there for a reason and should be
> expected to work correctly. Or at the very least not crash and burn
> the system.
>
> On the other hand, I like your approach because I may get away without
> having to fix it. But if I were to fix it - I would reference the MDIO
> bus from the secondary mac by phandle and count its references before
> dropping it. :)
>
> Bartosz
>
On Tue, Aug 08, 2023 at 05:15:45PM +0200, Andrew Lunn wrote:
> > > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > > this can be dealt with?
> > > >
> > > > unbinding is not a normal operation. So i would just live with it, and
> > > > if root decides to shoot herself in the foot, that is her choice.
> > > >
> > >
> > > I disagree. Unbinding is very much a normal operation.
>
> What do you use it for?
>
> I don't think i've ever manually done it. Maybe as part of a script to
> unbind the FTDI driver from an FTDI device in order to use user space
> tools to program the EEPROM? But that is about it.
>
> I actually expect many unbind operations are broken because it is very
> rarely used.
rmmod! Particularly useful during driver development, I tend to use it
extensively - and it has the advantage of testing those unbind paths!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> Ok so upon some further investigation, the actual culprit is in stmmac
> platform code - it always tries to register an MDIO bus - independent
> of whether there is an actual mdio child node - unless the MAC is
> marked explicitly as having a fixed-link.
>
> When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> created the MDIO bus.
>
> Even so, isn't it useful to actually reference the shared MDIO bus in some way?
>
> If the schematics look something like this:
>
> -------- -------
> | MAC0 |--MDIO-----| PHY |
> -------- | | -------
> | |
> -------- | | -------
> | MAC1 |-- ----| PHY |
> -------- -------
>
> Then it would make sense to model it on the device tree?
So I think what you're saying is that MAC0 and MAC1's have MDIO bus
masters, and the hardware designer decided to tie both together to
a single set of clock and data lines, which then go to two PHYs.
In that case, I would strongly advise only registering one MDIO bus,
and avoid registering the second one - thereby preventing any issues
caused by both MDIO bus masters trying to talk at the same time.
The PHYs should be populated in firmware on just one of the buses.
You will also need to ensure that whatever registers the bus does
make sure that the clocks necessary for communicating on the bus
are under control of the MDIO bus code and not the ethernet MAC
code. We've run into problems in the past where this has not been
the case, and it means - taking your example above - that when MAC1
wants to talk to its PHY, if MAC0 isn't alive it can't.
So just be aware of the clocking situation and make sure that your
MDIO bus code is managing the clocks necessary for the MDIO bus
master to work.
In regard to sharing of the MDIO bus signals between two bus
masters, I do not believe that is permissible - there's no
collision detection in hardware like there is on I?C. So
having two MDIO bus masters talking at the same time would
end up corrupting the MDC (clock) and MDIO (data) signals if
both were active at the same time.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Aug 08, 2023 at 09:44:16AM -0500, Andrew Halaney wrote:
> On Tue, Aug 08, 2023 at 04:30:05PM +0200, Bartosz Golaszewski wrote:
> > On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <[email protected]> wrote:
> > >
> > > > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > > > platform code - it always tries to register an MDIO bus - independent
> > > > > > of whether there is an actual mdio child node - unless the MAC is
> > > > > > marked explicitly as having a fixed-link.
> > > > > >
> > > > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > > > created the MDIO bus.
> > > > > >
> > > > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > > > >
> > > > > > If the schematics look something like this:
> > > > > >
> > > > > > -------- -------
> > > > > > | MAC0 |--MDIO-----| PHY |
> > > > > > -------- | | -------
> > > > > > | |
> > > > > > -------- | | -------
> > > > > > | MAC1 |-- ----| PHY |
> > > > > > -------- -------
> > > > > >
> > > > > > Then it would make sense to model it on the device tree?
> > > > >
> > > > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > > > masters, and the hardware designer decided to tie both together to
> > > > > a single set of clock and data lines, which then go to two PHYs.
> > > >
> > > > The schematics I have are not very clear on that, but now that you
> > > > mention this, it's most likely the case.
> > >
> > > I hope not. That would be very broken. As Russell pointed out, MDIO is
> > > not multi-master. You need to check with the hardware designer if the
> > > schematics are not clear.
> >
> > Sorry, it was not very clear. It's the case that two MDIO masters
> > share the MDC and data lines.
>
> I'll make the water muddier (hopefully clearer?). I have access to the
> board schematic (not SIP/SOM stuff though), but that should help here.
>
> MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> pinmuxed to gpio21/22.
>
> On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> one is connected to MAC1.
>
> MDIO1 is not connected to anything on the board. So there is only one
> MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> MAC0/MAC1.
>
> Does that make sense? I don't think from a hardware design standpoint
> this is violating anything, it isn't a multimaster setup on MDIO.
That all sounds sane, thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Mon, Aug 7, 2023 at 9:50 PM Andrew Lunn <[email protected]> wrote:
>
> On Mon, Aug 07, 2023 at 09:31:00PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <[email protected]>
> >
> > Two MACs may share MDIO lines to the PHYs. Let's allow that in the
> > stmmac driver by providing a new device-tree property allowing one MAC
> > node to reference the MDIO bus defined on a second MAC node.
>
> I don't understand why this is needed. phy-handle can point to a phy
> on any MDIO bus. So it is no problem for one MAC to point to the other
> MACs MDIO bus as is.
>
> You do sometimes get into ordering problems, especially if MAC0 is
> pointing to a PHY on MAC1 MDIO bus. But MAC0 should get a
> -EPROBE_DEFER, MAC1 then probes, creating its MDIO bus and the two
> PHYs on it, and then later MAC0 is probes again and is successful.
>
> Andrew
Ok so upon some further investigation, the actual culprit is in stmmac
platform code - it always tries to register an MDIO bus - independent
of whether there is an actual mdio child node - unless the MAC is
marked explicitly as having a fixed-link.
When I fixed that, MAC1's probe is correctly deferred until MAC0 has
created the MDIO bus.
Even so, isn't it useful to actually reference the shared MDIO bus in some way?
If the schematics look something like this:
-------- -------
| MAC0 |--MDIO-----| PHY |
-------- | | -------
| |
-------- | | -------
| MAC1 |-- ----| PHY |
-------- -------
Then it would make sense to model it on the device tree?
Anyway, this can be discussed later, I will drop this for now and send
a fix for stmmac mdio code instead to get this upstream.
Bart
On Tue, Aug 08, 2023 at 08:26:22PM +0200, Bartosz Golaszewski wrote:
> When I say "device unbind", I don't just mean manual unbinding using
> sysfs. I mean any code path (rmmod, unplugging the USB, etc.) that
> leads to the device being detached from its driver. This is a
> perfectly normal situation and should work correctly.
>
> I won't be fixing it for this series but may end up looking into
> establishing some kind of device links between MACs and their "remote"
> PHYs that would allow to safely unbind them.
I don't think you're the first to suggest that!
That gets difficult - because although the PHY may be a different
driver, the MDIO bus may be provided by the _same_ hardware as the
ethernet MAC itself. So you end up with a circular dependency - the
PHY device depends on the MDIO bus device (which is the ethernet MAC)
and then you make the ethernet MAC depend on the PHY device.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Aug 08, 2023 at 04:09:11PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 8, 2023 at 3:26 PM Russell King (Oracle)
> <[email protected]> wrote:
> >
> > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > platform code - it always tries to register an MDIO bus - independent
> > > of whether there is an actual mdio child node - unless the MAC is
> > > marked explicitly as having a fixed-link.
> > >
> > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > created the MDIO bus.
> > >
> > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > >
> > > If the schematics look something like this:
> > >
> > > -------- -------
> > > | MAC0 |--MDIO-----| PHY |
> > > -------- | | -------
> > > | |
> > > -------- | | -------
> > > | MAC1 |-- ----| PHY |
> > > -------- -------
> > >
> > > Then it would make sense to model it on the device tree?
> >
> > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > masters, and the hardware designer decided to tie both together to
> > a single set of clock and data lines, which then go to two PHYs.
>
> The schematics I have are not very clear on that, but now that you
> mention this, it's most likely the case.
>
> >
> > In that case, I would strongly advise only registering one MDIO bus,
> > and avoid registering the second one - thereby preventing any issues
> > caused by both MDIO bus masters trying to talk at the same time.
> >
>
> I sent a patch for that earlier today.
>
> > The PHYs should be populated in firmware on just one of the buses.
> >
> > You will also need to ensure that whatever registers the bus does
> > make sure that the clocks necessary for communicating on the bus
> > are under control of the MDIO bus code and not the ethernet MAC
> > code. We've run into problems in the past where this has not been
> > the case, and it means - taking your example above - that when MAC1
> > wants to talk to its PHY, if MAC0 isn't alive it can't.
>
> Good point, but it's worse than that: when MAC0 is unbound, it will
> unregister the MDIO bus and destroy all PHY devices. These are not
> refcounted so they will literally go from under MAC1. Not sure how
> this can be dealt with?
That has been a problem in the past, where a MII bus has been
registered by a driver, and then because its probe defers, the MII
bus gets torn down.
The "simple" solution to this is... try to avoid registering the MII
bus until you're sure that the probing will not defer. It is far from
perfect, since there's still the opportunity to unbind the driver
causing the MII bus to vanish along with the PHYs.
I have mentioned trying to address the issue of PHY drivers being
unbound in the past, and there's been some improvements with that,
but if the phy_device vanishes while something is using it, it
certainly will not end well. phylib is not the only case of this,
there are numerous instances of it. One of the recent ones that
I happened to be reminded of today is the pcs-rzn1-miic thing...
If you have a look at miic_create() and consider what would happen
if:
if (!pdev || !platform_get_drvdata(pdev))
return ERR_PTR(-EPROBE_DEFER);
... another thread ended up executing miic_remove() for this
platform device at this very point ...
miic_port = kzalloc(sizeof(*miic_port), GFP_KERNEL);
if (!miic_port)
return ERR_PTR(-ENOMEM);
miic = platform_get_drvdata(pdev);
device_link_add(dev, miic->dev, DL_FLAG_AUTOREMOVE_CONSUMER);
The devm allocation for "miic" would be freed, so either miic
ends up a stale pointer if it happened after this point, or
if miic_remove() completes, then platform_get_drvdata() returns
NULL and we oops the kernel here.
It's an unlikely race, but it's still a race. Sadly, the kernel
is getting riddled with things like this. I used to point these
things out, but having been shouted down many times I've given
up raising it.
Another example is the direct rendering manager bridge code
(drm_bridge).
I suggest a similar approach to not caring too much about this
for your own sanity... providing it doesn't actually cause a
problem!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
> I'll make the water muddier (hopefully clearer?). I have access to the
> board schematic (not SIP/SOM stuff though), but that should help here.
>
> MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> pinmuxed to gpio21/22.
>
> On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> one is connected to MAC1.
>
> MDIO1 is not connected to anything on the board. So there is only one
> MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> MAC0/MAC1.
>
> Does that make sense? I don't think from a hardware design standpoint
> this is violating anything, it isn't a multimaster setup on MDIO.
Thanks for taking a detailed look at the schematics. This is how i
would expect it to be.
> > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > this can be dealt with?
> > >
> > > unbinding is not a normal operation. So i would just live with it, and
> > > if root decides to shoot herself in the foot, that is her choice.
> > >
> >
> > I disagree. Unbinding is very much a normal operation.
What do you use it for?
I don't think i've ever manually done it. Maybe as part of a script to
unbind the FTDI driver from an FTDI device in order to use user space
tools to program the EEPROM? But that is about it.
I actually expect many unbind operations are broken because it is very
rarely used.
Andrew
> Ok so upon some further investigation, the actual culprit is in stmmac
> platform code - it always tries to register an MDIO bus - independent
> of whether there is an actual mdio child node - unless the MAC is
> marked explicitly as having a fixed-link.
If the MDIO bus does exist registering it should not be a problem.
Ideally it should have an internal pull up on its MDIO line, so that
all reads return 0xffff. The phylib core will then determine there are
no devices on it. Not actually registering it because there is no MDIO
node in DT is just an optimisation.
> When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> created the MDIO bus.
Great. That is how it should work.
> Even so, isn't it useful to actually reference the shared MDIO bus in some way?
Why? Linux does not care where the PHY is. There are some SoCs with an
independent MDIO bus masters. They have there own node in DT, there
own driver etc. You can create an MDIO bus from two or three GPIOs and
bit banging, again as an node in DT. There are Ethernet switches
which can have 11 ports, and 2 MDIO busses, one purely internal and
one external, and the PHYs are scattered over these busses.
All linux needs is a phandle to the PHY, nothing more.
Andrew
On Tue, Aug 8, 2023 at 5:15 PM Andrew Lunn <[email protected]> wrote:
>
> > I'll make the water muddier (hopefully clearer?). I have access to the
> > board schematic (not SIP/SOM stuff though), but that should help here.
> >
> > MAC0 owns its own MDIO bus (we'll call it MDIO0). It is pinmuxed to
> > gpio8/gpio9 for mdc/mdio. MAC1 owns its own bus (MDIO1) which is
> > pinmuxed to gpio21/22.
> >
> > On MDIO0 there are two SGMII ethernet phys. One is connected to MAC0,
> > one is connected to MAC1.
> >
> > MDIO1 is not connected to anything on the board. So there is only one
> > MDIO master, MAC0 on MDIO0, and it manages the ethernet phy for both
> > MAC0/MAC1.
> >
> > Does that make sense? I don't think from a hardware design standpoint
> > this is violating anything, it isn't a multimaster setup on MDIO.
>
> Thanks for taking a detailed look at the schematics. This is how i
> would expect it to be.
>
> > > > > Good point, but it's worse than that: when MAC0 is unbound, it will
> > > > > unregister the MDIO bus and destroy all PHY devices. These are not
> > > > > refcounted so they will literally go from under MAC1. Not sure how
> > > > > this can be dealt with?
> > > >
> > > > unbinding is not a normal operation. So i would just live with it, and
> > > > if root decides to shoot herself in the foot, that is her choice.
> > > >
> > >
> > > I disagree. Unbinding is very much a normal operation.
>
> What do you use it for?
>
> I don't think i've ever manually done it. Maybe as part of a script to
> unbind the FTDI driver from an FTDI device in order to use user space
> tools to program the EEPROM? But that is about it.
>
> I actually expect many unbind operations are broken because it is very
> rarely used.
>
When I say "device unbind", I don't just mean manual unbinding using
sysfs. I mean any code path (rmmod, unplugging the USB, etc.) that
leads to the device being detached from its driver. This is a
perfectly normal situation and should work correctly.
I won't be fixing it for this series but may end up looking into
establishing some kind of device links between MACs and their "remote"
PHYs that would allow to safely unbind them.
Bart
On Tue, Aug 8, 2023 at 4:25 PM Andrew Lunn <[email protected]> wrote:
>
> > > On Tue, Aug 08, 2023 at 10:13:09AM +0200, Bartosz Golaszewski wrote:
> > > > Ok so upon some further investigation, the actual culprit is in stmmac
> > > > platform code - it always tries to register an MDIO bus - independent
> > > > of whether there is an actual mdio child node - unless the MAC is
> > > > marked explicitly as having a fixed-link.
> > > >
> > > > When I fixed that, MAC1's probe is correctly deferred until MAC0 has
> > > > created the MDIO bus.
> > > >
> > > > Even so, isn't it useful to actually reference the shared MDIO bus in some way?
> > > >
> > > > If the schematics look something like this:
> > > >
> > > > -------- -------
> > > > | MAC0 |--MDIO-----| PHY |
> > > > -------- | | -------
> > > > | |
> > > > -------- | | -------
> > > > | MAC1 |-- ----| PHY |
> > > > -------- -------
> > > >
> > > > Then it would make sense to model it on the device tree?
> > >
> > > So I think what you're saying is that MAC0 and MAC1's have MDIO bus
> > > masters, and the hardware designer decided to tie both together to
> > > a single set of clock and data lines, which then go to two PHYs.
> >
> > The schematics I have are not very clear on that, but now that you
> > mention this, it's most likely the case.
>
> I hope not. That would be very broken. As Russell pointed out, MDIO is
> not multi-master. You need to check with the hardware designer if the
> schematics are not clear.
Sorry, it was not very clear. It's the case that two MDIO masters
share the MDC and data lines.
>
> > Good point, but it's worse than that: when MAC0 is unbound, it will
> > unregister the MDIO bus and destroy all PHY devices. These are not
> > refcounted so they will literally go from under MAC1. Not sure how
> > this can be dealt with?
>
> unbinding is not a normal operation. So i would just live with it, and
> if root decides to shoot herself in the foot, that is her choice.
>
I disagree. Unbinding is very much a normal operation. Less so for
platform devices but still, it is there for a reason and should be
expected to work correctly. Or at the very least not crash and burn
the system.
On the other hand, I like your approach because I may get away without
having to fix it. But if I were to fix it - I would reference the MDIO
bus from the secondary mac by phandle and count its references before
dropping it. :)
Bartosz