Since the introduction of mac_managed_pm in the FEC driver there were some
discrepancies regarding power management of the PHY.
This failed on our board that has a permanently powered Microchip LAN8700R
attached to the FEC. Although the root cause of the failure can be traced
back to f166f890c8f0 ("net: ethernet: fec: Replace interrupt driven MDIO
with polled IO") and probably even before that, we only started noticing
the problem going from 5.10 to 6.1.
Since 557d5dc83f68 ("net: fec: use mac-managed PHY PM") is actually a fix
to most of the power management sequencing problems that came with power
managing the MDIO bus which for the FEC meant adding a race with FEC
resume (and phy_start() if netif was running) and PHY resume.
That it worked before for us was probably just luck...
Thanks to Wei's response to my report at [1] I was able to pick up his
patch and start honing in on the remaining missing details.
[1]: https://lore.kernel.org/netdev/[email protected]/
v3:
- Implement feedback from Wei Fang for patch 2
- Fixes tag in patch 2 dropped, should it be delayed for net-next now?
v2: https://lore.kernel.org/netdev/[email protected]/
- Completely different approach that should be much more correct
(Wei Fang, Jakub Kicinski)
- Re-target to net tree, because I have fixes tags now
v1: https://lore.kernel.org/netdev/[email protected]/
John Ernberg (1):
net: fec: Suspend the PHY on probe
Wei Fang (1):
net: fec: Set mac_managed_pm during probe
drivers/net/ethernet/freescale/fec_main.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
--
2.43.0
Since the power management is now performed by the FEC instead of generic
pm the PHY will not suspend until the link has been up.
Therefor suspend it on probe. It will be resumed by {of_,}phy_connect()
when the link is brought up.
Since {of_,}phy_connect() and phy_disconnect() will resume and suspend the
PHY when the link is brought up and down respectively, and phy_stop() and
phy_start() will resume and suspend the PHY in the suspend-resume paths
there is no need for any additional calls anywhere.
Signed-off-by: John Ernberg <[email protected]>
---
v3:
- Only call phy_suspend() in probe, it is taken care of by the phy framework (Wei Fang)
- Update commit message to reflect the above
- Drop fixes tag, this is technically not a fix anymore (Wei Fang)
Should I re-send this for the net-next tree later, or is it still ok for net?
v2:
- New patch
---
drivers/net/ethernet/freescale/fec_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 8decb1b072c5..fb092b7bfe85 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2539,8 +2539,10 @@ static int fec_enet_mii_init(struct platform_device *pdev)
/* find all the PHY devices on the bus and set mac_managed_pm to true */
for (addr = 0; addr < PHY_MAX_ADDR; addr++) {
phydev = mdiobus_get_phy(fep->mii_bus, addr);
- if (phydev)
+ if (phydev) {
phydev->mac_managed_pm = true;
+ phy_suspend(phydev);
+ }
}
mii_cnt++;
--
2.43.0
Hello John,
I'm adding Andrew and Russell to the thread as PHY maintainers and
reviewers.
On Wed, 6 Mar 2024 13:37:45 +0000
John Ernberg <[email protected]> wrote:
> Since the power management is now performed by the FEC instead of generic
> pm the PHY will not suspend until the link has been up.
>
> Therefor suspend it on probe. It will be resumed by {of_,}phy_connect()
> when the link is brought up.
>
> Since {of_,}phy_connect() and phy_disconnect() will resume and suspend the
> PHY when the link is brought up and down respectively, and phy_stop() and
> phy_start() will resume and suspend the PHY in the suspend-resume paths
> there is no need for any additional calls anywhere.
>
> Signed-off-by: John Ernberg <[email protected]>
[...]
> @@ -2539,8 +2539,10 @@ static int fec_enet_mii_init(struct platform_device *pdev)
> /* find all the PHY devices on the bus and set mac_managed_pm to true */
> for (addr = 0; addr < PHY_MAX_ADDR; addr++) {
> phydev = mdiobus_get_phy(fep->mii_bus, addr);
> - if (phydev)
> + if (phydev) {
> phydev->mac_managed_pm = true;
> + phy_suspend(phydev);
> + }
I don't think that's correct. here phy_suspend() is being called before
the PHY got attached, so the PHY wasn't initialized at all at that
point (which I guess is your issue as the PHY is still in the state it
was configured into by the bootloader)
Following the code paths, it looks like this works for you because the
PHY you're using has a .suspend callback populated, but for any PHY
that uses the genphy driver, this will do nothing at all (the PHY isn't
yet attached to the genphy ops, therefore genphy_suspend won't be
called).
Best regards,
Maxime
On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>> Hi Russel,
>
> Growl. Hi Peter.
>
>> What we really want is the PHY to be suspended on suspend to RAM
>> regardless of us having had an initial link up or not.
>
> So what you're asking is for the PHY to be suspended when the system
> is entering suspend, which is a long time after the system booted and
> thus phy_probe() was called, and could be some time before the system
> resumes.
>
> I'm not sure what the relevance is of phy_probe() that was brought up
> previously then.
>
>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
>> may cut
>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
>> fec: use
>> mac-managed PHY PM") which was added in Linux 5.12.
>
> Looking at the former commit, that looks to me like it is only
> affecting the resume paths, not the suspend paths, so wouldn't have
> any impact itself on what happens when suspend happens.
>
> The latter commit states that it is a work around for an issue with a
> particular PHY. What happens if you revert just this commit, does your
> problem then go away?
>
> Also, please clarify. It seems that you are reporting a regression -
> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
> it working for you?
>
>> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
>> are not
>> taken. The resume sequencing with generic PM has been broken with the
>> FEC since
>> generic PM of the mdio bus was added, as the FEC will do phy_start()
>> (via FEC
>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>> (previously:
>> less damaging phy_resume()) due to how the FEC IP block works.
>
> That suggests that even with 557d5dc83f68 reverted, it's broken.
> Digging into the history, what you're referring to dates from January
> 2016, so are you reporting a regression that occured 8 _years_ ago,
> at which point I'd question why it's taken 8 years.
>
> Given the time that has passed, I don't think reverting commits is
> a sane approach. Quite what the right solution is though, I'm not
> sure.
>
> From the description and the commits pointed to, I just don't see
> that there is anything that could've changed with respect to the first
> boot - if that has changed, then I think more research into what caused
> it is needed.
>
> If it's the subsequent state after a suspend-resume cycle, then yes,
> I would agree that its possible that these changes broke this for you.
> Would clearing ndev->phydev->mac_managed_pm just before
> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
> resume paths for the PHY get used when the network interface is down?
>
> Maybe, however, that's something that should happen in any case inside
> phylib on phy_disconnect() as a matter of course, since the PHY will
> at that point be no longer under the control of the network driver for
> PM purposes. Could you give this idea a try please?
>
On phy_disconnect() we will do a phy_detach() which calls phy_suspend().
Given that phy_disconnect() is called from fec_enet_close(), we still
have a MDIO bus registered and we are not trying to suspend the MDIO
bus, so we should have an effective phy_suspend() call here, what am I
missing?
--
Florian
On 3/21/24 09:02, John Ernberg wrote:
> Hi Russell,
>
> On 3/20/24 20:44, Russell King (Oracle) wrote:
>> On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>>>
>>>
>>> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
>>>> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>>>>> Hi Russel,
>>>>
>>>> Growl. Hi Peter.
>>>>
>>>>> What we really want is the PHY to be suspended on suspend to RAM
>>>>> regardless of us having had an initial link up or not.
>>>>
>>>> So what you're asking is for the PHY to be suspended when the system
>>>> is entering suspend, which is a long time after the system booted and
>>>> thus phy_probe() was called, and could be some time before the system
>>>> resumes.
>>>>
>>>> I'm not sure what the relevance is of phy_probe() that was brought up
>>>> previously then.
>>>>
>>>>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
>>>>> may cut
>>>>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
>>>>> fec: use
>>>>> mac-managed PHY PM") which was added in Linux 5.12.
>>>>
>>>> Looking at the former commit, that looks to me like it is only
>>>> affecting the resume paths, not the suspend paths, so wouldn't have
>>>> any impact itself on what happens when suspend happens.
>>>>
>>>> The latter commit states that it is a work around for an issue with a
>>>> particular PHY. What happens if you revert just this commit, does your
>>>> problem then go away?
>
> Our PHY does not begin working again without reverting both. phy_init_hw()
> will remain an issue if it occurs after phy_start().
>
> The commit message in 557d5dc83f68 is not explaining nearly enough, I
> spent a
> few days on it before I proved that commit to be nearly correct (See whole
> thread at [1]), it happened to just explode with that PHY. The issue is a
> sequencing issue that was made more prominent by 4c0d2e96ba05, but it
> existed
> since around 2008. Because FEC is both MDIO controller and MAC, meaning the
> resume of the link in a link up case runs phy_start() in the FEC resume
> function, which will trigger a mdio bus resume when it completes, in turn
> calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
> wasn't a
> problem but still wrong sequence wise).
>
>>>>
>>>> Also, please clarify. It seems that you are reporting a regression -
>>>> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
>>>> it working for you?
>>>>
>>>>> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
>>>>> are not
>>>>> taken. The resume sequencing with generic PM has been broken with the
>>>>> FEC since
>>>>> generic PM of the mdio bus was added, as the FEC will do phy_start()
>>>>> (via FEC
>>>>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>>>>> (previously:
>>>>> less damaging phy_resume()) due to how the FEC IP block works.
>>>>
>>>> That suggests that even with 557d5dc83f68 reverted, it's broken.
>>>> Digging into the history, what you're referring to dates from January
>>>> 2016, so are you reporting a regression that occured 8 _years_ ago,
>>>> at which point I'd question why it's taken 8 years.
>
> A revert of those is absolutely wrong. Those commits are fixing bigger
> issues.
>
>>>>
>>>> Given the time that has passed, I don't think reverting commits is
>>>> a sane approach. Quite what the right solution is though, I'm not
>>>> sure.
>>>>
>>>> From the description and the commits pointed to, I just don't see
>>>> that there is anything that could've changed with respect to the first
>>>> boot - if that has changed, then I think more research into what caused
>>>> it is needed.
>>>>
>>>> If it's the subsequent state after a suspend-resume cycle, then yes,
>>>> I would agree that its possible that these changes broke this for you.
>>>> Would clearing ndev->phydev->mac_managed_pm just before
>>>> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
>>>> resume paths for the PHY get used when the network interface is down?
>>>>
>>>> Maybe, however, that's something that should happen in any case inside
>>>> phylib on phy_disconnect() as a matter of course, since the PHY will
>>>> at that point be no longer under the control of the network driver for
>>>> PM purposes. Could you give this idea a try please?
>>>>
>>>
>>> On phy_disconnect() we will do a phy_detach() which calls phy_suspend().
>>> Given that phy_disconnect() is called from fec_enet_close(), we still have a
>>> MDIO bus registered and we are not trying to suspend the MDIO bus, so we
>>> should have an effective phy_suspend() call here, what am I missing?
>>
>> I didn't look there, but if that is the case, then what is John's
>> problem - I can't figure it out, something isn't adding up here.
>>
>
> I could instead add extra phy_suspend() in the suspend path if the link is
> down and the FEC is up and running. I rejected it originally thinking it was
> a much dirtier fix, but maybe that is the more correct thing to do?
This does not seem like the proper solution, the only time where an
explicit phy_suspend() should be done in the Ethernet MAC's ->suspend()
routine is if the network device was brought up at the time
(netif_runninng() returns true) *and* you set mac_managed_pm = true
because you must precisely control the order in which the MAC and the
PHY get suspended with respect to each other (typically because the PHY
supplies a RX clock back to the MAC, and some of the MAC logic depends
upon it to operate properly, e.g.: perform a proper FIFO flush etc.).
From there, I see two distinct cases:
- the network device driver probed, but the network device was never
brought up in the first place in that case, I do not see a path whereby
the PHY would have been suspended, unless the boot firmware already took
care of that (which arguably it should if you are trying to be as power
efficient as possible), although arguably there could be a path within
the kernel where this is also done. It could get really complicated however
- the network device driver probed, and the network device was brought
up at least once (regardless of whether a link state was detected or
not), such that the PHY has gone through a phy_start()/phy_stop() cycle,
and upon phy_stop() a phy_suspend() has been called
It is safe to assume you fall in the first case only, or do you also see
a problem in the second case as well?
If the first case, I am a bit torn as to how to best go about it. The
initial state of the PHY upon kernel boot can be a tad difficult to work
with:
- some people want to conserve power as much as possible (like you, like
me) and would not mind seeing a link break to achieve that state, nor do
they mind a partial or full reconfiguration of the PHY by its driver
- some people want to avoid a link break and just inherit the existing
operational mode of the PHY such that they can have a working link as
quickly as possible
--
Florian
Hi Russell,
On 3/20/24 20:44, Russell King (Oracle) wrote:
> On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>>
>>
>> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
>>> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>>>> Hi Russel,
>>>
>>> Growl. Hi Peter.
>>>
>>>> What we really want is the PHY to be suspended on suspend to RAM
>>>> regardless of us having had an initial link up or not.
>>>
>>> So what you're asking is for the PHY to be suspended when the system
>>> is entering suspend, which is a long time after the system booted and
>>> thus phy_probe() was called, and could be some time before the system
>>> resumes.
>>>
>>> I'm not sure what the relevance is of phy_probe() that was brought up
>>> previously then.
>>>
>>>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
>>>> may cut
>>>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
>>>> fec: use
>>>> mac-managed PHY PM") which was added in Linux 5.12.
>>>
>>> Looking at the former commit, that looks to me like it is only
>>> affecting the resume paths, not the suspend paths, so wouldn't have
>>> any impact itself on what happens when suspend happens.
>>>
>>> The latter commit states that it is a work around for an issue with a
>>> particular PHY. What happens if you revert just this commit, does your
>>> problem then go away?
Our PHY does not begin working again without reverting both. phy_init_hw()
will remain an issue if it occurs after phy_start().
The commit message in 557d5dc83f68 is not explaining nearly enough, I
spent a
few days on it before I proved that commit to be nearly correct (See whole
thread at [1]), it happened to just explode with that PHY. The issue is a
sequencing issue that was made more prominent by 4c0d2e96ba05, but it
existed
since around 2008. Because FEC is both MDIO controller and MAC, meaning the
resume of the link in a link up case runs phy_start() in the FEC resume
function, which will trigger a mdio bus resume when it completes, in turn
calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
wasn't a
problem but still wrong sequence wise).
>>>
>>> Also, please clarify. It seems that you are reporting a regression -
>>> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
>>> it working for you?
>>>
>>>> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
>>>> are not
>>>> taken. The resume sequencing with generic PM has been broken with the
>>>> FEC since
>>>> generic PM of the mdio bus was added, as the FEC will do phy_start()
>>>> (via FEC
>>>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>>>> (previously:
>>>> less damaging phy_resume()) due to how the FEC IP block works.
>>>
>>> That suggests that even with 557d5dc83f68 reverted, it's broken.
>>> Digging into the history, what you're referring to dates from January
>>> 2016, so are you reporting a regression that occured 8 _years_ ago,
>>> at which point I'd question why it's taken 8 years.
A revert of those is absolutely wrong. Those commits are fixing bigger
issues.
>>>
>>> Given the time that has passed, I don't think reverting commits is
>>> a sane approach. Quite what the right solution is though, I'm not
>>> sure.
>>>
>>> From the description and the commits pointed to, I just don't see
>>> that there is anything that could've changed with respect to the first
>>> boot - if that has changed, then I think more research into what caused
>>> it is needed.
>>>
>>> If it's the subsequent state after a suspend-resume cycle, then yes,
>>> I would agree that its possible that these changes broke this for you.
>>> Would clearing ndev->phydev->mac_managed_pm just before
>>> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
>>> resume paths for the PHY get used when the network interface is down?
>>>
>>> Maybe, however, that's something that should happen in any case inside
>>> phylib on phy_disconnect() as a matter of course, since the PHY will
>>> at that point be no longer under the control of the network driver for
>>> PM purposes. Could you give this idea a try please?
>>>
>>
>> On phy_disconnect() we will do a phy_detach() which calls phy_suspend().
>> Given that phy_disconnect() is called from fec_enet_close(), we still have a
>> MDIO bus registered and we are not trying to suspend the MDIO bus, so we
>> should have an effective phy_suspend() call here, what am I missing?
>
> I didn't look there, but if that is the case, then what is John's
> problem - I can't figure it out, something isn't adding up here.
>
I could instead add extra phy_suspend() in the suspend path if the link is
down and the FEC is up and running. I rejected it originally thinking it was
a much dirtier fix, but maybe that is the more correct thing to do?
Thanks! // John Ernberg
[1]:
https://lore.kernel.org/netdev/[email protected]/
On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>
>
> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
> > On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
> > > Hi Russel,
> >
> > Growl. Hi Peter.
> >
> > > What we really want is the PHY to be suspended on suspend to RAM
> > > regardless of us having had an initial link up or not.
> >
> > So what you're asking is for the PHY to be suspended when the system
> > is entering suspend, which is a long time after the system booted and
> > thus phy_probe() was called, and could be some time before the system
> > resumes.
> >
> > I'm not sure what the relevance is of phy_probe() that was brought up
> > previously then.
> >
> > > This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
> > > may cut
> > > off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
> > > fec: use
> > > mac-managed PHY PM") which was added in Linux 5.12.
> >
> > Looking at the former commit, that looks to me like it is only
> > affecting the resume paths, not the suspend paths, so wouldn't have
> > any impact itself on what happens when suspend happens.
> >
> > The latter commit states that it is a work around for an issue with a
> > particular PHY. What happens if you revert just this commit, does your
> > problem then go away?
> >
> > Also, please clarify. It seems that you are reporting a regression -
> > it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
> > it working for you?
> >
> > > Since FEC requires mac_managed_pm the generic PM suspend-resume paths
> > > are not
> > > taken. The resume sequencing with generic PM has been broken with the
> > > FEC since
> > > generic PM of the mdio bus was added, as the FEC will do phy_start()
> > > (via FEC
> > > resume) and then generic PM runs phy_init_hw() via mdio bus resume
> > > (previously:
> > > less damaging phy_resume()) due to how the FEC IP block works.
> >
> > That suggests that even with 557d5dc83f68 reverted, it's broken.
> > Digging into the history, what you're referring to dates from January
> > 2016, so are you reporting a regression that occured 8 _years_ ago,
> > at which point I'd question why it's taken 8 years.
> >
> > Given the time that has passed, I don't think reverting commits is
> > a sane approach. Quite what the right solution is though, I'm not
> > sure.
> >
> > From the description and the commits pointed to, I just don't see
> > that there is anything that could've changed with respect to the first
> > boot - if that has changed, then I think more research into what caused
> > it is needed.
> >
> > If it's the subsequent state after a suspend-resume cycle, then yes,
> > I would agree that its possible that these changes broke this for you.
> > Would clearing ndev->phydev->mac_managed_pm just before
> > phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
> > resume paths for the PHY get used when the network interface is down?
> >
> > Maybe, however, that's something that should happen in any case inside
> > phylib on phy_disconnect() as a matter of course, since the PHY will
> > at that point be no longer under the control of the network driver for
> > PM purposes. Could you give this idea a try please?
> >
>
> On phy_disconnect() we will do a phy_detach() which calls phy_suspend().
> Given that phy_disconnect() is called from fec_enet_close(), we still have a
> MDIO bus registered and we are not trying to suspend the MDIO bus, so we
> should have an effective phy_suspend() call here, what am I missing?
I didn't look there, but if that is the case, then what is John's
problem - I can't figure it out, something isn't adding up here.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
> Hi Russel,
Growl. Hi Peter.
> What we really want is the PHY to be suspended on suspend to RAM
> regardless of us having had an initial link up or not.
So what you're asking is for the PHY to be suspended when the system
is entering suspend, which is a long time after the system booted and
thus phy_probe() was called, and could be some time before the system
resumes.
I'm not sure what the relevance is of phy_probe() that was brought up
previously then.
> This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
> may cut
> off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
> fec: use
> mac-managed PHY PM") which was added in Linux 5.12.
Looking at the former commit, that looks to me like it is only
affecting the resume paths, not the suspend paths, so wouldn't have
any impact itself on what happens when suspend happens.
The latter commit states that it is a work around for an issue with a
particular PHY. What happens if you revert just this commit, does your
problem then go away?
Also, please clarify. It seems that you are reporting a regression -
it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
it working for you?
> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
> are not
> taken. The resume sequencing with generic PM has been broken with the
> FEC since
> generic PM of the mdio bus was added, as the FEC will do phy_start()
> (via FEC
> resume) and then generic PM runs phy_init_hw() via mdio bus resume
> (previously:
> less damaging phy_resume()) due to how the FEC IP block works.
That suggests that even with 557d5dc83f68 reverted, it's broken.
Digging into the history, what you're referring to dates from January
2016, so are you reporting a regression that occured 8 _years_ ago,
at which point I'd question why it's taken 8 years.
Given the time that has passed, I don't think reverting commits is
a sane approach. Quite what the right solution is though, I'm not
sure.
From the description and the commits pointed to, I just don't see
that there is anything that could've changed with respect to the first
boot - if that has changed, then I think more research into what caused
it is needed.
If it's the subsequent state after a suspend-resume cycle, then yes,
I would agree that its possible that these changes broke this for you.
Would clearing ndev->phydev->mac_managed_pm just before
phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
resume paths for the PHY get used when the network interface is down?
Maybe, however, that's something that should happen in any case inside
phylib on phy_disconnect() as a matter of course, since the PHY will
at that point be no longer under the control of the network driver for
PM purposes. Could you give this idea a try please?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Hi Russel,
On 3/19/24 09:51, Russell King (Oracle) wrote:
> On Tue, Mar 19, 2024 at 08:37:44AM +0000, John Ernberg wrote:
>> There is also a case where the phy driver module is not automatically
>> loaded, in cases where request_module() fails, either due to the
>> userspace helper feature being compiled out or other reasons, and the
>> module is loaded manually later. I suspect for reasons like these the
>> genphy probe happens so late. My solution here doesn't cover non-loaded
>> modules either, but this could maybe be covered by moving phy_suspend()
>> to phy_probe(). Unless there is an even more clever way to go about it
>> which I can't see from inexperience.
>
> Note that in the case where the PHY driver module is loaded late,
> phy_probe() won't be called for the PHY until that happens.
>
> I would say if one wants a platform to behave with minimal power
> consumption, that is something that has to be done across the
> software stack, and that includes the boot firmware. So, if one
> wants the PHY to be in a low power state at boot time, then
> firmware needs to ensure that happens.
>
> Trying to shoe-horn that into the kernel isn't going to work
> because we get to decide what to do with the PHY way too late
> (due to PHY drivers being modular and on the rootfs.)
>
What we really want is the PHY to be suspended on suspend to RAM
regardless of
us having had an initial link up or not.
This worked prior to 4c0d2e96ba05 ("net: phy: consider that suspend2ram
may cut
off PHY power") which was added in Linux 5.11, and 557d5dc83f68 ("net:
fec: use
mac-managed PHY PM") which was added in Linux 5.12.
Since FEC requires mac_managed_pm the generic PM suspend-resume paths
are not
taken. The resume sequencing with generic PM has been broken with the
FEC since
generic PM of the mdio bus was added, as the FEC will do phy_start()
(via FEC
resume) and then generic PM runs phy_init_hw() via mdio bus resume
(previously:
less damaging phy_resume()) due to how the FEC IP block works.
Some background context to our usecase which might have been lost is
that our
system bring the link up based on outside input and to conserve power we
suspend
regularly, and this is the only situation where we care about the power
consumption. Since we cannot decide if link shall be up ourselves we can go
through numerous suspend cycles before the first link up. We could in theory
work around it in userspace by doing "ip link set <eth> up && ip link
set <eth>
down", but it wasn't required before.
Thanks! // John Ernberg
Hi Maxime,
Apologies for the delay in my response.
On 3/6/24 19:05, Maxime Chevallier wrote:
> Hello John,
>
> I'm adding Andrew and Russell to the thread as PHY maintainers and
> reviewers.
>
> On Wed, 6 Mar 2024 13:37:45 +0000
> John Ernberg <[email protected]> wrote:
>
>> Since the power management is now performed by the FEC instead of generic
>> pm the PHY will not suspend until the link has been up.
>>
>> Therefor suspend it on probe. It will be resumed by {of_,}phy_connect()
>> when the link is brought up.
>>
>> Since {of_,}phy_connect() and phy_disconnect() will resume and suspend the
>> PHY when the link is brought up and down respectively, and phy_stop() and
>> phy_start() will resume and suspend the PHY in the suspend-resume paths
>> there is no need for any additional calls anywhere.
>>
>> Signed-off-by: John Ernberg <[email protected]>
>
> [...]
>
>> @@ -2539,8 +2539,10 @@ static int fec_enet_mii_init(struct platform_device *pdev)
>> /* find all the PHY devices on the bus and set mac_managed_pm to true */
>> for (addr = 0; addr < PHY_MAX_ADDR; addr++) {
>> phydev = mdiobus_get_phy(fep->mii_bus, addr);
>> - if (phydev)
>> + if (phydev) {
>> phydev->mac_managed_pm = true;
>> + phy_suspend(phydev);
>> + }
>
> I don't think that's correct. here phy_suspend() is being called before
> the PHY got attached, so the PHY wasn't initialized at all at that
> point (which I guess is your issue as the PHY is still in the state it
> was configured into by the bootloader)
>
> Following the code paths, it looks like this works for you because the
> PHY you're using has a .suspend callback populated, but for any PHY
> that uses the genphy driver, this will do nothing at all (the PHY isn't
> yet attached to the genphy ops, therefore genphy_suspend won't be
> called).
Thanks for highlighting this.
Yes, it's a problem for genphy, although due to when genphy is probed,
it's always been a problem for genphy, even before this patch. Whereas
PHYs with specific drivers worked before due to MDIO bus PM.
There is also a case where the phy driver module is not automatically
loaded, in cases where request_module() fails, either due to the
userspace helper feature being compiled out or other reasons, and the
module is loaded manually later. I suspect for reasons like these the
genphy probe happens so late. My solution here doesn't cover non-loaded
modules either, but this could maybe be covered by moving phy_suspend()
to phy_probe(). Unless there is an even more clever way to go about it
which I can't see from inexperience.
If a PHY driver doesn't populate .suspend there's probably a good reason
for it and it makes sense to not suspend such a PHY, so I'm not
concerned about an unpopulated .suspend.
Best regards // John Ernberg
>
> Best regards,
>
> Maxime
On Tue, Mar 19, 2024 at 08:37:44AM +0000, John Ernberg wrote:
> There is also a case where the phy driver module is not automatically
> loaded, in cases where request_module() fails, either due to the
> userspace helper feature being compiled out or other reasons, and the
> module is loaded manually later. I suspect for reasons like these the
> genphy probe happens so late. My solution here doesn't cover non-loaded
> modules either, but this could maybe be covered by moving phy_suspend()
> to phy_probe(). Unless there is an even more clever way to go about it
> which I can't see from inexperience.
Note that in the case where the PHY driver module is loaded late,
phy_probe() won't be called for the PHY until that happens.
I would say if one wants a platform to behave with minimal power
consumption, that is something that has to be done across the
software stack, and that includes the boot firmware. So, if one
wants the PHY to be in a low power state at boot time, then
firmware needs to ensure that happens.
Trying to shoe-horn that into the kernel isn't going to work
because we get to decide what to do with the PHY way too late
(due to PHY drivers being modular and on the rootfs.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Hi Florian,
On 3/21/24 17:13, Florian Fainelli wrote:
> On 3/21/24 09:02, John Ernberg wrote:
>> Hi Russell,
>>
>> On 3/20/24 20:44, Russell King (Oracle) wrote:
>>> On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>>>>
>>>>
>>>> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
>>>>> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>>>>>> Hi Russel,
>>>>>
>>>>> Growl. Hi Peter.
>>>>>
>>>>>> What we really want is the PHY to be suspended on suspend to RAM
>>>>>> regardless of us having had an initial link up or not.
>>>>>
>>>>> So what you're asking is for the PHY to be suspended when the system
>>>>> is entering suspend, which is a long time after the system booted and
>>>>> thus phy_probe() was called, and could be some time before the system
>>>>> resumes.
>>>>>
>>>>> I'm not sure what the relevance is of phy_probe() that was brought up
>>>>> previously then.
>>>>>
>>>>>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that
>>>>>> suspend2ram
>>>>>> may cut
>>>>>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68
>>>>>> ("net:
>>>>>> fec: use
>>>>>> mac-managed PHY PM") which was added in Linux 5.12.
>>>>>
>>>>> Looking at the former commit, that looks to me like it is only
>>>>> affecting the resume paths, not the suspend paths, so wouldn't have
>>>>> any impact itself on what happens when suspend happens.
>>>>>
>>>>> The latter commit states that it is a work around for an issue with a
>>>>> particular PHY. What happens if you revert just this commit, does your
>>>>> problem then go away?
>>
>> Our PHY does not begin working again without reverting both.
>> phy_init_hw()
>> will remain an issue if it occurs after phy_start().
>>
>> The commit message in 557d5dc83f68 is not explaining nearly enough, I
>> spent a
>> few days on it before I proved that commit to be nearly correct (See
>> whole
>> thread at [1]), it happened to just explode with that PHY. The issue is a
>> sequencing issue that was made more prominent by 4c0d2e96ba05, but it
>> existed
>> since around 2008. Because FEC is both MDIO controller and MAC,
>> meaning the
>> resume of the link in a link up case runs phy_start() in the FEC resume
>> function, which will trigger a mdio bus resume when it completes, in turn
>> calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
>> wasn't a
>> problem but still wrong sequence wise).
>>
>>>>>
>>>>> Also, please clarify. It seems that you are reporting a regression -
>>>>> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
>>>>> it working for you?
>>>>>
>>>>>> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
>>>>>> are not
>>>>>> taken. The resume sequencing with generic PM has been broken with the
>>>>>> FEC since
>>>>>> generic PM of the mdio bus was added, as the FEC will do phy_start()
>>>>>> (via FEC
>>>>>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>>>>>> (previously:
>>>>>> less damaging phy_resume()) due to how the FEC IP block works.
>>>>>
>>>>> That suggests that even with 557d5dc83f68 reverted, it's broken.
>>>>> Digging into the history, what you're referring to dates from January
>>>>> 2016, so are you reporting a regression that occured 8 _years_ ago,
>>>>> at which point I'd question why it's taken 8 years.
>>
>> A revert of those is absolutely wrong. Those commits are fixing bigger
>> issues.
>>
>>>>>
>>>>> Given the time that has passed, I don't think reverting commits is
>>>>> a sane approach. Quite what the right solution is though, I'm not
>>>>> sure.
>>>>>
>>>>> From the description and the commits pointed to, I just don't see
>>>>> that there is anything that could've changed with respect to the first
>>>>> boot - if that has changed, then I think more research into what
>>>>> caused
>>>>> it is needed.
>>>>>
>>>>> If it's the subsequent state after a suspend-resume cycle, then yes,
>>>>> I would agree that its possible that these changes broke this for you.
>>>>> Would clearing ndev->phydev->mac_managed_pm just before
>>>>> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
>>>>> resume paths for the PHY get used when the network interface is down?
>>>>>
>>>>> Maybe, however, that's something that should happen in any case inside
>>>>> phylib on phy_disconnect() as a matter of course, since the PHY will
>>>>> at that point be no longer under the control of the network driver for
>>>>> PM purposes. Could you give this idea a try please?
>>>>>
>>>>
>>>> On phy_disconnect() we will do a phy_detach() which calls
>>>> phy_suspend().
>>>> Given that phy_disconnect() is called from fec_enet_close(), we
>>>> still have a
>>>> MDIO bus registered and we are not trying to suspend the MDIO bus,
>>>> so we
>>>> should have an effective phy_suspend() call here, what am I missing?
>>>
>>> I didn't look there, but if that is the case, then what is John's
>>> problem - I can't figure it out, something isn't adding up here.
>>>
>>
>> I could instead add extra phy_suspend() in the suspend path if the
>> link is
>> down and the FEC is up and running. I rejected it originally thinking
>> it was
>> a much dirtier fix, but maybe that is the more correct thing to do?
>
> This does not seem like the proper solution, the only time where an
> explicit phy_suspend() should be done in the Ethernet MAC's ->suspend()
> routine is if the network device was brought up at the time
> (netif_runninng() returns true) *and* you set mac_managed_pm = true
> because you must precisely control the order in which the MAC and the
> PHY get suspended with respect to each other (typically because the PHY
> supplies a RX clock back to the MAC, and some of the MAC logic depends
> upon it to operate properly, e.g.: perform a proper FIFO flush etc.).
I'm having some trouble understanding your message in context of my most
recent reply to Russell, so please bear with me here as I will
potentially ask a really dumb question:
Do I understand this correctly as what used to work in 5.10 was never
meant to work and the behavior now is the correct one in the FEC case?
Meaning that if the link has never been up the PHY must never be handled
from a power management perspective?
The only PHY examples I have come across (though not many in total) the
PHY has done some initial configuration of itself after POR or release
of the reset line.
>
> From there, I see two distinct cases:
>
> - the network device driver probed, but the network device was never
> brought up in the first place in that case, I do not see a path whereby
> the PHY would have been suspended, unless the boot firmware already took
> care of that (which arguably it should if you are trying to be as power
> efficient as possible), although arguably there could be a path within
> the kernel where this is also done. It could get really complicated however
Generic PM via mdio_bus_phy_suspend() will suspend the PHY if it has a
.suspend callback and mac_managed_pm isn't set.
mdio_bus_phy_may_suspend() will see that netdev is NULL, which means it
returns the inverse of phy->suspended (which is false), meaning the
function returns true. Thus phy_suspend() is called.
>
> - the network device driver probed, and the network device was brought
> up at least once (regardless of whether a link state was detected or
> not), such that the PHY has gone through a phy_start()/phy_stop() cycle,
> and upon phy_stop() a phy_suspend() has been called
>
> It is safe to assume you fall in the first case only, or do you also see
> a problem in the second case as well?
There is only a problem in the first case. The second case is working as
expected.
>
> If the first case, I am a bit torn as to how to best go about it. The
> initial state of the PHY upon kernel boot can be a tad difficult to work
> with:
>
> - some people want to conserve power as much as possible (like you, like
> me) and would not mind seeing a link break to achieve that state, nor do
> they mind a partial or full reconfiguration of the PHY by its driver
For these devices where we see what we consider an issue we only really
care about power consumption in suspend to RAM.
>
> - some people want to avoid a link break and just inherit the existing
> operational mode of the PHY such that they can have a working link as
> quickly as possible
Thanks! // John Ernberg
On 3/25/24 05:20, John Ernberg wrote:
> Hi Florian,
>
> On 3/21/24 17:13, Florian Fainelli wrote:
>> On 3/21/24 09:02, John Ernberg wrote:
>>> Hi Russell,
>>>
>>> On 3/20/24 20:44, Russell King (Oracle) wrote:
>>>> On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>>>>>
>>>>>
>>>>> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
>>>>>> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>>>>>>> Hi Russel,
>>>>>>
>>>>>> Growl. Hi Peter.
>>>>>>
>>>>>>> What we really want is the PHY to be suspended on suspend to RAM
>>>>>>> regardless of us having had an initial link up or not.
>>>>>>
>>>>>> So what you're asking is for the PHY to be suspended when the system
>>>>>> is entering suspend, which is a long time after the system booted and
>>>>>> thus phy_probe() was called, and could be some time before the system
>>>>>> resumes.
>>>>>>
>>>>>> I'm not sure what the relevance is of phy_probe() that was brought up
>>>>>> previously then.
>>>>>>
>>>>>>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that
>>>>>>> suspend2ram
>>>>>>> may cut
>>>>>>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68
>>>>>>> ("net:
>>>>>>> fec: use
>>>>>>> mac-managed PHY PM") which was added in Linux 5.12.
>>>>>>
>>>>>> Looking at the former commit, that looks to me like it is only
>>>>>> affecting the resume paths, not the suspend paths, so wouldn't have
>>>>>> any impact itself on what happens when suspend happens.
>>>>>>
>>>>>> The latter commit states that it is a work around for an issue with a
>>>>>> particular PHY. What happens if you revert just this commit, does your
>>>>>> problem then go away?
>>>
>>> Our PHY does not begin working again without reverting both.
>>> phy_init_hw()
>>> will remain an issue if it occurs after phy_start().
>>>
>>> The commit message in 557d5dc83f68 is not explaining nearly enough, I
>>> spent a
>>> few days on it before I proved that commit to be nearly correct (See
>>> whole
>>> thread at [1]), it happened to just explode with that PHY. The issue is a
>>> sequencing issue that was made more prominent by 4c0d2e96ba05, but it
>>> existed
>>> since around 2008. Because FEC is both MDIO controller and MAC,
>>> meaning the
>>> resume of the link in a link up case runs phy_start() in the FEC resume
>>> function, which will trigger a mdio bus resume when it completes, in turn
>>> calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
>>> wasn't a
>>> problem but still wrong sequence wise).
>>>
>>>>>>
>>>>>> Also, please clarify. It seems that you are reporting a regression -
>>>>>> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
>>>>>> it working for you?
>>>>>>
>>>>>>> Since FEC requires mac_managed_pm the generic PM suspend-resume paths
>>>>>>> are not
>>>>>>> taken. The resume sequencing with generic PM has been broken with the
>>>>>>> FEC since
>>>>>>> generic PM of the mdio bus was added, as the FEC will do phy_start()
>>>>>>> (via FEC
>>>>>>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>>>>>>> (previously:
>>>>>>> less damaging phy_resume()) due to how the FEC IP block works.
>>>>>>
>>>>>> That suggests that even with 557d5dc83f68 reverted, it's broken.
>>>>>> Digging into the history, what you're referring to dates from January
>>>>>> 2016, so are you reporting a regression that occured 8 _years_ ago,
>>>>>> at which point I'd question why it's taken 8 years.
>>>
>>> A revert of those is absolutely wrong. Those commits are fixing bigger
>>> issues.
>>>
>>>>>>
>>>>>> Given the time that has passed, I don't think reverting commits is
>>>>>> a sane approach. Quite what the right solution is though, I'm not
>>>>>> sure.
>>>>>>
>>>>>> From the description and the commits pointed to, I just don't see
>>>>>> that there is anything that could've changed with respect to the first
>>>>>> boot - if that has changed, then I think more research into what
>>>>>> caused
>>>>>> it is needed.
>>>>>>
>>>>>> If it's the subsequent state after a suspend-resume cycle, then yes,
>>>>>> I would agree that its possible that these changes broke this for you.
>>>>>> Would clearing ndev->phydev->mac_managed_pm just before
>>>>>> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
>>>>>> resume paths for the PHY get used when the network interface is down?
>>>>>>
>>>>>> Maybe, however, that's something that should happen in any case inside
>>>>>> phylib on phy_disconnect() as a matter of course, since the PHY will
>>>>>> at that point be no longer under the control of the network driver for
>>>>>> PM purposes. Could you give this idea a try please?
>>>>>>
>>>>>
>>>>> On phy_disconnect() we will do a phy_detach() which calls
>>>>> phy_suspend().
>>>>> Given that phy_disconnect() is called from fec_enet_close(), we
>>>>> still have a
>>>>> MDIO bus registered and we are not trying to suspend the MDIO bus,
>>>>> so we
>>>>> should have an effective phy_suspend() call here, what am I missing?
>>>>
>>>> I didn't look there, but if that is the case, then what is John's
>>>> problem - I can't figure it out, something isn't adding up here.
>>>>
>>>
>>> I could instead add extra phy_suspend() in the suspend path if the
>>> link is
>>> down and the FEC is up and running. I rejected it originally thinking
>>> it was
>>> a much dirtier fix, but maybe that is the more correct thing to do?
>>
>> This does not seem like the proper solution, the only time where an
>> explicit phy_suspend() should be done in the Ethernet MAC's ->suspend()
>> routine is if the network device was brought up at the time
>> (netif_runninng() returns true) *and* you set mac_managed_pm = true
>> because you must precisely control the order in which the MAC and the
>> PHY get suspended with respect to each other (typically because the PHY
>> supplies a RX clock back to the MAC, and some of the MAC logic depends
>> upon it to operate properly, e.g.: perform a proper FIFO flush etc.).
>
> I'm having some trouble understanding your message in context of my most
> recent reply to Russell, so please bear with me here as I will
> potentially ask a really dumb question:
>
> Do I understand this correctly as what used to work in 5.10 was never
> meant to work and the behavior now is the correct one in the FEC case?
I am not sure about that, because I do not believe this had been
explicitly accounted for, and that is what I am also trying to figure out.
> Meaning that if the link has never been up the PHY must never be handled
> from a power management perspective?
Let's stray away from saying "link is UP" but instead say interface
administratively brought up (ip link set dev <foo> up) so we are on the
same page.
>
> The only PHY examples I have come across (though not many in total) the
> PHY has done some initial configuration of itself after POR or release
> of the reset line.
>
>>
>> From there, I see two distinct cases:
>>
>> - the network device driver probed, but the network device was never
>> brought up in the first place in that case, I do not see a path whereby
>> the PHY would have been suspended, unless the boot firmware already took
>> care of that (which arguably it should if you are trying to be as power
>> efficient as possible), although arguably there could be a path within
>> the kernel where this is also done. It could get really complicated however
>
> Generic PM via mdio_bus_phy_suspend() will suspend the PHY if it has a
> .suspend callback and mac_managed_pm isn't set.
Correct.
>
> mdio_bus_phy_may_suspend() will see that netdev is NULL, which means it
> returns the inverse of phy->suspended (which is false), meaning the
> function returns true. Thus phy_suspend() is called.
OK, so right there is where you were were basically depending upon
mdio_bus_phy_may_suspend() being called to suspend your PHY device! I
don't think this had ever been envisioned to be working that way, if it
did that was a byproduct rather than a contract. Introducing
mac_managed_pm certainly did break that contract because now we no
longer have that "double" suspend and resume that used to happen.
>
>>
>> - the network device driver probed, and the network device was brought
>> up at least once (regardless of whether a link state was detected or
>> not), such that the PHY has gone through a phy_start()/phy_stop() cycle,
>> and upon phy_stop() a phy_suspend() has been called
>>
>> It is safe to assume you fall in the first case only, or do you also see
>> a problem in the second case as well?
>
> There is only a problem in the first case. The second case is working as
> expected.
Thanks for the clarification.
--
Florian
Hi Florian,
On 3/25/24 10:27 PM, Florian Fainelli wrote:
> On 3/25/24 05:20, John Ernberg wrote:
>> Hi Florian,
>>
>> On 3/21/24 17:13, Florian Fainelli wrote:
>>> On 3/21/24 09:02, John Ernberg wrote:
>>>> Hi Russell,
>>>>
>>>> On 3/20/24 20:44, Russell King (Oracle) wrote:
>>>>> On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:
>>>>>>
>>>>>>
>>>>>> On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:
>>>>>>> On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:
>>>>>>>> Hi Russel,
>>>>>>>
>>>>>>> Growl. Hi Peter.
>>>>>>>
>>>>>>>> What we really want is the PHY to be suspended on suspend to RAM
>>>>>>>> regardless of us having had an initial link up or not.
>>>>>>>
>>>>>>> So what you're asking is for the PHY to be suspended when the system
>>>>>>> is entering suspend, which is a long time after the system booted
>>>>>>> and
>>>>>>> thus phy_probe() was called, and could be some time before the
>>>>>>> system
>>>>>>> resumes.
>>>>>>>
>>>>>>> I'm not sure what the relevance is of phy_probe() that was
>>>>>>> brought up
>>>>>>> previously then.
>>>>>>>
>>>>>>>> This worked prior to 4c0d2e96ba05 ("net: phy: consider that
>>>>>>>> suspend2ram
>>>>>>>> may cut
>>>>>>>> off PHY power") which was added in Linux 5.11, and 557d5dc83f68
>>>>>>>> ("net:
>>>>>>>> fec: use
>>>>>>>> mac-managed PHY PM") which was added in Linux 5.12.
>>>>>>>
>>>>>>> Looking at the former commit, that looks to me like it is only
>>>>>>> affecting the resume paths, not the suspend paths, so wouldn't have
>>>>>>> any impact itself on what happens when suspend happens.
>>>>>>>
>>>>>>> The latter commit states that it is a work around for an issue
>>>>>>> with a
>>>>>>> particular PHY. What happens if you revert just this commit, does
>>>>>>> your
>>>>>>> problem then go away?
>>>>
>>>> Our PHY does not begin working again without reverting both.
>>>> phy_init_hw()
>>>> will remain an issue if it occurs after phy_start().
>>>>
>>>> The commit message in 557d5dc83f68 is not explaining nearly enough, I
>>>> spent a
>>>> few days on it before I proved that commit to be nearly correct (See
>>>> whole
>>>> thread at [1]), it happened to just explode with that PHY. The issue
>>>> is a
>>>> sequencing issue that was made more prominent by 4c0d2e96ba05, but it
>>>> existed
>>>> since around 2008. Because FEC is both MDIO controller and MAC,
>>>> meaning the
>>>> resume of the link in a link up case runs phy_start() in the FEC resume
>>>> function, which will trigger a mdio bus resume when it completes, in
>>>> turn
>>>> calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
>>>> wasn't a
>>>> problem but still wrong sequence wise).
>>>>
>>>>>>>
>>>>>>> Also, please clarify. It seems that you are reporting a regression -
>>>>>>> it used to work for you prior to 557d5dc83f68, but 557d5dc83f68
>>>>>>> stops
>>>>>>> it working for you?
>>>>>>>
>>>>>>>> Since FEC requires mac_managed_pm the generic PM suspend-resume
>>>>>>>> paths
>>>>>>>> are not
>>>>>>>> taken. The resume sequencing with generic PM has been broken
>>>>>>>> with the
>>>>>>>> FEC since
>>>>>>>> generic PM of the mdio bus was added, as the FEC will do
>>>>>>>> phy_start()
>>>>>>>> (via FEC
>>>>>>>> resume) and then generic PM runs phy_init_hw() via mdio bus resume
>>>>>>>> (previously:
>>>>>>>> less damaging phy_resume()) due to how the FEC IP block works.
>>>>>>>
>>>>>>> That suggests that even with 557d5dc83f68 reverted, it's broken.
>>>>>>> Digging into the history, what you're referring to dates from
>>>>>>> January
>>>>>>> 2016, so are you reporting a regression that occured 8 _years_ ago,
>>>>>>> at which point I'd question why it's taken 8 years.
>>>>
>>>> A revert of those is absolutely wrong. Those commits are fixing bigger
>>>> issues.
>>>>
>>>>>>>
>>>>>>> Given the time that has passed, I don't think reverting commits is
>>>>>>> a sane approach. Quite what the right solution is though, I'm not
>>>>>>> sure.
>>>>>>>
>>>>>>> From the description and the commits pointed to, I just don't
>>>>>>> see
>>>>>>> that there is anything that could've changed with respect to the
>>>>>>> first
>>>>>>> boot - if that has changed, then I think more research into what
>>>>>>> caused
>>>>>>> it is needed.
>>>>>>>
>>>>>>> If it's the subsequent state after a suspend-resume cycle, then yes,
>>>>>>> I would agree that its possible that these changes broke this for
>>>>>>> you.
>>>>>>> Would clearing ndev->phydev->mac_managed_pm just before
>>>>>>> phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
>>>>>>> resume paths for the PHY get used when the network interface is
>>>>>>> down?
>>>>>>>
>>>>>>> Maybe, however, that's something that should happen in any case
>>>>>>> inside
>>>>>>> phylib on phy_disconnect() as a matter of course, since the PHY will
>>>>>>> at that point be no longer under the control of the network
>>>>>>> driver for
>>>>>>> PM purposes. Could you give this idea a try please?
>>>>>>>
>>>>>>
>>>>>> On phy_disconnect() we will do a phy_detach() which calls
>>>>>> phy_suspend().
>>>>>> Given that phy_disconnect() is called from fec_enet_close(), we
>>>>>> still have a
>>>>>> MDIO bus registered and we are not trying to suspend the MDIO bus,
>>>>>> so we
>>>>>> should have an effective phy_suspend() call here, what am I missing?
>>>>>
>>>>> I didn't look there, but if that is the case, then what is John's
>>>>> problem - I can't figure it out, something isn't adding up here.
>>>>>
>>>>
>>>> I could instead add extra phy_suspend() in the suspend path if the
>>>> link is
>>>> down and the FEC is up and running. I rejected it originally thinking
>>>> it was
>>>> a much dirtier fix, but maybe that is the more correct thing to do?
>>>
>>> This does not seem like the proper solution, the only time where an
>>> explicit phy_suspend() should be done in the Ethernet MAC's ->suspend()
>>> routine is if the network device was brought up at the time
>>> (netif_runninng() returns true) *and* you set mac_managed_pm = true
>>> because you must precisely control the order in which the MAC and the
>>> PHY get suspended with respect to each other (typically because the PHY
>>> supplies a RX clock back to the MAC, and some of the MAC logic depends
>>> upon it to operate properly, e.g.: perform a proper FIFO flush etc.).
>>
>> I'm having some trouble understanding your message in context of my most
>> recent reply to Russell, so please bear with me here as I will
>> potentially ask a really dumb question:
>>
>> Do I understand this correctly as what used to work in 5.10 was never
>> meant to work and the behavior now is the correct one in the FEC case?
>
> I am not sure about that, because I do not believe this had been
> explicitly accounted for, and that is what I am also trying to figure out.
>
>> Meaning that if the link has never been up the PHY must never be handled
>> from a power management perspective?
>
> Let's stray away from saying "link is UP" but instead say interface
> administratively brought up (ip link set dev <foo> up) so we are on the
> same page.
>
>>
>> The only PHY examples I have come across (though not many in total) the
>> PHY has done some initial configuration of itself after POR or release
>> of the reset line.
>>
>>>
>>> From there, I see two distinct cases:
>>>
>>> - the network device driver probed, but the network device was never
>>> brought up in the first place in that case, I do not see a path whereby
>>> the PHY would have been suspended, unless the boot firmware already took
>>> care of that (which arguably it should if you are trying to be as power
>>> efficient as possible), although arguably there could be a path within
>>> the kernel where this is also done. It could get really complicated
>>> however
>>
>> Generic PM via mdio_bus_phy_suspend() will suspend the PHY if it has a
>> .suspend callback and mac_managed_pm isn't set.
>
> Correct.
>
>>
>> mdio_bus_phy_may_suspend() will see that netdev is NULL, which means it
>> returns the inverse of phy->suspended (which is false), meaning the
>> function returns true. Thus phy_suspend() is called.
>
> OK, so right there is where you were were basically depending upon
> mdio_bus_phy_may_suspend() being called to suspend your PHY device! I
> don't think this had ever been envisioned to be working that way, if it
> did that was a byproduct rather than a contract. Introducing
> mac_managed_pm certainly did break that contract because now we no
> longer have that "double" suspend and resume that used to happen.
I think I understand you know, thanks for clearing that up. Based on
this, even if the missing suspend in cases where the interface was never
administrative up is technically a regression it sounds like a much more
robust fix to our problem would be to do what I considered a work-around
before (using "ip link set <dev> up && ip link set <dev> down" at boot)
would likely be the most correct way to go about it.
I will be dropping this patch from the set in the next iteration.
>
>>
>>>
>>> - the network device driver probed, and the network device was brought
>>> up at least once (regardless of whether a link state was detected or
>>> not), such that the PHY has gone through a phy_start()/phy_stop() cycle,
>>> and upon phy_stop() a phy_suspend() has been called
>>>
>>> It is safe to assume you fall in the first case only, or do you also see
>>> a problem in the second case as well?
>>
>> There is only a problem in the first case. The second case is working as
>> expected.
>
> Thanks for the clarification.
Thanks! // John Ernberg