2017-04-19 13:47:02

by Alexander Kochetkov

[permalink] [raw]
Subject: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt

The problem I fix related to SMSC LAN8710/LAN8720 PHY handled using
interrupts. During power-up cycle the PHY do auto-negotiation, generate
interrupt and set BMSR_ANEGCOMPLETE flag. Interrupt is handled by PHY
state machine but doesn't update link because PHY is in PHY_READY state.
After some time MAC bring up and connect with PHY. It start PHY using
phy_start(). During startup PHY change state to PHY_AN but doesn't
set BMCR_ANRESTART flag due to genphy_config_aneg() doesn't update MII_BMCR
because there no new to advertising. As a result, state machine wait for
interrupt from PHY and nether get "link is up". Because BMSR_ANEGCOMPLETE
already set the patch schedule check link without waiting interrupt.
In case genphy_config_aneg() update MII_BMCR and set BMCR_ANRESTART
flag, BMSR_ANEGCOMPLETE will be cleared and state machine will continue
on auto-negotiation interrupt.

Signed-off-by: Alexander Kochetkov <[email protected]>
---
drivers/net/phy/phy.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 7cc1b7d..da8f03d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1169,6 +1169,18 @@ void phy_state_machine(struct work_struct *work)
if (phydev->irq == PHY_POLL)
queue_delayed_work(system_power_efficient_wq, &phydev->state_queue,
PHY_STATE_TIME * HZ);
+
+ /* Re-schedule a PHY state machine to check PHY status because
+ * negotiation already done and aneg interrupt may not be generated.
+ */
+ if (needs_aneg && (phydev->irq > 0) && (phydev->state == PHY_AN)) {
+ err = phy_aneg_done(phydev);
+ if (err > 0)
+ queue_delayed_work(system_power_efficient_wq,
+ &phydev->state_queue, 0);
+ if (err < 0)
+ phy_error(phydev);
+ }
}

/**
--
1.7.9.5


2017-04-19 14:06:00

by Alexander Kochetkov

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt

Just found similar problem fixed in another PHY. See commit 99f81afc139c
("phy: micrel: Disable auto negotiation on startup»)

> 19 апр. 2017 г., в 16:46, Alexander Kochetkov <[email protected]> написал(а):
>
> The problem I fix related to SMSC LAN8710/LAN8720 PHY handled using
> interrupts. During power-up cycle the PHY do auto-negotiation, generate
> interrupt and set BMSR_ANEGCOMPLETE flag. Interrupt is handled by PHY
> state machine but doesn't update link because PHY is in PHY_READY state.
> After some time MAC bring up and connect with PHY. It start PHY using
> phy_start(). During startup PHY change state to PHY_AN but doesn't
> set BMCR_ANRESTART flag due to genphy_config_aneg() doesn't update MII_BMCR
> because there no new to advertising. As a result, state machine wait for
> interrupt from PHY and nether get "link is up". Because BMSR_ANEGCOMPLETE
> already set the patch schedule check link without waiting interrupt.
> In case genphy_config_aneg() update MII_BMCR and set BMCR_ANRESTART
> flag, BMSR_ANEGCOMPLETE will be cleared and state machine will continue
> on auto-negotiation interrupt.
>
> Signed-off-by: Alexander Kochetkov <[email protected]>
> ---
> drivers/net/phy/phy.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 7cc1b7d..da8f03d 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -1169,6 +1169,18 @@ void phy_state_machine(struct work_struct *work)
> if (phydev->irq == PHY_POLL)
> queue_delayed_work(system_power_efficient_wq, &phydev->state_queue,
> PHY_STATE_TIME * HZ);
> +
> + /* Re-schedule a PHY state machine to check PHY status because
> + * negotiation already done and aneg interrupt may not be generated.
> + */
> + if (needs_aneg && (phydev->irq > 0) && (phydev->state == PHY_AN)) {
> + err = phy_aneg_done(phydev);
> + if (err > 0)
> + queue_delayed_work(system_power_efficient_wq,
> + &phydev->state_queue, 0);
> + if (err < 0)
> + phy_error(phydev);
> + }
> }
>
> /**
> --
> 1.7.9.5
>


2017-04-19 16:32:58

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt

On 04/19/2017 07:05 AM, Alexander Kochetkov wrote:
> Just found similar problem fixed in another PHY. See commit 99f81afc139c
> ("phy: micrel: Disable auto negotiation on startup»)

This specific commit was really a "this works for me, but I have not
investigated what needs fixing in PHYLIB".

Roger has been submitting a couple of patches to fix the same issue and
yours seems to be better in that it also addresses his concerns that
after his fix there was still a 1s delay (the HZ parameter to the
queue_delayed_work call) being seen:

http://patchwork.ozlabs.org/patch/743773/

Roger can you also test Alexander's patch?

>
>> 19 апр. 2017 г., в 16:46, Alexander Kochetkov <[email protected]> написал(а):
>>
>> The problem I fix related to SMSC LAN8710/LAN8720 PHY handled using
>> interrupts. During power-up cycle the PHY do auto-negotiation, generate
>> interrupt and set BMSR_ANEGCOMPLETE flag. Interrupt is handled by PHY
>> state machine but doesn't update link because PHY is in PHY_READY state.
>> After some time MAC bring up and connect with PHY. It start PHY using
>> phy_start(). During startup PHY change state to PHY_AN but doesn't
>> set BMCR_ANRESTART flag due to genphy_config_aneg() doesn't update MII_BMCR
>> because there no new to advertising. As a result, state machine wait for
>> interrupt from PHY and nether get "link is up". Because BMSR_ANEGCOMPLETE
>> already set the patch schedule check link without waiting interrupt.
>> In case genphy_config_aneg() update MII_BMCR and set BMCR_ANRESTART
>> flag, BMSR_ANEGCOMPLETE will be cleared and state machine will continue
>> on auto-negotiation interrupt.
>>
>> Signed-off-by: Alexander Kochetkov <[email protected]>
>> ---
>> drivers/net/phy/phy.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>> index 7cc1b7d..da8f03d 100644
>> --- a/drivers/net/phy/phy.c
>> +++ b/drivers/net/phy/phy.c
>> @@ -1169,6 +1169,18 @@ void phy_state_machine(struct work_struct *work)
>> if (phydev->irq == PHY_POLL)
>> queue_delayed_work(system_power_efficient_wq, &phydev->state_queue,
>> PHY_STATE_TIME * HZ);
>> +
>> + /* Re-schedule a PHY state machine to check PHY status because
>> + * negotiation already done and aneg interrupt may not be generated.
>> + */
>> + if (needs_aneg && (phydev->irq > 0) && (phydev->state == PHY_AN)) {
>> + err = phy_aneg_done(phydev);
>> + if (err > 0)
>> + queue_delayed_work(system_power_efficient_wq,
>> + &phydev->state_queue, 0);
>> + if (err < 0)
>> + phy_error(phydev);
>> + }
>> }
>>
>> /**
>> --
>> 1.7.9.5
>>
>


--
Florian

2017-04-19 16:54:41

by Alexander Kochetkov

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt


> 19 апр. 2017 г., в 19:32, Florian Fainelli <[email protected]> написал(а):
>
> http://patchwork.ozlabs.org/patch/743773/
>
> Roger can you also test Alexander's patch?

If MAC use phy_start_aneg() instead of phy_start() my patch will not work as
expected. Roger, if patch don’t work for you please check what MAC bring up PHY using
phy_start():

http://patchwork.ozlabs.org/patch/752308/

Is it correct to start PHY inside MAC probe using phy_start_aneg()? Or phy_start() must be used?

And probably this tags should be added for my patch:
Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
Cc: stable <[email protected]> # v4.9+

Because I bisected to commit 529ed1275263 ("net: phy: phy drivers
should not set SUPPORTED_[Asym_]Pause») that looks pretty good.

Also, there is another issue I found. link_timeout doesn’t work for interrupt driven PHY.
It is possible to implement timer to handle this case.
Florian, what do you think? Should this be fixed?

Alexander.


2017-04-19 18:11:50

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt

Hello!

On 04/19/2017 04:46 PM, Alexander Kochetkov wrote:

> The problem I fix related to SMSC LAN8710/LAN8720 PHY handled using
> interrupts. During power-up cycle the PHY do auto-negotiation, generate
> interrupt and set BMSR_ANEGCOMPLETE flag. Interrupt is handled by PHY
> state machine but doesn't update link because PHY is in PHY_READY state.
> After some time MAC bring up and connect with PHY. It start PHY using
> phy_start(). During startup PHY change state to PHY_AN but doesn't
> set BMCR_ANRESTART flag due to genphy_config_aneg() doesn't update MII_BMCR
> because there no new to advertising. As a result, state machine wait for
^^^^^^^^^^^^ ^^^^^^^^^^^
There is no new what? Advertize, maybe?

> interrupt from PHY and nether get "link is up". Because BMSR_ANEGCOMPLETE
^^^^^^
Never, neither? :-)

> already set the patch schedule check link without waiting interrupt.
> In case genphy_config_aneg() update MII_BMCR and set BMCR_ANRESTART
> flag, BMSR_ANEGCOMPLETE will be cleared and state machine will continue
> on auto-negotiation interrupt.
>
> Signed-off-by: Alexander Kochetkov <[email protected]>
[...]

MBR, Sergei

2017-04-20 09:21:28

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation stall due to unavailable interrupt

Hi,

On 19/04/17 19:53, Alexander Kochetkov wrote:
>
>> 19 апр. 2017 г., в 19:32, Florian Fainelli <[email protected]> написал(а):
>>
>> http://patchwork.ozlabs.org/patch/743773/
>>
>> Roger can you also test Alexander's patch?

This patch fixes my problem and doesn't have the 1 second delay that my patch had.

So,

Tested-by: Roger Quadros <[email protected]>

>
> If MAC use phy_start_aneg() instead of phy_start() my patch will not work as
> expected. Roger, if patch don’t work for you please check what MAC bring up PHY using
> phy_start():

Our MAC driver is using phy_start() and I didn't face any issues with your patch.

>
> http://patchwork.ozlabs.org/patch/752308/
>
> Is it correct to start PHY inside MAC probe using phy_start_aneg()? Or phy_start() must be used?
>
> And probably this tags should be added for my patch:
> Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
> Cc: stable <[email protected]> # v4.9+
>
> Because I bisected to commit 529ed1275263 ("net: phy: phy drivers
> should not set SUPPORTED_[Asym_]Pause») that looks pretty good.
>
> Also, there is another issue I found. link_timeout doesn’t work for interrupt driven PHY.
> It is possible to implement timer to handle this case.
> Florian, what do you think? Should this be fixed?
>
> Alexander.
>

cheers,
-roger