2021-11-01 09:38:34

by Zhang Changzhong

[permalink] [raw]
Subject: [PATCH net] net: phy: fix duplex out of sync problem while changing settings

Before commit 2bd229df5e2e ("net: phy: remove state PHY_FORCING") if
phy_ethtool_ksettings_set() is called with autoneg off, phy_start_aneg()
will set phydev->state to PHY_FORCING, so adjust_link will always be
called.

But after that commit, if phy_ethtool_ksettings_set() changes the local
link from 10Mbps/Half to 10Mbps/Full when the link partner is
10Mbps/Full, phy_check_link_status() will not trigger the link change,
because phydev->link and phydev->state has not changed. This will causes
the duplex of the PHY and MAC to be out of sync.

Fix it by re-adding the PHY_FORCING state to force adjust_link to be
called in fixed mode.

Fixes: 2bd229df5e2e ("net: phy: remove state PHY_FORCING")
Signed-off-by: Zhang Changzhong <[email protected]>
---
drivers/net/phy/phy.c | 10 ++++++++--
include/linux/phy.h | 6 ++++++
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index a3bfb15..b114f15 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -49,6 +49,7 @@ static const char *phy_state_to_str(enum phy_state st)
PHY_STATE_STR(UP)
PHY_STATE_STR(RUNNING)
PHY_STATE_STR(NOLINK)
+ PHY_STATE_STR(FORCING)
PHY_STATE_STR(CABLETEST)
PHY_STATE_STR(HALTED)
}
@@ -724,8 +725,12 @@ static int _phy_start_aneg(struct phy_device *phydev)
if (err < 0)
return err;

- if (phy_is_started(phydev))
- err = phy_check_link_status(phydev);
+ if (phy_is_started(phydev)) {
+ if (phydev->autoneg == AUTONEG_ENABLE)
+ err = phy_check_link_status(phydev);
+ else
+ phydev->state = PHY_FORCING;
+ }

return err;
}
@@ -1120,6 +1125,7 @@ void phy_state_machine(struct work_struct *work)
needs_aneg = true;

break;
+ case PHY_FORCING:
case PHY_NOLINK:
case PHY_RUNNING:
err = phy_check_link_status(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 736e1d1..e639729 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -446,6 +446,11 @@ struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr);
* - irq or timer will set @PHY_NOLINK if link goes down
* - phy_stop moves to @PHY_HALTED
*
+ * @PHY_FORCING: PHY is being configured with forced settings.
+ * - if link is up, move to @PHY_RUNNING
+ * - if link is down, move to @PHY_NOLINK
+ * - phy_stop moves to @PHY_HALTED
+ *
* @PHY_CABLETEST: PHY is performing a cable test. Packet reception/sending
* is not expected to work, carrier will be indicated as down. PHY will be
* poll once per second, or on interrupt for it current state.
@@ -463,6 +468,7 @@ enum phy_state {
PHY_UP,
PHY_RUNNING,
PHY_NOLINK,
+ PHY_FORCING,
PHY_CABLETEST,
};

--
2.9.5


2021-11-01 21:39:17

by Heiner Kallweit

[permalink] [raw]
Subject: Re: [PATCH net] net: phy: fix duplex out of sync problem while changing settings

On 01.11.2021 10:44, Zhang Changzhong wrote:
> Before commit 2bd229df5e2e ("net: phy: remove state PHY_FORCING") if
> phy_ethtool_ksettings_set() is called with autoneg off, phy_start_aneg()
> will set phydev->state to PHY_FORCING, so adjust_link will always be
> called.
>
> But after that commit, if phy_ethtool_ksettings_set() changes the local
> link from 10Mbps/Half to 10Mbps/Full when the link partner is
> 10Mbps/Full, phy_check_link_status() will not trigger the link change,
> because phydev->link and phydev->state has not changed. This will causes
> the duplex of the PHY and MAC to be out of sync.
>
> Fix it by re-adding the PHY_FORCING state to force adjust_link to be
> called in fixed mode.
>
> Fixes: 2bd229df5e2e ("net: phy: remove state PHY_FORCING")
> Signed-off-by: Zhang Changzhong <[email protected]>
> ---
> drivers/net/phy/phy.c | 10 ++++++++--
> include/linux/phy.h | 6 ++++++
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index a3bfb15..b114f15 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -49,6 +49,7 @@ static const char *phy_state_to_str(enum phy_state st)
> PHY_STATE_STR(UP)
> PHY_STATE_STR(RUNNING)
> PHY_STATE_STR(NOLINK)
> + PHY_STATE_STR(FORCING)
> PHY_STATE_STR(CABLETEST)
> PHY_STATE_STR(HALTED)
> }
> @@ -724,8 +725,12 @@ static int _phy_start_aneg(struct phy_device *phydev)
> if (err < 0)
> return err;
>
> - if (phy_is_started(phydev))
> - err = phy_check_link_status(phydev);
> + if (phy_is_started(phydev)) {
> + if (phydev->autoneg == AUTONEG_ENABLE)
> + err = phy_check_link_status(phydev);
> + else
> + phydev->state = PHY_FORCING;
> + }
>
> return err;
> }
> @@ -1120,6 +1125,7 @@ void phy_state_machine(struct work_struct *work)
> needs_aneg = true;
>
> break;
> + case PHY_FORCING:
> case PHY_NOLINK:
> case PHY_RUNNING:
> err = phy_check_link_status(phydev);
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index 736e1d1..e639729 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -446,6 +446,11 @@ struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr);
> * - irq or timer will set @PHY_NOLINK if link goes down
> * - phy_stop moves to @PHY_HALTED
> *
> + * @PHY_FORCING: PHY is being configured with forced settings.
> + * - if link is up, move to @PHY_RUNNING
> + * - if link is down, move to @PHY_NOLINK
> + * - phy_stop moves to @PHY_HALTED
> + *
> * @PHY_CABLETEST: PHY is performing a cable test. Packet reception/sending
> * is not expected to work, carrier will be indicated as down. PHY will be
> * poll once per second, or on interrupt for it current state.
> @@ -463,6 +468,7 @@ enum phy_state {
> PHY_UP,
> PHY_RUNNING,
> PHY_NOLINK,
> + PHY_FORCING,
> PHY_CABLETEST,
> };
>
>

I see your point, but the proposed fix may not be fully correct. You rely
on the phylib state machine, but there are drivers that use
phy_ethtool_ksettings_set() and some parts of phylib but not the phylib
state machine. One example is AMD xgbe.

Maybe the following is better, could you please give it a try?
By the way: With which MAC driver did you encounter the issue?


diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index a3bfb156c..beb2b66da 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -815,7 +815,12 @@ int phy_ethtool_ksettings_set(struct phy_device *phydev,
phydev->mdix_ctrl = cmd->base.eth_tp_mdix_ctrl;

/* Restart the PHY */
- _phy_start_aneg(phydev);
+ if (phy_is_started(phydev)) {
+ phydev->state = PHY_UP;
+ phy_trigger_machine(phydev);
+ } else {
+ _phy_start_aneg(phydev);
+ }

mutex_unlock(&phydev->lock);
return 0;
--
2.33.1


2021-11-03 08:57:08

by Zhang Changzhong

[permalink] [raw]
Subject: Re: [PATCH net] net: phy: fix duplex out of sync problem while changing settings

On 2021/11/2 5:37, Heiner Kallweit wrote:
> On 01.11.2021 10:44, Zhang Changzhong wrote:
>> Before commit 2bd229df5e2e ("net: phy: remove state PHY_FORCING") if
>> phy_ethtool_ksettings_set() is called with autoneg off, phy_start_aneg()
>> will set phydev->state to PHY_FORCING, so adjust_link will always be
>> called.
>>
>> But after that commit, if phy_ethtool_ksettings_set() changes the local
>> link from 10Mbps/Half to 10Mbps/Full when the link partner is
>> 10Mbps/Full, phy_check_link_status() will not trigger the link change,
>> because phydev->link and phydev->state has not changed. This will causes
>> the duplex of the PHY and MAC to be out of sync.
>>
>> Fix it by re-adding the PHY_FORCING state to force adjust_link to be
>> called in fixed mode.
>>
>> Fixes: 2bd229df5e2e ("net: phy: remove state PHY_FORCING")
>> Signed-off-by: Zhang Changzhong <[email protected]>
>> ---
>> drivers/net/phy/phy.c | 10 ++++++++--
>> include/linux/phy.h | 6 ++++++
>> 2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
>> index a3bfb15..b114f15 100644
>> --- a/drivers/net/phy/phy.c
>> +++ b/drivers/net/phy/phy.c
>> @@ -49,6 +49,7 @@ static const char *phy_state_to_str(enum phy_state st)
>> PHY_STATE_STR(UP)
>> PHY_STATE_STR(RUNNING)
>> PHY_STATE_STR(NOLINK)
>> + PHY_STATE_STR(FORCING)
>> PHY_STATE_STR(CABLETEST)
>> PHY_STATE_STR(HALTED)
>> }
>> @@ -724,8 +725,12 @@ static int _phy_start_aneg(struct phy_device *phydev)
>> if (err < 0)
>> return err;
>>
>> - if (phy_is_started(phydev))
>> - err = phy_check_link_status(phydev);
>> + if (phy_is_started(phydev)) {
>> + if (phydev->autoneg == AUTONEG_ENABLE)
>> + err = phy_check_link_status(phydev);
>> + else
>> + phydev->state = PHY_FORCING;
>> + }
>>
>> return err;
>> }
>> @@ -1120,6 +1125,7 @@ void phy_state_machine(struct work_struct *work)
>> needs_aneg = true;
>>
>> break;
>> + case PHY_FORCING:
>> case PHY_NOLINK:
>> case PHY_RUNNING:
>> err = phy_check_link_status(phydev);
>> diff --git a/include/linux/phy.h b/include/linux/phy.h
>> index 736e1d1..e639729 100644
>> --- a/include/linux/phy.h
>> +++ b/include/linux/phy.h
>> @@ -446,6 +446,11 @@ struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr);
>> * - irq or timer will set @PHY_NOLINK if link goes down
>> * - phy_stop moves to @PHY_HALTED
>> *
>> + * @PHY_FORCING: PHY is being configured with forced settings.
>> + * - if link is up, move to @PHY_RUNNING
>> + * - if link is down, move to @PHY_NOLINK
>> + * - phy_stop moves to @PHY_HALTED
>> + *
>> * @PHY_CABLETEST: PHY is performing a cable test. Packet reception/sending
>> * is not expected to work, carrier will be indicated as down. PHY will be
>> * poll once per second, or on interrupt for it current state.
>> @@ -463,6 +468,7 @@ enum phy_state {
>> PHY_UP,
>> PHY_RUNNING,
>> PHY_NOLINK,
>> + PHY_FORCING,
>> PHY_CABLETEST,
>> };
>>
>>
>
> I see your point, but the proposed fix may not be fully correct. You rely
> on the phylib state machine, but there are drivers that use
> phy_ethtool_ksettings_set() and some parts of phylib but not the phylib
> state machine. One example is AMD xgbe.

Yes, I notice that, thanks for remind!

>
> Maybe the following is better, could you please give it a try?
> By the way: With which MAC driver did you encounter the issue?

We encounter the issue with the driver for our in-house MAC.

>
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index a3bfb156c..beb2b66da 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -815,7 +815,12 @@ int phy_ethtool_ksettings_set(struct phy_device *phydev,
> phydev->mdix_ctrl = cmd->base.eth_tp_mdix_ctrl;
>
> /* Restart the PHY */
> - _phy_start_aneg(phydev);
> + if (phy_is_started(phydev)) {
> + phydev->state = PHY_UP;
> + phy_trigger_machine(phydev);
> + } else {
> + _phy_start_aneg(phydev);
> + }
>
> mutex_unlock(&phydev->lock);
> return 0;
>

We tested it and it works fine, thanks!