2020-11-24 15:01:48

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation in case of 'down-shift'

On Tue, Nov 24, 2020 at 03:38:48PM +0100, Antonio Borneo wrote:
> If the auto-negotiation fails to establish a gigabit link, the phy
> can try to 'down-shift': it resets the bits in MII_CTRL1000 to
> stop advertising 1Gbps and retries the negotiation at 100Mbps.
>
> From commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode
> in genphy_read_status") the content of MII_CTRL1000 is not checked
> anymore at the end of the negotiation, preventing the detection of
> phy 'down-shift'.
> In case of 'down-shift' phydev->advertising gets out-of-sync wrt
> MII_CTRL1000 and still includes modes that the phy have already
> dropped. The link partner could still advertise higher speeds,
> while the link is established at one of the common lower speeds.
> The logic 'and' in phy_resolve_aneg_linkmode() between
> phydev->advertising and phydev->lp_advertising will report an
> incorrect mode.
>
> Issue detected with a local phy rtl8211f connected with a gigabit
> capable router through a two-pairs network cable.
>
> After auto-negotiation, read back MII_CTRL1000 and mask-out from
> phydev->advertising the modes that have been eventually discarded
> due to the 'down-shift'.

Sorry, but no. While your solution will appear to work, in
introduces unexpected changes to the user visible APIs.

> if (phydev->autoneg == AUTONEG_ENABLE && phydev->autoneg_complete) {
> + if (phydev->is_gigabit_capable) {
> + adv = phy_read(phydev, MII_CTRL1000);
> + if (adv < 0)
> + return adv;
> + /* update advertising in case of 'down-shift' */
> + mii_ctrl1000_mod_linkmode_adv_t(phydev->advertising,
> + adv);

If a down-shift occurs, this will cause the configured advertising
mask to lose the 1G speed, which will be visible to userspace.
Userspace doesn't expect the advertising mask to change beneath it.
Since updates from userspace are done using a read-modify-write of
the ksettings, this can have the undesired effect of removing 1G
from the configured advertising mask.

We've had other PHYs have this behaviour; the correct solution is for
the PHY driver to implement reading the resolution from the PHY rather
than relying on the generic implementation if it can down-shift.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


2020-11-24 15:24:31

by Antonio Borneo

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation in case of 'down-shift'

On Tue, 2020-11-24 at 14:56 +0000, Russell King - ARM Linux admin wrote:
> On Tue, Nov 24, 2020 at 03:38:48PM +0100, Antonio Borneo wrote:
> > If the auto-negotiation fails to establish a gigabit link, the phy
> > can try to 'down-shift': it resets the bits in MII_CTRL1000 to
> > stop advertising 1Gbps and retries the negotiation at 100Mbps.
> >
> > From commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode
> > in genphy_read_status") the content of MII_CTRL1000 is not checked
> > anymore at the end of the negotiation, preventing the detection of
> > phy 'down-shift'.
> > In case of 'down-shift' phydev->advertising gets out-of-sync wrt
> > MII_CTRL1000 and still includes modes that the phy have already
> > dropped. The link partner could still advertise higher speeds,
> > while the link is established at one of the common lower speeds.
> > The logic 'and' in phy_resolve_aneg_linkmode() between
> > phydev->advertising and phydev->lp_advertising will report an
> > incorrect mode.
> >
> > Issue detected with a local phy rtl8211f connected with a gigabit
> > capable router through a two-pairs network cable.
> >
> > After auto-negotiation, read back MII_CTRL1000 and mask-out from
> > phydev->advertising the modes that have been eventually discarded
> > due to the 'down-shift'.
>
> Sorry, but no. While your solution will appear to work, in
> introduces unexpected changes to the user visible APIs.
>
> >   if (phydev->autoneg == AUTONEG_ENABLE && phydev->autoneg_complete) {
> > + if (phydev->is_gigabit_capable) {
> > + adv = phy_read(phydev, MII_CTRL1000);
> > + if (adv < 0)
> > + return adv;
> > + /* update advertising in case of 'down-shift' */
> > + mii_ctrl1000_mod_linkmode_adv_t(phydev->advertising,
> > + adv);
>
> If a down-shift occurs, this will cause the configured advertising
> mask to lose the 1G speed, which will be visible to userspace.

You are right, it gets propagated to user that 1Gbps is not advertised

> Userspace doesn't expect the advertising mask to change beneath it.
> Since updates from userspace are done using a read-modify-write of
> the ksettings, this can have the undesired effect of removing 1G
> from the configured advertising mask.
>
> We've had other PHYs have this behaviour; the correct solution is for
> the PHY driver to implement reading the resolution from the PHY rather
> than relying on the generic implementation if it can down-shift

If it's already upstream, could you please point to one of the phy driver
that already implements this properly?

Thanks
Antonio

2020-11-24 15:30:24

by Heiner Kallweit

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation in case of 'down-shift'

Am 24.11.2020 um 16:17 schrieb Antonio Borneo:
> On Tue, 2020-11-24 at 14:56 +0000, Russell King - ARM Linux admin wrote:
>> On Tue, Nov 24, 2020 at 03:38:48PM +0100, Antonio Borneo wrote:
>>> If the auto-negotiation fails to establish a gigabit link, the phy
>>> can try to 'down-shift': it resets the bits in MII_CTRL1000 to
>>> stop advertising 1Gbps and retries the negotiation at 100Mbps.
>>>
>>> From commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode
>>> in genphy_read_status") the content of MII_CTRL1000 is not checked
>>> anymore at the end of the negotiation, preventing the detection of
>>> phy 'down-shift'.
>>> In case of 'down-shift' phydev->advertising gets out-of-sync wrt
>>> MII_CTRL1000 and still includes modes that the phy have already
>>> dropped. The link partner could still advertise higher speeds,
>>> while the link is established at one of the common lower speeds.
>>> The logic 'and' in phy_resolve_aneg_linkmode() between
>>> phydev->advertising and phydev->lp_advertising will report an
>>> incorrect mode.
>>>
>>> Issue detected with a local phy rtl8211f connected with a gigabit
>>> capable router through a two-pairs network cable.
>>>
>>> After auto-negotiation, read back MII_CTRL1000 and mask-out from
>>> phydev->advertising the modes that have been eventually discarded
>>> due to the 'down-shift'.
>>
>> Sorry, but no. While your solution will appear to work, in
>> introduces unexpected changes to the user visible APIs.
>>
>>>   if (phydev->autoneg == AUTONEG_ENABLE && phydev->autoneg_complete) {
>>> + if (phydev->is_gigabit_capable) {
>>> + adv = phy_read(phydev, MII_CTRL1000);
>>> + if (adv < 0)
>>> + return adv;
>>> + /* update advertising in case of 'down-shift' */
>>> + mii_ctrl1000_mod_linkmode_adv_t(phydev->advertising,
>>> + adv);
>>
>> If a down-shift occurs, this will cause the configured advertising
>> mask to lose the 1G speed, which will be visible to userspace.
>
> You are right, it gets propagated to user that 1Gbps is not advertised
>
>> Userspace doesn't expect the advertising mask to change beneath it.
>> Since updates from userspace are done using a read-modify-write of
>> the ksettings, this can have the undesired effect of removing 1G
>> from the configured advertising mask.
>>
>> We've had other PHYs have this behaviour; the correct solution is for
>> the PHY driver to implement reading the resolution from the PHY rather
>> than relying on the generic implementation if it can down-shift
>
> If it's already upstream, could you please point to one of the phy driver
> that already implements this properly?
>

See e.g. aqr107_read_rate(), used by aqr107_read_status().

> Thanks
> Antonio
>

2020-11-24 15:41:50

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation in case of 'down-shift'

On Tue, Nov 24, 2020 at 04:17:42PM +0100, Antonio Borneo wrote:
> On Tue, 2020-11-24 at 14:56 +0000, Russell King - ARM Linux admin wrote:
> > Userspace doesn't expect the advertising mask to change beneath it.
> > Since updates from userspace are done using a read-modify-write of
> > the ksettings, this can have the undesired effect of removing 1G
> > from the configured advertising mask.
> >
> > We've had other PHYs have this behaviour; the correct solution is for
> > the PHY driver to implement reading the resolution from the PHY rather
> > than relying on the generic implementation if it can down-shift
>
> If it's already upstream, could you please point to one of the phy driver
> that already implements this properly?

Reading the resolved information is PHY specific as it isn't
standardised.

Marvell PHYs have read the resolved information for a very long time.
I added support for it to at803x.c:

06d5f3441b2e net: phy: at803x: use operating parameters from PHY-specific status

after it broke for exactly the reason you're reporting for your PHY.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2020-11-24 23:45:57

by Antonio Borneo

[permalink] [raw]
Subject: Re: [PATCH] net: phy: fix auto-negotiation in case of 'down-shift'

On Tue, 2020-11-24 at 15:37 +0000, Russell King - ARM Linux admin wrote:
> On Tue, Nov 24, 2020 at 04:17:42PM +0100, Antonio Borneo wrote:
> > On Tue, 2020-11-24 at 14:56 +0000, Russell King - ARM Linux admin wrote:
> > > Userspace doesn't expect the advertising mask to change beneath it.
> > > Since updates from userspace are done using a read-modify-write of
> > > the ksettings, this can have the undesired effect of removing 1G
> > > from the configured advertising mask.
> > >
> > > We've had other PHYs have this behaviour; the correct solution is for
> > > the PHY driver to implement reading the resolution from the PHY rather
> > > than relying on the generic implementation if it can down-shift
> >
> > If it's already upstream, could you please point to one of the phy driver
> > that already implements this properly?
>
> Reading the resolved information is PHY specific as it isn't
> standardised.

Digging in the info you have provided, I realized that another Realtek PHY
has some specific code already upstream to deal with downshift.
The PHY specific code is added by Heiner in d445dff2df60 ("net: phy:
realtek: read actual speed to detect downshift").
This code reads the actual speed from page 0xa43 address 0x12, that is not
reported in the datasheet of rtl8211f.
But I checked the register content in rtl8211f and it works at the same way
too!

I have added Willy in copy; maybe he can confirm that we can use page 0xa43
address 0x12 on rtl8211f to read the actual speed after negotiation.

In such case the fix for rtl8211f requires just adding the same custom
read_status().

Antonio