2022-08-18 17:03:53

by Sean Anderson

[permalink] [raw]
Subject: [PATCH net-next v4 00/10] net: phy: Add support for rate adaptation

This adds support for phy rate adaptation: when a phy adapts between
differing phy interface and link speeds. It was originally submitted as
part of [1], which is considered "v1" of this series.

We need support for rate adaptation for two reasons. First, the phy
consumer needs to know if the phy will perform rate adaptation in order to
program the correct advertising. An unaware consumer will only program
support for link modes at the phy interface mode's native speed. This
will cause autonegotiation to fail if the link partner only advertises
support for lower speed link modes. Second, to reduce packet loss it may
be desirable to throttle packet throughput.

There have been several past discussions [2-4] around adding rate
adaptation support. One point is that we must be certain that rate
adaptation is possible before enabling it. It is the opinion of some
developers that it is the responsibility of the system integrator or end
user to set the link settings appropriately for rate adaptation. In
particular, it was argued that (due to differing firmware) it might not
be clear if a particular phy has rate adaptation enabled. Additionally,
upper-layer protocols must already be tolerant of packet loss caused by
differing rates. Packet loss may happen anyway, such as if a faster link
is used with a slower switch or repeater. So adjusting pause settings
for rate adaptation is not strictly necessary.

I believe that our current approach is limiting, especially when
considering that rate adaptation (in two forms) has made it into IEEE
standards. In general, when we have appropriate information we should
set sensible defaults. To consider use a contrasting example, we enable
pause frames by default for link partners which autonegotiate for them.
When it's the phy itself generating these frames, we don't even have to
autonegotiate to know that we should enable pause frames.

Our current approach also encourages workarounds, such as commit
73a21fa817f0 ("dpaa_eth: support all modes with rate adapting PHYs").
These workarounds are fine for phylib drivers, but phylink drivers cannot
use this approach (since there is no direct access to the phy).

Although in earlier versions of this series, userspace could disable
rate adaptation, now it is only possible to determine the current rate
adaptation type. Disabling or otherwise configuring rate adaptation has
been left for future work. However, because currently only
RATE_ADAPT_PAUSE is implemented, it is possible to disable rate
adaptation by modifying the advertisement appropriately.

[1] https://lore.kernel.org/netdev/[email protected]/T/#t
[2] https://lore.kernel.org/netdev/[email protected]/
[3] https://lore.kernel.org/netdev/[email protected]/
[4] https://lore.kernel.org/netdev/[email protected]/

Changes in v4:
- Wrap docstring to 80 columns
- Export phy_rate_adaptation_to_str
- Remove phylink_interface_max_speed, which was accidentally added
- Split off the LS1046ARDB 1G fix

Changes in v3:
- Document MAC_(A)SYM_PAUSE
- Add some helpers for working with mac caps
- Modify link settings directly in phylink_link_up, instead of doing
things more indirectly via link_*.
- Add phylink_cap_from_speed_duplex to look up the mac capability
corresponding to the interface's speed.
- Include RATE_ADAPT_CRS; it's a few lines and it doesn't hurt.
- Move unused defines to next commit (where they will be used)
- Remove "Support differing link/interface speed/duplex". It has been
rendered unnecessary due to simplification of the rate adaptation
patches. Thanks Russell!
- Rewrite cover letter to better reflect the opinions of the developers
involved

Changes in v2:
- Use int/defines instead of enum to allow for use in ioctls/netlink
- Add locking to phy_get_rate_adaptation
- Add (read-only) ethtool support for rate adaptation
- Move part of commit message to cover letter, as it gives a good
overview of the whole series, and allows this patch to focus more on
the specifics.
- Use the phy's rate adaptation setting to determine whether to use its
link speed/duplex or the MAC's speed/duplex with MLO_AN_INBAND.
- Always use the rate adaptation setting to determine the interface
speed/duplex (instead of sometimes using the interface mode).
- Determine the interface speed and max mac speed directly instead of
guessing based on the caps.
- Add comments clarifying the register defines
- Reorder variables in aqr107_read_rate

Sean Anderson (10):
net: phy: Add 1000BASE-KX interface mode
net: phylink: Document MAC_(A)SYM_PAUSE
net: phylink: Export phylink_caps_to_linkmodes
net: phylink: Generate caps and convert to linkmodes separately
net: phylink: Add some helpers for working with mac caps
net: phy: Add support for rate adaptation
net: phylink: Adjust link settings based on rate adaptation
net: phylink: Adjust advertisement based on rate adaptation
net: phy: aquantia: Add some additional phy interfaces
net: phy: aquantia: Add support for rate adaptation

Documentation/networking/ethtool-netlink.rst | 2 +
drivers/net/phy/aquantia_main.c | 68 ++++-
drivers/net/phy/phy-core.c | 16 ++
drivers/net/phy/phy.c | 28 ++
drivers/net/phy/phylink.c | 279 +++++++++++++++++--
include/linux/phy.h | 26 +-
include/linux/phylink.h | 29 +-
include/uapi/linux/ethtool.h | 18 +-
include/uapi/linux/ethtool_netlink.h | 1 +
net/ethtool/ioctl.c | 1 +
net/ethtool/linkmodes.c | 5 +
11 files changed, 439 insertions(+), 34 deletions(-)

--
2.35.1.1320.gc452695387.dirty


2022-08-18 17:04:15

by Sean Anderson

[permalink] [raw]
Subject: [PATCH net-next v4 01/10] net: phy: Add 1000BASE-KX interface mode

Add 1000BASE-KX interface mode. This 1G backplane ethernet as described in
clause 70. Clause 73 autonegotiation is mandatory, and only full duplex
operation is supported.

Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v1)

drivers/net/phy/phylink.c | 1 +
include/linux/phy.h | 4 ++++
2 files changed, 5 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 9bd69328dc4d..b08716fe22c1 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -344,6 +344,7 @@ void phylink_get_linkmodes(unsigned long *linkmodes, phy_interface_t interface,
case PHY_INTERFACE_MODE_1000BASEX:
caps |= MAC_1000HD;
fallthrough;
+ case PHY_INTERFACE_MODE_1000BASEKX:
case PHY_INTERFACE_MODE_TRGMII:
caps |= MAC_1000FD;
break;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 87638c55d844..81ce76c3e799 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -115,6 +115,7 @@ extern const int phy_10gbit_features_array[1];
* @PHY_INTERFACE_MODE_25GBASER: 25G BaseR
* @PHY_INTERFACE_MODE_USXGMII: Universal Serial 10GE MII
* @PHY_INTERFACE_MODE_10GKR: 10GBASE-KR - with Clause 73 AN
+ * @PHY_INTERFACE_MODE_1000BASEKX: 1000Base-KX - with Clause 73 AN
* @PHY_INTERFACE_MODE_MAX: Book keeping
*
* Describes the interface between the MAC and PHY.
@@ -152,6 +153,7 @@ typedef enum {
PHY_INTERFACE_MODE_USXGMII,
/* 10GBASE-KR - with Clause 73 AN */
PHY_INTERFACE_MODE_10GKR,
+ PHY_INTERFACE_MODE_1000BASEKX,
PHY_INTERFACE_MODE_MAX,
} phy_interface_t;

@@ -249,6 +251,8 @@ static inline const char *phy_modes(phy_interface_t interface)
return "trgmii";
case PHY_INTERFACE_MODE_1000BASEX:
return "1000base-x";
+ case PHY_INTERFACE_MODE_1000BASEKX:
+ return "1000base-kx";
case PHY_INTERFACE_MODE_2500BASEX:
return "2500base-x";
case PHY_INTERFACE_MODE_5GBASER:
--
2.35.1.1320.gc452695387.dirty

2022-08-18 17:17:14

by Sean Anderson

[permalink] [raw]
Subject: [PATCH net-next v4 02/10] net: phylink: Document MAC_(A)SYM_PAUSE

This documents the possible MLO_PAUSE_* settings which can result from
different combinations of MLO_(A)SYM_PAUSE. These are more-or-less a
direct consequence of Table 28B-2.

Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v3)

Changes in v3:
- New

include/linux/phylink.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 6d06896fc20d..9629bcd594b1 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -21,6 +21,22 @@ enum {
MLO_AN_FIXED, /* Fixed-link mode */
MLO_AN_INBAND, /* In-band protocol */

+ /* MAC_SYM_PAUSE and MAC_ASYM_PAUSE correspond to the PAUSE and
+ * ASM_DIR bits used in autonegotiation, respectively. See IEEE802.3
+ * Annex 28B for more information.
+ *
+ * The following table lists the values of MLO_PAUSE_* (aside from
+ * MLO_PAUSE_AN) which might be requested depending on the results of
+ * autonegotiation or user configuration:
+ *
+ * MAC_SYM_PAUSE MAC_ASYM_PAUSE Valid pause modes
+ * ============= ============== ==============================
+ * 0 0 MLO_PAUSE_NONE
+ * 0 1 MLO_PAUSE_NONE, MLO_PAUSE_TX
+ * 1 0 MLO_PAUSE_NONE, MLO_PAUSE_TXRX
+ * 1 1 MLO_PAUSE_NONE, MLO_PAUSE_TXRX,
+ * MLO_PAUSE_RX
+ */
MAC_SYM_PAUSE = BIT(0),
MAC_ASYM_PAUSE = BIT(1),
MAC_10HD = BIT(2),
--
2.35.1.1320.gc452695387.dirty

2022-08-18 17:18:43

by Sean Anderson

[permalink] [raw]
Subject: [PATCH net-next v4 04/10] net: phylink: Generate caps and convert to linkmodes separately

If we call phylink_caps_to_linkmodes directly from
phylink_get_linkmodes, it is difficult to re-use this functionality in
MAC drivers. This is because MAC drivers must then work with an ethtool
linkmode bitmap, instead of with mac capabilities. Instead, let the
caller of phylink_get_linkmodes do the conversion. To reflect this
change, rename the function to phylink_get_capabilities.

Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v1)

drivers/net/phy/phylink.c | 21 +++++++++++----------
include/linux/phylink.h | 4 ++--
2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 51e66320526f..68a58ab6a8ed 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -305,17 +305,15 @@ void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps)
EXPORT_SYMBOL_GPL(phylink_caps_to_linkmodes);

/**
- * phylink_get_linkmodes() - get acceptable link modes
- * @linkmodes: ethtool linkmode mask (must be already initialised)
+ * phylink_get_capabilities() - get capabilities for a given MAC
* @interface: phy interface mode defined by &typedef phy_interface_t
* @mac_capabilities: bitmask of MAC capabilities
*
- * Set all possible pause, speed and duplex linkmodes in @linkmodes that
- * are supported by the @interface mode and @mac_capabilities. @linkmodes
- * must have been initialised previously.
+ * Get the MAC capabilities that are supported by the @interface mode and
+ * @mac_capabilities.
*/
-void phylink_get_linkmodes(unsigned long *linkmodes, phy_interface_t interface,
- unsigned long mac_capabilities)
+unsigned long phylink_get_capabilities(phy_interface_t interface,
+ unsigned long mac_capabilities)
{
unsigned long caps = MAC_SYM_PAUSE | MAC_ASYM_PAUSE;

@@ -390,9 +388,9 @@ void phylink_get_linkmodes(unsigned long *linkmodes, phy_interface_t interface,
break;
}

- phylink_caps_to_linkmodes(linkmodes, caps & mac_capabilities);
+ return caps & mac_capabilities;
}
-EXPORT_SYMBOL_GPL(phylink_get_linkmodes);
+EXPORT_SYMBOL_GPL(phylink_get_capabilities);

/**
* phylink_generic_validate() - generic validate() callback implementation
@@ -408,11 +406,14 @@ void phylink_generic_validate(struct phylink_config *config,
unsigned long *supported,
struct phylink_link_state *state)
{
+ unsigned long caps;
__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };

phylink_set_port_modes(mask);
phylink_set(mask, Autoneg);
- phylink_get_linkmodes(mask, state->interface, config->mac_capabilities);
+ caps = phylink_get_capabilities(state->interface,
+ config->mac_capabilities);
+ phylink_caps_to_linkmodes(mask, caps);

linkmode_and(supported, supported, mask);
linkmode_and(state->advertising, state->advertising, mask);
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 2a6d2f7a7ebb..661d1d4fdbec 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -535,8 +535,8 @@ void pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
#endif

void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps);
-void phylink_get_linkmodes(unsigned long *linkmodes, phy_interface_t interface,
- unsigned long mac_capabilities);
+unsigned long phylink_get_capabilities(phy_interface_t interface,
+ unsigned long mac_capabilities);
void phylink_generic_validate(struct phylink_config *config,
unsigned long *supported,
struct phylink_link_state *state);
--
2.35.1.1320.gc452695387.dirty

2022-08-18 17:19:30

by Sean Anderson

[permalink] [raw]
Subject: [PATCH net-next v4 05/10] net: phylink: Add some helpers for working with mac caps

This adds a table for converting between speed/duplex and mac
capabilities. It also adds a helper for getting the max speed/duplex
from some caps. It is intended to be used by Russell King's DSA phylink
series. The table will be used directly later in this series.

Co-developed-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>
[ adapted to live in phylink.c ]
Signed-off-by: Sean Anderson <[email protected]>
---
This is adapted from [1].

[1] https://lore.kernel.org/netdev/[email protected]/

Changes in v4:
- Wrap docstring to 80 columns

Changes in v3:
- New

drivers/net/phy/phylink.c | 57 +++++++++++++++++++++++++++++++++++++++
include/linux/phylink.h | 2 ++
2 files changed, 59 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 68a58ab6a8ed..8a9da7449c73 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -304,6 +304,63 @@ void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps)
}
EXPORT_SYMBOL_GPL(phylink_caps_to_linkmodes);

+static struct {
+ unsigned long mask;
+ int speed;
+ unsigned int duplex;
+} phylink_caps_params[] = {
+ { MAC_400000FD, SPEED_400000, DUPLEX_FULL },
+ { MAC_200000FD, SPEED_200000, DUPLEX_FULL },
+ { MAC_100000FD, SPEED_100000, DUPLEX_FULL },
+ { MAC_56000FD, SPEED_56000, DUPLEX_FULL },
+ { MAC_50000FD, SPEED_50000, DUPLEX_FULL },
+ { MAC_40000FD, SPEED_40000, DUPLEX_FULL },
+ { MAC_25000FD, SPEED_25000, DUPLEX_FULL },
+ { MAC_20000FD, SPEED_20000, DUPLEX_FULL },
+ { MAC_10000FD, SPEED_10000, DUPLEX_FULL },
+ { MAC_5000FD, SPEED_5000, DUPLEX_FULL },
+ { MAC_2500FD, SPEED_2500, DUPLEX_FULL },
+ { MAC_1000FD, SPEED_1000, DUPLEX_FULL },
+ { MAC_1000HD, SPEED_1000, DUPLEX_HALF },
+ { MAC_100FD, SPEED_100, DUPLEX_FULL },
+ { MAC_100HD, SPEED_100, DUPLEX_HALF },
+ { MAC_10FD, SPEED_10, DUPLEX_FULL },
+ { MAC_10HD, SPEED_10, DUPLEX_HALF },
+};
+
+/**
+ * phylink_caps_find_max_speed() - Find the max speed/duplex of mac capabilities
+ * @caps: A mask of mac capabilities
+ * @speed: Variable to store the maximum speed
+ * @duplex: Variable to store the maximum duplex
+ *
+ * Find the maximum speed (and associated duplex) supported by a mask of mac
+ * capabilities. @speed and @duplex are always set, even if no matching mac
+ * capability was found.
+ *
+ * Return: 0 on success, or %-EINVAL if the maximum speed/duplex could not be
+ * determined.
+ */
+int phylink_caps_find_max_speed(unsigned long caps, int *speed,
+ unsigned int *duplex)
+{
+ int i;
+
+ *speed = SPEED_UNKNOWN;
+ *duplex = DUPLEX_UNKNOWN;
+
+ for (i = 0; i < ARRAY_SIZE(phylink_caps_params); i++) {
+ if (caps & phylink_caps_params[i].mask) {
+ *speed = phylink_caps_params[i].speed;
+ *duplex = phylink_caps_params[i].duplex;
+ return 0;
+ }
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(phylink_caps_find_max_speed);
+
/**
* phylink_get_capabilities() - get capabilities for a given MAC
* @interface: phy interface mode defined by &typedef phy_interface_t
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 661d1d4fdbec..a5a236cfacb6 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -535,6 +535,8 @@ void pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
#endif

void phylink_caps_to_linkmodes(unsigned long *linkmodes, unsigned long caps);
+int phylink_caps_find_max_speed(unsigned long caps, int *speed,
+ unsigned int *duplex);
unsigned long phylink_get_capabilities(phy_interface_t interface,
unsigned long mac_capabilities);
void phylink_generic_validate(struct phylink_config *config,
--
2.35.1.1320.gc452695387.dirty