Add support for hardware statistics counters (if they are enabled) in
the AXI Ethernet driver. Unfortunately, the implementation is
complicated a bit since the hardware might only support 32-bit counters.
In the future, we could add a simpler alternate implementation for
64-bit counters enabled with a device-tree property, but for now I've
gone with a unified implementation.
Sean Anderson (3):
net: xilinx: axienet: Use NL_SET_ERR_MSG instead of netdev_err
net: xilinx: axienet: Report RxRject as rx_dropped
net: xilinx: axienet: Add statistics support
drivers/net/ethernet/xilinx/xilinx_axienet.h | 81 ++++++
.../net/ethernet/xilinx/xilinx_axienet_main.c | 275 +++++++++++++++++-
2 files changed, 349 insertions(+), 7 deletions(-)
--
2.35.1.1320.gc452695387.dirty
The Receive Frame Rejected interrupt is asserted whenever there was a
receive error (bad FCS, bad length, etc.) or whenever the frame was
dropped due to a mismatched address. So this is really a combination of
rx_otherhost_dropped, rx_length_errors, rx_frame_errors, and
rx_crc_errors. Mismatched addresses are common and aren't really errors
at all (much like how fragments are normal on half-duplex links). To
avoid confusion, report these events as rx_dropped. This better
reflects what's going on: the packet was received by the MAC but dropped
before being processed.
Signed-off-by: Sean Anderson <[email protected]>
---
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 5f98daa5b341..cf8908794409 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1296,7 +1296,7 @@ static irqreturn_t axienet_eth_irq(int irq, void *_ndev)
ndev->stats.rx_missed_errors++;
if (pending & XAE_INT_RXRJECT_MASK)
- ndev->stats.rx_frame_errors++;
+ ndev->stats.rx_dropped++;
axienet_iow(lp, XAE_IS_OFFSET, pending);
return IRQ_HANDLED;
--
2.35.1.1320.gc452695387.dirty
This error message can be triggered by userspace. Use NL_SET_ERR_MSG so
the message is returned to the user and to avoid polluting the kernel
logs.
Signed-off-by: Sean Anderson <[email protected]>
---
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index c29809cd9201..5f98daa5b341 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1945,9 +1945,9 @@ axienet_ethtools_set_coalesce(struct net_device *ndev,
struct axienet_local *lp = netdev_priv(ndev);
if (netif_running(ndev)) {
- netdev_err(ndev,
- "Please stop netif before applying configuration\n");
- return -EFAULT;
+ NL_SET_ERR_MSG(extack,
+ "Please stop netif before applying configuration");
+ return -EBUSY;
}
if (ecoalesce->rx_max_coalesced_frames)
--
2.35.1.1320.gc452695387.dirty
Add support for reading the statistics counters, if they are enabled.
The counters may be 64-bit, but we can't detect this as there's no
ability bit for it and the counters are read-only. Therefore, we assume
the counters are 32-bits. To ensure we don't miss an overflow, we need
to read all counters at regular intervals, configurable with
stats-block-usecs. This should be often enough to ensure the bytes
counters don't wrap at 2.5 Gbit/s.
Another complication is that the counters may be reset when the device
is reset (depending on configuration). To ensure the counters persist
across link up/down (including suspend/resume), we maintain our own
64-bit versions along with the last counter value we saw. Because we
might wait up to 100 ms for the reset to complete, we use a mutex to
protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
u64_stats_sync to protect readers.
We can't use the byte counters for either get_stats64 or
get_eth_mac_stats. This is because the byte counters include everything
in the frame (destination address to FCS, inclusive). But
rtnl_link_stats64 wants bytes excluding the FCS, and
ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
length/type).
Signed-off-by: Sean Anderson <[email protected]>
---
drivers/net/ethernet/xilinx/xilinx_axienet.h | 81 ++++++
.../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
2 files changed, 345 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
index fa5500decc96..c121e9b1c41d 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
@@ -156,6 +156,7 @@
#define XAE_TPID0_OFFSET 0x00000028 /* VLAN TPID0 register */
#define XAE_TPID1_OFFSET 0x0000002C /* VLAN TPID1 register */
#define XAE_PPST_OFFSET 0x00000030 /* PCS PMA Soft Temac Status Reg */
+#define XAE_STATS_OFFSET 0x00000200 /* Statistics counters */
#define XAE_RCW0_OFFSET 0x00000400 /* Rx Configuration Word 0 */
#define XAE_RCW1_OFFSET 0x00000404 /* Rx Configuration Word 1 */
#define XAE_TC_OFFSET 0x00000408 /* Tx Configuration */
@@ -163,6 +164,7 @@
#define XAE_EMMC_OFFSET 0x00000410 /* EMAC mode configuration */
#define XAE_PHYC_OFFSET 0x00000414 /* RGMII/SGMII configuration */
#define XAE_ID_OFFSET 0x000004F8 /* Identification register */
+#define XAE_ABILITY_OFFSET 0x000004FC /* Ability Register offset */
#define XAE_MDIO_MC_OFFSET 0x00000500 /* MII Management Config */
#define XAE_MDIO_MCR_OFFSET 0x00000504 /* MII Management Control */
#define XAE_MDIO_MWD_OFFSET 0x00000508 /* MII Management Write Data */
@@ -283,6 +285,16 @@
#define XAE_PHYC_SGLINKSPD_100 0x40000000 /* SGMII link 100 Mbit */
#define XAE_PHYC_SGLINKSPD_1000 0x80000000 /* SGMII link 1000 Mbit */
+/* Bit masks for Axi Ethernet ability register */
+#define XAE_ABILITY_PFC BIT(16)
+#define XAE_ABILITY_FRAME_FILTER BIT(10)
+#define XAE_ABILITY_HALF_DUPLEX BIT(9)
+#define XAE_ABILITY_STATS BIT(8)
+#define XAE_ABILITY_2_5G BIT(3)
+#define XAE_ABILITY_1G BIT(2)
+#define XAE_ABILITY_100M BIT(1)
+#define XAE_ABILITY_10M BIT(0)
+
/* Bit masks for Axi Ethernet MDIO interface MC register */
#define XAE_MDIO_MC_MDIOEN_MASK 0x00000040 /* MII management enable */
#define XAE_MDIO_MC_CLOCK_DIVIDE_MAX 0x3F /* Maximum MDIO divisor */
@@ -331,6 +343,7 @@
#define XAE_FEATURE_FULL_RX_CSUM (1 << 2)
#define XAE_FEATURE_FULL_TX_CSUM (1 << 3)
#define XAE_FEATURE_DMA_64BIT (1 << 4)
+#define XAE_FEATURE_STATS (1 << 5)
#define XAE_NO_CSUM_OFFLOAD 0
@@ -344,6 +357,61 @@
#define XLNX_MII_STD_SELECT_REG 0x11
#define XLNX_MII_STD_SELECT_SGMII BIT(0)
+/* enum temac_stat - TEMAC statistics counters
+ *
+ * Index of statistics counters within the TEMAC. This must match the
+ * order/offset of hardware registers exactly.
+ */
+enum temac_stat {
+ STAT_RX_BYTES = 0,
+ STAT_TX_BYTES,
+ STAT_UNDERSIZE_FRAMES,
+ STAT_FRAGMENT_FRAMES,
+ STAT_RX_64_BYTE_FRAMES,
+ STAT_RX_65_127_BYTE_FRAMES,
+ STAT_RX_128_255_BYTE_FRAMES,
+ STAT_RX_256_511_BYTE_FRAMES,
+ STAT_RX_512_1023_BYTE_FRAMES,
+ STAT_RX_1024_MAX_BYTE_FRAMES,
+ STAT_RX_OVERSIZE_FRAMES,
+ STAT_TX_64_BYTE_FRAMES,
+ STAT_TX_65_127_BYTE_FRAMES,
+ STAT_TX_128_255_BYTE_FRAMES,
+ STAT_TX_256_511_BYTE_FRAMES,
+ STAT_TX_512_1023_BYTE_FRAMES,
+ STAT_TX_1024_MAX_BYTE_FRAMES,
+ STAT_TX_OVERSIZE_FRAMES,
+ STAT_RX_GOOD_FRAMES,
+ STAT_RX_FCS_ERRORS,
+ STAT_RX_BROADCAST_FRAMES,
+ STAT_RX_MULTICAST_FRAMES,
+ STAT_RX_CONTROL_FRAMES,
+ STAT_RX_LENGTH_ERRORS,
+ STAT_RX_VLAN_FRAMES,
+ STAT_RX_PAUSE_FRAMES,
+ STAT_RX_CONTROL_OPCODE_ERRORS,
+ STAT_TX_GOOD_FRAMES,
+ STAT_TX_BROADCAST_FRAMES,
+ STAT_TX_MULTICAST_FRAMES,
+ STAT_TX_UNDERRUN_ERRORS,
+ STAT_TX_CONTROL_FRAMES,
+ STAT_TX_VLAN_FRAMES,
+ STAT_TX_PAUSE_FRAMES,
+ STAT_TX_SINGLE_COLLISION_FRAMES,
+ STAT_TX_MULTIPLE_COLLISION_FRAMES,
+ STAT_TX_DEFERRED_FRAMES,
+ STAT_TX_LATE_COLLISIONS,
+ STAT_TX_EXCESS_COLLISIONS,
+ STAT_TX_EXCESS_DEFERRAL,
+ STAT_RX_ALIGNMENT_ERRORS,
+ STAT_TX_PFC_FRAMES,
+ STAT_RX_PFC_FRAMES,
+ STAT_USER_DEFINED0,
+ STAT_USER_DEFINED1,
+ STAT_USER_DEFINED2,
+ STAT_COUNT,
+};
+
/**
* struct axidma_bd - Axi Dma buffer descriptor layout
* @next: MM2S/S2MM Next Descriptor Pointer
@@ -434,6 +502,11 @@ struct skbuf_dma_descriptor {
* @tx_packets: TX packet count for statistics
* @tx_bytes: TX byte count for statistics
* @tx_stat_sync: Synchronization object for TX stats
+ * @hw_last_counter: Last-seen value of each statistic
+ * @hw_stats: Interface statistics periodically updated from hardware counters
+ * @hw_stats_sync: Synchronization object for @hw_stats
+ * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
+ * @stats_work: Work for reading the hardware statistics counters
* @dma_err_task: Work structure to process Axi DMA errors
* @tx_irq: Axidma TX IRQ number
* @rx_irq: Axidma RX IRQ number
@@ -452,6 +525,7 @@ struct skbuf_dma_descriptor {
* @coalesce_usec_rx: IRQ coalesce delay for RX
* @coalesce_count_tx: Store the irq coalesce on TX side.
* @coalesce_usec_tx: IRQ coalesce delay for TX
+ * @coalesce_usec_stats: Delay between hardware statistics refreshes
* @use_dmaengine: flag to check dmaengine framework usage.
* @tx_chan: TX DMA channel.
* @rx_chan: RX DMA channel.
@@ -505,6 +579,12 @@ struct axienet_local {
u64_stats_t tx_bytes;
struct u64_stats_sync tx_stat_sync;
+ u32 hw_last_counter[STAT_COUNT];
+ u64_stats_t hw_stats[STAT_COUNT];
+ struct u64_stats_sync hw_stat_sync;
+ struct mutex stats_lock;
+ struct delayed_work stats_work;
+
struct work_struct dma_err_task;
int tx_irq;
@@ -525,6 +605,7 @@ struct axienet_local {
u32 coalesce_usec_rx;
u32 coalesce_count_tx;
u32 coalesce_usec_tx;
+ u32 coalesce_usec_stats;
u8 use_dmaengine;
struct dma_chan *tx_chan;
struct dma_chan *rx_chan;
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index cf8908794409..0ec1dc6dde3f 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -518,11 +518,53 @@ static void axienet_setoptions(struct net_device *ndev, u32 options)
lp->options |= options;
}
+static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
+{
+ return u64_stats_read(&lp->hw_stats[stat]);
+}
+
+static void axienet_stats_update(struct axienet_local *lp)
+{
+ enum temac_stat stat;
+
+ lockdep_assert_held(&lp->stats_lock);
+
+ u64_stats_update_begin(&lp->hw_stat_sync);
+ for (stat = 0; stat < STAT_COUNT; stat++) {
+ u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
+
+ u64_stats_add(&lp->hw_stats[stat],
+ counter - lp->hw_last_counter[stat]);
+ lp->hw_last_counter[stat] = counter;
+ }
+ u64_stats_update_end(&lp->hw_stat_sync);
+}
+
+static void axienet_refresh_stats(struct work_struct *work)
+{
+ struct axienet_local *lp = container_of(work, struct axienet_local,
+ stats_work.work);
+
+ mutex_lock(&lp->stats_lock);
+ axienet_stats_update(lp);
+ mutex_unlock(&lp->stats_lock);
+
+ if (lp->coalesce_usec_stats)
+ schedule_delayed_work(&lp->stats_work,
+ usecs_to_jiffies(lp->coalesce_usec_stats));
+}
+
static int __axienet_device_reset(struct axienet_local *lp)
{
u32 value;
int ret;
+ /* Save statistics counters in case they will be reset */
+ if (lp->features & XAE_FEATURE_STATS) {
+ mutex_lock(&lp->stats_lock);
+ axienet_stats_update(lp);
+ }
+
/* Reset Axi DMA. This would reset Axi Ethernet core as well. The reset
* process of Axi DMA takes a while to complete as all pending
* commands/transfers will be flushed or completed during this
@@ -537,7 +579,7 @@ static int __axienet_device_reset(struct axienet_local *lp)
XAXIDMA_TX_CR_OFFSET);
if (ret) {
dev_err(lp->dev, "%s: DMA reset timeout!\n", __func__);
- return ret;
+ goto err;
}
/* Wait for PhyRstCmplt bit to be set, indicating the PHY reset has finished */
@@ -547,10 +589,25 @@ static int __axienet_device_reset(struct axienet_local *lp)
XAE_IS_OFFSET);
if (ret) {
dev_err(lp->dev, "%s: timeout waiting for PhyRstCmplt\n", __func__);
- return ret;
+ goto err;
+ }
+
+ /* Update statistics counters with new values */
+ if (lp->features & XAE_FEATURE_STATS) {
+ enum temac_stat stat;
+
+ for (stat = 0; stat < STAT_COUNT; stat++)
+ lp->hw_last_counter[stat] =
+ axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
+ mutex_unlock(&lp->stats_lock);
}
return 0;
+
+err:
+ if (lp->features & XAE_FEATURE_STATS)
+ mutex_unlock(&lp->stats_lock);
+ return ret;
}
/**
@@ -1532,6 +1589,11 @@ static int axienet_open(struct net_device *ndev)
phylink_start(lp->phylink);
+ /* Start the statistics refresh work */
+ if (lp->coalesce_usec_stats)
+ schedule_delayed_work(&lp->stats_work,
+ usecs_to_jiffies(lp->coalesce_usec_stats));
+
if (lp->use_dmaengine) {
/* Enable interrupts for Axi Ethernet core (if defined) */
if (lp->eth_irq > 0) {
@@ -1556,6 +1618,7 @@ static int axienet_open(struct net_device *ndev)
if (lp->eth_irq > 0)
free_irq(lp->eth_irq, ndev);
err_phy:
+ cancel_delayed_work_sync(&lp->stats_work);
phylink_stop(lp->phylink);
phylink_disconnect_phy(lp->phylink);
return ret;
@@ -1583,6 +1646,8 @@ static int axienet_stop(struct net_device *ndev)
napi_disable(&lp->napi_rx);
}
+ cancel_delayed_work_sync(&lp->stats_work);
+
phylink_stop(lp->phylink);
phylink_disconnect_phy(lp->phylink);
@@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
stats->tx_packets = u64_stats_read(&lp->tx_packets);
stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
} while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
+
+ if (!(lp->features & XAE_FEATURE_STATS))
+ return;
+
+ do {
+ start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+ stats->rx_length_errors =
+ axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
+ stats->rx_crc_errors = axienet_stat(lp, STAT_RX_FCS_ERRORS);
+ stats->rx_frame_errors =
+ axienet_stat(lp, STAT_RX_ALIGNMENT_ERRORS);
+ stats->rx_errors = axienet_stat(lp, STAT_UNDERSIZE_FRAMES) +
+ axienet_stat(lp, STAT_FRAGMENT_FRAMES) +
+ stats->rx_length_errors +
+ stats->rx_crc_errors +
+ stats->rx_frame_errors;
+ stats->multicast = axienet_stat(lp, STAT_RX_MULTICAST_FRAMES);
+
+ stats->tx_aborted_errors =
+ axienet_stat(lp, STAT_TX_EXCESS_COLLISIONS);
+ stats->tx_fifo_errors =
+ axienet_stat(lp, STAT_TX_UNDERRUN_ERRORS);
+ stats->tx_window_errors =
+ axienet_stat(lp, STAT_TX_LATE_COLLISIONS);
+ stats->tx_errors = axienet_stat(lp, STAT_TX_EXCESS_DEFERRAL) +
+ stats->tx_aborted_errors +
+ stats->tx_fifo_errors +
+ stats->tx_window_errors;
+ } while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
}
static const struct net_device_ops axienet_netdev_ops = {
@@ -1920,6 +2014,7 @@ axienet_ethtools_get_coalesce(struct net_device *ndev,
ecoalesce->rx_coalesce_usecs = lp->coalesce_usec_rx;
ecoalesce->tx_max_coalesced_frames = lp->coalesce_count_tx;
ecoalesce->tx_coalesce_usecs = lp->coalesce_usec_tx;
+ ecoalesce->stats_block_coalesce_usecs = lp->coalesce_usec_stats;
return 0;
}
@@ -1958,6 +2053,9 @@ axienet_ethtools_set_coalesce(struct net_device *ndev,
lp->coalesce_count_tx = ecoalesce->tx_max_coalesced_frames;
if (ecoalesce->tx_coalesce_usecs)
lp->coalesce_usec_tx = ecoalesce->tx_coalesce_usecs;
+ /* Just less than 2^32 bytes at 2.5 GBit/s */
+ lp->coalesce_usec_stats = min(ecoalesce->stats_block_coalesce_usecs,
+ 13 * USEC_PER_SEC);
return 0;
}
@@ -1987,9 +2085,160 @@ static int axienet_ethtools_nway_reset(struct net_device *dev)
return phylink_ethtool_nway_reset(lp->phylink);
}
+static void
+axienet_ethtools_get_pause_stats(struct net_device *dev,
+ struct ethtool_pause_stats *pause_stats)
+{
+ struct axienet_local *lp = netdev_priv(dev);
+ unsigned int start;
+
+ if (!(lp->features & XAE_FEATURE_STATS))
+ return;
+
+ do {
+ start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+ pause_stats->tx_pause_frames =
+ axienet_stat(lp, STAT_TX_PAUSE_FRAMES);
+ pause_stats->rx_pause_frames =
+ axienet_stat(lp, STAT_RX_PAUSE_FRAMES);
+ } while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static void
+axienet_ethtool_get_eth_mac_stats(struct net_device *dev,
+ struct ethtool_eth_mac_stats *mac_stats)
+{
+ struct axienet_local *lp = netdev_priv(dev);
+ unsigned int start;
+
+ if (!(lp->features & XAE_FEATURE_STATS))
+ return;
+
+ do {
+ start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+ mac_stats->FramesTransmittedOK =
+ axienet_stat(lp, STAT_TX_GOOD_FRAMES);
+ mac_stats->SingleCollisionFrames =
+ axienet_stat(lp, STAT_TX_SINGLE_COLLISION_FRAMES);
+ mac_stats->MultipleCollisionFrames =
+ axienet_stat(lp, STAT_TX_MULTIPLE_COLLISION_FRAMES);
+ mac_stats->FramesReceivedOK =
+ axienet_stat(lp, STAT_RX_GOOD_FRAMES);
+ mac_stats->FrameCheckSequenceErrors =
+ axienet_stat(lp, STAT_RX_FCS_ERRORS);
+ mac_stats->AlignmentErrors =
+ axienet_stat(lp, STAT_RX_ALIGNMENT_ERRORS);
+ mac_stats->FramesWithDeferredXmissions =
+ axienet_stat(lp, STAT_TX_DEFERRED_FRAMES);
+ mac_stats->LateCollisions =
+ axienet_stat(lp, STAT_TX_LATE_COLLISIONS);
+ mac_stats->FramesAbortedDueToXSColls =
+ axienet_stat(lp, STAT_TX_EXCESS_COLLISIONS);
+ mac_stats->MulticastFramesXmittedOK =
+ axienet_stat(lp, STAT_TX_MULTICAST_FRAMES);
+ mac_stats->BroadcastFramesXmittedOK =
+ axienet_stat(lp, STAT_TX_BROADCAST_FRAMES);
+ mac_stats->FramesWithExcessiveDeferral =
+ axienet_stat(lp, STAT_TX_EXCESS_DEFERRAL);
+ mac_stats->MulticastFramesReceivedOK =
+ axienet_stat(lp, STAT_RX_MULTICAST_FRAMES);
+ mac_stats->BroadcastFramesReceivedOK =
+ axienet_stat(lp, STAT_RX_BROADCAST_FRAMES);
+ mac_stats->InRangeLengthErrors =
+ axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
+ } while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static void
+axienet_ethtool_get_eth_ctrl_stats(struct net_device *dev,
+ struct ethtool_eth_ctrl_stats *ctrl_stats)
+{
+ struct axienet_local *lp = netdev_priv(dev);
+ unsigned int start;
+
+ if (!(lp->features & XAE_FEATURE_STATS))
+ return;
+
+ do {
+ start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+ ctrl_stats->MACControlFramesTransmitted =
+ axienet_stat(lp, STAT_TX_CONTROL_FRAMES);
+ ctrl_stats->MACControlFramesReceived =
+ axienet_stat(lp, STAT_RX_CONTROL_FRAMES);
+ ctrl_stats->UnsupportedOpcodesReceived =
+ axienet_stat(lp, STAT_RX_CONTROL_OPCODE_ERRORS);
+ } while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+}
+
+static const struct ethtool_rmon_hist_range axienet_rmon_ranges[] = {
+ { 64, 64 },
+ { 65, 127 },
+ { 128, 255 },
+ { 256, 511 },
+ { 512, 1023 },
+ { 1024, 1518 },
+ { 1519, 16384 },
+ { },
+};
+
+static void
+axienet_ethtool_get_rmon_stats(struct net_device *dev,
+ struct ethtool_rmon_stats *rmon_stats,
+ const struct ethtool_rmon_hist_range **ranges)
+{
+ struct axienet_local *lp = netdev_priv(dev);
+ unsigned int start;
+
+ if (!(lp->features & XAE_FEATURE_STATS))
+ return;
+
+ do {
+ start = u64_stats_fetch_begin(&lp->hw_stat_sync);
+ rmon_stats->undersize_pkts =
+ axienet_stat(lp, STAT_UNDERSIZE_FRAMES);
+ rmon_stats->oversize_pkts =
+ axienet_stat(lp, STAT_RX_OVERSIZE_FRAMES);
+ rmon_stats->fragments =
+ axienet_stat(lp, STAT_FRAGMENT_FRAMES);
+
+ rmon_stats->hist[0] =
+ axienet_stat(lp, STAT_RX_64_BYTE_FRAMES);
+ rmon_stats->hist[1] =
+ axienet_stat(lp, STAT_RX_65_127_BYTE_FRAMES);
+ rmon_stats->hist[2] =
+ axienet_stat(lp, STAT_RX_128_255_BYTE_FRAMES);
+ rmon_stats->hist[3] =
+ axienet_stat(lp, STAT_RX_256_511_BYTE_FRAMES);
+ rmon_stats->hist[4] =
+ axienet_stat(lp, STAT_RX_512_1023_BYTE_FRAMES);
+ rmon_stats->hist[5] =
+ axienet_stat(lp, STAT_RX_1024_MAX_BYTE_FRAMES);
+ rmon_stats->hist[6] =
+ rmon_stats->oversize_pkts;
+
+ rmon_stats->hist_tx[0] =
+ axienet_stat(lp, STAT_TX_64_BYTE_FRAMES);
+ rmon_stats->hist_tx[1] =
+ axienet_stat(lp, STAT_TX_65_127_BYTE_FRAMES);
+ rmon_stats->hist_tx[2] =
+ axienet_stat(lp, STAT_TX_128_255_BYTE_FRAMES);
+ rmon_stats->hist_tx[3] =
+ axienet_stat(lp, STAT_TX_256_511_BYTE_FRAMES);
+ rmon_stats->hist_tx[4] =
+ axienet_stat(lp, STAT_TX_512_1023_BYTE_FRAMES);
+ rmon_stats->hist_tx[5] =
+ axienet_stat(lp, STAT_TX_1024_MAX_BYTE_FRAMES);
+ rmon_stats->hist_tx[6] =
+ axienet_stat(lp, STAT_TX_OVERSIZE_FRAMES);
+ } while (u64_stats_fetch_retry(&lp->hw_stat_sync, start));
+
+ *ranges = axienet_rmon_ranges;
+}
+
static const struct ethtool_ops axienet_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
- ETHTOOL_COALESCE_USECS,
+ ETHTOOL_COALESCE_USECS |
+ ETHTOOL_COALESCE_STATS_BLOCK_USECS,
.get_drvinfo = axienet_ethtools_get_drvinfo,
.get_regs_len = axienet_ethtools_get_regs_len,
.get_regs = axienet_ethtools_get_regs,
@@ -2003,6 +2252,10 @@ static const struct ethtool_ops axienet_ethtool_ops = {
.get_link_ksettings = axienet_ethtools_get_link_ksettings,
.set_link_ksettings = axienet_ethtools_set_link_ksettings,
.nway_reset = axienet_ethtools_nway_reset,
+ .get_pause_stats = axienet_ethtools_get_pause_stats,
+ .get_eth_mac_stats = axienet_ethtool_get_eth_mac_stats,
+ .get_eth_ctrl_stats = axienet_ethtool_get_eth_ctrl_stats,
+ .get_rmon_stats = axienet_ethtool_get_rmon_stats,
};
static struct axienet_local *pcs_to_axienet_local(struct phylink_pcs *pcs)
@@ -2271,6 +2524,10 @@ static int axienet_probe(struct platform_device *pdev)
u64_stats_init(&lp->rx_stat_sync);
u64_stats_init(&lp->tx_stat_sync);
+ u64_stats_init(&lp->hw_stat_sync);
+
+ mutex_init(&lp->stats_lock);
+ INIT_DEFERRABLE_WORK(&lp->stats_work, axienet_refresh_stats);
lp->axi_clk = devm_clk_get_optional(&pdev->dev, "s_axi_lite_clk");
if (!lp->axi_clk) {
@@ -2312,6 +2569,9 @@ static int axienet_probe(struct platform_device *pdev)
/* Setup checksum offload, but default to off if not specified */
lp->features = 0;
+ if (axienet_ior(lp, XAE_ABILITY_OFFSET) & XAE_ABILITY_STATS)
+ lp->features |= XAE_FEATURE_STATS;
+
ret = of_property_read_u32(pdev->dev.of_node, "xlnx,txcsum", &value);
if (!ret) {
switch (value) {
@@ -2527,6 +2787,7 @@ static int axienet_probe(struct platform_device *pdev)
lp->coalesce_count_tx = XAXIDMA_DFT_TX_THRESHOLD;
lp->coalesce_usec_rx = XAXIDMA_DFT_RX_USEC;
lp->coalesce_usec_tx = XAXIDMA_DFT_TX_USEC;
+ lp->coalesce_usec_stats = USEC_PER_SEC;
ret = axienet_mdio_setup(lp);
if (ret)
--
2.35.1.1320.gc452695387.dirty
On Mon, Jun 10, 2024 at 07:10:20PM -0400, Sean Anderson wrote:
> This error message can be triggered by userspace. Use NL_SET_ERR_MSG so
> the message is returned to the user and to avoid polluting the kernel
> logs.
This has nothing to do with statistics. So it would be better to post
it as a standalone patch. It is the sort of trivial patch that should
get merged quickly.
I would also comment about the change from EFAULT to EBUSY in the
commit message.
Andrew
---
pw-bot: cr
On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> Add support for reading the statistics counters, if they are enabled.
> The counters may be 64-bit, but we can't detect this as there's no
> ability bit for it and the counters are read-only. Therefore, we assume
> the counters are 32-bits.
> +static void axienet_stats_update(struct axienet_local *lp)
> +{
> + enum temac_stat stat;
> +
> + lockdep_assert_held(&lp->stats_lock);
> +
> + u64_stats_update_begin(&lp->hw_stat_sync);
> + for (stat = 0; stat < STAT_COUNT; stat++) {
> + u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
The * 8 here suggests the counters are spaced so that they could be 64
bit wide, even when only 32 bits are used. Does the documentation say
anything about the upper 32 bits when the counters are only 32 bits?
Are they guaranteed to read as zero? I'm just wondering if the code
should be forward looking and read all 64 bits?
> static int __axienet_device_reset(struct axienet_local *lp)
> {
> u32 value;
> int ret;
>
> + /* Save statistics counters in case they will be reset */
> + if (lp->features & XAE_FEATURE_STATS) {
> + mutex_lock(&lp->stats_lock);
> + axienet_stats_update(lp);
> + }
It is a pretty unusual pattern to split a mutex lock/unlock like this
on an if statement. Maybe just unconditionally hold the mutex? This
does not appear to be anyway hot path, so the overhead should not
matter.
Andrew
> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
> +{
> + return u64_stats_read(&lp->hw_stats[stat]);
> +}
> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> stats->tx_packets = u64_stats_read(&lp->tx_packets);
> stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
> } while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
> +
> + if (!(lp->features & XAE_FEATURE_STATS))
> + return;
> +
> + do {
> + start = u64_stats_fetch_begin(&lp->hw_stat_sync);
> + stats->rx_length_errors =
> + axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
I'm i reading this correctly. You are returning the counters from the
last refresh period. What is that? 2.5Gbps would wrapper around a 32
byte counter in 13 seconds. I hope these statistics are not 13 seconds
out of date?
Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
see why you cannot read the hardware counter value and update to the
latest value.
Andrew
On Tue, Jun 11, 2024 at 02:13:40AM +0200, Andrew Lunn wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> > Add support for reading the statistics counters, if they are enabled.
> > The counters may be 64-bit, but we can't detect this as there's no
> > ability bit for it and the counters are read-only. Therefore, we assume
> > the counters are 32-bits.
>
> > +static void axienet_stats_update(struct axienet_local *lp)
> > +{
> > + enum temac_stat stat;
> > +
> > + lockdep_assert_held(&lp->stats_lock);
> > +
> > + u64_stats_update_begin(&lp->hw_stat_sync);
> > + for (stat = 0; stat < STAT_COUNT; stat++) {
> > + u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
>
> The * 8 here suggests the counters are spaced so that they could be 64
> bit wide, even when only 32 bits are used. Does the documentation say
> anything about the upper 32 bits when the counters are only 32 bits?
> Are they guaranteed to read as zero? I'm just wondering if the code
> should be forward looking and read all 64 bits?
Actually, if you read the upper 32 bits and they are not 0, you know
you have 64 bit counters. You can then kill off your period task, it
is not needed because your software counters will wrap around the same
time as the hardware counters.
Andrew
On 6/10/24 19:49, Andrew Lunn wrote:
> On Mon, Jun 10, 2024 at 07:10:20PM -0400, Sean Anderson wrote:
>> This error message can be triggered by userspace. Use NL_SET_ERR_MSG so
>> the message is returned to the user and to avoid polluting the kernel
>> logs.
>
> This has nothing to do with statistics. So it would be better to post
> it as a standalone patch. It is the sort of trivial patch that should
> get merged quickly.
I included it in the series since patch 3 touches lines near it. But
upon reviewing that patch, it seems that these lines are not in that
patch's context. So I will resumbit this separately.
> I would also comment about the change from EFAULT to EBUSY in the
> commit message.
OK.
--Sean
On 6/10/24 20:13, Andrew Lunn wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> Add support for reading the statistics counters, if they are enabled.
>> The counters may be 64-bit, but we can't detect this as there's no
>> ability bit for it and the counters are read-only. Therefore, we assume
>> the counters are 32-bits.
>
>> +static void axienet_stats_update(struct axienet_local *lp)
>> +{
>> + enum temac_stat stat;
>> +
>> + lockdep_assert_held(&lp->stats_lock);
>> +
>> + u64_stats_update_begin(&lp->hw_stat_sync);
>> + for (stat = 0; stat < STAT_COUNT; stat++) {
>> + u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
>
> The * 8 here suggests the counters are spaced so that they could be 64
> bit wide, even when only 32 bits are used.
Correct.
> Does the documentation say anything about the upper 32 bits when the
> counters are only 32 bits? Are they guaranteed to read as zero? I'm
> just wondering if the code should be forward looking and read all 64
> bits?
The registers are documented as being 32-bit, with the upper 32-bits
being registered upon reading the lower 32 bits. The documentation
doesn't say what the upper registers are when the counters are 32 bits.
>> static int __axienet_device_reset(struct axienet_local *lp)
>> {
>> u32 value;
>> int ret;
>>
>> + /* Save statistics counters in case they will be reset */
>> + if (lp->features & XAE_FEATURE_STATS) {
>> + mutex_lock(&lp->stats_lock);
>> + axienet_stats_update(lp);
>> + }
>
> It is a pretty unusual pattern to split a mutex lock/unlock like this
> on an if statement. Maybe just unconditionally hold the mutex? This
> does not appear to be anyway hot path, so the overhead should not
> matter.
OK
--Sean
On 6/10/24 20:26, Andrew Lunn wrote:
>> +static u64 axienet_stat(struct axienet_local *lp, enum temac_stat stat)
>> +{
>> + return u64_stats_read(&lp->hw_stats[stat]);
>> +}
>> @@ -1695,6 +1760,35 @@ axienet_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>> stats->tx_packets = u64_stats_read(&lp->tx_packets);
>> stats->tx_bytes = u64_stats_read(&lp->tx_bytes);
>> } while (u64_stats_fetch_retry(&lp->tx_stat_sync, start));
>> +
>> + if (!(lp->features & XAE_FEATURE_STATS))
>> + return;
>> +
>> + do {
>> + start = u64_stats_fetch_begin(&lp->hw_stat_sync);
>> + stats->rx_length_errors =
>> + axienet_stat(lp, STAT_RX_LENGTH_ERRORS);
>
> I'm i reading this correctly. You are returning the counters from the
> last refresh period. What is that? 2.5Gbps would wrapper around a 32
> byte counter in 13 seconds. I hope these statistics are not 13 seconds
> out of date?
By default we use a 1 Hz refresh period. You can of course configure this
up to 13 seconds, but we refuse to raise it further since we risk missing
a wrap-around. It's configurable by userspace so they can determine how
out-of-date they like their stats (vs how often they want to wake up the
CPU).
> Since axienet_stats_update() also uses the lp->hw_stat_sync, i don't
> see why you cannot read the hardware counter value and update to the
> latest value.
We would need to synchronize against updates to hw_last_counter. Imagine
a scenario like
CPU 1 CPU 2
__axienet_device_reset()
axienet_stats_update()
axienet_stat()
u64_stats_read()
axienet_ior()
/* device reset */
hw_last_counter = 0
stats->foo = ... - hw_last_counter[...]
and now we have a glitch in the counter values, since we effectively are
double-counting the current counter value. Alternatively, we could read
the counter after reset but before hw_last_counter was updated and get a
glitch due to underflow.
--Sean
On 6/10/24 20:29, Andrew Lunn wrote:
> On Tue, Jun 11, 2024 at 02:13:40AM +0200, Andrew Lunn wrote:
>> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> > Add support for reading the statistics counters, if they are enabled.
>> > The counters may be 64-bit, but we can't detect this as there's no
>> > ability bit for it and the counters are read-only. Therefore, we assume
>> > the counters are 32-bits.
>>
>> > +static void axienet_stats_update(struct axienet_local *lp)
>> > +{
>> > + enum temac_stat stat;
>> > +
>> > + lockdep_assert_held(&lp->stats_lock);
>> > +
>> > + u64_stats_update_begin(&lp->hw_stat_sync);
>> > + for (stat = 0; stat < STAT_COUNT; stat++) {
>> > + u32 counter = axienet_ior(lp, XAE_STATS_OFFSET + stat * 8);
>>
>> The * 8 here suggests the counters are spaced so that they could be 64
>> bit wide, even when only 32 bits are used. Does the documentation say
>> anything about the upper 32 bits when the counters are only 32 bits?
>> Are they guaranteed to read as zero? I'm just wondering if the code
>> should be forward looking and read all 64 bits?
>
> Actually, if you read the upper 32 bits and they are not 0, you know
> you have 64 bit counters. You can then kill off your period task, it
> is not needed because your software counters will wrap around the same
> time as the hardware counters.
Yes, but then our stats remain stale forever, because we don't refresh
stats before reading them as detailed in my other response.
--Sean
On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
> Add support for reading the statistics counters, if they are enabled.
> The counters may be 64-bit, but we can't detect this as there's no
> ability bit for it and the counters are read-only. Therefore, we assume
> the counters are 32-bits. To ensure we don't miss an overflow, we need
> to read all counters at regular intervals, configurable with
> stats-block-usecs. This should be often enough to ensure the bytes
> counters don't wrap at 2.5 Gbit/s.
>
> Another complication is that the counters may be reset when the device
> is reset (depending on configuration). To ensure the counters persist
> across link up/down (including suspend/resume), we maintain our own
> 64-bit versions along with the last counter value we saw. Because we
> might wait up to 100 ms for the reset to complete, we use a mutex to
> protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
> u64_stats_sync to protect readers.
>
> We can't use the byte counters for either get_stats64 or
> get_eth_mac_stats. This is because the byte counters include everything
> in the frame (destination address to FCS, inclusive). But
> rtnl_link_stats64 wants bytes excluding the FCS, and
> ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
> length/type).
>
> Signed-off-by: Sean Anderson <[email protected]>
> ---
>
> drivers/net/ethernet/xilinx/xilinx_axienet.h | 81 ++++++
> .../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
> 2 files changed, 345 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
...
> @@ -434,6 +502,11 @@ struct skbuf_dma_descriptor {
> * @tx_packets: TX packet count for statistics
> * @tx_bytes: TX byte count for statistics
> * @tx_stat_sync: Synchronization object for TX stats
> + * @hw_last_counter: Last-seen value of each statistic
> + * @hw_stats: Interface statistics periodically updated from hardware counters
> + * @hw_stats_sync: Synchronization object for @hw_stats
nit: s/hw_stats_sync/hw_stat_sync/
Flagged by kernel-doc -none
> + * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
> + * @stats_work: Work for reading the hardware statistics counters
> * @dma_err_task: Work structure to process Axi DMA errors
> * @tx_irq: Axidma TX IRQ number
> * @rx_irq: Axidma RX IRQ number
> @@ -452,6 +525,7 @@ struct skbuf_dma_descriptor {
> * @coalesce_usec_rx: IRQ coalesce delay for RX
> * @coalesce_count_tx: Store the irq coalesce on TX side.
> * @coalesce_usec_tx: IRQ coalesce delay for TX
> + * @coalesce_usec_stats: Delay between hardware statistics refreshes
> * @use_dmaengine: flag to check dmaengine framework usage.
> * @tx_chan: TX DMA channel.
> * @rx_chan: RX DMA channel.
> @@ -505,6 +579,12 @@ struct axienet_local {
> u64_stats_t tx_bytes;
> struct u64_stats_sync tx_stat_sync;
>
> + u32 hw_last_counter[STAT_COUNT];
> + u64_stats_t hw_stats[STAT_COUNT];
> + struct u64_stats_sync hw_stat_sync;
> + struct mutex stats_lock;
> + struct delayed_work stats_work;
> +
> struct work_struct dma_err_task;
>
> int tx_irq;
...
On 6/14/24 12:30, Simon Horman wrote:
> On Mon, Jun 10, 2024 at 07:10:22PM -0400, Sean Anderson wrote:
>> Add support for reading the statistics counters, if they are enabled.
>> The counters may be 64-bit, but we can't detect this as there's no
>> ability bit for it and the counters are read-only. Therefore, we assume
>> the counters are 32-bits. To ensure we don't miss an overflow, we need
>> to read all counters at regular intervals, configurable with
>> stats-block-usecs. This should be often enough to ensure the bytes
>> counters don't wrap at 2.5 Gbit/s.
>>
>> Another complication is that the counters may be reset when the device
>> is reset (depending on configuration). To ensure the counters persist
>> across link up/down (including suspend/resume), we maintain our own
>> 64-bit versions along with the last counter value we saw. Because we
>> might wait up to 100 ms for the reset to complete, we use a mutex to
>> protect writing hw_stats. We can't sleep in ndo_get_stats64, so we use a
>> u64_stats_sync to protect readers.
>>
>> We can't use the byte counters for either get_stats64 or
>> get_eth_mac_stats. This is because the byte counters include everything
>> in the frame (destination address to FCS, inclusive). But
>> rtnl_link_stats64 wants bytes excluding the FCS, and
>> ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
>> length/type).
>>
>> Signed-off-by: Sean Anderson <[email protected]>
>> ---
>>
>> drivers/net/ethernet/xilinx/xilinx_axienet.h | 81 ++++++
>> .../net/ethernet/xilinx/xilinx_axienet_main.c | 267 +++++++++++++++++-
>> 2 files changed, 345 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
>
> ...
>
>> @@ -434,6 +502,11 @@ struct skbuf_dma_descriptor {
>> * @tx_packets: TX packet count for statistics
>> * @tx_bytes: TX byte count for statistics
>> * @tx_stat_sync: Synchronization object for TX stats
>> + * @hw_last_counter: Last-seen value of each statistic
>> + * @hw_stats: Interface statistics periodically updated from hardware counters
>> + * @hw_stats_sync: Synchronization object for @hw_stats
>
> nit: s/hw_stats_sync/hw_stat_sync/
>
> Flagged by kernel-doc -none
I think I will rename the member instead.
--Sean
>> + * @stats_lock: Lock for writing @hw_stats and @hw_last_counter
>> + * @stats_work: Work for reading the hardware statistics counters
>> * @dma_err_task: Work structure to process Axi DMA errors
>> * @tx_irq: Axidma TX IRQ number
>> * @rx_irq: Axidma RX IRQ number
>> @@ -452,6 +525,7 @@ struct skbuf_dma_descriptor {
>> * @coalesce_usec_rx: IRQ coalesce delay for RX
>> * @coalesce_count_tx: Store the irq coalesce on TX side.
>> * @coalesce_usec_tx: IRQ coalesce delay for TX
>> + * @coalesce_usec_stats: Delay between hardware statistics refreshes
>> * @use_dmaengine: flag to check dmaengine framework usage.
>> * @tx_chan: TX DMA channel.
>> * @rx_chan: RX DMA channel.
>> @@ -505,6 +579,12 @@ struct axienet_local {
>> u64_stats_t tx_bytes;
>> struct u64_stats_sync tx_stat_sync;
>>
>> + u32 hw_last_counter[STAT_COUNT];
>> + u64_stats_t hw_stats[STAT_COUNT];
>> + struct u64_stats_sync hw_stat_sync;
>> + struct mutex stats_lock;
>> + struct delayed_work stats_work;
>> +
>> struct work_struct dma_err_task;
>>
>> int tx_irq;
>
> ...