2017-12-08 12:48:24

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 0/6] staging: fsl-dpaa2/eth: Frame buffer optimizations

This patchset continues the work related to frame buffer layout.

The first two patches fix an issue and simplify some rather convoluted
formulas introduced by a previous commit.

The third adds a new counter for TX skb reallocations due to
insufficient headroom. This helped us notice that most of the frames
originated on the core had skbs with too little headroom, leading to
additional alloc/free operations and a performance penalty for
egress termination traffic.

A closer look at hardware requirements vs recommandations showed
that we don't in fact need to be so strict with our TX buffer layout.
The last three patches in this set remove all restrictions that are
not mandatory from a functional standpoint, lowering the required
TX headroom from 192B to 64B.

This helps avoid realloc's for most TCP frames. On a LS2088A board
with one core @2GHz, running a netperf TCP_SENDFILE test (as client),
we see an increase in throughput from 3.79Gbps to 5.11Gbps. For UDP
traffic no improvement is observed, since the stack usually creates
skbs with minimal headroom (2B) for UDP frames.

Ioana Radulescu (6):
staging: fsl-dpaa2/eth: Fix access to FAS field
staging: fsl-dpaa2/eth: Don't set netdev->needed_headroom
staging: fsl-dpaa2/eth: Add counter for skb reallocs
staging: fsl-dpaa2/eth: Don't enable FAS on Tx
staging: fsl-dpaa2/eth: Compute needed headroom per frame
staging: fsl-dpaa2/eth: Make Tx buffer alignment optional

drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 111 +++++----------------
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 31 +++---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c | 1 +
3 files changed, 46 insertions(+), 97 deletions(-)

--
2.7.4


2017-12-08 12:48:03

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 3/6] staging: fsl-dpaa2/eth: Add counter for skb reallocs

Add a counter for the number of egress frames that need to be
realloc'ed due to insufficient headroom space.

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 1 +
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 1 +
drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c | 1 +
3 files changed, 3 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 04db65c..6de6cdd 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -578,6 +578,7 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev)
percpu_stats->tx_dropped++;
goto err_alloc_headroom;
}
+ percpu_extras->tx_reallocs++;
dev_kfree_skb(skb);
skb = ns;
}
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 63b09d1..6940a98 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -231,6 +231,7 @@ struct dpaa2_eth_drv_stats {
__u64 tx_conf_bytes;
__u64 tx_sg_frames;
__u64 tx_sg_bytes;
+ __u64 tx_reallocs;
__u64 rx_sg_frames;
__u64 rx_sg_bytes;
/* Enqueues retried due to portal busy */
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c
index ebe8fd6..070a3f2 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-ethtool.c
@@ -62,6 +62,7 @@ static char dpaa2_ethtool_extras[][ETH_GSTRING_LEN] = {
"[drv] tx conf bytes",
"[drv] tx sg frames",
"[drv] tx sg bytes",
+ "[drv] tx realloc frames",
"[drv] rx sg frames",
"[drv] rx sg bytes",
"[drv] enqueue portal busy",
--
2.7.4

2017-12-08 12:48:09

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 1/6] staging: fsl-dpaa2/eth: Fix access to FAS field

Commit 4b2d9fe87950 ("staging: fsl-dpaa2/eth: Extra headroom in RX
buffers") removes the software annotation (SWA) area from the RX
buffer layout, as it's not used by anyone, but fails to update the
macros for accessing hardware annotation (HWA) fields, which is
right after the SWA in the buffer headroom.

This may lead to some frame annotation status fields (e.g. indication
if L3/L4 checksum is valid) to be read incorrectly.

Turn the accessor macros into inline functions and add a bool param
to specify if SWA is present or not.

Fixes: 4b2d9fe87950 ("staging: fsl-dpaa2/eth: Extra headroom in RX buffers")

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 8 ++++----
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 13 +++++++++----
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 0d8ed00..c8a8e3a 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -249,7 +249,7 @@ static void dpaa2_eth_rx(struct dpaa2_eth_priv *priv,
vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr);
dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE, DMA_FROM_DEVICE);

- fas = dpaa2_get_fas(vaddr);
+ fas = dpaa2_get_fas(vaddr, false);
prefetch(fas);
buf_data = vaddr + dpaa2_fd_get_offset(fd);
prefetch(buf_data);
@@ -385,7 +385,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
* on TX confirmation. We are clearing FAS (Frame Annotation Status)
* field from the hardware annotation area
*/
- fas = dpaa2_get_fas(sgt_buf);
+ fas = dpaa2_get_fas(sgt_buf, true);
memset(fas, 0, DPAA2_FAS_SIZE);

sgt = (struct dpaa2_sg_entry *)(sgt_buf + priv->tx_data_offset);
@@ -458,7 +458,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
* on TX confirmation. We are clearing FAS (Frame Annotation Status)
* field from the hardware annotation area
*/
- fas = dpaa2_get_fas(buffer_start);
+ fas = dpaa2_get_fas(buffer_start, true);
memset(fas, 0, DPAA2_FAS_SIZE);

/* Store a backpointer to the skb at the beginning of the buffer
@@ -510,7 +510,7 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,

fd_addr = dpaa2_fd_get_addr(fd);
skbh = dpaa2_iova_to_virt(priv->iommu_domain, fd_addr);
- fas = dpaa2_get_fas(skbh);
+ fas = dpaa2_get_fas(skbh, true);

if (fd_format == dpaa2_fd_single) {
skb = *skbh;
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 5b3ab9f..3a4e939 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -153,10 +153,15 @@ struct dpaa2_fas {
#define DPAA2_FAS_SIZE (sizeof(struct dpaa2_fas))

/* Accessors for the hardware annotation fields that we use */
-#define dpaa2_get_hwa(buf_addr) \
- ((void *)(buf_addr) + DPAA2_ETH_SWA_SIZE)
-#define dpaa2_get_fas(buf_addr) \
- (struct dpaa2_fas *)(dpaa2_get_hwa(buf_addr) + DPAA2_FAS_OFFSET)
+static inline void *dpaa2_get_hwa(void *buf_addr, bool swa)
+{
+ return buf_addr + (swa ? DPAA2_ETH_SWA_SIZE : 0);
+}
+
+static inline struct dpaa2_fas *dpaa2_get_fas(void *buf_addr, bool swa)
+{
+ return dpaa2_get_hwa(buf_addr, swa) + DPAA2_FAS_OFFSET;
+}

/* Error and status bits in the frame annotation status word */
/* Debug frame, otherwise supposed to be discarded */
--
2.7.4

2017-12-08 12:48:29

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 6/6] staging: fsl-dpaa2/eth: Make Tx buffer alignment optional

Aligning the Tx buffers at 64B is a performance optimization
recommendation, not a hard requirement.

Make optional the alignment of Tx FD buffers, without enforcing
a reallocation in case there is not enough headroom for it.

On Rx, we keep allocating buffers with enough headroom to allow
Tx alignment of forwarded frames.

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 14 ++++++++++----
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 2 +-
2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 8edb7c1..7f3e4fa 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -433,13 +433,19 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
struct dpaa2_fd *fd)
{
struct device *dev = priv->net_dev->dev.parent;
- u8 *buffer_start;
+ u8 *buffer_start, *aligned_start;
struct sk_buff **skbh;
dma_addr_t addr;

- buffer_start = PTR_ALIGN(skb->data -
- dpaa2_eth_needed_headroom(priv, skb),
- DPAA2_ETH_TX_BUF_ALIGN);
+ buffer_start = skb->data - dpaa2_eth_needed_headroom(priv, skb);
+
+ /* If there's enough room to align the FD address, do it.
+ * It will help hardware optimize accesses.
+ */
+ aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN,
+ DPAA2_ETH_TX_BUF_ALIGN);
+ if (aligned_start >= skb->head)
+ buffer_start = aligned_start;

/* Store a backpointer to the skb at the beginning of the buffer
* (in the private data area) such that we can release it
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 4ea41d8..d68ac38 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -370,7 +370,7 @@ unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv,
if (skb_is_nonlinear(skb))
return 0;

- return DPAA2_ETH_SWA_SIZE + DPAA2_ETH_TX_BUF_ALIGN;
+ return DPAA2_ETH_SWA_SIZE;
}

/* Extra headroom space requested to hardware, in order to make sure there's
--
2.7.4

2017-12-08 12:48:53

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 5/6] staging: fsl-dpaa2/eth: Compute needed headroom per frame

For non-linear skbs we build scatter-gather frames and allocate
a new buffer for the S/G table in which we reserve the required
headroom, so the actual skb headroom size doesn't matter.

Rather than use a one-size-fits-all approach, decide when to
enforce headroom requirements on a frame by frame basis.

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 9 ++++++---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 11 ++++++++---
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 2c12859..8edb7c1 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -437,7 +437,8 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
struct sk_buff **skbh;
dma_addr_t addr;

- buffer_start = PTR_ALIGN(skb->data - dpaa2_eth_needed_headroom(priv),
+ buffer_start = PTR_ALIGN(skb->data -
+ dpaa2_eth_needed_headroom(priv, skb),
DPAA2_ETH_TX_BUF_ALIGN);

/* Store a backpointer to the skb at the beginning of the buffer
@@ -532,15 +533,17 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev)
struct dpaa2_eth_drv_stats *percpu_extras;
struct dpaa2_eth_fq *fq;
u16 queue_mapping;
+ unsigned int needed_headroom;
int err, i;

percpu_stats = this_cpu_ptr(priv->percpu_stats);
percpu_extras = this_cpu_ptr(priv->percpu_extras);

- if (skb_headroom(skb) < dpaa2_eth_needed_headroom(priv)) {
+ needed_headroom = dpaa2_eth_needed_headroom(priv, skb);
+ if (skb_headroom(skb) < needed_headroom) {
struct sk_buff *ns;

- ns = skb_realloc_headroom(skb, dpaa2_eth_needed_headroom(priv));
+ ns = skb_realloc_headroom(skb, needed_headroom);
if (unlikely(!ns)) {
percpu_stats->tx_dropped++;
goto err_alloc_headroom;
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 546a862..4ea41d8 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -364,9 +364,13 @@ static inline unsigned int dpaa2_eth_buf_raw_size(struct dpaa2_eth_priv *priv)
}

static inline
-unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv)
+unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv,
+ struct sk_buff *skb)
{
- return priv->tx_data_offset + DPAA2_ETH_TX_BUF_ALIGN;
+ if (skb_is_nonlinear(skb))
+ return 0;
+
+ return DPAA2_ETH_SWA_SIZE + DPAA2_ETH_TX_BUF_ALIGN;
}

/* Extra headroom space requested to hardware, in order to make sure there's
@@ -374,7 +378,8 @@ unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv)
*/
static inline unsigned int dpaa2_eth_rx_head_room(struct dpaa2_eth_priv *priv)
{
- return dpaa2_eth_needed_headroom(priv) - DPAA2_ETH_RX_HWA_SIZE;
+ return priv->tx_data_offset + DPAA2_ETH_TX_BUF_ALIGN -
+ DPAA2_ETH_RX_HWA_SIZE;
}

static int dpaa2_eth_queue_count(struct dpaa2_eth_priv *priv)
--
2.7.4

2017-12-08 12:49:07

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 4/6] staging: fsl-dpaa2/eth: Don't enable FAS on Tx

For Tx confirmed frames that have an error indication in the frame
descriptor, we look at the Frame Annotation Status field (in the
buffer headroom) for details on the root cause and then print
a debug message with that information.

While useful in initial development stages, it doesn't bring
enough added value to justify reserving 64B of headroom for all
Tx frames (FAS is only 8B long, but we must reserve chunks of 64B
from the hardware annotation area).

If we remove the need for FAS field from egress frames, we can
renounce hardware annotation completely, since FAS is the only
HWA field we currently use.

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 73 +++++---------------------
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 6 ---
2 files changed, 13 insertions(+), 66 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index 6de6cdd..2c12859 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -348,7 +348,6 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
int num_sg;
int num_dma_bufs;
struct dpaa2_eth_swa *swa;
- struct dpaa2_fas *fas;

/* Create and map scatterlist.
* We don't advertise NETIF_F_FRAGLIST, so skb_to_sgvec() will not have
@@ -379,15 +378,6 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
goto sgt_buf_alloc_failed;
}
sgt_buf = PTR_ALIGN(sgt_buf, DPAA2_ETH_TX_BUF_ALIGN);
-
- /* PTA from egress side is passed as is to the confirmation side so
- * we need to clear some fields here in order to find consistent values
- * on TX confirmation. We are clearing FAS (Frame Annotation Status)
- * field from the hardware annotation area
- */
- fas = dpaa2_get_fas(sgt_buf, true);
- memset(fas, 0, DPAA2_FAS_SIZE);
-
sgt = (struct dpaa2_sg_entry *)(sgt_buf + priv->tx_data_offset);

/* Fill in the HW SGT structure.
@@ -424,8 +414,7 @@ static int build_sg_fd(struct dpaa2_eth_priv *priv,
dpaa2_fd_set_format(fd, dpaa2_fd_sg);
dpaa2_fd_set_addr(fd, addr);
dpaa2_fd_set_len(fd, skb->len);
- dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | DPAA2_FD_CTRL_PTA |
- DPAA2_FD_CTRL_PTV1);
+ dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_PTA | DPAA2_FD_CTRL_PTV1);

return 0;

@@ -445,21 +434,12 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
{
struct device *dev = priv->net_dev->dev.parent;
u8 *buffer_start;
- struct dpaa2_fas *fas;
struct sk_buff **skbh;
dma_addr_t addr;

buffer_start = PTR_ALIGN(skb->data - dpaa2_eth_needed_headroom(priv),
DPAA2_ETH_TX_BUF_ALIGN);

- /* PTA from egress side is passed as is to the confirmation side so
- * we need to clear some fields here in order to find consistent values
- * on TX confirmation. We are clearing FAS (Frame Annotation Status)
- * field from the hardware annotation area
- */
- fas = dpaa2_get_fas(buffer_start, true);
- memset(fas, 0, DPAA2_FAS_SIZE);
-
/* Store a backpointer to the skb at the beginning of the buffer
* (in the private data area) such that we can release it
* on Tx confirm
@@ -477,8 +457,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
dpaa2_fd_set_offset(fd, (u16)(skb->data - buffer_start));
dpaa2_fd_set_len(fd, skb->len);
dpaa2_fd_set_format(fd, dpaa2_fd_single);
- dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_ASAL | DPAA2_FD_CTRL_PTA |
- DPAA2_FD_CTRL_PTV1);
+ dpaa2_fd_set_ctrl(fd, DPAA2_FD_CTRL_PTA | DPAA2_FD_CTRL_PTV1);

return 0;
}
@@ -493,8 +472,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
* to be checked if we're on the confirmation path.
*/
static void free_tx_fd(const struct dpaa2_eth_priv *priv,
- const struct dpaa2_fd *fd,
- u32 *status)
+ const struct dpaa2_fd *fd)
{
struct device *dev = priv->net_dev->dev.parent;
dma_addr_t fd_addr;
@@ -505,11 +483,9 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,
int num_sg, num_dma_bufs;
struct dpaa2_eth_swa *swa;
u8 fd_format = dpaa2_fd_get_format(fd);
- struct dpaa2_fas *fas;

fd_addr = dpaa2_fd_get_addr(fd);
skbh = dpaa2_iova_to_virt(priv->iommu_domain, fd_addr);
- fas = dpaa2_get_fas(skbh, true);

if (fd_format == dpaa2_fd_single) {
skb = *skbh;
@@ -536,19 +512,10 @@ static void free_tx_fd(const struct dpaa2_eth_priv *priv,
sizeof(struct dpaa2_sg_entry) * (1 + num_dma_bufs);
dma_unmap_single(dev, fd_addr, unmap_size, DMA_BIDIRECTIONAL);
} else {
- /* Unsupported format, mark it as errored and give up */
- if (status)
- *status = ~0;
+ netdev_dbg(priv->net_dev, "Invalid FD format\n");
return;
}

- /* Read the status from the Frame Annotation after we unmap the first
- * buffer but before we free it. The caller function is responsible
- * for checking the status value.
- */
- if (status)
- *status = le32_to_cpu(fas->status);
-
/* Free SGT buffer kmalloc'ed on tx */
if (fd_format != dpaa2_fd_single)
kfree(skbh);
@@ -627,7 +594,7 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev)
if (unlikely(err < 0)) {
percpu_stats->tx_errors++;
/* Clean up everything, including freeing the skb */
- free_tx_fd(priv, &fd, NULL);
+ free_tx_fd(priv, &fd);
} else {
percpu_stats->tx_packets++;
percpu_stats->tx_bytes += dpaa2_fd_get_len(&fd);
@@ -650,9 +617,7 @@ static void dpaa2_eth_tx_conf(struct dpaa2_eth_priv *priv,
{
struct rtnl_link_stats64 *percpu_stats;
struct dpaa2_eth_drv_stats *percpu_extras;
- u32 status = 0;
u32 fd_errors;
- bool has_fas_errors = false;

/* Tracing point */
trace_dpaa2_tx_conf_fd(priv->net_dev, fd);
@@ -663,29 +628,18 @@ static void dpaa2_eth_tx_conf(struct dpaa2_eth_priv *priv,

/* Check frame errors in the FD field */
fd_errors = dpaa2_fd_get_ctrl(fd) & DPAA2_FD_TX_ERR_MASK;
- if (unlikely(fd_errors)) {
- /* We only check error bits in the FAS field if corresponding
- * FAERR bit is set in FD and the FAS field is marked as valid
- */
- has_fas_errors = (fd_errors & DPAA2_FD_CTRL_FAERR) &&
- !!(dpaa2_fd_get_frc(fd) & DPAA2_FD_FRC_FASV);
- if (net_ratelimit())
- netdev_dbg(priv->net_dev, "TX frame FD error: 0x%08x\n",
- fd_errors);
- }
-
- free_tx_fd(priv, fd, has_fas_errors ? &status : NULL);
+ free_tx_fd(priv, fd);

if (likely(!fd_errors))
return;

+ if (net_ratelimit())
+ netdev_dbg(priv->net_dev, "TX frame FD error: 0x%08x\n",
+ fd_errors);
+
percpu_stats = this_cpu_ptr(priv->percpu_stats);
/* Tx-conf logically pertains to the egress path. */
percpu_stats->tx_errors++;
-
- if (has_fas_errors && net_ratelimit())
- netdev_dbg(priv->net_dev, "TX frame FAS error: 0x%08x\n",
- status & DPAA2_FAS_TX_ERR_MASK);
}

static int set_rx_csum(struct dpaa2_eth_priv *priv, bool enable)
@@ -1791,10 +1745,8 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv)
priv->rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN;

/* tx buffer */
- buf_layout.pass_frame_status = true;
buf_layout.private_data_size = DPAA2_ETH_SWA_SIZE;
- buf_layout.options = DPNI_BUF_LAYOUT_OPT_FRAME_STATUS |
- DPNI_BUF_LAYOUT_OPT_PRIVATE_DATA_SIZE;
+ buf_layout.options = DPNI_BUF_LAYOUT_OPT_PRIVATE_DATA_SIZE;
err = dpni_set_buffer_layout(priv->mc_io, 0, priv->mc_token,
DPNI_QUEUE_TX, &buf_layout);
if (err) {
@@ -1803,7 +1755,7 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv)
}

/* tx-confirm buffer */
- buf_layout.options = DPNI_BUF_LAYOUT_OPT_FRAME_STATUS;
+ buf_layout.options = 0;
err = dpni_set_buffer_layout(priv->mc_io, 0, priv->mc_token,
DPNI_QUEUE_TX_CONFIRM, &buf_layout);
if (err) {
@@ -1826,6 +1778,7 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv)
priv->tx_data_offset);

/* rx buffer */
+ buf_layout.pass_frame_status = true;
buf_layout.pass_parser_result = true;
buf_layout.data_align = priv->rx_buf_align;
buf_layout.data_head_room = dpaa2_eth_rx_head_room(priv);
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 6940a98..546a862 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -134,7 +134,6 @@ struct dpaa2_eth_swa {
DPAA2_FD_CTRL_FAERR)

/* Annotation bits in FD CTRL */
-#define DPAA2_FD_CTRL_ASAL 0x00010000 /* ASAL = 64 */
#define DPAA2_FD_CTRL_PTA 0x00800000
#define DPAA2_FD_CTRL_PTV1 0x00400000

@@ -208,11 +207,6 @@ static inline struct dpaa2_fas *dpaa2_get_fas(void *buf_addr, bool swa)
DPAA2_FAS_BLE | \
DPAA2_FAS_L3CE | \
DPAA2_FAS_L4CE)
-/* Tx errors */
-#define DPAA2_FAS_TX_ERR_MASK (DPAA2_FAS_KSE | \
- DPAA2_FAS_EOFHE | \
- DPAA2_FAS_MNLE | \
- DPAA2_FAS_TIDE)

/* Time in milliseconds between link state updates */
#define DPAA2_ETH_LINK_STATE_REFRESH 1000
--
2.7.4

2017-12-08 12:49:12

by Ioana Ciocoi Radulescu

[permalink] [raw]
Subject: [PATCH 2/6] staging: fsl-dpaa2/eth: Don't set netdev->needed_headroom

Commit 4b2d9fe87950 ("staging: fsl-dpaa2/eth: Extra headroom in RX
buffers") tried to avoid the performance penalty of doing skb
reallocations in the network stack for IP forwarded frames between
two DPAA2 Ethernet interfaces. This led to a (too) complicated
formula that relies on the stack's internal implementation.

Instead, it's safer and easier to just not request any guarantee
from the stack. We already double check in the driver the required
headroom size of egress frames and realloc the skb if needed, so
we don't need to add any extra code.

On forwarding between two of our own interfaces, there is no
functional change; for traffic forwarded from a different device or
generated on the core, skb realloc operations are moved from the stack
to our driver, with no visible impact on performance.

Signed-off-by: Ioana Radulescu <[email protected]>
---
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 24 ++----------------------
drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h | 2 +-
2 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
index c8a8e3a..04db65c 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c
@@ -449,8 +449,7 @@ static int build_single_fd(struct dpaa2_eth_priv *priv,
struct sk_buff **skbh;
dma_addr_t addr;

- buffer_start = PTR_ALIGN(skb->data - priv->tx_data_offset -
- DPAA2_ETH_TX_BUF_ALIGN,
+ buffer_start = PTR_ALIGN(skb->data - dpaa2_eth_needed_headroom(priv),
DPAA2_ETH_TX_BUF_ALIGN);

/* PTA from egress side is passed as is to the confirmation side so
@@ -571,7 +570,7 @@ static netdev_tx_t dpaa2_eth_tx(struct sk_buff *skb, struct net_device *net_dev)
percpu_stats = this_cpu_ptr(priv->percpu_stats);
percpu_extras = this_cpu_ptr(priv->percpu_extras);

- if (unlikely(skb_headroom(skb) < dpaa2_eth_needed_headroom(priv))) {
+ if (skb_headroom(skb) < dpaa2_eth_needed_headroom(priv)) {
struct sk_buff *ns;

ns = skb_realloc_headroom(skb, dpaa2_eth_needed_headroom(priv));
@@ -2273,7 +2272,6 @@ static int netdev_init(struct net_device *net_dev)
{
struct device *dev = net_dev->dev.parent;
struct dpaa2_eth_priv *priv = netdev_priv(net_dev);
- u16 rx_headroom, req_headroom;
u8 bcast_addr[ETH_ALEN];
u8 num_queues;
int err;
@@ -2292,24 +2290,6 @@ static int netdev_init(struct net_device *net_dev)
return err;
}

- /* Reserve enough space to align buffer as per hardware requirement;
- * NOTE: priv->tx_data_offset MUST be initialized at this point.
- */
- net_dev->needed_headroom = dpaa2_eth_needed_headroom(priv);
-
- /* If headroom guaranteed by hardware in the Rx frame buffer is
- * smaller than the Tx headroom required by the stack, issue a
- * one time warning. This will most likely mean skbs forwarded to
- * another DPAA2 network interface will get reallocated, with a
- * significant performance impact.
- */
- req_headroom = LL_RESERVED_SPACE(net_dev) - ETH_HLEN;
- rx_headroom = ALIGN(DPAA2_ETH_RX_HWA_SIZE +
- dpaa2_eth_rx_head_room(priv), priv->rx_buf_align);
- if (req_headroom > rx_headroom)
- dev_info_once(dev, "Required headroom (%d) greater than available (%d)\n",
- req_headroom, rx_headroom);
-
/* Set MTU limits */
net_dev->min_mtu = 68;
net_dev->max_mtu = DPAA2_ETH_MAX_MTU;
diff --git a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
index 3a4e939..63b09d1 100644
--- a/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
+++ b/drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h
@@ -371,7 +371,7 @@ static inline unsigned int dpaa2_eth_buf_raw_size(struct dpaa2_eth_priv *priv)
static inline
unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv)
{
- return priv->tx_data_offset + DPAA2_ETH_TX_BUF_ALIGN - HH_DATA_MOD;
+ return priv->tx_data_offset + DPAA2_ETH_TX_BUF_ALIGN;
}

/* Extra headroom space requested to hardware, in order to make sure there's
--
2.7.4