If there are not enough hardware resources to allocate one tx queue per
CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using
them resulted each time with the packet being drop and this message in
the logs: XDP TX failed (-22)
These patches implement 2 fallback solutions for 2 different situations
that might happen:
1. There are not enough free resources for all the tx queues, but there
are some free resources available
2. There are not enough free resources at all for tx queues.
Both solutions are based in sharing tx queues, using __netif_tx_lock for
synchronization. In the second case, as there are not XDP TX queues to
share, network stack queues are used instead, but since we're taking
__netif_tx_lock, concurrent access to the queues is correctly protected.
The solution for this second case might affect performance both of XDP
traffic and normal traffice due to lock contention if both are used
intensively. That's why I call it a "last resort" fallback: it's not a
desirable situation, but at least we have XDP TX working.
Some tests has shown good results and indicate that the non-fallback
case is not being damaged by this changes. They are also promising for
the fallback cases. This is the test:
1. From another machine, send high amount of packets with pktgen, script
samples/pktgen/pktgen_sample04_many_flows.sh
2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments
"-a XDP_TX --swapmac" and see the results
3. In the tested machine, run also pktgen_sample04 to create high TX
normal traffic, and see how xdp_rxq_info results vary
Note that this test doesn't check the worst situations for the fallback
solutions because XDP_TX will only be executed from the same CPUs that
are processed by sfc, and not from every CPU in the system, so the
performance drop due to the highest locking contention doesn't happen.
I'd like to test that, as well, but I don't have access right now to a
proper environment.
Test results:
Without doing TX:
Before changes: ~2,900,000 pps
After changes, 1 queues/core: ~2,900,000 pps
After changes, 2 queues/core: ~2,900,000 pps
After changes, 8 queues/core: ~2,900,000 pps
After changes, borrowing from network stack: ~2,900,000 pps
With multiflow TX at the same time:
Before changes: ~1,700,000 - 2,900,000 pps
After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps
After changes, 2 queues/core: ~1,700,000 pps
After changes, 8 queues/core: ~1,700,000 pps
After changes, borrowing from network stack: 1,150,000 pps
Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program
and pktgen simultaneously. This was expected because XDP doesn't have any
buffering system if the NIC is under very high pressure. Thousands of
these warnings are shown in the case of borrowing net stack queues. As I
said before, this was also expected.
Íñigo Huguet (2):
sfc: fallback for lack of xdp tx queues
sfc: last resort fallback for lack of xdp tx queues
drivers/net/ethernet/sfc/efx_channels.c | 98 ++++++++++++++++++-------
drivers/net/ethernet/sfc/net_driver.h | 8 ++
drivers/net/ethernet/sfc/tx.c | 29 ++++++--
3 files changed, 99 insertions(+), 36 deletions(-)
--
2.31.1
If there are not enough resources to allocate one TX queue per core for
XDP TX it was completely disabled.
This patch implements a fallback solution for sharing the available
queues using __netif_tx_lock for synchronization. In the normal case that
there is one TX queue per CPU, no locking is done, as it was before.
With this fallback solution, XDP TX will work in much more cases that
were failing, specially in machines with many CPUs. It's hard for XDP
users to know what features are supported across different NICs and
configurations, so they will benefit on having wider support.
Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/efx_channels.c | 54 ++++++++++++++++++++-----
drivers/net/ethernet/sfc/net_driver.h | 8 ++++
drivers/net/ethernet/sfc/tx.c | 21 ++++++----
3 files changed, 64 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
index e5b0d795c301..e7849f1260a1 100644
--- a/drivers/net/ethernet/sfc/efx_channels.c
+++ b/drivers/net/ethernet/sfc/efx_channels.c
@@ -166,32 +166,46 @@ static int efx_allocate_msix_channels(struct efx_nic *efx,
* We need a channel per event queue, plus a VI per tx queue.
* This may be more pessimistic than it needs to be.
*/
- if (n_channels + n_xdp_ev > max_channels) {
+ if (n_channels >= max_channels) {
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DISABLED;
netif_err(efx, drv, efx->net_dev,
"Insufficient resources for %d XDP event queues (%d other channels, max %d)\n",
n_xdp_ev, n_channels, max_channels);
netif_err(efx, drv, efx->net_dev,
- "XDP_TX and XDP_REDIRECT will not work on this interface");
- efx->n_xdp_channels = 0;
- efx->xdp_tx_per_channel = 0;
- efx->xdp_tx_queue_count = 0;
+ "XDP_TX and XDP_REDIRECT will not work on this interface\n");
} else if (n_channels + n_xdp_tx > efx->max_vis) {
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DISABLED;
netif_err(efx, drv, efx->net_dev,
"Insufficient resources for %d XDP TX queues (%d other channels, max VIs %d)\n",
n_xdp_tx, n_channels, efx->max_vis);
netif_err(efx, drv, efx->net_dev,
- "XDP_TX and XDP_REDIRECT will not work on this interface");
- efx->n_xdp_channels = 0;
- efx->xdp_tx_per_channel = 0;
- efx->xdp_tx_queue_count = 0;
+ "XDP_TX and XDP_REDIRECT will not work on this interface\n");
+ } else if (n_channels + n_xdp_ev > max_channels) {
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_SHARED;
+ netif_warn(efx, drv, efx->net_dev,
+ "Insufficient resources for %d XDP event queues (%d other channels, max %d)\n",
+ n_xdp_ev, n_channels, max_channels);
+
+ n_xdp_ev = max_channels - n_channels;
+ netif_warn(efx, drv, efx->net_dev,
+ "XDP_TX and XDP_REDIRECT will work with reduced performance (%d cpus/tx_queue)\n",
+ DIV_ROUND_UP(n_xdp_tx, tx_per_ev * n_xdp_ev));
} else {
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DEDICATED;
+ }
+
+ if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DISABLED) {
efx->n_xdp_channels = n_xdp_ev;
efx->xdp_tx_per_channel = tx_per_ev;
efx->xdp_tx_queue_count = n_xdp_tx;
n_channels += n_xdp_ev;
netif_dbg(efx, drv, efx->net_dev,
"Allocating %d TX and %d event queues for XDP\n",
- n_xdp_tx, n_xdp_ev);
+ n_xdp_ev * tx_per_ev, n_xdp_ev);
+ } else {
+ efx->n_xdp_channels = 0;
+ efx->xdp_tx_per_channel = 0;
+ efx->xdp_tx_queue_count = 0;
}
if (vec_count < n_channels) {
@@ -921,7 +935,25 @@ int efx_set_channels(struct efx_nic *efx)
}
}
}
- WARN_ON(xdp_queue_number != efx->xdp_tx_queue_count);
+ WARN_ON(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DEDICATED &&
+ xdp_queue_number != efx->xdp_tx_queue_count);
+ WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED &&
+ xdp_queue_number > efx->xdp_tx_queue_count);
+
+ /* If we have less XDP TX queues than CPUs, assign the already existing
+ * queues to the exceeding CPUs (this means that we will have to use
+ * locking when transmitting with XDP)
+ */
+ next_queue = 0;
+ while (xdp_queue_number < efx->xdp_tx_queue_count) {
+ tx_queue = efx->xdp_tx_queues[next_queue++];
+ channel = tx_queue->channel;
+ netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
+ channel->channel, tx_queue->label,
+ xdp_queue_number, tx_queue->queue);
+
+ efx->xdp_tx_queues[xdp_queue_number++] = tx_queue;
+ }
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
if (rc)
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 9b4b25704271..e731c766f709 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -782,6 +782,12 @@ struct efx_async_filter_insertion {
#define EFX_RPS_MAX_IN_FLIGHT 8
#endif /* CONFIG_RFS_ACCEL */
+enum efx_xdp_tx_queues_mode {
+ EFX_XDP_TX_QUEUES_DEDICATED, /* one queue per core, locking not needed */
+ EFX_XDP_TX_QUEUES_SHARED, /* each queue used by more than 1 core */
+ EFX_XDP_TX_QUEUES_DISABLED /* xdp tx not available */
+};
+
/**
* struct efx_nic - an Efx NIC
* @name: Device name (net device name or bus id before net device registered)
@@ -820,6 +826,7 @@ struct efx_async_filter_insertion {
* should be allocated for this NIC
* @xdp_tx_queue_count: Number of entries in %xdp_tx_queues.
* @xdp_tx_queues: Array of pointers to tx queues used for XDP transmit.
+ * @xdp_txq_queues_mode: XDP TX queues sharing strategy.
* @rxq_entries: Size of receive queues requested by user.
* @txq_entries: Size of transmit queues requested by user.
* @txq_stop_thresh: TX queue fill level at or above which we stop it.
@@ -979,6 +986,7 @@ struct efx_nic {
unsigned int xdp_tx_queue_count;
struct efx_tx_queue **xdp_tx_queues;
+ enum efx_xdp_tx_queues_mode xdp_txq_queues_mode;
unsigned rxq_entries;
unsigned txq_entries;
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 0c6650d2e239..c7afd6cda902 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -430,21 +430,23 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n, struct xdp_frame **xdpfs,
int cpu;
int i;
- cpu = raw_smp_processor_id();
+ if (unlikely(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DISABLED))
+ return -EINVAL;
+ if (unlikely(n && !xdpfs))
+ return -EINVAL;
+ if (unlikely(!n))
+ return 0;
- if (!efx->xdp_tx_queue_count ||
- unlikely(cpu >= efx->xdp_tx_queue_count))
+ cpu = raw_smp_processor_id();
+ if (unlikely(cpu >= efx->xdp_tx_queue_count))
return -EINVAL;
tx_queue = efx->xdp_tx_queues[cpu];
if (unlikely(!tx_queue))
return -EINVAL;
- if (unlikely(n && !xdpfs))
- return -EINVAL;
-
- if (!n)
- return 0;
+ if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
+ HARD_TX_LOCK(efx->net_dev, tx_queue->core_txq, cpu);
/* Check for available space. We should never need multiple
* descriptors per frame.
@@ -484,6 +486,9 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n, struct xdp_frame **xdpfs,
if (flush && i > 0)
efx_nic_push_buffers(tx_queue);
+ if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
+ HARD_TX_UNLOCK(efx->net_dev, tx_queue->core_txq);
+
return i == 0 ? -EIO : i;
}
--
2.31.1
Previous patch addressed the situation of having some free resources for
xdp tx but not enough for one tx queue per CPU. This patch address the
worst case of not having resources at all for xdp tx.
Instead of using queues dedicated to xdp, normal queues used by network
stack are shared for both cases, using __netif_tx_lock for
synchronization. Also queue stop/restart must be considered in the xdp
path to avoid freezing the queue.
This is not the ideal situation we might want to be, and a performance
penalty is expected both for normal and xdp traffic, but at least XDP
will work in all possible situations (with a warning in the logs),
improving a bit the pain of not knowing in what situations we can use it
and in what situations we cannot.
Signed-off-by: Íñigo Huguet <[email protected]>
---
drivers/net/ethernet/sfc/efx_channels.c | 82 ++++++++++++++-----------
drivers/net/ethernet/sfc/net_driver.h | 2 +-
drivers/net/ethernet/sfc/tx.c | 14 ++++-
3 files changed, 58 insertions(+), 40 deletions(-)
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
index e7849f1260a1..3dbea028b325 100644
--- a/drivers/net/ethernet/sfc/efx_channels.c
+++ b/drivers/net/ethernet/sfc/efx_channels.c
@@ -167,19 +167,19 @@ static int efx_allocate_msix_channels(struct efx_nic *efx,
* This may be more pessimistic than it needs to be.
*/
if (n_channels >= max_channels) {
- efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DISABLED;
- netif_err(efx, drv, efx->net_dev,
- "Insufficient resources for %d XDP event queues (%d other channels, max %d)\n",
- n_xdp_ev, n_channels, max_channels);
- netif_err(efx, drv, efx->net_dev,
- "XDP_TX and XDP_REDIRECT will not work on this interface\n");
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_BORROWED;
+ netif_warn(efx, drv, efx->net_dev,
+ "Insufficient resources for %d XDP event queues (%d other channels, max %d)\n",
+ n_xdp_ev, n_channels, max_channels);
+ netif_warn(efx, drv, efx->net_dev,
+ "XDP_TX and XDP_REDIRECT might decrease device's performance\n");
} else if (n_channels + n_xdp_tx > efx->max_vis) {
- efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DISABLED;
- netif_err(efx, drv, efx->net_dev,
- "Insufficient resources for %d XDP TX queues (%d other channels, max VIs %d)\n",
- n_xdp_tx, n_channels, efx->max_vis);
- netif_err(efx, drv, efx->net_dev,
- "XDP_TX and XDP_REDIRECT will not work on this interface\n");
+ efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_BORROWED;
+ netif_warn(efx, drv, efx->net_dev,
+ "Insufficient resources for %d XDP TX queues (%d other channels, max VIs %d)\n",
+ n_xdp_tx, n_channels, efx->max_vis);
+ netif_warn(efx, drv, efx->net_dev,
+ "XDP_TX and XDP_REDIRECT might decrease device's performance\n");
} else if (n_channels + n_xdp_ev > max_channels) {
efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_SHARED;
netif_warn(efx, drv, efx->net_dev,
@@ -194,7 +194,7 @@ static int efx_allocate_msix_channels(struct efx_nic *efx,
efx->xdp_txq_queues_mode = EFX_XDP_TX_QUEUES_DEDICATED;
}
- if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DISABLED) {
+ if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_BORROWED) {
efx->n_xdp_channels = n_xdp_ev;
efx->xdp_tx_per_channel = tx_per_ev;
efx->xdp_tx_queue_count = n_xdp_tx;
@@ -205,7 +205,7 @@ static int efx_allocate_msix_channels(struct efx_nic *efx,
} else {
efx->n_xdp_channels = 0;
efx->xdp_tx_per_channel = 0;
- efx->xdp_tx_queue_count = 0;
+ efx->xdp_tx_queue_count = n_xdp_tx;
}
if (vec_count < n_channels) {
@@ -872,6 +872,20 @@ int efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 txq_entries)
goto out;
}
+static inline int
+efx_set_xdp_tx_queue(struct efx_nic *efx, int xdp_queue_number,
+ struct efx_tx_queue *tx_queue)
+{
+ if (xdp_queue_number >= efx->xdp_tx_queue_count)
+ return -EINVAL;
+
+ netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
+ tx_queue->channel->channel, tx_queue->label,
+ xdp_queue_number, tx_queue->queue);
+ efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
+ return 0;
+}
+
int efx_set_channels(struct efx_nic *efx)
{
struct efx_tx_queue *tx_queue;
@@ -910,20 +924,9 @@ int efx_set_channels(struct efx_nic *efx)
if (efx_channel_is_xdp_tx(channel)) {
efx_for_each_channel_tx_queue(tx_queue, channel) {
tx_queue->queue = next_queue++;
-
- /* We may have a few left-over XDP TX
- * queues owing to xdp_tx_queue_count
- * not dividing evenly by EFX_MAX_TXQ_PER_CHANNEL.
- * We still allocate and probe those
- * TXQs, but never use them.
- */
- if (xdp_queue_number < efx->xdp_tx_queue_count) {
- netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
- channel->channel, tx_queue->label,
- xdp_queue_number, tx_queue->queue);
- efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
+ if (rc == 0)
xdp_queue_number++;
- }
}
} else {
efx_for_each_channel_tx_queue(tx_queue, channel) {
@@ -932,6 +935,17 @@ int efx_set_channels(struct efx_nic *efx)
channel->channel, tx_queue->label,
tx_queue->queue);
}
+
+ /* If XDP is borrowing queues from net stack, it must use the queue
+ * with no csum offload, which is the first one of the channel
+ * (note: channel->tx_queue_by_type is not initialized yet)
+ */
+ if (efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_BORROWED) {
+ tx_queue = &channel->tx_queue[0];
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
+ if (rc == 0)
+ xdp_queue_number++;
+ }
}
}
}
@@ -940,19 +954,15 @@ int efx_set_channels(struct efx_nic *efx)
WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED &&
xdp_queue_number > efx->xdp_tx_queue_count);
- /* If we have less XDP TX queues than CPUs, assign the already existing
- * queues to the exceeding CPUs (this means that we will have to use
- * locking when transmitting with XDP)
+ /* If we have more CPUs than assigned XDP TX queues, assign the already
+ * existing queues to the exceeding CPUs
*/
next_queue = 0;
while (xdp_queue_number < efx->xdp_tx_queue_count) {
tx_queue = efx->xdp_tx_queues[next_queue++];
- channel = tx_queue->channel;
- netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
- channel->channel, tx_queue->label,
- xdp_queue_number, tx_queue->queue);
-
- efx->xdp_tx_queues[xdp_queue_number++] = tx_queue;
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
+ if (rc == 0)
+ xdp_queue_number++;
}
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index e731c766f709..f6981810039d 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -785,7 +785,7 @@ struct efx_async_filter_insertion {
enum efx_xdp_tx_queues_mode {
EFX_XDP_TX_QUEUES_DEDICATED, /* one queue per core, locking not needed */
EFX_XDP_TX_QUEUES_SHARED, /* each queue used by more than 1 core */
- EFX_XDP_TX_QUEUES_DISABLED /* xdp tx not available */
+ EFX_XDP_TX_QUEUES_BORROWED /* queues borrowed from net stack */
};
/**
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index c7afd6cda902..d16e031e95f4 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -428,10 +428,8 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n, struct xdp_frame **xdpfs,
unsigned int len;
int space;
int cpu;
- int i;
+ int i = 0;
- if (unlikely(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DISABLED))
- return -EINVAL;
if (unlikely(n && !xdpfs))
return -EINVAL;
if (unlikely(!n))
@@ -448,6 +446,15 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n, struct xdp_frame **xdpfs,
if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
HARD_TX_LOCK(efx->net_dev, tx_queue->core_txq, cpu);
+ /* If we're borrowing net stack queues we have to handle stop-restart
+ * or we might block the queue and it will be considered as frozen
+ */
+ if (efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_BORROWED) {
+ if (netif_tx_queue_stopped(tx_queue->core_txq))
+ goto unlock;
+ efx_tx_maybe_stop_queue(tx_queue);
+ }
+
/* Check for available space. We should never need multiple
* descriptors per frame.
*/
@@ -486,6 +493,7 @@ int efx_xdp_tx_buffers(struct efx_nic *efx, int n, struct xdp_frame **xdpfs,
if (flush && i > 0)
efx_nic_push_buffers(tx_queue);
+unlock:
if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
HARD_TX_UNLOCK(efx->net_dev, tx_queue->core_txq);
--
2.31.1
Hello:
This series was applied to netdev/net.git (refs/heads/master):
On Thu, 9 Sep 2021 11:28:44 +0200 you wrote:
> If there are not enough hardware resources to allocate one tx queue per
> CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using
> them resulted each time with the packet being drop and this message in
> the logs: XDP TX failed (-22)
>
> These patches implement 2 fallback solutions for 2 different situations
> that might happen:
> 1. There are not enough free resources for all the tx queues, but there
> are some free resources available
> 2. There are not enough free resources at all for tx queues.
>
> [...]
Here is the summary with links:
- [net,1/2] sfc: fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/415446185b93
- [net,2/2] sfc: last resort fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/6215b608a8c4
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Great work Huguet, patches LGTM, I would ACK but they have already been
applied:
Here is the summary with links:
- [net,1/2] sfc: fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/415446185b93
- [net,2/2] sfc: last resort fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/6215b608a8c4
Cloudflare (cc) heads-up for these improvements.
And heads-up to Toke and Frey on patch 2/2, as it creates push-back via
TX queue stop/restart logic (see kernel API netif_tx_queue_stopped).
XDP currently doesn't handle this well, but I hope to see XDP queueing
work from your side to improve the situation ;-)
On 09/09/2021 11.28, Íñigo Huguet wrote:
> If there are not enough hardware resources to allocate one tx queue per
> CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using
> them resulted each time with the packet being drop and this message in
> the logs: XDP TX failed (-22)
>
> These patches implement 2 fallback solutions for 2 different situations
> that might happen:
> 1. There are not enough free resources for all the tx queues, but there
> are some free resources available
> 2. There are not enough free resources at all for tx queues.
>
> Both solutions are based in sharing tx queues, using __netif_tx_lock for
> synchronization. In the second case, as there are not XDP TX queues to
> share, network stack queues are used instead, but since we're taking
> __netif_tx_lock, concurrent access to the queues is correctly protected.
>
> The solution for this second case might affect performance both of XDP
> traffic and normal traffice due to lock contention if both are used
> intensively. That's why I call it a "last resort" fallback: it's not a
> desirable situation, but at least we have XDP TX working.
>
> Some tests has shown good results and indicate that the non-fallback
> case is not being damaged by this changes. They are also promising for
> the fallback cases. This is the test:
> 1. From another machine, send high amount of packets with pktgen, script
> samples/pktgen/pktgen_sample04_many_flows.sh
> 2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments
> "-a XDP_TX --swapmac" and see the results
> 3. In the tested machine, run also pktgen_sample04 to create high TX
> normal traffic, and see how xdp_rxq_info results vary
>
> Note that this test doesn't check the worst situations for the fallback
> solutions because XDP_TX will only be executed from the same CPUs that
> are processed by sfc, and not from every CPU in the system, so the
> performance drop due to the highest locking contention doesn't happen.
> I'd like to test that, as well, but I don't have access right now to a
> proper environment.
>
> Test results:
>
> Without doing TX:
> Before changes: ~2,900,000 pps
> After changes, 1 queues/core: ~2,900,000 pps
> After changes, 2 queues/core: ~2,900,000 pps
> After changes, 8 queues/core: ~2,900,000 pps
> After changes, borrowing from network stack: ~2,900,000 pps
>
> With multiflow TX at the same time:
> Before changes: ~1,700,000 - 2,900,000 pps
> After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps
> After changes, 2 queues/core: ~1,700,000 pps
> After changes, 8 queues/core: ~1,700,000 pps
> After changes, borrowing from network stack: 1,150,000 pps
>
> Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program
> and pktgen simultaneously. This was expected because XDP doesn't have any
> buffering system if the NIC is under very high pressure. Thousands of
> these warnings are shown in the case of borrowing net stack queues. As I
> said before, this was also expected.
>
>
> Íñigo Huguet (2):
> sfc: fallback for lack of xdp tx queues
> sfc: last resort fallback for lack of xdp tx queues
>
> drivers/net/ethernet/sfc/efx_channels.c | 98 ++++++++++++++++++-------
> drivers/net/ethernet/sfc/net_driver.h | 8 ++
> drivers/net/ethernet/sfc/tx.c | 29 ++++++--
> 3 files changed, 99 insertions(+), 36 deletions(-)
>
On 09/09/2021 10:28, Íñigo Huguet wrote:
> Íñigo Huguet (2):
> sfc: fallback for lack of xdp tx queues
> sfc: last resort fallback for lack of xdp tx queues
Patches LGTM, thanks for working on this.
However, I would like to grumble a bit about process: firstly I would
have thought this was more net-next than net material.
Secondly and more importantly, I would really have liked to have had
more than 72 minutes to review this before it was applied. Dave, I
realise you never sleep, but the rest of us occasionally have to :P
-ed
On Thu, Sep 9, 2021 at 4:39 PM Edward Cree <[email protected]> wrote:
> Patches LGTM, thanks for working on this.
>
> However, I would like to grumble a bit about process: firstly I would
> have thought this was more net-next than net material.
I intended to send it for net-next, but by mistake I tagged it as net.
Sorry, my fault.
> Secondly and more importantly, I would really have liked to have had
> more than 72 minutes to review this before it was applied. Dave, I
> realise you never sleep, but the rest of us occasionally have to :P
I go to sleep now too, or at least rest a bit :-P
--
Íñigo Huguet