In order to take the best from per-cpu NAPI skbuff_head caches and
CPU cycles, let's switch from dev_kfree_skb_any(), which passes skb
back to the mm layer, to napi_consume_skb(), which feeds those
caches on non-zero budget instead (falls back to the former on 0).
Do the replacement in e1000_unmap_and_free_tx_resource(). There are
4 call sites of this function throughout the driver:
* e1000_clean_tx_ring(). Slowpath, process context, cleans the
whole Tx ring on ifdown. Use budget of 0 here;
* e1000_tx_map(). Hotpath, net Tx softirq, unmaps the buffers in
case of error. Use 0 as well;
* e1000_clean_tx_irq(). Hotpath, NAPI Tx completion polling cycle.
As the driver doesn't count completed Tx entries towards the NAPI
budget, just use the poll budget of 64 to utilize caches.
Apart from being a preparation for switching to napi_build_skb(),
this is useful on its own as well, as napi_consume_skb() flushes
skb caches by batches of 32 instead of one-at-a-time.
Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
---
drivers/net/ethernet/intel/e1000/e1000_main.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 669060a2e6aa..975a145d48ef 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1953,7 +1953,8 @@ void e1000_free_all_tx_resources(struct e1000_adapter *adapter)
static void
e1000_unmap_and_free_tx_resource(struct e1000_adapter *adapter,
- struct e1000_tx_buffer *buffer_info)
+ struct e1000_tx_buffer *buffer_info,
+ int budget)
{
if (buffer_info->dma) {
if (buffer_info->mapped_as_page)
@@ -1966,7 +1967,7 @@ e1000_unmap_and_free_tx_resource(struct e1000_adapter *adapter,
buffer_info->dma = 0;
}
if (buffer_info->skb) {
- dev_kfree_skb_any(buffer_info->skb);
+ napi_consume_skb(buffer_info->skb, budget);
buffer_info->skb = NULL;
}
buffer_info->time_stamp = 0;
@@ -1990,7 +1991,7 @@ static void e1000_clean_tx_ring(struct e1000_adapter *adapter,
for (i = 0; i < tx_ring->count; i++) {
buffer_info = &tx_ring->buffer_info[i];
- e1000_unmap_and_free_tx_resource(adapter, buffer_info);
+ e1000_unmap_and_free_tx_resource(adapter, buffer_info, 0);
}
netdev_reset_queue(adapter->netdev);
@@ -2958,7 +2959,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
i += tx_ring->count;
i--;
buffer_info = &tx_ring->buffer_info[i];
- e1000_unmap_and_free_tx_resource(adapter, buffer_info);
+ e1000_unmap_and_free_tx_resource(adapter, buffer_info, 0);
}
return 0;
@@ -3856,7 +3857,8 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter,
}
}
- e1000_unmap_and_free_tx_resource(adapter, buffer_info);
+ e1000_unmap_and_free_tx_resource(adapter, buffer_info,
+ 64);
tx_desc->upper.data = 0;
if (unlikely(++i == tx_ring->count))
--
2.33.1
> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of
> Alexander Lobakin
> Sent: Tuesday, November 23, 2021 9:19 AM
> To: [email protected]
> Cc: [email protected]; [email protected]; Jakub Kicinski
> <[email protected]>; David S. Miller <[email protected]>
> Subject: [Intel-wired-lan] [PATCH net-next 1/9] e1000: switch to
> napi_consume_skb()
>
> In order to take the best from per-cpu NAPI skbuff_head caches and CPU
> cycles, let's switch from dev_kfree_skb_any(), which passes skb back to the
> mm layer, to napi_consume_skb(), which feeds those caches on non-zero
> budget instead (falls back to the former on 0).
> Do the replacement in e1000_unmap_and_free_tx_resource(). There are
> 4 call sites of this function throughout the driver:
> * e1000_clean_tx_ring(). Slowpath, process context, cleans the
> whole Tx ring on ifdown. Use budget of 0 here;
> * e1000_tx_map(). Hotpath, net Tx softirq, unmaps the buffers in
> case of error. Use 0 as well;
> * e1000_clean_tx_irq(). Hotpath, NAPI Tx completion polling cycle.
> As the driver doesn't count completed Tx entries towards the NAPI
> budget, just use the poll budget of 64 to utilize caches.
>
> Apart from being a preparation for switching to napi_build_skb(), this is
> useful on its own as well, as napi_consume_skb() flushes skb caches by
> batches of 32 instead of one-at-a-time.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
> Reviewed-by: Michal Swiatkowski <[email protected]>
> ---
> drivers/net/ethernet/intel/e1000/e1000_main.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
Tested-by: Tony Brelinski <[email protected]>