2021-11-23 17:24:54

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH net-next 0/9] intel: switch to napi_build_skb()

napi_build_skb() I introduced earlier this year ([0]) aims
to decrease MM pressure and the overhead from in-place
kmem_cache_alloc() on each Rx entry processing by decaching
skbuff_heads from NAPI per-cpu cache filled prior to that by
napi_consume_skb() (so it is sort of a direct shortcut for
free -> mm -> alloc cycle).
Currently, no in-tree drivers use it. Switch all Intel Ethernet
drivers to it to get slight-to-medium perf boosts depending on
the frame size.

ice driver, 50 Gbps link, pktgen + XDP_PASS (local in) sample:

frame_size/nthreads 64/42 128/20 256/8 512/4 1024/2 1532/1

net-next (Kpps) 46062 34654 18248 9830 5343 2714
series 47438 34708 18330 9875 5435 2777
increase 2.9% 0.15% 0.45% 0.46% 1.72% 2.32%

Additionally, e1000's been switched to napi_consume_skb() as it's
safe and works fine there, and there's no point in napi_build_skb()
without paired NAPI cache feeding point.

[0] https://lore.kernel.org/all/[email protected]

Alexander Lobakin (9):
e1000: switch to napi_consume_skb()
e1000: switch to napi_build_skb()
i40e: switch to napi_build_skb()
iavf: switch to napi_build_skb()
ice: switch to napi_build_skb()
igb: switch to napi_build_skb()
igc: switch to napi_build_skb()
ixgbe: switch to napi_build_skb()
ixgbevf: switch to napi_build_skb()

drivers/net/ethernet/intel/e1000/e1000_main.c | 14 ++++++++------
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +-
8 files changed, 15 insertions(+), 13 deletions(-)

--
2.33.1



2021-11-23 17:24:57

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH net-next 7/9] igc: switch to napi_build_skb()

napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order
to save some cycles on freeing/allocating skbuff_heads on every
new Rx or completed Tx.
igc driver runs Tx completion polling cycle right before the Rx
one and uses napi_consume_skb() to feed the cache with skbuff_heads
of completed entries, so it's never empty and always warm at that
moment. Switch to the napi_build_skb() to relax mm pressure on
heavy Rx.

Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 8e448288ee26..8b13a61ea5c9 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1729,7 +1729,7 @@ static struct sk_buff *igc_build_skb(struct igc_ring *rx_ring,
net_prefetch(va);

/* build an skb around the page buffer */
- skb = build_skb(va - IGC_SKB_PAD, truesize);
+ skb = napi_build_skb(va - IGC_SKB_PAD, truesize);
if (unlikely(!skb))
return NULL;

--
2.33.1


2021-11-23 17:25:01

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH net-next 3/9] i40e: switch to napi_build_skb()

napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order
to save some cycles on freeing/allocating skbuff_heads on every
new Rx or completed Tx.
i40e driver runs Tx completion polling cycle right before the Rx
one and uses napi_consume_skb() to feed the cache with skbuff_heads
of completed entries, so it's never empty and always warm at that
moment. Switch to the napi_build_skb() to relax mm pressure on
heavy Rx.

Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 10a83e5385c7..9e3991caa5c9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2204,7 +2204,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
net_prefetch(xdp->data_meta);

/* build an skb around the page buffer */
- skb = build_skb(xdp->data_hard_start, truesize);
+ skb = napi_build_skb(xdp->data_hard_start, truesize);
if (unlikely(!skb))
return NULL;

--
2.33.1


2021-12-03 14:45:05

by G, GurucharanX

[permalink] [raw]
Subject: RE: [Intel-wired-lan] [PATCH net-next 3/9] i40e: switch to napi_build_skb()



> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of
> Alexander Lobakin
> Sent: Tuesday, November 23, 2021 10:49 PM
> To: [email protected]
> Cc: [email protected]; [email protected]; Jakub Kicinski
> <[email protected]>; David S. Miller <[email protected]>
> Subject: [Intel-wired-lan] [PATCH net-next 3/9] i40e: switch to napi_build_skb()
>
> napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some
> cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx.
> i40e driver runs Tx completion polling cycle right before the Rx one and uses
> napi_consume_skb() to feed the cache with skbuff_heads of completed entries,
> so it's never empty and always warm at that moment. Switch to the
> napi_build_skb() to relax mm pressure on heavy Rx.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
> Reviewed-by: Michal Swiatkowski <[email protected]>
> ---
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>

Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)

2021-12-08 08:15:46

by Kraus, NechamaX

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net-next 7/9] igc: switch to napi_build_skb()

On 11/23/2021 19:18, Alexander Lobakin wrote:
> napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order
> to save some cycles on freeing/allocating skbuff_heads on every
> new Rx or completed Tx.
> igc driver runs Tx completion polling cycle right before the Rx
> one and uses napi_consume_skb() to feed the cache with skbuff_heads
> of completed entries, so it's never empty and always warm at that
> moment. Switch to the napi_build_skb() to relax mm pressure on
> heavy Rx.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
> Reviewed-by: Michal Swiatkowski <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Tested-by: Nechama Kraus <[email protected]>