2022-07-05 11:54:19

by Guangbin Huang

[permalink] [raw]
Subject: [PATCH net-next v3] net: page_pool: optimize page pool page allocation in NUMA scenario

From: Jie Wang <[email protected]>

Currently NIC packet receiving performance based on page pool deteriorates
occasionally. To analysis the causes of this problem page allocation stats
are collected. Here are the stats when NIC rx performance deteriorates:

bandwidth(Gbits/s) 16.8 6.91
rx_pp_alloc_fast 13794308 21141869
rx_pp_alloc_slow 108625 166481
rx_pp_alloc_slow_h 0 0
rx_pp_alloc_empty 8192 8192
rx_pp_alloc_refill 0 0
rx_pp_alloc_waive 100433 158289
rx_pp_recycle_cached 0 0
rx_pp_recycle_cache_full 0 0
rx_pp_recycle_ring 362400 420281
rx_pp_recycle_ring_full 6064893 9709724
rx_pp_recycle_released_ref 0 0

The rx_pp_alloc_waive count indicates that a large number of pages' numa
node are inconsistent with the NIC device numa node. Therefore these pages
can't be reused by the page pool. As a result, many new pages would be
allocated by __page_pool_alloc_pages_slow which is time consuming. This
causes the NIC rx performance fluctuations.

The main reason of huge numa mismatch pages in page pool is that page pool
uses alloc_pages_bulk_array to allocate original pages. This function is
not suitable for page allocation in NUMA scenario. So this patch uses
alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
the NUMA consistent between NIC device and allocated pages.

Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
is higher and more stable compared to the datas above. Here are three test
stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
indicates pages allocated from slow patch is relatively low.

bandwidth(Gbits/s) 93 93.9 93.8
rx_pp_alloc_fast 60066264 61266386 60938254
rx_pp_alloc_slow 16512 16517 16539
rx_pp_alloc_slow_ho 0 0 0
rx_pp_alloc_empty 16512 16517 16539
rx_pp_alloc_refill 473841 481910 481585
rx_pp_alloc_waive 0 0 0
rx_pp_recycle_cached 0 0 0
rx_pp_recycle_cache_full 0 0 0
rx_pp_recycle_ring 29754145 30358243 30194023
rx_pp_recycle_ring_full 0 0 0
rx_pp_recycle_released_ref 0 0 0

Signed-off-by: Jie Wang <[email protected]>

---
v2->v3:
1, Delete the #ifdefs
2, Use 'pool->p.nid' in the call to alloc_pages_bulk_array_node()

v1->v2:
1, Remove two inappropriate comments.
2, Use NUMA_NO_NODE instead of numa_mem_id() for code maintenance.
---
net/core/page_pool.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f18e6e771993..b74905fcc3a1 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);

- nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
+ nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk,
+ pool->alloc.cache);
if (unlikely(!nr_pages))
return NULL;

--
2.33.0


2022-07-07 19:38:36

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH net-next v3] net: page_pool: optimize page pool page allocation in NUMA scenario


On 05/07/2022 13.35, Guangbin Huang wrote:
> From: Jie Wang <[email protected]>
>
> Currently NIC packet receiving performance based on page pool deteriorates
> occasionally. To analysis the causes of this problem page allocation stats
> are collected. Here are the stats when NIC rx performance deteriorates:
>
> bandwidth(Gbits/s) 16.8 6.91
> rx_pp_alloc_fast 13794308 21141869
> rx_pp_alloc_slow 108625 166481
> rx_pp_alloc_slow_h 0 0
> rx_pp_alloc_empty 8192 8192
> rx_pp_alloc_refill 0 0
> rx_pp_alloc_waive 100433 158289
> rx_pp_recycle_cached 0 0
> rx_pp_recycle_cache_full 0 0
> rx_pp_recycle_ring 362400 420281
> rx_pp_recycle_ring_full 6064893 9709724
> rx_pp_recycle_released_ref 0 0
>
> The rx_pp_alloc_waive count indicates that a large number of pages' numa
> node are inconsistent with the NIC device numa node. Therefore these pages
> can't be reused by the page pool. As a result, many new pages would be
> allocated by __page_pool_alloc_pages_slow which is time consuming. This
> causes the NIC rx performance fluctuations.
>
> The main reason of huge numa mismatch pages in page pool is that page pool
> uses alloc_pages_bulk_array to allocate original pages. This function is
> not suitable for page allocation in NUMA scenario. So this patch uses
> alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
> the NUMA consistent between NIC device and allocated pages.
>
> Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
> is higher and more stable compared to the datas above. Here are three test
> stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
> indicates pages allocated from slow patch is relatively low.
>
> bandwidth(Gbits/s) 93 93.9 93.8
> rx_pp_alloc_fast 60066264 61266386 60938254
> rx_pp_alloc_slow 16512 16517 16539
> rx_pp_alloc_slow_ho 0 0 0
> rx_pp_alloc_empty 16512 16517 16539
> rx_pp_alloc_refill 473841 481910 481585
> rx_pp_alloc_waive 0 0 0
> rx_pp_recycle_cached 0 0 0
> rx_pp_recycle_cache_full 0 0 0
> rx_pp_recycle_ring 29754145 30358243 30194023
> rx_pp_recycle_ring_full 0 0 0
> rx_pp_recycle_released_ref 0 0 0
>
> Signed-off-by: Jie Wang <[email protected]>

Acked-by: Jesper Dangaard Brouer <[email protected]>

> ---
> v2->v3:
> 1, Delete the #ifdefs
> 2, Use 'pool->p.nid' in the call to alloc_pages_bulk_array_node()
>
> v1->v2:
> 1, Remove two inappropriate comments.
> 2, Use NUMA_NO_NODE instead of numa_mem_id() for code maintenance.
> ---
> net/core/page_pool.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index f18e6e771993..b74905fcc3a1 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
> memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
>
> - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> + nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk,
> + pool->alloc.cache);
> if (unlikely(!nr_pages))
> return NULL;
>

2022-07-07 21:55:35

by Ilias Apalodimas

[permalink] [raw]
Subject: Re: [PATCH net-next v3] net: page_pool: optimize page pool page allocation in NUMA scenario

On Thu, 7 Jul 2022 at 22:14, Jesper Dangaard Brouer <[email protected]> wrote:
>
>
> On 05/07/2022 13.35, Guangbin Huang wrote:
> > From: Jie Wang <[email protected]>
> >
> > Currently NIC packet receiving performance based on page pool deteriorates
> > occasionally. To analysis the causes of this problem page allocation stats
> > are collected. Here are the stats when NIC rx performance deteriorates:
> >
> > bandwidth(Gbits/s) 16.8 6.91
> > rx_pp_alloc_fast 13794308 21141869
> > rx_pp_alloc_slow 108625 166481
> > rx_pp_alloc_slow_h 0 0
> > rx_pp_alloc_empty 8192 8192
> > rx_pp_alloc_refill 0 0
> > rx_pp_alloc_waive 100433 158289
> > rx_pp_recycle_cached 0 0
> > rx_pp_recycle_cache_full 0 0
> > rx_pp_recycle_ring 362400 420281
> > rx_pp_recycle_ring_full 6064893 9709724
> > rx_pp_recycle_released_ref 0 0
> >
> > The rx_pp_alloc_waive count indicates that a large number of pages' numa
> > node are inconsistent with the NIC device numa node. Therefore these pages
> > can't be reused by the page pool. As a result, many new pages would be
> > allocated by __page_pool_alloc_pages_slow which is time consuming. This
> > causes the NIC rx performance fluctuations.
> >
> > The main reason of huge numa mismatch pages in page pool is that page pool
> > uses alloc_pages_bulk_array to allocate original pages. This function is
> > not suitable for page allocation in NUMA scenario. So this patch uses
> > alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
> > the NUMA consistent between NIC device and allocated pages.
> >
> > Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
> > is higher and more stable compared to the datas above. Here are three test
> > stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
> > indicates pages allocated from slow patch is relatively low.
> >
> > bandwidth(Gbits/s) 93 93.9 93.8
> > rx_pp_alloc_fast 60066264 61266386 60938254
> > rx_pp_alloc_slow 16512 16517 16539
> > rx_pp_alloc_slow_ho 0 0 0
> > rx_pp_alloc_empty 16512 16517 16539
> > rx_pp_alloc_refill 473841 481910 481585
> > rx_pp_alloc_waive 0 0 0
> > rx_pp_recycle_cached 0 0 0
> > rx_pp_recycle_cache_full 0 0 0
> > rx_pp_recycle_ring 29754145 30358243 30194023
> > rx_pp_recycle_ring_full 0 0 0
> > rx_pp_recycle_released_ref 0 0 0
> >
> > Signed-off-by: Jie Wang <[email protected]>
>
> Acked-by: Jesper Dangaard Brouer <[email protected]>
>
> > ---
> > v2->v3:
> > 1, Delete the #ifdefs
> > 2, Use 'pool->p.nid' in the call to alloc_pages_bulk_array_node()
> >
> > v1->v2:
> > 1, Remove two inappropriate comments.
> > 2, Use NUMA_NO_NODE instead of numa_mem_id() for code maintenance.
> > ---
> > net/core/page_pool.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index f18e6e771993..b74905fcc3a1 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -389,7 +389,8 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> > /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
> > memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
> >
> > - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> > + nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk,
> > + pool->alloc.cache);
> > if (unlikely(!nr_pages))
> > return NULL;
> >
>

Acked-by: Ilias Apalodimas <[email protected]>

2022-07-08 00:45:20

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next v3] net: page_pool: optimize page pool page allocation in NUMA scenario

Hello:

This patch was applied to netdev/net-next.git (master)
by Jakub Kicinski <[email protected]>:

On Tue, 5 Jul 2022 19:35:15 +0800 you wrote:
> From: Jie Wang <[email protected]>
>
> Currently NIC packet receiving performance based on page pool deteriorates
> occasionally. To analysis the causes of this problem page allocation stats
> are collected. Here are the stats when NIC rx performance deteriorates:
>
> bandwidth(Gbits/s) 16.8 6.91
> rx_pp_alloc_fast 13794308 21141869
> rx_pp_alloc_slow 108625 166481
> rx_pp_alloc_slow_h 0 0
> rx_pp_alloc_empty 8192 8192
> rx_pp_alloc_refill 0 0
> rx_pp_alloc_waive 100433 158289
> rx_pp_recycle_cached 0 0
> rx_pp_recycle_cache_full 0 0
> rx_pp_recycle_ring 362400 420281
> rx_pp_recycle_ring_full 6064893 9709724
> rx_pp_recycle_released_ref 0 0
>
> [...]

Here is the summary with links:
- [net-next,v3] net: page_pool: optimize page pool page allocation in NUMA scenario
https://git.kernel.org/netdev/net-next/c/d810d367ec40

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html