Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp220541iog; Fri, 24 Jun 2022 02:44:53 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tW68s1/faawRE6zsnVRB+f5ecemn1aVxrfQswTau5309mmpspgALwx/aHzf73Ke1cLSEDk X-Received: by 2002:a17:907:8c05:b0:726:2a09:c951 with SMTP id ta5-20020a1709078c0500b007262a09c951mr6070278ejc.143.1656063893081; Fri, 24 Jun 2022 02:44:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656063893; cv=none; d=google.com; s=arc-20160816; b=NEn3mkqk1pWIrzqACw3IYlLuXD1PC3KEaGDQXjnipe1/sIitkMjEnvXftnm+JooStK upSamPR2Ich/YbdjhUCGBSs1MbLdyZKgsUOz0JXRRBu+2YkznmZxA3+xJRq+9O/Irv+u wYC5q50iwl9+I0u5JxepQmSkae+DH4P+Wk52EJ+lEt2u5kV+Z48GeEgfipnTCEwBPua/ +vCkd1TFAfPddSYwYqdwSf97aJnxANrdJQjgLLc0DEKDjVC5iENPHjJT3jU8zJ1IVeZS mebVSH8A8k1Y6MW4ETLQqrT9yGD/t+GnAj6TTzRMCQbM3AR3Mby9gTMF+9YDfesmwZnd 30Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=E+/2S889vE1nU7z1Rsu3kTBwrE5REVT65NPUJ5Bh6VY=; b=XOF0fXPqklIh+s0GXMfh4r0GXpPd4EHZUSUvpfeQpSxzgA8HaJJrBLS9lVZF+uL2Mg JEabC5efZZrpsDCU2CkZgPh9wPW4sUK2ZS1orU6EUBkorLAz1mL4ksfDmHfW3GrNV284 SkQpxqISFohybqDUEUVArcs8H1KvEctEUToc6kxCMOLfgcQWdGox0mh46mO0tTg2JBWt 4xr5o8tFDK9qeElgdRnla2sfAOBOn1GCxY0IWGUo88qfzanxvxstTRJ4iPrySCyUlObu B1BQUQrHAFi8N6O0HAeoTyWlBHsT58pV7UsZQlLmN/zlWHlBfxUr3P9pD7I07mtpnRpt EF5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id js21-20020a17090797d500b0072636d040c2si561008ejc.104.2022.06.24.02.44.26; Fri, 24 Jun 2022 02:44:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231410AbiFXJmv (ORCPT + 99 others); Fri, 24 Jun 2022 05:42:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229496AbiFXJmu (ORCPT ); Fri, 24 Jun 2022 05:42:50 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C257794DF; Fri, 24 Jun 2022 02:42:49 -0700 (PDT) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4LTsbS2NRqzDsTV; Fri, 24 Jun 2022 17:42:12 +0800 (CST) Received: from kwepemm600016.china.huawei.com (7.193.23.20) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 24 Jun 2022 17:42:46 +0800 Received: from localhost.localdomain (10.67.165.24) by kwepemm600016.china.huawei.com (7.193.23.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 24 Jun 2022 17:42:46 +0800 From: Guangbin Huang To: , , , , , , CC: , , , , , , Subject: [PATCH net-next] net: page_pool: optimize page pool page allocation in NUMA scenario Date: Fri, 24 Jun 2022 17:36:21 +0800 Message-ID: <20220624093621.12505-1-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600016.china.huawei.com (7.193.23.20) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jie Wang Currently NIC packet receiving performance based on page pool deteriorates occasionally. To analysis the causes of this problem page allocation stats are collected. Here are the stats when NIC rx performance deteriorates: bandwidth(Gbits/s) 16.8 6.91 rx_pp_alloc_fast 13794308 21141869 rx_pp_alloc_slow 108625 166481 rx_pp_alloc_slow_h 0 0 rx_pp_alloc_empty 8192 8192 rx_pp_alloc_refill 0 0 rx_pp_alloc_waive 100433 158289 rx_pp_recycle_cached 0 0 rx_pp_recycle_cache_full 0 0 rx_pp_recycle_ring 362400 420281 rx_pp_recycle_ring_full 6064893 9709724 rx_pp_recycle_released_ref 0 0 The rx_pp_alloc_waive count indicates that a large number of pages' numa node are inconsistent with the NIC device numa node. Therefore these pages can't be reused by the page pool. As a result, many new pages would be allocated by __page_pool_alloc_pages_slow which is time consuming. This causes the NIC rx performance fluctuations. The main reason of huge numa mismatch pages in page pool is that page pool uses alloc_pages_bulk_array to allocate original pages. This function is not suitable for page allocation in NUMA scenario. So this patch uses alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure the NUMA consistent between NIC device and allocated pages. Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth is higher and more stable compared to the datas above. Here are three test stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which indicates pages allocated from slow patch is relatively low. bandwidth(Gbits/s) 93 93.9 93.8 rx_pp_alloc_fast 60066264 61266386 60938254 rx_pp_alloc_slow 16512 16517 16539 rx_pp_alloc_slow_ho 0 0 0 rx_pp_alloc_empty 16512 16517 16539 rx_pp_alloc_refill 473841 481910 481585 rx_pp_alloc_waive 0 0 0 rx_pp_recycle_cached 0 0 0 rx_pp_recycle_cache_full 0 0 0 rx_pp_recycle_ring 29754145 30358243 30194023 rx_pp_recycle_ring_full 0 0 0 rx_pp_recycle_released_ref 0 0 0 Signed-off-by: Jie Wang --- net/core/page_pool.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index f18e6e771993..15997fcd78f3 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -377,6 +377,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, unsigned int pp_order = pool->p.order; struct page *page; int i, nr_pages; + int pref_nid; /* preferred NUMA node */ /* Don't support bulk alloc for high-order pages */ if (unlikely(pp_order)) @@ -386,10 +387,18 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, if (unlikely(pool->alloc.count > 0)) return pool->alloc.cache[--pool->alloc.count]; +#ifdef CONFIG_NUMA + pref_nid = (pool->p.nid == NUMA_NO_NODE) ? numa_mem_id() : pool->p.nid; +#else + /* Ignore pool->p.nid setting if !CONFIG_NUMA, helps compiler */ + pref_nid = numa_mem_id(); /* will be zero like page_to_nid() */ +#endif + /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache); + nr_pages = alloc_pages_bulk_array_node(gfp, pref_nid, bulk, + pool->alloc.cache); if (unlikely(!nr_pages)) return NULL; -- 2.33.0