Received: by 2002:ab2:3350:0:b0:1f4:6588:b3a7 with SMTP id o16csp1822889lqe; Tue, 9 Apr 2024 01:00:16 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU+P9a2ihGPFzoK/5jOXzj0G6y/bT21gAPAt/witGo4vUidV4AsqDPvUK9o3zw8tiJaJEsPbreqO26s2DHFikGkTwZNcn7nWbQHV/O7Kw== X-Google-Smtp-Source: AGHT+IHlSQWkFkdq+Zx6gfkzkad+igyiPhUPMu5Jt4eJt957Py+AID8iLxe3l8Ojv/2yJGXlg+5n X-Received: by 2002:a17:907:3ea0:b0:a47:3526:2e0f with SMTP id hs32-20020a1709073ea000b00a4735262e0fmr7972036ejc.75.1712649616026; Tue, 09 Apr 2024 01:00:16 -0700 (PDT) Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id i22-20020a170906251600b00a51f7b71fa0si483721ejb.689.2024.04.09.01.00.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Apr 2024 01:00:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-136402-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-136402-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-136402-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 963EF1F21FF1 for ; Tue, 9 Apr 2024 08:00:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5ACED7E107; Tue, 9 Apr 2024 07:59:53 +0000 (UTC) Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0641579955; Tue, 9 Apr 2024 07:59:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712649592; cv=none; b=Na1/C69uxjtnZFkgKM1Hhsve3kNUUwdtJo8Y3nd+3gsZZMqVRl1VEv3J+d0qxii+BWFcuj5TzPzt7N66zYrr01zRkVUAtjlYxBqX9SfZxDmgTa8cP2M2h3q12bCP8x+JvF2Fsbj5DDBRX26LlWVORY3XtyZw2DSCX3bTSOatAro= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712649592; c=relaxed/simple; bh=OxTZQ+EK2K47lRLqA+IcS9MDkFZMa9+oqIZc7SxR34g=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=nE+goyO6A06Jsa7I7NsIPMkBgbO5nw9qUv4umGlRZlZZrwtDAH7eGCVexggd9V+R2oZeq25qs6FxjZ2fCf8JXt4K1cLJKjaEq8SWpA7CkhF930fbxbDuo7THDJkJv+THnWEaEoydAdsF5e4xft2+Hv/L7o8L701g4V9y+spaRxI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4VDJFs4TCzz1R5Sv; Tue, 9 Apr 2024 15:57:05 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id 727B31400CD; Tue, 9 Apr 2024 15:59:47 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 9 Apr 2024 15:59:47 +0800 Subject: Re: [PATCH net-next v1 02/12] mm: page_frag: use initial zero offset for page_frag_alloc_align() To: Alexander Duyck CC: , , , , , Andrew Morton , References: <20240407130850.19625-1-linyunsheng@huawei.com> <20240407130850.19625-3-linyunsheng@huawei.com> <43d99616cd4a2a6fce6a6b97f73d08ebc5361a61.camel@gmail.com> From: Yunsheng Lin Message-ID: Date: Tue, 9 Apr 2024 15:59:46 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemm500005.china.huawei.com (7.185.36.74) On 2024/4/9 0:11, Alexander Duyck wrote: > On Mon, Apr 8, 2024 at 6:39 AM Yunsheng Lin wrote: >> >> On 2024/4/8 1:52, Alexander H Duyck wrote: >>> On Sun, 2024-04-07 at 21:08 +0800, Yunsheng Lin wrote: >>>> We are above to use page_frag_alloc_*() API to not just >>>> allocate memory for skb->data, but also use them to do >>>> the memory allocation for skb frag too. Currently the >>>> implementation of page_frag in mm subsystem is running >>>> the offset as a countdown rather than count-up value, >>>> there may have several advantages to that as mentioned >>>> in [1], but it may have some disadvantages, for example, >>>> it may disable skb frag coaleasing and more correct cache >>>> prefetching >>>> >>>> We have a trade-off to make in order to have a unified >>>> implementation and API for page_frag, so use a initial zero >>>> offset in this patch, and the following patch will try to >>>> make some optimization to aovid the disadvantages as much >>>> as possible. >>>> >>>> 1. https://lore.kernel.org/all/f4abe71b3439b39d17a6fb2d410180f367cadf5c.camel@gmail.com/ >>>> >>>> CC: Alexander Duyck >>>> Signed-off-by: Yunsheng Lin >>>> --- >>>> mm/page_frag_cache.c | 31 ++++++++++++++----------------- >>>> 1 file changed, 14 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c >>>> index a0f90ba25200..3e3e88d9af90 100644 >>>> --- a/mm/page_frag_cache.c >>>> +++ b/mm/page_frag_cache.c >>>> @@ -67,9 +67,8 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >>>> unsigned int fragsz, gfp_t gfp_mask, >>>> unsigned int align_mask) >>>> { >>>> - unsigned int size = PAGE_SIZE; >>>> + unsigned int size, offset; >>>> struct page *page; >>>> - int offset; >>>> >>>> if (unlikely(!nc->va)) { >>>> refill: >>>> @@ -77,10 +76,6 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >>>> if (!page) >>>> return NULL; >>>> >>>> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >>>> - /* if size can vary use size else just use PAGE_SIZE */ >>>> - size = nc->size; >>>> -#endif >>>> /* Even if we own the page, we do not use atomic_set(). >>>> * This would break get_page_unless_zero() users. >>>> */ >>>> @@ -89,11 +84,18 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, >>>> /* reset page count bias and offset to start of new frag */ >>>> nc->pfmemalloc = page_is_pfmemalloc(page); >>>> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; >>>> - nc->offset = size; >>>> + nc->offset = 0; >>>> } >>>> >>>> - offset = nc->offset - fragsz; >>>> - if (unlikely(offset < 0)) { >>>> +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >>>> + /* if size can vary use size else just use PAGE_SIZE */ >>>> + size = nc->size; >>>> +#else >>>> + size = PAGE_SIZE; >>>> +#endif >>>> + >>>> + offset = ALIGN(nc->offset, -align_mask); >>>> + if (unlikely(offset + fragsz > size)) { >>> >>> Rather than using "ALIGN" with a negative value it would probably make >>> more sense to use __ALIGN_KERNEL_MASK with ~align_mask. I am not sure >>> how well the compiler sorts out the use of negatives to flip values >>> that are then converted to masks with the "(a) - 1". >> >> The next patch will remove the '-' in '-align_mask' as the 'ALIGN' operation >> is done in the inline helper. I am not sure that matter much to use >> __ALIGN_KERNEL_MASK with ~align_mask? > > It is a matter of making the negations more obvious. Basically you > could achieve the same alignment by doing: > (offset + (~align_mask)) & ~(~align_mask) > rather than: > (offset + ((-align_mask) - 1)) & ~((-align_mask) - 1) > > I'm not sure the compiler will pick up on the fact that the two are > identical and can save a number of operations. Also my suggested > approach is closer to how it used to work. Technically the one you are > using only works if align_mask is a negative power of 2. In patch 3, we have below, so the above trick is not really needed after patch 3: @@ -94,7 +93,7 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, size = PAGE_SIZE; #endif - offset = ALIGN(nc->offset, -align_mask); + offset = nc->offset; if (unlikely(offset + fragsz > size)) { page = virt_to_page(nc->va); @@ -131,7 +130,7 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, return nc->va + offset; } -EXPORT_SYMBOL(__page_frag_alloc_align); +EXPORT_SYMBOL(page_frag_alloc); .. +static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align) +{ + nc->offset = ALIGN(nc->offset, align); + + return page_frag_alloc(nc, fragsz, gfp_mask); +} static inline void *page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align) { WARN_ON_ONCE(!is_power_of_2(align)); - return __page_frag_alloc_align(nc, fragsz, gfp_mask, -align); -} > . >