Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2280419pxb; Thu, 11 Feb 2021 08:33:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJyNB+ZkeitixblPE1TVQUNlk7Jl+sG1tA96U+/oqADtfB9xgCrsPc4+Q3kJYuSLnGoeUSsC X-Received: by 2002:a05:6402:2022:: with SMTP id ay2mr9082757edb.292.1613061203088; Thu, 11 Feb 2021 08:33:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613061203; cv=none; d=google.com; s=arc-20160816; b=fgvPBfIvccF2W7ONumkJv4/zbcYtZ46GPVFh0AXHOKzdIBcRn6neDv37IoDyjcvA+1 q0s2bvALHdLAxqMgKK0RtVv2DgF8CFn0fIaT/vUOL39lNG7lOvO5GcZXnlb+NCjK9gDV drQYXz8uW7+wUhIhfVmkyiIJrh/Szom0xm9bMIPYN05nXLYlmQz21/vZwSQHo+opSuZt 1jPmJIE0uT91jViJRHf7tkE1hwD0wdY0hQ1SBJ1brtlVkCc8Rq3UGw0q79rH6IVgUk9C t9ZJKnjlmfUcPoGQVrvoYCp1giJKvqoshLfqmIMVcyKzpTpCAlgvZSb4RjzirECAHm3r uX4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:reply-to:cc:from:to :dkim-signature:date; bh=LVzb0WFXP5BTyndSSDvVZrA0sSqaGkk4EkDkTvrQzYY=; b=nHmoj2QlcvewdcRuEpa3QRho1FsDxuKzd89Nxt6eajwBjjn1uobsq+1a73SML2F00s 9WuBWMxCwqHNvjRfxkL3sJCjdHI6dQzuttGqlQL8OAy06bM+TlT/Nb28MrsHejov7dcK WPPOL5/fr5+GPVGengq7HX55TDkOQJLwH5aaz5pyhiIV+i7bFr9piZaX1DlRG7bTuOMJ aC0HaOLCDh8K0FU0WC+vMxi1VLDe0ek62gtNqw9qxXRY+bX6llx5qHPq6PvoLY3+0KnU alfRh1u+6PxZIjYEBDcgrnLoIRQYDvlIEEvq9FC2UVGhHdHw3okZk+21eBYZdIwjZH6J KWJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@pm.me header.s=protonmail header.b=k4fRLo6T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=pm.me Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r25si5201060edc.257.2021.02.11.08.32.59; Thu, 11 Feb 2021 08:33:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@pm.me header.s=protonmail header.b=k4fRLo6T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=pm.me Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231432AbhBKQcO (ORCPT + 99 others); Thu, 11 Feb 2021 11:32:14 -0500 Received: from mail-40136.protonmail.ch ([185.70.40.136]:38010 "EHLO mail-40136.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230244AbhBKP1x (ORCPT ); Thu, 11 Feb 2021 10:27:53 -0500 Date: Thu, 11 Feb 2021 15:26:46 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1613057213; bh=LVzb0WFXP5BTyndSSDvVZrA0sSqaGkk4EkDkTvrQzYY=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=k4fRLo6TGq5i8/y0neBrbr1UFF/+tigMTucFWh1ZDkibUA0IGp5OYm4ln4611hldw R40OrDgEWU5j2UViTIgwmpEOGIj3j0+m/atEP1c7Kxl+aVqHZ8BB1jXX8BSjx5W93m tzEzTUtQjeuegLRX6/Aosmh5G7aOnr2zdC/bCyRTXMsDQRkpB8N0OTSkRd60CFvjJD 1VNUJG5Je7ylv4wdnlPyy+9C1yDV+QjE2V+FZZB0w8oc9Ul1pvcLE1nQ6w7pJQ+gCb TaHZtiRmr2TZs0PPM2j8xlKjfkb69Fpp8iy7SE8VEbzBerAEYrAJ/zSObgRKpgJO2O zBaV7m6CheHgg== To: Paolo Abeni From: Alexander Lobakin Cc: Alexander Lobakin , "David S. Miller" , Jakub Kicinski , Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?Q?Bj=C3=B6rn_T=C3=B6pel?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: Re: [PATCH v4 net-next 09/11] skbuff: allow to optionally use NAPI cache from __alloc_skb() Message-ID: <20210211152620.3339-1-alobakin@pm.me> In-Reply-To: References: <20210210162732.80467-1-alobakin@pm.me> <20210210162732.80467-10-alobakin@pm.me> <58147c2d36ea7b6e0284d400229cd79185c53463.camel@redhat.com> <20210211142811.1813-1-alobakin@pm.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=10.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=disabled version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on mailout.protonmail.ch Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Paolo Abeni Date: Thu, 11 Feb 2021 15:55:04 +0100 > On Thu, 2021-02-11 at 14:28 +0000, Alexander Lobakin wrote: > > From: Paolo Abeni on Thu, 11 Feb 2021 11:16:40 +010= 0 wrote: > > > What about changing __napi_alloc_skb() to always use > > > the __napi_build_skb(), for both kmalloc and page backed skbs? That i= s, > > > always doing the 'data' allocation in __napi_alloc_skb() - either via > > > page_frag or via kmalloc() - and than call __napi_build_skb(). > > >=20 > > > I think that should avoid adding more checks in __alloc_skb() and > > > should probably reduce the number of conditional used > > > by __napi_alloc_skb(). > >=20 > > I thought of this too. But this will introduce conditional branch > > to set or not skb->head_frag. So one branch less in __alloc_skb(), > > one branch more here, and we also lose the ability to __alloc_skb() > > with decached head. >=20 > Just to try to be clear, I mean something alike the following (not even > build tested). In the fast path it has less branches than the current > code - for both kmalloc and page_frag allocation. >=20 > --- > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 785daff48030..a242fbe4730e 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -506,23 +506,12 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct= *napi, unsigned int len, > =09=09=09=09 gfp_t gfp_mask) > { > =09struct napi_alloc_cache *nc; > +=09bool head_frag, pfmemalloc; > =09struct sk_buff *skb; > =09void *data; > =20 > =09len +=3D NET_SKB_PAD + NET_IP_ALIGN; > =20 > -=09/* If requested length is either too small or too big, > -=09 * we use kmalloc() for skb->head allocation. > -=09 */ > -=09if (len <=3D SKB_WITH_OVERHEAD(1024) || > -=09 len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > -=09 (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > -=09=09skb =3D __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); > -=09=09if (!skb) > -=09=09=09goto skb_fail; > -=09=09goto skb_success; > -=09} > - > =09nc =3D this_cpu_ptr(&napi_alloc_cache); > =09len +=3D SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > =09len =3D SKB_DATA_ALIGN(len); > @@ -530,25 +519,34 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct= *napi, unsigned int len, > =09if (sk_memalloc_socks()) > =09=09gfp_mask |=3D __GFP_MEMALLOC; > =20 > -=09data =3D page_frag_alloc(&nc->page, len, gfp_mask); > +=09if (len <=3D SKB_WITH_OVERHEAD(1024) || > + len > SKB_WITH_OVERHEAD(PAGE_SIZE) || > + (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { > +=09=09data =3D kmalloc_reserve(len, gfp_mask, NUMA_NO_NODE, &pfmemalloc)= ; > +=09=09head_frag =3D 0; > +=09=09len =3D 0; > +=09} else { > +=09=09data =3D page_frag_alloc(&nc->page, len, gfp_mask); > +=09=09pfmemalloc =3D nc->page.pfmemalloc; > +=09=09head_frag =3D 1; > +=09} > =09if (unlikely(!data)) > =09=09return NULL; Sure. I have a separate WIP series that reworks all three *alloc_skb() functions, as there's a nice room for optimization, especially after that tiny skbs now fall back to __alloc_skb(). It will likely hit mailing lists after the merge window and next net-next season, not now. And it's not really connected with NAPI cache reusing. > =09skb =3D __build_skb(data, len); > =09if (unlikely(!skb)) { > -=09=09skb_free_frag(data); > +=09=09if (head_frag) > +=09=09=09skb_free_frag(data); > +=09=09else > +=09=09=09kfree(data); > =09=09return NULL; > =09} > =20 > -=09if (nc->page.pfmemalloc) > -=09=09skb->pfmemalloc =3D 1; > -=09skb->head_frag =3D 1; > +=09skb->pfmemalloc =3D pfmemalloc; > +=09skb->head_frag =3D head_frag; > =20 > -skb_success: > =09skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); > =09skb->dev =3D napi->dev; > - > -skb_fail: > =09return skb; > } > EXPORT_SYMBOL(__napi_alloc_skb); Al