Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2229340pxb; Thu, 11 Feb 2021 07:29:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJz2VjDRyamhh20gA+BuUAhduj9SIlA4mgEcKtZF1HmjSQ0a/oKyd5yT5CKhcDs/6/Cn3/m6 X-Received: by 2002:a05:6402:3585:: with SMTP id y5mr8684641edc.97.1613057367609; Thu, 11 Feb 2021 07:29:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613057367; cv=none; d=google.com; s=arc-20160816; b=pZe5Pig7ST7awp7anSPvZ+9k+/QFw9V4I+U6eFC6Hi8x4Cw6dAPuPA5tgHna8E9HR3 3lJJ1jOGnOhHkJXYhb5y3zQCWEoD8USkX+RjjhaVF4StrzAza9KTdgX64ScXwCVrlvA+ b+SLlB3XeW/UM4/SoddUuSTAEoaqEp8G0broTvDAKjq9qsMKHBQi1a8j5cW5g1s8BGf9 ZarlCkmQDmkl8ce2Fd95MYQoOJemCiCpAl31YWoZfdb+uwMyf4WzGf2lQdcGtyC0n4cY BrX4sfWj2xiAK3liTHbAD9a1aX0FL0WEuK3iesKxDIwV30WowNRy+NMhrFkgaTwq7JeH TjfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=Rv0AX4El3L9BRXhQ67Zd1G732A2I2MFEub7SaOhadis=; b=X+I7sFvKOcLuvG4K3S0J1PxJf11CBeXTpXwviSbGQTXaRQ0GRs/xBDsLTgRtoUq6zn CMeUzrDIkIYS+nTdtAyvsZ4o6ihNTvjvDjkkmu4ZXsctIHZuTMpTkZrYqyj8IfhdqlJb NTKCmbRmJUJGztIuhLaYA3Uj5DG9PwEC9dyW3JxbXMxZPr6lIZBolTwXfvgcukdPldVS Qax5JkUNN8x6edQGg9R6kDE4o0rzPP1rAihL3M/R6uKSxLbCrigptWeliWz99oDkqnqx 86GgbjMSjRd8yMR1yK7T9zPxXHzn9rf7tLbjD3v+iz481MYoc93nTkvre54eg3/0zZQE AIGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ErnTkZsk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dn20si4456396ejc.418.2021.02.11.07.28.28; Thu, 11 Feb 2021 07:29:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ErnTkZsk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbhBKP0v (ORCPT + 99 others); Thu, 11 Feb 2021 10:26:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:41560 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230465AbhBKO5I (ORCPT ); Thu, 11 Feb 2021 09:57:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613055323; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rv0AX4El3L9BRXhQ67Zd1G732A2I2MFEub7SaOhadis=; b=ErnTkZskeCTVcqX5XIzjN5EBmRA1dde4ovWiSm8fINM+m5zMFipS4xQhGrtfz5KbX7O2cq pxp9uvUpY2xK1cdfAy3lfhGmZh9rjlN8LQLx0xjbr5mzmy5NKcAKBrE4WpaQJpjFA8SZm1 b9mgo067wj6jOoI46fCJBLPGNsf+OlQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-149-3XopvuBpM9GU7hxzP4vlvQ-1; Thu, 11 Feb 2021 09:55:21 -0500 X-MC-Unique: 3XopvuBpM9GU7hxzP4vlvQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 36AF66D4EE; Thu, 11 Feb 2021 14:55:17 +0000 (UTC) Received: from ovpn-115-49.ams2.redhat.com (ovpn-115-49.ams2.redhat.com [10.36.115.49]) by smtp.corp.redhat.com (Postfix) with ESMTP id 609AD60C17; Thu, 11 Feb 2021 14:55:05 +0000 (UTC) Message-ID: Subject: Re: [PATCH v4 net-next 09/11] skbuff: allow to optionally use NAPI cache from __alloc_skb() From: Paolo Abeni To: Alexander Lobakin Cc: "David S. Miller" , Jakub Kicinski , Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?ISO-8859-1?Q?Bj=F6rn_T=F6pel?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Date: Thu, 11 Feb 2021 15:55:04 +0100 In-Reply-To: <20210211142811.1813-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> <20210210162732.80467-10-alobakin@pm.me> <58147c2d36ea7b6e0284d400229cd79185c53463.camel@redhat.com> <20210211142811.1813-1-alobakin@pm.me> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2021-02-11 at 14:28 +0000, Alexander Lobakin wrote: > From: Paolo Abeni on Thu, 11 Feb 2021 11:16:40 +0100 wrote: > > What about changing __napi_alloc_skb() to always use > > the __napi_build_skb(), for both kmalloc and page backed skbs? That is, > > always doing the 'data' allocation in __napi_alloc_skb() - either via > > page_frag or via kmalloc() - and than call __napi_build_skb(). > > > > I think that should avoid adding more checks in __alloc_skb() and > > should probably reduce the number of conditional used > > by __napi_alloc_skb(). > > I thought of this too. But this will introduce conditional branch > to set or not skb->head_frag. So one branch less in __alloc_skb(), > one branch more here, and we also lose the ability to __alloc_skb() > with decached head. Just to try to be clear, I mean something alike the following (not even build tested). In the fast path it has less branches than the current code - for both kmalloc and page_frag allocation. --- diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 785daff48030..a242fbe4730e 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -506,23 +506,12 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, gfp_t gfp_mask) { struct napi_alloc_cache *nc; + bool head_frag, pfmemalloc; struct sk_buff *skb; void *data; len += NET_SKB_PAD + NET_IP_ALIGN; - /* If requested length is either too small or too big, - * we use kmalloc() for skb->head allocation. - */ - if (len <= SKB_WITH_OVERHEAD(1024) || - len > SKB_WITH_OVERHEAD(PAGE_SIZE) || - (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { - skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); - if (!skb) - goto skb_fail; - goto skb_success; - } - nc = this_cpu_ptr(&napi_alloc_cache); len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); len = SKB_DATA_ALIGN(len); @@ -530,25 +519,34 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, if (sk_memalloc_socks()) gfp_mask |= __GFP_MEMALLOC; - data = page_frag_alloc(&nc->page, len, gfp_mask); + if (len <= SKB_WITH_OVERHEAD(1024) || + len > SKB_WITH_OVERHEAD(PAGE_SIZE) || + (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { + data = kmalloc_reserve(len, gfp_mask, NUMA_NO_NODE, &pfmemalloc); + head_frag = 0; + len = 0; + } else { + data = page_frag_alloc(&nc->page, len, gfp_mask); + pfmemalloc = nc->page.pfmemalloc; + head_frag = 1; + } if (unlikely(!data)) return NULL; skb = __build_skb(data, len); if (unlikely(!skb)) { - skb_free_frag(data); + if (head_frag) + skb_free_frag(data); + else + kfree(data); return NULL; } - if (nc->page.pfmemalloc) - skb->pfmemalloc = 1; - skb->head_frag = 1; + skb->pfmemalloc = pfmemalloc; + skb->head_frag = head_frag; -skb_success: skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); skb->dev = napi->dev; - -skb_fail: return skb; } EXPORT_SYMBOL(__napi_alloc_skb);