Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp7879860rdb; Thu, 4 Jan 2024 10:25:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IHe/PjK5x0saRDXANF3h24fWpi7HyDIis8AnKdMN8wXH5jQmEapMtsAMXl6EsaQjIzbVCTl X-Received: by 2002:a17:903:453:b0:1d4:d350:7932 with SMTP id iw19-20020a170903045300b001d4d3507932mr834146plb.75.1704392711485; Thu, 04 Jan 2024 10:25:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704392711; cv=none; d=google.com; s=arc-20160816; b=jN1PubP2CpqkB4aih5sYo1fUiU7ABjRZVdAu7KlwimXhLogQnX3yfHebP8zh5V/HQM +Wp+txCGzXz+4haICaL8LSbK0fMqBsVDOJMhBwjoNBBczTLygMAneNkisuN2pQECo8pa TgXsRgOIv9+t1Em0TXXC3EWNzihXMMF6iEHji1l12x3HQvvpdop29oIrH6FEacp/sXWb T8kW3bzSJAXqieG5elIpHa0864exz5BL4JzRQvAebOdxWBE3Pnrn+m/lxAquH0qaKw9a Tt0oDyVnoYbgvYTn5pEEaGhYoFNwK0Tv2YfMJQ3vabJFtgw7vrXAq7iZs4n+XhYMHWdr JtcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=w906pRFrrBglgFN05sIbeh3rO+uVc2/5xWCV3xYhIB8=; fh=h0MJ5j3WihNNW6X4RNzDYMB4z6dutAQcQnJtFLy5kKg=; b=UcfHGjMSvKq8hFH8y2qsBDYdkjn+tB4H+0yymMmPeoL7Xpa83zUWoyJTm288qsHJxR BT9SZrupI0eOc10xEZLnPtL9AXLjx2l5Q9jWpa4nCY+knNNc/U0WrMmXTXJPJJT3zH7j 5pL5TG9+yhW2S1ECFg8ODU8xUQFxq5VN3+uapsYlSzlvs1vN11LnqDop8ISDzBdgz236 JU8f5PyrduP5GKx6b4cuqquAy4UdpQNrgVbSnYUtaQyHlaO/CSqaxP+EePigZiYgbpZZ DdI06Aj+72Dhg6ApbKSVhoi9/WQrUM6KdwTO0UwA2jkBkkghRlEQ+YNBaUZY1QMa9DG/ 4zGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nxGHI1kG; spf=pass (google.com: domain of linux-kernel+bounces-17089-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-17089-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id j17-20020a170902da9100b001d3714f17easi24348462plx.37.2024.01.04.10.25.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 10:25:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-17089-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nxGHI1kG; spf=pass (google.com: domain of linux-kernel+bounces-17089-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-17089-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 3CD2FB24573 for ; Thu, 4 Jan 2024 18:24:57 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6E8F428DB0; Thu, 4 Jan 2024 18:24:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nxGHI1kG" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14B9C28DD8 for ; Thu, 4 Jan 2024 18:24:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-55559e26ccfso1022656a12.3 for ; Thu, 04 Jan 2024 10:24:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704392681; x=1704997481; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w906pRFrrBglgFN05sIbeh3rO+uVc2/5xWCV3xYhIB8=; b=nxGHI1kGpJdyecUGvE6QDOGs3aN2TRKl0NThtScBMBlo9F3Y8IYqT2pfYzqvBtNxNV oFn8YMnzMINKa6MeSpfn3DCzx2oUBHsBhv722hk2wCqAd+QglEhZbeZiPoyo87dxkIh4 qseus14cA1agPIE2PndGD7LxTD8gKHF6vFnl1h1QTrpTR65C1O6UZfYBmJ3wdTdtqmD2 M17gR2fLHa6nachcQ9vbg/cLWwsL224Dv+j41gAlS+hI3PoF0P4d8uJ+cdkVTkAcWDnD EMiCjpqMfp6FggwYlSJtAupy3qFWiLtG0YP+B2fWoV9STRPnLUWmS6Dcxkkjx5xXmkhy gzEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704392681; x=1704997481; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w906pRFrrBglgFN05sIbeh3rO+uVc2/5xWCV3xYhIB8=; b=SlWqbQUKmlKqmunnS5ywmqRt2yEyGO+jpsRoWTD+KEe9Waot8D9Ln6JTFyHhBFkuGa 7e9cTCdOtHfqzVs3M1wm8pRm1UAfY9VN11Tx4hDbv332g7jSti9srZscGxieoT0+TWGc Wus5mq5By4v2O8pBvhTXsPXPMsmKIf1X+uC0wjQE5N31UYSPWaV7dB3byFOdhVWgSLo0 pXg9wR2bX8VL6flbLm5Z6eryCF9rHQzhaFGik+3ccwTLjubjcKTGCxeSLq9FaowZDD/m toFhvalglwLyekKG0gCa/J/GstiSVog+qHh4Ld4FCUjuNWT96kX3b1CvqOzxtgQJZGiN qf2g== X-Gm-Message-State: AOJu0YxIb3kHiGTuag4xi5gf/hPWGZ04G8JazTHLKNmxwxkqYfBmH9ZT H7h7ETKTS999/VwjHh3h8coAjq9/cBfn4X3KOBNmOAnG8egg X-Received: by 2002:a17:906:e0d8:b0:a27:6e73:a248 with SMTP id gl24-20020a170906e0d800b00a276e73a248mr344849ejb.68.1704392681125; Thu, 04 Jan 2024 10:24:41 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20231214020530.2267499-1-almasrymina@google.com> <20231214020530.2267499-5-almasrymina@google.com> <20231215021114.ipvdx2bwtxckrfdg@google.com> <20231215190126.1040fa12@kernel.org> <54f226ef-df2d-9f32-fa3f-e846d6510758@huawei.com> <7c6d35e3-165f-5883-1c1b-fce82c976028@huawei.com> <99817ed2-8ba6-ef8f-3ccb-2a2ab284b4af@huawei.com> In-Reply-To: <99817ed2-8ba6-ef8f-3ccb-2a2ab284b4af@huawei.com> From: Mina Almasry Date: Thu, 4 Jan 2024 10:24:29 -0800 Message-ID: Subject: Re: [RFC PATCH net-next v1 4/4] net: page_pool: use netmem_t instead of struct page in API To: Yunsheng Lin Cc: Shakeel Butt , Jakub Kicinski , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Greg Kroah-Hartman , "Rafael J. Wysocki" , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Michael Chan , "David S. Miller" , Eric Dumazet , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Wei Fang , Shenwei Wang , Clark Wang , NXP Linux Team , Jeroen de Borst , Praveen Kaligineedi , Shailend Chand , Yisen Zhuang , Salil Mehta , Jesse Brandeburg , Tony Nguyen , Thomas Petazzoni , Marcin Wojtas , Russell King , Sunil Goutham , Geetha sowjanya , Subbaraya Sundeep , hariprasad , Felix Fietkau , John Crispin , Sean Wang , Mark Lee , Lorenzo Bianconi , Matthias Brugger , AngeloGioacchino Del Regno , Saeed Mahameed , Leon Romanovsky , Horatiu Vultur , UNGLinuxDriver@microchip.com, "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Jassi Brar , Ilias Apalodimas , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Siddharth Vadapalli , Ravi Gunasekaran , Roger Quadros , Jiawen Wu , Mengyuan Lou , Ronak Doshi , VMware PV-Drivers Reviewers , Ryder Lee , Shayne Chen , Kalle Valo , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Stefan Hajnoczi , Stefano Garzarella , Shuah Khan , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , Jason Gunthorpe , Willem de Bruijn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Jan 4, 2024 at 12:48=E2=80=AFAM Yunsheng Lin wrote: > > On 2024/1/4 2:38, Mina Almasry wrote: > > On Wed, Jan 3, 2024 at 1:47=E2=80=AFAM Yunsheng Lin wrote: > >> > >> On 2024/1/3 0:14, Mina Almasry wrote: > >>> > >>> The idea being that skb_frag_page() can return NULL if the frag is no= t > >>> paged, and the relevant callers are modified to handle that. > >> > >> There are many existing drivers which are not expecting NULL returning= for > >> skb_frag_page() as those drivers are not supporting devmem, adding add= itionl > >> checking overhead in skb_frag_page() for those drivers does not make m= uch > >> sense, IMHO, it may make more sense to introduce a new helper for the = driver > >> supporting devmem or networking core that needing dealing with both no= rmal > >> page and devmem. > >> > >> And we are also able to keep the old non-NULL returning semantic for > >> skb_frag_page(). > > > > I think I'm seeing agreement that the direction we're heading into > > here is that most net stack & drivers should use the abstract netmem > > As far as I see, at least for the drivers, I don't think we have a clear > agreement if we should have a unified driver facing struct or API for bot= h > normal page and devmem yet. > To be honest I definitely read that we have agreement that we should have a unified driver facing struct from the responses in this thread like this one: https://lore.kernel.org/netdev/20231215190126.1040fa12@kernel.org/ But I'll let folks correct me if I'm wrong. > > type, and only specific code that needs a page or devmem (like > > tcp_receive_zerocopy or tcp_recvmsg_dmabuf) will be the ones that > > unpack the netmem and get the underlying page or devmem, using > > skb_frag_page() or something like skb_frag_dmabuf(), etc. > > > > As Jason says repeatedly, I'm not allowed to blindly cast a netmem to > > a page and assume netmem=3D=3Dpage. Netmem can only be cast to a page > > after checking the low bits and verifying the netmem is actually a > > I thought it would be best to avoid casting a netmem or devmem to a > page in the driver, I think the main argument is that it is hard > to audit very single driver doing a checking before doing the casting > in the future? and we can do better auditting if the casting is limited > to a few core functions in the networking core. > Correct, the drivers should never cast directly, but helpers like skb_frag_page() must check that the netmem is a page before doing a cast. > > page. I think any suggestions that blindly cast a netmem to page > > without the checks will get nacked by Jason & Christian, so the > > checking in the specific cases where the code needs to know the > > underlying memory type seems necessary. > > > > IMO I'm not sure the checking is expensive. With likely/unlikely & > > static branches the checks should be very minimal or a straight no-op. > > For example in RFC v2 where we were doing a lot of checks for devmem > > (we don't do that anymore for RFCv5), I had run the page_pool perf > > tests and proved there is little to no perf regression: > > For MAX_SKB_FRAGS being 17, it means we may have 17 additional checking > overhead for the drivers not supporting devmem, not to mention we may > have bigger value for MAX_SKB_FRAGS if BIG TCP is enable. > With static branch the checks should be complete no-ops unless the user's set up enabled devmem. > Even there is no notiable performance degradation for a specific case, > we should avoid the overhead as much as possible for the existing use > case when supporting a new use case. > > > > > https://lore.kernel.org/netdev/CAHS8izM4w2UETAwfnV7w+ZzTMxLkz+FKO+xTgRd= tYKzV8RzqXw@mail.gmail.com/ > > The above test case does not even seems to be testing a code path calling > skb_frag_page() as my understanding. > > > --=20 Thanks, Mina