Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp1307747lqp; Fri, 22 Mar 2024 10:55:52 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXSZur1RB+LK9KYLzFJfEVf12dGt6rjraJpBJ2Xjr/g3LeehdHpYMjA2clclqQS89sy/IUvWACrz2IzK9MUh3P/uI7AfRfTrrClnTmSgg== X-Google-Smtp-Source: AGHT+IH7imz4u6vgreQEqjjo8gvipBHllgLhYebjU/v92i2qTrPniL8QL8DY48l8krGpKasYiHBK X-Received: by 2002:a05:6214:258e:b0:691:4bdc:aa48 with SMTP id fq14-20020a056214258e00b006914bdcaa48mr71568qvb.54.1711130151844; Fri, 22 Mar 2024 10:55:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711130151; cv=pass; d=google.com; s=arc-20160816; b=gQAZRidJc8LE6CEc4cfQwUfgF5u+mU0ObihKB0/1dSo1gMFg8u3qlz1uCJm0vPG2br hTf70pXxRaVX3vO9gWN8aLpvsMg02YU8Dv1UxhJRzlUHIWozIg9E8TljHJH43i0C1HQX H0OhASi+ergVZaQ/uZydpPtODhvw/WrW+GnAGqNpEIvxyDj3oXes/OH5rrsnKu9xMZcM OkOTH4bPhNzEQ2RigPFj6T+n7euiiOY6eM1+3fmnt953jAGHjKPVSomkoFQQdkAIgggX L/59XpGWALWtr0w1tKAb0rAxattD890mkOm8Hcchwimi3zymDU/TPy+WDViiwX0z4cu3 +MWg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=6WuYhmNp304wILfLv6V/aQkHKFavalWMa0RSWXYvrPw=; fh=sc/XFOJuylNftfB7xJxjQAdG6TFP1ohbdLNZSZmL5kM=; b=ZuOkyHTc+qLkmRYMfuOmPrHkQ4jT0ioxkNioG8bR9GnES113p+T5pqtj+tRb8DpXjV XDBwJXYNS8CYe9YnyIebTx+XE8qK5b6FLbtEvsogn1Ps95abuooexWKAqFK7ItI1SOOp l3f036mLiopXHMjbhfl0Petv51ZKrfbUIm/TZi2hbLV7r/ohvMd0WT6QF6yosmjrG5UY kK6s/KArK54240ab2fTw2bB+48fMJqRaB2MOirU0qdSjjSMDQi6TtSoppl6M66ZaOGxj af8VAp3watMDinbnytnzHk/CXmBLJnXSFB4mOwbQ3lm6rvcenwOAkwr6LdalMhGl1tZq 7ZnA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=RzkND4Bm; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-111892-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-111892-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id iw6-20020a0562140f2600b006915032b1f1si2523712qvb.568.2024.03.22.10.55.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Mar 2024 10:55:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-111892-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=RzkND4Bm; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-111892-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-111892-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 81A321C2298F for ; Fri, 22 Mar 2024 17:55:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0624260DEC; Fri, 22 Mar 2024 17:55:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="RzkND4Bm" Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B029D60DC0 for ; Fri, 22 Mar 2024 17:55:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.53 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711130111; cv=none; b=UZXUHnMf0IwYXwQH91UP7v3Re8GstmlTCSh1BFGeqSyMDXbbPNAzvynjlTveLOFrNiwo+oM+tSjiFz+G9Tzb0G5CfTms36j9u/ULdHIOtOIn8vQd8nxR2w+Jl5WP0sS8SKS8O9bh7AEOzRe492g8b103GMtkQsrjptOLdatfzdI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711130111; c=relaxed/simple; bh=l9m99BFxmTnLiVZTWH4IyJEQNsmy+RFkQO1mLY1L5IA=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Z0gbBg7pEfjFqABBL+x9SQBpOPuRzVOxJLFEUBD4nQMRx0Ya28fA70a7/QAwQyqXuwWbgZJHGMKR7BIxZ2jUoua8m/ZCxgT4G03/qJxc0waJSHxR/Pt8yHBphdzUFT0yTjPV5boxKGTBpHBwTwODG/jX/r6pjh/HGirBTKsRtYU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=RzkND4Bm; arc=none smtp.client-ip=209.85.208.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-56b8e4f38a2so3166276a12.3 for ; Fri, 22 Mar 2024 10:55:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711130107; x=1711734907; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6WuYhmNp304wILfLv6V/aQkHKFavalWMa0RSWXYvrPw=; b=RzkND4Bm4wkW3lvcEypkcxdmIg1Pc2XAJIPjja+0m+v/CoeclpPbx3S06/HnLsyDhQ IlE86J50GPqiQotibPDMSZcMLL/LCKx/nc/WNm03EtQrfCMtwRaAGvsZjG31wZg8unia yIO5LSMFoknJfUy3v6TCpywvoCOXS0ROpdyhSG4wgoLxD11Cmr7aC53NpjsGzJZtHEmz sA4xBR4biuHrpF47aakzjpmM+UJ+L8Q/7YjQryfz+gthJ09Xn0G6rQ7Edg5cBODxW7zj U24TMitp1pKP0upk5gxicw9rGF6SY9GuUwm96+GjbXB1pIgu8Q2a3SaUmr9e+3ce/QnJ wkHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711130107; x=1711734907; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6WuYhmNp304wILfLv6V/aQkHKFavalWMa0RSWXYvrPw=; b=VjPsTKJ8e9kNACrgehWyW90w433eyixfRHcVdXo9X4lujcjeJe8pOPbg17IYMfLO5f HrjNWBgMXxrKWq6b3S5DpQNqqhw/KkEaaWUAQmnK3Exvht219vkNr7tbJLAwYAy6Xzb1 9MJeXFaAhlK0ORwiDlDjlk+q8ihIS1ItvV8M43AR2KhGqOAdUOjv+qSQW6/ya11Wh6Je iOp+ggG0cK5i6bTW4/jt43fjUJyc7rwgHabUPzmCUfMbme8sFdPj2/nMbVTFncTWIuMy tKBYZyV/zvJrBFuKjyCBFyhv5By+W5cMy21p0DkogHLBOrbpEKIGnafR3bkFTiehmWff n7PA== X-Forwarded-Encrypted: i=1; AJvYcCXEDLAzHcAqhYBOn4TiXNF490LxoJn0iMV0Yoepy3LikFmSAPmv5NSl6IXtAJksSVf9nRaOH8D61h147Ag5eBhFgYgKnlDV/LEmG8cT X-Gm-Message-State: AOJu0Yyk5WX09bEaFS+ADA7HRTlAhZCEfwVPfazpGe5xOlY6J6OIrDW2 LvP+rxdqpbg3wS5FmRnSQ/l7A+VkIChzKOzkhQ7zC0GLN4fkz0WZLOICdBsB/cFnRPM5msN1psi Yn0/6bVu0CAr8Ex1kBeBkLmkotY58SU+w5P1X X-Received: by 2002:a17:906:d190:b0:a47:e62:4d72 with SMTP id c16-20020a170906d19000b00a470e624d72mr331556ejz.15.1711130106762; Fri, 22 Mar 2024 10:55:06 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240305020153.2787423-1-almasrymina@google.com> <20240305020153.2787423-3-almasrymina@google.com> In-Reply-To: From: Mina Almasry Date: Fri, 22 Mar 2024 10:54:54 -0700 Message-ID: Subject: Re: [RFC PATCH net-next v6 02/15] net: page_pool: create hooks for custom page providers To: Christoph Hellwig Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Pavel Begunkov , David Wei , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Mar 17, 2024 at 7:03=E2=80=AFPM Christoph Hellwig wrote: > > On Mon, Mar 04, 2024 at 06:01:37PM -0800, Mina Almasry wrote: > > From: Jakub Kicinski > > > > The page providers which try to reuse the same pages will > > need to hold onto the ref, even if page gets released from > > the pool - as in releasing the page from the pp just transfers > > the "ownership" reference from pp to the provider, and provider > > will wait for other references to be gone before feeding this > > page back into the pool. > > The word hook always rings a giant warning bell for me, and looking into > this series I am concerned indeed. > > The only provider provided here is the dma-buf one, and that basically > is the only sensible one for the documented design. Sorry I don't mean to argue but as David mentioned, there are some plans in the works and ones not in the works to extend this to other memory types. David mentioned io_uring & Jakub's huge page use cases which may want to re-use this design. I have an additional one in mind, which is extending devmem TCP for storage devices. Currently storage devices do not support dmabuf and my understanding is that it's very hard to do so, and NVMe uses pci_p2pdma instead. I wonder if it's possible to extend devmem TCP in the future to support pci_p2pdma to support nvme devices in the future. Additionally I've been thinking about a use case of limiting the amount of memory the net stack can use. Currently the page pool is free to allocate as much memory as it wants from the buddy allocator. This may be undesirable in very low memory setups such as overcommited VMs. We can imagine a memory provider that allows allocation only if the page_pool is below a certain limit. We can also imagine a memory provider that preallocates memory and only uses that pinned pool. None of these are in the works at the moment, but are examples of how this can be (reasonably?) extended. > So instead of > adding hooks that random proprietary crap can hook into, To be completely honest I'm unsure how to design hooks for proprietary code to hook into. I think that would be done on the basis of EXPORTED_SYMBOL? We do not export these hooks, nor plan to at the moment. > why not hard > code the dma buf provide and just use a flag? That'll also avoid > expensive indirect calls. > Thankfully the indirect calls do not seem to be an issue. We've been able to hit 95% line rate with devmem TCP and I think the remaining 5% are a bottleneck unrelated to the indirect calls. Page_pool benchmarks show a very minor degradation in the fast path, so small it may be just noise in the measurement (may!): https://lore.kernel.org/netdev/20240305020153.2787423-1-almasrymina@google.= com/T/#m1c308df9665724879947a345c4b1ec3b51ff6856 This is because the code path that does indirect allocations is the slow path. The page_pool recycles netmem aggressively. --=20 Thanks, Mina