Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2034390rwd; Thu, 15 Jun 2023 20:48:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6XZALINugyfW9BIOWNiDZ7uz+JlitvxBdoQlAHb4H1P2POCLcu4voKH1QT70ZhfR9QoNo9 X-Received: by 2002:a05:6a21:329f:b0:10d:3ff2:4531 with SMTP id yt31-20020a056a21329f00b0010d3ff24531mr1309277pzb.29.1686887308531; Thu, 15 Jun 2023 20:48:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686887308; cv=none; d=google.com; s=arc-20160816; b=sqlg7q50qfV8BRa3WMWr3vowyuLMvX000LRrpfJs+rRf6eQyQUn3iQmFvMn+IpQ+Ag mS7ferNKfnCDtHd9LU1OYA6GRj2LheJmuHf9v4difL0gY+6i9KCvmsYa1HNstSkWclwW 4Diwd1+Dnq7cP6t3FXuG2j4MGqvVCGoRJYeQsuTdJ4UBaARST1rrD6PbeWJHcitIJnkS 0EOIbeg0/iJjaIsL2qGwiFcBMzfo7+H5TDv/5UV5ob1iqVWLkB2NK/G/Ph3i4dEUF5v7 HweBI4ePtpywF1kMW90U93z0XDIUt8LzTRnfYir2qgv5nowYqj5fCeBUKGFUiKaxLJqN gKaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=eT9UpcbwWO2JnIYIfjcX5Jd0IBCHYSti9NZENcHY/tk=; b=xiRiGXC5zboJ5voAl8UEJkC3/mZOOH432vvcVy0kECsp9mIavpLaPod3Vup7lINNSC S/OKEoJ/BxczkGNKRK2PqrpYSqC4lSbl67gR2zxLIQxP+beI9e/uYjO/aqLlu+MJMM/x FTGFtXySdFvhZmgvPu90RQYZXNelQLbJR/4ngRoZcq3PSw2sWNK+jA3CZAgSsW/XKnbO oeyayDnqxwkAyLiXqH5+EsmnovGqF/TtBMXfvziItxKRYKTF7l9Mr2Ezn6EYj7qRKf0d tHRv9A6I+C4xX90muJXQC6ijXdeymlgxY+Hlo77xck+/Vn0IS5k7PKd42S2O4W4bu/di dz7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=dmoE23iS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k2-20020a170902d58200b001b041114a62si10525765plh.355.2023.06.15.20.48.14; Thu, 15 Jun 2023 20:48:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=dmoE23iS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241576AbjFPDHl (ORCPT + 99 others); Thu, 15 Jun 2023 23:07:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241642AbjFPDHi (ORCPT ); Thu, 15 Jun 2023 23:07:38 -0400 Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FDD92D67; Thu, 15 Jun 2023 20:07:27 -0700 (PDT) Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2b1acd41ad2so1830801fa.3; Thu, 15 Jun 2023 20:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686884845; x=1689476845; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eT9UpcbwWO2JnIYIfjcX5Jd0IBCHYSti9NZENcHY/tk=; b=dmoE23iSIi92vpqibKIWHcNdfWvs3eBID/BVsDizNXrKkgufw5qcI/CpLFD68JvlM+ JUiakZpOdGoRQmOYBS8/BzSXbkfXonHtwbYbwcqHGXNkldt3t1lxeQ2PfyfLLEKejsa4 zYIqkOdLpHNd5dJas0KacV/V+S9kuyJOtL3sX9HsM3I/+sq75D0IFe3dWaxTReCyvCvQ IEk4mtt04oaHDnzIm7bBLpHqBaWp+OMczYLtJj8s839oEgnrlAGYMEaB8Mz0UpOL57+L 1wkhmbsjG4RlFmQQFw9w6AVls8hIdcTWCPjltntl5JsFeDYw24KGtpjQVlB3QTRlRBEO KbOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686884845; x=1689476845; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eT9UpcbwWO2JnIYIfjcX5Jd0IBCHYSti9NZENcHY/tk=; b=O7kRnOF0HVLEOD+uxSgBRuWfzNkIPh6G1lsgJYw4A+xVubk8nOAp/e1WyLqNSMzQTv 0lfPvWgDUTXdLM7Fi7Zzt+gmGC9f97sP2E7b1xzIj/3yRw/B0Wmr82kyw4A5A5wjWaHU h6jiY6Lztet2epMDib8lwdNZRdSko/lFpUZPuDn/h6Rf1CMpiQDYaFFUEJJ++qBb8WJf nZQLwDt2z5CbmBpzLT/I0jdM5mC8TGDHoZJnneu+v3ImFUaNRLTElLHM0SpFfHEe1cSV l9/ZRTqNixhmUDBvvCoYRwkPusZKOIqt3zlARObTt2s7RqgXXJvho6Ho4PpGL8KiRdSv YKyA== X-Gm-Message-State: AC+VfDxNgkJP10fo0FQG46MyX1g1yGh8yzLtcqgX7IUhvpQ1Y69PNWdg b+hwDxOgVQuAFgMaVaqVb1ePnyQREyzOL8i4A5IoTFUMPH0= X-Received: by 2002:a2e:914a:0:b0:2b1:eb93:ecb1 with SMTP id q10-20020a2e914a000000b002b1eb93ecb1mr773236ljg.26.1686884845321; Thu, 15 Jun 2023 20:07:25 -0700 (PDT) MIME-Version: 1.0 References: <20230615013645.7297-1-liangchen.linux@gmail.com> <20230614212031.7e1b6893@kernel.org> In-Reply-To: From: Liang Chen Date: Fri, 16 Jun 2023 11:07:12 +0800 Message-ID: Subject: Re: [PATCH net-next] page pool: not return page to alloc cache during pool destruction To: Jesper Dangaard Brouer Cc: Jakub Kicinski , brouer@redhat.com, hawk@kernel.org, ilias.apalodimas@linaro.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, davem@davemloft.net, edumazet@google.com, pabeni@redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 15, 2023 at 10:00=E2=80=AFPM Jesper Dangaard Brouer wrote: > > > > On 15/06/2023 06.20, Jakub Kicinski wrote: > > On Thu, 15 Jun 2023 09:36:45 +0800 Liang Chen wrote: > >> When destroying a page pool, the alloc cache and recycle ring are empt= ied. > >> If there are inflight pages, the retry process will periodically check= the > >> recycle ring for recently returned pages, but not the alloc cache (all= oc > >> cache is only emptied once). As a result, any pages returned to the al= loc > >> cache after the page pool destruction will be stuck there and cause th= e > >> retry process to continuously look for inflight pages and report warni= ngs. > >> > >> To safeguard against this situation, any pages returning to the alloc = cache > >> after pool destruction should be prevented. > > > > Let's hear from the page pool maintainers but I think the driver > > is supposed to prevent allocations while pool is getting destroyed. > > Yes, this is a driver API violation. Direct returns (allow_direct) can > only happen from drivers RX path, e.g while driver is active processing > packets (in NAPI). When driver is shutting down a page_pool, it MUST > have stopped RX path and NAPI (napi_disable()) before calling > page_pool_destroy() Thus, this situation cannot happen and if it does > it is a driver bug. > > > Perhaps we can add DEBUG_NET_WARN_ON_ONCE() for this condition to > > prevent wasting cycles in production builds? > > > > For this page_pool code path ("allow_direct") it is extremely important > we avoid wasting cycles in production. As this is used for XDP_DROP > use-cases for 100Gbit/s NICs. > > At 100Gbit/s with 64 bytes Ethernet frames (84 on wire), the wirespeed > is 148.8Mpps which gives CPU 6.72 nanosec to process each packet. > The microbench[1] shows (below signature) that page_pool_alloc_pages() + > page_pool_recycle_direct() cost 4.041 ns (or 14 cycles(tsc)). > Thus, for this code fast-path every cycle counts. > > In practice PCIe transactions/sec seems limit total system to 108Mpps > (with multiple RX-queues + descriptor compression) thus 9.26 nanosec to > process each packet. Individual hardware RX queues seems be limited to > around 36Mpps thus 27.77 nanosec to process each packet. > > Adding a DEBUG_NET_WARN_ON_ONCE will be annoying as I like to run my > testlab kernels with CONFIG_DEBUG_NET, which will change this extreme > fash-path slightly (adding some unlikely's affecting code layout to the > mix). > > Question to Liang Chen: Did you hit this bug in practice? > > --Jesper > Yeah, we hit this problem while implementing page pool support for virtio_net driver, where we only enable page pool for xdp path, i.e. turning on/off page pool when xdp is enabled/disabled. The problem turns up when the xdp program is uninstalled, and there are still inflight page pool page buffers. Then napi is enabled again, the driver starts to process those inflight page pool buffers. So we will need to be aware of the state of the page pool (if it is being destructed) while returning the pages back. That's what motivated us to add this check to __page_pool_put_page. Thanks, Liang > CPU E5-1650 v4 @ 3.60GHz > tasklet_page_pool01_fast_path Per elem: 14 cycles(tsc) 4.041 ns > tasklet_page_pool02_ptr_ring Per elem: 49 cycles(tsc) 13.622 ns > tasklet_page_pool03_slow Per elem: 162 cycles(tsc) 45.198 ns > > [1] > https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/b= ench_page_pool_simple.c >