Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1078432rwd; Wed, 31 May 2023 09:11:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6Oqcjrpx4L29FcoY7q7uEC/zi9nBWv0BEdReRd9cC5H//A6zqdWQMeSD9U0D0oFOrHwapb X-Received: by 2002:a92:b106:0:b0:335:542b:aa48 with SMTP id t6-20020a92b106000000b00335542baa48mr2925433ilh.19.1685549459889; Wed, 31 May 2023 09:10:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685549459; cv=none; d=google.com; s=arc-20160816; b=mFxY2EfNYTpYYPSHYwpJzTrOUFrhLVpfwL7mmddXuuKOersl5yJaDyclTa7Hzp48h0 TgZGdsd1I77BX3NCBj3L0i162fShhN9gb+ap0FQZah9pYrsiLZCEmbIESeKf4KcVRpQt XONEEqyt8gfaVMRiv5+skOLDUtKAG1LCkgPV+Oqiru8SquMyre+R0O7YLNgqB32U2ZPA 7T15NX99Md/P6HA6rgHEP3wYDcZ9Xx4qLmy0US3e9tTAXoRY4q7Nl2tzkWa4HcBZA6IS fMJRHbZTVATADIvqCM8KkEYEPwpGfqluKs8U+3iBJDn5pAVKczv8WeIB/K7PqfVyD4Jz 52og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=x09q4V7XQ8tGv39HcU1XOFn4TdsznzmNLwgqf5c45o0=; b=pTRacP5/tNOjcjQSvUbyLodzwWtCFsyKjBPLcMNNU6GLyoHiyL4278NuacB7IB+xYY Atri1mZgTTceft553TQLUo/TORYhR4ZALIFGM175QQmhMOClhApohBSv2aXMcNQJP2+z CBY0Tnr7YGxXsffqdrCKKpocrHRujVCskn4wgiYJ7tyVOMWgbnf+MUiT25P5LMGeRazq Te5IOerLZ2qOKJxUFsf9rqgG+y2qLYE+vmds5i+kxfrlmj1Kz5BSTcrgrnvXWH4/Tw2a rIK4HzESW5ba3zjo1vzaOei8X5ANa+Vplu1sMxyDMEKGZZcLTdBQbuTV8oFAw82CNfkw yreA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Dw+kB6wD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i7-20020a17090a4b8700b0024e500f3749si1223523pjh.68.2023.05.31.09.10.44; Wed, 31 May 2023 09:10:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Dw+kB6wD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231987AbjEaPiC (ORCPT + 99 others); Wed, 31 May 2023 11:38:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230470AbjEaPiA (ORCPT ); Wed, 31 May 2023 11:38:00 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8BC79D; Wed, 31 May 2023 08:37:59 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-52cb78647ecso3662809a12.1; Wed, 31 May 2023 08:37:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685547479; x=1688139479; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=x09q4V7XQ8tGv39HcU1XOFn4TdsznzmNLwgqf5c45o0=; b=Dw+kB6wDsxUnFbut6TXa9W8Vi58mvUX04CMY61FEPF4i6py8V8S4Q+SiHXh2i02/kz 7UMf0eLFl9FUgGJIhWXtdgDxU7n9z9JGfLNFH4oF8O1vytsyYBDc0Gl9pUefWYXwBHK2 kwgxi1YRc1h4fKoB6vfKxCEtPntPafH35p13xW4FXrYmCPPPN4/zoH4ejmjAaoITsPNd UlmOE6dBBdB1Sk1LW1rT+W5/PcLw/mhgbRlTacxve6KvPCv2rqmtvpwV6fGve66nQk8y zibAOCrafRW90stiJR7/lHfHI/kfsSrTbiOdIJ4QHcbLtS1GMz+BCL1CI4HqYKwOlE3t 3m7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685547479; x=1688139479; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=x09q4V7XQ8tGv39HcU1XOFn4TdsznzmNLwgqf5c45o0=; b=PcUtq87nGdHNjfWf/jGTcbaiwmL/t7rk6LWwqfoYxSCFQ50fe6rT8GXkt2BrzX04Bu ++JZUErtGquL1m/l/CT2OFedVOJE0DIkygVqNBcdOW6KGzfTTfYkL6nKePWcugj2+F4E rCiqVROGue5HhFpzPZFHBUlkL/ZKOOsBl+KOKv/plVc+uSRlEdo1AYXhMVYuWtYTprh+ Th02/fCWgeFnnF3WUL6I/S8TtoPe3FRiM51DYyTiznZTByTJeF0aQrlO9Tscf7eA1Tq5 iiedg8kjYTv1RZKNhYsdqN2ALBx4R5uXNtHm93Q21zg/bkGywJuoPCKCtdbzvfW5IIPI RyCQ== X-Gm-Message-State: AC+VfDzy1LKhvVIrpAIasldYV3TLqJbOeMbXSNEg0gud3dZDWKwDOady rrEq4YaUiCLpWFs2S5dw2rg= X-Received: by 2002:a05:6a20:a122:b0:105:66d3:8538 with SMTP id q34-20020a056a20a12200b0010566d38538mr5353248pzk.8.1685547478984; Wed, 31 May 2023 08:37:58 -0700 (PDT) Received: from ?IPv6:2605:59c8:448:b800:82ee:73ff:fe41:9a02? ([2605:59c8:448:b800:82ee:73ff:fe41:9a02]) by smtp.googlemail.com with ESMTPSA id fe13-20020a056a002f0d00b0063f1430dd57sm3440014pfb.49.2023.05.31.08.37.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 May 2023 08:37:58 -0700 (PDT) Message-ID: <9523677f696a6376c79d32cbec7d6e7ceb1b0500.camel@gmail.com> Subject: Re: [PATCH net-next v3 03/12] iavf: optimize Rx buffer allocation a bunch From: Alexander H Duyck To: Alexander Lobakin , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Maciej Fijalkowski , Magnus Karlsson , Michal Kubiak , Larysa Zaremba , Jesper Dangaard Brouer , Ilias Apalodimas , Christoph Hellwig , Paul Menzel , netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org Date: Wed, 31 May 2023 08:37:56 -0700 In-Reply-To: <20230530150035.1943669-4-aleksander.lobakin@intel.com> References: <20230530150035.1943669-1-aleksander.lobakin@intel.com> <20230530150035.1943669-4-aleksander.lobakin@intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4 (3.44.4-3.fc36) MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2023-05-30 at 17:00 +0200, Alexander Lobakin wrote: > The Rx hotpath code of IAVF is not well-optimized TBH. Before doing any > further buffer model changes, shake it up a bit. Notably: >=20 > 1. Cache more variables on the stack. > DMA device, Rx page size, NTC -- these are the most common things > used all throughout the hotpath, often in loops on each iteration. > Instead of fetching (or even calculating, as with the page size) them > from the ring all the time, cache them on the stack at the beginning > of the NAPI polling callback. NTC will be written back at the end, > the rest are used read-only, so no sync needed. > 2. Don't move the recycled buffers around the ring. > The idea of passing the page of the right-now-recycled-buffer to a > different buffer, in this case, the first one that needs to be > allocated, moreover, on each new frame, is fundamentally wrong. It > involves a few o' fetches, branches and then writes (and one Rx > buffer struct is at least 32 bytes) where they're completely unneeded, > but gives no good -- the result is the same as if we'd recycle it > inplace, at the same position where it was used. So drop this and let > the main refilling function take care of all the buffers, which were > processed and now need to be recycled/refilled. > 3. Don't allocate with %GPF_ATOMIC on ifup. > This involved introducing the @gfp parameter to a couple functions. > Doesn't change anything for Rx -> softirq. > 4. 1 budget unit =3D=3D 1 descriptor, not skb. > There could be underflow when receiving a lot of fragmented frames. > If each of them would consist of 2 frags, it means that we'd process > 64 descriptors at the point where we pass the 32th skb to the stack. > But the driver would count that only as a half, which could make NAPI > re-enable interrupts prematurely and create unnecessary CPU load. > 5. Shortcut !size case. > It's super rare, but possible -- for example, if the last buffer of > the fragmented frame contained only FCS, which was then stripped by > the HW. Instead of checking for size several times when processing, > quickly reuse the buffer and jump to the skb fields part. > 6. Refill the ring after finishing the polling loop. > Previously, the loop wasn't starting a new iteration after the 64th > desc, meaning that we were always leaving 16 buffers non-refilled > until the next NAPI poll. It's better to refill them while they're > still hot, so do that right after exiting the loop as well. > For a full cycle of 64 descs, there will be 4 refills of 16 descs > from now on. >=20 > Function: add/remove: 4/2 grow/shrink: 0/5 up/down: 473/-647 (-174) >=20 > + up to 2% performance. >=20 > Signed-off-by: Alexander Lobakin What is the workload that is showing the performance improvement? <...> > @@ -1350,14 +1297,6 @@ static bool iavf_is_non_eop(struct iavf_ring *rx_r= ing, > union iavf_rx_desc *rx_desc, > struct sk_buff *skb) I am pretty sure the skb pointer here is an unused variable. We needed it for ixgbe to support RSC. I don't think you have any code that uses it in this function and I know we removed the variable for i40e, see commit d06e2f05b4f18 ("i40e: adjust i40e_is_non_eop").