Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp6214032rwr; Mon, 24 Apr 2023 16:03:03 -0700 (PDT) X-Google-Smtp-Source: AKy350bM6B1INdmOKBjVB+ADsl433QCYfCQeEdL8RYCt/3wdzFiZ2coZy88ROasCUrJekc2zyQEY X-Received: by 2002:a17:902:ec91:b0:1a9:57b4:9d5a with SMTP id x17-20020a170902ec9100b001a957b49d5amr12738353plg.31.1682377382645; Mon, 24 Apr 2023 16:03:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682377382; cv=none; d=google.com; s=arc-20160816; b=j1qUszCjFfHujrJxNaKwaiB7QBPvLjp+DKPNG2RtFbt1W99NNk1VgjF7rq5dZ8TQu2 OrCc1OMrd7CUHinV0emzK8TwC/p9JFzeTNxmOBl0XoxKRdXtKb5ttody/NUQeLikJ74B I45aeAASZr84ByCadSYZrwFxZHsWcOqjsJC+gPejt7QaQPmd2FE13e96iZXH3t9dSuEA SN9ZHTa4URCHNi5p3aAIzj+IVPjsCmxd7hSZ/VDbIRVIvEg4MQe+NQfEKJRpz2f3sOOw Htx/ViIvpNkul7o+jWGvwdq13KRUYwJvlYFxMV0vbsJLZ9J58hRSBeoTBOiEhb83/N6l GrWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Lb9V5qEz8uwYJMRGS2n+7cwxt++F8Vl5cfnWbWpg2NA=; b=zrB+1yB1yHo1+TOAas7x7mB5dTbJE3HKJUzf6FVbMAzTyhMend3zJCAbHfinu+Muoc EBILvjJvtAvEjoenhcy7uDFvtX0UYYKWxeGosqefxa6lHCbT0w4hokr9Ae+a7wGNFvDi Wzo0+TNJK5KAHPt6/FMCb6rxlMoY95ywgY+JA611j1PYH7TJuZTQrdl55g25cLfUxf5S VbdwjE5a+gJl2pT3rNzoML3FlUwEcTWa4iq5h/HLDgj7RMQy9pkmFKM2FWyTZ7+g1ypA ENuY2ERWpwXUOQmbiGMQU8Ip+8w6+cNzTC/F1mKfUe7WPrUU5mXjT6KV60vAri+lT3NZ hxRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dectris.com header.s=google header.b=oqBZV0kB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dectris.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a11-20020a170902710b00b001a0543fb660si11136482pll.160.2023.04.24.16.02.48; Mon, 24 Apr 2023 16:03:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@dectris.com header.s=google header.b=oqBZV0kB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=dectris.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233042AbjDXWxK (ORCPT + 99 others); Mon, 24 Apr 2023 18:53:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231438AbjDXWxI (ORCPT ); Mon, 24 Apr 2023 18:53:08 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDDE37694 for ; Mon, 24 Apr 2023 15:53:06 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-505035e3368so8739003a12.0 for ; Mon, 24 Apr 2023 15:53:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dectris.com; s=google; t=1682376785; x=1684968785; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Lb9V5qEz8uwYJMRGS2n+7cwxt++F8Vl5cfnWbWpg2NA=; b=oqBZV0kBPqtAYvC/4m7in4/HJVjt2Cq2tq+xjPyWe6vOXlUcae6M/NmBFzDz/FPGrX AG2o8U04UdF21KwSmuhBGZ1q2DHETDwPFLMrgZrFyDGnBs4S312y0jexltI499BS+Nbf U4J9A0MQ8UK8x7eGinOq2eMOG1HC++DT1KCug= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682376785; x=1684968785; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Lb9V5qEz8uwYJMRGS2n+7cwxt++F8Vl5cfnWbWpg2NA=; b=dshFlG0bebw2A2+V3lc9ccG73k1Hy5HHtOoaRZOsJaw/7YS6GmWE3ooIiMFs3m60yJ po3nI/gZe/mY784gY/gnSQ7z4tHSbRpnQRDVc31rW9cU4N54uFNIz52a7JVRuD3Xs3qF c5FtHVoNA9Sc6QdTexaLDTrpZ4NCoo+sqzaybIY/xCpYi4d8BopyRzl5sALqUPyRjHfM /XYlecxlgipRskuvOiwk81ARs6zQ+mhlh91Vm3VZsZn8yUpkEBvE6GH884wbQjpZZ0Ef OKHteFtEQFkwP++++iCYq9uGYdOGafjj3sl3x/83umBf3sioE7uwVwPJFqwBXJO/Uk30 V6PQ== X-Gm-Message-State: AAQBX9dXRiFaZs1ANNWdtbd9/mswaILpjFluyIwD3PD/TUf84AjujMl/ ABL3fRdDxJFIakR4LDyKGAMjDwBEamrYrcUVP8K/cA== X-Received: by 2002:aa7:d490:0:b0:508:422b:a61 with SMTP id b16-20020aa7d490000000b00508422b0a61mr14430531edr.4.1682376785287; Mon, 24 Apr 2023 15:53:05 -0700 (PDT) MIME-Version: 1.0 References: <20230423075335.92597-1-kal.conley@dectris.com> <6446d34f9568_338f220872@john.notmuch> In-Reply-To: <6446d34f9568_338f220872@john.notmuch> From: Kal Cutter Conley Date: Tue, 25 Apr 2023 00:52:53 +0200 Message-ID: Subject: Re: [PATCH] xsk: Use pool->dma_pages to check for DMA To: John Fastabend Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > Compare pool->dma_pages instead of pool->dma_pages_cnt to check for an > > active DMA mapping. pool->dma_pages needs to be read anyway to access > > the map so this compiles to more efficient code. > > Was it noticable in some sort of performance test? This patch is part of the patchset found at https://lore.kernel.org/all/20230412162114.19389-3-kal.conley@dectris.com/ which is being actively discussed and needs to be resubmitted anyway because of a conflict. While the discussion continues, I am submitting this patch by itself because I think it's an improvement on its own (regardless of what happens with the rest of the linked patchset). On one system, I measured a performance regression of 2-3% with xdpsock and the linked changes without the current patch. With the current patch, the performance regression was no longer observed. > > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h > > index d318c769b445..a8d7b8a3688a 100644 > > --- a/include/net/xsk_buff_pool.h > > +++ b/include/net/xsk_buff_pool.h > > @@ -180,7 +180,7 @@ static inline bool xp_desc_crosses_non_contig_pg(struct xsk_buff_pool *pool, > > if (likely(!cross_pg)) > > return false; > > > > - return pool->dma_pages_cnt && > > + return pool->dma_pages && > > !(pool->dma_pages[addr >> PAGE_SHIFT] & XSK_NEXT_PG_CONTIG_MASK); > > } I would consider the above code part of the "fast path". It may be executed approximately once per frame in unaligned mode. > This seems to be used in the setup/tear-down paths so your optimizing > a control side. Is there a fast path with this code? I walked the > ice driver. If its just setup code we should do whatever is more > readable. It is not only used in setup/tear-down paths (see above). Additionally, I believe the code is also _more_ readable with this patch applied. In particular, this patch reduces cognitive complexity since people (and compilers) reading the code don't need to additionally think about pool->dma_pages_cnt. > Both the _alloc_ cases read neighboring free_heads_cnt so your saving a load I guess? > This is so deep into micro-optimizing I'm curious if you could measure it? It is saving a load which also reduces code size. This will affect other decisions such as what to inline. Also in the linked patchset, dma_pages and dma_pages_cnt do not share a cache line (on x86_64). > > > } else { > > xskb = &pool->heads[xp_aligned_extract_idx(pool, addr)]; > > I'm not actually against optimizing but maybe another idea. Why do we have to > check at all? Seems if the DMA has been disabled/unmapped the driver shouldn't > be trying to call xsk_buff_alloc_batch? Then you can just drop the 'if' check. > > It feels to me the drivers shouldn't even be calling this after unmapping > the dma. WDYT? Many of these code paths are used both for ZC and copy modes. You might be right that this particular case is only used with DMA.