Received: by 2002:ab2:60d1:0:b0:1f7:5705:b850 with SMTP id i17csp1237138lqm; Thu, 2 May 2024 08:50:07 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCW1AiAUtlAh80VpokQvyp9ZclEO85N1EQJaPJIT9H6ihjomiNLataC4z86IcVgRZvAy3fVciO7kYDuZxNeRJoIN6ss2ksvFQldClJt/tQ== X-Google-Smtp-Source: AGHT+IFBrlgSfeIdqsptZJMx/GdSuuPZZD5olvbn513FhDeVEVDrAnEQQQ2JGun2QW0UO7uHW9cY X-Received: by 2002:a05:6214:f25:b0:6a0:5528:9594 with SMTP id iw5-20020a0562140f2500b006a055289594mr2345841qvb.64.1714665007245; Thu, 02 May 2024 08:50:07 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714665007; cv=pass; d=google.com; s=arc-20160816; b=cFlQ7cek1kavv7Z3+M8QA+ZFUYHyodG2FB1qKY88LiV4aA+pWMkKrv02M/MNW+B+Yd rby5mfGwcvYfVo/4/7rJ4EJvQ72eEmjfIJAavro50YZthlBYKu2GpZfP+UTAt4lRkf0N pJzD13CyJmitxjVtreSDFpTiBHxLq7vLF1QoiQdLqAQ5POSCcr2rwEdjv6/NL3+dl+r0 xK+3/rLZVEafSqS/scJy7dvTKsYSL2zhEcknWG+unmsKEzpVXHsSPpoBgfNVqXa3ihlA 3dgm8jYO3lOMNfRuTPok6WEAGGdwzz41Rtos0TS7Miyxt+psWMTEOof1QbpC1WTFsY4P LA4Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:content-language:from :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=DOGsGXikaqLvQeA8kSspgh1xCHh84OzUs1nq2NWVuA0=; fh=lUN0sprrei+4aXe/GdG2r5rOS2uUE9t2glIVbvzZw6I=; b=iDxxheUeUs9w3oGOJDSNusKvGLD5xzTxPneEU7eDm/ZU0qLPh2P5SVBhCXAKoG++RI /BxbG1CBmaMPfIXlrnaWShggEprRk++rLWTBAoRwDAp0nGFB2/r7xGJ9Uwq+lMVjMYiz 8JYCS+xlvVYCXzrrmxxp5mwICMrWMT/k4x7c1Ko8U0Z9C7o6knr9l3LNawc/5cbwWwVT dPQlaxso4XAFuXLFtEcBpcvXOzVsdM6uT89G5GK153I12GmJDcT/5fhDVfXZ8zZiIx1i 1bzitKWQ+sToV6z/FOhO+Ucd+EqSGLZOavRkgh+9bU7tcl2DnJllFgrAmeMpkmftYORa rPGg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-166665-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-166665-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id t3-20020ad45bc3000000b006968991bbc4si1175593qvt.273.2024.05.02.08.50.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 May 2024 08:50:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-166665-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-166665-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-166665-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id D6C0E1C22470 for ; Thu, 2 May 2024 15:50:06 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8343E15CD47; Thu, 2 May 2024 15:49:55 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8404F15574D; Thu, 2 May 2024 15:49:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714664995; cv=none; b=Uhhq/N5Mi2gu+ej/Be2c+qVoTAVgpPf/xNVgRFsxWqWYyQxJvjuB6n+W+56G8rwFZXWQaJ3mpRnVR74QUnjq0VNiyvdooJJ2lTMj/Rl2QJfKJAhUzzEBZj9pEb8kPm4O6ibCVNQiBCXzh7e7fynKYsvXoJfYoPtnEp2QqFh1llE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714664995; c=relaxed/simple; bh=lSLQE7weTxnVqPnhv28TxjpfYwfoIlf4YwWxqRNzmeg=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=BwdR4yj6phaQLQz2Aujrw4SBXz1tHs1YnIBHamjd9+3Rupucc8g0eK7ql+LAKPDm5q/lAxAHlZz4VsDBcMk5ZVpQF+//eBUXfX5Kz/e7idlxFI3aFIQAckAXEutb+oXW3+7jN5s+hF1qITdg3leZBMX5DIVcRtIme041n9h3x3o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 458BD2F4; Thu, 2 May 2024 08:50:17 -0700 (PDT) Received: from [10.1.196.40] (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D6C993F793; Thu, 2 May 2024 08:49:49 -0700 (PDT) Message-ID: Date: Thu, 2 May 2024 16:49:48 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v4 6/7] page_pool: check for DMA sync shortcut earlier To: Alexander Lobakin , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Christoph Hellwig , Marek Szyprowski , Joerg Roedel , Will Deacon , "Rafael J. Wysocki" , Magnus Karlsson , nex.sw.ncis.osdt.itp.upstreaming@intel.com, bpf@vger.kernel.org, netdev@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org References: <20240423135832.2271696-1-aleksander.lobakin@intel.com> <20240423135832.2271696-7-aleksander.lobakin@intel.com> <81df4e0f-07f3-40f6-8d71-ffad791ab611@intel.com> From: Robin Murphy Content-Language: en-GB In-Reply-To: <81df4e0f-07f3-40f6-8d71-ffad791ab611@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 24/04/2024 9:52 am, Alexander Lobakin wrote: > From: Alexander Lobakin > Date: Tue, 23 Apr 2024 15:58:31 +0200 > >> We can save a couple more function calls in the Page Pool code if we >> check for dma_need_sync() earlier, just when we test pp->p.dma_sync. >> Move both these checks into an inline wrapper and call the PP wrapper >> over the generic DMA sync function only when both are true. >> You can't cache the result of dma_need_sync() in &page_pool, as it may >> change anytime if an SWIOTLB buffer is allocated or mapped. >> >> Signed-off-by: Alexander Lobakin >> --- >> net/core/page_pool.c | 31 +++++++++++++++++-------------- >> 1 file changed, 17 insertions(+), 14 deletions(-) >> >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >> index 6cf26a68fa91..87319c6365e0 100644 >> --- a/net/core/page_pool.c >> +++ b/net/core/page_pool.c >> @@ -398,16 +398,24 @@ static struct page *__page_pool_get_cached(struct page_pool *pool) >> return page; >> } >> >> -static void page_pool_dma_sync_for_device(const struct page_pool *pool, >> - const struct page *page, >> - unsigned int dma_sync_size) >> +static void __page_pool_dma_sync_for_device(const struct page_pool *pool, >> + const struct page *page, >> + u32 dma_sync_size) >> { >> dma_addr_t dma_addr = page_pool_get_dma_addr(page); >> >> dma_sync_size = min(dma_sync_size, pool->p.max_len); >> - dma_sync_single_range_for_device(pool->p.dev, dma_addr, >> - pool->p.offset, dma_sync_size, >> - pool->p.dma_dir); >> + __dma_sync_single_for_device(pool->p.dev, dma_addr + pool->p.offset, >> + dma_sync_size, pool->p.dma_dir); > > Breh, this breaks builds on !DMA_NEED_SYNC systems, as this function > isn't defined there, and my CI didn't catch it in time... I'll ifdef > this in the next spin after some reviews for this one. Hmm, the way things have ended up, do we even need this change? (Other than factoring out the pool->dma_sync check, which is clearly nice) Since __page_pool_dma_sync_for_device() is never called directly, the flow we always get is: page_pool_dma_sync_for_device() dma_dev_need_sync() __page_pool_dma_sync_for_device() ... // a handful of trivial arithmetic __dma_sync_single_for_device() i.e. still effectively the same order of "if (dma_dev_need_sync()) __dma_sync()" invocations as if we'd just used the standard dma_sync(), so referencing the unwrapped internals only spreads it out a bit more for no real benefit. As an experiment I tried the diff below on top, effectively undoing this problematic part, and it doesn't seem to make any appreciable difference in a before-and-after comparison of the object code, at least for my arm64 build. Thanks, Robin. ----->8----- diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 27f3b6db800e..b8ab7d4ca229 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -398,24 +398,20 @@ static struct page *__page_pool_get_cached(struct page_pool *pool) return page; } -static void __page_pool_dma_sync_for_device(const struct page_pool *pool, - const struct page *page, - u32 dma_sync_size) -{ - dma_addr_t dma_addr = page_pool_get_dma_addr(page); - - dma_sync_size = min(dma_sync_size, pool->p.max_len); - __dma_sync_single_for_device(pool->p.dev, dma_addr + pool->p.offset, - dma_sync_size, pool->p.dma_dir); -} - static __always_inline void page_pool_dma_sync_for_device(const struct page_pool *pool, const struct page *page, u32 dma_sync_size) { - if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) - __page_pool_dma_sync_for_device(pool, page, dma_sync_size); + dma_addr_t dma_addr = page_pool_get_dma_addr(page); + + if (!pool->dma_sync) + return; + + dma_sync_size = min(dma_sync_size, pool->p.max_len); + dma_sync_single_range_for_device(pool->p.dev, dma_addr, + pool->p.offset, dma_sync_size, + pool->p.dma_dir); } static bool page_pool_dma_map(struct page_pool *pool, struct page *page)