Received: by 2002:a05:7412:2a91:b0:fc:a2b0:25d7 with SMTP id u17csp747358rdh; Wed, 14 Feb 2024 10:05:24 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCX4XEl9Ix19QIUZzFaJFqTiDpPwsWRWBjMJu2pOZL3vz5wdXQkdK3M/yFHTXDxGyOm0ZnkxbvrvyEkgat/QUCBe5PsJRb9LX1JpKsc6ow== X-Google-Smtp-Source: AGHT+IGZHKITUdGhCxVanIdDYwLi3TPxY749zeXoMHFHEgpklsawTXtuV+4NneT7rUW4O8DdgzoL X-Received: by 2002:a17:907:76e5:b0:a38:40fc:2bcf with SMTP id kg5-20020a17090776e500b00a3840fc2bcfmr2766224ejc.60.1707933924809; Wed, 14 Feb 2024 10:05:24 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707933924; cv=pass; d=google.com; s=arc-20160816; b=utTh1TIx0VOFect6S1clExTa8naZQ6Pv0AZudUAveb74A0GhDDqJWDRHvPMhABeSao 4uIOWPa1QBOOq40LTKO7p21LdrcQAJ5pBCVHpa39GcQC8oM6wr0rkSdfTLqeTHuRq7jE vT4XsitIUN9fdIsEriNYaARHH5oqsDFr+nHUs1wbUqS9Trvpu1BUf/aToiXEOJWu9MII 4gPysqjPstWQuao/D8V3ioXuD175tDsxJZgPPoYQWBYCMtMV9rcmfkOnZeu49R/WBNha fHZOqChlXb/xwtAB1RodOP4Er3xaSgrlHXZcOtsBeFgEuenrPpsnsQBqDQCLDf+u1Ltq dBYw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=YWBrG9/ahY0LnVvIJW7vsgCINESBZzn0YT6YzcPwEbI=; fh=eFbRECkOp8hctmkCOd0ZL+oB2WOKOaKCxA9e/kRRPko=; b=qjdvkqNPYM1VmY/8XwB/aOrGHOj197E9cwObJ/as/0GRLcxZJ02YTZERnO6Qjhr4Yv kATbFoKDuhGkd3CPHHiEgxbpcOEH60v7DB75O8SZ646S+eV6co/mLqOqfmHOZf3s3NV+ jsj0RHf79TAsC0j9f9ArebioVprEmYgB7rXxzcRCvHG81zwHIwJV+YVJtAPm9G+hmBZZ /T0d6uGCriyOcjDnpL3EyhnKhAQfbPnTfStAAUpiZAlSeAxtDjKSQZM4rWFpyBaL8HLv j0ZJidR362pwAQ2RXr0iHTUTfyZB1De3qEiU0tSkP6IIPQEVSVspvI/9/zOxuJcnbTdv D46g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-65763-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65763-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com X-Forwarded-Encrypted: i=2; AJvYcCVKYdUF/Lyqf7YSC0lpX1f9TDkgXOSEN0gFOKYsS/3f6cRTRm+AALHUKSuPFdn4ASZzQTZolcsaA41XNly3CnvgMZ5+3u8bV8zFEbz/Fw== Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id gl12-20020a170906e0cc00b00a3d0401356asi1823682ejb.26.2024.02.14.10.05.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 10:05:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-65763-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-65763-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65763-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 8E6661F226A6 for ; Wed, 14 Feb 2024 18:05:24 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3406012881C; Wed, 14 Feb 2024 17:58:38 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id ABFF8127B51; Wed, 14 Feb 2024 17:58:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707933517; cv=none; b=EiwTBQxmjONUvf6DoZ+yEayq+AI30j3YldI0Ocw6eKJggQrHd1yUwXtZglvSLp3x4it7/adfHldFh52bfIk5mvURwrK5eKHktjgbZoguIcAktZl0i8qNAGtz1Y6RcxtCU/iyEXEESgICMz0XH3ca3j6LIHl269ctj8ujKRa0aVE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707933517; c=relaxed/simple; bh=M9SzvVIV7YSZhK7oCmf3vj7+C0jEeizkNf7iIs8Prgk=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=D74D6rXgIP31aqKyuNjFuEMWncYyki+H3gtCUJGlYoKHHLHbvFjHrMx+z8N9M7IX+g4nMeZKosnEcv/2pXUf295xLUXDoQ5ywUTo6eN2ZerfMzgARJ7YS0T+R/njDArkw+EwYe0Fv0bf++UKdN49v9LlzNivMId9aaAZE5T92cw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 13D4B1FB; Wed, 14 Feb 2024 09:59:16 -0800 (PST) Received: from [10.57.47.86] (unknown [10.57.47.86]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 59B433F766; Wed, 14 Feb 2024 09:58:32 -0800 (PST) Message-ID: <2d13134d-1e5c-4534-8686-c0022caeb36c@arm.com> Date: Wed, 14 Feb 2024 17:58:30 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v3 3/7] iommu/dma: avoid expensive indirect calls for sync operations Content-Language: en-GB To: Alexander Lobakin , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Christoph Hellwig , Marek Szyprowski , Joerg Roedel , Will Deacon , Greg Kroah-Hartman , "Rafael J. Wysocki" , Magnus Karlsson , Maciej Fijalkowski , Alexander Duyck , bpf@vger.kernel.org, netdev@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org References: <20240214162201.4168778-1-aleksander.lobakin@intel.com> <20240214162201.4168778-4-aleksander.lobakin@intel.com> From: Robin Murphy In-Reply-To: <20240214162201.4168778-4-aleksander.lobakin@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2024-02-14 4:21 pm, Alexander Lobakin wrote: > When IOMMU is on, the actual synchronization happens in the same cases > as with the direct DMA. Advertise %DMA_F_CAN_SKIP_SYNC in IOMMU DMA to > skip sync ops calls (indirect) for non-SWIOTLB buffers. > > perf profile before the patch: > > 18.53% [kernel] [k] gq_rx_skb > 14.77% [kernel] [k] napi_reuse_skb > 8.95% [kernel] [k] skb_release_data > 5.42% [kernel] [k] dev_gro_receive > 5.37% [kernel] [k] memcpy > <*> 5.26% [kernel] [k] iommu_dma_sync_sg_for_cpu > 4.78% [kernel] [k] tcp_gro_receive > <*> 4.42% [kernel] [k] iommu_dma_sync_sg_for_device > 4.12% [kernel] [k] ipv6_gro_receive > 3.65% [kernel] [k] gq_pool_get > 3.25% [kernel] [k] skb_gro_receive > 2.07% [kernel] [k] napi_gro_frags > 1.98% [kernel] [k] tcp6_gro_receive > 1.27% [kernel] [k] gq_rx_prep_buffers > 1.18% [kernel] [k] gq_rx_napi_handler > 0.99% [kernel] [k] csum_partial > 0.74% [kernel] [k] csum_ipv6_magic > 0.72% [kernel] [k] free_pcp_prepare > 0.60% [kernel] [k] __napi_poll > 0.58% [kernel] [k] net_rx_action > 0.56% [kernel] [k] read_tsc > <*> 0.50% [kernel] [k] __x86_indirect_thunk_r11 > 0.45% [kernel] [k] memset > > After patch, lines with <*> no longer show up, and overall > cpu usage looks much better (~60% instead of ~72%): > > 25.56% [kernel] [k] gq_rx_skb > 9.90% [kernel] [k] napi_reuse_skb > 7.39% [kernel] [k] dev_gro_receive > 6.78% [kernel] [k] memcpy > 6.53% [kernel] [k] skb_release_data > 6.39% [kernel] [k] tcp_gro_receive > 5.71% [kernel] [k] ipv6_gro_receive > 4.35% [kernel] [k] napi_gro_frags > 4.34% [kernel] [k] skb_gro_receive > 3.50% [kernel] [k] gq_pool_get > 3.08% [kernel] [k] gq_rx_napi_handler > 2.35% [kernel] [k] tcp6_gro_receive > 2.06% [kernel] [k] gq_rx_prep_buffers > 1.32% [kernel] [k] csum_partial > 0.93% [kernel] [k] csum_ipv6_magic > 0.65% [kernel] [k] net_rx_action > > iavf yields +10% of Mpps on Rx. This also unblocks batched allocations > of XSk buffers when IOMMU is active. Acked-by: Robin Murphy > Co-developed-by: Eric Dumazet > Signed-off-by: Eric Dumazet > Signed-off-by: Alexander Lobakin > --- > drivers/iommu/dma-iommu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 50ccc4f1ef81..4ab9ac13d362 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -1707,7 +1707,8 @@ static size_t iommu_dma_opt_mapping_size(void) > } > > static const struct dma_map_ops iommu_dma_ops = { > - .flags = DMA_F_PCI_P2PDMA_SUPPORTED, > + .flags = DMA_F_PCI_P2PDMA_SUPPORTED | > + DMA_F_CAN_SKIP_SYNC, > .alloc = iommu_dma_alloc, > .free = iommu_dma_free, > .alloc_pages = dma_common_alloc_pages,