Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1177130rdh; Mon, 25 Sep 2023 05:41:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEPiAOao7BV09Lu9O0SO/fKCYZPnMzRTSiv1SmSGnsDACrPxfbKMHC9EQmFRsRFOeyLyiyF X-Received: by 2002:a17:90a:bd0c:b0:274:9871:b4b9 with SMTP id y12-20020a17090abd0c00b002749871b4b9mr6155146pjr.16.1695645690072; Mon, 25 Sep 2023 05:41:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695645690; cv=none; d=google.com; s=arc-20160816; b=RQNBNZazPlrtaONs1f6PFvZOfucYvegdaReSlfsJCVjagUrt+jNG2tdg5JV378yYgx OjEljaMb4GYPM3BIEIfek2gMLyFcHieIxAfnjpu/Q8jATeeshRL8i4EyzIlkoKU6qLUO ZjGetQorO3NEWJY9u0dHpYn8G/5KRVwlFNcjf3RPNUUvV193ODPDeVirDKctNL4g4SF5 Hsz/J98l1sFsobBcDgvfoidONHYzrut5iMmSlkc9+uAM3neRMUvcG7mwt8FANS0vZpEm gbW5nSgxChG/rQhq1s6iwjde8UloN5Q5b6h6AdYwTRHNHwSn9CoT9uezPfx7XKtK/+Oz JwQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=sniGn5Qr6Z7H4tJl9rtCb64yTsJy/Z6jverglkcfiIc=; fh=+gnzY2Qliy8paKo3kWiD3qCpqkGWLBsLPjEEkmWhIT0=; b=fKv9cciRLvLQvlTjVltEPrR8aCe6vkMIJgOculEGhAJ/nmhufvfHW8qJOKKAaf3DS1 UEVCgrECxi3kIl7dY9/MVEXgEnn+M0ud1RqLGWW8qMcySDNMruZK/VKM3+QJiMRU32XD QZ5S+PgRuAegywMq2vSklfQxz6PbCZ12q+kQ7+yLOCay4aBdh90e6IT44dEkr4jOrR2A 3yPrtSve1LPj3Ee8IU779/vLfiA3AJLzEaSS289dSiJ4wRBNl7aQ/DHtEBb1IMk6w/m0 5vuhfht5Y2EHVEvpAwadlvF7H1vJqZsNeFYKmm00zcBZ4sEAML/PKBiqhjAOEQ3iUTnb kzQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id np8-20020a17090b4c4800b00274d3f62044si11497922pjb.111.2023.09.25.05.41.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 05:41:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id DC6608081BC2; Mon, 25 Sep 2023 04:15:59 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230045AbjIYLQA (ORCPT + 99 others); Mon, 25 Sep 2023 07:16:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229556AbjIYLP7 (ORCPT ); Mon, 25 Sep 2023 07:15:59 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ED37EC6 for ; Mon, 25 Sep 2023 04:15:51 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 67847DA7; Mon, 25 Sep 2023 04:16:29 -0700 (PDT) Received: from [10.57.0.188] (unknown [10.57.0.188]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5E2BC3F59C; Mon, 25 Sep 2023 04:15:50 -0700 (PDT) Message-ID: <92cd8f47-054c-938a-0dcb-778ed42805ed@arm.com> Date: Mon, 25 Sep 2023 12:15:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: dma_map_resource() has a bad performance in pcie peer to peer transactions when iommu enabled in Linux Content-Language: en-GB To: Kelly Devilliv , "joro@8bytes.org" , "will@kernel.org" Cc: "iommu@lists.linux.dev" , "linux-kernel@vger.kernel.org" References: From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 25 Sep 2023 04:16:00 -0700 (PDT) On 2023-09-25 04:59, Kelly Devilliv wrote: > Dear all, > > I am working on an ARM-V8 server with two gpu cards on it. Recently, I need to test pcie peer to peer communication between the two gpu cards, but the throughput is only 4GB/s. > > After I explored the gpu's kernel mode driver, I found it was using the dma_map_resource() API to map the peer device's MMIO space. The arm iommu driver then will hardcode a 'IOMMU_MMIO' prot in the later dma map: > > static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > return __iommu_dma_map(dev, phys, size, > dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, > dma_get_mask(dev)); > } > > And that will finally set the 'ARM_LPAE_PTE_MEMATTR_DEV' attribute in PTE, which may have a negative impact on the performance of the pcie peer to peer transactions. > > /* > * Note that this logic is structured to accommodate Mali LPAE > * having stage-1-like attributes but stage-2-like permissions. > */ > if (data->iop.fmt == ARM_64_LPAE_S2 || > data->iop.fmt == ARM_32_LPAE_S2) { > if (prot & IOMMU_MMIO) > pte |= ARM_LPAE_PTE_MEMATTR_DEV; > else if (prot & IOMMU_CACHE) > pte |= ARM_LPAE_PTE_MEMATTR_OIWB; > else > pte |= ARM_LPAE_PTE_MEMATTR_NC; > } else { > if (prot & IOMMU_MMIO) > pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV > << ARM_LPAE_PTE_ATTRINDX_SHIFT); > else if (prot & IOMMU_CACHE) > pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE > << ARM_LPAE_PTE_ATTRINDX_SHIFT); > } > > I tried to remove the 'IOMMU_MMIO' prot in the dma_map_resource() API and re-compile the linux kernel, the throughput then can be up to 28GB/s. > > Is there an elegant way to solve this issue without modifying the linux kernel? e.g., a substitution of dma_map_resource() API? Not really. Other use-cases for dma_map_resource() include DMA offload engines accessing FIFO registers, where allowing reordering, write-gathering, etc. would be a terrible idea. Thus it needs to assume a "safe" MMIO memory type, which on Arm means Device-nGnRE. However, the "proper" PCI peer-to-peer support under CONFIG_PCI_P2PDMA ended up moving away from the dma_map_resource() approach anyway, and allows this kind of device memory to be treated more like regular memory (via ZONE_DEVICE) rather than arbitrary MMIO resources, so your best bet would be to get the GPU driver converted over to using that. Thanks, Robin. > > Thank you! > > Platform info: > Linux kernel version: 5.10 > PCIE GEN4 x16 > > Sincerely, > Kelly >