Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp732533iog; Wed, 29 Jun 2022 09:06:37 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sy4WaA8KaYQV6ni4JzLnqJxxb/0OfmLsBA8ZRUMu1pM0iprgpk2HCE/VevnJ7dG9ouTwar X-Received: by 2002:a17:90b:1d84:b0:1ed:5918:74e3 with SMTP id pf4-20020a17090b1d8400b001ed591874e3mr4545003pjb.173.1656518797684; Wed, 29 Jun 2022 09:06:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656518797; cv=none; d=google.com; s=arc-20160816; b=w2w2ccEBTWa9beqHhG9cbofBQbAn+2XgLav9BhlPzOZqtf4QpkFNdoGMSkQ50Uuzzi 0VvFC21HXvSz/+CUAH2XGpdohA7pq2yquqMHUv6PlaDBdDVbgPifexz/UawEavvRlax3 EeevLqy4BXIOOxW7gOtdJDXTZCAh1k8Y6VzgYEIkZJuioXfP1yJzwKWWHxEhXQX+918T 4JfLy33ZBX+dv/bWBBJiA4rhS6YSiCZL41dbNgZaO6ZsqSrfCmFD3BD91llxdPeoU1Ws RHUyW9S4I+4QrqOQAcsXBkH0y/FHYAStRCkoCdSOYFa1wfl6M6mINbwhjAHT3K2HKuGa EnaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:content-transfer-encoding:in-reply-to :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=O0wtNrY7uERkcVmDzn9Qzn0QOjEuS076HETf2WzjME4=; b=MzLjQkhMIfll344rHRXUl4T99bhjD/KImE4QCswGDwF7EIG2SMqMREoVzbEYATM2VE 2g+q3ii/psaq/klmsPnSzPQwBPG0b7bbjZt0WhqxSnEO+1LofSNSdJUns8uplfuHr+15 N9CrSyO4tQTIoP3PhtBdNCuacGyXmqBsjMDyQsSYGuN3uL0Dg49/M3nGpR58x0q1XzFl QTEqa2HGilEADOpOmF9ZZte+44JltXUb/+cE7lWDgcYcabdGrEhbCh8xtiH0wLMs8zG2 fE5mGReY/fetokOkGe1x4EdQyEXlrgAisDaAHpjGYIZcN9+QYbv04s4YX4fvaZqwGiFf 0llA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=fcBq3BeN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s1-20020a170902ea0100b0016362d70cf1si27851314plg.50.2022.06.29.09.06.15; Wed, 29 Jun 2022 09:06:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=fcBq3BeN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233775AbiF2Pjj (ORCPT + 99 others); Wed, 29 Jun 2022 11:39:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233843AbiF2Pjc (ORCPT ); Wed, 29 Jun 2022 11:39:32 -0400 Received: from ale.deltatee.com (ale.deltatee.com [204.191.154.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4923936172; Wed, 29 Jun 2022 08:39:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:In-Reply-To:From:References:Cc:To: MIME-Version:Date:Message-ID:content-disposition; bh=O0wtNrY7uERkcVmDzn9Qzn0QOjEuS076HETf2WzjME4=; b=fcBq3BeNtwnbwtK5Y2Ik7xDHNX R2UXfuLGRfqLKYw2O//6lyhArxbX+eYSq7C1KjshVcSK35zRKyY1/m5BoX90BANeJxKsWQHWzpJfH RLRIfLr01lR+my9+2IpdU/iMUoeB9400EApUEEHjJe6aaYg5O0DxWTZXa/RxPY1IK4XGp8wIPTO+B n0noULDLnowj1XntvqsDZQQRPCkqagswEtT4sxIgVQ1daNBO6u/6PpMAZgin/duxNTYcVdNOEKaTU 0nCYR6bUVRS/Ok2+w20mNj06bY59NCDNF5cCXwNLFbRetVx9eUzGy9o9+U3YM2hjMTGWYbrJXVGen 0sKA+1AQ==; Received: from s0106a84e3fe8c3f3.cg.shawcable.net ([24.64.144.200] helo=[192.168.0.10]) by ale.deltatee.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1o6Zmm-002RkM-SQ; Wed, 29 Jun 2022 09:39:25 -0600 Message-ID: Date: Wed, 29 Jun 2022 09:39:14 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-CA To: Robin Murphy , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?UTF-8?Q?Christian_K=c3=b6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Chaitanya Kulkarni References: <20220615161233.17527-1-logang@deltatee.com> <20220615161233.17527-2-logang@deltatee.com> From: Logan Gunthorpe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SA-Exim-Connect-IP: 24.64.144.200 X-SA-Exim-Rcpt-To: robin.murphy@arm.com, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, dan.j.williams@intel.com, jgg@ziepe.ca, christian.koenig@amd.com, jhubbard@nvidia.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, dave.b.minturn@intel.com, jason@jlekstrand.net, dave.hansen@linux.intel.com, jianxin.xiong@intel.com, helgaas@kernel.org, ira.weiny@intel.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, rcampbell@nvidia.com, kch@nvidia.com X-SA-Exim-Mail-From: logang@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Subject: Re: [PATCH v7 01/21] lib/scatterlist: add flag for indicating P2PDMA segments in an SGL X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-06-29 03:05, Robin Murphy wrote: > On 2022-06-15 17:12, Logan Gunthorpe wrote: >> Make use of the third free LSB in scatterlist's page_link on 64bit >> systems. >> >> The extra bit will be used by dma_[un]map_sg_p2pdma() to determine when a >> given SGL segments dma_address points to a PCI bus address. >> dma_unmap_sg_p2pdma() will need to perform different cleanup when a >> segment is marked as a bus address. >> >> The new bit will only be used when CONFIG_PCI_P2PDMA is set; this means >> PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the >> majority of P2PDMA use cases are restricted to newer root complexes and >> roughly require the extra address space for memory BARs used in the >> transactions. >> >> Signed-off-by: Logan Gunthorpe >> Reviewed-by: Chaitanya Kulkarni >> --- >>   drivers/pci/Kconfig         |  5 +++++ >>   include/linux/scatterlist.h | 44 ++++++++++++++++++++++++++++++++++++- >>   2 files changed, 48 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig >> index 133c73207782..5cc7cba1941f 100644 >> --- a/drivers/pci/Kconfig >> +++ b/drivers/pci/Kconfig >> @@ -164,6 +164,11 @@ config PCI_PASID >>   config PCI_P2PDMA >>       bool "PCI peer-to-peer transfer support" >>       depends on ZONE_DEVICE >> +    # >> +    # The need for the scatterlist DMA bus address flag means PCI P2PDMA >> +    # requires 64bit >> +    # >> +    depends on 64BIT >>       select GENERIC_ALLOCATOR >>       help >>         Enableѕ drivers to do PCI peer-to-peer transactions to and from >> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h >> index 7ff9d6386c12..6561ca8aead8 100644 >> --- a/include/linux/scatterlist.h >> +++ b/include/linux/scatterlist.h >> @@ -64,12 +64,24 @@ struct sg_append_table { >>   #define SG_CHAIN    0x01UL >>   #define SG_END        0x02UL >>   +/* >> + * bit 2 is the third free bit in the page_link on 64bit systems which >> + * is used by dma_unmap_sg() to determine if the dma_address is a >> + * bus address when doing P2PDMA. >> + */ >> +#ifdef CONFIG_PCI_P2PDMA >> +#define SG_DMA_BUS_ADDRESS    0x04UL >> +static_assert(__alignof__(struct page) >= 8); >> +#else >> +#define SG_DMA_BUS_ADDRESS    0x00UL >> +#endif >> + >>   /* >>    * We overload the LSB of the page pointer to indicate whether it's >>    * a valid sg entry, or whether it points to the start of a new >> scatterlist. >>    * Those low bits are there for everyone! (thanks mason :-) >>    */ >> -#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END) >> +#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END | SG_DMA_BUS_ADDRESS) >>     static inline unsigned int __sg_flags(struct scatterlist *sg) >>   { >> @@ -91,6 +103,11 @@ static inline bool sg_is_last(struct scatterlist *sg) >>       return __sg_flags(sg) & SG_END; >>   } >>   +static inline bool sg_is_dma_bus_address(struct scatterlist *sg) >> +{ >> +    return __sg_flags(sg) & SG_DMA_BUS_ADDRESS; >> +} >> + >>   /** >>    * sg_assign_page - Assign a given page to an SG entry >>    * @sg:            SG entry >> @@ -245,6 +262,31 @@ static inline void sg_unmark_end(struct >> scatterlist *sg) >>       sg->page_link &= ~SG_END; >>   } >>   +/** >> + * sg_dma_mark_bus address - Mark the scatterlist entry as a bus address >> + * @sg:         SG entryScatterlist > > entryScatterlist? > >> + * >> + * Description: >> + *   Marks the passed in sg entry to indicate that the dma_address is >> + *   a bus address and doesn't need to be unmapped. >> + **/ >> +static inline void sg_dma_mark_bus_address(struct scatterlist *sg) >> +{ >> +    sg->page_link |= SG_DMA_BUS_ADDRESS; >> +} >> + >> +/** >> + * sg_unmark_pci_p2pdma - Unmark the scatterlist entry as a bus address >> + * @sg:         SG entryScatterlist >> + * >> + * Description: >> + *   Clears the bus address mark. >> + **/ >> +static inline void sg_dma_unmark_bus_address(struct scatterlist *sg) >> +{ >> +    sg->page_link &= ~SG_DMA_BUS_ADDRESS; >> +} > > Does this serve any useful purpose? If a page is determined to be device > memory, it's not going to suddenly stop being device memory, and if the > underlying sg is recycled to point elsewhere then sg_assign_page() will > still (correctly) clear this flag anyway. Trying to reason about this > beyond superficial API symmetry - i.e. why exactly would a caller need > to call it, and what would the implications be of failing to do so - > seems to lead straight to confusion. > > In fact I'd be inclined to have sg_assign_page() be responsible for > setting the flag automatically as well, and thus not need > sg_dma_mark_bus_address() either, however I can see the argument for > doing it this way round to not entangle the APIs too much, so I don't > have any great objection to that. Yes, I think you misunderstand what this is for. The SG_DMA_BUS_ADDDRESS flag doesn't mark the segment for the page, but for the dma address. It cannot be set in sg_assign_page() seeing it's not a property of the page but a property of the dma_address in the sgl. It's not meant for use by regular SG users, it's only meant for use inside DMA mapping implementations. The purpose is to know whether a given dma_address in the SGL is a bus address or regular memory because the two different types must be unmapped differently. We can't rely on the page because, as you know, many dma_map_sg() the dma_address entry in the sgl does not map to the same memory as the page. Or to put it another way: is_pci_p2pdma_page(sg->page) does not imply that sg->dma_address points to a bus address. Does that make sense? Logan