Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp841691iog; Wed, 29 Jun 2022 11:17:00 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u2CteKWt5fXC8Voe3hkBciaVAt3uzCzZPu02R8YbH4JplLJGYQfgp5IxryfGVZQijr74CV X-Received: by 2002:a63:85c6:0:b0:40d:80e9:49e7 with SMTP id u189-20020a6385c6000000b0040d80e949e7mr4077002pgd.478.1656526620570; Wed, 29 Jun 2022 11:17:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656526620; cv=none; d=google.com; s=arc-20160816; b=Q4fGZJJ65EA56jyZET//XhsFfXvxXeHhfkFbpkV6WKoSorvF1l50bEWhphIL2rZXci NtZyKBNCLU64qMWRDrG3dMFq0dBwGPvzu22dkjUtGldLoLh7oC4qaRHaidcZFW21i1Y1 ED7+DN62J9k+cwraFcBhLTlGYp62LrwljzjAk9Y1XiMyP3dCD/FsTzi1ggZZI+LhbSU/ NCpsM3ZJAPi4txCRPyTYiyYfKUMBI10bhAMNTPvSEb8bJWr7TGBlvnsJgvl9zNBew+Bq zQdXM8jHcv5vml9bzThg64wtEpOuLWLzMA78Y13YTtwDfoYNBUIZUs2vWBkE8+qBTOxY 5rmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=VJ5giCnL9gW0Wv51Z6KrKnRSp+wk2wjV+WILMkU4Itc=; b=t2mLFBAjeE4nwJmLHdzeIHI0gvRZaV9mgL9N+r8Ltj4aV9+cdBa8edBEdmk2Dzn+iY C4MWvyq3tKCA9iZ2Qze70G9LGdLn3ZTvS5hHWFJac5pZQ/FOe32p2od6CLJEh6FDXN82 DsNn7NVtUkCvy664Vej36XznhyHU3UeE9wAzBh7L+IhQcPuqE5KlGcZshLTumTMAHdkD qpyjvMNNUImghYxAPCok8DmHH9/5hRxYO1DERLqSNKGcM+OzATZpRspLjzeq/3o79z7C ecYumsksmZXUA/K7dxxGBf82rAOG+9Ha+mzOPaaBJuC4nzawuubWaSJVrDBPmNVjClKA uTyg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c1-20020a630d01000000b0040c9f5a0e43si23592893pgl.338.2022.06.29.11.16.45; Wed, 29 Jun 2022 11:17:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232227AbiF2SDE (ORCPT + 99 others); Wed, 29 Jun 2022 14:03:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232231AbiF2SCo (ORCPT ); Wed, 29 Jun 2022 14:02:44 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2E3D83A735; Wed, 29 Jun 2022 11:02:43 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E21051480; Wed, 29 Jun 2022 11:02:42 -0700 (PDT) Received: from [10.57.85.71] (unknown [10.57.85.71]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 050C53F5A1; Wed, 29 Jun 2022 11:02:37 -0700 (PDT) Message-ID: Date: Wed, 29 Jun 2022 19:02:33 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v7 01/21] lib/scatterlist: add flag for indicating P2PDMA segments in an SGL Content-Language: en-GB To: Logan Gunthorpe , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?UTF-8?Q?Christian_K=c3=b6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Chaitanya Kulkarni References: <20220615161233.17527-1-logang@deltatee.com> <20220615161233.17527-2-logang@deltatee.com> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-06-29 16:39, Logan Gunthorpe wrote: > > > > On 2022-06-29 03:05, Robin Murphy wrote: >> On 2022-06-15 17:12, Logan Gunthorpe wrote: >>> Make use of the third free LSB in scatterlist's page_link on 64bit >>> systems. >>> >>> The extra bit will be used by dma_[un]map_sg_p2pdma() to determine when a >>> given SGL segments dma_address points to a PCI bus address. >>> dma_unmap_sg_p2pdma() will need to perform different cleanup when a >>> segment is marked as a bus address. >>> >>> The new bit will only be used when CONFIG_PCI_P2PDMA is set; this means >>> PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the >>> majority of P2PDMA use cases are restricted to newer root complexes and >>> roughly require the extra address space for memory BARs used in the >>> transactions. >>> >>> Signed-off-by: Logan Gunthorpe >>> Reviewed-by: Chaitanya Kulkarni >>> --- >>>   drivers/pci/Kconfig         |  5 +++++ >>>   include/linux/scatterlist.h | 44 ++++++++++++++++++++++++++++++++++++- >>>   2 files changed, 48 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig >>> index 133c73207782..5cc7cba1941f 100644 >>> --- a/drivers/pci/Kconfig >>> +++ b/drivers/pci/Kconfig >>> @@ -164,6 +164,11 @@ config PCI_PASID >>>   config PCI_P2PDMA >>>       bool "PCI peer-to-peer transfer support" >>>       depends on ZONE_DEVICE >>> +    # >>> +    # The need for the scatterlist DMA bus address flag means PCI P2PDMA >>> +    # requires 64bit >>> +    # >>> +    depends on 64BIT >>>       select GENERIC_ALLOCATOR >>>       help >>>         Enableѕ drivers to do PCI peer-to-peer transactions to and from >>> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h >>> index 7ff9d6386c12..6561ca8aead8 100644 >>> --- a/include/linux/scatterlist.h >>> +++ b/include/linux/scatterlist.h >>> @@ -64,12 +64,24 @@ struct sg_append_table { >>>   #define SG_CHAIN    0x01UL >>>   #define SG_END        0x02UL >>>   +/* >>> + * bit 2 is the third free bit in the page_link on 64bit systems which >>> + * is used by dma_unmap_sg() to determine if the dma_address is a >>> + * bus address when doing P2PDMA. >>> + */ >>> +#ifdef CONFIG_PCI_P2PDMA >>> +#define SG_DMA_BUS_ADDRESS    0x04UL >>> +static_assert(__alignof__(struct page) >= 8); >>> +#else >>> +#define SG_DMA_BUS_ADDRESS    0x00UL >>> +#endif >>> + >>>   /* >>>    * We overload the LSB of the page pointer to indicate whether it's >>>    * a valid sg entry, or whether it points to the start of a new >>> scatterlist. >>>    * Those low bits are there for everyone! (thanks mason :-) >>>    */ >>> -#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END) >>> +#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END | SG_DMA_BUS_ADDRESS) >>>     static inline unsigned int __sg_flags(struct scatterlist *sg) >>>   { >>> @@ -91,6 +103,11 @@ static inline bool sg_is_last(struct scatterlist *sg) >>>       return __sg_flags(sg) & SG_END; >>>   } >>>   +static inline bool sg_is_dma_bus_address(struct scatterlist *sg) >>> +{ >>> +    return __sg_flags(sg) & SG_DMA_BUS_ADDRESS; >>> +} >>> + >>>   /** >>>    * sg_assign_page - Assign a given page to an SG entry >>>    * @sg:            SG entry >>> @@ -245,6 +262,31 @@ static inline void sg_unmark_end(struct >>> scatterlist *sg) >>>       sg->page_link &= ~SG_END; >>>   } >>>   +/** >>> + * sg_dma_mark_bus address - Mark the scatterlist entry as a bus address >>> + * @sg:         SG entryScatterlist >> >> entryScatterlist? >> >>> + * >>> + * Description: >>> + *   Marks the passed in sg entry to indicate that the dma_address is >>> + *   a bus address and doesn't need to be unmapped. >>> + **/ >>> +static inline void sg_dma_mark_bus_address(struct scatterlist *sg) >>> +{ >>> +    sg->page_link |= SG_DMA_BUS_ADDRESS; >>> +} >>> + >>> +/** >>> + * sg_unmark_pci_p2pdma - Unmark the scatterlist entry as a bus address >>> + * @sg:         SG entryScatterlist >>> + * >>> + * Description: >>> + *   Clears the bus address mark. >>> + **/ >>> +static inline void sg_dma_unmark_bus_address(struct scatterlist *sg) >>> +{ >>> +    sg->page_link &= ~SG_DMA_BUS_ADDRESS; >>> +} >> >> Does this serve any useful purpose? If a page is determined to be device >> memory, it's not going to suddenly stop being device memory, and if the >> underlying sg is recycled to point elsewhere then sg_assign_page() will >> still (correctly) clear this flag anyway. Trying to reason about this >> beyond superficial API symmetry - i.e. why exactly would a caller need >> to call it, and what would the implications be of failing to do so - >> seems to lead straight to confusion. >> >> In fact I'd be inclined to have sg_assign_page() be responsible for >> setting the flag automatically as well, and thus not need >> sg_dma_mark_bus_address() either, however I can see the argument for >> doing it this way round to not entangle the APIs too much, so I don't >> have any great objection to that. > > Yes, I think you misunderstand what this is for. The SG_DMA_BUS_ADDDRESS > flag doesn't mark the segment for the page, but for the dma address. It > cannot be set in sg_assign_page() seeing it's not a property of the page > but a property of the dma_address in the sgl. > > It's not meant for use by regular SG users, it's only meant for use > inside DMA mapping implementations. The purpose is to know whether a > given dma_address in the SGL is a bus address or regular memory because > the two different types must be unmapped differently. We can't rely on > the page because, as you know, many dma_map_sg() the dma_address entry > in the sgl does not map to the same memory as the page. Or to put it > another way: is_pci_p2pdma_page(sg->page) does not imply that > sg->dma_address points to a bus address. > > Does that make sense? Ah, you're quite right, in trying to take in the whole series at once first thing in the morning I did fail to properly grasp that detail, so indeed the sg_assign_page() thing couldn't possibly work, but as I said that's fine anyway. I still think the lifecycle management is a bit off though - equivalently, a bus address doesn't stop being a bus address, so it would seem appropriate to update this flag appropriately whenever sg_dma_address() is assigned to, and not when it isn't. Thanks, Robin.