Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp2527639rdb; Tue, 12 Sep 2023 05:01:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGKeDzY6Zb2sc8bgz93K4divZvrjw42Cjew4/XBtbpRw6TKozv6wPN2knH77DhIyF/6frne X-Received: by 2002:a17:902:b494:b0:1c0:ecbb:1831 with SMTP id y20-20020a170902b49400b001c0ecbb1831mr9406340plr.16.1694520106707; Tue, 12 Sep 2023 05:01:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694520106; cv=none; d=google.com; s=arc-20160816; b=eidC6xnGYhUc8ephizPxtL4B9JuEDyqGYnVeFQPxHZP+vmp1LCQxNSmCvYy79PqPpv 11QUFvyoQ2s4bVfaTCpWAg/ixy0iPI80cPBEgGf5yhGk1YQPrNgp1jn3pWPTu9pvMrwh lEwko8vgyktKDMu7wFVWVAKbNBeZzUjG2ENt7tI9ElCKhKuy1S/CpVby9+yXjBqDdT5Q ERSN6JB5FANMh2WkCAkO+ed/tO7IcW0Jkjt5cn+z+9M3MIIaxcFyvCBM6KL0zy2wYlIm 0/o+z1cyqG/MD8lJ1qKEtRc7pDOsT77M0f/djcmm6J1c4TAle++5r7nKcgzEyE50c7MU fuVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=ztHBgv8Ur7h9LDykM1slt5BEHfYMt+Hbm6brhAShDlY=; fh=/z8a15y/t/vbIE4kOie7byQ1bhNpCUJ/JVmtXgUNlhk=; b=a3prrPr6L+L+GsbqGf10vlu34LmHpI17gbCWWoe1qe/nN/yalQYrJvtmNvpv9Av1ey zoPltgCBui53lXvZQmuznbBTu/us+83Je9BtZp/pHzTLcAMrd+Ypsz6utRjgY/1dLxM/ 6e2yUJWRTvyNfAbTzq66frLLAxsnb4g5aGMICFg9+7mf1yJ8A2TRVj6Lh5B8/J4H2oVt w4f9Rcc+hxuo/e1mWwyU1a7lNqtCQPxR40SZOfghtVvEwJsjmCP8erx2iUUgkBISv4xv j/BOXpxHnsuYjqyYPZ4eIHzjtGjyQiZg2tn0eynH5f0nvGpV4T4MjoJG3wsEk07IaPmk IV2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id x4-20020a170902a38400b001bb96de922dsi4963745pla.140.2023.09.12.05.01.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 05:01:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id C8C1180B19FD; Tue, 12 Sep 2023 04:57:47 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.8 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234814AbjILL5p (ORCPT + 99 others); Tue, 12 Sep 2023 07:57:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234930AbjILL5Z (ORCPT ); Tue, 12 Sep 2023 07:57:25 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 052531703 for ; Tue, 12 Sep 2023 04:57:18 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E5521C15; Tue, 12 Sep 2023 04:57:54 -0700 (PDT) Received: from [10.1.196.40] (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0FE083F67D; Tue, 12 Sep 2023 04:57:16 -0700 (PDT) Message-ID: <2b4a7aee-1997-cf2f-e8e8-19ab4adc09b5@arm.com> Date: Tue, 12 Sep 2023 12:57:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [PATCH] dma-mapping: fix dma_addressing_limited if dma_range_map is scanned Content-Language: en-GB To: Jia He , Christoph Hellwig , Marek Szyprowski , iommu@lists.linux.dev Cc: linux-kernel@vger.kernel.org References: <20230912084002.2168-1-justin.he@arm.com> From: Robin Murphy In-Reply-To: <20230912084002.2168-1-justin.he@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 12 Sep 2023 04:57:50 -0700 (PDT) X-Spam-Status: No, score=-2.2 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email On 12/09/2023 9:40 am, Jia He wrote: > After scanning the dma_range_map, if it is found that not all of the > system RAM ranges are encompassed within it, an incorrect calculation > occurs for dma_addressing_limited(), which prevents the nvme device > dma mapping in the checking path of phys_to_dma(). Nit: the subject and this description aren't very clear - the key point here is the unusual case that the range map covers right up to the top of system RAM, but leaves a hole somewhere lower down. There's no issue with the more typical case where some RAM exists above the top of the range map, since bus_dma_limit will capture that and work as expected. > E.g. On an Armv8 Ampere server, the dsdt ACPI table is: > Method (_DMA, 0, Serialized) // _DMA: Direct Memory Access > { > Name (RBUF, ResourceTemplate () > { > QWordMemory (ResourceConsumer, PosDecode, MinFixed, > MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000000000000000, // Range Minimum > 0x00000000FFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > 0x0000000100000000, // Length > ,, , AddressRangeMemory, TypeStatic) > QWordMemory (ResourceConsumer, PosDecode, MinFixed, > MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000006010200000, // Range Minimum > 0x000000602FFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > 0x000000001FE00000, // Length > ,, , AddressRangeMemory, TypeStatic) > QWordMemory (ResourceConsumer, PosDecode, MinFixed, > MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x00000060F0000000, // Range Minimum > 0x00000060FFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > 0x0000000010000000, // Length > ,, , AddressRangeMemory, TypeStatic) > QWordMemory (ResourceConsumer, PosDecode, MinFixed, > MaxFixed, Cacheable, ReadWrite, > 0x0000000000000000, // Granularity > 0x0000007000000000, // Range Minimum > 0x000003FFFFFFFFFF, // Range Maximum > 0x0000000000000000, // Translation Offset > 0x0000039000000000, // Length > ,, , AddressRangeMemory, TypeStatic) > }) > > But the System RAM ranges are: > cat /proc/iomem |grep -i ram > 90000000-91ffffff : System RAM > 92900000-fffbffff : System RAM > 880000000-fffffffff : System RAM > 8800000000-bff5990fff : System RAM > bff59d0000-bff5a4ffff : System RAM > bff8000000-bfffffffff : System RAM > So some RAM ranges are out of dma_range_map. > > Fixes it by checking whether each of the system RAM resources can be > properly encompassed within the dma_range_map. > > Signed-off-by: Jia He > --- > include/linux/dma-mapping.h | 8 +++++-- > kernel/dma/mapping.c | 45 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 51 insertions(+), 2 deletions(-) > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > index f0ccca16a0ac..d9d1c67c8579 100644 > --- a/include/linux/dma-mapping.h > +++ b/include/linux/dma-mapping.h > @@ -144,6 +144,7 @@ bool dma_pci_p2pdma_supported(struct device *dev); > int dma_set_mask(struct device *dev, u64 mask); > int dma_set_coherent_mask(struct device *dev, u64 mask); > u64 dma_get_required_mask(struct device *dev); > +bool all_ram_in_dma_range_map(struct device *dev); > size_t dma_max_mapping_size(struct device *dev); > size_t dma_opt_mapping_size(struct device *dev); > bool dma_need_sync(struct device *dev, dma_addr_t dma_addr); > @@ -475,8 +476,11 @@ static inline int dma_coerce_mask_and_coherent(struct device *dev, u64 mask) > */ > static inline bool dma_addressing_limited(struct device *dev) > { > - return min_not_zero(dma_get_mask(dev), dev->bus_dma_limit) < > - dma_get_required_mask(dev); > + if (min_not_zero(dma_get_mask(dev), dev->bus_dma_limit) < > + dma_get_required_mask(dev)) Nit: indentation > + return true; > + > + return !all_ram_in_dma_range_map(dev); > } > > static inline unsigned int dma_get_max_seg_size(struct device *dev) > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c > index e323ca48f7f2..ab407deb81b8 100644 > --- a/kernel/dma/mapping.c > +++ b/kernel/dma/mapping.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include Nit: please keep the includes sorted alphabetically, however I do wonder whether this whole thing shouldn't belong in dma-direct anyway. > #include "debug.h" > #include "direct.h" > > @@ -819,6 +820,50 @@ size_t dma_opt_mapping_size(struct device *dev) > } > EXPORT_SYMBOL_GPL(dma_opt_mapping_size); > > +/* > + * To check whether all ram resource ranges are mapped in dma range map > + * Returns 0 when continuous check is needed > + * Returns 1 if there is some ram range can't be mapped to dma_range_map > + */ > +static int check_ram_in_range_map(unsigned long start_pfn, > + unsigned long nr_pages, void *data) > +{ > + phys_addr_t end_paddr = (start_pfn + nr_pages) << PAGE_SHIFT; This could still wrap to 0 on 32-bit. I think the robust thing to do is either spray some extra -1s and +1s around to make all the "end" values inclusive limits, or maybe just do everything in units of pages rather than bytes (i.e. use PFN_DOWN() on the bus_dma_region entry values). > + phys_addr_t start_paddr = start_pfn << PAGE_SHIFT; > + struct device *dev = (struct device *)data; > + struct bus_dma_region *region = NULL; > + const struct bus_dma_region *m; > + > + while (start_paddr < end_paddr) { > + // find region containing start_paddr Nit: inconsistent comment style (although it's not a particularly valuable comment anyway, IMO the loop itself is clear enough). Thanks, Robin. > + for (m = dev->dma_range_map; m->size; m++) { > + if (start_paddr >= m->cpu_start > + && start_paddr - m->cpu_start < m->size) { > + region = (struct bus_dma_region *)m; > + break; > + } > + } > + if (!region) > + return 1; > + > + start_paddr = region->cpu_start + region->size; > + /* handle overflow of phys_addr_t */ > + if (start_paddr == 0) > + break; > + } > + > + return 0; > +} > + > +bool all_ram_in_dma_range_map(struct device *dev) > +{ > + if (!dev->dma_range_map) > + return 1; > + > + return !walk_system_ram_range(0, ULONG_MAX, dev, check_ram_in_range_map); > +} > +EXPORT_SYMBOL_GPL(all_ram_in_dma_range_map); > + > bool dma_need_sync(struct device *dev, dma_addr_t dma_addr) > { > const struct dma_map_ops *ops = get_dma_ops(dev);