Received: by 2002:a05:7412:1e0b:b0:fc:a2b0:25d7 with SMTP id kr11csp1203338rdb; Fri, 16 Feb 2024 08:13:28 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCW4ChNVGaospZhz3NIi9qcWQxlP7m4YhXQ2QCmHZolPPCjPV0c0osTxVZqYW7Brukk/wygRcpNsADNTJAMxlH8EWDGMVT29GERX/5ZHnQ== X-Google-Smtp-Source: AGHT+IE3OWzHEx9KmtNbuPXl4fG/ootHvYfoqMosnVm5RsPRYw+C7UOcRrNQzvW+HMv6uT0jFdrM X-Received: by 2002:a17:906:a446:b0:a3d:28ba:6c4c with SMTP id cb6-20020a170906a44600b00a3d28ba6c4cmr4462508ejb.38.1708100008062; Fri, 16 Feb 2024 08:13:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708100008; cv=pass; d=google.com; s=arc-20160816; b=Mq3+dSanf2SRZFeA1YbQR9DuWr2L/3Q2PpQEYK8/S4ohgUYzDmpYzNFQwEnh2Ad2gk YeB+sb/p/RAVoWuUK8L+aoZTSxTB/Y080unOcz3Zx4N62SQ875eK67r8/yl1efWItv1f 5VXxGOdZHgjGzEioMMO31CSAnWJtFRtKC0jOt6FFECCX9UmSjJpMseOewwMZfH/+/1Ed trSJ1QjCV0obbugIE76vlV8FydeXEsjBIwc+ogPoGHgVuEKJMDPgZvqLatIHhBeM4Rhw U8QFfq4JTeHWbqyZ2IKYWUuEK4H80SHOPV/xYcwGr9pcpYNzWjcGUFRNdqTBcOsww3sg /JvQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:references :message-id:subject:cc:to:from:date:dkim-signature; bh=QFvpkVFXP/zgoz+D6O1Fk9rMBCgM6H4gg4cf1TlwltE=; fh=f4jzvOdPuXv+WRoE+HOAMiI2g6ldawVtwzqDV5ByRYk=; b=INJk9/DEr00uiyE3FqVdapqWzW6+2SwwB4H0CqfJEj5kzTQ9atQRmXPb2OUKTMZFn5 hDFQi83JmNTGAxo/biZCHovg0X+iECLbM9L7xkX0DT3oia5QvfmbceONlLQbm4a9EBg8 Sdaj4ldgKMRQgYLc+LN07Nw/xe+5RurpMKKpVe77bawuTOH5NtCpYG0XAxl39H/ObO3s g4vuKPRikkokZgCwtxuLTzDGOk9RXu1Hg6fvSBdvg+R4Q3q8vqhLu9N5FgUSkgTm3Kuy KuVOIH1qU8JVD5LLXguY/SF/toJSTNYWA2OhrAClKyEK7sIREEkSOoQ7HMxZ7uIygcUC CkfQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uQ2AVVwa; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-68941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-68941-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id rs28-20020a170907037c00b00a3d51bfffc7si56731ejb.505.2024.02.16.08.13.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Feb 2024 08:13:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-68941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uQ2AVVwa; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-68941-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-68941-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CB2821F23725 for ; Fri, 16 Feb 2024 16:13:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 352C112FB31; Fri, 16 Feb 2024 16:13:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uQ2AVVwa" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41CD012D770; Fri, 16 Feb 2024 16:13:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708099999; cv=none; b=LYbihIk37TNqKBVOa5a8VJy6r6irZgoLk4byqabC2UZp7Ne6ui7JPdjhkwY3oLQZ6ddhsYVNFFNsyk4hCS6BNHw3hBlznLjsfuKZk6GMQpDjrDSQXQcG9H6SMIFeaoEUGDW65fH+MF+7NZ8qvTyB33ngbgz3MyAzdqgdSSj0zCQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708099999; c=relaxed/simple; bh=0C9+pCgjxRMOl9W9M/RJQspbO81gdJ2Y5+OCFt5kwAw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=RoV3Sp4kkpKAZEudGTRBi1Ale0uBx48kMrjKFf3THWhCZNUagFyxmQ/h+NrM/3OdBN362aQMmV/f+z55DxF90sZjq9RDEa47TCMjXk1NTJeHVuHLGQYoOYVNmvhmn1EwkUgkXu+seSDSV5lEmvnrJWcdLCnymgQPuDxHhzHmhCw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uQ2AVVwa; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 585B0C433C7; Fri, 16 Feb 2024 16:13:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1708099998; bh=0C9+pCgjxRMOl9W9M/RJQspbO81gdJ2Y5+OCFt5kwAw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=uQ2AVVwaKgHAtfDxbKsFun3Cxph1v1ovRg2RMHIfxNpeAFMxZoBEOaPhg+KMRRTFM dLRVSYRF1YV2rMCcilm14ixVSLxgl1pfJyn5kLRar+fPcQITEyDRQG3ehlzAljPYF2 0Z39Z/YUMeaeipQ00UVGpvdYRSjtImymJsFsrnozJG3a+dkvLQn22ewV5tlTro2cyf qQBkStYqVNzmodLKgIfctm15lIX3C6B+Bw7ZCBViG6f3pse+iE6j3E/YIMwBNWZxN2 Azgk3l0gu6/rm8sOJZr7FAlI2Yak86+3NTqkyi8VIituSoZpvE8MgpL59ggOqSjqva DBapcF2ROFIgQ== Date: Fri, 16 Feb 2024 16:13:12 +0000 From: Will Deacon To: Nicolin Chen Cc: sagi@grimberg.me, hch@lst.de, axboe@kernel.dk, kbusch@kernel.org, joro@8bytes.org, robin.murphy@arm.com, jgg@nvidia.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, iommu@lists.linux.dev, murphyt7@tcd.ie, baolu.lu@linux.intel.com Subject: Re: [PATCH v1 0/2] nvme-pci: Fix dma-iommu mapping failures when PAGE_SIZE=64KB Message-ID: <20240216161312.GA2203@willie-the-truck> References: <20240214164138.GA31927@willie-the-truck> <20240215142208.GA753@willie-the-truck> <20240215163544.GA821@willie-the-truck> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Hi Nicolin, Thanks for sharing all the logs, .config etc. On Thu, Feb 15, 2024 at 04:26:23PM -0800, Nicolin Chen wrote: > On Thu, Feb 15, 2024 at 04:35:45PM +0000, Will Deacon wrote: > > On Thu, Feb 15, 2024 at 02:22:09PM +0000, Will Deacon wrote: > > > On Wed, Feb 14, 2024 at 11:57:32AM -0800, Nicolin Chen wrote: > > > > On Wed, Feb 14, 2024 at 04:41:38PM +0000, Will Deacon wrote: > > > > > On Tue, Feb 13, 2024 at 01:53:55PM -0800, Nicolin Chen wrote: > > > > And it seems to get worse, as even a 64KB mapping is failing: > > > > [ 0.239821] nvme 0000:00:01.0: swiotlb buffer is full (sz: 65536 bytes), total 32768 (slots), used 0 (slots) > > > > > > > > With a printk, I found the iotlb_align_mask isn't correct: > > > > swiotlb_area_find_slots:alloc_align_mask 0xffff, iotlb_align_mask 0x800 > > > > > > > > But fixing the iotlb_align_mask to 0x7ff still fails the 64KB > > > > mapping.. > > > > > > Hmm. A mask of 0x7ff doesn't make a lot of sense given that the slabs > > > are 2KiB aligned. I'll try plugging in some of the constants you have > > > here, as something definitely isn't right... > > > > Sorry, another ask: please can you print 'orig_addr' in the case of the > > failing allocation? > > I added nvme_print_sgl() in the nvme-pci driver before its > dma_map_sgtable() call, so the orig_addr isn't aligned with > PAGE_SIZE=64K or NVME_CTRL_PAGE_SIZE=4K: > sg[0] phys_addr:0x0000000105774600 offset:17920 length:512 dma_address:0x0000000000000000 dma_length:0 > > Also attaching some verbose logs, in case you'd like to check: > nvme 0000:00:01.0: swiotlb_area_find_slots: dma_get_min_align_mask 0xfff, IO_TLB_SIZE 0xfffff7ff > nvme 0000:00:01.0: swiotlb_area_find_slots: alloc_align_mask 0xffff, iotlb_align_mask 0x7ff > nvme 0000:00:01.0: swiotlb_area_find_slots: stride 0x20, max 0xffff > nvme 0000:00:01.0: swiotlb_area_find_slots: tlb_addr=0xbd830000, iotlb_align_mask=0x7ff, alloc_align_mask=0xffff > => nvme 0000:00:01.0: swiotlb_area_find_slots: orig_addr=0x105774600, iotlb_align_mask=0x7ff With my patches, I think 'iotlb_align_mask' will be 0x800 here, so this particular allocation might be alright, however I think I'm starting to see the wider problem. The IOMMU code is asking for a 64k-aligned allocation so that it can map it safely, but at the same time dma_get_min_align_mask() is asking for congruence in the 4k NVME page offset. Now, because we're going to allocate a 64k-aligned mapping and offset it, I think the NVME alignment will just fall out in the wash and checking the 'orig_addr' (which includes the offset) is wrong. So perhaps this diff (which I'm sadly not able to test) will help? You'll want to apply it on top of my other patches. The idea is to ignore the bits of 'orig_addr' which will be aligned automatically by offseting from the aligned allocation. I fixed the max() thing too, although that's only an issue for older kernels. Cheers, Will --->8 diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 283eea33dd22..4a000d97f568 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -981,8 +981,7 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(dev, pool->start) & boundary_mask; unsigned long max_slots = get_max_slots(boundary_mask); - unsigned int iotlb_align_mask = - dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); + unsigned int iotlb_align_mask = dma_get_min_align_mask(dev); unsigned int nslots = nr_slots(alloc_size), stride; unsigned int offset = swiotlb_align_offset(dev, orig_addr); unsigned int index, slots_checked, count = 0, i; @@ -993,6 +992,9 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool BUG_ON(!nslots); BUG_ON(area_index >= pool->nareas); + alloc_align_mask |= (IO_TLB_SIZE - 1); + iotlb_align_mask &= ~alloc_align_mask; + /* * For mappings with an alignment requirement don't bother looping to * unaligned slots once we found an aligned one. @@ -1004,7 +1006,7 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool * allocations. */ if (alloc_size >= PAGE_SIZE) - stride = max(stride, PAGE_SHIFT - IO_TLB_SHIFT + 1); + stride = umax(stride, PAGE_SHIFT - IO_TLB_SHIFT + 1); spin_lock_irqsave(&area->lock, flags); if (unlikely(nslots > pool->area_nslabs - area->used))