Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp648921lql; Mon, 11 Mar 2024 13:05:21 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV+lAo2yu+1tqOCXxQ3aPYRB1LjmlQRgVliHcR9649kFEhxf2pkabILOkJzjDojoxmz9T+ixWFAFq+shHx883gxApk5ALa1e1qx0z307g== X-Google-Smtp-Source: AGHT+IHZdDFJ8iGd14A6csEzBPQ5Xx6kq+94BycDuETlduWl+VhvDlmhu7WBm3d4WQHb4U7Q+diN X-Received: by 2002:a05:6a20:9194:b0:1a1:5108:ab57 with SMTP id v20-20020a056a20919400b001a15108ab57mr5349706pzd.60.1710187521202; Mon, 11 Mar 2024 13:05:21 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710187521; cv=pass; d=google.com; s=arc-20160816; b=DBaiKQErAYSYu2T5U6CwbGUpKcyI+yFswVxmDEkJV/rg0L69g8OXCPVInJoGtXsotB Z0T+finKL/ylH9y6QCTAKcoBbhy7ru4y6eJjFHpCOdgDwi7pRNxlKEwgG8v1TirpQfpQ U8mrASo7uXfjbTjCcAuZ6bxB1Gc4lp/0w+BLkiR2mE1W4jGlHNFL5wWq32czqXIXLj6r c+SRhQ6qIIZJDdBgrCbdAEGlrrte+NLdGYYRjEDr6u9kU51Xg3M9GOzo9cDobB4RpmPt sIWatXqinIBT3omzOLT97qduYLW+qtEwWKjdfvG5uB8RBnIr1eRbeME6w5UrITffwHXr rznA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=UR9/fHtF3ylhFb+vQrKq2FopW5DRuWNRssvKZI8ZonY=; fh=N7h0nzeeyPRx/x2dZtKSjhUeEfqB33mRpfHGeOCzGWE=; b=u9RJAAtXAZaqE3naEhTiJqHnS+C5M2anXFyk0R7W8jsdt+ZL+MZKvBC8z9P8i/bJPj R+iNjmUTVyt0NUNQ9iAMjDTh0omotzl4ryASM06jwsDKs24ly42cvy/w8xwkVwRB+voy IKN0M7xheGomp0h2VsA+GC/Bv9SpQeJgHC03nywJTQPBK1eWon8wnuphGmV6KFU+Grwb 76LEEf5iNJV/lHc4FMwqjWT6oGNVuz8CvcqV0eKWJ7odnmeiOkPuwQ1jEP88PidrMDUU GNHS+AkZdIXSJusR3F8cGU8Otg2gBTdwR+Aj6ri92giiHsG9c45k6bHOQPeRY0j8a2WK UKxw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=wyEKgURJ; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-99497-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99497-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id f13-20020a056a00238d00b006e689ab3bd7si3583684pfc.141.2024.03.11.13.05.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 13:05:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-99497-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=wyEKgURJ; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-99497-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99497-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 64697282024 for ; Mon, 11 Mar 2024 20:05:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7564956759; Mon, 11 Mar 2024 20:05:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b="wyEKgURJ" Received: from bee.tesarici.cz (bee.tesarici.cz [37.205.15.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B2C154BCC for ; Mon, 11 Mar 2024 20:05:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=37.205.15.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710187514; cv=none; b=I5MH7JBqOI5gDSFFVXYWnwftosAF0wiB/FPaOvlvxM0vufPx4TrjWNB7GCQu52j8ecRVDQaIIZe+ObV5KPbA4J7g98TK2hB27HEVWgEMMjLVIIDAZ/vYPUzUq8Dhtzqzalq6MGHVUSC8jTVK6QaDOPXvlTx2EHQPHLfDBGeReL8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710187514; c=relaxed/simple; bh=Dd3L8MnUZbt3XNG7D95vnVqMzNTpKO7da/OXJzhdrJY=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uR56rBKFXa9oh0462xAhGjYpSAfQzWtvfIalKj/pbMX8FZuqcQ0ijA/EHHGtgnPdN8aYr71eSStTnPGTE/MP00F9i3K+yF5fhNjz/DrvA35ZpEwhLDhIH6Hn3PL1PYA8iymUp4BhvaHzzDEmFkiGhXvedKyMo0hDCOnUCrFMQvY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz; spf=pass smtp.mailfrom=tesarici.cz; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b=wyEKgURJ; arc=none smtp.client-ip=37.205.15.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tesarici.cz Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id C4DE01CFAE6; Mon, 11 Mar 2024 21:05:08 +0100 (CET) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=quarantine dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tesarici.cz; s=mail; t=1710187509; bh=UR9/fHtF3ylhFb+vQrKq2FopW5DRuWNRssvKZI8ZonY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=wyEKgURJ/lwuVhrDSJ9tPWRN7winKNNaRknwXJWxewn2lpx4cAtHvgJs6XzuZc9vo 6iX4aBnh3QizG7IyItjoQqmj1Uuz7zyTUDcIlEhiYP1iTTk7MV8m9O+yZZ8nZD7sjI Jlsvq3IsJaOuCWgX+yRbAGwnA+jNnno69KDEpJ5hjY4nwdSD46calyYGhDtlz82aia JyvClztpcOJYNRLHUUR9oN3Z/J6n/0+p8YP+Gb73meFyME9cVtiajKOcmwHIyDLGgA dlRY+OOfeCBWrU8nCdPTp+NHfgx8VMxJJp1x7IPyS1juHZwMhY/8L7iDDqTKop9M6e MArHNeOxOk1kA== Date: Mon, 11 Mar 2024 21:05:07 +0100 From: Petr =?UTF-8?B?VGVzYcWZw61r?= To: Will Deacon , Nicolin Chen Cc: linux-kernel@vger.kernel.org, kernel-team@android.com, iommu@lists.linux.dev, Christoph Hellwig , Marek Szyprowski , Robin Murphy , Petr Tesarik , Dexuan Cui , Michael Kelley Subject: Re: [PATCH v6 4/6] swiotlb: Fix alignment checks when both allocation and DMA masks are present Message-ID: <20240311210507.217daf8b@meshulam.tesarici.cz> In-Reply-To: <20240308152829.25754-5-will@kernel.org> References: <20240308152829.25754-1-will@kernel.org> <20240308152829.25754-5-will@kernel.org> X-Mailer: Claws Mail 4.2.0 (GTK 3.24.39; x86_64-suse-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Fri, 8 Mar 2024 15:28:27 +0000 Will Deacon wrote: > Nicolin reports that swiotlb buffer allocations fail for an NVME device > behind an IOMMU using 64KiB pages. This is because we end up with a > minimum allocation alignment of 64KiB (for the IOMMU to map the buffer > safely) but a minimum DMA alignment mask corresponding to a 4KiB NVME > page (i.e. preserving the 4KiB page offset from the original allocation). > If the original address is not 4KiB-aligned, the allocation will fail > because swiotlb_search_pool_area() erroneously compares these unmasked > bits with the 64KiB-aligned candidate allocation. > > Tweak swiotlb_search_pool_area() so that the DMA alignment mask is > reduced based on the required alignment of the allocation. > > Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers") > Reported-by: Nicolin Chen > Link: https://lore.kernel.org/r/cover.1707851466.git.nicolinc@nvidia.com > Tested-by: Nicolin Chen > Reviewed-by: Michael Kelley > Signed-off-by: Will Deacon > --- > kernel/dma/swiotlb.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index c20324fba814..c381a7ed718f 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -981,8 +981,7 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool > dma_addr_t tbl_dma_addr = > phys_to_dma_unencrypted(dev, pool->start) & boundary_mask; > unsigned long max_slots = get_max_slots(boundary_mask); > - unsigned int iotlb_align_mask = > - dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); > + unsigned int iotlb_align_mask = dma_get_min_align_mask(dev); > unsigned int nslots = nr_slots(alloc_size), stride; > unsigned int offset = swiotlb_align_offset(dev, orig_addr); > unsigned int index, slots_checked, count = 0, i; > @@ -993,6 +992,14 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool > BUG_ON(!nslots); > BUG_ON(area_index >= pool->nareas); > > + /* > + * Ensure that the allocation is at least slot-aligned and update > + * 'iotlb_align_mask' to ignore bits that will be preserved when > + * offsetting into the allocation. > + */ > + alloc_align_mask |= (IO_TLB_SIZE - 1); > + iotlb_align_mask &= ~alloc_align_mask; > + I have started writing the KUnit test suite, and the results look incorrect to me for this case. I'm calling swiotlb_tbl_map_single() with: * alloc_align_mask = 0xfff * a device with min_align_mask = 0xfff * the 12 lowest bits of orig_addr are 0xfa0 The min_align_mask becomes zero after the masking added by this patch, and the 12 lowest bits of the returned address are 0x7a0, i.e. not equal to 0xfa0. In other words, the min_align_mask constraint is not honored. Of course, given the above values, it is not possible to honor both min_align_mask and alloc_align_mask. I find it somewhat surprising that NVMe does not in fact require that the NVME_CTRL_PAGE_SHIFT low bits are preserved, as suggested by Nicolin's successful testing. Why is that? Does IOMMU do some additional post-processing of the bounce buffer address to restore the value of bit 11? Or is this bit always zero in all real-world scenarios? Petr T