Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1908067rwr; Sat, 6 May 2023 00:57:04 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5ulGm4B5o+X9yipPMFRKYujPH1yRlcutsz4KM0xgBpGKekFOU4Q1vyMkeRtWMW/BiKbCNl X-Received: by 2002:a17:90b:38d2:b0:24e:203c:caa4 with SMTP id nn18-20020a17090b38d200b0024e203ccaa4mr4105340pjb.15.1683359824158; Sat, 06 May 2023 00:57:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683359824; cv=none; d=google.com; s=arc-20160816; b=QaAQRJvne2KvC5B6qTbzGryntwOeJQwLvlD8BLq0JhhtHLVGJ8vt6PTwUJA8N26eCq v4FmdX4immIxbLv/nyDE/34h/PJOoWroxAG7xFZ/3iDptOFUfgSTYEC+drQ+YPW/4nul cqeguhYBXfCDcCjOEsIxROdhkuak/SlPEerGlQtb71i6vK+DWrQ6dzYx4ac0RrnuVkAu KYEb477FWGLZqGPOGp2LPTw4F8+oV0+gLbAYpvBGTY8ipXdDPCgXiUlrpSX47EYzwjT7 Fmt3jR6AAI6b0QwYa79hKcKmosrPL8FekkKKCGeEl61L7R3Evggdu26bBkecNpMzWNVY xrdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=9GAfmX3alcOeCx78VJBcKLDSu+X0RI91B1Ne2NGso2w=; b=TrFSXkMLNbfHQ0hxcZYufnVLgrAgr0oPE550Fdm70SNG60TqIlFs2dP43+6hMfwnLD 9BQSQCOqp+lpRH5QGOI/3K9N/hVGS7sHCtC9/k37u5XK6R6sB5ABfaD5Vhgi08wLaIb2 z5xy2hVIb6ZUpoQmtFfL7X8Sd0SkyX0jCIrSCZgbMXvIF3esITgFc4+WCh12oBHunNPO 5E2YwinHHKOdBsTWtNkh0ca5HT5cWhqLJP6qfR+RoA5xfyJ1zyRvBxl/ZPo+6VkztAfP EzJki66660tJXgTDNUIwvxUL7ZhfCbSE6lfkY1L/6yOgfCkQa5QkcbFxKFcJsWL1+ojV dptA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cUPgDdY7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o20-20020a17090aac1400b00240e633e032si8029477pjq.8.2023.05.06.00.56.52; Sat, 06 May 2023 00:57:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cUPgDdY7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230090AbjEFHZ7 (ORCPT + 99 others); Sat, 6 May 2023 03:25:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229714AbjEFHZ5 (ORCPT ); Sat, 6 May 2023 03:25:57 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CD04A267; Sat, 6 May 2023 00:25:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DCE6361827; Sat, 6 May 2023 07:25:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 41CA8C433B0; Sat, 6 May 2023 07:25:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683357955; bh=/pqLqv84J2jWuKl1Lk7AhETw9xQvFqEdpB7kxyH6fI8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=cUPgDdY7WBdR+obN6ZVonxD9+AqvTjDLZek6hO8Bh5anvhOs4TmU67zRl7ySmPvBp i8FRz1KX+mzHYYI+Z/y0wYNkkGVlDZfw2wNSyJH6w3yIBsRuEiCKsmGiGIf/pZsEal texLImmTg/gCpVHUuWvn14o5Lgf+f1wOysFff2gp10PJHZgoXw6iwYzAwIUxisqWfw u5jsEnyW3YFcIFUx6RL4jWbXoqzOfY/dd8vIcdt+Xq68Ux5KJCIYvI2XApR9C23Nmm Sgkh2cUCXwwJKLy4JCAVy999KDS8Et5jkyt2gkH/o90zp2F9JCGMHuMQgxIkMklZJH mmsRI+wgR/9MQ== Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-50bc4ba28cbso4764522a12.0; Sat, 06 May 2023 00:25:55 -0700 (PDT) X-Gm-Message-State: AC+VfDx1fH+AarEJBfLItd5cBE1oD/OYqmgTHedz96Mhy7/Sz0wG3gD+ 8pXZX7UAVq9WS4X4inu9EdKBwRF8HD2z5O2lsMI= X-Received: by 2002:a17:907:3d9f:b0:959:5454:1db7 with SMTP id he31-20020a1709073d9f00b0095954541db7mr3511012ejc.3.1683357953246; Sat, 06 May 2023 00:25:53 -0700 (PDT) MIME-Version: 1.0 References: <20230327121317.4081816-1-arnd@kernel.org> <20230327121317.4081816-10-arnd@kernel.org> In-Reply-To: From: Guo Ren Date: Sat, 6 May 2023 15:25:41 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 09/21] riscv: dma-mapping: skip invalidation before bidirectional DMA To: Arnd Bergmann Cc: Arnd Bergmann , Christoph Hellwig , linux-kernel@vger.kernel.org, Vineet Gupta , Will Deacon , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz , "David S . Miller" , Max Filippov , Robin Murphy , "Lad, Prabhakar" , "Conor.Dooley" , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, "linux-oxnas@groups.io" , "linux-csky@vger.kernel.org" , linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, "linux-openrisc@vger.kernel.org" , linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 5, 2023 at 9:19=E2=80=AFPM Arnd Bergmann wrote: > > On Fri, May 5, 2023, at 07:47, Guo Ren wrote: > > On Mon, Mar 27, 2023 at 8:15=E2=80=AFPM Arnd Bergmann = wrote: > > >> > >> riscv also invalidates the caches before the transfer, which does > >> not appear to serve any purpose. > > Yes, we can't guarantee the CPU pre-load cache lines randomly during > > dma working. > > > > But I've two purposes to keep invalidates before dma transfer: > > - We clearly tell the CPU these cache lines are invalid. The caching > > algorithm would use these invalid slots first instead of replacing > > valid ones. > > - Invalidating is very cheap. Actually, flush and clean have the same > > performance in our machine. > > The main purpose of the series was to get consistent behavior on > all machines, so I really don't want a custom optimization on > one architecture. You make a good point about cacheline reuse > after invalidation, but if we do that, I'd suggest doing this > across all architectures. Yes, invalidation of DMA_FROM_DEVICE-for_device is a proposal for all architectures. > > > So, how about: > > > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoh= erent.c > > index d919efab6eba..2c52fbc15064 100644 > > --- a/arch/riscv/mm/dma-noncoherent.c > > +++ b/arch/riscv/mm/dma-noncoherent.c > > @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size= _t size, > > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > > break; > > case DMA_FROM_DEVICE: > > - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > > - break; > > case DMA_BIDIRECTIONAL: > > ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > > break; > > This is something we can consider. Unfortunately, this is something > that no architecture (except pa-risc, which has other problems) > does at the moment, so we'd probably need to have a proper debate > about this. > > We already have two conflicting ways to handle DMA_FROM_DEVICE, > either invalidate/invalidate, or clean/invalidate. I can see I vote to invalidate/invalidate. My key point is to let DMA_FROM_DEVICE-for_device invalidate, and DMA_BIDIRECTIONAL contains DMA_FROM_DEVICE. So I also agree: @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t s= ize, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_FROM_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(invalidate, vaddr, size, riscv_cbom_block_size); break; case DMA_BIDIRECTIONAL: ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); break; > that flush/invalidate may be a sensible option as well, but I'd > want to have that discussion after the series is complete, so > we can come to a generic solution that has the same documented > behavior across all architectures. Yes, I agree to unify them into a generic solution first. My proposal could be another topic in the future. For that purpose, I give Acked-by: Guo Ren > > In particular, if we end up moving arm64 and riscv back to the > traditional invalidate/invalidate for DMA_FROM_DEVICE and > document that driver must not rely on buffers getting cleaned After invalidation, the cache lines are also cleaned, right? So why do we need to document it additionally? > before a partial DMA_FROM_DEVICE, the question between clean > or flush becomes moot as well. > > > @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t = size, > > break; > > case DMA_FROM_DEVICE: > > case DMA_BIDIRECTIONAL: > > /* I'm not sure all drivers have guaranteed cacheline > > alignment. If not, this inval would cause problems */ > > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > > + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); > > break; > > This is my original patch, and I would not mix it with the other > change. The problem with non-aligned DMA_BIDIRECTIONAL buffers in > is that both flush and inval would be wrong if you get simultaneous > writes from device and cpu to the same cache line, so there is > no way to win this. Using inval instead of flush would at least > work if the CPU data in the cacheline is read-only from the CPU, > so that seems better than something that is always wrong. If CPU data in the cacheline is read-only, the cacheline would never be dirty. Yes, It's always safe. Okay, I agree we must keep cache-line-aligned. I comment it here, just worry some dirty drivers couldn't work with the "invalid mechanism" because of the CPU data corruption, and device data in the cacheline is useless. > > The documented API is that sharing the cache line is not allowed > at all, so anything that would observe a difference between the > two is also a bug. One idea that we have considered already is > that we could overwrite the unused bits of the cacheline with > poison values and/or mark them as invalid using KASAN for debugging > purposes, to find drivers that already violate this. > > Arnd --=20 Best Regards Guo Ren