Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp882537rwr; Fri, 5 May 2023 06:25:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6ca2d8+xOsTpte+JbMbsxgFBMiyG2r54edo2Q4KrGV07toqUIyOxfpEpkbem7nNcm9LcFD X-Received: by 2002:a17:90a:8004:b0:24e:5245:6383 with SMTP id b4-20020a17090a800400b0024e52456383mr1395561pjn.23.1683293129200; Fri, 05 May 2023 06:25:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683293129; cv=none; d=google.com; s=arc-20160816; b=RjoTpzjA1v+NClYi8LUx+EFydMATjJHHjrslV3mgZKtN2xlCaRu9GaIj508xHIw3E7 xmAvaHWw4G2xRW4DeYZWbtQefltCpQkIyVJUHc9bP6wO1QryTOXC/N3ABLyqtgHhdk4e N6d5l+J4X+zIMWC7boCTIWtWRSP59QEupIv5NtegnCilrg8WAN5sd1jKDEmJ0RZBMnQN JGM6L5EwBf+49wQTYD2gJvcJ58pbBFD2//DvSTwDVugEk91IdNanmZWKt10p7LIUCu+t rXuNuKTs4AGJ/y+6WOlj0YaNgLFu7UBwEv3vDQiwyps+cpLAizVm4dQVAZq5jUkJgkTn zOpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:subject:cc:to:from :date:references:in-reply-to:message-id:mime-version:user-agent :feedback-id:dkim-signature:dkim-signature; bh=MFewdMftG+wRD887OcQ41hQs0awPz56T0ruyGnj8ABI=; b=Rfldj0rxfZILh7VRVnOMnhfb/CMWmGyt+WEsmfxURZnS+WhfRqa0s9+BBNDjLhGH1e xNtpUwN5JexuVzRRWtTMZM4aJsHfTBhkSrCGUQTuUNytjydMnw4crbePgK77Aal5c27S 7mz33/kG0n2RllzEljSx08eK4CHcuKQZgZv1r5sMi7z9siP4iedRBOBP1anyErs4aT62 7RcTdGORvaqhmhD9Vj4GbLW7iSYZBzxdc/jr6uFViyPpBhFRlml9pwWUa5MPw7Cu1+XM nN6oB2DxP/MCOd3zZHuwjuRAibZYHRWLScnJgjrjRd1zEV1Q4Cqa3ysgUaCF6yTm6xu+ OTTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arndb.de header.s=fm3 header.b=T7QiYreO; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=FVvu2kHf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k25-20020a637b59000000b0052c96dcaa82si1997766pgn.98.2023.05.05.06.25.15; Fri, 05 May 2023 06:25:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@arndb.de header.s=fm3 header.b=T7QiYreO; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=FVvu2kHf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232439AbjEENT2 (ORCPT + 99 others); Fri, 5 May 2023 09:19:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232422AbjEENT0 (ORCPT ); Fri, 5 May 2023 09:19:26 -0400 Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D1E11E9BD; Fri, 5 May 2023 06:19:22 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailnew.nyi.internal (Postfix) with ESMTP id 1A10F5802A0; Fri, 5 May 2023 09:19:19 -0400 (EDT) Received: from imap51 ([10.202.2.101]) by compute6.internal (MEProxy); Fri, 05 May 2023 09:19:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arndb.de; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1683292759; x=1683299959; bh=MFewdMftG+wRD887OcQ41hQs0awPz56T0ru yGnj8ABI=; b=T7QiYreOPC+QycowzVZsUb+Flt5SaOjHFxAhKjqZsP+nsbKyqVd vQ8eowoEYwNI/GU+ISyjC+jWPEKamuy6X2bmGmXoFQmBFMRt3DPTZF7imVRis0gr S0EkyyDbQvtLD7pfZ4AsHyrYk42NDSyf8/wa7V9JGjoeIM+dclks9qkVsfuCYS36 2bnobzF6Lg7r4sP9JfGrvCZy6bK7TL5qDUup6dDDozkhtlDKoL2JENn8cqmA5ZJ5 wLu4/Si0kCVvWmQxVaYqYa+bPXXPogECh5ZRpy0s2bQVgjpqCgavDgb5K4Sf79fl Z2ACYtb7nNPJlBQY7RKS+i0yrfAvSYbn6sQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1683292759; x=1683299959; bh=MFewdMftG+wRD887OcQ41hQs0awPz56T0ru yGnj8ABI=; b=FVvu2kHfPB+goofUSO7GBQXw9iAuEK1luvy6mN9DIT3CnY67Gls 0KZVLaa6ELu1958ejxqPitibsxtjnQdtVNl7Z4DIo/7t6vSCqKnMSbKFtiZxUfg8 SuTynEk2oMcUtF9nJJbJg0GYURVCU7bPEPfG8kEUYCku0K859h+BHBVlobnzbNVb 9jCOuQHGR9JdwHKNFMPcOBCnbMbqE970CGlmaHJPbDGFzyO5if6gbd+OUkaHxy+L frNzAleO1fKFLe9X7xNq2nNbq2hrGATg6TTB1yBWoA0Wjb4XH3SHavwUomASSa1B aF2LKjYsYglP/xYL6ASj4zNf9qShE/HNiUg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeefvddgieefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtgfesthhqredtreerjeenucfhrhhomhepfdet rhhnugcuuegvrhhgmhgrnhhnfdcuoegrrhhnugesrghrnhgusgdruggvqeenucggtffrrg htthgvrhhnpeegfeejhedvledvffeijeeijeeivddvhfeliedvleevheejleetgedukedt gfejveenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe grrhhnugesrghrnhgusgdruggv X-ME-Proxy: Feedback-ID: i56a14606:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 110CDB60089; Fri, 5 May 2023 09:19:15 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.9.0-alpha0-386-g2404815117-fm-20230425.001-g24048151 Mime-Version: 1.0 Message-Id: In-Reply-To: References: <20230327121317.4081816-1-arnd@kernel.org> <20230327121317.4081816-10-arnd@kernel.org> Date: Fri, 05 May 2023 15:18:54 +0200 From: "Arnd Bergmann" To: guoren , "Arnd Bergmann" , "Christoph Hellwig" Cc: linux-kernel@vger.kernel.org, "Vineet Gupta" , "Will Deacon" , "Russell King" , "Neil Armstrong" , "Linus Walleij" , "Catalin Marinas" , "Brian Cain" , "Geert Uytterhoeven" , "Michal Simek" , "Thomas Bogendoerfer" , "Dinh Nguyen" , "Stafford Horne" , "Helge Deller" , "Michael Ellerman" , "Christophe Leroy" , "Paul Walmsley" , "Palmer Dabbelt" , "Rich Felker" , "John Paul Adrian Glaubitz" , "David S . Miller" , "Max Filippov" , "Robin Murphy" , "Lad, Prabhakar" , "Conor.Dooley" , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, "linux-oxnas@groups.io" , "linux-csky@vger.kernel.org" , linux-hexagon@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, "linux-openrisc@vger.kernel.org" , linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org Subject: Re: [PATCH 09/21] riscv: dma-mapping: skip invalidation before bidirectional DMA Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 5, 2023, at 07:47, Guo Ren wrote: > On Mon, Mar 27, 2023 at 8:15=E2=80=AFPM Arnd Bergmann wrote: >> >> riscv also invalidates the caches before the transfer, which does >> not appear to serve any purpose. > Yes, we can't guarantee the CPU pre-load cache lines randomly during > dma working. > > But I've two purposes to keep invalidates before dma transfer: > - We clearly tell the CPU these cache lines are invalid. The caching > algorithm would use these invalid slots first instead of replacing > valid ones. > - Invalidating is very cheap. Actually, flush and clean have the same > performance in our machine. The main purpose of the series was to get consistent behavior on all machines, so I really don't want a custom optimization on one architecture. You make a good point about cacheline reuse after invalidation, but if we do that, I'd suggest doing this across all architectures. > So, how about: > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-nonco= herent.c > index d919efab6eba..2c52fbc15064 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, siz= e_t size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_FROM_DEVICE: > - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > - break; > case DMA_BIDIRECTIONAL: > ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > break; This is something we can consider. Unfortunately, this is something that no architecture (except pa-risc, which has other problems) does at the moment, so we'd probably need to have a proper debate about this. We already have two conflicting ways to handle DMA_FROM_DEVICE, either invalidate/invalidate, or clean/invalidate. I can see that flush/invalidate may be a sensible option as well, but I'd want to have that discussion after the series is complete, so we can come to a generic solution that has the same documented behavior across all architectures. In particular, if we end up moving arm64 and riscv back to the traditional invalidate/invalidate for DMA_FROM_DEVICE and document that driver must not rely on buffers getting cleaned before a partial DMA_FROM_DEVICE, the question between clean or flush becomes moot as well. > @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t= size, > break; > case DMA_FROM_DEVICE: > case DMA_BIDIRECTIONAL: > /* I'm not sure all drivers have guaranteed cacheline > alignment. If not, this inval would cause problems */ > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); > break; This is my original patch, and I would not mix it with the other change. The problem with non-aligned DMA_BIDIRECTIONAL buffers in is that both flush and inval would be wrong if you get simultaneous writes from device and cpu to the same cache line, so there is no way to win this. Using inval instead of flush would at least work if the CPU data in the cacheline is read-only from the CPU, so that seems better than something that is always wrong. The documented API is that sharing the cache line is not allowed at all, so anything that would observe a difference between the two is also a bug. One idea that we have considered already is that we could overwrite the unused bits of the cacheline with poison values and/or mark them as invalid using KASAN for debugging purposes, to find drivers that already violate this. Arnd