Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp929234rdb; Wed, 6 Dec 2023 04:14:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IGtPE39Yw+38d56b8vV+96DLB0/4x3YsykwkUizjkt04Jkcr/TzPAkqdlhPhakaGb3uQ/12 X-Received: by 2002:a05:6a20:261b:b0:187:a2ca:409c with SMTP id i27-20020a056a20261b00b00187a2ca409cmr448170pze.5.1701864885032; Wed, 06 Dec 2023 04:14:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701864885; cv=none; d=google.com; s=arc-20160816; b=P5cWsf7ZjY4fJU9GpsDOKQ9LgE4qSCI2xNy0V4VITX3h+En0p3+wKwoNMxyGmnSh/b xHKtGeO80NgcMfhIt0LCYY/IPa9Uj7eNYPlFX2VvkZVDlSlcTq5xfyKqmTr0wjn2XI4Q NZvmKjNaXlEQtJaITqVEqcH6hv+N53FuLydaqgRPS6f/3OPQ/DlqhbtQ/XzEtXd3K3A/ KRsb9k2jhArJJuXwssYIArH3GleKi67t/VSf/T8OWN8yiFu1HqMUmMrbiyrMVgXvOSKH 5kFxenrYkJ07aDswzsdD5gqa4JVFMr3qrsogXXk2Yo1gqV/zaFPOgi97JC+BERGrqQpT 8S/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=+duReU+A8pbquVrmV6nz9Jr6PnJ2utC9JKVQgPQ0lKg=; fh=+LuCcWRqhPQZProGK7ifgVBzlEPxI0vSfsAgIXPTXgM=; b=wkvXcfwPFGhDALfjffH+evvDLD4A6++TWDozmwixBvC/5EbLlVsH2m6VppUYx4Mx2S fb6TbtfF4OjYVruz8sHnxTlQNKwD+e/IHUAEsRJZbj/jHVO7DCnN8rFV0DV4Od+etNTU 7vYnfVF4Gcnu0UTcXYQAa4uWkEmvacO7NK8mDdSgrHrVip1kBriNtcGULwx5vGQc9/GB Agi7vvMg8WbgqiwpmZCq1+soMJpkwhUVnOhll8ZUnyzP1135y4eB/S0y4cmMBjZ9CxeA y8bG31M6lzB3uWvcDZvX3TOyh84iwHj8lgIyvyg7Rh3rOZx0VGHZSXVcmyH10gyh6oOf 7wag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id h13-20020a056a00170d00b006ce02981808si9000066pfc.199.2023.12.06.04.14.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 04:14:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 4E32180C3A22; Wed, 6 Dec 2023 04:14:36 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377993AbjLFMOT (ORCPT + 99 others); Wed, 6 Dec 2023 07:14:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377846AbjLFMOS (ORCPT ); Wed, 6 Dec 2023 07:14:18 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E126BA for ; Wed, 6 Dec 2023 04:14:25 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D996C433C7; Wed, 6 Dec 2023 12:14:20 +0000 (UTC) Date: Wed, 6 Dec 2023 12:14:18 +0000 From: Catalin Marinas To: Marc Zyngier Cc: Jason Gunthorpe , ankita@nvidia.com, Shameerali Kolothum Thodi , oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, lpieralisi@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory Message-ID: References: <86fs0hatt3.wl-maz@kernel.org> <20231205130517.GD2692119@nvidia.com> <20231205164318.GG2692119@nvidia.com> <86bkb4bn2v.wl-maz@kernel.org> <86a5qobkt8.wl-maz@kernel.org> <868r67blwo.wl-maz@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <868r67blwo.wl-maz@kernel.org> X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 06 Dec 2023 04:14:36 -0800 (PST) On Wed, Dec 06, 2023 at 11:39:03AM +0000, Marc Zyngier wrote: > On Tue, 05 Dec 2023 18:40:42 +0000, > Catalin Marinas wrote: > > On Tue, Dec 05, 2023 at 05:50:27PM +0000, Marc Zyngier wrote: > > > On Tue, 05 Dec 2023 17:33:01 +0000, > > > Catalin Marinas wrote: > > > > Ideally we should do this for vfio only but we don't have an easy > > > > way to convey this to KVM. > > > > > > But if we want to limit this to PCIe, we'll have to find out. The > > > initial proposal (a long while ago) had a flag conveying some > > > information, and I'd definitely feel more confident having something > > > like that. > > > > We can add a VM_PCI_IO in the high vma flags to be set by > > vfio_pci_core_mmap(), though it limits it to 64-bit architectures. KVM > > knows this is PCI and relaxes things a bit. It's not generic though if > > we need this later for something else. > > Either that, or something actually describing the attributes that VFIO > wants. > > And I very much want it to be a buy-in behaviour, not something that > automagically happens and changes the default behaviour for everyone > based on some hand-wavy assertions. > > If that means a userspace change, fine by me. The VMM better know what > is happening. Driving the attributes from a single point like the VFIO driver is indeed better. The problem is that write-combining on Arm doesn't come without speculative loads, otherwise we would have solved it by now. I also recall the VFIO maintainer pushing back on relaxing the pgprot_noncached() for the user mapping but I don't remember the reasons. We could do with a pgprot_maybewritecombine() or pgprot_writecombinenospec() (similar to Jason's idea but without changing the semantics of pgprot_device()). For the user mapping on arm64 this would be Device (even _GRE) since it can't disable speculation but stage 2 would leave the decision to the guest since the speculative loads aren't much different from committed loads done wrongly. If we want the VMM to drive this entirely, we could add a new mmap() flag like MAP_WRITECOMBINE or PROT_WRITECOMBINE. They do feel a bit weird but there is precedent with PROT_MTE to describe a memory type. One question is whether the VFIO driver still needs to have the knowledge and sanitise the requests from the VMM within a single BAR. If there are no security implications to such mappings, the VMM can map parts of the BAR as pgprot_noncached(), other parts as pgprot_writecombine() and KVM just follows them (similarly if we need a cacheable mapping). The latter has some benefits for DPDK but it's a lot more involved with having to add device-specific knowledge into the VMM. The VMM would also have to present the whole BAR contiguously to the guest even if there are different mapping attributes within the range. So a lot of MAP_FIXED uses. I'd rather leaving this decision with the guest than the VMM, it looks like more hassle to create those mappings. The VMM or the VFIO could only state write-combine and speculation allowed. -- Catalin