Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp513421rdb; Tue, 5 Dec 2023 11:25:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IFfJV2CLYic7fdNAHexWNZEgP4flsfb9sF16CfKAlCe0hl5Vn8CiDheprNqwJ2DwyAZ3khY X-Received: by 2002:a05:6a21:a58e:b0:18f:726d:dcdf with SMTP id gd14-20020a056a21a58e00b0018f726ddcdfmr3221683pzc.115.1701804302975; Tue, 05 Dec 2023 11:25:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701804302; cv=none; d=google.com; s=arc-20160816; b=CweQgyp1R+SYsnDMTg2V51Wgf1kGTrEVWVrQlGmkX9ol7D+EvzpTGCME5JL/z8m/fe BhC5x91LP4TsFeS/rhaSPefcxrvNz+qoJjEeLtau3iq7boI++MqUN8l4eP5l14IRllDF 3P9kn1btDSBr+3fSyFawoGh1XFjdn7qakWnJfw9HKpa+rsc2A+FoDCjUXpPKyQrA/GAu AGtzAsAYzR5OFprzm612pRlYssJvUZdYGIXL9lEHV0iCpC1Ut/g3DMVJ6/OHSaZB9ime rjCKTWFiv47gVbw7auZsS8pd0HmpZrUiAzY+v7WQ4+7JhHr/er4JI2giW/OsD0hMSRgg xDvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=vUXfj88qi9gmgR7nNRfKGjmBJX0I4HM3tPFhELBcO+Y=; fh=bEyFLQknfHvZVRjPYONukTUGOd/Q5gJXTq+COzby4r4=; b=FoS2zsrzJ2jPwkCSGi9WJ6wQ4fk+M/E33K4Ws6rgNWvYUvOQe0nbO3jIjOHgJIR0s1 gmKHB4hngRLvhO76zcqAnem4xCS5686GcpxUEMigGbR3CFRnOsAWWlJ9mWy3cLAmSmy0 SSzcBV9OPKOxxfEaNL2+bkrjEdCxYTBMReXCiEpAQvBystKqYXmLdGwqDBPRrov1HlBz WkVdTfp74ODbcpShnhZH1c9GivcSaSSJurui/moYTtNem83vsRghHdEyBq/slcGcoz3c wr8F2f4E9xMQhplV9xNnypX/K9N3u/Lw7edAR5JTNG6SxGwHEyTeJntgN6Z8t5PtINi7 bXCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id y29-20020a634b1d000000b005c63a36938fsi5048074pga.809.2023.12.05.11.25.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 11:25:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 0956080AE20A; Tue, 5 Dec 2023 11:24:58 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232023AbjLETYj (ORCPT + 99 others); Tue, 5 Dec 2023 14:24:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbjLETYh (ORCPT ); Tue, 5 Dec 2023 14:24:37 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02FFBA5 for ; Tue, 5 Dec 2023 11:24:44 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B199C433C7; Tue, 5 Dec 2023 19:24:39 +0000 (UTC) Date: Tue, 5 Dec 2023 19:24:37 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Marc Zyngier , ankita@nvidia.com, Shameerali Kolothum Thodi , oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, lpieralisi@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory Message-ID: References: <20231205033015.10044-1-ankita@nvidia.com> <86fs0hatt3.wl-maz@kernel.org> <20231205130517.GD2692119@nvidia.com> <20231205164318.GG2692119@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231205164318.GG2692119@nvidia.com> X-TUID: nsHFPCYE/dW9 X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Tue, 05 Dec 2023 11:24:58 -0800 (PST) On Tue, Dec 05, 2023 at 12:43:18PM -0400, Jason Gunthorpe wrote: > On Tue, Dec 05, 2023 at 04:22:33PM +0000, Catalin Marinas wrote: > > Yeah, I made this argument in the past. But it's a fair question to ask > > since the Arm world is different from x86. Just reusing an existing > > driver in a different context may break its expectations. Does Normal NC > > access complete by the time a TLBI (for Stage 2) and DSB (DVMsync) is > > completed? It does reach some point of serialisation with subsequent > > accesses to the same address but not sure how it is ordered with an > > access to a different location like the config space used for reset. > > Maybe it's not a problem at all or it is safe only for PCIe but it would > > be good to get to the bottom of this. > > IMHO, the answer is you can't know architecturally. The specific > vfio-platform driver must do an analysis of it's specific SOC and > determine what exactly is required to order the reset. The primary > purpose of the vfio-platform drivers is to provide this reset! > > In most cases I would expect some reads from the device to be required > before the reset. I can see in the vfio_platform_common.c code that the reset is either handled by an ACPI _RST method or some custom function in case of DT. Let's consider the ACPI method for now, I assume the AML code pokes some device registers but we can't say much about the ordering it expects without knowing the details. The AML may assume that the ioaddr mapped as Device-nRnRE (ioremap()) in the kernel has the same attributes wherever else is mapped in user or guests. Note that currently the vfio_platform and vfio_pci drivers only allow pgprot_noncached() in user, so they wouldn't worry about other mismatched aliases. I think PCIe is slightly better documented but even here we'll have to rely on the TLBI+DSB to clear any prior writes on different CPUs. It can be argued that it's the responsibility of whoever grants device access to know the details. However, it would help if we give some guidance, any expectations broken if an alias is Normal-NC? It's easier to start with PCIe first until we get some concrete request for other types of devices. > > So, I think it would be easier to get this patch upstream if we limit > > the change to PCIe devices for now. We may relax this further in the > > future. Do you actually have a need for non-PCIe devices to support WC > > in the guest or it's more about the complexity of the logic to detect > > whether it's actually a PCIe BAR we are mapping into the guest? (I can > > see some Arm GPU folk asking for this but those devices are not easily > > virtualisable). > > The complexity is my concern, and the disruption to the ecosystem with > some of the ideas given. > > If there was a trivial way to convey in the VMA that it is safe then > sure, no objection from me. I suggested a new VM_* flag or some way to probe the iomem_resources for PCIe ranges (if they are described in there, not sure). We can invent other tree searching for ranges that get registers from the vfio driver, I don't think it's that difficult. Question is, do we need to do this for other types of devices or it's mostly theoretical at this point (what's theoretical goes both ways really). A more complex way is to change vfio to allow Normal mappings and KVM would mimic them. You were actually planning to do this for Cacheable anyway. > I would turn it around and ask we find a way to restrict platform > devices when someone comes with a platform device that wants to use > secure kvm and has a single well defined HW problem that is solved by > this work. We end up with similar search/validation mechanism, so not sure we gain much. > What if we change vfio-pci to use pgprot_device() like it already > really should and say the pgprot_noncached() is enforced as > DEVICE_nGnRnE and pgprot_device() may be DEVICE_nGnRE or NORMAL_NC? > Would that be acceptable? pgprot_device() needs to stay as Device, otherwise you'd get speculative reads with potential side-effects. -- Catalin