Received: by 10.223.176.5 with SMTP id f5csp1770342wra; Wed, 31 Jan 2018 11:16:05 -0800 (PST) X-Google-Smtp-Source: AH8x226hv5nIijCfVhffS6cxIT27cPVTyzjXcyPC1RrLaI/Ydcc1isAD6lSKXydCnIxXeL3emEF1 X-Received: by 10.99.115.67 with SMTP id d3mr28051129pgn.223.1517426165542; Wed, 31 Jan 2018 11:16:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517426165; cv=none; d=google.com; s=arc-20160816; b=jvQ5jGU47OMvc0h1/5XtRMf7JUCm4cUgcmKsXaKs9YZZO93vhG3cKRKJ0E/SXuUiNe KVtOVXtW9bvaPQw67Wz3vEEuQAX29bIjMzMy6DLFxzPwZ9GUaMeAHXLGgENBVhXbJqHl d2/nvxKSz+Lol7HyW8uMZSTWKj7wkt6O3m+pybCxge44uQrZExWlzFLtih39UpiWjpdU ajmLzbu/zlk0NFYhrS1SYk/pnTvBYHcEYjTFqrAPoq4dFezwFZXbAqcKs8K105uDetZA JyAL3IbqCwEae18TaDus/Fmjlvtln6mA1l32xwHVYrybi5Ecqo694iFHz8EVqd+pdHZu rx7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=G6JLl7Qh9sAyRbO12LW9l+/dCE8kWomHgQ1RRSoKOhk=; b=q1Jv5kvVm30t2PsSf6XKL5NYILzpns6aQv8kC6bC1/UErag2UCM7klfX1QsrK+/8mq Khfxj/VXVBXiLJdq5J4jcrZhBTVTuLNGBF1YNIpOslnD31EQwSdFR5DEKwCBoVNFQKR0 kLahb351NHD48ksEJrSd1FrEz063VeBEZGwavLB2gTWkwaWlvdiZHGgYEmwO1uvFUo2P k5uB1F+YYwugjz2NGIZQ1JXXv0vSYCc7JMf1z3vcDEouXQkZlK9eDLffywAdGonD7p13 II1PRslsvSAghe7mVDKZ7ssxBILADdvBoLgKWQ8rievRfHkf0BXV/rVFUUJCsS5cBMdJ +fng== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c2-v6si564559plb.439.2018.01.31.11.15.48; Wed, 31 Jan 2018 11:16:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751625AbeAaTPW (ORCPT + 99 others); Wed, 31 Jan 2018 14:15:22 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35694 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751163AbeAaTPV (ORCPT ); Wed, 31 Jan 2018 14:15:21 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B3F0C03677B; Wed, 31 Jan 2018 19:15:21 +0000 (UTC) Received: from kamzik.brq.redhat.com (unknown [10.43.2.160]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B797260473; Wed, 31 Jan 2018 19:15:18 +0000 (UTC) Date: Wed, 31 Jan 2018 20:15:16 +0100 From: Andrew Jones To: Marc Zyngier Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, Lorenzo Pieralisi , Ard Biesheuvel , Catalin Marinas , Will Deacon , Jon Masters , Robin Murphy Subject: Re: [PATCH v2 07/16] arm/arm64: KVM: Add PSCI version selection API Message-ID: <20180131191516.gnz6jk7pbt25crcx@kamzik.brq.redhat.com> References: <20180129174559.1866-1-marc.zyngier@arm.com> <20180129174559.1866-8-marc.zyngier@arm.com> <20180131173824.leyfy4vlqq3vmd37@kamzik.brq.redhat.com> <1de5a7a8-8118-597f-d1a3-14c406b57b0e@arm.com> <20180131180330.ev5cypq5k5ai43wm@kamzik.brq.redhat.com> <86dc5223-3fa9-688d-fb2d-8bc911356eba@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86dc5223-3fa9-688d-fb2d-8bc911356eba@arm.com> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 31 Jan 2018 19:15:21 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 31, 2018 at 06:36:38PM +0000, Marc Zyngier wrote: > On 31/01/18 18:03, Andrew Jones wrote: > > On Wed, Jan 31, 2018 at 05:45:56PM +0000, Marc Zyngier wrote: > >> On 31/01/18 17:38, Andrew Jones wrote: > >>> On Mon, Jan 29, 2018 at 05:45:50PM +0000, Marc Zyngier wrote: > >>>> Although we've implemented PSCI 1.0 and 1.1, nothing can select them > >>>> Since all the new PSCI versions are backward compatible, we decide to > >>>> default to the latest version of the PSCI implementation. This is no > >>>> different from doing a firmware upgrade on KVM. > >>>> > >>>> But in order to give a chance to hypothetical badly implemented guests > >>>> that would have a fit by discovering something other than PSCI 0.2, > >>>> let's provide a new API that allows userspace to pick one particular > >>>> version of the API. > >>>> > >>>> This is implemented as a new class of "firmware" registers, where > >>>> we expose the PSCI version. This allows the PSCI version to be > >>>> save/restored as part of a guest migration, and also set to > >>>> any supported version if the guest requires it. > >>>> > >>>> Signed-off-by: Marc Zyngier > >>>> --- > >>>> Documentation/virtual/kvm/api.txt | 3 +- > >>>> Documentation/virtual/kvm/arm/psci.txt | 28 ++++++++++++++ > >>>> arch/arm/include/asm/kvm_host.h | 3 ++ > >>>> arch/arm/include/uapi/asm/kvm.h | 6 +++ > >>>> arch/arm/kvm/guest.c | 13 +++++++ > >>>> arch/arm64/include/asm/kvm_host.h | 3 ++ > >>>> arch/arm64/include/uapi/asm/kvm.h | 6 +++ > >>>> arch/arm64/kvm/guest.c | 14 ++++++- > >>>> include/kvm/arm_psci.h | 9 +++++ > >>>> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++- > >>>> 10 files changed, 149 insertions(+), 4 deletions(-) > >>>> create mode 100644 Documentation/virtual/kvm/arm/psci.txt > >>>> > >>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > >>>> index 57d3ee9e4bde..334905202141 100644 > >>>> --- a/Documentation/virtual/kvm/api.txt > >>>> +++ b/Documentation/virtual/kvm/api.txt > >>>> @@ -2493,7 +2493,8 @@ Possible features: > >>>> and execute guest code when KVM_RUN is called. > >>>> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. > >>>> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). > >>>> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. > >>>> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision > >>>> + backward compatible with v0.2) for the CPU. > >>>> Depends on KVM_CAP_ARM_PSCI_0_2. > >>>> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU. > >>>> Depends on KVM_CAP_ARM_PMU_V3. > >>>> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt > >>>> new file mode 100644 > >>>> index 000000000000..2e49a4e9f084 > >>>> --- /dev/null > >>>> +++ b/Documentation/virtual/kvm/arm/psci.txt > >>>> @@ -0,0 +1,28 @@ > >>>> +KVM implements the PSCI (Power State Coordination Interface) > >>>> +specification in order to provide services such as CPU on/off, reset > >>>> +and power-off to the guest. > >>>> + > >>>> +The PSCI specification is regularly updated to provide new features, > >>>> +and KVM implements these updates if they make sense from a virtualization > >>>> +point of view. > >>>> + > >>>> +This means that a guest booted on two different versions of KVM can > >>>> +observe two different "firmware" revisions. This could cause issues if > >>>> +a given guest is tied to a particular PSCI revision (unlikely), or if > >>>> +a migration causes a different PSCI version to be exposed out of the > >>>> +blue to an unsuspecting guest. > >>>> + > >>>> +In order to remedy this situation, KVM exposes a set of "firmware > >>>> +pseuodo-registers" that can be manipulated using the GET/SET_ONE_REG > >>>> +interface. These registers can be saved/restored by userspace, and set > >>>> +to a convenient value if required. > >>> > >>> Userspace can save/restore any state it deems necessary. > >> > >> Only if it has access to it. Until now, that wasn't an option. > >> > >>> Is there another > >>> reason to invent a pseudo register? Considering the PSCI version is VM > >>> state, then maybe a VM "device" attribute[*], like s390 use, would fit > >>> better. > >> > >> Possibly. But that means current userspace won't engage the mitigation > >> by default, and that's pretty bad. > >> > >>> Or, if keeping it VCPU state has some benefit, then we already > >>> have VCPU device attribute support for ARM that we could extend with > >>> another attribute. An advantage of using the device-attr API is that it > >>> has KVM_HAS_DEVICE_ATTR, allowing the new attribute support to be probed. > >>> How should userspace check if the pseudo register is supported? Maybe > >>> by just failing with NOT_SUPPORTED to get/set it? > >>> > >>> [*] Documentation/virtual/kvm/devices/vm.txt > >> > >> Frankly, I have no opinion on the way to implement it. My only > >> requirement is that it becomes enabled by default, without any userspace > >> change. > > > > I think we can turn it on by default in KVM, no matter what API we choose, > > and then enable userspace (with whatever API) to turn it off. Newer > > machine types certainly wouldn't do that, and, as later PSCI is compatible > > with older PSCI, we're probably safe (safer regarding the BTB) not using > > the older PSCI with older machine types either - although that would be up > > for debate and the whole point of adding the API. > > OK. At least we're aligned on this. As for your question about finding > out about the register, here's what I have in kvmtool: Actually I was starting to have some second thoughts about that default, but... > > + reg.addr = (u64)&data; > + reg.id = KVM_REG_ARM_PSCI_VERSION; > + err = ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, ®); > + if (err < 0) > + pr_info("Unable to get PSCI version\n"); > + else > + pr_info("PSCI version %llx\n", data); > > If you get an error, the feature doesn't exist. One of the reasons for > using a register is that migration works with today's QEMU. And if you > try to restore a VM that's aware of the PSCI version (and possibly of > the BTB invalidation) on a host that doesn't support it, you get a fail > without any change of userspace. > > To me, that's a pretty good argument in favour of this method. But if ...I agree that does sound like a good argument. > you have strong feelings about it and would like me to implement > something else, please let me know ASAP. We'd like this series to land > in -rc1, so we only have a few days left... I'll do some experiments tomorrow and let you know if my second thoughts are valid concerns right away. > > >> > >>> > >>>> + > >>>> +The following register is defined: > >>>> + > >>>> +* KVM_REG_ARM_PSCI_VERSION: > >>>> + > >>>> + - Only valid if the vcpu has KVM_ARM_VCPU_PSCI_0_2 feature set > >>>> + - Returns the current PSCI version on GET_ONE_REG > >>>> + - Allows any supported PSCI version compatible with v0.2 to be set > >>>> + with SET_ONE_REG > >>>> + - Affects the whole VM (even if the register view is per-vcpu) > >>> > >>> > >>>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > >>>> index acbf9ec7b396..e9d57060d88c 100644 > >>>> --- a/arch/arm/include/asm/kvm_host.h > >>>> +++ b/arch/arm/include/asm/kvm_host.h > >>>> @@ -75,6 +75,9 @@ struct kvm_arch { > >>>> /* Interrupt controller */ > >>>> struct vgic_dist vgic; > >>>> int max_vcpus; > >>>> + > >>>> + /* Mandated version of PSCI */ > >>>> + u32 psci_version; > >>>> }; > >>>> > >>>> #define KVM_NR_MEM_OBJS 40 > >>>> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h > >>>> index 6edd177bb1c7..47dfc99f5cd0 100644 > >>>> --- a/arch/arm/include/uapi/asm/kvm.h > >>>> +++ b/arch/arm/include/uapi/asm/kvm.h > >>>> @@ -186,6 +186,12 @@ struct kvm_arch_memory_slot { > >>>> #define KVM_REG_ARM_VFP_FPINST 0x1009 > >>>> #define KVM_REG_ARM_VFP_FPINST2 0x100A > >>>> > >>>> +/* KVM-as-firmware specific pseudo-registers */ > >>>> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT) > >>> > >>> Is this 0x14 documented somewhere? I'm just curious how the space for > >>> pseudo registers is reserved. Is there a common place that we should > >>> document that 0x14 is for the firmware (if it's not documented there > >>> already)? > >> > >> No. First come, first served. > > > > Should we document it in Documentation/ somewhere? > We can. I tend to see the uapi/asm/kvm.h file as the reference (and I > feel that a text file would simply be an English translation of that > file), but I'm not opposed to it either. I'm not sure I have the > bandwidth for that at the moment though. Yeah, docs can come later. But now that I look, I think we just need a couple lines in Documentation/virtual/kvm/api.txt like ARM 32-bit firmware registers have the following id bit patterns: 0x4030 0000 0014 0 ARM 64-bit firmware registers have the following id bit patterns: 0x6030 0000 0014 0 (I'm not sure those are correct. I'd expect both to start with 4 like the other entries in that file, but maybe not, or the other 64-bit entries in that file aren't correct?) Thanks, drew