Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp336396pxp; Wed, 16 Mar 2022 06:49:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxmBafCW7Mkr+Ezf3HeUMMnLaDabvlWEd1DnbzWTGF0miQUtYtCmNicJp7hkPfpF4sSiBTJ X-Received: by 2002:a17:907:168b:b0:6db:a7da:2d4d with SMTP id hc11-20020a170907168b00b006dba7da2d4dmr54922ejc.101.1647438598829; Wed, 16 Mar 2022 06:49:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647438598; cv=none; d=google.com; s=arc-20160816; b=VTqCycDFhn1l8SthmdvYCVu/JvWSlRn4zPzdzbrFioX2Mat4JTzjicUcyIBZE2jo9d sn8yeIT2Z39c9GDSOml8YtZMvNRyyzc3hlELVR/cj4A0ugS377+Z5uYg5yLPyG4e/155 QmEtiKsMR30RLvcwXkPxSL9HDsxAvGQGJQD3Xa5fYbQTSL4Yvfts0cUdgVGA1Tz0N318 JAn5RwV3amIc6F/4KLkXqfUcOOOnmOHk7QpEWu21nmkKZdhFuzCEvgoAcLWvsZW3jZ/g p3mDi2ZDcC0bF+lb037/6+gTNR4wfYmZ4FTpCRkRuFcImkf/hT2pAqHE0WE+Y8+zLrqR GC8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=dOhxfMjGjR8ZaJk+jcBpHhmGin6K+/4J7u9JMGmRLbg=; b=D2wOWP0EVjUOg6BfyNiaqxTpVVb8XcbwB0Ak5qlhewdhOgoBD9v0mENRCjBj2rm/fK +9GpkpAE0ar9J3jr2JLSyQkmP5PwcHpJP68QZb2a+xdwwFeUeQy1t1uv2j7+8FFVrWnl IqkPGPsnq4r59usP1O5s+UWinveC6TxRn5Ni6v7qRpaejtSYeiflYka2nGfhZUOY5fa8 1zQr0XAl9B7UyjS+1Jm4EzCLs9wG3QmGja97WFLr+PUSMD16eRv50AFdk8HAvwQxV2GT frPQHKlKn3nxlC6mA6wOLCrJaYt5J7m+8FQzUA9Dfy7+qU1X8B9+1mlVpfMyfjneITPu cbSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=sI9JdJno; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hp6-20020a1709073e0600b006d90b7b2de7si1606440ejc.748.2022.03.16.06.49.32; Wed, 16 Mar 2022 06:49:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=sI9JdJno; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244116AbiCNTmU (ORCPT + 99 others); Mon, 14 Mar 2022 15:42:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235356AbiCNTmT (ORCPT ); Mon, 14 Mar 2022 15:42:19 -0400 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 114803CA41 for ; Mon, 14 Mar 2022 12:41:07 -0700 (PDT) Received: by mail-io1-xd35.google.com with SMTP id r2so19594438iod.9 for ; Mon, 14 Mar 2022 12:41:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dOhxfMjGjR8ZaJk+jcBpHhmGin6K+/4J7u9JMGmRLbg=; b=sI9JdJno8Hz1lo2nyb+hKD45ZTwd6x3OrlZO82XCfr6ct7E42iIesJomUDPZFQOSD9 b46oJTTQeiC6vI+7c2ZyW4CI4iy0P6/OZcr1KP/Hnq5Y8Ep2gE+m1EbMNOFisUUZEoAi h8QIFVubMcDQ54aaQzGwsZWB1D66g0pMwIHPBpn+zo7gpWudcxRo0t6qZxNjZokT7ccf HNCQmBDZxE0F5/VCU2wqKZ2772urmIRdTGnF2JN7eqCzqAtueIuGkedvz6jRXeR2wGU9 CT3pOSTEaSKoWxEmjLpCvG9C8VVVtQ1uBRueEaTF1weE4E23+o3NmJKvA+meswYZaCef g9NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dOhxfMjGjR8ZaJk+jcBpHhmGin6K+/4J7u9JMGmRLbg=; b=DB0L1ljwuhwzuYyNMqYUILuStRBrgyEye1paAgXkV00MCYt/mnKAYUQk4AjHPD/x0V 0iVco7G5timdE657b0v/EmUyGDoSn8fL62KZ+MmNxoQKGcWapCf7o3JwCXc8TER1N8Uo ubA2eloE5csWhiV4kDt+4LydCiX+48x2hfoqPRD+eZA03T8XY5TADxs8+YFIgvcJg5dL 5gwW48sGkJ5r4kNrQoIEOsK5lMcUgJTZClU4cYustxNrTTsn454ua+R9bznEwb9BhC3L t5hkV+IJYeOs6mQBQLup4Iqo73zDiF+YeQZ9fJvnEuQALF5dvyB5xVw0vhR5KnQeDfZz usNw== X-Gm-Message-State: AOAM530sI9gMqHWG2fVrMXsGFuJBih2ksca8gz6dJrJhIIX8BII/aDjK EcHLeGsMqfy9vdKRTiOn17Ayjw== X-Received: by 2002:a05:6602:29cb:b0:638:d53c:d212 with SMTP id z11-20020a05660229cb00b00638d53cd212mr20547956ioq.167.1647286866075; Mon, 14 Mar 2022 12:41:06 -0700 (PDT) Received: from google.com (194.225.68.34.bc.googleusercontent.com. [34.68.225.194]) by smtp.gmail.com with ESMTPSA id t1-20020a92cc41000000b002c796ff2f95sm3364445ilq.9.2022.03.14.12.41.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Mar 2022 12:41:05 -0700 (PDT) Date: Mon, 14 Mar 2022 19:41:02 +0000 From: Oliver Upton To: Raghavendra Rao Ananta Cc: Marc Zyngier , Andrew Jones , James Morse , Alexandru Elisei , Suzuki K Poulose , Paolo Bonzini , Catalin Marinas , Will Deacon , Peter Shier , Ricardo Koller , Reiji Watanabe , Jing Zhang , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [PATCH v4 05/13] KVM: arm64: Setup a framework for hypercall bitmap firmware registers Message-ID: References: <20220224172559.4170192-1-rananta@google.com> <20220224172559.4170192-6-rananta@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220224172559.4170192-6-rananta@google.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 24, 2022 at 05:25:51PM +0000, Raghavendra Rao Ananta wrote: > KVM regularly introduces new hypercall services to the guests without > any consent from the userspace. This means, the guests can observe > hypercall services in and out as they migrate across various host > kernel versions. This could be a major problem if the guest > discovered a hypercall, started using it, and after getting migrated > to an older kernel realizes that it's no longer available. Depending > on how the guest handles the change, there's a potential chance that > the guest would just panic. > > As a result, there's a need for the userspace to elect the services > that it wishes the guest to discover. It can elect these services > based on the kernels spread across its (migration) fleet. To remedy > this, extend the existing firmware psuedo-registers, such as > KVM_REG_ARM_PSCI_VERSION, for all the hypercall services available. > > These firmware registers are categorized based on the service call > owners, and unlike the existing firmware psuedo-registers, they hold > the features supported in the form of a bitmap. > > During the VM initialization, the registers holds an upper-limit of > the features supported by the corresponding registers. It's expected > that the VMMs discover the features provided by each register via > GET_ONE_REG, and writeback the desired values using SET_ONE_REG. > KVM allows this modification only until the VM has started. > > Older userspace code can simply ignore the capability and the > hypercall services will be exposed unconditionally to the guests, thus > ensuring backward compatibility. > > In this patch, the framework adds the register only for ARM's standard > secure services (owner value 4). Currently, this includes support only > for ARM True Random Number Generator (TRNG) service, with bit-0 of the > register representing mandatory features of v1.0. The register is also > added to the kvm_arm_vm_scope_fw_regs[] list as it maintains its state > per-VM. Other services are momentarily added in the upcoming patches. > > Signed-off-by: Raghavendra Rao Ananta > --- > arch/arm64/include/asm/kvm_host.h | 12 +++++ > arch/arm64/include/uapi/asm/kvm.h | 8 ++++ > arch/arm64/kvm/arm.c | 8 ++++ > arch/arm64/kvm/guest.c | 1 + > arch/arm64/kvm/hypercalls.c | 78 +++++++++++++++++++++++++++++++ > include/kvm/arm_hypercalls.h | 4 ++ > 6 files changed, 111 insertions(+) > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index e823571e50cc..1909ced3208f 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -101,6 +101,15 @@ struct kvm_s2_mmu { > struct kvm_arch_memory_slot { > }; > > +/** > + * struct kvm_hvc_desc: KVM ARM64 hypercall descriptor > + * > + * @hvc_std_bmap: Bitmap of standard secure service calls > + */ > +struct kvm_hvc_desc { nit: maybe call this structure kvm_hypercall_features? When nested comes along guests will need to use the SVC conduit as HVC traps are always taken to EL2. Same will need to be true for virtual EL2. > + u64 hvc_std_bmap; > +}; > + > struct kvm_arch { > struct kvm_s2_mmu mmu; > > @@ -142,6 +151,9 @@ struct kvm_arch { > > /* Capture first run of the VM */ > bool has_run_once; > + > + /* Hypercall firmware register' descriptor */ > + struct kvm_hvc_desc hvc_desc; > }; > > struct kvm_vcpu_fault_info { > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > index c35447cc0e0c..2decc30d6b84 100644 > --- a/arch/arm64/include/uapi/asm/kvm.h > +++ b/arch/arm64/include/uapi/asm/kvm.h > @@ -287,6 +287,14 @@ struct kvm_arm_copy_mte_tags { > #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED 3 > #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED (1U << 4) > > +/* Bitmap firmware registers, extension to the existing psuedo-register space */ > +#define KVM_REG_ARM_FW_BMAP KVM_REG_ARM_FW_REG(0xff00) What is the motivation for moving the bitmap register indices so far away from the rest of the firmware regs? > +#define KVM_REG_ARM_FW_BMAP_REG(r) (KVM_REG_ARM_FW_BMAP | (r)) If you are still going to use the index offset, just pass 'r' through to the other macro: #define KVM_REG_ARM_FW_BMAP_REG(r) KVM_REG_ARM_FW_REG(0xff00 + r) > +#define KVM_REG_ARM_STD_BMAP KVM_REG_ARM_FW_BMAP_REG(0) > +#define KVM_REG_ARM_STD_BIT_TRNG_V1_0 BIT(0) > +#define KVM_REG_ARM_STD_BMAP_BIT_MAX 0 /* Last valid bit */ Implementation details such as this probably shouldn't live in UAPI headers. We'll likely need to bump the value in the future. > + > /* SVE registers */ > #define KVM_REG_ARM64_SVE (0x15 << KVM_REG_ARM_COPROC_SHIFT) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index f61cd8d57eae..e9f9edb1cf55 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -156,6 +156,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) > kvm->arch.max_vcpus = kvm_arm_default_max_vcpus(); > > set_default_spectre(kvm); > + kvm_arm_init_hypercalls(kvm); > > return ret; > out_free_stage2_pgd: > @@ -635,7 +636,14 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) > if (kvm_vm_is_protected(kvm)) > kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu); > > + /* > + * Grab kvm->lock such that the reader of has_run_once can finish > + * the necessary operation atomically, such as deciding whether to > + * block the writes to the firmware registers if the VM has run once. > + */ > + mutex_lock(&kvm->lock); > kvm->arch.has_run_once = true; > + mutex_unlock(&kvm->lock); Shouldn't this have just grabbed the kvm lock in patch 04/13? > return ret; > } > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index eb061e64a7a5..d66e6c742bbe 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -65,6 +65,7 @@ const struct kvm_stats_header kvm_vcpu_stats_header = { > static const u64 kvm_arm_vm_scope_fw_regs[] = { > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, > + KVM_REG_ARM_STD_BMAP, > }; > > /** > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c > index 8624e6964940..48c126c3da72 100644 > --- a/arch/arm64/kvm/hypercalls.c > +++ b/arch/arm64/kvm/hypercalls.c > @@ -58,6 +58,29 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val) > val[3] = lower_32_bits(cycles); > } > > +static bool kvm_arm_fw_reg_feat_enabled(u64 reg_bmap, u64 feat_bit) > +{ > + return reg_bmap & feat_bit; > +} > + > +static bool kvm_hvc_call_supported(struct kvm_vcpu *vcpu, u32 func_id) > +{ > + struct kvm_hvc_desc *hvc_desc = &vcpu->kvm->arch.hvc_desc; > + > + switch (func_id) { > + case ARM_SMCCC_TRNG_VERSION: > + case ARM_SMCCC_TRNG_FEATURES: > + case ARM_SMCCC_TRNG_GET_UUID: > + case ARM_SMCCC_TRNG_RND32: > + case ARM_SMCCC_TRNG_RND64: > + return kvm_arm_fw_reg_feat_enabled(hvc_desc->hvc_std_bmap, > + KVM_REG_ARM_STD_BIT_TRNG_V1_0); > + default: > + /* By default, allow the services that aren't listed here */ > + return true; I think your default case should really return false. It keeps people honest when they add new patches to set up a new hypercall bit (no bit? no call!) That of course requires that you only return false once all of the preexisting hypercalls are enumerated, otherwise such a patch would cause a regression in isolation. > + } > +} > + > int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) > { > u32 func_id = smccc_get_function(vcpu); > @@ -65,6 +88,9 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) > u32 feature; > gpa_t gpa; > > + if (!kvm_hvc_call_supported(vcpu, func_id)) > + goto out; > + > switch (func_id) { > case ARM_SMCCC_VERSION_FUNC_ID: > val[0] = ARM_SMCCC_VERSION_1_1; > @@ -143,6 +169,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) > return kvm_psci_call(vcpu); > } > > +out: > smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]); > return 1; > } > @@ -151,8 +178,16 @@ static const u64 kvm_arm_fw_reg_ids[] = { > KVM_REG_ARM_PSCI_VERSION, > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, > + KVM_REG_ARM_STD_BMAP, > }; > > +void kvm_arm_init_hypercalls(struct kvm *kvm) > +{ > + struct kvm_hvc_desc *hvc_desc = &kvm->arch.hvc_desc; > + > + hvc_desc->hvc_std_bmap = ARM_SMCCC_STD_FEATURES; > +} > + > int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu) > { > return ARRAY_SIZE(kvm_arm_fw_reg_ids); > @@ -220,6 +255,7 @@ static int get_kernel_wa_level(u64 regid) > > int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > { > + struct kvm_hvc_desc *hvc_desc = &vcpu->kvm->arch.hvc_desc; > void __user *uaddr = (void __user *)(long)reg->addr; > u64 val, reg_id = reg->id; > > @@ -233,6 +269,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2: > val = get_kernel_wa_level(reg_id) & KVM_REG_FEATURE_LEVEL_MASK; > break; > + case KVM_REG_ARM_STD_BMAP: > + val = READ_ONCE(hvc_desc->hvc_std_bmap); > + break; > default: > return -ENOENT; > } > @@ -243,6 +282,43 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > return 0; > } > > +static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val) > +{ > + int ret = 0; > + struct kvm *kvm = vcpu->kvm; > + struct kvm_hvc_desc *hvc_desc = &kvm->arch.hvc_desc; > + u64 *fw_reg_bmap, fw_reg_features; nit: use reverse fir tree ordering for locals (longest line first, shortest last). > + switch (reg_id) { > + case KVM_REG_ARM_STD_BMAP: > + fw_reg_bmap = &hvc_desc->hvc_std_bmap; > + fw_reg_features = ARM_SMCCC_STD_FEATURES; > + break; > + default: > + return -ENOENT; > + } > + > + /* Check for unsupported bit */ > + if (val & ~fw_reg_features) > + return -EINVAL; > + > + mutex_lock(&kvm->lock); > + > + /* > + * If the VM (any vCPU) has already started running, return success > + * if there's no change in the value. Else, return -EBUSY. How about returning -EINVAL instead? We already do this for KVM_ARM_VCPU_INIT if userspace uses a different target than the one previously set. > + */ > + if (kvm_arm_vm_has_run_once(&kvm->arch)) { > + ret = *fw_reg_bmap != val ? -EBUSY : 0; > + goto out; > + } > + > + WRITE_ONCE(*fw_reg_bmap, val); > +out: > + mutex_unlock(&kvm->lock); > + return ret; > +} > + > int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > { > void __user *uaddr = (void __user *)(long)reg->addr; > @@ -321,6 +397,8 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > return -EINVAL; > > return 0; > + case KVM_REG_ARM_STD_BMAP: > + return kvm_arm_set_fw_reg_bmap(vcpu, reg_id, val); > default: > return -ENOENT; > } > diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h > index 5d38628a8d04..64d30b452809 100644 > --- a/include/kvm/arm_hypercalls.h > +++ b/include/kvm/arm_hypercalls.h > @@ -6,6 +6,9 @@ > > #include > > +#define ARM_SMCCC_STD_FEATURES \ > + GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0) > + This probably needs KVM_ somewhere in its name for the sake of scoping. > int kvm_hvc_call_handler(struct kvm_vcpu *vcpu); > > static inline u32 smccc_get_function(struct kvm_vcpu *vcpu) > @@ -42,6 +45,7 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu, > > struct kvm_one_reg; > > +void kvm_arm_init_hypercalls(struct kvm *kvm); > int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu); > int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices); > int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg); > -- > 2.35.1.473.g83b2b277ed-goog >