Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp3831389iog; Tue, 28 Jun 2022 03:48:30 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uL2HWLEgs7utdlkcRVzyTqys0RK9Ov716vkrxTxGWgtsqt62/InLVjS7OgV2HrHFtg04/G X-Received: by 2002:a17:903:22c3:b0:16a:5850:5773 with SMTP id y3-20020a17090322c300b0016a58505773mr4335002plg.127.1656413310243; Tue, 28 Jun 2022 03:48:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656413310; cv=none; d=google.com; s=arc-20160816; b=UBGkos82EdPuSdWVromUNJXvg1eUXQdOieUM/4VS2rxBIq/UHEPlboEMgSnadptD27 h19Z9SBKG8rwLzwJy+xtAWK6UvxNOAwDD/Vqph2bwfhQw5Xz07uDBCiXvC35hrfGgNX1 pClAMyMrBKV0x+pbVlbYMsNwDeEHt0Rvj+qlNiFQ4yU63m4tm1Goaq3dp6v4K035sWIN Bh9mwOANGymPNkH8KH2lrT+ZwvbHQyFM8CvyUuOY0jYnuWAzL1pY9Lmu1QXKtL4X+EK/ HKlq43u+6p/8aJbQsWaEQ81j2H/pMNYH8CN+F8jzzsjTZOFzndOhKLxuEasPjMCdXnoA +WaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature:dkim-filter; bh=hCGQYhHwpOSkpnLZXxvS6HfcNgQdwWClPUDXMEaPtE8=; b=0/K2Y2Yh12++oyr31SSV/xlsFNCxisgm8AIVP6+/R1OvQjf7eL8iipyP0nVGHPBvPX iYGZMTzAS6NU+WGAF/RnB0FUOKKU+eH1p4N4DsBlFi743P+a0mfY2zv0ZyzYBizaJ+Ju 7y5JzV/nyMKu20tJvEb5aR+bTvagJ+1enr511r0Y6ScUgfWTSGzmsgaB26M77LeyEpWc mxabARVQch/jdYtnX6HmPTHeCInu/FOX6KE8hVaDQaJIX3x9s9mRnFTfsybkbOwrLCan moD72pawoOHSv5zflFYVu3ZW/z02tpBD1kszYXQSMdt/FV12EjJUal0rQGrds0y78w3K S+dA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=N4vRaqAO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nv13-20020a17090b1b4d00b001ed53b28e8dsi13479089pjb.84.2022.06.28.03.48.17; Tue, 28 Jun 2022 03:48:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=N4vRaqAO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344917AbiF1Kdi (ORCPT + 99 others); Tue, 28 Jun 2022 06:33:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344904AbiF1Kdg (ORCPT ); Tue, 28 Jun 2022 06:33:36 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D79E230F62; Tue, 28 Jun 2022 03:33:35 -0700 (PDT) Received: from anrayabh-desk.corp.microsoft.com (unknown [167.220.238.193]) by linux.microsoft.com (Postfix) with ESMTPSA id 5346C20CD15E; Tue, 28 Jun 2022 03:33:30 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5346C20CD15E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1656412415; bh=hCGQYhHwpOSkpnLZXxvS6HfcNgQdwWClPUDXMEaPtE8=; h=From:To:Cc:Subject:Date:From; b=N4vRaqAOXEV7lTUTiRazbLlEcbdIZ0naG9MKtQxPVGLlRGmh3WiuYQEWYBPwNjQsm Zoo9L8ZSUWa8C87d8wqR9lxv5FLiL6mmPnYsQyRAOBOcftKhQqwSev7hyBLQs8cUwg xQw651inTey5aY1EmSs6JE3la1OMjWIrVbfKq/FY= From: Anirudh Rayabharam To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Maxim Levitsky , Ilias Stamatis Cc: mail@anirudhrb.com, kumarpraveen@linux.microsoft.com, Anirudh Rayabharam , wei.liu@kernel.org, robert.bradford@intel.com, liuwe@microsoft.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2] KVM: nVMX: Don't expose eVMCS unsupported fields to L1 Date: Tue, 28 Jun 2022 16:02:41 +0530 Message-Id: <20220628103241.1785380-1-anrayabh@linux.microsoft.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When running cloud-hypervisor tests, VM entry into an L2 guest on KVM on Hyper-V fails with this splat (stripped for brevity): [ 1481.600386] WARNING: CPU: 4 PID: 7641 at arch/x86/kvm/vmx/nested.c:4563 nested_vmx_vmexit+0x70d/0x790 [kvm_intel] [ 1481.600427] CPU: 4 PID: 7641 Comm: vcpu2 Not tainted 5.15.0-1008-azure #9-Ubuntu [ 1481.600429] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 07/22/2021 [ 1481.600430] RIP: 0010:nested_vmx_vmexit+0x70d/0x790 [kvm_intel] [ 1481.600447] Call Trace: [ 1481.600449] [ 1481.600451] nested_vmx_reflect_vmexit+0x10b/0x440 [kvm_intel] [ 1481.600457] __vmx_handle_exit+0xef/0x670 [kvm_intel] [ 1481.600467] vmx_handle_exit+0x12/0x50 [kvm_intel] [ 1481.600472] vcpu_enter_guest+0x83a/0xfd0 [kvm] [ 1481.600524] vcpu_run+0x5e/0x240 [kvm] [ 1481.600560] kvm_arch_vcpu_ioctl_run+0xd7/0x550 [kvm] [ 1481.600597] kvm_vcpu_ioctl+0x29a/0x6d0 [kvm] [ 1481.600634] __x64_sys_ioctl+0x91/0xc0 [ 1481.600637] do_syscall_64+0x5c/0xc0 [ 1481.600667] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1481.600670] RIP: 0033:0x7f688becdaff [ 1481.600686] TSC multiplier field is currently not supported in EVMCS in KVM. It was previously not supported from Hyper-V but has been added since. Because it is not supported in KVM the use "TSC scaling control" is filtered out of vmcs_config by evmcs_sanitize_exec_ctrls(). However, in nested_vmx_setup_ctls_msrs(), TSC scaling is exposed to L1. eVMCS unsupported fields are not sanitized. When L1 tries to launch an L2 guest, vmcs12 has TSC scaling enabled. This propagates to vmcs02. But KVM doesn't set the TSC multiplier value because kvm_has_tsc_control is false. Due to this VM entry for L2 guest fails. (VM entry fails if "use TSC scaling" is 1 but TSC multiplier is 0.) To fix, in nested_vmx_setup_ctls_msrs(), sanitize the values read from MSRs by filtering out fields that are not supported by eVMCS. This is a stable-friendly intermediate fix. A more comprehensive fix is in progress [1] but is probably too complicated to safely apply to stable. [1]: https://lore.kernel.org/kvm/20220627160440.31857-1-vkuznets@redhat.com/ Fixes: d041b5ea93352 ("KVM: nVMX: Enable nested TSC scaling") Signed-off-by: Anirudh Rayabharam --- Changes since v1: - Sanitize all eVMCS unsupported fields instead of just TSC scaling. v1: https://lore.kernel.org/lkml/20220613161611.3567556-1-anrayabh@linux.microsoft.com/ --- arch/x86/kvm/vmx/nested.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index f5cb18e00e78..f88d748c7cc6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -6564,6 +6564,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->pinbased_ctls_high); msrs->pinbased_ctls_low |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->pinbased_ctls_high &= ~EVMCS1_UNSUPPORTED_PINCTRL; +#endif msrs->pinbased_ctls_high &= PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING | @@ -6580,6 +6584,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->exit_ctls_low = VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->exit_ctls_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL; +#endif msrs->exit_ctls_high &= #ifdef CONFIG_X86_64 VM_EXIT_HOST_ADDR_SPACE_SIZE | @@ -6600,6 +6608,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->entry_ctls_high); msrs->entry_ctls_low = VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->entry_ctls_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL; +#endif msrs->entry_ctls_high &= #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | @@ -6657,6 +6669,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps) msrs->secondary_ctls_high); msrs->secondary_ctls_low = 0; +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) + msrs->secondary_ctls_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC; +#endif msrs->secondary_ctls_high &= SECONDARY_EXEC_DESC | SECONDARY_EXEC_ENABLE_RDTSCP | -- 2.34.1