Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp189108pxv; Wed, 21 Jul 2021 19:45:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz38vptdZSVcOHZN9/c82N7F9KrM1kT/I5JO4UvYjRJki/oaMJc++E7TNSswwJheftHDz6K X-Received: by 2002:a17:906:a24c:: with SMTP id bi12mr42146462ejb.530.1626921952784; Wed, 21 Jul 2021 19:45:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626921952; cv=none; d=google.com; s=arc-20160816; b=gf2B/OFfgFu+LRRd/hOGFjTh+gfowC1XZcT8Z90CXYIalx/ufj5VnhnKFzGEkkTdRv FHf/ILam1pdkypkEZ/R9uyIkoKDax1V69zGYDdAFp/dc4gcjsucXI2otMHqNQRDtLVVO 4sq1EV4O0u/5/Csl80QUKIC9SLDE7ye0ic8hFdz01ywYSb5F/hlnU88Fws20qREm9dzQ zm5maR9QRJVx2jJv9ws9yxOrSfxHMa+CKWudONrVFhAqWfvT47Ly4i6yiEGJdtBdhkPI tKJtaO1UmrG6LQLuu3jjYkmKwPoqat5eT4l9DR3iQ87HPbthKPZTucgzuZCSoIQZPZeB LozA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=erLXs39tiTaWXXsREYsc+TQK27UqAG/VSI9gTQf7+Fs=; b=tqN/ymTsjwS1r2DbP12FEs81JTVe04PfcajZUb738t0lRBxoRSCpzjAASvE1SaOa7Q UnGdDYqJzm6zSPVgnyDNGEb7WEzjVizvcsweJi/q0O5pd7pm0W8NlXd7zJFC+LGiUDo3 cSwQ8NPbc12tc+Ef+yPX+v5mv7pCPSNflq3hFMNiUdss7Y0yiJPliIDOS45TseFjaCFE kpgG7s/7edFXHeMEK6DMduXb4NylaLex0sl/RHJvIOGohzqj0Lnte6JXVsBs0QJf+v7L PVDdKqaj/J7HABl/2phZYy7HfTlcG7i0+E2dsaASu7V4BaDT1buvcwxLFaVYYpzWW3Rk uNPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ce16si15458130ejb.307.2021.07.21.19.45.29; Wed, 21 Jul 2021 19:45:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230405AbhGVCDQ (ORCPT + 99 others); Wed, 21 Jul 2021 22:03:16 -0400 Received: from mga01.intel.com ([192.55.52.88]:30913 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229932AbhGVCDQ (ORCPT ); Wed, 21 Jul 2021 22:03:16 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10052"; a="233363213" X-IronPort-AV: E=Sophos;i="5.84,259,1620716400"; d="scan'208";a="233363213" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2021 19:43:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,259,1620716400"; d="scan'208";a="495569549" Received: from sqa-gate.sh.intel.com (HELO robert-ivt.tsp.org) ([10.239.48.212]) by fmsmga004.fm.intel.com with ESMTP; 21 Jul 2021 19:43:48 -0700 Message-ID: <30a8062f9b937b3245b073dd0002b61d99901ed7.camel@linux.intel.com> Subject: Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12 From: Robert Hoo To: Sean Christopherson , "Hu, Robert" Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" Date: Thu, 22 Jul 2021 10:43:48 +0800 In-Reply-To: References: <20210618214658.2700765-1-seanjc@google.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-8.el7) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2021-07-21 at 16:18 +0000, Sean Christopherson wrote: > On Wed, Jul 21, 2021, Hu, Robert wrote: > > > > Queued, thanks. Without having checked the kvm-unit-tests > > > > sources > > > > very thoroughly, this might be a configuration issue in > > > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu > > > > host,migratable=no") should not enable TSC scaling. > > > > > > As noted in the code comments, KVM allows VMREAD/VMWRITE to all > > > defined > > > fields, whether or not the field should actually exist for the > > > vCPU model doesn't > > > enter into the equation. That's technically wrong as there are a > > > number of > > > fields that the SDM explicitly states exist iff a certain feature > > > is supported. > > > > It's right that a number of fields' existence depends on some > > certain feature. > > Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest > > index > > value used in the encoding of any field *supported* by the > > processor", rather than > > *existed*. > > Yes. > > > So my understanding is no matter what VMCS exec control field's > > value is set, > > value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports > > the physical > > CPU's capability rather than runtime VMCS configuration. > > Yes. > > > Back to nested case, L1's VMCS configuration lays the "physical" > > capability > > for L2, right? > > Yes. > > > nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put > > to vcpu > > scope, rather than current kvm global. > > > > Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, > > before > > vcpu features are settled. I think should be moved to later stage > > as well. > > Just moving the helper (or adding another call) wouldn't fix the > underlying > problem that KVM doesn't correctly model the interplay between VMX > features and > VMCS fields. It's arguably less wrong than letting userspace stuff > an incorrect > value, but it's not 100% correct and ignoring/overriding userspace is > sketchy at > best. I think IA32_VMX_VMCS_ENUM MSR shall not be set by QEMU, it is actually already indirectly set by user space CPU feature set. > As suggested below, the full fix is to fail VMREAD/VMWRITE to fields > that > shouldn't exist given the vCPU model. > > > > To fix that we'd need to add a "feature flag" to > > > vmcs_field_to_offset_table > > > that is checked against the vCPU model, though updating the MSR > > > would > > > probably fall onto userspace's shoulders? > > > > > > And FWIW, this is the QEMU code: > > > > > > #define VMCS12_MAX_FIELD_INDEX (0x17) > > > > > > static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray > > > f) > > > { > > > ... > > > > > > /* > > > * Just to be safe, write these with constant values. The > > > CRn_FIXED1 > > > * MSRs are generated by KVM based on the vCPU's CPUID. > > > */ > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0, > > > CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK); > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0, > > > CR4_VMXE_MASK); > > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM, > > > VMCS12_MAX_FIELD_INDEX << 1); > > > }