Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3159739yba; Mon, 22 Apr 2019 21:38:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqwyMMzOMrtjbgA6ITRO6ceR4nyg+iQiMw49CJGbaqX/Y6lezDc5JkEmYAnmEP3Q2vUkGeyT X-Received: by 2002:a63:844a:: with SMTP id k71mr22654053pgd.138.1555994329594; Mon, 22 Apr 2019 21:38:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555994329; cv=none; d=google.com; s=arc-20160816; b=BA2kxBst+QTdwslLH6o5xO/2rlbcijiNaUAPbVCSCUwNC5CGbtcPnmomcXcBmzNSVy b0JslSc8Rd7h5jgdBaITB/8m96B+LN1fPMc5l5YhEa0YZseE+5E3jAXVWaE77NEbruvX cq9r8apm84PSjm98x4rFtUecsvrHyqXBcaqL2g34BZjoHXlGt2aLjKzJpLaB5ql9EKAJ oU3uJwhhChq7hFq/gXNl9UoaVpz4sS1/9MAkd/LTxd4j5pq1oVJPb0DfPk8YaP4g5Tno Hx3ZcWgOy/fB5X4jeI7Wj4b3nwW8BlLi3Tlfujh3H3hiCDBTrq/I+kfOzT1c8rO+z42R E8Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=iLKnctQDibCkX38MACAkSpavHjyfRmhWRcV6eAkZh3E=; b=I+LkohsNlXFhKUqvHah74yOaiDv+eMPaTb4xi6BXqiKdQfVnOkPBmd0MuIX3ASUKZc DrEXv0+Ltx/GbhEW7mnLEvUz+PbOGX2EPkA9aocQgKYBedNB7PG5LkGSKUZ1tbnIAIfz CYDGZJpEzjp+gp/UtR/TN9r/XlEcwPscYhFxqDiHYr9IEfx+1taTKIZUWFF0JSeluJ8+ 7Bk6mmHZANUJhhgT6DDzBrpszX87XqdagnBrgVFi1F+x9UJfGPIIy/XAz+jdJq6mc3+6 STRI78s8RuM4wojKFqAf8gTCOyH5YoEqXH3tC3t2YUvZ99qSpfaTFpRLPlNf6/TdAWrF UX9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c68si817602pfa.56.2019.04.22.21.38.34; Mon, 22 Apr 2019 21:38:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729906AbfDWDYD (ORCPT + 99 others); Mon, 22 Apr 2019 23:24:03 -0400 Received: from mga04.intel.com ([192.55.52.120]:14834 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728301AbfDWDYD (ORCPT ); Mon, 22 Apr 2019 23:24:03 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Apr 2019 20:24:02 -0700 X-IronPort-AV: E=Sophos;i="5.60,384,1549958400"; d="scan'208";a="136515140" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.239.196.121]) ([10.239.196.121]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/AES128-SHA; 22 Apr 2019 20:24:01 -0700 Subject: Re: [PATCH] KVM: x86: Add Intel CPUID.1F cpuid emulation support To: Sean Christopherson Cc: kvm@vger.kernel.org, Paolo Bonzini , Thomas Gleixner , linux-kernel@vger.kernel.org References: <1555915234-2536-1-git-send-email-like.xu@linux.intel.com> <20190422183553.GH1236@linux.intel.com> From: Like Xu Organization: Intel OTC Message-ID: Date: Tue, 23 Apr 2019 11:23:59 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190422183553.GH1236@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/4/23 2:35, Sean Christopherson wrote: > On Mon, Apr 22, 2019 at 02:40:34PM +0800, Like Xu wrote: >> Expose Intel V2 Extended Topology Enumeration Leaf to guest only when >> host system has multiple software-visible die within each package. >> >> Signed-off-by: Like Xu >> --- >> arch/x86/kvm/cpuid.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c >> index fd39516..9fc14f2 100644 >> --- a/arch/x86/kvm/cpuid.c >> +++ b/arch/x86/kvm/cpuid.c >> @@ -65,6 +65,16 @@ u64 kvm_supported_xcr0(void) >> return xcr0; >> } >> >> +/* We need to check if the host cpu has multi-chip packaging technology. */ >> +static bool kvm_supported_intel_mcp(void) >> +{ >> + u32 eax, ignored; >> + >> + cpuid_count(0x1f, 0, &eax, &ignored, &ignored, &ignored); >> + >> + return boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && (eax != 0); >> +} >> + >> #define F(x) bit(X86_FEATURE_##x) >> >> int kvm_update_cpuid(struct kvm_vcpu *vcpu) >> @@ -426,6 +436,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, >> switch (function) { >> case 0: >> entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd)); >> + entry->eax = kvm_supported_intel_mcp() ? 0x1f : entry->eax; > > This all seems unnecessary. And by 'all', I mean the existing Intel PT > and XSAVE leaf checks, as well as the new mcp check. entry->eax comes > directly from hardware, and unless I missed something, PT and XSAVE are > only exposed to the guest when they're supported in hardware. In other > words, KVM will never need to adjust entry->eax to expose PT or XSAVE. We call this function for both case KVM_GET_SUPPORTED_CPUID and KVM_GET_EMULATED_CPUID although kvm user could reconfig them via KVM_SET_CPUID* path. > > The original min() check was added by commit 0771671749b5 ("KVM: Enhance > guest cpuid management"), which doesn't provide any explicit information > on why KVM does min() in the first place. Exposing cpuid.0.eax in a blind way (with host hardware support) is not a good practice for guest migration and improves compatibility requirements. > Given that the original code > was "entry->eax = min(entry->eax, (u32)0xb);", my *guess* is that the > idea was to always report "Extended Topology Enumeration Leaf" as > supported so that userspace can enumerate the VM's topology to the guest > even when hardware itself doesn't do so. If the host cpu mode is too antiquated to support 0xb, it wouldn't report 0xb for sure. The host cpuid.0.eax has been over 0xb for a long time and reached 0x1f in the latest SDM. AFAICT, the original code keeps minimum cpuid.0.eax out of features guest just used or at least it claimed to use. > > Assuming we want to allow userspace to use "V2 Extended Topology > Enumeration Leaf" regardless of hardware support, then this can simply be: > > entry->eax = min(entry->eax, (u32)0x1f); > > Or am I completely missing something? > >> break; >> case 1: >> entry->edx &= kvm_cpuid_1_edx_x86_features; >> @@ -544,6 +555,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, >> entry->edx = edx.full; >> break; >> } >> + /* function 0x1f has additional index. */ > > The original comment is rather useless, it's obvious from the code that > it has additional indices. No need to repeat its sins. A more useful > comment would be to explain that 0x1f and 0xb have identical formats and > thus can be handled by common code. I agree and let me fix it in next version. > > Which begs the question, why does leaf 0x1f exist? AFAICT the only > difference is that 0x1f supports additional "level types", but 0x1f's > types are backwards compatibile. Any idea why leaf 0xb wasn't simply > extended for the new types? It's not just about backwards compatibility on numerical parsing. So many software (whatever OS and applications) are using 0x1b to get CPU topology. In most cases, they (legacy code) would assume that the next level of CORE is package (at lease for Intel) not die and it's a semantic conflict if we reuse 0xb. As said in SDM, Intel recommends first checking for the existence of Leaf 1FH and using this if available. > >> + case 0x1f: >> /* function 0xb has additional index. */ >> case 0xb: { >> int i, level_type; >> -- >> 1.8.3.1 >> >