Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5026123pxj; Wed, 12 May 2021 19:53:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyg9QuKXeH3wqRUGFSpyMX/sm1mJ43nyjpOFKGIvOKJhDVCcKapef2ysU4w4oozveBW6hBH X-Received: by 2002:a17:906:b0cb:: with SMTP id bk11mr41802359ejb.310.1620874435066; Wed, 12 May 2021 19:53:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620874435; cv=none; d=google.com; s=arc-20160816; b=X3c9zWaKkK5aWh6xbY9+H5UhfdaNwmN/aTK4QSa/BZKZrw3ADsa+4gFlyH7i7Cqpbz rqkzgwqzOxjXZayQTHSwfWWGMZM0l4dMbvCFRODcrxxa9k6Dbv8cQmujdAGmd/rDkfQT ZgYdc1H7D/RU3PBqtVQkd26u2cokFB3FsUbS0ekmhVqpJR+TEYOPz4gXV5wUrZk9ZCwW Wy+GHf9vkcb0OB4kXr4Y8Tz2RLg4fp27YNgPQkXqGjQnqnsVJv6PC1sMDesvlMZ27twm 3AQq2Kh1tYD39JxceAO3Va6yr8BbFhz+UDRarQHKHCJHKdeAwLXX/Z3f+LaqtmfdhYf6 XBcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=obrFuQI8IT/QacLBKw6S9AEunKXopLc5AIVNlOSuAnc=; b=nYq+OMxEjeO0YMnEeK8e1/Ui67fEXD6yv++dSazk8W/UeK5oC/gO8CCDXOjedejgTM 7pUImvLpqvpUvHJjfNgvremU0lx5RmcAqV4IRKlE6h5LFoXn0+u/BwIRytkoz9LNjNAB vGV8/jlHUw8OWdhQz0naQRy5wVq1GKlrxnghRYaIlFDEGw82C40R23M5XZ7zE7b97qDG NWF94Ri112VDBKzA351rE5nMqTSkGDOCN2hXoGLtRGgOBYbvXwtq7kZBeGcQe0DKxpbX tfPUpXDGsF3G4Cz9nlj54n5y3yTv7VzedDVIK9TMW4cIqOoqci+CG2J15AIkWhY4+4dj sHsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rs3si1698519ejb.501.2021.05.12.19.53.29; Wed, 12 May 2021 19:53:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230366AbhEMCwE (ORCPT + 99 others); Wed, 12 May 2021 22:52:04 -0400 Received: from mga11.intel.com ([192.55.52.93]:15460 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229745AbhEMCwC (ORCPT ); Wed, 12 May 2021 22:52:02 -0400 IronPort-SDR: 6uSWLJ+A8r+KgKYftq3Ju/PH8PmJv7vBUYl8ODO/lBpUbdOtIOHAuoHLdZ1Dp6dILxDym4uuSD 86RPaStwx0Zw== X-IronPort-AV: E=McAfee;i="6200,9189,9982"; a="196761692" X-IronPort-AV: E=Sophos;i="5.82,296,1613462400"; d="scan'208";a="196761692" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2021 19:50:50 -0700 IronPort-SDR: 6rmxe/j+8FY1jJu5OoDnevBHx6jzZIeJxnDX74H92iE5f9tYHRRrAmOnvBnI4HGk3IYilCgBEo R0Sdxsp62FqQ== X-IronPort-AV: E=Sophos;i="5.82,296,1613462400"; d="scan'208";a="623066848" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.238.4.93]) ([10.238.4.93]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2021 19:50:44 -0700 Subject: Re: [PATCH v6 04/16] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled To: Sean Christopherson Cc: Venkatesh Srinivas , Peter Zijlstra , Paolo Bonzini , Borislav Petkov , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , weijiang.yang@intel.com, Kan Liang , ak@linux.intel.com, wei.w.wang@intel.com, eranian@google.com, liuxiangdong5@huawei.com, linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org, Yao Yuan , Like Xu References: <20210511024214.280733-1-like.xu@linux.intel.com> <20210511024214.280733-5-like.xu@linux.intel.com> From: "Xu, Like" Message-ID: <5ef2215b-1c43-fc8a-42ef-46c22e093f40@intel.com> Date: Thu, 13 May 2021 10:50:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/5/12 23:18, Sean Christopherson wrote: > On Wed, May 12, 2021, Xu, Like wrote: >> Hi Venkatesh Srinivas, >> >> On 2021/5/12 9:58, Venkatesh Srinivas wrote: >>> On 5/10/21, Like Xu wrote: >>>> On Intel platforms, the software can use the IA32_MISC_ENABLE[7] bit to >>>> detect whether the processor supports performance monitoring facility. >>>> >>>> It depends on the PMU is enabled for the guest, and a software write >>>> operation to this available bit will be ignored. >>> Is the behavior that writes to IA32_MISC_ENABLE[7] are ignored (rather than #GP) >>> documented someplace? >> The bit[7] behavior of the real hardware on the native host is quite >> suspicious. > Ugh. Can you file an SDM bug to get the wording and accessibility updated? The > current phrasing is a mess: > > Performance Monitoring Available (R) > 1 = Performance monitoring enabled. > 0 = Performance monitoring disabled. > > The (R) is ambiguous because most other entries that are read-only use (RO), and > the "enabled vs. disabled" implies the bit is writable and really does control > the PMU. But on my Haswell system, it's read-only. On your Haswell system, does it cause #GP or just silent if you change this bit ? > Assuming the bit is supposed > to be a read-only "PMU supported bit", the SDM should be: > > Performance Monitoring Available (RO) > 1 = Performance monitoring supported. > 0 = Performance monitoring not supported. > > And please update the changelog to explain the "why" of whatever the behavior > ends up being. The "what" is obvious from the code. Thanks for your "why" comment. > >> To keep the semantics consistent and simple, we propose ignoring write >> operation in the virtualized world, since whether or not to expose PMU is >> configured by the hypervisor user space and not by the guest side. > Making up our own architectural behavior because it's convient is not a good > idea. Sometime we do change it. For example, the scope of some msrs may be "core level share" but we likely keep it as a "thread level" variable in the KVM out of convenience. > >>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c >>>> index 9efc1a6b8693..d9dbebe03cae 100644 >>>> --- a/arch/x86/kvm/vmx/pmu_intel.c >>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c >>>> @@ -488,6 +488,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) >>>> if (!pmu->version) >>>> return; >>>> >>>> + vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON; > Hmm, normally I would say overwriting the guest's value is a bad idea, but if > the bit really is a read-only "PMU supported" bit, then this is the correct > behavior, albeit weird if userspace does a late CPUID update (though that's > weird no matter what). > >>>> perf_get_x86_pmu_capability(&x86_pmu); >>>> >>>> pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters, >>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>> index 5bd550eaf683..abe3ea69078c 100644 >>>> --- a/arch/x86/kvm/x86.c >>>> +++ b/arch/x86/kvm/x86.c >>>> @@ -3211,6 +3211,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct >>>> msr_data *msr_info) >>>> } >>>> break; >>>> case MSR_IA32_MISC_ENABLE: >>>> + data &= ~MSR_IA32_MISC_ENABLE_EMON; > However, this is not. If it's a read-only bit, then toggling the bit should > cause a #GP. The proposal here is trying to make it as an unchangeable bit and don't make it #GP if guest changes it. It may different from the host behavior but it doesn't cause potential issue if some guest code changes it during the use of performance monitoring. Does this make sense to you or do you want to keep it strictly the same as the host side? > >>>> if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) >>>> && >>>> ((vcpu->arch.ia32_misc_enable_msr ^ data) & >>>> MSR_IA32_MISC_ENABLE_MWAIT)) { >>>> if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) >>>> --