by Like Xu

[permalink] [raw]

Subject: Re: [RESEND v13 03/10] KVM: x86/pmu: Use IA32_PERF_CAPABILITIES to adjust features visibility

Hi Paolo,

On 2021/1/27 14:04, Like Xu wrote:
> On 2021/1/26 17:42, Paolo Bonzini wrote:
>> On 08/01/21 02:36, Like Xu wrote:
>>>
>>> @@ -401,6 +398,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
>>>          pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED;
>>>          pmu->fixed_counters[i].current_config = 0;
>>>      }
>>> +
>>> +    vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu,
>>> X86_FEATURE_PDCM) ?
>>> +        vmx_get_perf_capabilities() : 0;
>>
>> There is one thing I don't understand with this patch: intel_pmu_init is
>> not called when CPUID is changed. So I would have thought that anything
>> that uses guest_cpuid_has must stay in intel_pmu_refresh. As I
>> understand it vcpu->arch.perf_capabilities is always set to 0
>> (vmx_get_perf_capabilities is never called), and kvm_set_msr_common
>> would fail to set any bit in the MSR. What am I missing?
>>
>> In addition, the code of patch 4:
>>
>> +    if (!intel_pmu_lbr_is_enabled(vcpu)) {
>> +        vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT;
>> +        lbr_desc->records.nr = 0;
>> +    }
>>
>> is not okay after MSR changes. The value written by the host must be
>> either rejected (with "return 1") or applied unchanged.
>>
>> Fortunately I think this code is dead if you move the check in
>> kvm_set_msr from patch 9 to patch 4. However, in patch 9
>> vmx_get_perf_capabilities() must only set the LBR format bits if
>> intel_pmu_lbr_is_compatible(vcpu).
>
> Thanks for the guidance. How about handling it in this way:
>
> In the intel_pmu_init():
>
>     vcpu->arch.perf_capabilities = 0;
>     lbr_desc->records.nr = 0;
>
> In the intel_pmu_refresh():
>
>     if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM)) {
>         vcpu->arch.perf_capabilities = vmx_get_perf_capabilities();
>         if (!lbr_desc->records.nr)
>             vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT;
>     }
>
> In the vmx_set_msr():
>
>     case MSR_IA32_PERF_CAPABILITIES:
>         // set up lbr_desc->records.nr
>         if (!intel_pmu_lbr_is_compatible(vcpu))
>             return 1;
>         ret = kvm_set_msr_common(vcpu, msr_info);
>
> In the kvm_set_msr_common():
>
>     case MSR_IA32_PERF_CAPABILITIES:
>         vcpu->arch.perf_capabilities = data;
>         kvm_pmu_refresh(vcpu);

The new version will make the vcpu->arch.perf_capabilities crud as simple
as possible.

>
>>
>>
>> The patches look good apart from these issues and the other nits I
>> pointed out. However, you need testcases here, for both kvm-unit-tests
>> and tools/testing/selftests/kvm.
>>
>> For KVM, it would be at least a basic check that looks for the MSR LBR
>> (using the MSR indices for the various processors), does a branch, and
>> checks that the FROM_IP/TO_IP are good. You can write the
>> kvm-unit-tests using the QEMU option "-cpu host,migratable=no": if you
>> do this, QEMU will pick the KVM_GET_SUPPORTED_CPUID bits and move them
>> more or less directly into the guest CPUID.
>>
>> For tools/testing/selftests/kvm, your test need to check the effect of
>> various CPUID settings on the PERF_CAPABILITIES MSR, check that whatever
>> you write with KVM_SET_MSR is _not_ modified and can be retrieved with
>> KVM_GET_MSR, and check that invalid LBR formats are rejected.
>
> Thanks, I will add the above tests in the next version.
>
>>
>> I'm really, really sorry for leaving these patches on my todo list for
>> months, but you guys need to understand the main reason for this: they
>> come with no testcases. A large patch series adding userspace APIs and
>> complicated CPUID/MSR processing *automatically* goes to the bottom of
>> my queue, because:

Please review the new version
https://lore.kernel.org/kvm/[email protected]/T/#t
and kvm-unit-tests:
https://lore.kernel.org/kvm/[email protected],
in case the patch set is not *automatically* processed to the bottom of
your queue.

I'll also add tests for other new vPMU features.

---
thx, likexu

>>
>> - I need to go with a fine comb over all the userspace API changes, I
>> cannot just look at test code and see if it works.
>>
>> - I will have no way to test its correctness after it's committed.
>>
>> For you, the work ends when your patch is accepted. For me, that's when
>> the work begins, and I need to make sure that the patch will be
>> maintainable in the future.
>>
>> Thanks, and sorry again for the delay.
>>
>> Paolo
>>
>