Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp86033rwd; Wed, 24 May 2023 14:36:59 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6jeXq6qsdelVk7VqWmcc1q3iGANNZmwpTre5kJK5CJVh/ZbJ81VmMfwY5q8Ptb9ipDor+S X-Received: by 2002:a17:903:283:b0:1ac:85b0:1bd8 with SMTP id j3-20020a170903028300b001ac85b01bd8mr697325plr.34.1684964218830; Wed, 24 May 2023 14:36:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684964218; cv=none; d=google.com; s=arc-20160816; b=cdPYHKvYwIMzOTyM7Qu+37IwAlcP3iymElGCKenSn3fprI1i3fI70hwg4jLGdrZQri xNROaBCIMo5F/Ny+vXcRsaoroObgCyMHlmLxT1NHVSbGJ+Fj02y9Dsks+KxFvN5hbSfW 472+r7YO/4fL4xHpSywjURY3Xa++xaaVF3/XvuThyWj7r2BPP2yhTTOXW63MU5GQCrka mHrqQgI01jnwfFBKFTwtDbE5ARiLJhF6re02qw1D6Fz/zIk6RPMFk5GsH64TMZySyKg3 TGkNMYqZ2Cv2aWd7uS9zAoB+g5JibyRVDggSYR8HCpq3vJcut0X9BondcYkae7om9ar4 9M4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=G2+uYBM7vhcZmnf3eRsoO9c1MdM84v2xsik1061w3BU=; b=Cg9Ex6KfNSNv81fVL6GfbHKEn1/o96GXsaKMoPE+8MX6B493fdmissZmHFx+VTAXVC y1uUpz/3RQnDaPYHxngYT1bskFjX+J4YMJT5T8Vwt5/6gcfb4DiuPIoojNLvfDgHfS0v m/muyY3iYx8EqUpP8crPX0UW26eGrAS0txl2koyGCsSJnzQIWs2Xsb4RiOOvfWsV5Itf v6eu8umKWfZOkYt7lO7EpP+4MfVa1GrXV/0pmYTTb3D24I2DGAilOHgNpDqIwnfYsObM g+fjo2/2vDLLMMD4/NXmrM0Bc+P5EMPf9/8SZAawwDW3gpeLJ+/hYzPNWRwfCgIFWPOn tecQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=303m9Llr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b13-20020a170903228d00b001aae82caf75si9487355plh.66.2023.05.24.14.36.38; Wed, 24 May 2023 14:36:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=303m9Llr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232878AbjEXVah (ORCPT + 99 others); Wed, 24 May 2023 17:30:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231278AbjEXVag (ORCPT ); Wed, 24 May 2023 17:30:36 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5375FC for ; Wed, 24 May 2023 14:30:34 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1a950b982d4so51835ad.0 for ; Wed, 24 May 2023 14:30:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684963834; x=1687555834; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=G2+uYBM7vhcZmnf3eRsoO9c1MdM84v2xsik1061w3BU=; b=303m9Llrp6cfxOdtK60TR0oaqqRB2ACyQFV7cV3rxGv8aVqoho39Sqj7AGcKq5A/DD 6Sw2vu+7TxqxiqG5KVsxm3YbmUwaONCcn2RFl2G1EmqB0M6oXwj0uBwjshI7oEN5rp9a dkOI+cQBI85rSr27inftspFiR3yffYEV17N/ypfmSOCpDjDilGar7hHGn2x2Xh4fyZjV p/J/sodZI7aw3e8vLoA6EtKP7r4eRuMGH2PXzEkfk1KktA29Iy94lwGf/bGSvSb6M03t kInAp2jRO1+y8NE26awFbtyfzAfolKWsi6yGwR2E5orPdlaJaR49Tfl4IhwCCGxo7gIf 5pRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684963834; x=1687555834; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=G2+uYBM7vhcZmnf3eRsoO9c1MdM84v2xsik1061w3BU=; b=Xfq92zGDp9d/OOgOD4MvxChtxcFhjOsPEzHLyiWwFNHXPFqxi32xPfutLXlSbI2wEA GvMsGE59HPZdQ8dKZJCQFftmRUZ2Ab2veY3Q7FmSMEFHKm6FGhGtyLiNQsjPIvTd2sZw Pry1bU2L5oyK+NCy27nlRLIonGrJ9H6a1ASp9Lem6zqs4utEWGLF8rscAzFgheRd6STr AVlkO1mKTdAiUW16aKcB+qIpnTtHbQ8cBDR4kG4fJeCeCykUiSDuIan9p8sanR04IxR7 XFqPUE28o6vyCNjqj/emjLvJUZiROuLxn9WWW6cBQEiIiYW3CAqU/Q5ZxGuWWvuDv5BF bhkg== X-Gm-Message-State: AC+VfDyScuiVVST+6jQNaaYON5s42sb8Q9gdFCAkVB8z8QqdkVH8EC/H 6AuPwQN9JKRCZ+hribchwpncXWHTQU2m0AIvWXJPSQ== X-Received: by 2002:a17:902:e746:b0:1a8:df:f3c6 with SMTP id p6-20020a170902e74600b001a800dff3c6mr73961plf.2.1684963833783; Wed, 24 May 2023 14:30:33 -0700 (PDT) MIME-Version: 1.0 References: <20230310105346.12302-1-likexu@tencent.com> <20230310105346.12302-6-likexu@tencent.com> In-Reply-To: From: Jim Mattson Date: Wed, 24 May 2023 14:30:22 -0700 Message-ID: Subject: Re: [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the VMRUN instruction To: Sean Christopherson Cc: Like Xu , Paolo Bonzini , Ravi Bangoria , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 24, 2023 at 2:23=E2=80=AFPM Sean Christopherson wrote: > > On Fri, Mar 10, 2023, Like Xu wrote: > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm= _host.h > > index adb92fc4d7c9..d6fcbf233cb3 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -561,6 +561,10 @@ struct kvm_pmu { > > */ > > u64 host_cross_mapped_mask; > > > > + /* Flags to track any HW quirks that need to be fixed by vPMU. */ > > + u64 quirk_flags; > > + DECLARE_BITMAP(hide_vmrun_pmc_idx, X86_PMC_IDX_MAX); > > Since it sounds like AMD isn't changing the behavior, let's forego the qu= irk and > just hardcode the fixup. > > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > > index 2a0504732966..315dca021d57 100644 > > --- a/arch/x86/kvm/pmu.c > > +++ b/arch/x86/kvm/pmu.c > > @@ -254,6 +254,7 @@ static void pmc_pause_counter(struct kvm_pmc *pmc) > > counter +=3D perf_event_pause(pmc->perf_event, true); > > pmc->counter =3D counter & pmc_bitmask(pmc); > > pmc->is_paused =3D true; > > + kvm_mark_pmc_is_quirky(pmc); > > } > > > > static bool pmc_resume_counter(struct kvm_pmc *pmc) > > @@ -822,6 +823,19 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *= kvm, void __user *argp) > > return r; > > } > > > > +static inline bool event_is_branch_instruction(struct kvm_pmc *pmc) > > How about pmc_is_counting_branches()? The "event" itself isn't a branch > instruction. Note that there's a bug in the original code for this that has probably never been fixed: it ignores CMASK and INV in the PerfEvtSel. > > +{ > > + return eventsel_match_perf_hw_id(pmc, PERF_COUNT_HW_INSTRUCTIONS)= || > > + eventsel_match_perf_hw_id(pmc, > > + PERF_COUNT_HW_BRANCH_INSTRUCTIO= NS); > > Let this poke out. > > > +} > > + > > +static inline bool quirky_pmc_will_count_vmrun(struct kvm_pmc *pmc) > > +{ > > + return event_is_branch_instruction(pmc) && event_is_allowed(pmc) = && > > + !static_call(kvm_x86_get_cpl)(pmc->vcpu); > > +} > > + > > void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) > > { > > struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); > > @@ -837,6 +851,10 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) > > > > reprogram_counter(pmc); > > kvm_pmu_handle_pmc_overflow(pmc); > > + > > + if (vcpu_has_pmu_quirks(vcpu) && > > + quirky_pmc_will_count_vmrun(pmc)) > > + set_bit(pmc->idx, pmu->hide_vmrun_pmc_idx); > > Doesn't this need to adjust the count _before_ handling overflow? I.e. i= sn't it > possible for the bogus counts to cause bogus overflow? > > > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h > > index a47b579667c6..30f6f58f4c38 100644 > > --- a/arch/x86/kvm/pmu.h > > +++ b/arch/x86/kvm/pmu.h > > @@ -18,6 +18,9 @@ > > #define VMWARE_BACKDOOR_PMC_REAL_TIME 0x10001 > > #define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002 > > > > +#define X86_PMU_COUNT_VMRUN BIT_ULL(0) > > +#define X86_PMU_QUIRKS_MASK X86_PMU_COUNT_VMRUN > > + > > struct kvm_pmu_ops { > > bool (*hw_event_available)(struct kvm_pmc *pmc); > > bool (*pmc_is_enabled)(struct kvm_pmc *pmc); > > @@ -54,14 +57,33 @@ static inline void kvm_pmu_request_counter_reprogra= m(struct kvm_pmc *pmc) > > kvm_make_request(KVM_REQ_PMU, pmc->vcpu); > > } > > > > +static inline bool vcpu_has_pmu_quirks(struct kvm_vcpu *vcpu) > > +{ > > + return vcpu_to_pmu(vcpu)->quirk_flags & X86_PMU_QUIRKS_MASK; > > +} > > + > > +/* > > + * The time to mark pmc is when the accumulation value returned > > + * by perf API based on a HW counter has just taken effect. > > + */ > > +static inline void kvm_mark_pmc_is_quirky(struct kvm_pmc *pmc) > > +{ > > + if (!vcpu_has_pmu_quirks(pmc->vcpu)) > > + return; > > + > > + kvm_pmu_request_counter_reprogram(pmc); > > +} > > + > > static inline u64 pmc_read_counter(struct kvm_pmc *pmc) > > { > > u64 counter, enabled, running; > > > > counter =3D pmc->counter; > > - if (pmc->perf_event && !pmc->is_paused) > > + if (pmc->perf_event && !pmc->is_paused) { > > counter +=3D perf_event_read_value(pmc->perf_event, > > &enabled, &running); > > + kvm_mark_pmc_is_quirky(pmc); > > + } > > /* FIXME: Scaling needed? */ > > return counter & pmc_bitmask(pmc); > > } > > diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c > > index 5fa939e411d8..130991a97f22 100644 > > --- a/arch/x86/kvm/svm/pmu.c > > +++ b/arch/x86/kvm/svm/pmu.c > > @@ -187,6 +187,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu) > > pmu->nr_arch_fixed_counters =3D 0; > > pmu->global_status =3D 0; > > bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters); > > + pmu->quirk_flags |=3D X86_PMU_COUNT_VMRUN; > > } > > > > static void amd_pmu_init(struct kvm_vcpu *vcpu) > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index f41d96e638ef..f6b33d172481 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -3919,6 +3919,31 @@ static fastpath_t svm_exit_handlers_fastpath(str= uct kvm_vcpu *vcpu) > > return EXIT_FASTPATH_NONE; > > } > > > > +static void pmu_hide_vmrun(struct kvm_vcpu *vcpu) > > This needs to be noinstr. > > > +{ > > + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); > > + struct kvm_pmc *pmc; > > + unsigned int i; > > + > > + for_each_set_bit(i, pmu->hide_vmrun_pmc_idx, X86_PMC_IDX_MAX) { > > + clear_bit(i, pmu->hide_vmrun_pmc_idx); > > Clearing the bit will hide only the first VMRUN after the guest attempts = to read > the counter, no? The fixup needs to apply to every VMRUN that is execute= d after > the PMC is programmed. Or am I misreading the patch? > > > + > > + /* AMD doesn't have fixed counters at now. */ > > + if (i >=3D pmu->nr_arch_gp_counters) > > + continue; > > + > > + /* > > + * The prerequisite for fixing HW quirks is that there is= indeed > > + * HW working and perf has no chance to retrieve the coun= ter. > > I don't follow the "perf has no chance to retrieve the counter" part. > > > + */ > > + pmc =3D &pmu->gp_counters[i]; > > + if (!pmc->perf_event || pmc->perf_event->hw.idx < 0) > > + continue; > > + > > + pmc->counter--; > > + } > > +} > > + > > static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool sp= ec_ctrl_intercepted) > > { > > struct vcpu_svm *svm =3D to_svm(vcpu); > > @@ -3986,6 +4011,9 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct = kvm_vcpu *vcpu) > > > > kvm_wait_lapic_expire(vcpu); > > > > + if (vcpu->kvm->arch.enable_pmu && vcpu_has_pmu_quirks(vcpu)) > > + pmu_hide_vmrun(vcpu); > > + > > /* > > * If this vCPU has touched SPEC_CTRL, restore the guest's value = if > > * it's non-zero. Since vmentry is serialising on affected CPUs, = there > > -- > > 2.39.2 > >