Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6088253rdb; Thu, 14 Dec 2023 08:03:19 -0800 (PST) X-Google-Smtp-Source: AGHT+IEFhPeP5stEaKFDl1HznGg2xlIPMLBrERCQkVLwemyo8lpBDRr4YuNny/V5sLA83YceJc3t X-Received: by 2002:a05:6808:190a:b0:3b9:e87b:a72f with SMTP id bf10-20020a056808190a00b003b9e87ba72fmr5996576oib.25.1702569798935; Thu, 14 Dec 2023 08:03:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702569798; cv=none; d=google.com; s=arc-20160816; b=G8KSclkRXfAHbG3PIPIPWt3KTaIl4A+J2NS/K2vYlP7Vq+HfRrDx4djpB2K65slfGH o2dbznuYnDPDr9VytfhwAYpkANtA9Jal0/Y3Jhh2O46PIBicMzbGMNL+q/Cp9w4PGUYc yqfh/OrXoaau4JpCaCYlGRiRcdzJQult4uj0qqTYLPDYlsWXRYlhf196GsQA0+RHtxPo ojlBND48lkMl5IdjDsDJf1poNvmsvW4WRZYQuTQ76szAwgC7LesYVnzrnzzD2rwG6IUA oZQrN7xm3py3IfU4Q5+nr8DmrMLz/QFgjI+8ggSdoSBKErqHT8RJHfnwU/AHa3dguXrq HfKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=IUD6uTp4X8Im/Ut4KcB1wfCy3Zhc1GmMSrEZshSdY3E=; fh=oOBf863YCUz5nxlCZQoxyjaLXR9Zm0DTQlRqUgQrhSo=; b=sW4B8Ap/YDqsAHnxZ9xvjblukH8Vycgxlxecc8TqSNSeUrWHqCj9cWugG6oHgTxz4P 8wOrhC0w5Gwjx9sbekcTsb8FZPZ4FYMQyArZS44bxM4JWbAThSmp4VxYYgeO3nxiKjmw Wu5fn+RYBMyYnCWaYZaSKzyoeSHsAGbHqL8BZvT/IKsKOSW3SG7HVj/1nvPY0rXmclot 9BRA79EVSFG75ozWnhaKe6PUuJBC4Oy8J/hYvC1xh4TIMqefPwSzubhZ82FujJIxhTDR 2+GBG1xSRLtDrVCo3xKw6l7UTVoBC7QCqGdaLmWvi4FifnsNRlyd/tGaPk2KjWHtXzoE 7hEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20230601.gappssmtp.com header.s=20230601 header.b=VCjgo3GZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id eq9-20020a056808448900b003b8b1025038si4800865oib.232.2023.12.14.08.03.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 08:03:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20230601.gappssmtp.com header.s=20230601 header.b=VCjgo3GZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 64F218260E86; Thu, 14 Dec 2023 08:02:58 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573944AbjLNQCf (ORCPT + 99 others); Thu, 14 Dec 2023 11:02:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230441AbjLNQCc (ORCPT ); Thu, 14 Dec 2023 11:02:32 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 977A311B for ; Thu, 14 Dec 2023 08:02:37 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-28b09aeca73so585231a91.1 for ; Thu, 14 Dec 2023 08:02:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20230601.gappssmtp.com; s=20230601; t=1702569757; x=1703174557; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IUD6uTp4X8Im/Ut4KcB1wfCy3Zhc1GmMSrEZshSdY3E=; b=VCjgo3GZ4HhdC/TL+iRVk8VpUbWn+R5/jBTDcMvLJMZYCZsNqMk1qYFwGD671FfDkd Gv05HQM4o6oS70Dc89jmrmAWxHtqHI8ohIPo8/bKe+LuFUwZA1DcJaTNDl/w21ZINYj+ DaO+sudJuXfh+kfVGI+pk3x7+ttiWmEp+AG6l/HKuIEZLIKCRyTuVqqTlNGKw+wJarul YA6dymv7dtyckbcu6MVzF2QuV/Yb5fcqZDWX69edzNageruiMpcoo/9UicaYEODAsm2M GEY3N4ctNdGAtekbQ8Szu7R6fPyBIACBEfYRWc03VFTDaNnsN2tgdAQYfealG8iTEfzI Cf5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569757; x=1703174557; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IUD6uTp4X8Im/Ut4KcB1wfCy3Zhc1GmMSrEZshSdY3E=; b=kfBfSpXzuM5Ow/VkHxLu51vdRXfLFe8w106FxWDJd7TglTo0QoYHCucnKmxWTnvawv 5dqCjiB419hwEPF3SBY0w6GwBKz9Tv0YUVhgMGCVS3WNWurmSY50ouUYlEo6H+EusZb0 chBNwfaD7QL+OjagXhkJdDtP5SarhThu98/SCNbP3lGhtkwqT8TYc6kW4pg0ZdYsq3xa PYSngbpYPXkVMOYN8fseLl7vNFxy4XAFTiNSYnOTTqOVPCB1I7RgdBOe3F+E7AUmlU62 aLzCNC3pLBXftjEDgMOEMou6fjNVny+II7ih9eeja59XyH42RFVSMvJ4haFPadEy6KdU K7yQ== X-Gm-Message-State: AOJu0YxIrlNgLmXKxOwqfLhHiErCzQiDV0/mnaHbS6nkzveBZz1LTq8H Bvyd585TUoGpFENJFcoMDhwTrx+d5sQS1gbN7AQDdg== X-Received: by 2002:a17:90b:3e8e:b0:286:576a:74d4 with SMTP id rj14-20020a17090b3e8e00b00286576a74d4mr13535045pjb.17.1702569756502; Thu, 14 Dec 2023 08:02:36 -0800 (PST) MIME-Version: 1.0 References: <20231205024310.1593100-1-atishp@rivosinc.com> <20231205024310.1593100-9-atishp@rivosinc.com> In-Reply-To: <20231205024310.1593100-9-atishp@rivosinc.com> From: Anup Patel Date: Thu, 14 Dec 2023 21:32:24 +0530 Message-ID: Subject: Re: [RFC 8/9] RISC-V: KVM: Add perf sampling support for guests To: Atish Patra Cc: linux-kernel@vger.kernel.org, Alexandre Ghiti , Andrew Jones , Atish Patra , Conor Dooley , Guo Ren , Icenowy Zheng , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-riscv@lists.infradead.org, Mark Rutland , Palmer Dabbelt , Paul Walmsley , Will Deacon Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 14 Dec 2023 08:02:58 -0800 (PST) On Tue, Dec 5, 2023 at 8:13=E2=80=AFAM Atish Patra wr= ote: > > KVM enables perf for guest via counter virtualization. However, the > sampling can not be supported as there is no mechanism to enabled > trap/emulate scountovf in ISA yet. Rely on the SBI PMU snapshot > to provide the counter overflow data via the shared memory. > > In case of sampling event, the host first guest the LCOFI interrupt > and injects to the guest via irq filtering mechanism defined in AIA > specification. Thus, ssaia must be enabled in the host in order to > use perf sampling in the guest. No other AIA dpeendancy w.r.t kernel s/dpeendancy/dependency/ > is required. > > Signed-off-by: Atish Patra > --- > arch/riscv/include/asm/csr.h | 3 +- > arch/riscv/include/uapi/asm/kvm.h | 1 + > arch/riscv/kvm/main.c | 1 + > arch/riscv/kvm/vcpu.c | 8 ++-- > arch/riscv/kvm/vcpu_onereg.c | 1 + > arch/riscv/kvm/vcpu_pmu.c | 69 ++++++++++++++++++++++++++++--- > 6 files changed, 73 insertions(+), 10 deletions(-) > > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h > index 88cdc8a3e654..bec09b33e2f0 100644 > --- a/arch/riscv/include/asm/csr.h > +++ b/arch/riscv/include/asm/csr.h > @@ -168,7 +168,8 @@ > #define VSIP_TO_HVIP_SHIFT (IRQ_VS_SOFT - IRQ_S_SOFT) > #define VSIP_VALID_MASK ((_AC(1, UL) << IRQ_S_SOFT) | \ > (_AC(1, UL) << IRQ_S_TIMER) | \ > - (_AC(1, UL) << IRQ_S_EXT)) > + (_AC(1, UL) << IRQ_S_EXT) | \ > + (_AC(1, UL) << IRQ_PMU_OVF)) > > /* AIA CSR bits */ > #define TOPI_IID_SHIFT 16 > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/= asm/kvm.h > index 60d3b21dead7..741c16f4518e 100644 > --- a/arch/riscv/include/uapi/asm/kvm.h > +++ b/arch/riscv/include/uapi/asm/kvm.h > @@ -139,6 +139,7 @@ enum KVM_RISCV_ISA_EXT_ID { > KVM_RISCV_ISA_EXT_ZIHPM, > KVM_RISCV_ISA_EXT_SMSTATEEN, > KVM_RISCV_ISA_EXT_ZICOND, > + KVM_RISCV_ISA_EXT_SSCOFPMF, > KVM_RISCV_ISA_EXT_MAX, > }; > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 225a435d9c9a..5a3a4cee0e3d 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -43,6 +43,7 @@ int kvm_arch_hardware_enable(void) > csr_write(CSR_HCOUNTEREN, 0x02); > > csr_write(CSR_HVIP, 0); > + csr_write(CSR_HVIEN, 1UL << IRQ_PMU_OVF); > > kvm_riscv_aia_enable(); > > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index e087c809073c..2d9f252356c3 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -380,7 +380,8 @@ int kvm_riscv_vcpu_set_interrupt(struct kvm_vcpu *vcp= u, unsigned int irq) > if (irq < IRQ_LOCAL_MAX && > irq !=3D IRQ_VS_SOFT && > irq !=3D IRQ_VS_TIMER && > - irq !=3D IRQ_VS_EXT) > + irq !=3D IRQ_VS_EXT && > + irq !=3D IRQ_PMU_OVF) > return -EINVAL; > > set_bit(irq, vcpu->arch.irqs_pending); > @@ -395,14 +396,15 @@ int kvm_riscv_vcpu_set_interrupt(struct kvm_vcpu *v= cpu, unsigned int irq) > int kvm_riscv_vcpu_unset_interrupt(struct kvm_vcpu *vcpu, unsigned int i= rq) > { > /* > - * We only allow VS-mode software, timer, and external > + * We only allow VS-mode software, timer, counter overflow and ex= ternal > * interrupts when irq is one of the local interrupts > * defined by RISC-V privilege specification. > */ > if (irq < IRQ_LOCAL_MAX && > irq !=3D IRQ_VS_SOFT && > irq !=3D IRQ_VS_TIMER && > - irq !=3D IRQ_VS_EXT) > + irq !=3D IRQ_VS_EXT && > + irq !=3D IRQ_PMU_OVF) > return -EINVAL; > > clear_bit(irq, vcpu->arch.irqs_pending); > diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c > index f8c9fa0c03c5..19a0e4eaf0df 100644 > --- a/arch/riscv/kvm/vcpu_onereg.c > +++ b/arch/riscv/kvm/vcpu_onereg.c > @@ -36,6 +36,7 @@ static const unsigned long kvm_isa_ext_arr[] =3D { > /* Multi letter extensions (alphabetically sorted) */ > KVM_ISA_EXT_ARR(SMSTATEEN), > KVM_ISA_EXT_ARR(SSAIA), > + KVM_ISA_EXT_ARR(SSCOFPMF), Sscofpmf can't be disabled for guest so we should add it to kvm_riscv_vcpu_isa_disable_allowed(), no ? > KVM_ISA_EXT_ARR(SSTC), > KVM_ISA_EXT_ARR(SVINVAL), > KVM_ISA_EXT_ARR(SVNAPOT), > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c > index 622c4ee89e7b..86c8e92f92d3 100644 > --- a/arch/riscv/kvm/vcpu_pmu.c > +++ b/arch/riscv/kvm/vcpu_pmu.c > @@ -229,6 +229,47 @@ static int kvm_pmu_validate_counter_mask(struct kvm_= pmu *kvpmu, unsigned long ct > return 0; > } > > +static void kvm_riscv_pmu_overflow(struct perf_event *perf_event, > + struct perf_sample_data *data, > + struct pt_regs *regs) > +{ > + struct kvm_pmc *pmc =3D perf_event->overflow_handler_context; > + struct kvm_vcpu *vcpu =3D pmc->vcpu; Ahh, the "vcpu" field is used here. Move that change from patch7 to this patch. > + struct kvm_pmu *kvpmu =3D vcpu_to_pmu(vcpu); > + struct riscv_pmu *rpmu =3D to_riscv_pmu(perf_event->pmu); > + u64 period; > + > + /* > + * Stop the event counting by directly accessing the perf_event. > + * Otherwise, this needs to deferred via a workqueue. > + * That will introduce skew in the counter value because the actu= al > + * physical counter would start after returning from this functio= n. > + * It will be stopped again once the workqueue is scheduled > + */ > + rpmu->pmu.stop(perf_event, PERF_EF_UPDATE); > + > + /* > + * The hw counter would start automatically when this function re= turns. > + * Thus, the host may continue to interrupts and inject it to the= guest > + * even without guest configuring the next event. Depending on th= e hardware > + * the host may some sluggishness only if privilege mode filterin= g is not > + * available. In an ideal world, where qemu is not the only capab= le hardware, > + * this can be removed. > + * FYI: ARM64 does this way while x86 doesn't do anything as such= . > + * TODO: Should we keep it for RISC-V ? > + */ > + period =3D -(local64_read(&perf_event->count)); > + > + local64_set(&perf_event->hw.period_left, 0); > + perf_event->attr.sample_period =3D period; > + perf_event->hw.sample_period =3D period; > + > + set_bit(pmc->idx, kvpmu->pmc_overflown); > + kvm_riscv_vcpu_set_interrupt(vcpu, IRQ_PMU_OVF); > + > + rpmu->pmu.start(perf_event, PERF_EF_RELOAD); > +} > + > static int kvm_pmu_create_perf_event(struct kvm_pmc *pmc, struct perf_ev= ent_attr *attr, > unsigned long flags, unsigned long e= idx, unsigned long evtdata) > { > @@ -247,7 +288,7 @@ static int kvm_pmu_create_perf_event(struct kvm_pmc *= pmc, struct perf_event_attr > */ > attr->sample_period =3D kvm_pmu_get_sample_period(pmc); > > - event =3D perf_event_create_kernel_counter(attr, -1, current, NUL= L, pmc); > + event =3D perf_event_create_kernel_counter(attr, -1, current, kvm= _riscv_pmu_overflow, pmc); > if (IS_ERR(event)) { > pr_err("kvm pmu event creation failed for eidx %lx: %ld\n= ", eidx, PTR_ERR(event)); > return PTR_ERR(event); > @@ -466,6 +507,12 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vc= pu, unsigned long ctr_base, > } > } > > + /* The guest have serviced the interrupt and starting the counter= again */ > + if (test_bit(IRQ_PMU_OVF, vcpu->arch.irqs_pending)) { > + clear_bit(pmc_index, kvpmu->pmc_overflown); > + kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_PMU_OVF); > + } > + > out: > retdata->err_val =3D sbiret; > > @@ -537,7 +584,12 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcp= u, unsigned long ctr_base, > } > > if (bSnapshot && !sbiret) { > - //TODO: Add counter overflow support when sscofpm= f support is added > + /* The counter and overflow indicies in the snaps= hot region are w.r.to > + * cbase. Modify the set bit in the counter mask = instead of the pmc_index > + * which indicates the absolute counter index. > + */ Use a double winged comment block here. > + if (test_bit(pmc_index, kvpmu->pmc_overflown)) > + kvpmu->sdata->ctr_overflow_mask |=3D (1UL= << i); > kvpmu->sdata->ctr_values[i] =3D pmc->counter_val; > kvm_vcpu_write_guest(vcpu, kvpmu->snapshot_addr, = kvpmu->sdata, > sizeof(struct riscv_pmu_snap= shot_data)); > @@ -546,15 +598,19 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vc= pu, unsigned long ctr_base, > if (flags & SBI_PMU_STOP_FLAG_RESET) { > pmc->event_idx =3D SBI_PMU_EVENT_IDX_INVALID; > clear_bit(pmc_index, kvpmu->pmc_in_use); > + clear_bit(pmc_index, kvpmu->pmc_overflown); > if (bSnapshot) { > /* Clear the snapshot area for the upcomi= ng deletion event */ > kvpmu->sdata->ctr_values[i] =3D 0; > + /* Only clear the given counter as the ca= ller is responsible to > + * validate both the overflow mask and co= nfigured counters. > + */ Use a double winged comment block here. > + kvpmu->sdata->ctr_overflow_mask &=3D ~(1U= L << i); > kvm_vcpu_write_guest(vcpu, kvpmu->snapsho= t_addr, kvpmu->sdata, > sizeof(struct riscv_= pmu_snapshot_data)); > } > } > } > - > out: > retdata->err_val =3D sbiret; > > @@ -729,15 +785,16 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcp= u) > if (!kvpmu) > return; > > - for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) { > + for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_KVM_MAX_COUNTERS) { > pmc =3D &kvpmu->pmc[i]; > pmc->counter_val =3D 0; > kvm_pmu_release_perf_event(pmc); > pmc->event_idx =3D SBI_PMU_EVENT_IDX_INVALID; > } > - bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS); > + bitmap_zero(kvpmu->pmc_in_use, RISCV_KVM_MAX_COUNTERS); > + bitmap_zero(kvpmu->pmc_overflown, RISCV_KVM_MAX_COUNTERS); > memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw= _event)); > - kvpmu->snapshot_addr =3D INVALID_GPA; > + kvm_pmu_clear_snapshot_area(vcpu); > } > > void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) > -- > 2.34.1 > Regards, Anup