Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp23045rdb; Mon, 4 Dec 2023 18:44:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IFW+sH0cjhrcML/sA0z/Gq/33C1CfXJByME1H3Z2+yx9OIYBExQhTvz85BqrDUAiSwuHovr X-Received: by 2002:a05:6a20:12d4:b0:18d:1790:56be with SMTP id v20-20020a056a2012d400b0018d179056bemr6628211pzg.39.1701744298845; Mon, 04 Dec 2023 18:44:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701744298; cv=none; d=google.com; s=arc-20160816; b=czz6zCHPAF3KkmE0iW2CaTlz0+dRaenfCX+u7eqEkfMEoL1oCaYMqW/2xgjpWf5BGs qyFhY6FQSMb5fxIwJ8u4DVBF0+7uANd+8nZGW7kAgzlE21DRbQBN5rurLj1ibQv0isx0 +GC0KmsKwWGXrffe8lpW3/fxLuJjrr6abf19Lqz04RlapgBSmXMWPjQ8ZqOM/a30bLcG sZCQmWBbphqS1/M5pOAGw5/0C60GaoZAJxwu/tfOpsaVQGVD6fIB2f7Qc3tehGEIqc/7 LXJ/XSMQAT9Nc5/Z/DeHUYshfIGdwX3Oivf7IeE6Ah8l4ByPaCp2Bp5N1ObOwsPXxR0s dhOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UUFFL9WHJXhpyF2jxsBVxmfmjltHCrCzHnKko1KoPMA=; fh=hdWWmqr42gXb2LKR5pVlTakVhLV/lVdx71bRf5NvIOA=; b=rkKuFTfUrwDq8KL17qen7BS4u7a+Vrv8vIbjB0sm43ypI5HwKAStZ6L1bDhPvrspMT ybIVkVmzKwIsytioHOwneABazZOReY2BEN5W4Sa9hhxEH07IRXRPQWrBu3infeKmM55e JzBvkqeRqMAGvUVK2KbaDzgUH03iZMljyQmXtWpGSmOi6M/hWsqfSNAGkY261822IqY6 LlzzPCl2gh2pNBn8cKDW6bknw4MqvBGUcwhFBrKYRRYYFaFAMveaKPL1VCCZxlSnukQw sXilLUgVljY+9okKeZvkuzYt16fgeLnFFIsgzOxQmmSdqgDKIFZjvZ4DJjDfdEl/XeRe BCHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=GJ14nbCj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id bq24-20020a056a000e1800b006cd92bdabfdsi8851454pfb.48.2023.12.04.18.44.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:44:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=GJ14nbCj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 6457C8028548; Mon, 4 Dec 2023 18:44:55 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346660AbjLECn6 (ORCPT + 99 others); Mon, 4 Dec 2023 21:43:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346821AbjLECno (ORCPT ); Mon, 4 Dec 2023 21:43:44 -0500 Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 641D0D6C for ; Mon, 4 Dec 2023 18:43:47 -0800 (PST) Received: by mail-ot1-x334.google.com with SMTP id 46e09a7af769-6d87a8228e0so1145245a34.1 for ; Mon, 04 Dec 2023 18:43:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1701744226; x=1702349026; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UUFFL9WHJXhpyF2jxsBVxmfmjltHCrCzHnKko1KoPMA=; b=GJ14nbCjNglQTH4dYkCI78LgNrPCNHRIRpqxg9iGeBTX/eXZX8bfiGPgs+6N+PZR8W cOeTYyY3RrlIoun/qUjTft564N8rOC0Q8HXf4i347YAep3Jf7wrnc5GE8Xk04BWPef9c tzvnv+X9CgbC3LmZ2nCUK5oTS6W4JT4mjaR1f8Q5VjwUdSkYbJbQUcD2uFS54eKbEUPg akAEHz359/FsWMLjZ/ZM84B8uVoCp54+b1jV4EncoRg3mKqfidO3k+7zRxSI/sy/lDqF nLWtm6ks5PF4uCiVd/sSKmw5/Cxcv1EVQghz9oqmcjsif98d7h8DvfvWdaLofPnDTEPD Z2FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701744226; x=1702349026; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UUFFL9WHJXhpyF2jxsBVxmfmjltHCrCzHnKko1KoPMA=; b=EV+N1wewAOF6KHWArtiXyaCWVHr2/tBHe+5nZcuWf4ZlSpd3kso5cEoz2X+nTmiarb oKtLMilConN5ZaiY3frdkKLu2C8Zt5glHFOc3FDm9zqIiHkeFMN2t2xKEwFCkqI7a+El Lpq0u28z52J8I5zTZc2C9SRQ3FpaPZEOcRNKjIG9/ZWEllilNLvdtPcdMvizQratgO2K 2ZEDssCuKRHFebeSS+0u9m+yQKvHBXsFbTTRk3h7cvwN9S9BBDvBYYB6GefBhjibNOj5 Ypt1o7ft7ANlyYDKkwP9BhM2Cl8SNNckzCejWRsKQn/URFoGYzUPcwremrkL7pQZXCnq 88sg== X-Gm-Message-State: AOJu0YxpSyzpAhjG+uC4aF6O3y/dXJBpy1ifwT9ZOUILx+Cjj+Nj7EPq dRpERm1FeMJF08iYSydyykmYTVQdG0q9Ox1L6obhWw== X-Received: by 2002:a05:6830:1183:b0:6d8:74e2:94f6 with SMTP id u3-20020a056830118300b006d874e294f6mr2396699otq.60.1701744225782; Mon, 04 Dec 2023 18:43:45 -0800 (PST) Received: from atishp.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id z17-20020a9d62d1000000b006b9848f8aa7sm2157655otk.45.2023.12.04.18.43.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:43:45 -0800 (PST) From: Atish Patra To: linux-kernel@vger.kernel.org Cc: Atish Patra , Alexandre Ghiti , Andrew Jones , Anup Patel , Atish Patra , Conor Dooley , Guo Ren , Icenowy Zheng , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-riscv@lists.infradead.org, Mark Rutland , Palmer Dabbelt , Paul Walmsley , Will Deacon Subject: [RFC 8/9] RISC-V: KVM: Add perf sampling support for guests Date: Mon, 4 Dec 2023 18:43:09 -0800 Message-Id: <20231205024310.1593100-9-atishp@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205024310.1593100-1-atishp@rivosinc.com> References: <20231205024310.1593100-1-atishp@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 04 Dec 2023 18:44:55 -0800 (PST) KVM enables perf for guest via counter virtualization. However, the sampling can not be supported as there is no mechanism to enabled trap/emulate scountovf in ISA yet. Rely on the SBI PMU snapshot to provide the counter overflow data via the shared memory. In case of sampling event, the host first guest the LCOFI interrupt and injects to the guest via irq filtering mechanism defined in AIA specification. Thus, ssaia must be enabled in the host in order to use perf sampling in the guest. No other AIA dpeendancy w.r.t kernel is required. Signed-off-by: Atish Patra --- arch/riscv/include/asm/csr.h | 3 +- arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kvm/main.c | 1 + arch/riscv/kvm/vcpu.c | 8 ++-- arch/riscv/kvm/vcpu_onereg.c | 1 + arch/riscv/kvm/vcpu_pmu.c | 69 ++++++++++++++++++++++++++++--- 6 files changed, 73 insertions(+), 10 deletions(-) diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 88cdc8a3e654..bec09b33e2f0 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -168,7 +168,8 @@ #define VSIP_TO_HVIP_SHIFT (IRQ_VS_SOFT - IRQ_S_SOFT) #define VSIP_VALID_MASK ((_AC(1, UL) << IRQ_S_SOFT) | \ (_AC(1, UL) << IRQ_S_TIMER) | \ - (_AC(1, UL) << IRQ_S_EXT)) + (_AC(1, UL) << IRQ_S_EXT) | \ + (_AC(1, UL) << IRQ_PMU_OVF)) /* AIA CSR bits */ #define TOPI_IID_SHIFT 16 diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index 60d3b21dead7..741c16f4518e 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -139,6 +139,7 @@ enum KVM_RISCV_ISA_EXT_ID { KVM_RISCV_ISA_EXT_ZIHPM, KVM_RISCV_ISA_EXT_SMSTATEEN, KVM_RISCV_ISA_EXT_ZICOND, + KVM_RISCV_ISA_EXT_SSCOFPMF, KVM_RISCV_ISA_EXT_MAX, }; diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c index 225a435d9c9a..5a3a4cee0e3d 100644 --- a/arch/riscv/kvm/main.c +++ b/arch/riscv/kvm/main.c @@ -43,6 +43,7 @@ int kvm_arch_hardware_enable(void) csr_write(CSR_HCOUNTEREN, 0x02); csr_write(CSR_HVIP, 0); + csr_write(CSR_HVIEN, 1UL << IRQ_PMU_OVF); kvm_riscv_aia_enable(); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index e087c809073c..2d9f252356c3 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -380,7 +380,8 @@ int kvm_riscv_vcpu_set_interrupt(struct kvm_vcpu *vcpu, unsigned int irq) if (irq < IRQ_LOCAL_MAX && irq != IRQ_VS_SOFT && irq != IRQ_VS_TIMER && - irq != IRQ_VS_EXT) + irq != IRQ_VS_EXT && + irq != IRQ_PMU_OVF) return -EINVAL; set_bit(irq, vcpu->arch.irqs_pending); @@ -395,14 +396,15 @@ int kvm_riscv_vcpu_set_interrupt(struct kvm_vcpu *vcpu, unsigned int irq) int kvm_riscv_vcpu_unset_interrupt(struct kvm_vcpu *vcpu, unsigned int irq) { /* - * We only allow VS-mode software, timer, and external + * We only allow VS-mode software, timer, counter overflow and external * interrupts when irq is one of the local interrupts * defined by RISC-V privilege specification. */ if (irq < IRQ_LOCAL_MAX && irq != IRQ_VS_SOFT && irq != IRQ_VS_TIMER && - irq != IRQ_VS_EXT) + irq != IRQ_VS_EXT && + irq != IRQ_PMU_OVF) return -EINVAL; clear_bit(irq, vcpu->arch.irqs_pending); diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c index f8c9fa0c03c5..19a0e4eaf0df 100644 --- a/arch/riscv/kvm/vcpu_onereg.c +++ b/arch/riscv/kvm/vcpu_onereg.c @@ -36,6 +36,7 @@ static const unsigned long kvm_isa_ext_arr[] = { /* Multi letter extensions (alphabetically sorted) */ KVM_ISA_EXT_ARR(SMSTATEEN), KVM_ISA_EXT_ARR(SSAIA), + KVM_ISA_EXT_ARR(SSCOFPMF), KVM_ISA_EXT_ARR(SSTC), KVM_ISA_EXT_ARR(SVINVAL), KVM_ISA_EXT_ARR(SVNAPOT), diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index 622c4ee89e7b..86c8e92f92d3 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -229,6 +229,47 @@ static int kvm_pmu_validate_counter_mask(struct kvm_pmu *kvpmu, unsigned long ct return 0; } +static void kvm_riscv_pmu_overflow(struct perf_event *perf_event, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + struct kvm_pmc *pmc = perf_event->overflow_handler_context; + struct kvm_vcpu *vcpu = pmc->vcpu; + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu); + struct riscv_pmu *rpmu = to_riscv_pmu(perf_event->pmu); + u64 period; + + /* + * Stop the event counting by directly accessing the perf_event. + * Otherwise, this needs to deferred via a workqueue. + * That will introduce skew in the counter value because the actual + * physical counter would start after returning from this function. + * It will be stopped again once the workqueue is scheduled + */ + rpmu->pmu.stop(perf_event, PERF_EF_UPDATE); + + /* + * The hw counter would start automatically when this function returns. + * Thus, the host may continue to interrupts and inject it to the guest + * even without guest configuring the next event. Depending on the hardware + * the host may some sluggishness only if privilege mode filtering is not + * available. In an ideal world, where qemu is not the only capable hardware, + * this can be removed. + * FYI: ARM64 does this way while x86 doesn't do anything as such. + * TODO: Should we keep it for RISC-V ? + */ + period = -(local64_read(&perf_event->count)); + + local64_set(&perf_event->hw.period_left, 0); + perf_event->attr.sample_period = period; + perf_event->hw.sample_period = period; + + set_bit(pmc->idx, kvpmu->pmc_overflown); + kvm_riscv_vcpu_set_interrupt(vcpu, IRQ_PMU_OVF); + + rpmu->pmu.start(perf_event, PERF_EF_RELOAD); +} + static int kvm_pmu_create_perf_event(struct kvm_pmc *pmc, struct perf_event_attr *attr, unsigned long flags, unsigned long eidx, unsigned long evtdata) { @@ -247,7 +288,7 @@ static int kvm_pmu_create_perf_event(struct kvm_pmc *pmc, struct perf_event_attr */ attr->sample_period = kvm_pmu_get_sample_period(pmc); - event = perf_event_create_kernel_counter(attr, -1, current, NULL, pmc); + event = perf_event_create_kernel_counter(attr, -1, current, kvm_riscv_pmu_overflow, pmc); if (IS_ERR(event)) { pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event)); return PTR_ERR(event); @@ -466,6 +507,12 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base, } } + /* The guest have serviced the interrupt and starting the counter again */ + if (test_bit(IRQ_PMU_OVF, vcpu->arch.irqs_pending)) { + clear_bit(pmc_index, kvpmu->pmc_overflown); + kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_PMU_OVF); + } + out: retdata->err_val = sbiret; @@ -537,7 +584,12 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base, } if (bSnapshot && !sbiret) { - //TODO: Add counter overflow support when sscofpmf support is added + /* The counter and overflow indicies in the snapshot region are w.r.to + * cbase. Modify the set bit in the counter mask instead of the pmc_index + * which indicates the absolute counter index. + */ + if (test_bit(pmc_index, kvpmu->pmc_overflown)) + kvpmu->sdata->ctr_overflow_mask |= (1UL << i); kvpmu->sdata->ctr_values[i] = pmc->counter_val; kvm_vcpu_write_guest(vcpu, kvpmu->snapshot_addr, kvpmu->sdata, sizeof(struct riscv_pmu_snapshot_data)); @@ -546,15 +598,19 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base, if (flags & SBI_PMU_STOP_FLAG_RESET) { pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID; clear_bit(pmc_index, kvpmu->pmc_in_use); + clear_bit(pmc_index, kvpmu->pmc_overflown); if (bSnapshot) { /* Clear the snapshot area for the upcoming deletion event */ kvpmu->sdata->ctr_values[i] = 0; + /* Only clear the given counter as the caller is responsible to + * validate both the overflow mask and configured counters. + */ + kvpmu->sdata->ctr_overflow_mask &= ~(1UL << i); kvm_vcpu_write_guest(vcpu, kvpmu->snapshot_addr, kvpmu->sdata, sizeof(struct riscv_pmu_snapshot_data)); } } } - out: retdata->err_val = sbiret; @@ -729,15 +785,16 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) if (!kvpmu) return; - for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) { + for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_KVM_MAX_COUNTERS) { pmc = &kvpmu->pmc[i]; pmc->counter_val = 0; kvm_pmu_release_perf_event(pmc); pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID; } - bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS); + bitmap_zero(kvpmu->pmc_in_use, RISCV_KVM_MAX_COUNTERS); + bitmap_zero(kvpmu->pmc_overflown, RISCV_KVM_MAX_COUNTERS); memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event)); - kvpmu->snapshot_addr = INVALID_GPA; + kvm_pmu_clear_snapshot_area(vcpu); } void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) -- 2.34.1