2023-02-01 23:12:58

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 00/14] KVM perf support

This series extends perf support for KVM. The KVM implementation relies
on the SBI PMU extension and trap n emulation of hpmcounter CSRs.
The KVM implementation exposes the virtual counters to the guest and internally
manage the counters using kernel perf counters.

This series doesn't support the counter overflow as the Sscofpmf extension
doesn't allow trap & emulation mechanism of scountovf CSR yet. The required
changes to allow that are being under discussions. Supporting overflow interrupt
also requires AIA interrupt filtering support.

1. PATCH 1-5 are generic KVM/PMU driver improvements.
2. PATCH 9 disables hpmcounter for now. It will be enabled to maintain ABI
requirement once the ONE reg interface is settled.

perf stat works in kvm guests with this series.

Here is example of running perf stat in a guest running in KVM.

===========================================================================
/ # /host/apps/perf stat -e instructions -e cycles -e r8000000000000005 \
> -e r8000000000000006 -e r8000000000000007 -e r8000000000000008 \
> -e r800000000000000a perf bench sched messaging -g 10 -l 10

# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

Total time: 7.769 [sec]

Performance counter stats for 'perf bench sched messaging -g 10 -l 10':

73556259604 cycles
73387266056 instructions # 1.00 insn per cycle
0 dTLB-store-misses
0 iTLB-load-misses
0 r8000000000000005
2595 r8000000000000006
2272 r8000000000000007
10 r8000000000000008
0 r800000000000000a

12.173720400 seconds time elapsed

1.002716000 seconds user
21.931047000 seconds sys


Note: The SBI_PMU_FW_SET_TIMER (eventid : r8000000000000005) is zero
as kvm guest supports sstc now.

This series can be found here as well.
https://github.com/atishp04/linux/tree/kvm_perf_v4

TODO:
1. Add sscofpmf support.
2. Add One reg interface for the following operations:
1. Enable/Disable PMU (should it at VM level rather than vcpu ?)
2. Number of hpmcounter and width of the counters
3. Init PMU
4. Allow guest user to access cycle & instret without trapping
3. Move counter mask to a bitmask instead of unsigned long so that it can work
for RV32 systems where number of total counters are more than 32.
This will also accomodate future systems which may define maximum counters
to be more than 64.

Changes from v3->v4:
1. Addressed all the comments on v3.
2. Modified the vcpu_pmu_init to void return type.
3. Redirect illegal instruction trap to guest for invalid hpmcounter access
instead of exiting to the userpsace.
4. Got rid of unecessary error messages.

Changes v2->v3:
1. Changed the exported functions to GPL only export.
2. Addressed all the nit comments on v2.
3. Split non-kvm related changes into separate patches.
4. Reorgainze the PATCH 11 and 10 based on Drew's suggestions.

Changes from v1->v2:
1. Addressed comments from Andrew.
2. Removed kvpmu sanity check.
3. Added a kvm pmu init flag and the sanity check to probe function.
4. Improved the linux vs sbi error code handling.


Atish Patra (14):
perf: RISC-V: Define helper functions expose hpm counter width and
count
perf: RISC-V: Improve privilege mode filtering for perf
RISC-V: Improve SBI PMU extension related definitions
RISC-V: KVM: Define a probe function for SBI extension data structures
RISC-V: KVM: Return correct code for hsm stop function
RISC-V: KVM: Modify SBI extension handler to return SBI error code
RISC-V: KVM: Add skeleton support for perf
RISC-V: KVM: Add SBI PMU extension support
RISC-V: KVM: Make PMU functionality depend on Sscofpmf
RISC-V: KVM: Disable all hpmcounter access for VS/VU mode
RISC-V: KVM: Implement trap & emulate for hpmcounters
RISC-V: KVM: Implement perf support without sampling
RISC-V: KVM: Support firmware events
RISC-V: KVM: Increment firmware pmu events

arch/riscv/include/asm/kvm_host.h | 4 +
arch/riscv/include/asm/kvm_vcpu_pmu.h | 110 +++++
arch/riscv/include/asm/kvm_vcpu_sbi.h | 13 +-
arch/riscv/include/asm/sbi.h | 7 +-
arch/riscv/kvm/Makefile | 1 +
arch/riscv/kvm/main.c | 3 +-
arch/riscv/kvm/tlb.c | 4 +
arch/riscv/kvm/vcpu.c | 7 +
arch/riscv/kvm/vcpu_insn.c | 4 +-
arch/riscv/kvm/vcpu_pmu.c | 627 ++++++++++++++++++++++++++
arch/riscv/kvm/vcpu_sbi.c | 72 ++-
arch/riscv/kvm/vcpu_sbi_base.c | 43 +-
arch/riscv/kvm/vcpu_sbi_hsm.c | 28 +-
arch/riscv/kvm/vcpu_sbi_pmu.c | 85 ++++
arch/riscv/kvm/vcpu_sbi_replace.c | 50 +-
arch/riscv/kvm/vcpu_sbi_v01.c | 18 +-
drivers/perf/riscv_pmu_sbi.c | 64 ++-
include/linux/perf/riscv_pmu.h | 5 +
18 files changed, 1029 insertions(+), 116 deletions(-)
create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
create mode 100644 arch/riscv/kvm/vcpu_pmu.c
create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c

--
2.25.1



2023-02-01 23:13:02

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 01/14] perf: RISC-V: Define helper functions expose hpm counter width and count

KVM module needs to know how many hardware counters and the counter
width that the platform supports. Otherwise, it will not be able to show
optimal value of virtual counters to the guest. The virtual hardware
counters also need to have the same width as the logical hardware
counters for simplicity. However, there shouldn't be mapping between
virtual hardware counters and logical hardware counters. As we don't
support hetergeneous harts or counters with different width as of now,
the implementation relies on the counter width of the first available
programmable counter.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
drivers/perf/riscv_pmu_sbi.c | 37 ++++++++++++++++++++++++++++++++--
include/linux/perf/riscv_pmu.h | 3 +++
2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index f6507ef..6b53adc 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -44,7 +44,7 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
};

/*
- * RISC-V doesn't have hetergenous harts yet. This need to be part of
+ * RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
static union sbi_pmu_ctr_info *pmu_ctr_list;
@@ -52,6 +52,9 @@ static bool riscv_pmu_use_irq;
static unsigned int riscv_pmu_irq_num;
static unsigned int riscv_pmu_irq;

+/* Cache the available counters in a bitmask */
+static unsigned long cmask;
+
struct sbi_pmu_event_data {
union {
union {
@@ -267,6 +270,37 @@ static bool pmu_sbi_ctr_is_fw(int cidx)
return (info->type == SBI_PMU_CTR_TYPE_FW) ? true : false;
}

+/*
+ * Returns the counter width of a programmable counter and number of hardware
+ * counters. As we don't support heterogeneous CPUs yet, it is okay to just
+ * return the counter width of the first programmable counter.
+ */
+int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
+{
+ int i;
+ union sbi_pmu_ctr_info *info;
+ u32 hpm_width = 0, hpm_count = 0;
+
+ if (!cmask)
+ return -EINVAL;
+
+ for_each_set_bit(i, &cmask, RISCV_MAX_COUNTERS) {
+ info = &pmu_ctr_list[i];
+ if (!info)
+ continue;
+ if (!hpm_width && info->csr != CSR_CYCLE && info->csr != CSR_INSTRET)
+ hpm_width = info->width;
+ if (info->type == SBI_PMU_CTR_TYPE_HW)
+ hpm_count++;
+ }
+
+ *hw_ctr_width = hpm_width;
+ *num_hw_ctr = hpm_count;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);
+
static int pmu_sbi_ctr_get_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -812,7 +846,6 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
static int pmu_sbi_device_probe(struct platform_device *pdev)
{
struct riscv_pmu *pmu = NULL;
- unsigned long cmask = 0;
int ret = -ENODEV;
int num_counters;

diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index e17e86a..a1c3f77 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -73,6 +73,9 @@ void riscv_pmu_legacy_skip_init(void);
static inline void riscv_pmu_legacy_skip_init(void) {};
#endif
struct riscv_pmu *riscv_pmu_alloc(void);
+#ifdef CONFIG_RISCV_PMU_SBI
+int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
+#endif

#endif /* CONFIG_RISCV_PMU */

--
2.25.1


2023-02-01 23:13:05

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 02/14] perf: RISC-V: Improve privilege mode filtering for perf

Currently, the host driver doesn't have any method to identify if the
requested perf event is from kvm or bare metal. As KVM runs in HS
mode, there are no separate hypervisor privilege mode to distinguish
between the attributes for guest/host.

Improve the privilege mode filtering by using the event specific
config1 field.

Reviewed-by: Andrew Jones <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
drivers/perf/riscv_pmu_sbi.c | 27 ++++++++++++++++++++++-----
include/linux/perf/riscv_pmu.h | 2 ++
2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 6b53adc..71174fa 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -301,6 +301,27 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
}
EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);

+static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
+{
+ unsigned long cflags = 0;
+ bool guest_events = false;
+
+ if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)
+ guest_events = true;
+ if (event->attr.exclude_kernel)
+ cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VSINH : SBI_PMU_CFG_FLAG_SET_SINH;
+ if (event->attr.exclude_user)
+ cflags |= guest_events ? SBI_PMU_CFG_FLAG_SET_VUINH : SBI_PMU_CFG_FLAG_SET_UINH;
+ if (guest_events && event->attr.exclude_hv)
+ cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
+ if (event->attr.exclude_host)
+ cflags |= SBI_PMU_CFG_FLAG_SET_UINH | SBI_PMU_CFG_FLAG_SET_SINH;
+ if (event->attr.exclude_guest)
+ cflags |= SBI_PMU_CFG_FLAG_SET_VSINH | SBI_PMU_CFG_FLAG_SET_VUINH;
+
+ return cflags;
+}
+
static int pmu_sbi_ctr_get_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -311,11 +332,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
uint64_t cbase = 0;
unsigned long cflags = 0;

- if (event->attr.exclude_kernel)
- cflags |= SBI_PMU_CFG_FLAG_SET_SINH;
- if (event->attr.exclude_user)
- cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
-
+ cflags = pmu_sbi_get_filter_flags(event);
/* retrieve the available counter index */
#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index a1c3f77..43fc892 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -26,6 +26,8 @@

#define RISCV_PMU_STOP_FLAG_RESET 1

+#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
+
struct cpu_hw_events {
/* currently enabled events */
int n_events;
--
2.25.1


2023-02-01 23:13:07

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 03/14] RISC-V: Improve SBI PMU extension related definitions

This patch fixes/improve few minor things in SBI PMU extension
definition.

1. Align all the firmware event names.
2. Add macros for bit positions in cache event ID & ops.

The changes were small enough to combine them together instead
of creating 1 liner patches.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/sbi.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 4ca7fba..945b7be 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -169,9 +169,9 @@ enum sbi_pmu_fw_generic_events_t {
SBI_PMU_FW_ILLEGAL_INSN = 4,
SBI_PMU_FW_SET_TIMER = 5,
SBI_PMU_FW_IPI_SENT = 6,
- SBI_PMU_FW_IPI_RECVD = 7,
+ SBI_PMU_FW_IPI_RCVD = 7,
SBI_PMU_FW_FENCE_I_SENT = 8,
- SBI_PMU_FW_FENCE_I_RECVD = 9,
+ SBI_PMU_FW_FENCE_I_RCVD = 9,
SBI_PMU_FW_SFENCE_VMA_SENT = 10,
SBI_PMU_FW_SFENCE_VMA_RCVD = 11,
SBI_PMU_FW_SFENCE_VMA_ASID_SENT = 12,
@@ -215,6 +215,9 @@ enum sbi_pmu_ctr_type {
#define SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK 0x06
#define SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK 0x01

+#define SBI_PMU_EVENT_CACHE_ID_SHIFT 3
+#define SBI_PMU_EVENT_CACHE_OP_SHIFT 1
+
#define SBI_PMU_EVENT_IDX_INVALID 0xFFFFFFFF

/* Flags defined for config matching function */
--
2.25.1


2023-02-01 23:13:09

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 04/14] RISC-V: KVM: Define a probe function for SBI extension data structures

Currently the probe function just checks if an SBI extension is
registered or not. However, the extension may not want to advertise
itself depending on some other condition.
An additional extension specific probe function will allow
extensions to decide if they want to be advertised to the caller or
not. Any extension that does not require additional dependency checks
can avoid implementing this function.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 +++
arch/riscv/kvm/vcpu_sbi_base.c | 13 +++++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index f79478a..45ba341 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long *out_val, struct kvm_cpu_trap *utrap,
bool *exit);
+
+ /* Extension specific probe function */
+ unsigned long (*probe)(struct kvm_vcpu *vcpu);
};

void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
index 5d65c63..846d518 100644
--- a/arch/riscv/kvm/vcpu_sbi_base.c
+++ b/arch/riscv/kvm/vcpu_sbi_base.c
@@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
{
int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+ const struct kvm_vcpu_sbi_extension *sbi_ext;

switch (cp->a6) {
case SBI_EXT_BASE_GET_SPEC_VERSION:
@@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
*exit = true;
- } else
- *out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
+ } else {
+ sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
+ if (sbi_ext) {
+ if (sbi_ext->probe)
+ *out_val = sbi_ext->probe(vcpu);
+ else
+ *out_val = 1;
+ } else
+ *out_val = 0;
+ }
break;
case SBI_EXT_BASE_GET_MVENDORID:
*out_val = vcpu->arch.mvendorid;
--
2.25.1


2023-02-01 23:13:13

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 05/14] RISC-V: KVM: Return correct code for hsm stop function

According to the SBI specification, the stop function can only
return error code SBI_ERR_FAILED. However, currently it returns
-EINVAL which will be mapped SBI_ERR_INVALID_PARAM.

Return an linux error code that maps to SBI_ERR_FAILED i.e doesn't map
to any other SBI error code. While EACCES is not the best error code
to describe the situation, it is close enough and will be replaced
with SBI error codes directly anyways.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
index 2e915ca..619ac0f 100644
--- a/arch/riscv/kvm/vcpu_sbi_hsm.c
+++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
@@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
{
if (vcpu->arch.power_off)
- return -EINVAL;
+ return -EACCES;

kvm_riscv_vcpu_power_off(vcpu);

--
2.25.1


2023-02-01 23:13:16

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 09/14] RISC-V: KVM: Make PMU functionality depend on Sscofpmf

The privilege mode filtering feature must be available in the host so
that the host can inhibit the counters while the execution is in HS mode.
Otherwise, the guests may have access to critical guest information.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/vcpu_pmu.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 2dad37f..9a531fe 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -79,6 +79,14 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;

+ /*
+ * PMU functionality should be only available to guests if privilege mode
+ * filtering is available in the host. Otherwise, guest will always count
+ * events while the execution is in hypervisor mode.
+ */
+ if (!riscv_isa_extension_available(NULL, SSCOFPMF))
+ return;
+
ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
if (ret < 0 || !hpm_width || !num_hw_ctrs)
return;
--
2.25.1


2023-02-01 23:13:25

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 06/14] RISC-V: KVM: Modify SBI extension handler to return SBI error code

Currently, the SBI extension handle is expected to return Linux error code.
The top SBI layer converts the Linux error code to SBI specific error code
that can be returned to guest invoking the SBI calls. This model works
as long as SBI error codes have 1-to-1 mappings between them.
However, that may not be true always. This patch attempts to disassociate
both these error codes by allowing the SBI extension implementation to
return SBI specific error codes as well.

The extension will continue to return the Linux error specific code which
will indicate any problem *with* the extension emulation while the
SBI specific error will indicate the problem *of* the emulation.

Suggested-by: Andrew Jones <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++-
arch/riscv/kvm/vcpu_sbi.c | 61 +++++++++++----------------
arch/riscv/kvm/vcpu_sbi_base.c | 36 +++++++---------
arch/riscv/kvm/vcpu_sbi_hsm.c | 28 ++++++------
arch/riscv/kvm/vcpu_sbi_replace.c | 43 +++++++++----------
arch/riscv/kvm/vcpu_sbi_v01.c | 18 ++++----
6 files changed, 90 insertions(+), 106 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index 45ba341..8425556 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -18,6 +18,13 @@ struct kvm_vcpu_sbi_context {
int return_handled;
};

+struct kvm_vcpu_sbi_return {
+ unsigned long out_val;
+ unsigned long err_val;
+ struct kvm_cpu_trap *utrap;
+ bool uexit;
+};
+
struct kvm_vcpu_sbi_extension {
unsigned long extid_start;
unsigned long extid_end;
@@ -27,8 +34,7 @@ struct kvm_vcpu_sbi_extension {
* specific error codes.
*/
int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val, struct kvm_cpu_trap *utrap,
- bool *exit);
+ struct kvm_vcpu_sbi_return *retdata);

/* Extension specific probe function */
unsigned long (*probe)(struct kvm_vcpu *vcpu);
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index f96991d..fe2897e 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -12,26 +12,6 @@
#include <asm/sbi.h>
#include <asm/kvm_vcpu_sbi.h>

-static int kvm_linux_err_map_sbi(int err)
-{
- switch (err) {
- case 0:
- return SBI_SUCCESS;
- case -EPERM:
- return SBI_ERR_DENIED;
- case -EINVAL:
- return SBI_ERR_INVALID_PARAM;
- case -EFAULT:
- return SBI_ERR_INVALID_ADDRESS;
- case -EOPNOTSUPP:
- return SBI_ERR_NOT_SUPPORTED;
- case -EALREADY:
- return SBI_ERR_ALREADY_AVAILABLE;
- default:
- return SBI_ERR_FAILURE;
- };
-}
-
#ifndef CONFIG_RISCV_SBI_V01
static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
.extid_start = -1UL,
@@ -125,11 +105,14 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
int ret = 1;
bool next_sepc = true;
- bool userspace_exit = false;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
const struct kvm_vcpu_sbi_extension *sbi_ext;
- struct kvm_cpu_trap utrap = { 0 };
- unsigned long out_val = 0;
+ struct kvm_cpu_trap utrap = {0};
+ struct kvm_vcpu_sbi_return sbi_ret = {
+ .out_val = 0,
+ .err_val = 0,
+ .utrap = &utrap,
+ };
bool ext_is_v01 = false;

sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
@@ -139,42 +122,46 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
ext_is_v01 = true;
#endif
- ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
+ ret = sbi_ext->handler(vcpu, run, &sbi_ret);
} else {
/* Return error for unsupported SBI calls */
cp->a0 = SBI_ERR_NOT_SUPPORTED;
goto ecall_done;
}

+ /*
+ * When the SBI extension returns a Linux error code, it exits the ioctl
+ * loop and forwards the error to userspace.
+ */
+ if (ret < 0) {
+ next_sepc = false;
+ goto ecall_done;
+ }
+
/* Handle special error cases i.e trap, exit or userspace forward */
- if (utrap.scause) {
+ if (sbi_ret.utrap->scause) {
/* No need to increment sepc or exit ioctl loop */
ret = 1;
- utrap.sepc = cp->sepc;
- kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
+ sbi_ret.utrap->sepc = cp->sepc;
+ kvm_riscv_vcpu_trap_redirect(vcpu, sbi_ret.utrap);
next_sepc = false;
goto ecall_done;
}

/* Exit ioctl loop or Propagate the error code the guest */
- if (userspace_exit) {
+ if (sbi_ret.uexit) {
next_sepc = false;
ret = 0;
} else {
- /**
- * SBI extension handler always returns an Linux error code. Convert
- * it to the SBI specific error code that can be propagated the SBI
- * caller.
- */
- ret = kvm_linux_err_map_sbi(ret);
- cp->a0 = ret;
+ cp->a0 = sbi_ret.err_val;
ret = 1;
}
ecall_done:
if (next_sepc)
cp->sepc += 4;
- if (!ext_is_v01)
- cp->a1 = out_val;
+ /* a1 should only be updated when we continue the ioctl loop */
+ if (!ext_is_v01 && ret == 1)
+ cp->a1 = sbi_ret.out_val;

return ret;
}
diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
index 846d518..69f4202 100644
--- a/arch/riscv/kvm/vcpu_sbi_base.c
+++ b/arch/riscv/kvm/vcpu_sbi_base.c
@@ -14,24 +14,22 @@
#include <asm/kvm_vcpu_sbi.h>

static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *trap, bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
- int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
const struct kvm_vcpu_sbi_extension *sbi_ext;

switch (cp->a6) {
case SBI_EXT_BASE_GET_SPEC_VERSION:
- *out_val = (KVM_SBI_VERSION_MAJOR <<
+ retdata->out_val = (KVM_SBI_VERSION_MAJOR <<
SBI_SPEC_VERSION_MAJOR_SHIFT) |
KVM_SBI_VERSION_MINOR;
break;
case SBI_EXT_BASE_GET_IMP_ID:
- *out_val = KVM_SBI_IMPID;
+ retdata->out_val = KVM_SBI_IMPID;
break;
case SBI_EXT_BASE_GET_IMP_VERSION:
- *out_val = LINUX_VERSION_CODE;
+ retdata->out_val = LINUX_VERSION_CODE;
break;
case SBI_EXT_BASE_PROBE_EXT:
if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
@@ -43,33 +41,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
* forward it to the userspace
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
- *exit = true;
+ retdata->uexit = true;
} else {
sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
if (sbi_ext) {
if (sbi_ext->probe)
- *out_val = sbi_ext->probe(vcpu);
+ retdata->out_val = sbi_ext->probe(vcpu);
else
- *out_val = 1;
+ retdata->out_val = 1;
} else
- *out_val = 0;
+ retdata->out_val = 0;
}
break;
case SBI_EXT_BASE_GET_MVENDORID:
- *out_val = vcpu->arch.mvendorid;
+ retdata->out_val = vcpu->arch.mvendorid;
break;
case SBI_EXT_BASE_GET_MARCHID:
- *out_val = vcpu->arch.marchid;
+ retdata->out_val = vcpu->arch.marchid;
break;
case SBI_EXT_BASE_GET_MIMPID:
- *out_val = vcpu->arch.mimpid;
+ retdata->out_val = vcpu->arch.mimpid;
break;
default:
- ret = -EOPNOTSUPP;
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
break;
}

- return ret;
+ return 0;
}

const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
@@ -79,17 +77,15 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
};

static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
- struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap,
- bool *exit)
+ struct kvm_run *run,
+ struct kvm_vcpu_sbi_return *retdata)
{
/*
* Both SBI experimental and vendor extensions are
* unconditionally forwarded to userspace.
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
- *exit = true;
+ retdata->uexit = true;
return 0;
}

diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
index 619ac0f..7dca0e9 100644
--- a/arch/riscv/kvm/vcpu_sbi_hsm.c
+++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
@@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)

target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
if (!target_vcpu)
- return -EINVAL;
+ return SBI_ERR_INVALID_PARAM;
if (!target_vcpu->arch.power_off)
- return -EALREADY;
+ return SBI_ERR_ALREADY_AVAILABLE;

reset_cntx = &target_vcpu->arch.guest_reset_context;
/* start address */
@@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
{
if (vcpu->arch.power_off)
- return -EACCES;
+ return SBI_ERR_FAILURE;

kvm_riscv_vcpu_power_off(vcpu);

@@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)

target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
if (!target_vcpu)
- return -EINVAL;
+ return SBI_ERR_INVALID_PARAM;
if (!target_vcpu->arch.power_off)
return SBI_HSM_STATE_STARTED;
else if (vcpu->stat.generic.blocking)
@@ -67,9 +67,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
}

static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap,
- bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
@@ -88,27 +86,29 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
case SBI_EXT_HSM_HART_STATUS:
ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
if (ret >= 0) {
- *out_val = ret;
- ret = 0;
+ retdata->out_val = ret;
+ retdata->err_val = 0;
}
- break;
+ return 0;
case SBI_EXT_HSM_HART_SUSPEND:
switch (cp->a0) {
case SBI_HSM_SUSPEND_RET_DEFAULT:
kvm_riscv_vcpu_wfi(vcpu);
break;
case SBI_HSM_SUSPEND_NON_RET_DEFAULT:
- ret = -EOPNOTSUPP;
+ ret = SBI_ERR_NOT_SUPPORTED;
break;
default:
- ret = -EINVAL;
+ ret = SBI_ERR_INVALID_PARAM;
}
break;
default:
- ret = -EOPNOTSUPP;
+ ret = SBI_ERR_NOT_SUPPORTED;
}

- return ret;
+ retdata->err_val = ret;
+
+ return 0;
}

const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 03a0198..38fa4c0 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -14,15 +14,15 @@
#include <asm/kvm_vcpu_sbi.h>

static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap, bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
- int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
u64 next_cycle;

- if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
- return -EINVAL;
+ if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
+ retdata->err_val = SBI_ERR_INVALID_PARAM;
+ return 0;
+ }

#if __riscv_xlen == 32
next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
@@ -31,7 +31,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
#endif
kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);

- return ret;
+ return 0;
}

const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
@@ -41,8 +41,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
};

static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap, bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
int ret = 0;
unsigned long i;
@@ -51,8 +50,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
unsigned long hmask = cp->a0;
unsigned long hbase = cp->a1;

- if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
- return -EINVAL;
+ if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
+ retdata->err_val = SBI_ERR_INVALID_PARAM;
+ return 0;
+ }

kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
if (hbase != -1UL) {
@@ -76,10 +77,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
};

static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap, bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
- int ret = 0;
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long hmask = cp->a0;
unsigned long hbase = cp->a1;
@@ -116,10 +115,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
*/
break;
default:
- ret = -EOPNOTSUPP;
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
}

- return ret;
+ return 0;
}

const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
@@ -130,14 +129,12 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {

static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap, bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long funcid = cp->a6;
u32 reason = cp->a1;
u32 type = cp->a0;
- int ret = 0;

switch (funcid) {
case SBI_EXT_SRST_RESET:
@@ -146,24 +143,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
KVM_SYSTEM_EVENT_SHUTDOWN,
reason);
- *exit = true;
+ retdata->uexit = true;
break;
case SBI_SRST_RESET_TYPE_COLD_REBOOT:
case SBI_SRST_RESET_TYPE_WARM_REBOOT:
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
KVM_SYSTEM_EVENT_RESET,
reason);
- *exit = true;
+ retdata->uexit = true;
break;
default:
- ret = -EOPNOTSUPP;
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
}
break;
default:
- ret = -EOPNOTSUPP;
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
}

- return ret;
+ return 0;
}

const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
index 489f225..0269e08 100644
--- a/arch/riscv/kvm/vcpu_sbi_v01.c
+++ b/arch/riscv/kvm/vcpu_sbi_v01.c
@@ -14,9 +14,7 @@
#include <asm/kvm_vcpu_sbi.h>

static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
- unsigned long *out_val,
- struct kvm_cpu_trap *utrap,
- bool *exit)
+ struct kvm_vcpu_sbi_return *retdata)
{
ulong hmask;
int i, ret = 0;
@@ -33,7 +31,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
* handled in kernel so we forward these to user-space
*/
kvm_riscv_vcpu_sbi_forward(vcpu, run);
- *exit = true;
+ retdata->uexit = true;
break;
case SBI_EXT_0_1_SET_TIMER:
#if __riscv_xlen == 32
@@ -49,10 +47,10 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
case SBI_EXT_0_1_SEND_IPI:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
- utrap);
+ retdata->utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
- if (utrap->scause)
+ if (retdata->utrap->scause)
break;

for_each_set_bit(i, &hmask, BITS_PER_LONG) {
@@ -65,17 +63,17 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
case SBI_EXT_0_1_SHUTDOWN:
kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
KVM_SYSTEM_EVENT_SHUTDOWN, 0);
- *exit = true;
+ retdata->uexit = true;
break;
case SBI_EXT_0_1_REMOTE_FENCE_I:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
if (cp->a0)
hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
- utrap);
+ retdata->utrap);
else
hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
- if (utrap->scause)
+ if (retdata->utrap->scause)
break;

if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
@@ -103,7 +101,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
}
break;
default:
- ret = -EINVAL;
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
break;
}

--
2.25.1


2023-02-01 23:13:31

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 08/14] RISC-V: KVM: Add SBI PMU extension support

SBI PMU extension allows KVM guests to configure/start/stop/query about
the PMU counters in virtualized enviornment as well.

In order to allow that, KVM implements the entire SBI PMU extension.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/Makefile | 2 +-
arch/riscv/kvm/vcpu_sbi.c | 11 +++++
arch/riscv/kvm/vcpu_sbi_pmu.c | 85 +++++++++++++++++++++++++++++++++++
3 files changed, 97 insertions(+), 1 deletion(-)
create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c

diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index 5de1053..278e97c 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
kvm-y += vcpu_sbi_replace.o
kvm-y += vcpu_sbi_hsm.o
kvm-y += vcpu_timer.o
-kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
+kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index fe2897e..15fde15 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
};
#endif

+#ifdef CONFIG_RISCV_PMU_SBI
+extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
+#else
+static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
+ .extid_start = -1UL,
+ .extid_end = -1UL,
+ .handler = NULL,
+};
+#endif
+
static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
&vcpu_sbi_ext_v01,
&vcpu_sbi_ext_base,
@@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
&vcpu_sbi_ext_rfence,
&vcpu_sbi_ext_srst,
&vcpu_sbi_ext_hsm,
+ &vcpu_sbi_ext_pmu,
&vcpu_sbi_ext_experimental,
&vcpu_sbi_ext_vendor,
};
diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
new file mode 100644
index 0000000..e028b0a
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 Rivos Inc
+ *
+ * Authors:
+ * Atish Patra <[email protected]>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <asm/csr.h>
+#include <asm/sbi.h>
+#include <asm/kvm_vcpu_sbi.h>
+
+static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ int ret = 0;
+ struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ unsigned long funcid = cp->a6;
+ uint64_t temp;
+
+ /* Return not supported if PMU is not initialized */
+ if (!kvpmu->init_done)
+ return -EINVAL;
+
+ switch (funcid) {
+ case SBI_EXT_PMU_NUM_COUNTERS:
+ ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, retdata);
+ break;
+ case SBI_EXT_PMU_COUNTER_GET_INFO:
+ ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, retdata);
+ break;
+ case SBI_EXT_PMU_COUNTER_CFG_MATCH:
+#if defined(CONFIG_32BIT)
+ temp = ((uint64_t)cp->a5 << 32) | cp->a4;
+#else
+ temp = cp->a4;
+#endif
+ /*
+ * This can fail if perf core framework fails to create an event.
+ * Forward the error to the user space because its an error happened
+ * within host kernel. The other option would be convert this to
+ * an SBI error and forward to the guest.
+ */
+ ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
+ cp->a2, cp->a3, temp, retdata);
+ break;
+ case SBI_EXT_PMU_COUNTER_START:
+#if defined(CONFIG_32BIT)
+ temp = ((uint64_t)cp->a4 << 32) | cp->a3;
+#else
+ temp = cp->a3;
+#endif
+ ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
+ temp, retdata);
+ break;
+ case SBI_EXT_PMU_COUNTER_STOP:
+ ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, retdata);
+ break;
+ case SBI_EXT_PMU_COUNTER_FW_READ:
+ ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, retdata);
+ break;
+ default:
+ retdata->err_val = SBI_ERR_NOT_SUPPORTED;
+ }
+
+ return ret;
+}
+
+unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+ return kvpmu->init_done;
+}
+
+const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
+ .extid_start = SBI_EXT_PMU,
+ .extid_end = SBI_EXT_PMU,
+ .handler = kvm_sbi_ext_pmu_handler,
+ .probe = kvm_sbi_ext_pmu_probe,
+};
--
2.25.1


2023-02-01 23:13:37

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 10/14] RISC-V: KVM: Disable all hpmcounter access for VS/VU mode

Any guest must not get access to any hpmcounter including cycle/instret
without any checks. We achieve that by disabling all the bits except TM
bit in hcounteren.

However, instret and cycle access for guest user space can be enabled
upon explicit request (via ONE REG) or on first trap from VU mode
to maintain ABI requirement in the future. This patch doesn't support
that as ONE REG interface is not settled yet.

Reviewed-by: Andrew Jones <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index 58c5489..c5d400f 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -49,7 +49,8 @@ int kvm_arch_hardware_enable(void)
hideleg |= (1UL << IRQ_VS_EXT);
csr_write(CSR_HIDELEG, hideleg);

- csr_write(CSR_HCOUNTEREN, -1UL);
+ /* VS should access only the time counter directly. Everything else should trap */
+ csr_write(CSR_HCOUNTEREN, 0x02);

csr_write(CSR_HVIP, 0);

--
2.25.1


2023-02-01 23:13:42

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 11/14] RISC-V: KVM: Implement trap & emulate for hpmcounters

As the KVM guests only see the virtual PMU counters, all hpmcounter
access should trap and KVM emulates the read access on behalf of guests.

Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/kvm_vcpu_pmu.h | 15 +++++++
arch/riscv/kvm/vcpu_insn.c | 4 +-
arch/riscv/kvm/vcpu_pmu.c | 59 ++++++++++++++++++++++++++-
3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
index e2b4038..2afaaf5 100644
--- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -48,6 +48,19 @@ struct kvm_pmu {
#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))

+#if defined(CONFIG_32BIT)
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = CSR_CYCLEH, .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm }, \
+{ .base = CSR_CYCLE, .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
+#else
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = CSR_CYCLE, .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
+#endif
+
+int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
+ unsigned long *val, unsigned long new_val,
+ unsigned long wr_mask);
+
int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
struct kvm_vcpu_sbi_return *retdata);
@@ -70,6 +83,8 @@ void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
#else
struct kvm_pmu {
};
+#define KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS \
+{ .base = 0, .count = 0, .func = NULL },

static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c
index 0bb5276..f689337 100644
--- a/arch/riscv/kvm/vcpu_insn.c
+++ b/arch/riscv/kvm/vcpu_insn.c
@@ -213,7 +213,9 @@ struct csr_func {
unsigned long wr_mask);
};

-static const struct csr_func csr_funcs[] = { };
+static const struct csr_func csr_funcs[] = {
+ KVM_RISCV_VCPU_HPMCOUNTER_CSR_FUNCS
+};

/**
* kvm_riscv_vcpu_csr_return -- Handle CSR read/write after user space
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 9a531fe..6fa0065 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -17,6 +17,58 @@

#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)

+static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+ unsigned long *out_val)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ struct kvm_pmc *pmc;
+ u64 enabled, running;
+
+ pmc = &kvpmu->pmc[cidx];
+ if (!pmc->perf_event)
+ return -EINVAL;
+
+ pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
+ *out_val = pmc->counter_val;
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
+ unsigned long *val, unsigned long new_val,
+ unsigned long wr_mask)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ int cidx, ret = KVM_INSN_CONTINUE_NEXT_SEPC;
+
+ if (!kvpmu || !kvpmu->init_done) {
+ /*
+ * In absence of sscofpmf in the platform, the guest OS may use
+ * the legacy PMU driver to read cycle/instret. In that case,
+ * just return 0 to avoid any illegal trap. However, any other
+ * hpmcounter access should result in illegal trap as they must
+ * be access through SBI PMU only.
+ */
+ if (csr_num == CSR_CYCLE || csr_num == CSR_INSTRET) {
+ *val = 0;
+ return ret;
+ } else {
+ return KVM_INSN_ILLEGAL_TRAP;
+ }
+ }
+
+ /* The counter CSR are read only. Thus, any write should result in illegal traps */
+ if (wr_mask)
+ return KVM_INSN_ILLEGAL_TRAP;
+
+ cidx = csr_num - CSR_CYCLE;
+
+ if (pmu_ctr_read(vcpu, cidx, val) < 0)
+ return KVM_INSN_ILLEGAL_TRAP;
+
+ return ret;
+}
+
int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
{
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
@@ -69,7 +121,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
struct kvm_vcpu_sbi_return *retdata)
{
- /* TODO */
+ int ret;
+
+ ret = pmu_ctr_read(vcpu, cidx, &retdata->out_val);
+ if (ret == -EINVAL)
+ retdata->err_val = SBI_ERR_INVALID_PARAM;
+
return 0;
}

--
2.25.1


2023-02-01 23:13:44

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 12/14] RISC-V: KVM: Implement perf support without sampling

RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
the virtualization enviornment as well. KVM implementation
relies on SBI PMU extension for the most part while trapping
& emulating the CSRs read for counter access.

This patch doesn't have the event sampling support yet.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/vcpu_pmu.c | 360 +++++++++++++++++++++++++++++++++++++-
1 file changed, 356 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 6fa0065..473ad80 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -12,10 +12,189 @@
#include <linux/perf/riscv_pmu.h>
#include <asm/csr.h>
#include <asm/kvm_vcpu_sbi.h>
+#include <asm/bitops.h>
#include <asm/kvm_vcpu_pmu.h>
#include <linux/kvm_host.h>

#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
+#define get_event_type(x) (((x) & SBI_PMU_EVENT_IDX_TYPE_MASK) >> 16)
+#define get_event_code(x) ((x) & SBI_PMU_EVENT_IDX_CODE_MASK)
+
+static enum perf_hw_id hw_event_perf_map[SBI_PMU_HW_GENERAL_MAX] = {
+ [SBI_PMU_HW_CPU_CYCLES] = PERF_COUNT_HW_CPU_CYCLES,
+ [SBI_PMU_HW_INSTRUCTIONS] = PERF_COUNT_HW_INSTRUCTIONS,
+ [SBI_PMU_HW_CACHE_REFERENCES] = PERF_COUNT_HW_CACHE_REFERENCES,
+ [SBI_PMU_HW_CACHE_MISSES] = PERF_COUNT_HW_CACHE_MISSES,
+ [SBI_PMU_HW_BRANCH_INSTRUCTIONS] = PERF_COUNT_HW_BRANCH_INSTRUCTIONS,
+ [SBI_PMU_HW_BRANCH_MISSES] = PERF_COUNT_HW_BRANCH_MISSES,
+ [SBI_PMU_HW_BUS_CYCLES] = PERF_COUNT_HW_BUS_CYCLES,
+ [SBI_PMU_HW_STALLED_CYCLES_FRONTEND] = PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
+ [SBI_PMU_HW_STALLED_CYCLES_BACKEND] = PERF_COUNT_HW_STALLED_CYCLES_BACKEND,
+ [SBI_PMU_HW_REF_CPU_CYCLES] = PERF_COUNT_HW_REF_CPU_CYCLES,
+};
+
+static u64 kvm_pmu_get_sample_period(struct kvm_pmc *pmc)
+{
+ u64 counter_val_mask = GENMASK(pmc->cinfo.width, 0);
+ u64 sample_period;
+
+ if (!pmc->counter_val)
+ sample_period = counter_val_mask + 1;
+ else
+ sample_period = (-pmc->counter_val) & counter_val_mask;
+
+ return sample_period;
+}
+
+static u32 kvm_pmu_get_perf_event_type(unsigned long eidx)
+{
+ enum sbi_pmu_event_type etype = get_event_type(eidx);
+ u32 type = PERF_TYPE_MAX;
+
+ switch (etype) {
+ case SBI_PMU_EVENT_TYPE_HW:
+ type = PERF_TYPE_HARDWARE;
+ break;
+ case SBI_PMU_EVENT_TYPE_CACHE:
+ type = PERF_TYPE_HW_CACHE;
+ break;
+ case SBI_PMU_EVENT_TYPE_RAW:
+ case SBI_PMU_EVENT_TYPE_FW:
+ type = PERF_TYPE_RAW;
+ break;
+ default:
+ break;
+ }
+
+ return type;
+}
+
+static bool kvm_pmu_is_fw_event(unsigned long eidx)
+{
+ return get_event_type(eidx) == SBI_PMU_EVENT_TYPE_FW;
+}
+
+static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc)
+{
+ if (pmc->perf_event) {
+ perf_event_disable(pmc->perf_event);
+ perf_event_release_kernel(pmc->perf_event);
+ pmc->perf_event = NULL;
+ }
+}
+
+static u64 kvm_pmu_get_perf_event_hw_config(u32 sbi_event_code)
+{
+ return hw_event_perf_map[sbi_event_code];
+}
+
+static u64 kvm_pmu_get_perf_event_cache_config(u32 sbi_event_code)
+{
+ u64 config = U64_MAX;
+ unsigned int cache_type, cache_op, cache_result;
+
+ /* All the cache event masks lie within 0xFF. No separate masking is necessary */
+ cache_type = (sbi_event_code & SBI_PMU_EVENT_CACHE_ID_CODE_MASK) >>
+ SBI_PMU_EVENT_CACHE_ID_SHIFT;
+ cache_op = (sbi_event_code & SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK) >>
+ SBI_PMU_EVENT_CACHE_OP_SHIFT;
+ cache_result = sbi_event_code & SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK;
+
+ if (cache_type >= PERF_COUNT_HW_CACHE_MAX ||
+ cache_op >= PERF_COUNT_HW_CACHE_OP_MAX ||
+ cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+ return config;
+
+ config = cache_type | (cache_op << 8) | (cache_result << 16);
+
+ return config;
+}
+
+static u64 kvm_pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
+{
+ enum sbi_pmu_event_type etype = get_event_type(eidx);
+ u32 ecode = get_event_code(eidx);
+ u64 config = U64_MAX;
+
+ switch (etype) {
+ case SBI_PMU_EVENT_TYPE_HW:
+ if (ecode < SBI_PMU_HW_GENERAL_MAX)
+ config = kvm_pmu_get_perf_event_hw_config(ecode);
+ break;
+ case SBI_PMU_EVENT_TYPE_CACHE:
+ config = kvm_pmu_get_perf_event_cache_config(ecode);
+ break;
+ case SBI_PMU_EVENT_TYPE_RAW:
+ config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
+ break;
+ case SBI_PMU_EVENT_TYPE_FW:
+ if (ecode < SBI_PMU_FW_MAX)
+ config = (1ULL << 63) | ecode;
+ break;
+ default:
+ break;
+ }
+
+ return config;
+}
+
+static int kvm_pmu_get_fixed_pmc_index(unsigned long eidx)
+{
+ u32 etype = kvm_pmu_get_perf_event_type(eidx);
+ u32 ecode = get_event_code(eidx);
+
+ if (etype != SBI_PMU_EVENT_TYPE_HW)
+ return -EINVAL;
+
+ if (ecode == SBI_PMU_HW_CPU_CYCLES)
+ return 0;
+ else if (ecode == SBI_PMU_HW_INSTRUCTIONS)
+ return 2;
+ else
+ return -EINVAL;
+}
+
+static int kvm_pmu_get_programmable_pmc_index(struct kvm_pmu *kvpmu, unsigned long eidx,
+ unsigned long cbase, unsigned long cmask)
+{
+ int ctr_idx = -1;
+ int i, pmc_idx;
+ int min, max;
+
+ if (kvm_pmu_is_fw_event(eidx)) {
+ /* Firmware counters are mapped 1:1 starting from num_hw_ctrs for simplicity */
+ min = kvpmu->num_hw_ctrs;
+ max = min + kvpmu->num_fw_ctrs;
+ } else {
+ /* First 3 counters are reserved for fixed counters */
+ min = 3;
+ max = kvpmu->num_hw_ctrs;
+ }
+
+ for_each_set_bit(i, &cmask, BITS_PER_LONG) {
+ pmc_idx = i + cbase;
+ if ((pmc_idx >= min && pmc_idx < max) &&
+ !test_bit(pmc_idx, kvpmu->pmc_in_use)) {
+ ctr_idx = pmc_idx;
+ break;
+ }
+ }
+
+ return ctr_idx;
+}
+
+static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
+ unsigned long cbase, unsigned long cmask)
+{
+ int ret;
+
+ /* Fixed counters need to be have fixed mapping as they have different width */
+ ret = kvm_pmu_get_fixed_pmc_index(eidx);
+ if (ret >= 0)
+ return ret;
+
+ return kvm_pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
+}

static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
unsigned long *out_val)
@@ -34,6 +213,16 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
return 0;
}

+static int kvm_pmu_validate_counter_mask(struct kvm_pmu *kvpmu, unsigned long ctr_base,
+ unsigned long ctr_mask)
+{
+ /* Make sure the we have a valid counter mask requested from the caller */
+ if (!ctr_mask || (ctr_base + __fls(ctr_mask) >= kvm_pmu_num_counters(kvpmu)))
+ return -EINVAL;
+
+ return 0;
+}
+
int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
unsigned long *val, unsigned long new_val,
unsigned long wr_mask)
@@ -97,7 +286,39 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
unsigned long ctr_mask, unsigned long flag, uint64_t ival,
struct kvm_vcpu_sbi_return *retdata)
{
- /* TODO */
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ int i, pmc_index, sbiret = 0;
+ struct kvm_pmc *pmc;
+
+ if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ goto out;
+ }
+
+ /* Start the counters that have been configured and requested by the guest */
+ for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
+ pmc_index = i + ctr_base;
+ if (!test_bit(pmc_index, kvpmu->pmc_in_use))
+ continue;
+ pmc = &kvpmu->pmc[pmc_index];
+ if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
+ pmc->counter_val = ival;
+ if (pmc->perf_event) {
+ if (unlikely(pmc->started)) {
+ sbiret = SBI_ERR_ALREADY_STARTED;
+ continue;
+ }
+ perf_event_period(pmc->perf_event, kvm_pmu_get_sample_period(pmc));
+ perf_event_enable(pmc->perf_event);
+ pmc->started = true;
+ } else {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ }
+ }
+
+out:
+ retdata->err_val = sbiret;
+
return 0;
}

@@ -105,7 +326,45 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
unsigned long ctr_mask, unsigned long flag,
struct kvm_vcpu_sbi_return *retdata)
{
- /* TODO */
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ int i, pmc_index, sbiret = 0;
+ u64 enabled, running;
+ struct kvm_pmc *pmc;
+
+ if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ goto out;
+ }
+
+ /* Stop the counters that have been configured and requested by the guest */
+ for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
+ pmc_index = i + ctr_base;
+ if (!test_bit(pmc_index, kvpmu->pmc_in_use))
+ continue;
+ pmc = &kvpmu->pmc[pmc_index];
+ if (pmc->perf_event) {
+ if (pmc->started) {
+ /* Stop counting the counter */
+ perf_event_disable(pmc->perf_event);
+ pmc->started = false;
+ } else
+ sbiret = SBI_ERR_ALREADY_STOPPED;
+
+ if (flag & SBI_PMU_STOP_FLAG_RESET) {
+ /* Relase the counter if this is a reset request */
+ pmc->counter_val += perf_event_read_value(pmc->perf_event,
+ &enabled, &running);
+ kvm_pmu_release_perf_event(pmc);
+ clear_bit(pmc_index, kvpmu->pmc_in_use);
+ }
+ } else {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ }
+ }
+
+out:
+ retdata->err_val = sbiret;
+
return 0;
}

@@ -114,7 +373,88 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
unsigned long eidx, uint64_t evtdata,
struct kvm_vcpu_sbi_return *retdata)
{
- /* TODO */
+ int ctr_idx, sbiret = 0;
+ u64 config;
+ u32 etype = kvm_pmu_get_perf_event_type(eidx);
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ struct perf_event *event;
+ struct kvm_pmc *pmc;
+ struct perf_event_attr attr = {
+ .type = etype,
+ .size = sizeof(struct perf_event_attr),
+ .pinned = true,
+ /*
+ * It should never reach here if the platform doesn't support the sscofpmf
+ * extension as mode filtering won't work without it.
+ */
+ .exclude_host = true,
+ .exclude_hv = true,
+ .exclude_user = !!(flag & SBI_PMU_CFG_FLAG_SET_UINH),
+ .exclude_kernel = !!(flag & SBI_PMU_CFG_FLAG_SET_SINH),
+ .config1 = RISCV_PMU_CONFIG1_GUEST_EVENTS,
+ };
+
+ if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ goto out;
+ }
+
+ if (kvm_pmu_is_fw_event(eidx)) {
+ sbiret = SBI_ERR_NOT_SUPPORTED;
+ goto out;
+ }
+
+ /*
+ * SKIP_MATCH flag indicates the caller is aware of the assigned counter
+ * for this event. Just do a sanity check if it already marked used.
+ */
+ if (flag & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
+ if (!test_bit(ctr_base + __ffs(ctr_mask), kvpmu->pmc_in_use)) {
+ sbiret = SBI_ERR_FAILURE;
+ goto out;
+ }
+ ctr_idx = ctr_base + __ffs(ctr_mask);
+ } else {
+
+ ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
+ if (ctr_idx < 0) {
+ sbiret = SBI_ERR_NOT_SUPPORTED;
+ goto out;
+ }
+ }
+
+ pmc = &kvpmu->pmc[ctr_idx];
+ kvm_pmu_release_perf_event(pmc);
+ pmc->idx = ctr_idx;
+
+ config = kvm_pmu_get_perf_event_config(eidx, evtdata);
+ attr.config = config;
+ if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
+ //TODO: Do we really want to clear the value in hardware counter
+ pmc->counter_val = 0;
+ }
+
+ /*
+ * Set the default sample_period for now. The guest specified value
+ * will be updated in the start call.
+ */
+ attr.sample_period = kvm_pmu_get_sample_period(pmc);
+
+ event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
+ if (IS_ERR(event)) {
+ pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
+ return PTR_ERR(event);
+ }
+
+ set_bit(ctr_idx, kvpmu->pmc_in_use);
+ pmc->perf_event = event;
+ if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
+ perf_event_enable(pmc->perf_event);
+
+ retdata->out_val = ctr_idx;
+out:
+ retdata->err_val = sbiret;
+
return 0;
}

@@ -192,7 +532,19 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)

void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
{
- /* TODO */
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ struct kvm_pmc *pmc;
+ int i;
+
+ if (!kvpmu)
+ return;
+
+ for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
+ pmc = &kvpmu->pmc[i];
+ pmc->counter_val = 0;
+ kvm_pmu_release_perf_event(pmc);
+ }
+ bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
}

void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
--
2.25.1


2023-02-01 23:13:47

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

This patch only adds barebone structure of perf implementation. Most of
the function returns zero at this point and will be implemented
fully in the future.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/kvm_host.h | 4 +
arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
arch/riscv/kvm/Makefile | 1 +
arch/riscv/kvm/vcpu.c | 7 ++
arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
5 files changed, 226 insertions(+)
create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
create mode 100644 arch/riscv/kvm/vcpu_pmu.c

diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index 93f43a3..b90be9a 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -18,6 +18,7 @@
#include <asm/kvm_vcpu_insn.h>
#include <asm/kvm_vcpu_sbi.h>
#include <asm/kvm_vcpu_timer.h>
+#include <asm/kvm_vcpu_pmu.h>

#define KVM_MAX_VCPUS 1024

@@ -228,6 +229,9 @@ struct kvm_vcpu_arch {

/* Don't run the VCPU (blocked) */
bool pause;
+
+ /* Performance monitoring context */
+ struct kvm_pmu pmu_context;
};

static inline void kvm_arch_hardware_unsetup(void) {}
diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
new file mode 100644
index 0000000..e2b4038
--- /dev/null
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2023 Rivos Inc
+ *
+ * Authors:
+ * Atish Patra <[email protected]>
+ */
+
+#ifndef __KVM_VCPU_RISCV_PMU_H
+#define __KVM_VCPU_RISCV_PMU_H
+
+#include <linux/perf/riscv_pmu.h>
+#include <asm/kvm_vcpu_sbi.h>
+#include <asm/sbi.h>
+
+#ifdef CONFIG_RISCV_PMU_SBI
+#define RISCV_KVM_MAX_FW_CTRS 32
+
+#if RISCV_KVM_MAX_FW_CTRS > 32
+#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
+#endif
+
+#define RISCV_MAX_COUNTERS 64
+
+/* Per virtual pmu counter data */
+struct kvm_pmc {
+ u8 idx;
+ struct perf_event *perf_event;
+ uint64_t counter_val;
+ union sbi_pmu_ctr_info cinfo;
+ /* Event monitoring status */
+ bool started;
+};
+
+/* PMU data structure per vcpu */
+struct kvm_pmu {
+ struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
+ /* Number of the virtual firmware counters available */
+ int num_fw_ctrs;
+ /* Number of the virtual hardware counters available */
+ int num_hw_ctrs;
+ /* A flag to indicate that pmu initialization is done */
+ bool init_done;
+ /* Bit map of all the virtual counter used */
+ DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
+};
+
+#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
+#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))
+
+int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
+int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
+ struct kvm_vcpu_sbi_return *retdata);
+int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag, uint64_t ival,
+ struct kvm_vcpu_sbi_return *retdata);
+int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag,
+ struct kvm_vcpu_sbi_return *retdata);
+int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag,
+ unsigned long eidx, uint64_t evtdata,
+ struct kvm_vcpu_sbi_return *retdata);
+int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+ struct kvm_vcpu_sbi_return *retdata);
+void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
+
+#else
+struct kvm_pmu {
+};
+
+static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
+static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
+static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
+#endif /* CONFIG_RISCV_PMU_SBI */
+#endif /* !__KVM_VCPU_RISCV_PMU_H */
diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index 019df920..5de1053 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
kvm-y += vcpu_sbi_replace.o
kvm-y += vcpu_sbi_hsm.o
kvm-y += vcpu_timer.o
+kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 7c08567..7d010b0 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -138,6 +138,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
WRITE_ONCE(vcpu->arch.irqs_pending, 0);
WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);

+ kvm_riscv_vcpu_pmu_reset(vcpu);
+
vcpu->arch.hfence_head = 0;
vcpu->arch.hfence_tail = 0;
memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
@@ -194,6 +196,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
/* Setup VCPU timer */
kvm_riscv_vcpu_timer_init(vcpu);

+ /* setup performance monitoring */
+ kvm_riscv_vcpu_pmu_init(vcpu);
+
/* Reset VCPU */
kvm_riscv_reset_vcpu(vcpu);

@@ -216,6 +221,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
/* Cleanup VCPU timer */
kvm_riscv_vcpu_timer_deinit(vcpu);

+ kvm_riscv_vcpu_pmu_deinit(vcpu);
+
/* Free unused pages pre-allocated for G-stage page table mappings */
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
}
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
new file mode 100644
index 0000000..2dad37f
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 Rivos Inc
+ *
+ * Authors:
+ * Atish Patra <[email protected]>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <linux/perf/riscv_pmu.h>
+#include <asm/csr.h>
+#include <asm/kvm_vcpu_sbi.h>
+#include <asm/kvm_vcpu_pmu.h>
+#include <linux/kvm_host.h>
+
+#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
+
+int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+ retdata->out_val = kvm_pmu_num_counters(kvpmu);
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+
+ if (cidx > RISCV_MAX_COUNTERS || cidx == 1) {
+ retdata->err_val = SBI_ERR_INVALID_PARAM;
+ return 0;
+ }
+
+ retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag, uint64_t ival,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ /* TODO */
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ /* TODO */
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
+ unsigned long ctr_mask, unsigned long flag,
+ unsigned long eidx, uint64_t evtdata,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ /* TODO */
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ /* TODO */
+ return 0;
+}
+
+void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
+{
+ int i = 0, ret, num_hw_ctrs = 0, hpm_width = 0;
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ struct kvm_pmc *pmc;
+
+ ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
+ if (ret < 0 || !hpm_width || !num_hw_ctrs)
+ return;
+
+ /*
+ * It is guranteed that RISCV_KVM_MAX_FW_CTRS can't exceed 32 as
+ * that may exceed total number of counters more than RISCV_MAX_COUNTERS
+ */
+ kvpmu->num_hw_ctrs = num_hw_ctrs;
+ kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
+
+ /*
+ * There is no correlation between the logical hardware counter and virtual counters.
+ * However, we need to encode a hpmcounter CSR in the counter info field so that
+ * KVM can trap n emulate the read. This works well in the migration use case as
+ * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
+ */
+ for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
+ /* TIME CSR shouldn't be read from perf interface */
+ if (i == 1)
+ continue;
+ pmc = &kvpmu->pmc[i];
+ pmc->idx = i;
+ if (i < kvpmu->num_hw_ctrs) {
+ pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
+ if (i < 3)
+ /* CY, IR counters */
+ pmc->cinfo.width = 63;
+ else
+ pmc->cinfo.width = hpm_width;
+ /*
+ * The CSR number doesn't have any relation with the logical
+ * hardware counters. The CSR numbers are encoded sequentially
+ * to avoid maintaining a map between the virtual counter
+ * and CSR number.
+ */
+ pmc->cinfo.csr = CSR_CYCLE + i;
+ } else {
+ pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
+ pmc->cinfo.width = BITS_PER_LONG - 1;
+ }
+ }
+
+ kvpmu->init_done = true;
+}
+
+void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
+{
+ /* TODO */
+}
+
+void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
+{
+ kvm_riscv_vcpu_pmu_deinit(vcpu);
+}
--
2.25.1


2023-02-01 23:14:02

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 13/14] RISC-V: KVM: Support firmware events

SBI PMU extension defines a set of firmware events which can provide
useful information to guests about the number of SBI calls. As
hypervisor implements the SBI PMU extension, these firmware events
correspond to ecall invocations between VS->HS mode. All other firmware
events will always report zero if monitored as KVM doesn't implement them.

This patch adds all the infrastructure required to support firmware
events.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/kvm_vcpu_pmu.h | 17 +++
arch/riscv/kvm/vcpu_pmu.c | 142 ++++++++++++++++++++------
2 files changed, 125 insertions(+), 34 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
index 2afaaf5..a1d8b7d 100644
--- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -22,6 +22,14 @@

#define RISCV_MAX_COUNTERS 64

+struct kvm_fw_event {
+ /* Current value of the event */
+ unsigned long value;
+
+ /* Event monitoring status */
+ bool started;
+};
+
/* Per virtual pmu counter data */
struct kvm_pmc {
u8 idx;
@@ -30,11 +38,14 @@ struct kvm_pmc {
union sbi_pmu_ctr_info cinfo;
/* Event monitoring status */
bool started;
+ /* Monitoring event ID */
+ unsigned long event_idx;
};

/* PMU data structure per vcpu */
struct kvm_pmu {
struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
+ struct kvm_fw_event fw_event[RISCV_KVM_MAX_FW_CTRS];
/* Number of the virtual firmware counters available */
int num_fw_ctrs;
/* Number of the virtual hardware counters available */
@@ -57,6 +68,7 @@ struct kvm_pmu {
{ .base = CSR_CYCLE, .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
#endif

+int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid);
int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
unsigned long *val, unsigned long new_val,
unsigned long wr_mask);
@@ -87,6 +99,11 @@ struct kvm_pmu {
{ .base = 0, .count = 0, .func = NULL },

static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
+static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
+{
+ return 0;
+}
+
static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
#endif /* CONFIG_RISCV_PMU_SBI */
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 473ad80..dd16e60 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -202,12 +202,15 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;
u64 enabled, running;
+ int fevent_code;

pmc = &kvpmu->pmc[cidx];
- if (!pmc->perf_event)
- return -EINVAL;

- pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
+ if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ fevent_code = get_event_code(pmc->event_idx);
+ pmc->counter_val = kvpmu->fw_event[fevent_code].value;
+ } else if (pmc->perf_event)
+ pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
*out_val = pmc->counter_val;

return 0;
@@ -223,6 +226,55 @@ static int kvm_pmu_validate_counter_mask(struct kvm_pmu *kvpmu, unsigned long ct
return 0;
}

+static int kvm_pmu_create_perf_event(struct kvm_pmc *pmc, int ctr_idx,
+ struct perf_event_attr *attr, unsigned long flag,
+ unsigned long eidx, unsigned long evtdata)
+{
+ struct perf_event *event;
+
+ kvm_pmu_release_perf_event(pmc);
+ pmc->idx = ctr_idx;
+
+ attr->config = kvm_pmu_get_perf_event_config(eidx, evtdata);
+ if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
+ //TODO: Do we really want to clear the value in hardware counter
+ pmc->counter_val = 0;
+ }
+
+ /*
+ * Set the default sample_period for now. The guest specified value
+ * will be updated in the start call.
+ */
+ attr->sample_period = kvm_pmu_get_sample_period(pmc);
+
+ event = perf_event_create_kernel_counter(attr, -1, current, NULL, pmc);
+ if (IS_ERR(event)) {
+ pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
+ return PTR_ERR(event);
+ }
+
+ pmc->perf_event = event;
+ if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
+ perf_event_enable(pmc->perf_event);
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
+{
+ struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
+ struct kvm_fw_event *fevent;
+
+ if (!kvpmu || fid >= SBI_PMU_FW_MAX)
+ return -EINVAL;
+
+ fevent = &kvpmu->fw_event[fid];
+ if (fevent->started)
+ fevent->value++;
+
+ return 0;
+}
+
int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
unsigned long *val, unsigned long new_val,
unsigned long wr_mask)
@@ -289,6 +341,7 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
int i, pmc_index, sbiret = 0;
struct kvm_pmc *pmc;
+ int fevent_code;

if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
sbiret = SBI_ERR_INVALID_PARAM;
@@ -303,7 +356,22 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
pmc = &kvpmu->pmc[pmc_index];
if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
pmc->counter_val = ival;
- if (pmc->perf_event) {
+ if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ fevent_code = get_event_code(pmc->event_idx);
+ if (fevent_code >= SBI_PMU_FW_MAX) {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ goto out;
+ }
+
+ /* Check if the counter was already started for some reason */
+ if (kvpmu->fw_event[fevent_code].started) {
+ sbiret = SBI_ERR_ALREADY_STARTED;
+ continue;
+ }
+
+ kvpmu->fw_event[fevent_code].started = true;
+ kvpmu->fw_event[fevent_code].value = pmc->counter_val;
+ } else if (pmc->perf_event) {
if (unlikely(pmc->started)) {
sbiret = SBI_ERR_ALREADY_STARTED;
continue;
@@ -330,6 +398,7 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
int i, pmc_index, sbiret = 0;
u64 enabled, running;
struct kvm_pmc *pmc;
+ int fevent_code;

if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
sbiret = SBI_ERR_INVALID_PARAM;
@@ -342,7 +411,18 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
if (!test_bit(pmc_index, kvpmu->pmc_in_use))
continue;
pmc = &kvpmu->pmc[pmc_index];
- if (pmc->perf_event) {
+ if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ fevent_code = get_event_code(pmc->event_idx);
+ if (fevent_code >= SBI_PMU_FW_MAX) {
+ sbiret = SBI_ERR_INVALID_PARAM;
+ goto out;
+ }
+
+ if (!kvpmu->fw_event[fevent_code].started)
+ sbiret = SBI_ERR_ALREADY_STOPPED;
+
+ kvpmu->fw_event[fevent_code].started = false;
+ } else if (pmc->perf_event) {
if (pmc->started) {
/* Stop counting the counter */
perf_event_disable(pmc->perf_event);
@@ -355,11 +435,14 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
pmc->counter_val += perf_event_read_value(pmc->perf_event,
&enabled, &running);
kvm_pmu_release_perf_event(pmc);
- clear_bit(pmc_index, kvpmu->pmc_in_use);
}
} else {
sbiret = SBI_ERR_INVALID_PARAM;
}
+ if (flag & SBI_PMU_STOP_FLAG_RESET) {
+ pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
+ clear_bit(pmc_index, kvpmu->pmc_in_use);
+ }
}

out:
@@ -373,12 +456,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
unsigned long eidx, uint64_t evtdata,
struct kvm_vcpu_sbi_return *retdata)
{
- int ctr_idx, sbiret = 0;
- u64 config;
+ int ctr_idx, ret, sbiret = 0;
+ bool is_fevent;
+ unsigned long event_code;
u32 etype = kvm_pmu_get_perf_event_type(eidx);
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
- struct perf_event *event;
- struct kvm_pmc *pmc;
+ struct kvm_pmc *pmc = NULL;
struct perf_event_attr attr = {
.type = etype,
.size = sizeof(struct perf_event_attr),
@@ -399,7 +482,9 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
goto out;
}

- if (kvm_pmu_is_fw_event(eidx)) {
+ event_code = get_event_code(eidx);
+ is_fevent = kvm_pmu_is_fw_event(eidx);
+ if (is_fevent && event_code >= SBI_PMU_FW_MAX) {
sbiret = SBI_ERR_NOT_SUPPORTED;
goto out;
}
@@ -424,33 +509,18 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
}

pmc = &kvpmu->pmc[ctr_idx];
- kvm_pmu_release_perf_event(pmc);
- pmc->idx = ctr_idx;
-
- config = kvm_pmu_get_perf_event_config(eidx, evtdata);
- attr.config = config;
- if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
- //TODO: Do we really want to clear the value in hardware counter
- pmc->counter_val = 0;
- }
-
- /*
- * Set the default sample_period for now. The guest specified value
- * will be updated in the start call.
- */
- attr.sample_period = kvm_pmu_get_sample_period(pmc);
-
- event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
- if (IS_ERR(event)) {
- pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
- return PTR_ERR(event);
+ if (is_fevent) {
+ if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
+ kvpmu->fw_event[event_code].started = true;
+ } else {
+ ret = kvm_pmu_create_perf_event(pmc, ctr_idx, &attr, flag, eidx, evtdata);
+ if (ret)
+ return ret;
}

set_bit(ctr_idx, kvpmu->pmc_in_use);
- pmc->perf_event = event;
- if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
- perf_event_enable(pmc->perf_event);

+ pmc->event_idx = eidx;
retdata->out_val = ctr_idx;
out:
retdata->err_val = sbiret;
@@ -494,6 +564,7 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
*/
kvpmu->num_hw_ctrs = num_hw_ctrs;
kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
+ memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));

/*
* There is no correlation between the logical hardware counter and virtual counters.
@@ -507,6 +578,7 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
continue;
pmc = &kvpmu->pmc[i];
pmc->idx = i;
+ pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
if (i < kvpmu->num_hw_ctrs) {
pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
if (i < 3)
@@ -543,8 +615,10 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
pmc = &kvpmu->pmc[i];
pmc->counter_val = 0;
kvm_pmu_release_perf_event(pmc);
+ pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
}
bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
+ memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
}

void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
--
2.25.1


2023-02-01 23:14:13

by Atish Kumar Patra

[permalink] [raw]
Subject: [PATCH v4 14/14] RISC-V: KVM: Increment firmware pmu events

KVM supports firmware events now. Invoke the firmware event increment
function from appropriate places.

Reviewed-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/kvm/tlb.c | 4 ++++
arch/riscv/kvm/vcpu_sbi_replace.c | 7 +++++++
2 files changed, 11 insertions(+)

diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c
index 309d79b..b797f7c 100644
--- a/arch/riscv/kvm/tlb.c
+++ b/arch/riscv/kvm/tlb.c
@@ -181,6 +181,7 @@ void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu)

void kvm_riscv_fence_i_process(struct kvm_vcpu *vcpu)
{
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_RCVD);
local_flush_icache_all();
}

@@ -264,15 +265,18 @@ void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu)
d.addr, d.size, d.order);
break;
case KVM_RISCV_HFENCE_VVMA_ASID_GVA:
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
kvm_riscv_local_hfence_vvma_asid_gva(
READ_ONCE(v->vmid), d.asid,
d.addr, d.size, d.order);
break;
case KVM_RISCV_HFENCE_VVMA_ASID_ALL:
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_RCVD);
kvm_riscv_local_hfence_vvma_asid_all(
READ_ONCE(v->vmid), d.asid);
break;
case KVM_RISCV_HFENCE_VVMA_GVA:
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_RCVD);
kvm_riscv_local_hfence_vvma_gva(
READ_ONCE(v->vmid),
d.addr, d.size, d.order);
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 38fa4c0..7c4d5d3 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -11,6 +11,7 @@
#include <linux/kvm_host.h>
#include <asm/sbi.h>
#include <asm/kvm_vcpu_timer.h>
+#include <asm/kvm_vcpu_pmu.h>
#include <asm/kvm_vcpu_sbi.h>

static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
@@ -24,6 +25,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
return 0;
}

+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_SET_TIMER);
#if __riscv_xlen == 32
next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
#else
@@ -55,6 +57,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
return 0;
}

+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_IPI_SENT);
kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
if (hbase != -1UL) {
if (tmp->vcpu_id < hbase)
@@ -65,6 +68,7 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
ret = kvm_riscv_vcpu_set_interrupt(tmp, IRQ_VS_SOFT);
if (ret < 0)
break;
+ kvm_riscv_vcpu_pmu_incr_fw(tmp, SBI_PMU_FW_IPI_RCVD);
}

return ret;
@@ -87,6 +91,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
switch (funcid) {
case SBI_EXT_RFENCE_REMOTE_FENCE_I:
kvm_riscv_fence_i(vcpu->kvm, hbase, hmask);
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_FENCE_I_SENT);
break;
case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
if (cp->a2 == 0 && cp->a3 == 0)
@@ -94,6 +99,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
else
kvm_riscv_hfence_vvma_gva(vcpu->kvm, hbase, hmask,
cp->a2, cp->a3, PAGE_SHIFT);
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_SENT);
break;
case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
if (cp->a2 == 0 && cp->a3 == 0)
@@ -104,6 +110,7 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
hbase, hmask,
cp->a2, cp->a3,
PAGE_SHIFT, cp->a4);
+ kvm_riscv_vcpu_pmu_incr_fw(vcpu, SBI_PMU_FW_HFENCE_VVMA_ASID_SENT);
break;
case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
--
2.25.1


2023-02-02 04:01:06

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 03/14] RISC-V: Improve SBI PMU extension related definitions

On Thu, Feb 2, 2023 at 4:42 AM Atish Patra <[email protected]> wrote:
>
> This patch fixes/improve few minor things in SBI PMU extension
> definition.
>
> 1. Align all the firmware event names.
> 2. Add macros for bit positions in cache event ID & ops.
>
> The changes were small enough to combine them together instead
> of creating 1 liner patches.
>
> Signed-off-by: Atish Patra <[email protected]>

Looks good to me.

Reviewed-by: Anup Patel <[email protected]>

Regards,
Anup

> ---
> arch/riscv/include/asm/sbi.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 4ca7fba..945b7be 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -169,9 +169,9 @@ enum sbi_pmu_fw_generic_events_t {
> SBI_PMU_FW_ILLEGAL_INSN = 4,
> SBI_PMU_FW_SET_TIMER = 5,
> SBI_PMU_FW_IPI_SENT = 6,
> - SBI_PMU_FW_IPI_RECVD = 7,
> + SBI_PMU_FW_IPI_RCVD = 7,
> SBI_PMU_FW_FENCE_I_SENT = 8,
> - SBI_PMU_FW_FENCE_I_RECVD = 9,
> + SBI_PMU_FW_FENCE_I_RCVD = 9,
> SBI_PMU_FW_SFENCE_VMA_SENT = 10,
> SBI_PMU_FW_SFENCE_VMA_RCVD = 11,
> SBI_PMU_FW_SFENCE_VMA_ASID_SENT = 12,
> @@ -215,6 +215,9 @@ enum sbi_pmu_ctr_type {
> #define SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK 0x06
> #define SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK 0x01
>
> +#define SBI_PMU_EVENT_CACHE_ID_SHIFT 3
> +#define SBI_PMU_EVENT_CACHE_OP_SHIFT 1
> +
> #define SBI_PMU_EVENT_IDX_INVALID 0xFFFFFFFF
>
> /* Flags defined for config matching function */
> --
> 2.25.1
>

2023-02-02 04:02:06

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 06/14] RISC-V: KVM: Modify SBI extension handler to return SBI error code

On Thu, Feb 2, 2023 at 4:42 AM Atish Patra <[email protected]> wrote:
>
> Currently, the SBI extension handle is expected to return Linux error code.
> The top SBI layer converts the Linux error code to SBI specific error code
> that can be returned to guest invoking the SBI calls. This model works
> as long as SBI error codes have 1-to-1 mappings between them.
> However, that may not be true always. This patch attempts to disassociate
> both these error codes by allowing the SBI extension implementation to
> return SBI specific error codes as well.
>
> The extension will continue to return the Linux error specific code which
> will indicate any problem *with* the extension emulation while the
> SBI specific error will indicate the problem *of* the emulation.
>
> Suggested-by: Andrew Jones <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++-
> arch/riscv/kvm/vcpu_sbi.c | 61 +++++++++++----------------
> arch/riscv/kvm/vcpu_sbi_base.c | 36 +++++++---------
> arch/riscv/kvm/vcpu_sbi_hsm.c | 28 ++++++------
> arch/riscv/kvm/vcpu_sbi_replace.c | 43 +++++++++----------
> arch/riscv/kvm/vcpu_sbi_v01.c | 18 ++++----
> 6 files changed, 90 insertions(+), 106 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> index 45ba341..8425556 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> @@ -18,6 +18,13 @@ struct kvm_vcpu_sbi_context {
> int return_handled;
> };
>
> +struct kvm_vcpu_sbi_return {
> + unsigned long out_val;
> + unsigned long err_val;
> + struct kvm_cpu_trap *utrap;

This should not be a pointer.

> + bool uexit;
> +};
> +
> struct kvm_vcpu_sbi_extension {
> unsigned long extid_start;
> unsigned long extid_end;
> @@ -27,8 +34,7 @@ struct kvm_vcpu_sbi_extension {
> * specific error codes.
> */
> int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val, struct kvm_cpu_trap *utrap,
> - bool *exit);
> + struct kvm_vcpu_sbi_return *retdata);
>
> /* Extension specific probe function */
> unsigned long (*probe)(struct kvm_vcpu *vcpu);
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index f96991d..fe2897e 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -12,26 +12,6 @@
> #include <asm/sbi.h>
> #include <asm/kvm_vcpu_sbi.h>
>
> -static int kvm_linux_err_map_sbi(int err)
> -{
> - switch (err) {
> - case 0:
> - return SBI_SUCCESS;
> - case -EPERM:
> - return SBI_ERR_DENIED;
> - case -EINVAL:
> - return SBI_ERR_INVALID_PARAM;
> - case -EFAULT:
> - return SBI_ERR_INVALID_ADDRESS;
> - case -EOPNOTSUPP:
> - return SBI_ERR_NOT_SUPPORTED;
> - case -EALREADY:
> - return SBI_ERR_ALREADY_AVAILABLE;
> - default:
> - return SBI_ERR_FAILURE;
> - };
> -}
> -
> #ifndef CONFIG_RISCV_SBI_V01
> static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> .extid_start = -1UL,
> @@ -125,11 +105,14 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> int ret = 1;
> bool next_sepc = true;
> - bool userspace_exit = false;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> const struct kvm_vcpu_sbi_extension *sbi_ext;
> - struct kvm_cpu_trap utrap = { 0 };
> - unsigned long out_val = 0;
> + struct kvm_cpu_trap utrap = {0};
> + struct kvm_vcpu_sbi_return sbi_ret = {
> + .out_val = 0,
> + .err_val = 0,
> + .utrap = &utrap,
> + };
> bool ext_is_v01 = false;
>
> sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
> @@ -139,42 +122,46 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
> ext_is_v01 = true;
> #endif
> - ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
> + ret = sbi_ext->handler(vcpu, run, &sbi_ret);
> } else {
> /* Return error for unsupported SBI calls */
> cp->a0 = SBI_ERR_NOT_SUPPORTED;
> goto ecall_done;
> }
>
> + /*
> + * When the SBI extension returns a Linux error code, it exits the ioctl
> + * loop and forwards the error to userspace.
> + */
> + if (ret < 0) {
> + next_sepc = false;
> + goto ecall_done;
> + }
> +
> /* Handle special error cases i.e trap, exit or userspace forward */
> - if (utrap.scause) {
> + if (sbi_ret.utrap->scause) {
> /* No need to increment sepc or exit ioctl loop */
> ret = 1;
> - utrap.sepc = cp->sepc;
> - kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
> + sbi_ret.utrap->sepc = cp->sepc;
> + kvm_riscv_vcpu_trap_redirect(vcpu, sbi_ret.utrap);
> next_sepc = false;
> goto ecall_done;
> }
>
> /* Exit ioctl loop or Propagate the error code the guest */
> - if (userspace_exit) {
> + if (sbi_ret.uexit) {
> next_sepc = false;
> ret = 0;
> } else {
> - /**
> - * SBI extension handler always returns an Linux error code. Convert
> - * it to the SBI specific error code that can be propagated the SBI
> - * caller.
> - */
> - ret = kvm_linux_err_map_sbi(ret);
> - cp->a0 = ret;
> + cp->a0 = sbi_ret.err_val;
> ret = 1;
> }
> ecall_done:
> if (next_sepc)
> cp->sepc += 4;
> - if (!ext_is_v01)
> - cp->a1 = out_val;
> + /* a1 should only be updated when we continue the ioctl loop */
> + if (!ext_is_v01 && ret == 1)
> + cp->a1 = sbi_ret.out_val;
>
> return ret;
> }
> diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> index 846d518..69f4202 100644
> --- a/arch/riscv/kvm/vcpu_sbi_base.c
> +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> @@ -14,24 +14,22 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *trap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> const struct kvm_vcpu_sbi_extension *sbi_ext;
>
> switch (cp->a6) {
> case SBI_EXT_BASE_GET_SPEC_VERSION:
> - *out_val = (KVM_SBI_VERSION_MAJOR <<
> + retdata->out_val = (KVM_SBI_VERSION_MAJOR <<
> SBI_SPEC_VERSION_MAJOR_SHIFT) |
> KVM_SBI_VERSION_MINOR;
> break;
> case SBI_EXT_BASE_GET_IMP_ID:
> - *out_val = KVM_SBI_IMPID;
> + retdata->out_val = KVM_SBI_IMPID;
> break;
> case SBI_EXT_BASE_GET_IMP_VERSION:
> - *out_val = LINUX_VERSION_CODE;
> + retdata->out_val = LINUX_VERSION_CODE;
> break;
> case SBI_EXT_BASE_PROBE_EXT:
> if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
> @@ -43,33 +41,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> * forward it to the userspace
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> } else {
> sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> if (sbi_ext) {
> if (sbi_ext->probe)
> - *out_val = sbi_ext->probe(vcpu);
> + retdata->out_val = sbi_ext->probe(vcpu);
> else
> - *out_val = 1;
> + retdata->out_val = 1;
> } else
> - *out_val = 0;
> + retdata->out_val = 0;
> }
> break;
> case SBI_EXT_BASE_GET_MVENDORID:
> - *out_val = vcpu->arch.mvendorid;
> + retdata->out_val = vcpu->arch.mvendorid;
> break;
> case SBI_EXT_BASE_GET_MARCHID:
> - *out_val = vcpu->arch.marchid;
> + retdata->out_val = vcpu->arch.marchid;
> break;
> case SBI_EXT_BASE_GET_MIMPID:
> - *out_val = vcpu->arch.mimpid;
> + retdata->out_val = vcpu->arch.mimpid;
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> break;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> @@ -79,17 +77,15 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> };
>
> static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
> - struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_run *run,
> + struct kvm_vcpu_sbi_return *retdata)
> {
> /*
> * Both SBI experimental and vendor extensions are
> * unconditionally forwarded to userspace.
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> return 0;
> }
>
> diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> index 619ac0f..7dca0e9 100644
> --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> @@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
>
> target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> if (!target_vcpu)
> - return -EINVAL;
> + return SBI_ERR_INVALID_PARAM;
> if (!target_vcpu->arch.power_off)
> - return -EALREADY;
> + return SBI_ERR_ALREADY_AVAILABLE;
>
> reset_cntx = &target_vcpu->arch.guest_reset_context;
> /* start address */
> @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> {
> if (vcpu->arch.power_off)
> - return -EACCES;
> + return SBI_ERR_FAILURE;
>
> kvm_riscv_vcpu_power_off(vcpu);
>
> @@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
>
> target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> if (!target_vcpu)
> - return -EINVAL;
> + return SBI_ERR_INVALID_PARAM;
> if (!target_vcpu->arch.power_off)
> return SBI_HSM_STATE_STARTED;
> else if (vcpu->stat.generic.blocking)
> @@ -67,9 +67,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> }
>
> static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> @@ -88,27 +86,29 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_HSM_HART_STATUS:
> ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
> if (ret >= 0) {
> - *out_val = ret;
> - ret = 0;
> + retdata->out_val = ret;
> + retdata->err_val = 0;
> }
> - break;
> + return 0;
> case SBI_EXT_HSM_HART_SUSPEND:
> switch (cp->a0) {
> case SBI_HSM_SUSPEND_RET_DEFAULT:
> kvm_riscv_vcpu_wfi(vcpu);
> break;
> case SBI_HSM_SUSPEND_NON_RET_DEFAULT:
> - ret = -EOPNOTSUPP;
> + ret = SBI_ERR_NOT_SUPPORTED;
> break;
> default:
> - ret = -EINVAL;
> + ret = SBI_ERR_INVALID_PARAM;
> }
> break;
> default:
> - ret = -EOPNOTSUPP;
> + ret = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + retdata->err_val = ret;
> +
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index 03a0198..38fa4c0 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -14,15 +14,15 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> u64 next_cycle;
>
> - if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
> - return -EINVAL;
> + if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
>
> #if __riscv_xlen == 32
> next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> @@ -31,7 +31,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> #endif
> kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> @@ -41,8 +41,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> };
>
> static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> int ret = 0;
> unsigned long i;
> @@ -51,8 +50,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> unsigned long hmask = cp->a0;
> unsigned long hbase = cp->a1;
>
> - if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
> - return -EINVAL;
> + if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
>
> kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> if (hbase != -1UL) {
> @@ -76,10 +77,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
> };
>
> static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> unsigned long hmask = cp->a0;
> unsigned long hbase = cp->a1;
> @@ -116,10 +115,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> */
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> @@ -130,14 +129,12 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
>
> static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> unsigned long funcid = cp->a6;
> u32 reason = cp->a1;
> u32 type = cp->a0;
> - int ret = 0;
>
> switch (funcid) {
> case SBI_EXT_SRST_RESET:
> @@ -146,24 +143,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_SHUTDOWN,
> reason);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_SRST_RESET_TYPE_COLD_REBOOT:
> case SBI_SRST_RESET_TYPE_WARM_REBOOT:
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_RESET,
> reason);
> - *exit = true;
> + retdata->uexit = true;
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
> diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> index 489f225..0269e08 100644
> --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> @@ -14,9 +14,7 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> ulong hmask;
> int i, ret = 0;
> @@ -33,7 +31,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> * handled in kernel so we forward these to user-space
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_EXT_0_1_SET_TIMER:
> #if __riscv_xlen == 32
> @@ -49,10 +47,10 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_0_1_SEND_IPI:
> if (cp->a0)
> hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> - utrap);
> + retdata->utrap);
> else
> hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> - if (utrap->scause)
> + if (retdata->utrap->scause)
> break;
>
> for_each_set_bit(i, &hmask, BITS_PER_LONG) {
> @@ -65,17 +63,17 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_0_1_SHUTDOWN:
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_SHUTDOWN, 0);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_EXT_0_1_REMOTE_FENCE_I:
> case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
> case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
> if (cp->a0)
> hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> - utrap);
> + retdata->utrap);
> else
> hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> - if (utrap->scause)
> + if (retdata->utrap->scause)
> break;
>
> if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> @@ -103,7 +101,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> }
> break;
> default:
> - ret = -EINVAL;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> break;
> }
>
> --
> 2.25.1
>

Regards,
Anup

2023-02-02 04:03:19

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Thu, Feb 2, 2023 at 4:42 AM Atish Patra <[email protected]> wrote:
>
> This patch only adds barebone structure of perf implementation. Most of
> the function returns zero at this point and will be implemented
> fully in the future.
>
> Signed-off-by: Atish Patra <[email protected]>

Looks good to me.

Reviewed-by: Anup Patel <[email protected]>

Regards,
Anup

> ---
> arch/riscv/include/asm/kvm_host.h | 4 +
> arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> arch/riscv/kvm/Makefile | 1 +
> arch/riscv/kvm/vcpu.c | 7 ++
> arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> 5 files changed, 226 insertions(+)
> create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> create mode 100644 arch/riscv/kvm/vcpu_pmu.c
>
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 93f43a3..b90be9a 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -18,6 +18,7 @@
> #include <asm/kvm_vcpu_insn.h>
> #include <asm/kvm_vcpu_sbi.h>
> #include <asm/kvm_vcpu_timer.h>
> +#include <asm/kvm_vcpu_pmu.h>
>
> #define KVM_MAX_VCPUS 1024
>
> @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
>
> /* Don't run the VCPU (blocked) */
> bool pause;
> +
> + /* Performance monitoring context */
> + struct kvm_pmu pmu_context;
> };
>
> static inline void kvm_arch_hardware_unsetup(void) {}
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> new file mode 100644
> index 0000000..e2b4038
> --- /dev/null
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#ifndef __KVM_VCPU_RISCV_PMU_H
> +#define __KVM_VCPU_RISCV_PMU_H
> +
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/sbi.h>
> +
> +#ifdef CONFIG_RISCV_PMU_SBI
> +#define RISCV_KVM_MAX_FW_CTRS 32
> +
> +#if RISCV_KVM_MAX_FW_CTRS > 32
> +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
> +#endif
> +
> +#define RISCV_MAX_COUNTERS 64
> +
> +/* Per virtual pmu counter data */
> +struct kvm_pmc {
> + u8 idx;
> + struct perf_event *perf_event;
> + uint64_t counter_val;
> + union sbi_pmu_ctr_info cinfo;
> + /* Event monitoring status */
> + bool started;
> +};
> +
> +/* PMU data structure per vcpu */
> +struct kvm_pmu {
> + struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> + /* Number of the virtual firmware counters available */
> + int num_fw_ctrs;
> + /* Number of the virtual hardware counters available */
> + int num_hw_ctrs;
> + /* A flag to indicate that pmu initialization is done */
> + bool init_done;
> + /* Bit map of all the virtual counter used */
> + DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> +};
> +
> +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
> +#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + unsigned long eidx, uint64_t evtdata,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata);
> +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> +
> +#else
> +struct kvm_pmu {
> +};
> +
> +static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> +#endif /* CONFIG_RISCV_PMU_SBI */
> +#endif /* !__KVM_VCPU_RISCV_PMU_H */
> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 019df920..5de1053 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
> kvm-y += vcpu_sbi_replace.o
> kvm-y += vcpu_sbi_hsm.o
> kvm-y += vcpu_timer.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 7c08567..7d010b0 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -138,6 +138,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
> WRITE_ONCE(vcpu->arch.irqs_pending, 0);
> WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
>
> + kvm_riscv_vcpu_pmu_reset(vcpu);
> +
> vcpu->arch.hfence_head = 0;
> vcpu->arch.hfence_tail = 0;
> memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
> @@ -194,6 +196,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> /* Setup VCPU timer */
> kvm_riscv_vcpu_timer_init(vcpu);
>
> + /* setup performance monitoring */
> + kvm_riscv_vcpu_pmu_init(vcpu);
> +
> /* Reset VCPU */
> kvm_riscv_reset_vcpu(vcpu);
>
> @@ -216,6 +221,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> /* Cleanup VCPU timer */
> kvm_riscv_vcpu_timer_deinit(vcpu);
>
> + kvm_riscv_vcpu_pmu_deinit(vcpu);
> +
> /* Free unused pages pre-allocated for G-stage page table mappings */
> kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> }
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> new file mode 100644
> index 0000000..2dad37f
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/csr.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/kvm_vcpu_pmu.h>
> +#include <linux/kvm_host.h>
> +
> +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + retdata->out_val = kvm_pmu_num_counters(kvpmu);
> +
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + if (cidx > RISCV_MAX_COUNTERS || cidx == 1) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
> +
> + retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
> +
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + unsigned long eidx, uint64_t evtdata,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> +{
> + int i = 0, ret, num_hw_ctrs = 0, hpm_width = 0;
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + struct kvm_pmc *pmc;
> +
> + ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> + if (ret < 0 || !hpm_width || !num_hw_ctrs)
> + return;
> +
> + /*
> + * It is guranteed that RISCV_KVM_MAX_FW_CTRS can't exceed 32 as
> + * that may exceed total number of counters more than RISCV_MAX_COUNTERS
> + */
> + kvpmu->num_hw_ctrs = num_hw_ctrs;
> + kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> +
> + /*
> + * There is no correlation between the logical hardware counter and virtual counters.
> + * However, we need to encode a hpmcounter CSR in the counter info field so that
> + * KVM can trap n emulate the read. This works well in the migration use case as
> + * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> + */
> + for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> + /* TIME CSR shouldn't be read from perf interface */
> + if (i == 1)
> + continue;
> + pmc = &kvpmu->pmc[i];
> + pmc->idx = i;
> + if (i < kvpmu->num_hw_ctrs) {
> + pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
> + if (i < 3)
> + /* CY, IR counters */
> + pmc->cinfo.width = 63;
> + else
> + pmc->cinfo.width = hpm_width;
> + /*
> + * The CSR number doesn't have any relation with the logical
> + * hardware counters. The CSR numbers are encoded sequentially
> + * to avoid maintaining a map between the virtual counter
> + * and CSR number.
> + */
> + pmc->cinfo.csr = CSR_CYCLE + i;
> + } else {
> + pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> + pmc->cinfo.width = BITS_PER_LONG - 1;
> + }
> + }
> +
> + kvpmu->init_done = true;
> +}
> +
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> +{
> + /* TODO */
> +}
> +
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> +{
> + kvm_riscv_vcpu_pmu_deinit(vcpu);
> +}
> --
> 2.25.1
>

2023-02-02 07:53:12

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 08/14] RISC-V: KVM: Add SBI PMU extension support

Hey Atish,

On Wed, Feb 01, 2023 at 03:12:44PM -0800, Atish Patra wrote:
> SBI PMU extension allows KVM guests to configure/start/stop/query about
> the PMU counters in virtualized enviornment as well.
>
> In order to allow that, KVM implements the entire SBI PMU extension.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>

Could complaints from CI for you:

+ 1 ../arch/riscv/kvm/vcpu_sbi_pmu.c:73:15: warning: no previous prototype for 'kvm_sbi_ext_pmu_probe' [-Wmissing-prototypes]
+ 1 ../arch/riscv/kvm/vcpu_sbi_pmu.c:73:15: warning: symbol 'kvm_sbi_ext_pmu_probe' was not declared. Should it be static?
+ 1 ../arch/riscv/kvm/vcpu_sbi_pmu.c:80:37: warning: symbol 'vcpu_sbi_ext_pmu' was not declared. Should it be static?

Thanks,
Conor.

> ---
> arch/riscv/kvm/Makefile | 2 +-
> arch/riscv/kvm/vcpu_sbi.c | 11 +++++
> arch/riscv/kvm/vcpu_sbi_pmu.c | 85 +++++++++++++++++++++++++++++++++++
> 3 files changed, 97 insertions(+), 1 deletion(-)
> create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c
>
> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 5de1053..278e97c 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
> kvm-y += vcpu_sbi_replace.o
> kvm-y += vcpu_sbi_hsm.o
> kvm-y += vcpu_timer.o
> -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index fe2897e..15fde15 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> };
> #endif
>
> +#ifdef CONFIG_RISCV_PMU_SBI
> +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
> +#else
> +static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> + .extid_start = -1UL,
> + .extid_end = -1UL,
> + .handler = NULL,
> +};
> +#endif
> +
> static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> &vcpu_sbi_ext_v01,
> &vcpu_sbi_ext_base,
> @@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> &vcpu_sbi_ext_rfence,
> &vcpu_sbi_ext_srst,
> &vcpu_sbi_ext_hsm,
> + &vcpu_sbi_ext_pmu,
> &vcpu_sbi_ext_experimental,
> &vcpu_sbi_ext_vendor,
> };
> diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
> new file mode 100644
> index 0000000..e028b0a
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <asm/csr.h>
> +#include <asm/sbi.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +
> +static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + int ret = 0;
> + struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + unsigned long funcid = cp->a6;
> + uint64_t temp;
> +
> + /* Return not supported if PMU is not initialized */
> + if (!kvpmu->init_done)
> + return -EINVAL;
> +
> + switch (funcid) {
> + case SBI_EXT_PMU_NUM_COUNTERS:
> + ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_GET_INFO:
> + ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_CFG_MATCH:
> +#if defined(CONFIG_32BIT)
> + temp = ((uint64_t)cp->a5 << 32) | cp->a4;
> +#else
> + temp = cp->a4;
> +#endif
> + /*
> + * This can fail if perf core framework fails to create an event.
> + * Forward the error to the user space because its an error happened
> + * within host kernel. The other option would be convert this to
> + * an SBI error and forward to the guest.
> + */
> + ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
> + cp->a2, cp->a3, temp, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_START:
> +#if defined(CONFIG_32BIT)
> + temp = ((uint64_t)cp->a4 << 32) | cp->a3;
> +#else
> + temp = cp->a3;
> +#endif
> + ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
> + temp, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_STOP:
> + ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_FW_READ:
> + ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, retdata);
> + break;
> + default:
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> + }
> +
> + return ret;
> +}
> +
> +unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + return kvpmu->init_done;
> +}
> +
> +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> + .extid_start = SBI_EXT_PMU,
> + .extid_end = SBI_EXT_PMU,
> + .handler = kvm_sbi_ext_pmu_handler,
> + .probe = kvm_sbi_ext_pmu_probe,
> +};
> --
> 2.25.1
>
>


Attachments:
(No filename) (4.96 kB)
signature.asc (228.00 B)
Download all attachments

2023-02-02 08:53:08

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v4 06/14] RISC-V: KVM: Modify SBI extension handler to return SBI error code

On Thu, Feb 2, 2023 at 9:31 AM Anup Patel <[email protected]> wrote:
>
> On Thu, Feb 2, 2023 at 4:42 AM Atish Patra <[email protected]> wrote:
> >
> > Currently, the SBI extension handle is expected to return Linux error code.
> > The top SBI layer converts the Linux error code to SBI specific error code
> > that can be returned to guest invoking the SBI calls. This model works
> > as long as SBI error codes have 1-to-1 mappings between them.
> > However, that may not be true always. This patch attempts to disassociate
> > both these error codes by allowing the SBI extension implementation to
> > return SBI specific error codes as well.
> >
> > The extension will continue to return the Linux error specific code which
> > will indicate any problem *with* the extension emulation while the
> > SBI specific error will indicate the problem *of* the emulation.
> >
> > Suggested-by: Andrew Jones <[email protected]>
> > Signed-off-by: Atish Patra <[email protected]>
> > ---
> > arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++-
> > arch/riscv/kvm/vcpu_sbi.c | 61 +++++++++++----------------
> > arch/riscv/kvm/vcpu_sbi_base.c | 36 +++++++---------
> > arch/riscv/kvm/vcpu_sbi_hsm.c | 28 ++++++------
> > arch/riscv/kvm/vcpu_sbi_replace.c | 43 +++++++++----------
> > arch/riscv/kvm/vcpu_sbi_v01.c | 18 ++++----
> > 6 files changed, 90 insertions(+), 106 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > index 45ba341..8425556 100644
> > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > @@ -18,6 +18,13 @@ struct kvm_vcpu_sbi_context {
> > int return_handled;
> > };
> >
> > +struct kvm_vcpu_sbi_return {
> > + unsigned long out_val;
> > + unsigned long err_val;
> > + struct kvm_cpu_trap *utrap;
>
> This should not be a pointer.

Ignore this comment. This is better otherwise we will have
circular header dependency.

Reviewed-by: Anup Patel <[email protected]>

Thanks,
Anup

>
> > + bool uexit;
> > +};
> > +
> > struct kvm_vcpu_sbi_extension {
> > unsigned long extid_start;
> > unsigned long extid_end;
> > @@ -27,8 +34,7 @@ struct kvm_vcpu_sbi_extension {
> > * specific error codes.
> > */
> > int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val, struct kvm_cpu_trap *utrap,
> > - bool *exit);
> > + struct kvm_vcpu_sbi_return *retdata);
> >
> > /* Extension specific probe function */
> > unsigned long (*probe)(struct kvm_vcpu *vcpu);
> > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> > index f96991d..fe2897e 100644
> > --- a/arch/riscv/kvm/vcpu_sbi.c
> > +++ b/arch/riscv/kvm/vcpu_sbi.c
> > @@ -12,26 +12,6 @@
> > #include <asm/sbi.h>
> > #include <asm/kvm_vcpu_sbi.h>
> >
> > -static int kvm_linux_err_map_sbi(int err)
> > -{
> > - switch (err) {
> > - case 0:
> > - return SBI_SUCCESS;
> > - case -EPERM:
> > - return SBI_ERR_DENIED;
> > - case -EINVAL:
> > - return SBI_ERR_INVALID_PARAM;
> > - case -EFAULT:
> > - return SBI_ERR_INVALID_ADDRESS;
> > - case -EOPNOTSUPP:
> > - return SBI_ERR_NOT_SUPPORTED;
> > - case -EALREADY:
> > - return SBI_ERR_ALREADY_AVAILABLE;
> > - default:
> > - return SBI_ERR_FAILURE;
> > - };
> > -}
> > -
> > #ifndef CONFIG_RISCV_SBI_V01
> > static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> > .extid_start = -1UL,
> > @@ -125,11 +105,14 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> > {
> > int ret = 1;
> > bool next_sepc = true;
> > - bool userspace_exit = false;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > const struct kvm_vcpu_sbi_extension *sbi_ext;
> > - struct kvm_cpu_trap utrap = { 0 };
> > - unsigned long out_val = 0;
> > + struct kvm_cpu_trap utrap = {0};
> > + struct kvm_vcpu_sbi_return sbi_ret = {
> > + .out_val = 0,
> > + .err_val = 0,
> > + .utrap = &utrap,
> > + };
> > bool ext_is_v01 = false;
> >
> > sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
> > @@ -139,42 +122,46 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> > cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
> > ext_is_v01 = true;
> > #endif
> > - ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
> > + ret = sbi_ext->handler(vcpu, run, &sbi_ret);
> > } else {
> > /* Return error for unsupported SBI calls */
> > cp->a0 = SBI_ERR_NOT_SUPPORTED;
> > goto ecall_done;
> > }
> >
> > + /*
> > + * When the SBI extension returns a Linux error code, it exits the ioctl
> > + * loop and forwards the error to userspace.
> > + */
> > + if (ret < 0) {
> > + next_sepc = false;
> > + goto ecall_done;
> > + }
> > +
> > /* Handle special error cases i.e trap, exit or userspace forward */
> > - if (utrap.scause) {
> > + if (sbi_ret.utrap->scause) {
> > /* No need to increment sepc or exit ioctl loop */
> > ret = 1;
> > - utrap.sepc = cp->sepc;
> > - kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
> > + sbi_ret.utrap->sepc = cp->sepc;
> > + kvm_riscv_vcpu_trap_redirect(vcpu, sbi_ret.utrap);
> > next_sepc = false;
> > goto ecall_done;
> > }
> >
> > /* Exit ioctl loop or Propagate the error code the guest */
> > - if (userspace_exit) {
> > + if (sbi_ret.uexit) {
> > next_sepc = false;
> > ret = 0;
> > } else {
> > - /**
> > - * SBI extension handler always returns an Linux error code. Convert
> > - * it to the SBI specific error code that can be propagated the SBI
> > - * caller.
> > - */
> > - ret = kvm_linux_err_map_sbi(ret);
> > - cp->a0 = ret;
> > + cp->a0 = sbi_ret.err_val;
> > ret = 1;
> > }
> > ecall_done:
> > if (next_sepc)
> > cp->sepc += 4;
> > - if (!ext_is_v01)
> > - cp->a1 = out_val;
> > + /* a1 should only be updated when we continue the ioctl loop */
> > + if (!ext_is_v01 && ret == 1)
> > + cp->a1 = sbi_ret.out_val;
> >
> > return ret;
> > }
> > diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> > index 846d518..69f4202 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_base.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> > @@ -14,24 +14,22 @@
> > #include <asm/kvm_vcpu_sbi.h>
> >
> > static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *trap, bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > - int ret = 0;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > const struct kvm_vcpu_sbi_extension *sbi_ext;
> >
> > switch (cp->a6) {
> > case SBI_EXT_BASE_GET_SPEC_VERSION:
> > - *out_val = (KVM_SBI_VERSION_MAJOR <<
> > + retdata->out_val = (KVM_SBI_VERSION_MAJOR <<
> > SBI_SPEC_VERSION_MAJOR_SHIFT) |
> > KVM_SBI_VERSION_MINOR;
> > break;
> > case SBI_EXT_BASE_GET_IMP_ID:
> > - *out_val = KVM_SBI_IMPID;
> > + retdata->out_val = KVM_SBI_IMPID;
> > break;
> > case SBI_EXT_BASE_GET_IMP_VERSION:
> > - *out_val = LINUX_VERSION_CODE;
> > + retdata->out_val = LINUX_VERSION_CODE;
> > break;
> > case SBI_EXT_BASE_PROBE_EXT:
> > if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
> > @@ -43,33 +41,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > * forward it to the userspace
> > */
> > kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > - *exit = true;
> > + retdata->uexit = true;
> > } else {
> > sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> > if (sbi_ext) {
> > if (sbi_ext->probe)
> > - *out_val = sbi_ext->probe(vcpu);
> > + retdata->out_val = sbi_ext->probe(vcpu);
> > else
> > - *out_val = 1;
> > + retdata->out_val = 1;
> > } else
> > - *out_val = 0;
> > + retdata->out_val = 0;
> > }
> > break;
> > case SBI_EXT_BASE_GET_MVENDORID:
> > - *out_val = vcpu->arch.mvendorid;
> > + retdata->out_val = vcpu->arch.mvendorid;
> > break;
> > case SBI_EXT_BASE_GET_MARCHID:
> > - *out_val = vcpu->arch.marchid;
> > + retdata->out_val = vcpu->arch.marchid;
> > break;
> > case SBI_EXT_BASE_GET_MIMPID:
> > - *out_val = vcpu->arch.mimpid;
> > + retdata->out_val = vcpu->arch.mimpid;
> > break;
> > default:
> > - ret = -EOPNOTSUPP;
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > break;
> > }
> >
> > - return ret;
> > + return 0;
> > }
> >
> > const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> > @@ -79,17 +77,15 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> > };
> >
> > static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
> > - struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap,
> > - bool *exit)
> > + struct kvm_run *run,
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > /*
> > * Both SBI experimental and vendor extensions are
> > * unconditionally forwarded to userspace.
> > */
> > kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > - *exit = true;
> > + retdata->uexit = true;
> > return 0;
> > }
> >
> > diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > index 619ac0f..7dca0e9 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> > @@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> >
> > target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> > if (!target_vcpu)
> > - return -EINVAL;
> > + return SBI_ERR_INVALID_PARAM;
> > if (!target_vcpu->arch.power_off)
> > - return -EALREADY;
> > + return SBI_ERR_ALREADY_AVAILABLE;
> >
> > reset_cntx = &target_vcpu->arch.guest_reset_context;
> > /* start address */
> > @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> > static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> > {
> > if (vcpu->arch.power_off)
> > - return -EACCES;
> > + return SBI_ERR_FAILURE;
> >
> > kvm_riscv_vcpu_power_off(vcpu);
> >
> > @@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> >
> > target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> > if (!target_vcpu)
> > - return -EINVAL;
> > + return SBI_ERR_INVALID_PARAM;
> > if (!target_vcpu->arch.power_off)
> > return SBI_HSM_STATE_STARTED;
> > else if (vcpu->stat.generic.blocking)
> > @@ -67,9 +67,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> > }
> >
> > static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap,
> > - bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > int ret = 0;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > @@ -88,27 +86,29 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > case SBI_EXT_HSM_HART_STATUS:
> > ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
> > if (ret >= 0) {
> > - *out_val = ret;
> > - ret = 0;
> > + retdata->out_val = ret;
> > + retdata->err_val = 0;
> > }
> > - break;
> > + return 0;
> > case SBI_EXT_HSM_HART_SUSPEND:
> > switch (cp->a0) {
> > case SBI_HSM_SUSPEND_RET_DEFAULT:
> > kvm_riscv_vcpu_wfi(vcpu);
> > break;
> > case SBI_HSM_SUSPEND_NON_RET_DEFAULT:
> > - ret = -EOPNOTSUPP;
> > + ret = SBI_ERR_NOT_SUPPORTED;
> > break;
> > default:
> > - ret = -EINVAL;
> > + ret = SBI_ERR_INVALID_PARAM;
> > }
> > break;
> > default:
> > - ret = -EOPNOTSUPP;
> > + ret = SBI_ERR_NOT_SUPPORTED;
> > }
> >
> > - return ret;
> > + retdata->err_val = ret;
> > +
> > + return 0;
> > }
> >
> > const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
> > diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> > index 03a0198..38fa4c0 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> > @@ -14,15 +14,15 @@
> > #include <asm/kvm_vcpu_sbi.h>
> >
> > static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap, bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > - int ret = 0;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > u64 next_cycle;
> >
> > - if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
> > - return -EINVAL;
> > + if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
> > + retdata->err_val = SBI_ERR_INVALID_PARAM;
> > + return 0;
> > + }
> >
> > #if __riscv_xlen == 32
> > next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> > @@ -31,7 +31,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > #endif
> > kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
> >
> > - return ret;
> > + return 0;
> > }
> >
> > const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> > @@ -41,8 +41,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> > };
> >
> > static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap, bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > int ret = 0;
> > unsigned long i;
> > @@ -51,8 +50,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > unsigned long hmask = cp->a0;
> > unsigned long hbase = cp->a1;
> >
> > - if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
> > - return -EINVAL;
> > + if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
> > + retdata->err_val = SBI_ERR_INVALID_PARAM;
> > + return 0;
> > + }
> >
> > kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > if (hbase != -1UL) {
> > @@ -76,10 +77,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
> > };
> >
> > static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap, bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > - int ret = 0;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > unsigned long hmask = cp->a0;
> > unsigned long hbase = cp->a1;
> > @@ -116,10 +115,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> > */
> > break;
> > default:
> > - ret = -EOPNOTSUPP;
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > }
> >
> > - return ret;
> > + return 0;
> > }
> >
> > const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> > @@ -130,14 +129,12 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> >
> > static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> > struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap, bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > unsigned long funcid = cp->a6;
> > u32 reason = cp->a1;
> > u32 type = cp->a0;
> > - int ret = 0;
> >
> > switch (funcid) {
> > case SBI_EXT_SRST_RESET:
> > @@ -146,24 +143,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> > kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> > KVM_SYSTEM_EVENT_SHUTDOWN,
> > reason);
> > - *exit = true;
> > + retdata->uexit = true;
> > break;
> > case SBI_SRST_RESET_TYPE_COLD_REBOOT:
> > case SBI_SRST_RESET_TYPE_WARM_REBOOT:
> > kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> > KVM_SYSTEM_EVENT_RESET,
> > reason);
> > - *exit = true;
> > + retdata->uexit = true;
> > break;
> > default:
> > - ret = -EOPNOTSUPP;
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > }
> > break;
> > default:
> > - ret = -EOPNOTSUPP;
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > }
> >
> > - return ret;
> > + return 0;
> > }
> >
> > const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
> > diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> > index 489f225..0269e08 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> > @@ -14,9 +14,7 @@
> > #include <asm/kvm_vcpu_sbi.h>
> >
> > static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > - unsigned long *out_val,
> > - struct kvm_cpu_trap *utrap,
> > - bool *exit)
> > + struct kvm_vcpu_sbi_return *retdata)
> > {
> > ulong hmask;
> > int i, ret = 0;
> > @@ -33,7 +31,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > * handled in kernel so we forward these to user-space
> > */
> > kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > - *exit = true;
> > + retdata->uexit = true;
> > break;
> > case SBI_EXT_0_1_SET_TIMER:
> > #if __riscv_xlen == 32
> > @@ -49,10 +47,10 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > case SBI_EXT_0_1_SEND_IPI:
> > if (cp->a0)
> > hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> > - utrap);
> > + retdata->utrap);
> > else
> > hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> > - if (utrap->scause)
> > + if (retdata->utrap->scause)
> > break;
> >
> > for_each_set_bit(i, &hmask, BITS_PER_LONG) {
> > @@ -65,17 +63,17 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > case SBI_EXT_0_1_SHUTDOWN:
> > kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> > KVM_SYSTEM_EVENT_SHUTDOWN, 0);
> > - *exit = true;
> > + retdata->uexit = true;
> > break;
> > case SBI_EXT_0_1_REMOTE_FENCE_I:
> > case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
> > case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
> > if (cp->a0)
> > hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> > - utrap);
> > + retdata->utrap);
> > else
> > hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> > - if (utrap->scause)
> > + if (retdata->utrap->scause)
> > break;
> >
> > if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> > @@ -103,7 +101,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > }
> > break;
> > default:
> > - ret = -EINVAL;
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > break;
> > }
> >
> > --
> > 2.25.1
> >
>
> Regards,
> Anup

2023-02-02 11:34:29

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> This patch only adds barebone structure of perf implementation. Most of
> the function returns zero at this point and will be implemented
> fully in the future.
>
> Signed-off-by: Atish Patra <[email protected]>
> +/* Per virtual pmu counter data */
> +struct kvm_pmc {
> + u8 idx;
> + struct perf_event *perf_event;
> + uint64_t counter_val;

CI also complained that here, and elsewhere, you used uint64_t rather
than u64. Am I missing a reason for not using the regular types?

Thanks,
Conor.

> + union sbi_pmu_ctr_info cinfo;
> + /* Event monitoring status */
> + bool started;


Attachments:
(No filename) (648.00 B)
signature.asc (228.00 B)
Download all attachments

2023-02-02 11:41:29

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 13/14] RISC-V: KVM: Support firmware events

Hey Atish,

On Wed, Feb 01, 2023 at 03:12:49PM -0800, Atish Patra wrote:
> SBI PMU extension defines a set of firmware events which can provide
> useful information to guests about the number of SBI calls. As
> hypervisor implements the SBI PMU extension, these firmware events
> correspond to ecall invocations between VS->HS mode. All other firmware
> events will always report zero if monitored as KVM doesn't implement them.
>
> This patch adds all the infrastructure required to support firmware
> events.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 473ad80..dd16e60 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -202,12 +202,15 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> struct kvm_pmc *pmc;
> u64 enabled, running;
> + int fevent_code;
>
> pmc = &kvpmu->pmc[cidx];
> - if (!pmc->perf_event)
> - return -EINVAL;
>
> - pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> + if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> + fevent_code = get_event_code(pmc->event_idx);
> + pmc->counter_val = kvpmu->fw_event[fevent_code].value;
> + } else if (pmc->perf_event)
> + pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);

Here, and elsewhere, all branches of an if/else must use {} if one
branch needs them.
Patches 4 & 12 have similar issues, which checkpatch in the patchwork
CI stuff also complained about.

Thanks,
Conor.


Attachments:
(No filename) (1.61 kB)
signature.asc (228.00 B)
Download all attachments

2023-02-02 15:00:02

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 01/14] perf: RISC-V: Define helper functions expose hpm counter width and count

On Wed, Feb 01, 2023 at 03:12:37PM -0800, Atish Patra wrote:
> KVM module needs to know how many hardware counters and the counter
> width that the platform supports. Otherwise, it will not be able to show
> optimal value of virtual counters to the guest. The virtual hardware
> counters also need to have the same width as the logical hardware
> counters for simplicity. However, there shouldn't be mapping between
> virtual hardware counters and logical hardware counters. As we don't
> support hetergeneous harts or counters with different width as of now,
> the implementation relies on the counter width of the first available
> programmable counter.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> drivers/perf/riscv_pmu_sbi.c | 37 ++++++++++++++++++++++++++++++++--
> include/linux/perf/riscv_pmu.h | 3 +++
> 2 files changed, 38 insertions(+), 2 deletions(-)
>

Reviewed-by: Andrew Jones <[email protected]>

2023-02-02 15:02:01

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 03/14] RISC-V: Improve SBI PMU extension related definitions

On Wed, Feb 01, 2023 at 03:12:39PM -0800, Atish Patra wrote:
> This patch fixes/improve few minor things in SBI PMU extension
> definition.
>
> 1. Align all the firmware event names.
> 2. Add macros for bit positions in cache event ID & ops.
>
> The changes were small enough to combine them together instead
> of creating 1 liner patches.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/sbi.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 4ca7fba..945b7be 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -169,9 +169,9 @@ enum sbi_pmu_fw_generic_events_t {
> SBI_PMU_FW_ILLEGAL_INSN = 4,
> SBI_PMU_FW_SET_TIMER = 5,
> SBI_PMU_FW_IPI_SENT = 6,
> - SBI_PMU_FW_IPI_RECVD = 7,
> + SBI_PMU_FW_IPI_RCVD = 7,
> SBI_PMU_FW_FENCE_I_SENT = 8,
> - SBI_PMU_FW_FENCE_I_RECVD = 9,
> + SBI_PMU_FW_FENCE_I_RCVD = 9,
> SBI_PMU_FW_SFENCE_VMA_SENT = 10,
> SBI_PMU_FW_SFENCE_VMA_RCVD = 11,
> SBI_PMU_FW_SFENCE_VMA_ASID_SENT = 12,
> @@ -215,6 +215,9 @@ enum sbi_pmu_ctr_type {
> #define SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK 0x06
> #define SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK 0x01
>
> +#define SBI_PMU_EVENT_CACHE_ID_SHIFT 3
> +#define SBI_PMU_EVENT_CACHE_OP_SHIFT 1
> +
> #define SBI_PMU_EVENT_IDX_INVALID 0xFFFFFFFF
>
> /* Flags defined for config matching function */
> --
> 2.25.1
>

Reviewed-by: Andrew Jones <[email protected]>

2023-02-02 15:14:53

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 04/14] RISC-V: KVM: Define a probe function for SBI extension data structures

On Wed, Feb 01, 2023 at 03:12:40PM -0800, Atish Patra wrote:
> Currently the probe function just checks if an SBI extension is
> registered or not. However, the extension may not want to advertise
> itself depending on some other condition.
> An additional extension specific probe function will allow
> extensions to decide if they want to be advertised to the caller or
> not. Any extension that does not require additional dependency checks
> can avoid implementing this function.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 +++
> arch/riscv/kvm/vcpu_sbi_base.c | 13 +++++++++++--
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> index f79478a..45ba341 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> @@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
> int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> unsigned long *out_val, struct kvm_cpu_trap *utrap,
> bool *exit);
> +
> + /* Extension specific probe function */
> + unsigned long (*probe)(struct kvm_vcpu *vcpu);
> };
>
> void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
> diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> index 5d65c63..846d518 100644
> --- a/arch/riscv/kvm/vcpu_sbi_base.c
> +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> @@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> {
> int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> + const struct kvm_vcpu_sbi_extension *sbi_ext;
>
> switch (cp->a6) {
> case SBI_EXT_BASE_GET_SPEC_VERSION:
> @@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> *exit = true;
> - } else
> - *out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
> + } else {
> + sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> + if (sbi_ext) {
> + if (sbi_ext->probe)
> + *out_val = sbi_ext->probe(vcpu);
> + else
> + *out_val = 1;
> + } else
> + *out_val = 0;

Conor points out elsewhere that we need {} on both arms if one arm needs
it. We actually don't need {} on either arm, though, or even the if, if
we rewrite as

*out_val = sbi_ext && sbi_ext->probe ? sbi_ext->probe(vcpu) : !!sbi_ext;

Thanks,
drew

> + }
> break;
> case SBI_EXT_BASE_GET_MVENDORID:
> *out_val = vcpu->arch.mvendorid;
> --
> 2.25.1
>

2023-02-02 15:16:51

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 04/14] RISC-V: KVM: Define a probe function for SBI extension data structures

On Thu, Feb 02, 2023 at 04:14:35PM +0100, Andrew Jones wrote:
> On Wed, Feb 01, 2023 at 03:12:40PM -0800, Atish Patra wrote:
> > Currently the probe function just checks if an SBI extension is
> > registered or not. However, the extension may not want to advertise
> > itself depending on some other condition.
> > An additional extension specific probe function will allow
> > extensions to decide if they want to be advertised to the caller or
> > not. Any extension that does not require additional dependency checks
> > can avoid implementing this function.
> >
> > Reviewed-by: Anup Patel <[email protected]>
> > Signed-off-by: Atish Patra <[email protected]>
> > ---
> > arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 +++
> > arch/riscv/kvm/vcpu_sbi_base.c | 13 +++++++++++--
> > 2 files changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > index f79478a..45ba341 100644
> > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> > @@ -29,6 +29,9 @@ struct kvm_vcpu_sbi_extension {
> > int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > unsigned long *out_val, struct kvm_cpu_trap *utrap,
> > bool *exit);
> > +
> > + /* Extension specific probe function */
> > + unsigned long (*probe)(struct kvm_vcpu *vcpu);
> > };
> >
> > void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
> > diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> > index 5d65c63..846d518 100644
> > --- a/arch/riscv/kvm/vcpu_sbi_base.c
> > +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> > @@ -19,6 +19,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > {
> > int ret = 0;
> > struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > + const struct kvm_vcpu_sbi_extension *sbi_ext;
> >
> > switch (cp->a6) {
> > case SBI_EXT_BASE_GET_SPEC_VERSION:
> > @@ -43,8 +44,16 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > */
> > kvm_riscv_vcpu_sbi_forward(vcpu, run);
> > *exit = true;
> > - } else
> > - *out_val = kvm_vcpu_sbi_find_ext(cp->a0) ? 1 : 0;
> > + } else {
> > + sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> > + if (sbi_ext) {
> > + if (sbi_ext->probe)
> > + *out_val = sbi_ext->probe(vcpu);
> > + else
> > + *out_val = 1;
> > + } else
> > + *out_val = 0;
>
> Conor points out elsewhere that we need {} on both arms if one arm needs
> it. We actually don't need {} on either arm, though, or even the if, if
> we rewrite as
>
> *out_val = sbi_ext && sbi_ext->probe ? sbi_ext->probe(vcpu) : !!sbi_ext;

I sent too soon, I meant to add

In any case,

Reviewed-by: Andrew Jones <[email protected]>

Thanks,
drew


>
> Thanks,
> drew
>
> > + }
> > break;
> > case SBI_EXT_BASE_GET_MVENDORID:
> > *out_val = vcpu->arch.mvendorid;
> > --
> > 2.25.1
> >

2023-02-02 15:27:05

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 05/14] RISC-V: KVM: Return correct code for hsm stop function

On Wed, Feb 01, 2023 at 03:12:41PM -0800, Atish Patra wrote:
> According to the SBI specification, the stop function can only
> return error code SBI_ERR_FAILED. However, currently it returns
> -EINVAL which will be mapped SBI_ERR_INVALID_PARAM.
>
> Return an linux error code that maps to SBI_ERR_FAILED i.e doesn't map
> to any other SBI error code. While EACCES is not the best error code
> to describe the situation, it is close enough and will be replaced
> with SBI error codes directly anyways.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> index 2e915ca..619ac0f 100644
> --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> {
> if (vcpu->arch.power_off)
> - return -EINVAL;
> + return -EACCES;
>
> kvm_riscv_vcpu_power_off(vcpu);
>
> --
> 2.25.1
>

Reviewed-by: Andrew Jones <[email protected]>

2023-02-02 16:06:06

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 06/14] RISC-V: KVM: Modify SBI extension handler to return SBI error code

On Wed, Feb 01, 2023 at 03:12:42PM -0800, Atish Patra wrote:
> Currently, the SBI extension handle is expected to return Linux error code.
> The top SBI layer converts the Linux error code to SBI specific error code
> that can be returned to guest invoking the SBI calls. This model works
> as long as SBI error codes have 1-to-1 mappings between them.
> However, that may not be true always. This patch attempts to disassociate
> both these error codes by allowing the SBI extension implementation to
> return SBI specific error codes as well.
>
> The extension will continue to return the Linux error specific code which
> will indicate any problem *with* the extension emulation while the
> SBI specific error will indicate the problem *of* the emulation.
>
> Suggested-by: Andrew Jones <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/kvm_vcpu_sbi.h | 10 ++++-
> arch/riscv/kvm/vcpu_sbi.c | 61 +++++++++++----------------
> arch/riscv/kvm/vcpu_sbi_base.c | 36 +++++++---------
> arch/riscv/kvm/vcpu_sbi_hsm.c | 28 ++++++------
> arch/riscv/kvm/vcpu_sbi_replace.c | 43 +++++++++----------
> arch/riscv/kvm/vcpu_sbi_v01.c | 18 ++++----
> 6 files changed, 90 insertions(+), 106 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> index 45ba341..8425556 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
> @@ -18,6 +18,13 @@ struct kvm_vcpu_sbi_context {
> int return_handled;
> };
>
> +struct kvm_vcpu_sbi_return {
> + unsigned long out_val;
> + unsigned long err_val;
> + struct kvm_cpu_trap *utrap;
> + bool uexit;
> +};
> +
> struct kvm_vcpu_sbi_extension {
> unsigned long extid_start;
> unsigned long extid_end;
> @@ -27,8 +34,7 @@ struct kvm_vcpu_sbi_extension {
> * specific error codes.
> */
> int (*handler)(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val, struct kvm_cpu_trap *utrap,
> - bool *exit);
> + struct kvm_vcpu_sbi_return *retdata);
>
> /* Extension specific probe function */
> unsigned long (*probe)(struct kvm_vcpu *vcpu);
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index f96991d..fe2897e 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -12,26 +12,6 @@
> #include <asm/sbi.h>
> #include <asm/kvm_vcpu_sbi.h>
>
> -static int kvm_linux_err_map_sbi(int err)
> -{
> - switch (err) {
> - case 0:
> - return SBI_SUCCESS;
> - case -EPERM:
> - return SBI_ERR_DENIED;
> - case -EINVAL:
> - return SBI_ERR_INVALID_PARAM;
> - case -EFAULT:
> - return SBI_ERR_INVALID_ADDRESS;
> - case -EOPNOTSUPP:
> - return SBI_ERR_NOT_SUPPORTED;
> - case -EALREADY:
> - return SBI_ERR_ALREADY_AVAILABLE;
> - default:
> - return SBI_ERR_FAILURE;
> - };
> -}
> -
> #ifndef CONFIG_RISCV_SBI_V01
> static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> .extid_start = -1UL,
> @@ -125,11 +105,14 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> int ret = 1;
> bool next_sepc = true;
> - bool userspace_exit = false;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> const struct kvm_vcpu_sbi_extension *sbi_ext;
> - struct kvm_cpu_trap utrap = { 0 };
> - unsigned long out_val = 0;
> + struct kvm_cpu_trap utrap = {0};
> + struct kvm_vcpu_sbi_return sbi_ret = {
> + .out_val = 0,
> + .err_val = 0,
> + .utrap = &utrap,
> + };
> bool ext_is_v01 = false;
>
> sbi_ext = kvm_vcpu_sbi_find_ext(cp->a7);
> @@ -139,42 +122,46 @@ int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run)
> cp->a7 <= SBI_EXT_0_1_SHUTDOWN)
> ext_is_v01 = true;
> #endif
> - ret = sbi_ext->handler(vcpu, run, &out_val, &utrap, &userspace_exit);
> + ret = sbi_ext->handler(vcpu, run, &sbi_ret);
> } else {
> /* Return error for unsupported SBI calls */
> cp->a0 = SBI_ERR_NOT_SUPPORTED;
> goto ecall_done;
> }
>
> + /*
> + * When the SBI extension returns a Linux error code, it exits the ioctl
> + * loop and forwards the error to userspace.
> + */
> + if (ret < 0) {
> + next_sepc = false;
> + goto ecall_done;
> + }
> +
> /* Handle special error cases i.e trap, exit or userspace forward */
> - if (utrap.scause) {
> + if (sbi_ret.utrap->scause) {
> /* No need to increment sepc or exit ioctl loop */
> ret = 1;
> - utrap.sepc = cp->sepc;
> - kvm_riscv_vcpu_trap_redirect(vcpu, &utrap);
> + sbi_ret.utrap->sepc = cp->sepc;
> + kvm_riscv_vcpu_trap_redirect(vcpu, sbi_ret.utrap);
> next_sepc = false;
> goto ecall_done;
> }
>
> /* Exit ioctl loop or Propagate the error code the guest */
> - if (userspace_exit) {
> + if (sbi_ret.uexit) {
> next_sepc = false;
> ret = 0;
> } else {
> - /**
> - * SBI extension handler always returns an Linux error code. Convert
> - * it to the SBI specific error code that can be propagated the SBI
> - * caller.
> - */
> - ret = kvm_linux_err_map_sbi(ret);
> - cp->a0 = ret;
> + cp->a0 = sbi_ret.err_val;
> ret = 1;
> }
> ecall_done:
> if (next_sepc)
> cp->sepc += 4;
> - if (!ext_is_v01)
> - cp->a1 = out_val;
> + /* a1 should only be updated when we continue the ioctl loop */
> + if (!ext_is_v01 && ret == 1)
> + cp->a1 = sbi_ret.out_val;
>
> return ret;
> }
> diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c
> index 846d518..69f4202 100644
> --- a/arch/riscv/kvm/vcpu_sbi_base.c
> +++ b/arch/riscv/kvm/vcpu_sbi_base.c
> @@ -14,24 +14,22 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *trap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> const struct kvm_vcpu_sbi_extension *sbi_ext;
>
> switch (cp->a6) {
> case SBI_EXT_BASE_GET_SPEC_VERSION:
> - *out_val = (KVM_SBI_VERSION_MAJOR <<
> + retdata->out_val = (KVM_SBI_VERSION_MAJOR <<
> SBI_SPEC_VERSION_MAJOR_SHIFT) |
> KVM_SBI_VERSION_MINOR;
> break;
> case SBI_EXT_BASE_GET_IMP_ID:
> - *out_val = KVM_SBI_IMPID;
> + retdata->out_val = KVM_SBI_IMPID;
> break;
> case SBI_EXT_BASE_GET_IMP_VERSION:
> - *out_val = LINUX_VERSION_CODE;
> + retdata->out_val = LINUX_VERSION_CODE;
> break;
> case SBI_EXT_BASE_PROBE_EXT:
> if ((cp->a0 >= SBI_EXT_EXPERIMENTAL_START &&
> @@ -43,33 +41,33 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> * forward it to the userspace
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> } else {
> sbi_ext = kvm_vcpu_sbi_find_ext(cp->a0);
> if (sbi_ext) {
> if (sbi_ext->probe)
> - *out_val = sbi_ext->probe(vcpu);
> + retdata->out_val = sbi_ext->probe(vcpu);
> else
> - *out_val = 1;
> + retdata->out_val = 1;
> } else
> - *out_val = 0;
> + retdata->out_val = 0;
> }
> break;
> case SBI_EXT_BASE_GET_MVENDORID:
> - *out_val = vcpu->arch.mvendorid;
> + retdata->out_val = vcpu->arch.mvendorid;
> break;
> case SBI_EXT_BASE_GET_MARCHID:
> - *out_val = vcpu->arch.marchid;
> + retdata->out_val = vcpu->arch.marchid;
> break;
> case SBI_EXT_BASE_GET_MIMPID:
> - *out_val = vcpu->arch.mimpid;
> + retdata->out_val = vcpu->arch.mimpid;
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> break;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> @@ -79,17 +77,15 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_base = {
> };
>
> static int kvm_sbi_ext_forward_handler(struct kvm_vcpu *vcpu,
> - struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_run *run,
> + struct kvm_vcpu_sbi_return *retdata)
> {
> /*
> * Both SBI experimental and vendor extensions are
> * unconditionally forwarded to userspace.
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> return 0;
> }
>
> diff --git a/arch/riscv/kvm/vcpu_sbi_hsm.c b/arch/riscv/kvm/vcpu_sbi_hsm.c
> index 619ac0f..7dca0e9 100644
> --- a/arch/riscv/kvm/vcpu_sbi_hsm.c
> +++ b/arch/riscv/kvm/vcpu_sbi_hsm.c
> @@ -21,9 +21,9 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
>
> target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> if (!target_vcpu)
> - return -EINVAL;
> + return SBI_ERR_INVALID_PARAM;
> if (!target_vcpu->arch.power_off)
> - return -EALREADY;
> + return SBI_ERR_ALREADY_AVAILABLE;
>
> reset_cntx = &target_vcpu->arch.guest_reset_context;
> /* start address */
> @@ -42,7 +42,7 @@ static int kvm_sbi_hsm_vcpu_start(struct kvm_vcpu *vcpu)
> static int kvm_sbi_hsm_vcpu_stop(struct kvm_vcpu *vcpu)
> {
> if (vcpu->arch.power_off)
> - return -EACCES;
> + return SBI_ERR_FAILURE;
>
> kvm_riscv_vcpu_power_off(vcpu);
>
> @@ -57,7 +57,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
>
> target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, target_vcpuid);
> if (!target_vcpu)
> - return -EINVAL;
> + return SBI_ERR_INVALID_PARAM;
> if (!target_vcpu->arch.power_off)
> return SBI_HSM_STATE_STARTED;
> else if (vcpu->stat.generic.blocking)
> @@ -67,9 +67,7 @@ static int kvm_sbi_hsm_vcpu_get_status(struct kvm_vcpu *vcpu)
> }
>
> static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> @@ -88,27 +86,29 @@ static int kvm_sbi_ext_hsm_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_HSM_HART_STATUS:
> ret = kvm_sbi_hsm_vcpu_get_status(vcpu);
> if (ret >= 0) {
> - *out_val = ret;
> - ret = 0;
> + retdata->out_val = ret;
> + retdata->err_val = 0;
> }
> - break;
> + return 0;
> case SBI_EXT_HSM_HART_SUSPEND:
> switch (cp->a0) {
> case SBI_HSM_SUSPEND_RET_DEFAULT:
> kvm_riscv_vcpu_wfi(vcpu);
> break;
> case SBI_HSM_SUSPEND_NON_RET_DEFAULT:
> - ret = -EOPNOTSUPP;
> + ret = SBI_ERR_NOT_SUPPORTED;
> break;
> default:
> - ret = -EINVAL;
> + ret = SBI_ERR_INVALID_PARAM;
> }
> break;
> default:
> - ret = -EOPNOTSUPP;
> + ret = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + retdata->err_val = ret;
> +
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm = {
> diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
> index 03a0198..38fa4c0 100644
> --- a/arch/riscv/kvm/vcpu_sbi_replace.c
> +++ b/arch/riscv/kvm/vcpu_sbi_replace.c
> @@ -14,15 +14,15 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> u64 next_cycle;
>
> - if (cp->a6 != SBI_EXT_TIME_SET_TIMER)
> - return -EINVAL;
> + if (cp->a6 != SBI_EXT_TIME_SET_TIMER) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
>
> #if __riscv_xlen == 32
> next_cycle = ((u64)cp->a1 << 32) | (u64)cp->a0;
> @@ -31,7 +31,7 @@ static int kvm_sbi_ext_time_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> #endif
> kvm_riscv_vcpu_timer_next_event(vcpu, next_cycle);
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> @@ -41,8 +41,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_time = {
> };
>
> static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> int ret = 0;
> unsigned long i;
> @@ -51,8 +50,10 @@ static int kvm_sbi_ext_ipi_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> unsigned long hmask = cp->a0;
> unsigned long hbase = cp->a1;
>
> - if (cp->a6 != SBI_EXT_IPI_SEND_IPI)
> - return -EINVAL;
> + if (cp->a6 != SBI_EXT_IPI_SEND_IPI) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
>
> kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> if (hbase != -1UL) {
> @@ -76,10 +77,8 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_ipi = {
> };
>
> static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> - int ret = 0;
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> unsigned long hmask = cp->a0;
> unsigned long hbase = cp->a1;
> @@ -116,10 +115,10 @@ static int kvm_sbi_ext_rfence_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
> */
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
> @@ -130,14 +129,12 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_rfence = {
>
> static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap, bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {
> struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> unsigned long funcid = cp->a6;
> u32 reason = cp->a1;
> u32 type = cp->a0;
> - int ret = 0;
>
> switch (funcid) {
> case SBI_EXT_SRST_RESET:
> @@ -146,24 +143,24 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_SHUTDOWN,
> reason);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_SRST_RESET_TYPE_COLD_REBOOT:
> case SBI_SRST_RESET_TYPE_WARM_REBOOT:
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_RESET,
> reason);
> - *exit = true;
> + retdata->uexit = true;
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
> break;
> default:
> - ret = -EOPNOTSUPP;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> }
>
> - return ret;
> + return 0;
> }
>
> const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst = {
> diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
> index 489f225..0269e08 100644
> --- a/arch/riscv/kvm/vcpu_sbi_v01.c
> +++ b/arch/riscv/kvm/vcpu_sbi_v01.c
> @@ -14,9 +14,7 @@
> #include <asm/kvm_vcpu_sbi.h>
>
> static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> - unsigned long *out_val,
> - struct kvm_cpu_trap *utrap,
> - bool *exit)
> + struct kvm_vcpu_sbi_return *retdata)
> {

ubernit: Could do

struct kvm_cpu_trap *utrap = retdata->utrap;

here and then half the changes below would go away, and we'd
wouldn't have so many ->'s

Looking back up, I see the same ubernit could apply to
kvm_sbi_ext_base_handler() as well, but with

unsigned long *out_val = &retdata->out_val;

> ulong hmask;
> int i, ret = 0;
> @@ -33,7 +31,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> * handled in kernel so we forward these to user-space
> */
> kvm_riscv_vcpu_sbi_forward(vcpu, run);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_EXT_0_1_SET_TIMER:
> #if __riscv_xlen == 32
> @@ -49,10 +47,10 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_0_1_SEND_IPI:
> if (cp->a0)
> hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> - utrap);
> + retdata->utrap);
> else
> hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> - if (utrap->scause)
> + if (retdata->utrap->scause)
> break;
>
> for_each_set_bit(i, &hmask, BITS_PER_LONG) {
> @@ -65,17 +63,17 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> case SBI_EXT_0_1_SHUTDOWN:
> kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
> KVM_SYSTEM_EVENT_SHUTDOWN, 0);
> - *exit = true;
> + retdata->uexit = true;
> break;
> case SBI_EXT_0_1_REMOTE_FENCE_I:
> case SBI_EXT_0_1_REMOTE_SFENCE_VMA:
> case SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID:
> if (cp->a0)
> hmask = kvm_riscv_vcpu_unpriv_read(vcpu, false, cp->a0,
> - utrap);
> + retdata->utrap);
> else
> hmask = (1UL << atomic_read(&kvm->online_vcpus)) - 1;
> - if (utrap->scause)
> + if (retdata->utrap->scause)
> break;
>
> if (cp->a7 == SBI_EXT_0_1_REMOTE_FENCE_I)
> @@ -103,7 +101,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> }
> break;
> default:
> - ret = -EINVAL;
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> break;
> }
>
> --
> 2.25.1
>

Other than ubernits,

Reviewed-by: Andrew Jones <[email protected]>

Thanks,
drew

2023-02-02 17:03:54

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> This patch only adds barebone structure of perf implementation. Most of
> the function returns zero at this point and will be implemented
> fully in the future.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/kvm_host.h | 4 +
> arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> arch/riscv/kvm/Makefile | 1 +
> arch/riscv/kvm/vcpu.c | 7 ++
> arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> 5 files changed, 226 insertions(+)
> create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> create mode 100644 arch/riscv/kvm/vcpu_pmu.c
>
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 93f43a3..b90be9a 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -18,6 +18,7 @@
> #include <asm/kvm_vcpu_insn.h>
> #include <asm/kvm_vcpu_sbi.h>
> #include <asm/kvm_vcpu_timer.h>
> +#include <asm/kvm_vcpu_pmu.h>
>
> #define KVM_MAX_VCPUS 1024
>
> @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
>
> /* Don't run the VCPU (blocked) */
> bool pause;
> +
> + /* Performance monitoring context */
> + struct kvm_pmu pmu_context;
> };
>
> static inline void kvm_arch_hardware_unsetup(void) {}
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> new file mode 100644
> index 0000000..e2b4038
> --- /dev/null
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#ifndef __KVM_VCPU_RISCV_PMU_H
> +#define __KVM_VCPU_RISCV_PMU_H
> +
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/sbi.h>
> +
> +#ifdef CONFIG_RISCV_PMU_SBI
> +#define RISCV_KVM_MAX_FW_CTRS 32
> +
> +#if RISCV_KVM_MAX_FW_CTRS > 32
> +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"

"The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"

> +#endif
> +
> +#define RISCV_MAX_COUNTERS 64

But instead of that message, what I think we need is something like

#define RISCV_KVM_MAX_HW_CTRS 32
#define RISCV_KVM_MAX_FW_CTRS 32
#define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)

static_assert(RISCV_MAX_COUNTERS <= 64)

And then in pmu_sbi_device_probe() should ensure

num_counters <= RISCV_MAX_COUNTERS

and pmu_sbi_get_ctrinfo() should ensure

num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS

which has to be done at runtime.

> +
> +/* Per virtual pmu counter data */
> +struct kvm_pmc {
> + u8 idx;
> + struct perf_event *perf_event;
> + uint64_t counter_val;
> + union sbi_pmu_ctr_info cinfo;
> + /* Event monitoring status */
> + bool started;
> +};
> +
> +/* PMU data structure per vcpu */
> +struct kvm_pmu {
> + struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> + /* Number of the virtual firmware counters available */
> + int num_fw_ctrs;
> + /* Number of the virtual hardware counters available */
> + int num_hw_ctrs;
> + /* A flag to indicate that pmu initialization is done */
> + bool init_done;
> + /* Bit map of all the virtual counter used */
> + DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> +};
> +
> +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
> +#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + struct kvm_vcpu_sbi_return *retdata);
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + unsigned long eidx, uint64_t evtdata,
> + struct kvm_vcpu_sbi_return *retdata);

s/flag/flags/ for all the above prototypes and all the implementations
below.

> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata);
> +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> +
> +#else
> +struct kvm_pmu {
> +};
> +
> +static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> +#endif /* CONFIG_RISCV_PMU_SBI */
> +#endif /* !__KVM_VCPU_RISCV_PMU_H */
> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 019df920..5de1053 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
> kvm-y += vcpu_sbi_replace.o
> kvm-y += vcpu_sbi_hsm.o
> kvm-y += vcpu_timer.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 7c08567..7d010b0 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -138,6 +138,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
> WRITE_ONCE(vcpu->arch.irqs_pending, 0);
> WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
>
> + kvm_riscv_vcpu_pmu_reset(vcpu);
> +
> vcpu->arch.hfence_head = 0;
> vcpu->arch.hfence_tail = 0;
> memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
> @@ -194,6 +196,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> /* Setup VCPU timer */
> kvm_riscv_vcpu_timer_init(vcpu);
>
> + /* setup performance monitoring */
> + kvm_riscv_vcpu_pmu_init(vcpu);
> +
> /* Reset VCPU */
> kvm_riscv_reset_vcpu(vcpu);
>
> @@ -216,6 +221,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> /* Cleanup VCPU timer */
> kvm_riscv_vcpu_timer_deinit(vcpu);
>
> + kvm_riscv_vcpu_pmu_deinit(vcpu);
> +
> /* Free unused pages pre-allocated for G-stage page table mappings */
> kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> }
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> new file mode 100644
> index 0000000..2dad37f
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <linux/perf/riscv_pmu.h>
> +#include <asm/csr.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +#include <asm/kvm_vcpu_pmu.h>
> +#include <linux/kvm_host.h>
> +
> +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> +
> +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + retdata->out_val = kvm_pmu_num_counters(kvpmu);
> +
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + if (cidx > RISCV_MAX_COUNTERS || cidx == 1) {
> + retdata->err_val = SBI_ERR_INVALID_PARAM;
> + return 0;
> + }
> +
> + retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
> +
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> + unsigned long ctr_mask, unsigned long flag,
> + unsigned long eidx, uint64_t evtdata,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> +{
> + int i = 0, ret, num_hw_ctrs = 0, hpm_width = 0;
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + struct kvm_pmc *pmc;
> +
> + ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> + if (ret < 0 || !hpm_width || !num_hw_ctrs)
> + return;
> +
> + /*
> + * It is guranteed that RISCV_KVM_MAX_FW_CTRS can't exceed 32 as
> + * that may exceed total number of counters more than RISCV_MAX_COUNTERS
> + */
> + kvpmu->num_hw_ctrs = num_hw_ctrs;
> + kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;

If we sanity check that num_hw_ctrs <= 32 and num_fw_ctrs <= 32 at sbi_pmu
probe time, then we can also return num_fw_ctrs (or num_ctrs) along with
num_hw_ctrs from riscv_pmu_get_hpm_info(). Then, we can put the exact
number here into kvmpmu->num_fw_ctrs, rather than using its max.

> +
> + /*
> + * There is no correlation between the logical hardware counter and virtual counters.
> + * However, we need to encode a hpmcounter CSR in the counter info field so that
> + * KVM can trap n emulate the read. This works well in the migration use case as
> + * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> + */
> + for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> + /* TIME CSR shouldn't be read from perf interface */
> + if (i == 1)
> + continue;
> + pmc = &kvpmu->pmc[i];
> + pmc->idx = i;
> + if (i < kvpmu->num_hw_ctrs) {
> + pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
> + if (i < 3)
> + /* CY, IR counters */
> + pmc->cinfo.width = 63;
> + else
> + pmc->cinfo.width = hpm_width;
> + /*
> + * The CSR number doesn't have any relation with the logical
> + * hardware counters. The CSR numbers are encoded sequentially
> + * to avoid maintaining a map between the virtual counter
> + * and CSR number.
> + */
> + pmc->cinfo.csr = CSR_CYCLE + i;
> + } else {
> + pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> + pmc->cinfo.width = BITS_PER_LONG - 1;
> + }
> + }
> +
> + kvpmu->init_done = true;
> +}
> +
> +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> +{
> + /* TODO */
> +}
> +
> +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> +{
> + kvm_riscv_vcpu_pmu_deinit(vcpu);
> +}
> --
> 2.25.1
>

Thanks,
drew

2023-02-02 17:29:35

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 08/14] RISC-V: KVM: Add SBI PMU extension support

On Wed, Feb 01, 2023 at 03:12:44PM -0800, Atish Patra wrote:
> SBI PMU extension allows KVM guests to configure/start/stop/query about
> the PMU counters in virtualized enviornment as well.
>
> In order to allow that, KVM implements the entire SBI PMU extension.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kvm/Makefile | 2 +-
> arch/riscv/kvm/vcpu_sbi.c | 11 +++++
> arch/riscv/kvm/vcpu_sbi_pmu.c | 85 +++++++++++++++++++++++++++++++++++
> 3 files changed, 97 insertions(+), 1 deletion(-)
> create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c
>
> diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> index 5de1053..278e97c 100644
> --- a/arch/riscv/kvm/Makefile
> +++ b/arch/riscv/kvm/Makefile
> @@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
> kvm-y += vcpu_sbi_replace.o
> kvm-y += vcpu_sbi_hsm.o
> kvm-y += vcpu_timer.o
> -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> index fe2897e..15fde15 100644
> --- a/arch/riscv/kvm/vcpu_sbi.c
> +++ b/arch/riscv/kvm/vcpu_sbi.c
> @@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> };
> #endif
>
> +#ifdef CONFIG_RISCV_PMU_SBI
> +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
> +#else
> +static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> + .extid_start = -1UL,
> + .extid_end = -1UL,
> + .handler = NULL,
> +};
> +#endif
> +
> static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> &vcpu_sbi_ext_v01,
> &vcpu_sbi_ext_base,
> @@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> &vcpu_sbi_ext_rfence,
> &vcpu_sbi_ext_srst,
> &vcpu_sbi_ext_hsm,
> + &vcpu_sbi_ext_pmu,
> &vcpu_sbi_ext_experimental,
> &vcpu_sbi_ext_vendor,
> };
> diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
> new file mode 100644
> index 0000000..e028b0a
> --- /dev/null
> +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2023 Rivos Inc
> + *
> + * Authors:
> + * Atish Patra <[email protected]>
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/err.h>
> +#include <linux/kvm_host.h>
> +#include <asm/csr.h>
> +#include <asm/sbi.h>
> +#include <asm/kvm_vcpu_sbi.h>
> +
> +static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> + struct kvm_vcpu_sbi_return *retdata)
> +{
> + int ret = 0;
> + struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + unsigned long funcid = cp->a6;
> + uint64_t temp;
> +
> + /* Return not supported if PMU is not initialized */
> + if (!kvpmu->init_done)
> + return -EINVAL;

Shouldn't this be the following?

if (!kvpmu->init_done)
retdata->err_val = SBI_ERR_NOT_SUPPORTED;
return 0;
}

> +
> + switch (funcid) {
> + case SBI_EXT_PMU_NUM_COUNTERS:
> + ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_GET_INFO:
> + ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_CFG_MATCH:
> +#if defined(CONFIG_32BIT)
> + temp = ((uint64_t)cp->a5 << 32) | cp->a4;
> +#else
> + temp = cp->a4;
> +#endif
> + /*
> + * This can fail if perf core framework fails to create an event.
> + * Forward the error to the user space because its an error happened

"Forward the error to userspace because it's an error which happened within
the host kernel."

> + * within host kernel. The other option would be convert this to
^ to
> + * an SBI error and forward to the guest.
> + */
> + ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
> + cp->a2, cp->a3, temp, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_START:
> +#if defined(CONFIG_32BIT)
> + temp = ((uint64_t)cp->a4 << 32) | cp->a3;
> +#else
> + temp = cp->a3;
> +#endif
> + ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
> + temp, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_STOP:
> + ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, retdata);
> + break;
> + case SBI_EXT_PMU_COUNTER_FW_READ:
> + ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, retdata);
> + break;
> + default:
> + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> + }
> +
> + return ret;
> +}
> +
> +unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> +
> + return kvpmu->init_done;
> +}
> +
> +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> + .extid_start = SBI_EXT_PMU,
> + .extid_end = SBI_EXT_PMU,
> + .handler = kvm_sbi_ext_pmu_handler,
> + .probe = kvm_sbi_ext_pmu_probe,
> +};
> --
> 2.25.1
>

Thanks,
drew

2023-02-02 17:30:54

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 09/14] RISC-V: KVM: Make PMU functionality depend on Sscofpmf

On Wed, Feb 01, 2023 at 03:12:45PM -0800, Atish Patra wrote:
> The privilege mode filtering feature must be available in the host so
> that the host can inhibit the counters while the execution is in HS mode.
> Otherwise, the guests may have access to critical guest information.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kvm/vcpu_pmu.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 2dad37f..9a531fe 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -79,6 +79,14 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> struct kvm_pmc *pmc;
>
> + /*
> + * PMU functionality should be only available to guests if privilege mode
> + * filtering is available in the host. Otherwise, guest will always count
> + * events while the execution is in hypervisor mode.
> + */
> + if (!riscv_isa_extension_available(NULL, SSCOFPMF))
> + return;
> +
> ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> if (ret < 0 || !hpm_width || !num_hw_ctrs)
> return;
> --
> 2.25.1
>

Reviewed-by: Andrew Jones <[email protected]>

2023-02-02 18:45:09

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 12/14] RISC-V: KVM: Implement perf support without sampling

On Wed, Feb 01, 2023 at 03:12:48PM -0800, Atish Patra wrote:
> RISC-V SBI PMU & Sscofpmf ISA extension allows supporting perf in
> the virtualization enviornment as well. KVM implementation
> relies on SBI PMU extension for the most part while trapping
> & emulating the CSRs read for counter access.
>
> This patch doesn't have the event sampling support yet.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kvm/vcpu_pmu.c | 360 +++++++++++++++++++++++++++++++++++++-
> 1 file changed, 356 insertions(+), 4 deletions(-)
>
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 6fa0065..473ad80 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -12,10 +12,189 @@
> #include <linux/perf/riscv_pmu.h>
> #include <asm/csr.h>
> #include <asm/kvm_vcpu_sbi.h>
> +#include <asm/bitops.h>
> #include <asm/kvm_vcpu_pmu.h>
> #include <linux/kvm_host.h>
>
> #define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> +#define get_event_type(x) (((x) & SBI_PMU_EVENT_IDX_TYPE_MASK) >> 16)
> +#define get_event_code(x) ((x) & SBI_PMU_EVENT_IDX_CODE_MASK)
> +
> +static enum perf_hw_id hw_event_perf_map[SBI_PMU_HW_GENERAL_MAX] = {
> + [SBI_PMU_HW_CPU_CYCLES] = PERF_COUNT_HW_CPU_CYCLES,
> + [SBI_PMU_HW_INSTRUCTIONS] = PERF_COUNT_HW_INSTRUCTIONS,
> + [SBI_PMU_HW_CACHE_REFERENCES] = PERF_COUNT_HW_CACHE_REFERENCES,
> + [SBI_PMU_HW_CACHE_MISSES] = PERF_COUNT_HW_CACHE_MISSES,
> + [SBI_PMU_HW_BRANCH_INSTRUCTIONS] = PERF_COUNT_HW_BRANCH_INSTRUCTIONS,
> + [SBI_PMU_HW_BRANCH_MISSES] = PERF_COUNT_HW_BRANCH_MISSES,
> + [SBI_PMU_HW_BUS_CYCLES] = PERF_COUNT_HW_BUS_CYCLES,
> + [SBI_PMU_HW_STALLED_CYCLES_FRONTEND] = PERF_COUNT_HW_STALLED_CYCLES_FRONTEND,
> + [SBI_PMU_HW_STALLED_CYCLES_BACKEND] = PERF_COUNT_HW_STALLED_CYCLES_BACKEND,
> + [SBI_PMU_HW_REF_CPU_CYCLES] = PERF_COUNT_HW_REF_CPU_CYCLES,
> +};
> +
> +static u64 kvm_pmu_get_sample_period(struct kvm_pmc *pmc)
> +{
> + u64 counter_val_mask = GENMASK(pmc->cinfo.width, 0);
> + u64 sample_period;
> +
> + if (!pmc->counter_val)
> + sample_period = counter_val_mask + 1;
> + else
> + sample_period = (-pmc->counter_val) & counter_val_mask;
> +
> + return sample_period;
> +}
> +
> +static u32 kvm_pmu_get_perf_event_type(unsigned long eidx)
> +{
> + enum sbi_pmu_event_type etype = get_event_type(eidx);
> + u32 type = PERF_TYPE_MAX;
> +
> + switch (etype) {
> + case SBI_PMU_EVENT_TYPE_HW:
> + type = PERF_TYPE_HARDWARE;
> + break;
> + case SBI_PMU_EVENT_TYPE_CACHE:
> + type = PERF_TYPE_HW_CACHE;
> + break;
> + case SBI_PMU_EVENT_TYPE_RAW:
> + case SBI_PMU_EVENT_TYPE_FW:
> + type = PERF_TYPE_RAW;
> + break;
> + default:
> + break;
> + }
> +
> + return type;
> +}
> +
> +static bool kvm_pmu_is_fw_event(unsigned long eidx)
> +{
> + return get_event_type(eidx) == SBI_PMU_EVENT_TYPE_FW;
> +}
> +
> +static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc)
> +{
> + if (pmc->perf_event) {
> + perf_event_disable(pmc->perf_event);
> + perf_event_release_kernel(pmc->perf_event);
> + pmc->perf_event = NULL;
> + }
> +}
> +
> +static u64 kvm_pmu_get_perf_event_hw_config(u32 sbi_event_code)
> +{
> + return hw_event_perf_map[sbi_event_code];
> +}
> +
> +static u64 kvm_pmu_get_perf_event_cache_config(u32 sbi_event_code)
> +{
> + u64 config = U64_MAX;
> + unsigned int cache_type, cache_op, cache_result;
> +
> + /* All the cache event masks lie within 0xFF. No separate masking is necessary */
> + cache_type = (sbi_event_code & SBI_PMU_EVENT_CACHE_ID_CODE_MASK) >>
> + SBI_PMU_EVENT_CACHE_ID_SHIFT;
> + cache_op = (sbi_event_code & SBI_PMU_EVENT_CACHE_OP_ID_CODE_MASK) >>
> + SBI_PMU_EVENT_CACHE_OP_SHIFT;
> + cache_result = sbi_event_code & SBI_PMU_EVENT_CACHE_RESULT_ID_CODE_MASK;
> +
> + if (cache_type >= PERF_COUNT_HW_CACHE_MAX ||
> + cache_op >= PERF_COUNT_HW_CACHE_OP_MAX ||
> + cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
> + return config;
> +
> + config = cache_type | (cache_op << 8) | (cache_result << 16);
> +
> + return config;
> +}
> +
> +static u64 kvm_pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
> +{
> + enum sbi_pmu_event_type etype = get_event_type(eidx);
> + u32 ecode = get_event_code(eidx);
> + u64 config = U64_MAX;
> +
> + switch (etype) {
> + case SBI_PMU_EVENT_TYPE_HW:
> + if (ecode < SBI_PMU_HW_GENERAL_MAX)
> + config = kvm_pmu_get_perf_event_hw_config(ecode);
> + break;
> + case SBI_PMU_EVENT_TYPE_CACHE:
> + config = kvm_pmu_get_perf_event_cache_config(ecode);
> + break;
> + case SBI_PMU_EVENT_TYPE_RAW:
> + config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
> + break;
> + case SBI_PMU_EVENT_TYPE_FW:
> + if (ecode < SBI_PMU_FW_MAX)
> + config = (1ULL << 63) | ecode;
> + break;
> + default:
> + break;
> + }
> +
> + return config;
> +}
> +
> +static int kvm_pmu_get_fixed_pmc_index(unsigned long eidx)
> +{
> + u32 etype = kvm_pmu_get_perf_event_type(eidx);
> + u32 ecode = get_event_code(eidx);
> +
> + if (etype != SBI_PMU_EVENT_TYPE_HW)
> + return -EINVAL;
> +
> + if (ecode == SBI_PMU_HW_CPU_CYCLES)
> + return 0;
> + else if (ecode == SBI_PMU_HW_INSTRUCTIONS)
> + return 2;
> + else
> + return -EINVAL;
> +}
> +
> +static int kvm_pmu_get_programmable_pmc_index(struct kvm_pmu *kvpmu, unsigned long eidx,
> + unsigned long cbase, unsigned long cmask)
> +{
> + int ctr_idx = -1;
> + int i, pmc_idx;
> + int min, max;
> +
> + if (kvm_pmu_is_fw_event(eidx)) {
> + /* Firmware counters are mapped 1:1 starting from num_hw_ctrs for simplicity */
> + min = kvpmu->num_hw_ctrs;
> + max = min + kvpmu->num_fw_ctrs;
> + } else {
> + /* First 3 counters are reserved for fixed counters */
> + min = 3;
> + max = kvpmu->num_hw_ctrs;
> + }
> +
> + for_each_set_bit(i, &cmask, BITS_PER_LONG) {
> + pmc_idx = i + cbase;
> + if ((pmc_idx >= min && pmc_idx < max) &&
> + !test_bit(pmc_idx, kvpmu->pmc_in_use)) {
> + ctr_idx = pmc_idx;
> + break;
> + }
> + }
> +
> + return ctr_idx;
> +}
> +
> +static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx,
> + unsigned long cbase, unsigned long cmask)
> +{
> + int ret;
> +
> + /* Fixed counters need to be have fixed mapping as they have different width */
> + ret = kvm_pmu_get_fixed_pmc_index(eidx);
> + if (ret >= 0)
> + return ret;
> +
> + return kvm_pmu_get_programmable_pmc_index(pmu, eidx, cbase, cmask);
> +}
>
> static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> unsigned long *out_val)
> @@ -34,6 +213,16 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> return 0;
> }
>
> +static int kvm_pmu_validate_counter_mask(struct kvm_pmu *kvpmu, unsigned long ctr_base,
> + unsigned long ctr_mask)
> +{
> + /* Make sure the we have a valid counter mask requested from the caller */
> + if (!ctr_mask || (ctr_base + __fls(ctr_mask) >= kvm_pmu_num_counters(kvpmu)))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> unsigned long *val, unsigned long new_val,
> unsigned long wr_mask)
> @@ -97,7 +286,39 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> struct kvm_vcpu_sbi_return *retdata)
> {
> - /* TODO */
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + int i, pmc_index, sbiret = 0;
> + struct kvm_pmc *pmc;
> +
> + if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + goto out;
> + }
> +
> + /* Start the counters that have been configured and requested by the guest */
> + for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> + pmc_index = i + ctr_base;
> + if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> + continue;
> + pmc = &kvpmu->pmc[pmc_index];
> + if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> + pmc->counter_val = ival;
> + if (pmc->perf_event) {
> + if (unlikely(pmc->started)) {
> + sbiret = SBI_ERR_ALREADY_STARTED;
> + continue;
> + }
> + perf_event_period(pmc->perf_event, kvm_pmu_get_sample_period(pmc));
> + perf_event_enable(pmc->perf_event);
> + pmc->started = true;
> + } else {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + }
> + }
> +
> +out:
> + retdata->err_val = sbiret;
> +
> return 0;
> }
>
> @@ -105,7 +326,45 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> unsigned long ctr_mask, unsigned long flag,
> struct kvm_vcpu_sbi_return *retdata)
> {
> - /* TODO */
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + int i, pmc_index, sbiret = 0;
> + u64 enabled, running;
> + struct kvm_pmc *pmc;
> +
> + if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + goto out;
> + }
> +
> + /* Stop the counters that have been configured and requested by the guest */
> + for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
> + pmc_index = i + ctr_base;
> + if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> + continue;
> + pmc = &kvpmu->pmc[pmc_index];
> + if (pmc->perf_event) {
> + if (pmc->started) {
> + /* Stop counting the counter */
> + perf_event_disable(pmc->perf_event);
> + pmc->started = false;
> + } else
> + sbiret = SBI_ERR_ALREADY_STOPPED;
> +
> + if (flag & SBI_PMU_STOP_FLAG_RESET) {
> + /* Relase the counter if this is a reset request */
> + pmc->counter_val += perf_event_read_value(pmc->perf_event,
> + &enabled, &running);
> + kvm_pmu_release_perf_event(pmc);
> + clear_bit(pmc_index, kvpmu->pmc_in_use);
> + }
> + } else {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + }
> + }
> +
> +out:
> + retdata->err_val = sbiret;
> +
> return 0;
> }
>
> @@ -114,7 +373,88 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> unsigned long eidx, uint64_t evtdata,
> struct kvm_vcpu_sbi_return *retdata)
> {
> - /* TODO */
> + int ctr_idx, sbiret = 0;
> + u64 config;
> + u32 etype = kvm_pmu_get_perf_event_type(eidx);
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + struct perf_event *event;
> + struct kvm_pmc *pmc;
> + struct perf_event_attr attr = {
> + .type = etype,
> + .size = sizeof(struct perf_event_attr),
> + .pinned = true,
> + /*
> + * It should never reach here if the platform doesn't support the sscofpmf
> + * extension as mode filtering won't work without it.
> + */
> + .exclude_host = true,
> + .exclude_hv = true,
> + .exclude_user = !!(flag & SBI_PMU_CFG_FLAG_SET_UINH),
> + .exclude_kernel = !!(flag & SBI_PMU_CFG_FLAG_SET_SINH),
> + .config1 = RISCV_PMU_CONFIG1_GUEST_EVENTS,
> + };
> +
> + if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + goto out;
> + }
> +
> + if (kvm_pmu_is_fw_event(eidx)) {
> + sbiret = SBI_ERR_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + /*
> + * SKIP_MATCH flag indicates the caller is aware of the assigned counter
> + * for this event. Just do a sanity check if it already marked used.
> + */
> + if (flag & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> + if (!test_bit(ctr_base + __ffs(ctr_mask), kvpmu->pmc_in_use)) {
> + sbiret = SBI_ERR_FAILURE;
> + goto out;
> + }
> + ctr_idx = ctr_base + __ffs(ctr_mask);
> + } else {
> +

extra blank line here

> + ctr_idx = pmu_get_pmc_index(kvpmu, eidx, ctr_base, ctr_mask);
> + if (ctr_idx < 0) {
> + sbiret = SBI_ERR_NOT_SUPPORTED;
> + goto out;
> + }
> + }
> +
> + pmc = &kvpmu->pmc[ctr_idx];
> + kvm_pmu_release_perf_event(pmc);
> + pmc->idx = ctr_idx;
> +
> + config = kvm_pmu_get_perf_event_config(eidx, evtdata);
> + attr.config = config;
> + if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
> + //TODO: Do we really want to clear the value in hardware counter
> + pmc->counter_val = 0;
> + }
> +
> + /*
> + * Set the default sample_period for now. The guest specified value
> + * will be updated in the start call.
> + */
> + attr.sample_period = kvm_pmu_get_sample_period(pmc);
> +
> + event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
> + if (IS_ERR(event)) {
> + pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
> + return PTR_ERR(event);
> + }
> +
> + set_bit(ctr_idx, kvpmu->pmc_in_use);
> + pmc->perf_event = event;
> + if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> + perf_event_enable(pmc->perf_event);
> +
> + retdata->out_val = ctr_idx;
> +out:
> + retdata->err_val = sbiret;
> +
> return 0;
> }
>
> @@ -192,7 +532,19 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
>
> void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> {
> - /* TODO */
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + struct kvm_pmc *pmc;
> + int i;
> +
> + if (!kvpmu)
> + return;
> +
> + for_each_set_bit(i, kvpmu->pmc_in_use, RISCV_MAX_COUNTERS) {
> + pmc = &kvpmu->pmc[i];
> + pmc->counter_val = 0;
> + kvm_pmu_release_perf_event(pmc);
> + }
> + bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
> }
>
> void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> --
> 2.25.1
>

Other than the extra blank line,

Reviewed-by: Andrew Jones <[email protected]>

Thanks,
drew

2023-02-02 18:49:06

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 14/14] RISC-V: KVM: Increment firmware pmu events

On Wed, Feb 01, 2023 at 03:12:50PM -0800, Atish Patra wrote:
> KVM supports firmware events now. Invoke the firmware event increment
> function from appropriate places.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/kvm/tlb.c | 4 ++++
> arch/riscv/kvm/vcpu_sbi_replace.c | 7 +++++++
> 2 files changed, 11 insertions(+)
>

Reviewed-by: Andrew Jones <[email protected]>

2023-02-03 08:06:26

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Thu, Feb 2, 2023 at 3:34 AM Conor Dooley <[email protected]> wrote:
>
> On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > This patch only adds barebone structure of perf implementation. Most of
> > the function returns zero at this point and will be implemented
> > fully in the future.
> >
> > Signed-off-by: Atish Patra <[email protected]>
> > +/* Per virtual pmu counter data */
> > +struct kvm_pmc {
> > + u8 idx;
> > + struct perf_event *perf_event;
> > + uint64_t counter_val;
>
> CI also complained that here, and elsewhere, you used uint64_t rather
> than u64. Am I missing a reason for not using the regular types?
>

Nope. It was a simple oversight. I will fix it.
Do you have a link to the CI report so that I can address them all in v5 ?

> Thanks,
> Conor.
>
> > + union sbi_pmu_ctr_info cinfo;
> > + /* Event monitoring status */
> > + bool started;



--
Regards,
Atish

2023-02-03 08:11:00

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf



On 3 February 2023 08:04:00 GMT, Atish Patra <[email protected]> wrote:
>On Thu, Feb 2, 2023 at 3:34 AM Conor Dooley <[email protected]> wrote:
>>
>> On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
>> > This patch only adds barebone structure of perf implementation. Most of
>> > the function returns zero at this point and will be implemented
>> > fully in the future.
>> >
>> > Signed-off-by: Atish Patra <[email protected]>
>> > +/* Per virtual pmu counter data */
>> > +struct kvm_pmc {
>> > + u8 idx;
>> > + struct perf_event *perf_event;
>> > + uint64_t counter_val;
>>
>> CI also complained that here, and elsewhere, you used uint64_t rather
>> than u64. Am I missing a reason for not using the regular types?
>>
>
>Nope. It was a simple oversight. I will fix it.
>Do you have a link to the CI report so that I can address them all in v5 ?

Try:
:%s/uint64_t/u64
It was just this patch, and checkpatch --strict should show it.

>
>> Thanks,
>> Conor.
>>
>> > + union sbi_pmu_ctr_info cinfo;
>> > + /* Event monitoring status */
>> > + bool started;
>
>
>

2023-02-03 08:47:44

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Thu, Feb 2, 2023 at 9:03 AM Andrew Jones <[email protected]> wrote:
>
> On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > This patch only adds barebone structure of perf implementation. Most of
> > the function returns zero at this point and will be implemented
> > fully in the future.
> >
> > Signed-off-by: Atish Patra <[email protected]>
> > ---
> > arch/riscv/include/asm/kvm_host.h | 4 +
> > arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> > arch/riscv/kvm/Makefile | 1 +
> > arch/riscv/kvm/vcpu.c | 7 ++
> > arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> > 5 files changed, 226 insertions(+)
> > create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> > create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> >
> > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > index 93f43a3..b90be9a 100644
> > --- a/arch/riscv/include/asm/kvm_host.h
> > +++ b/arch/riscv/include/asm/kvm_host.h
> > @@ -18,6 +18,7 @@
> > #include <asm/kvm_vcpu_insn.h>
> > #include <asm/kvm_vcpu_sbi.h>
> > #include <asm/kvm_vcpu_timer.h>
> > +#include <asm/kvm_vcpu_pmu.h>
> >
> > #define KVM_MAX_VCPUS 1024
> >
> > @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
> >
> > /* Don't run the VCPU (blocked) */
> > bool pause;
> > +
> > + /* Performance monitoring context */
> > + struct kvm_pmu pmu_context;
> > };
> >
> > static inline void kvm_arch_hardware_unsetup(void) {}
> > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > new file mode 100644
> > index 0000000..e2b4038
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > @@ -0,0 +1,78 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (c) 2023 Rivos Inc
> > + *
> > + * Authors:
> > + * Atish Patra <[email protected]>
> > + */
> > +
> > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > +#define __KVM_VCPU_RISCV_PMU_H
> > +
> > +#include <linux/perf/riscv_pmu.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +#include <asm/sbi.h>
> > +
> > +#ifdef CONFIG_RISCV_PMU_SBI
> > +#define RISCV_KVM_MAX_FW_CTRS 32
> > +
> > +#if RISCV_KVM_MAX_FW_CTRS > 32
> > +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
>
> "The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"
>
> > +#endif
> > +
> > +#define RISCV_MAX_COUNTERS 64
>
> But instead of that message, what I think we need is something like
>
> #define RISCV_KVM_MAX_HW_CTRS 32
> #define RISCV_KVM_MAX_FW_CTRS 32
> #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
>
> static_assert(RISCV_MAX_COUNTERS <= 64)
>
> And then in pmu_sbi_device_probe() should ensure
>
> num_counters <= RISCV_MAX_COUNTERS
>
> and pmu_sbi_get_ctrinfo() should ensure
>
> num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
> num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS
>
> which has to be done at runtime.
>

Sure. I will add the additional sanity checks.

> > +
> > +/* Per virtual pmu counter data */
> > +struct kvm_pmc {
> > + u8 idx;
> > + struct perf_event *perf_event;
> > + uint64_t counter_val;
> > + union sbi_pmu_ctr_info cinfo;
> > + /* Event monitoring status */
> > + bool started;
> > +};
> > +
> > +/* PMU data structure per vcpu */
> > +struct kvm_pmu {
> > + struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> > + /* Number of the virtual firmware counters available */
> > + int num_fw_ctrs;
> > + /* Number of the virtual hardware counters available */
> > + int num_hw_ctrs;
> > + /* A flag to indicate that pmu initialization is done */
> > + bool init_done;
> > + /* Bit map of all the virtual counter used */
> > + DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> > +};
> > +
> > +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
> > +#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))
> > +
> > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
> > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > + struct kvm_vcpu_sbi_return *retdata);
> > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > + struct kvm_vcpu_sbi_return *retdata);
> > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag,
> > + struct kvm_vcpu_sbi_return *retdata);
> > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag,
> > + unsigned long eidx, uint64_t evtdata,
> > + struct kvm_vcpu_sbi_return *retdata);
>
> s/flag/flags/ for all the above prototypes and all the implementations
> below.
>

Fixed.

> > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > + struct kvm_vcpu_sbi_return *retdata);
> > +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> > +
> > +#else
> > +struct kvm_pmu {
> > +};
> > +
> > +static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> > +#endif /* CONFIG_RISCV_PMU_SBI */
> > +#endif /* !__KVM_VCPU_RISCV_PMU_H */
> > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> > index 019df920..5de1053 100644
> > --- a/arch/riscv/kvm/Makefile
> > +++ b/arch/riscv/kvm/Makefile
> > @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
> > kvm-y += vcpu_sbi_replace.o
> > kvm-y += vcpu_sbi_hsm.o
> > kvm-y += vcpu_timer.o
> > +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> > index 7c08567..7d010b0 100644
> > --- a/arch/riscv/kvm/vcpu.c
> > +++ b/arch/riscv/kvm/vcpu.c
> > @@ -138,6 +138,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
> > WRITE_ONCE(vcpu->arch.irqs_pending, 0);
> > WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
> >
> > + kvm_riscv_vcpu_pmu_reset(vcpu);
> > +
> > vcpu->arch.hfence_head = 0;
> > vcpu->arch.hfence_tail = 0;
> > memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
> > @@ -194,6 +196,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > /* Setup VCPU timer */
> > kvm_riscv_vcpu_timer_init(vcpu);
> >
> > + /* setup performance monitoring */
> > + kvm_riscv_vcpu_pmu_init(vcpu);
> > +
> > /* Reset VCPU */
> > kvm_riscv_reset_vcpu(vcpu);
> >
> > @@ -216,6 +221,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > /* Cleanup VCPU timer */
> > kvm_riscv_vcpu_timer_deinit(vcpu);
> >
> > + kvm_riscv_vcpu_pmu_deinit(vcpu);
> > +
> > /* Free unused pages pre-allocated for G-stage page table mappings */
> > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > }
> > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> > new file mode 100644
> > index 0000000..2dad37f
> > --- /dev/null
> > +++ b/arch/riscv/kvm/vcpu_pmu.c
> > @@ -0,0 +1,136 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2023 Rivos Inc
> > + *
> > + * Authors:
> > + * Atish Patra <[email protected]>
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/err.h>
> > +#include <linux/kvm_host.h>
> > +#include <linux/perf/riscv_pmu.h>
> > +#include <asm/csr.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +#include <asm/kvm_vcpu_pmu.h>
> > +#include <linux/kvm_host.h>
> > +
> > +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> > +
> > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > + retdata->out_val = kvm_pmu_num_counters(kvpmu);
> > +
> > + return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > + if (cidx > RISCV_MAX_COUNTERS || cidx == 1) {
> > + retdata->err_val = SBI_ERR_INVALID_PARAM;
> > + return 0;
> > + }
> > +
> > + retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
> > +
> > + return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + /* TODO */
> > + return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + /* TODO */
> > + return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > + unsigned long ctr_mask, unsigned long flag,
> > + unsigned long eidx, uint64_t evtdata,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + /* TODO */
> > + return 0;
> > +}
> > +
> > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + /* TODO */
> > + return 0;
> > +}
> > +
> > +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> > +{
> > + int i = 0, ret, num_hw_ctrs = 0, hpm_width = 0;
> > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > + struct kvm_pmc *pmc;
> > +
> > + ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> > + if (ret < 0 || !hpm_width || !num_hw_ctrs)
> > + return;
> > +
> > + /*
> > + * It is guranteed that RISCV_KVM_MAX_FW_CTRS can't exceed 32 as
> > + * that may exceed total number of counters more than RISCV_MAX_COUNTERS
> > + */
> > + kvpmu->num_hw_ctrs = num_hw_ctrs;
> > + kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
>
> If we sanity check that num_hw_ctrs <= 32 and num_fw_ctrs <= 32 at sbi_pmu
> probe time, then we can also return num_fw_ctrs (or num_ctrs) along with
> num_hw_ctrs from riscv_pmu_get_hpm_info(). Then, we can put the exact
> number here into kvmpmu->num_fw_ctrs, rather than using its max.
>

The firmware counter information retrieved from PMU driver will be the
number of firmware
counter host supports (i.e. M-mode firmware supports). The number of
counters supported for a
guest is entirely up to the hypervisor. There shouldn't be any
relation with the host's firmware counter.

Looking at it again, we should probably set kvpmu->num_fw_ctrs to
SBI_PMU_FW_MAX instead of RISCV_KVM_MAX_FW_CTRS.
We already have a sanity check for SBI_PMU_FW_MAX in the code.
> > +
> > + /*
> > + * There is no correlation between the logical hardware counter and virtual counters.
> > + * However, we need to encode a hpmcounter CSR in the counter info field so that
> > + * KVM can trap n emulate the read. This works well in the migration use case as
> > + * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> > + */
> > + for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> > + /* TIME CSR shouldn't be read from perf interface */
> > + if (i == 1)
> > + continue;
> > + pmc = &kvpmu->pmc[i];
> > + pmc->idx = i;
> > + if (i < kvpmu->num_hw_ctrs) {
> > + pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
> > + if (i < 3)
> > + /* CY, IR counters */
> > + pmc->cinfo.width = 63;
> > + else
> > + pmc->cinfo.width = hpm_width;
> > + /*
> > + * The CSR number doesn't have any relation with the logical
> > + * hardware counters. The CSR numbers are encoded sequentially
> > + * to avoid maintaining a map between the virtual counter
> > + * and CSR number.
> > + */
> > + pmc->cinfo.csr = CSR_CYCLE + i;
> > + } else {
> > + pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> > + pmc->cinfo.width = BITS_PER_LONG - 1;
> > + }
> > + }
> > +
> > + kvpmu->init_done = true;
> > +}
> > +
> > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> > +{
> > + /* TODO */
> > +}
> > +
> > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> > +{
> > + kvm_riscv_vcpu_pmu_deinit(vcpu);
> > +}
> > --
> > 2.25.1
> >
>
> Thanks,
> drew



--
Regards,
Atish

2023-02-03 09:08:18

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v4 08/14] RISC-V: KVM: Add SBI PMU extension support

On Thu, Feb 2, 2023 at 9:29 AM Andrew Jones <[email protected]> wrote:
>
> On Wed, Feb 01, 2023 at 03:12:44PM -0800, Atish Patra wrote:
> > SBI PMU extension allows KVM guests to configure/start/stop/query about
> > the PMU counters in virtualized enviornment as well.
> >
> > In order to allow that, KVM implements the entire SBI PMU extension.
> >
> > Reviewed-by: Anup Patel <[email protected]>
> > Signed-off-by: Atish Patra <[email protected]>
> > ---
> > arch/riscv/kvm/Makefile | 2 +-
> > arch/riscv/kvm/vcpu_sbi.c | 11 +++++
> > arch/riscv/kvm/vcpu_sbi_pmu.c | 85 +++++++++++++++++++++++++++++++++++
> > 3 files changed, 97 insertions(+), 1 deletion(-)
> > create mode 100644 arch/riscv/kvm/vcpu_sbi_pmu.c
> >
> > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> > index 5de1053..278e97c 100644
> > --- a/arch/riscv/kvm/Makefile
> > +++ b/arch/riscv/kvm/Makefile
> > @@ -25,4 +25,4 @@ kvm-y += vcpu_sbi_base.o
> > kvm-y += vcpu_sbi_replace.o
> > kvm-y += vcpu_sbi_hsm.o
> > kvm-y += vcpu_timer.o
> > -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> > +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
> > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
> > index fe2897e..15fde15 100644
> > --- a/arch/riscv/kvm/vcpu_sbi.c
> > +++ b/arch/riscv/kvm/vcpu_sbi.c
> > @@ -20,6 +20,16 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
> > };
> > #endif
> >
> > +#ifdef CONFIG_RISCV_PMU_SBI
> > +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
> > +#else
> > +static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> > + .extid_start = -1UL,
> > + .extid_end = -1UL,
> > + .handler = NULL,
> > +};
> > +#endif
> > +
> > static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> > &vcpu_sbi_ext_v01,
> > &vcpu_sbi_ext_base,
> > @@ -28,6 +38,7 @@ static const struct kvm_vcpu_sbi_extension *sbi_ext[] = {
> > &vcpu_sbi_ext_rfence,
> > &vcpu_sbi_ext_srst,
> > &vcpu_sbi_ext_hsm,
> > + &vcpu_sbi_ext_pmu,
> > &vcpu_sbi_ext_experimental,
> > &vcpu_sbi_ext_vendor,
> > };
> > diff --git a/arch/riscv/kvm/vcpu_sbi_pmu.c b/arch/riscv/kvm/vcpu_sbi_pmu.c
> > new file mode 100644
> > index 0000000..e028b0a
> > --- /dev/null
> > +++ b/arch/riscv/kvm/vcpu_sbi_pmu.c
> > @@ -0,0 +1,85 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2023 Rivos Inc
> > + *
> > + * Authors:
> > + * Atish Patra <[email protected]>
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/err.h>
> > +#include <linux/kvm_host.h>
> > +#include <asm/csr.h>
> > +#include <asm/sbi.h>
> > +#include <asm/kvm_vcpu_sbi.h>
> > +
> > +static int kvm_sbi_ext_pmu_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
> > + struct kvm_vcpu_sbi_return *retdata)
> > +{
> > + int ret = 0;
> > + struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
> > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > + unsigned long funcid = cp->a6;
> > + uint64_t temp;
> > +
> > + /* Return not supported if PMU is not initialized */
> > + if (!kvpmu->init_done)
> > + return -EINVAL;
>
> Shouldn't this be the following?
>
> if (!kvpmu->init_done)
> retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> return 0;
> }
>

This condition presents an additional sanity check and indicates a bug
in the guest code
where it invokes PMU extension calls when the probe function already
returns NOT_SUPPORTED.

Earlier, I was in two minds whether this should just return not
supported to the guest or report an
error. However, given that it's just a driver in the guest, the error
shouldn't be fatal. It should just
return the sbi error back to the guest.

Thanks for catching this. I have fixed it in v5.

> > +
> > + switch (funcid) {
> > + case SBI_EXT_PMU_NUM_COUNTERS:
> > + ret = kvm_riscv_vcpu_pmu_num_ctrs(vcpu, retdata);
> > + break;
> > + case SBI_EXT_PMU_COUNTER_GET_INFO:
> > + ret = kvm_riscv_vcpu_pmu_ctr_info(vcpu, cp->a0, retdata);
> > + break;
> > + case SBI_EXT_PMU_COUNTER_CFG_MATCH:
> > +#if defined(CONFIG_32BIT)
> > + temp = ((uint64_t)cp->a5 << 32) | cp->a4;
> > +#else
> > + temp = cp->a4;
> > +#endif
> > + /*
> > + * This can fail if perf core framework fails to create an event.
> > + * Forward the error to the user space because its an error happened
>
> "Forward the error to userspace because it's an error which happened within
> the host kernel."
>

Fixed.

> > + * within host kernel. The other option would be convert this to
> ^ to
> > + * an SBI error and forward to the guest.
> > + */
> > + ret = kvm_riscv_vcpu_pmu_ctr_cfg_match(vcpu, cp->a0, cp->a1,
> > + cp->a2, cp->a3, temp, retdata);
> > + break;
> > + case SBI_EXT_PMU_COUNTER_START:
> > +#if defined(CONFIG_32BIT)
> > + temp = ((uint64_t)cp->a4 << 32) | cp->a3;
> > +#else
> > + temp = cp->a3;
> > +#endif
> > + ret = kvm_riscv_vcpu_pmu_ctr_start(vcpu, cp->a0, cp->a1, cp->a2,
> > + temp, retdata);
> > + break;
> > + case SBI_EXT_PMU_COUNTER_STOP:
> > + ret = kvm_riscv_vcpu_pmu_ctr_stop(vcpu, cp->a0, cp->a1, cp->a2, retdata);
> > + break;
> > + case SBI_EXT_PMU_COUNTER_FW_READ:
> > + ret = kvm_riscv_vcpu_pmu_ctr_read(vcpu, cp->a0, retdata);
> > + break;
> > + default:
> > + retdata->err_val = SBI_ERR_NOT_SUPPORTED;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +unsigned long kvm_sbi_ext_pmu_probe(struct kvm_vcpu *vcpu)
> > +{
> > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > +
> > + return kvpmu->init_done;
> > +}
> > +
> > +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
> > + .extid_start = SBI_EXT_PMU,
> > + .extid_end = SBI_EXT_PMU,
> > + .handler = kvm_sbi_ext_pmu_handler,
> > + .probe = kvm_sbi_ext_pmu_probe,
> > +};
> > --
> > 2.25.1
> >
>
> Thanks,
> drew



--
Regards,
Atish

2023-02-03 10:14:32

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 13/14] RISC-V: KVM: Support firmware events

On Wed, Feb 01, 2023 at 03:12:49PM -0800, Atish Patra wrote:
> SBI PMU extension defines a set of firmware events which can provide
> useful information to guests about the number of SBI calls. As
> hypervisor implements the SBI PMU extension, these firmware events
> correspond to ecall invocations between VS->HS mode. All other firmware
> events will always report zero if monitored as KVM doesn't implement them.
>
> This patch adds all the infrastructure required to support firmware
> events.
>
> Reviewed-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/kvm_vcpu_pmu.h | 17 +++
> arch/riscv/kvm/vcpu_pmu.c | 142 ++++++++++++++++++++------
> 2 files changed, 125 insertions(+), 34 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> index 2afaaf5..a1d8b7d 100644
> --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
> +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> @@ -22,6 +22,14 @@
>
> #define RISCV_MAX_COUNTERS 64
>
> +struct kvm_fw_event {
> + /* Current value of the event */
> + unsigned long value;
> +
> + /* Event monitoring status */
> + bool started;
> +};
> +
> /* Per virtual pmu counter data */
> struct kvm_pmc {
> u8 idx;
> @@ -30,11 +38,14 @@ struct kvm_pmc {
> union sbi_pmu_ctr_info cinfo;
> /* Event monitoring status */
> bool started;
> + /* Monitoring event ID */
> + unsigned long event_idx;
> };
>
> /* PMU data structure per vcpu */
> struct kvm_pmu {
> struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> + struct kvm_fw_event fw_event[RISCV_KVM_MAX_FW_CTRS];
> /* Number of the virtual firmware counters available */
> int num_fw_ctrs;
> /* Number of the virtual hardware counters available */
> @@ -57,6 +68,7 @@ struct kvm_pmu {
> { .base = CSR_CYCLE, .count = 31, .func = kvm_riscv_vcpu_pmu_read_hpm },
> #endif
>
> +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid);
> int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> unsigned long *val, unsigned long new_val,
> unsigned long wr_mask);
> @@ -87,6 +99,11 @@ struct kvm_pmu {
> { .base = 0, .count = 0, .func = NULL },
>
> static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
> +static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> +{
> + return 0;
> +}
> +
> static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> #endif /* CONFIG_RISCV_PMU_SBI */
> diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> index 473ad80..dd16e60 100644
> --- a/arch/riscv/kvm/vcpu_pmu.c
> +++ b/arch/riscv/kvm/vcpu_pmu.c
> @@ -202,12 +202,15 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> struct kvm_pmc *pmc;
> u64 enabled, running;
> + int fevent_code;
>
> pmc = &kvpmu->pmc[cidx];
> - if (!pmc->perf_event)
> - return -EINVAL;
>
> - pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);
> + if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> + fevent_code = get_event_code(pmc->event_idx);
> + pmc->counter_val = kvpmu->fw_event[fevent_code].value;
> + } else if (pmc->perf_event)
> + pmc->counter_val += perf_event_read_value(pmc->perf_event, &enabled, &running);

We also need

else {
return -EINVAL;
}

> *out_val = pmc->counter_val;
>
> return 0;
> @@ -223,6 +226,55 @@ static int kvm_pmu_validate_counter_mask(struct kvm_pmu *kvpmu, unsigned long ct
> return 0;
> }
>
> +static int kvm_pmu_create_perf_event(struct kvm_pmc *pmc, int ctr_idx,
> + struct perf_event_attr *attr, unsigned long flag,
> + unsigned long eidx, unsigned long evtdata)
> +{
> + struct perf_event *event;
> +
> + kvm_pmu_release_perf_event(pmc);
> + pmc->idx = ctr_idx;
> +
> + attr->config = kvm_pmu_get_perf_event_config(eidx, evtdata);
> + if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
> + //TODO: Do we really want to clear the value in hardware counter
> + pmc->counter_val = 0;
> + }
> +
> + /*
> + * Set the default sample_period for now. The guest specified value
> + * will be updated in the start call.
> + */
> + attr->sample_period = kvm_pmu_get_sample_period(pmc);
> +
> + event = perf_event_create_kernel_counter(attr, -1, current, NULL, pmc);
> + if (IS_ERR(event)) {
> + pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
> + return PTR_ERR(event);
> + }
> +
> + pmc->perf_event = event;
> + if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> + perf_event_enable(pmc->perf_event);
> +
> + return 0;
> +}
> +
> +int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid)
> +{
> + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> + struct kvm_fw_event *fevent;
> +
> + if (!kvpmu || fid >= SBI_PMU_FW_MAX)
> + return -EINVAL;
> +
> + fevent = &kvpmu->fw_event[fid];
> + if (fevent->started)
> + fevent->value++;
> +
> + return 0;
> +}
> +
> int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num,
> unsigned long *val, unsigned long new_val,
> unsigned long wr_mask)
> @@ -289,6 +341,7 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> int i, pmc_index, sbiret = 0;
> struct kvm_pmc *pmc;
> + int fevent_code;
>
> if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
> sbiret = SBI_ERR_INVALID_PARAM;
> @@ -303,7 +356,22 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> pmc = &kvpmu->pmc[pmc_index];
> if (flag & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> pmc->counter_val = ival;
> - if (pmc->perf_event) {
> + if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> + fevent_code = get_event_code(pmc->event_idx);
> + if (fevent_code >= SBI_PMU_FW_MAX) {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + goto out;
> + }
> +
> + /* Check if the counter was already started for some reason */
> + if (kvpmu->fw_event[fevent_code].started) {
> + sbiret = SBI_ERR_ALREADY_STARTED;
> + continue;
> + }
> +
> + kvpmu->fw_event[fevent_code].started = true;
> + kvpmu->fw_event[fevent_code].value = pmc->counter_val;
> + } else if (pmc->perf_event) {
> if (unlikely(pmc->started)) {
> sbiret = SBI_ERR_ALREADY_STARTED;
> continue;
> @@ -330,6 +398,7 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> int i, pmc_index, sbiret = 0;
> u64 enabled, running;
> struct kvm_pmc *pmc;
> + int fevent_code;
>
> if (kvm_pmu_validate_counter_mask(kvpmu, ctr_base, ctr_mask) < 0) {
> sbiret = SBI_ERR_INVALID_PARAM;
> @@ -342,7 +411,18 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> if (!test_bit(pmc_index, kvpmu->pmc_in_use))
> continue;
> pmc = &kvpmu->pmc[pmc_index];
> - if (pmc->perf_event) {
> + if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> + fevent_code = get_event_code(pmc->event_idx);
> + if (fevent_code >= SBI_PMU_FW_MAX) {
> + sbiret = SBI_ERR_INVALID_PARAM;
> + goto out;
> + }
> +
> + if (!kvpmu->fw_event[fevent_code].started)
> + sbiret = SBI_ERR_ALREADY_STOPPED;
> +
> + kvpmu->fw_event[fevent_code].started = false;
> + } else if (pmc->perf_event) {
> if (pmc->started) {
> /* Stop counting the counter */
> perf_event_disable(pmc->perf_event);
> @@ -355,11 +435,14 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> pmc->counter_val += perf_event_read_value(pmc->perf_event,
> &enabled, &running);
> kvm_pmu_release_perf_event(pmc);
> - clear_bit(pmc_index, kvpmu->pmc_in_use);
> }
> } else {
> sbiret = SBI_ERR_INVALID_PARAM;
> }
> + if (flag & SBI_PMU_STOP_FLAG_RESET) {
> + pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> + clear_bit(pmc_index, kvpmu->pmc_in_use);
> + }
> }
>
> out:
> @@ -373,12 +456,12 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> unsigned long eidx, uint64_t evtdata,
> struct kvm_vcpu_sbi_return *retdata)
> {
> - int ctr_idx, sbiret = 0;
> - u64 config;
> + int ctr_idx, ret, sbiret = 0;
> + bool is_fevent;
> + unsigned long event_code;
> u32 etype = kvm_pmu_get_perf_event_type(eidx);
> struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> - struct perf_event *event;
> - struct kvm_pmc *pmc;
> + struct kvm_pmc *pmc = NULL;

I don't think this change initializing pmc is necessary, but OK.

> struct perf_event_attr attr = {
> .type = etype,
> .size = sizeof(struct perf_event_attr),
> @@ -399,7 +482,9 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> goto out;
> }
>
> - if (kvm_pmu_is_fw_event(eidx)) {
> + event_code = get_event_code(eidx);
> + is_fevent = kvm_pmu_is_fw_event(eidx);
> + if (is_fevent && event_code >= SBI_PMU_FW_MAX) {
> sbiret = SBI_ERR_NOT_SUPPORTED;
> goto out;
> }
> @@ -424,33 +509,18 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
> }
>
> pmc = &kvpmu->pmc[ctr_idx];
> - kvm_pmu_release_perf_event(pmc);
> - pmc->idx = ctr_idx;
> -
> - config = kvm_pmu_get_perf_event_config(eidx, evtdata);
> - attr.config = config;
> - if (flag & SBI_PMU_CFG_FLAG_CLEAR_VALUE) {
> - //TODO: Do we really want to clear the value in hardware counter
> - pmc->counter_val = 0;
> - }
> -
> - /*
> - * Set the default sample_period for now. The guest specified value
> - * will be updated in the start call.
> - */
> - attr.sample_period = kvm_pmu_get_sample_period(pmc);
> -
> - event = perf_event_create_kernel_counter(&attr, -1, current, NULL, pmc);
> - if (IS_ERR(event)) {
> - pr_err("kvm pmu event creation failed for eidx %lx: %ld\n", eidx, PTR_ERR(event));
> - return PTR_ERR(event);
> + if (is_fevent) {
> + if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> + kvpmu->fw_event[event_code].started = true;
> + } else {
> + ret = kvm_pmu_create_perf_event(pmc, ctr_idx, &attr, flag, eidx, evtdata);
> + if (ret)
> + return ret;
> }
>
> set_bit(ctr_idx, kvpmu->pmc_in_use);
> - pmc->perf_event = event;
> - if (flag & SBI_PMU_CFG_FLAG_AUTO_START)
> - perf_event_enable(pmc->perf_event);
>
> + pmc->event_idx = eidx;
> retdata->out_val = ctr_idx;
> out:
> retdata->err_val = sbiret;
> @@ -494,6 +564,7 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> */
> kvpmu->num_hw_ctrs = num_hw_ctrs;
> kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> + memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
>
> /*
> * There is no correlation between the logical hardware counter and virtual counters.
> @@ -507,6 +578,7 @@ void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> continue;
> pmc = &kvpmu->pmc[i];
> pmc->idx = i;
> + pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> if (i < kvpmu->num_hw_ctrs) {
> pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
> if (i < 3)
> @@ -543,8 +615,10 @@ void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> pmc = &kvpmu->pmc[i];
> pmc->counter_val = 0;
> kvm_pmu_release_perf_event(pmc);
> + pmc->event_idx = SBI_PMU_EVENT_IDX_INVALID;
> }
> bitmap_zero(kvpmu->pmc_in_use, RISCV_MAX_COUNTERS);
> + memset(&kvpmu->fw_event, 0, SBI_PMU_FW_MAX * sizeof(struct kvm_fw_event));
> }
>
> void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> --
> 2.25.1
>

Thanks,
drew

2023-02-05 07:38:07

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Fri, Feb 3, 2023 at 12:47 AM Atish Patra <[email protected]> wrote:
>
> On Thu, Feb 2, 2023 at 9:03 AM Andrew Jones <[email protected]> wrote:
> >
> > On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > > This patch only adds barebone structure of perf implementation. Most of
> > > the function returns zero at this point and will be implemented
> > > fully in the future.
> > >
> > > Signed-off-by: Atish Patra <[email protected]>
> > > ---
> > > arch/riscv/include/asm/kvm_host.h | 4 +
> > > arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> > > arch/riscv/kvm/Makefile | 1 +
> > > arch/riscv/kvm/vcpu.c | 7 ++
> > > arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> > > 5 files changed, 226 insertions(+)
> > > create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> > >
> > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > > index 93f43a3..b90be9a 100644
> > > --- a/arch/riscv/include/asm/kvm_host.h
> > > +++ b/arch/riscv/include/asm/kvm_host.h
> > > @@ -18,6 +18,7 @@
> > > #include <asm/kvm_vcpu_insn.h>
> > > #include <asm/kvm_vcpu_sbi.h>
> > > #include <asm/kvm_vcpu_timer.h>
> > > +#include <asm/kvm_vcpu_pmu.h>
> > >
> > > #define KVM_MAX_VCPUS 1024
> > >
> > > @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
> > >
> > > /* Don't run the VCPU (blocked) */
> > > bool pause;
> > > +
> > > + /* Performance monitoring context */
> > > + struct kvm_pmu pmu_context;
> > > };
> > >
> > > static inline void kvm_arch_hardware_unsetup(void) {}
> > > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > new file mode 100644
> > > index 0000000..e2b4038
> > > --- /dev/null
> > > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > @@ -0,0 +1,78 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +/*
> > > + * Copyright (c) 2023 Rivos Inc
> > > + *
> > > + * Authors:
> > > + * Atish Patra <[email protected]>
> > > + */
> > > +
> > > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > > +#define __KVM_VCPU_RISCV_PMU_H
> > > +
> > > +#include <linux/perf/riscv_pmu.h>
> > > +#include <asm/kvm_vcpu_sbi.h>
> > > +#include <asm/sbi.h>
> > > +
> > > +#ifdef CONFIG_RISCV_PMU_SBI
> > > +#define RISCV_KVM_MAX_FW_CTRS 32
> > > +
> > > +#if RISCV_KVM_MAX_FW_CTRS > 32
> > > +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
> >
> > "The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"
> >
> > > +#endif
> > > +
> > > +#define RISCV_MAX_COUNTERS 64
> >
> > But instead of that message, what I think we need is something like
> >
> > #define RISCV_KVM_MAX_HW_CTRS 32
> > #define RISCV_KVM_MAX_FW_CTRS 32
> > #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
> >
> > static_assert(RISCV_MAX_COUNTERS <= 64)
> >
> > And then in pmu_sbi_device_probe() should ensure
> >
> > num_counters <= RISCV_MAX_COUNTERS
> >
> > and pmu_sbi_get_ctrinfo() should ensure
> >
> > num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
> > num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS
> >
> > which has to be done at runtime.
> >
>
> Sure. I will add the additional sanity checks.
>

As explained above, I feel we shouldn't mix the firmware number of
counters that the host gets and it exposes to a guest.
So I have not included this suggestion in the v5.
I have changed the num_fw_ctrs to PMU_FW_MAX though to accurately
reflect the firmware counters KVM is actually using.
I don't know if there is any benefit of static_assert over #error.
Please let me know if you feel strongly about that.


> > > +
> > > +/* Per virtual pmu counter data */
> > > +struct kvm_pmc {
> > > + u8 idx;
> > > + struct perf_event *perf_event;
> > > + uint64_t counter_val;
> > > + union sbi_pmu_ctr_info cinfo;
> > > + /* Event monitoring status */
> > > + bool started;
> > > +};
> > > +
> > > +/* PMU data structure per vcpu */
> > > +struct kvm_pmu {
> > > + struct kvm_pmc pmc[RISCV_MAX_COUNTERS];
> > > + /* Number of the virtual firmware counters available */
> > > + int num_fw_ctrs;
> > > + /* Number of the virtual hardware counters available */
> > > + int num_hw_ctrs;
> > > + /* A flag to indicate that pmu initialization is done */
> > > + bool init_done;
> > > + /* Bit map of all the virtual counter used */
> > > + DECLARE_BITMAP(pmc_in_use, RISCV_MAX_COUNTERS);
> > > +};
> > > +
> > > +#define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu_context)
> > > +#define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu_context))
> > > +
> > > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata);
> > > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > > + struct kvm_vcpu_sbi_return *retdata);
> > > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > > + struct kvm_vcpu_sbi_return *retdata);
> > > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag,
> > > + struct kvm_vcpu_sbi_return *retdata);
> > > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag,
> > > + unsigned long eidx, uint64_t evtdata,
> > > + struct kvm_vcpu_sbi_return *retdata);
> >
> > s/flag/flags/ for all the above prototypes and all the implementations
> > below.
> >
>
> Fixed.
>
> > > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > > + struct kvm_vcpu_sbi_return *retdata);
> > > +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu);
> > > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu);
> > > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu);
> > > +
> > > +#else
> > > +struct kvm_pmu {
> > > +};
> > > +
> > > +static inline void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu) {}
> > > +static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
> > > +static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
> > > +#endif /* CONFIG_RISCV_PMU_SBI */
> > > +#endif /* !__KVM_VCPU_RISCV_PMU_H */
> > > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
> > > index 019df920..5de1053 100644
> > > --- a/arch/riscv/kvm/Makefile
> > > +++ b/arch/riscv/kvm/Makefile
> > > @@ -25,3 +25,4 @@ kvm-y += vcpu_sbi_base.o
> > > kvm-y += vcpu_sbi_replace.o
> > > kvm-y += vcpu_sbi_hsm.o
> > > kvm-y += vcpu_timer.o
> > > +kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
> > > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> > > index 7c08567..7d010b0 100644
> > > --- a/arch/riscv/kvm/vcpu.c
> > > +++ b/arch/riscv/kvm/vcpu.c
> > > @@ -138,6 +138,8 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu)
> > > WRITE_ONCE(vcpu->arch.irqs_pending, 0);
> > > WRITE_ONCE(vcpu->arch.irqs_pending_mask, 0);
> > >
> > > + kvm_riscv_vcpu_pmu_reset(vcpu);
> > > +
> > > vcpu->arch.hfence_head = 0;
> > > vcpu->arch.hfence_tail = 0;
> > > memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
> > > @@ -194,6 +196,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > > /* Setup VCPU timer */
> > > kvm_riscv_vcpu_timer_init(vcpu);
> > >
> > > + /* setup performance monitoring */
> > > + kvm_riscv_vcpu_pmu_init(vcpu);
> > > +
> > > /* Reset VCPU */
> > > kvm_riscv_reset_vcpu(vcpu);
> > >
> > > @@ -216,6 +221,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> > > /* Cleanup VCPU timer */
> > > kvm_riscv_vcpu_timer_deinit(vcpu);
> > >
> > > + kvm_riscv_vcpu_pmu_deinit(vcpu);
> > > +
> > > /* Free unused pages pre-allocated for G-stage page table mappings */
> > > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> > > }
> > > diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
> > > new file mode 100644
> > > index 0000000..2dad37f
> > > --- /dev/null
> > > +++ b/arch/riscv/kvm/vcpu_pmu.c
> > > @@ -0,0 +1,136 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (c) 2023 Rivos Inc
> > > + *
> > > + * Authors:
> > > + * Atish Patra <[email protected]>
> > > + */
> > > +
> > > +#include <linux/errno.h>
> > > +#include <linux/err.h>
> > > +#include <linux/kvm_host.h>
> > > +#include <linux/perf/riscv_pmu.h>
> > > +#include <asm/csr.h>
> > > +#include <asm/kvm_vcpu_sbi.h>
> > > +#include <asm/kvm_vcpu_pmu.h>
> > > +#include <linux/kvm_host.h>
> > > +
> > > +#define kvm_pmu_num_counters(pmu) ((pmu)->num_hw_ctrs + (pmu)->num_fw_ctrs)
> > > +
> > > +int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > > +
> > > + retdata->out_val = kvm_pmu_num_counters(kvpmu);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
> > > + struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > > +
> > > + if (cidx > RISCV_MAX_COUNTERS || cidx == 1) {
> > > + retdata->err_val = SBI_ERR_INVALID_PARAM;
> > > + return 0;
> > > + }
> > > +
> > > + retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag, uint64_t ival,
> > > + struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + /* TODO */
> > > + return 0;
> > > +}
> > > +
> > > +int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag,
> > > + struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + /* TODO */
> > > + return 0;
> > > +}
> > > +
> > > +int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_base,
> > > + unsigned long ctr_mask, unsigned long flag,
> > > + unsigned long eidx, uint64_t evtdata,
> > > + struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + /* TODO */
> > > + return 0;
> > > +}
> > > +
> > > +int kvm_riscv_vcpu_pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
> > > + struct kvm_vcpu_sbi_return *retdata)
> > > +{
> > > + /* TODO */
> > > + return 0;
> > > +}
> > > +
> > > +void kvm_riscv_vcpu_pmu_init(struct kvm_vcpu *vcpu)
> > > +{
> > > + int i = 0, ret, num_hw_ctrs = 0, hpm_width = 0;
> > > + struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
> > > + struct kvm_pmc *pmc;
> > > +
> > > + ret = riscv_pmu_get_hpm_info(&hpm_width, &num_hw_ctrs);
> > > + if (ret < 0 || !hpm_width || !num_hw_ctrs)
> > > + return;
> > > +
> > > + /*
> > > + * It is guranteed that RISCV_KVM_MAX_FW_CTRS can't exceed 32 as
> > > + * that may exceed total number of counters more than RISCV_MAX_COUNTERS
> > > + */
> > > + kvpmu->num_hw_ctrs = num_hw_ctrs;
> > > + kvpmu->num_fw_ctrs = RISCV_KVM_MAX_FW_CTRS;
> >
> > If we sanity check that num_hw_ctrs <= 32 and num_fw_ctrs <= 32 at sbi_pmu
> > probe time, then we can also return num_fw_ctrs (or num_ctrs) along with
> > num_hw_ctrs from riscv_pmu_get_hpm_info(). Then, we can put the exact
> > number here into kvmpmu->num_fw_ctrs, rather than using its max.
> >
>
> The firmware counter information retrieved from PMU driver will be the
> number of firmware
> counter host supports (i.e. M-mode firmware supports). The number of
> counters supported for a
> guest is entirely up to the hypervisor. There shouldn't be any
> relation with the host's firmware counter.
>
> Looking at it again, we should probably set kvpmu->num_fw_ctrs to
> SBI_PMU_FW_MAX instead of RISCV_KVM_MAX_FW_CTRS.
> We already have a sanity check for SBI_PMU_FW_MAX in the code.
> > > +
> > > + /*
> > > + * There is no correlation between the logical hardware counter and virtual counters.
> > > + * However, we need to encode a hpmcounter CSR in the counter info field so that
> > > + * KVM can trap n emulate the read. This works well in the migration use case as
> > > + * KVM doesn't care if the actual hpmcounter is available in the hardware or not.
> > > + */
> > > + for (i = 0; i < kvm_pmu_num_counters(kvpmu); i++) {
> > > + /* TIME CSR shouldn't be read from perf interface */
> > > + if (i == 1)
> > > + continue;
> > > + pmc = &kvpmu->pmc[i];
> > > + pmc->idx = i;
> > > + if (i < kvpmu->num_hw_ctrs) {
> > > + pmc->cinfo.type = SBI_PMU_CTR_TYPE_HW;
> > > + if (i < 3)
> > > + /* CY, IR counters */
> > > + pmc->cinfo.width = 63;
> > > + else
> > > + pmc->cinfo.width = hpm_width;
> > > + /*
> > > + * The CSR number doesn't have any relation with the logical
> > > + * hardware counters. The CSR numbers are encoded sequentially
> > > + * to avoid maintaining a map between the virtual counter
> > > + * and CSR number.
> > > + */
> > > + pmc->cinfo.csr = CSR_CYCLE + i;
> > > + } else {
> > > + pmc->cinfo.type = SBI_PMU_CTR_TYPE_FW;
> > > + pmc->cinfo.width = BITS_PER_LONG - 1;
> > > + }
> > > + }
> > > +
> > > + kvpmu->init_done = true;
> > > +}
> > > +
> > > +void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu)
> > > +{
> > > + /* TODO */
> > > +}
> > > +
> > > +void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu)
> > > +{
> > > + kvm_riscv_vcpu_pmu_deinit(vcpu);
> > > +}
> > > --
> > > 2.25.1
> > >
> >
> > Thanks,
> > drew
>
>
>
> --
> Regards,
> Atish



--
Regards,
Atish

2023-02-06 09:22:13

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Sat, Feb 04, 2023 at 11:37:47PM -0800, Atish Patra wrote:
> On Fri, Feb 3, 2023 at 12:47 AM Atish Patra <[email protected]> wrote:
> >
> > On Thu, Feb 2, 2023 at 9:03 AM Andrew Jones <[email protected]> wrote:
> > >
> > > On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > > > This patch only adds barebone structure of perf implementation. Most of
> > > > the function returns zero at this point and will be implemented
> > > > fully in the future.
> > > >
> > > > Signed-off-by: Atish Patra <[email protected]>
> > > > ---
> > > > arch/riscv/include/asm/kvm_host.h | 4 +
> > > > arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> > > > arch/riscv/kvm/Makefile | 1 +
> > > > arch/riscv/kvm/vcpu.c | 7 ++
> > > > arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> > > > 5 files changed, 226 insertions(+)
> > > > create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> > > >
> > > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > > > index 93f43a3..b90be9a 100644
> > > > --- a/arch/riscv/include/asm/kvm_host.h
> > > > +++ b/arch/riscv/include/asm/kvm_host.h
> > > > @@ -18,6 +18,7 @@
> > > > #include <asm/kvm_vcpu_insn.h>
> > > > #include <asm/kvm_vcpu_sbi.h>
> > > > #include <asm/kvm_vcpu_timer.h>
> > > > +#include <asm/kvm_vcpu_pmu.h>
> > > >
> > > > #define KVM_MAX_VCPUS 1024
> > > >
> > > > @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
> > > >
> > > > /* Don't run the VCPU (blocked) */
> > > > bool pause;
> > > > +
> > > > + /* Performance monitoring context */
> > > > + struct kvm_pmu pmu_context;
> > > > };
> > > >
> > > > static inline void kvm_arch_hardware_unsetup(void) {}
> > > > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > new file mode 100644
> > > > index 0000000..e2b4038
> > > > --- /dev/null
> > > > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > @@ -0,0 +1,78 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > +/*
> > > > + * Copyright (c) 2023 Rivos Inc
> > > > + *
> > > > + * Authors:
> > > > + * Atish Patra <[email protected]>
> > > > + */
> > > > +
> > > > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > > > +#define __KVM_VCPU_RISCV_PMU_H
> > > > +
> > > > +#include <linux/perf/riscv_pmu.h>
> > > > +#include <asm/kvm_vcpu_sbi.h>
> > > > +#include <asm/sbi.h>
> > > > +
> > > > +#ifdef CONFIG_RISCV_PMU_SBI
> > > > +#define RISCV_KVM_MAX_FW_CTRS 32
> > > > +
> > > > +#if RISCV_KVM_MAX_FW_CTRS > 32
> > > > +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
> > >
> > > "The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"
> > >
> > > > +#endif
> > > > +
> > > > +#define RISCV_MAX_COUNTERS 64
> > >
> > > But instead of that message, what I think we need is something like
> > >
> > > #define RISCV_KVM_MAX_HW_CTRS 32
> > > #define RISCV_KVM_MAX_FW_CTRS 32
> > > #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
> > >
> > > static_assert(RISCV_MAX_COUNTERS <= 64)
> > >
> > > And then in pmu_sbi_device_probe() should ensure
> > >
> > > num_counters <= RISCV_MAX_COUNTERS
> > >
> > > and pmu_sbi_get_ctrinfo() should ensure
> > >
> > > num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
> > > num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS
> > >
> > > which has to be done at runtime.
> > >
> >
> > Sure. I will add the additional sanity checks.
> >
>
> As explained above, I feel we shouldn't mix the firmware number of
> counters that the host gets and it exposes to a guest.
> So I have not included this suggestion in the v5.
> I have changed the num_fw_ctrs to PMU_FW_MAX though to accurately
> reflect the firmware counters KVM is actually using.

Sounds good

> I don't know if there is any benefit of static_assert over #error.
> Please let me know if you feel strongly about that.

One "normal" line vs. three #-lines?

Thanks,
drew

2023-02-06 11:40:05

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Mon, Feb 06, 2023 at 10:22:04AM +0100, Andrew Jones wrote:
> On Sat, Feb 04, 2023 at 11:37:47PM -0800, Atish Patra wrote:
> > On Fri, Feb 3, 2023 at 12:47 AM Atish Patra <[email protected]> wrote:
> > >
> > > On Thu, Feb 2, 2023 at 9:03 AM Andrew Jones <[email protected]> wrote:
> > > >
> > > > On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > > > > This patch only adds barebone structure of perf implementation. Most of
> > > > > the function returns zero at this point and will be implemented
> > > > > fully in the future.
> > > > >
> > > > > Signed-off-by: Atish Patra <[email protected]>
> > > > > ---
> > > > > arch/riscv/include/asm/kvm_host.h | 4 +
> > > > > arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> > > > > arch/riscv/kvm/Makefile | 1 +
> > > > > arch/riscv/kvm/vcpu.c | 7 ++
> > > > > arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> > > > > 5 files changed, 226 insertions(+)
> > > > > create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> > > > >
> > > > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > > > > index 93f43a3..b90be9a 100644
> > > > > --- a/arch/riscv/include/asm/kvm_host.h
> > > > > +++ b/arch/riscv/include/asm/kvm_host.h
> > > > > @@ -18,6 +18,7 @@
> > > > > #include <asm/kvm_vcpu_insn.h>
> > > > > #include <asm/kvm_vcpu_sbi.h>
> > > > > #include <asm/kvm_vcpu_timer.h>
> > > > > +#include <asm/kvm_vcpu_pmu.h>
> > > > >
> > > > > #define KVM_MAX_VCPUS 1024
> > > > >
> > > > > @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
> > > > >
> > > > > /* Don't run the VCPU (blocked) */
> > > > > bool pause;
> > > > > +
> > > > > + /* Performance monitoring context */
> > > > > + struct kvm_pmu pmu_context;
> > > > > };
> > > > >
> > > > > static inline void kvm_arch_hardware_unsetup(void) {}
> > > > > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > new file mode 100644
> > > > > index 0000000..e2b4038
> > > > > --- /dev/null
> > > > > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > @@ -0,0 +1,78 @@
> > > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > > +/*
> > > > > + * Copyright (c) 2023 Rivos Inc
> > > > > + *
> > > > > + * Authors:
> > > > > + * Atish Patra <[email protected]>
> > > > > + */
> > > > > +
> > > > > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > > > > +#define __KVM_VCPU_RISCV_PMU_H
> > > > > +
> > > > > +#include <linux/perf/riscv_pmu.h>
> > > > > +#include <asm/kvm_vcpu_sbi.h>
> > > > > +#include <asm/sbi.h>
> > > > > +
> > > > > +#ifdef CONFIG_RISCV_PMU_SBI
> > > > > +#define RISCV_KVM_MAX_FW_CTRS 32
> > > > > +
> > > > > +#if RISCV_KVM_MAX_FW_CTRS > 32
> > > > > +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
> > > >
> > > > "The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"
> > > >
> > > > > +#endif
> > > > > +
> > > > > +#define RISCV_MAX_COUNTERS 64
> > > >
> > > > But instead of that message, what I think we need is something like
> > > >
> > > > #define RISCV_KVM_MAX_HW_CTRS 32
> > > > #define RISCV_KVM_MAX_FW_CTRS 32
> > > > #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
> > > >
> > > > static_assert(RISCV_MAX_COUNTERS <= 64)
> > > >
> > > > And then in pmu_sbi_device_probe() should ensure
> > > >
> > > > num_counters <= RISCV_MAX_COUNTERS
> > > >
> > > > and pmu_sbi_get_ctrinfo() should ensure
> > > >
> > > > num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
> > > > num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS
> > > >
> > > > which has to be done at runtime.
> > > >
> > >
> > > Sure. I will add the additional sanity checks.
> > >
> >
> > As explained above, I feel we shouldn't mix the firmware number of
> > counters that the host gets and it exposes to a guest.
> > So I have not included this suggestion in the v5.
> > I have changed the num_fw_ctrs to PMU_FW_MAX though to accurately
> > reflect the firmware counters KVM is actually using.
>
> Sounds good

I just looked at v5. IMO, much of what I proposed above still makes
sense, since what I'm proposing is that the relationship between
RISCV_KVM_MAX_HW_CTRS, RISCV_KVM_MAX_FW_CTRS, RISCV_MAX_COUNTERS, and 64
(our current max bitmap size) be explicitly checked. So, even if we want
RISCV_KVM_MAX_FW_CTRS to be SBI_PMU_FW_MAX, it'd be good to have

#define RISCV_KVM_MAX_HW_CTRS 32

(And a runtime check confirming num_hw_ctrs + 1 <= RISCV_KVM_MAX_HW_CTRS,
and then either silently capping or issuing a warning and capping)

And, to be sure the sum of RISCV_KVM_MAX_FW_CTRS and RISCV_KVM_MAX_HW_CTRS
doesn't exceed the size of the bitmap

#define RISCV_KVM_MAX_FW_CTRS SBI_PMU_FW_MAX
#define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
static_assert(RISCV_MAX_COUNTERS <= 64)

Thanks,
drew

>
> > I don't know if there is any benefit of static_assert over #error.
> > Please let me know if you feel strongly about that.
>
> One "normal" line vs. three #-lines?
>
> Thanks,
> drew

2023-02-07 09:20:59

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH v4 07/14] RISC-V: KVM: Add skeleton support for perf

On Mon, Feb 6, 2023 at 3:39 AM Andrew Jones <[email protected]> wrote:
>
> On Mon, Feb 06, 2023 at 10:22:04AM +0100, Andrew Jones wrote:
> > On Sat, Feb 04, 2023 at 11:37:47PM -0800, Atish Patra wrote:
> > > On Fri, Feb 3, 2023 at 12:47 AM Atish Patra <[email protected]> wrote:
> > > >
> > > > On Thu, Feb 2, 2023 at 9:03 AM Andrew Jones <[email protected]> wrote:
> > > > >
> > > > > On Wed, Feb 01, 2023 at 03:12:43PM -0800, Atish Patra wrote:
> > > > > > This patch only adds barebone structure of perf implementation. Most of
> > > > > > the function returns zero at this point and will be implemented
> > > > > > fully in the future.
> > > > > >
> > > > > > Signed-off-by: Atish Patra <[email protected]>
> > > > > > ---
> > > > > > arch/riscv/include/asm/kvm_host.h | 4 +
> > > > > > arch/riscv/include/asm/kvm_vcpu_pmu.h | 78 +++++++++++++++
> > > > > > arch/riscv/kvm/Makefile | 1 +
> > > > > > arch/riscv/kvm/vcpu.c | 7 ++
> > > > > > arch/riscv/kvm/vcpu_pmu.c | 136 ++++++++++++++++++++++++++
> > > > > > 5 files changed, 226 insertions(+)
> > > > > > create mode 100644 arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > > create mode 100644 arch/riscv/kvm/vcpu_pmu.c
> > > > > >
> > > > > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> > > > > > index 93f43a3..b90be9a 100644
> > > > > > --- a/arch/riscv/include/asm/kvm_host.h
> > > > > > +++ b/arch/riscv/include/asm/kvm_host.h
> > > > > > @@ -18,6 +18,7 @@
> > > > > > #include <asm/kvm_vcpu_insn.h>
> > > > > > #include <asm/kvm_vcpu_sbi.h>
> > > > > > #include <asm/kvm_vcpu_timer.h>
> > > > > > +#include <asm/kvm_vcpu_pmu.h>
> > > > > >
> > > > > > #define KVM_MAX_VCPUS 1024
> > > > > >
> > > > > > @@ -228,6 +229,9 @@ struct kvm_vcpu_arch {
> > > > > >
> > > > > > /* Don't run the VCPU (blocked) */
> > > > > > bool pause;
> > > > > > +
> > > > > > + /* Performance monitoring context */
> > > > > > + struct kvm_pmu pmu_context;
> > > > > > };
> > > > > >
> > > > > > static inline void kvm_arch_hardware_unsetup(void) {}
> > > > > > diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > > new file mode 100644
> > > > > > index 0000000..e2b4038
> > > > > > --- /dev/null
> > > > > > +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
> > > > > > @@ -0,0 +1,78 @@
> > > > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > > > +/*
> > > > > > + * Copyright (c) 2023 Rivos Inc
> > > > > > + *
> > > > > > + * Authors:
> > > > > > + * Atish Patra <[email protected]>
> > > > > > + */
> > > > > > +
> > > > > > +#ifndef __KVM_VCPU_RISCV_PMU_H
> > > > > > +#define __KVM_VCPU_RISCV_PMU_H
> > > > > > +
> > > > > > +#include <linux/perf/riscv_pmu.h>
> > > > > > +#include <asm/kvm_vcpu_sbi.h>
> > > > > > +#include <asm/sbi.h>
> > > > > > +
> > > > > > +#ifdef CONFIG_RISCV_PMU_SBI
> > > > > > +#define RISCV_KVM_MAX_FW_CTRS 32
> > > > > > +
> > > > > > +#if RISCV_KVM_MAX_FW_CTRS > 32
> > > > > > +#error "Maximum firmware counter can't exceed 32 without increasing the RISCV_MAX_COUNTERS"
> > > > >
> > > > > "The number of firmware counters cannot exceed 32 without increasing RISCV_MAX_COUNTERS"
> > > > >
> > > > > > +#endif
> > > > > > +
> > > > > > +#define RISCV_MAX_COUNTERS 64
> > > > >
> > > > > But instead of that message, what I think we need is something like
> > > > >
> > > > > #define RISCV_KVM_MAX_HW_CTRS 32
> > > > > #define RISCV_KVM_MAX_FW_CTRS 32
> > > > > #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
> > > > >
> > > > > static_assert(RISCV_MAX_COUNTERS <= 64)
> > > > >
> > > > > And then in pmu_sbi_device_probe() should ensure
> > > > >
> > > > > num_counters <= RISCV_MAX_COUNTERS
> > > > >
> > > > > and pmu_sbi_get_ctrinfo() should ensure
> > > > >
> > > > > num_hw_ctr <= RISCV_KVM_MAX_HW_CTRS
> > > > > num_fw_ctr <= RISCV_KVM_MAX_FW_CTRS
> > > > >
> > > > > which has to be done at runtime.
> > > > >
> > > >
> > > > Sure. I will add the additional sanity checks.
> > > >
> > >
> > > As explained above, I feel we shouldn't mix the firmware number of
> > > counters that the host gets and it exposes to a guest.
> > > So I have not included this suggestion in the v5.
> > > I have changed the num_fw_ctrs to PMU_FW_MAX though to accurately
> > > reflect the firmware counters KVM is actually using.
> >
> > Sounds good
>
> I just looked at v5. IMO, much of what I proposed above still makes
> sense, since what I'm proposing is that the relationship between
> RISCV_KVM_MAX_HW_CTRS, RISCV_KVM_MAX_FW_CTRS, RISCV_MAX_COUNTERS, and 64
> (our current max bitmap size) be explicitly checked. So, even if we want
> RISCV_KVM_MAX_FW_CTRS to be SBI_PMU_FW_MAX, it'd be good to have
>
> #define RISCV_KVM_MAX_HW_CTRS 32
>
> (And a runtime check confirming num_hw_ctrs + 1 <= RISCV_KVM_MAX_HW_CTRS,
> and then either silently capping or issuing a warning and capping)
>
> And, to be sure the sum of RISCV_KVM_MAX_FW_CTRS and RISCV_KVM_MAX_HW_CTRS
> doesn't exceed the size of the bitmap
>
> #define RISCV_KVM_MAX_FW_CTRS SBI_PMU_FW_MAX
> #define RISCV_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
> static_assert(RISCV_MAX_COUNTERS <= 64)
>

ok. I have added those changes but I have modified RISCV_MAX_COUNTERS
to RISCV_KVM_MAX_COUNTERS,
to avoid overlapping with the RISCV_MAX_COUNTERS defined in the host.
Logically, the host and the guest can have separate counters anyways.

> Thanks,
> drew
>
> >
> > > I don't know if there is any benefit of static_assert over #error.
> > > Please let me know if you feel strongly about that.
> >
> > One "normal" line vs. three #-lines?
> >

Fair enough. Fixed.

> > Thanks,
> > drew



--
Regards,
Atish