Hello,
The goal of this series is to allow userspace to limit the number
of PMU event counters on the vCPU. We need this to support migration
across systems that implement different numbers of counters.
The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, its value will be the same as
the current PE by default. Userspace can set PMCR_EL0.N for the
vCPU to any value even with the current KVM using KVM_SET_ONE_REG.
However, it is practically unsupported, as KVM resets PMCR_EL0.N
to the host value on vCPU reset and some KVM code uses the host
value to identify (un)implemented event counters on the vCPU.
This series will ensure that the PMCR_EL0.N value is preserved
on vCPU reset and that KVM doesn't use the host value
to identify (un)implemented event counters on the vCPU.
This allows userspace to limit the number of the PMU event
counters on the vCPU.
The series is based on kvmarm/next @0a3a1665cbc59 to include the
vCPU reset and feature flags cleanup/fixes series [1] and the
new sysreg definitions [2] that the selftests in this series utilizes.
Patch 1 adds helper functions to set a PMU for the guest. This
helper will make it easier for the following patches to add
modify codes for that process.
Patch 2 makes the default PMU for the guest set before the first
vCPU reset.
Patch 3 adds a helper to read vCPU's PMCR_EL0.
Patch 4 changes the code to use the guest's PMCR_EL0.N, instead
of the PE's PMCR_EL0.N.
Patch 5 adds userspace handlers for PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} to consider the guest's PMCR.N.
Patch 6 sanitizes the PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} registers
before the first run of the guest based on the number of counters
configured.
Patch 7 adds support userspace modifying PMCR_EL0.N.
Patch 8-13 adds a selftest to verify reading and writing PMU registers
for implemented or unimplemented PMU event counters on the vCPU.
v8: Thanks, Oliver, Sebastian, and Eric for suggestions
- Drop v7 patches 3 and 4, and bring back initializing the
PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} registers with unknown
values. (Eric)
- Implement {get,set}_user callbacks for PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} registers. (Oliver)
- Sanitize PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} registers
before starting the first vCPU run. (Oliver)
- Rename kvm_vcpu_set_pmu() to kvm_setup_vcpu(). (Oliver)
- Rename kvm_arm_get_num_counters() to kvm_arm_pmu_get_max_counters()
and squash it into the caller's patch. (Oliver)
- In set_pmcr() implementation, do not initialize the pmcr register
with kvm_vcpu_read_pmcr(). (Oliver)
- Introduce test_create_vpmu_vm_with_pmcr_n() in the selftest to
carry the commonly used code of creating a VM and configuring
its PMCR.N field. (Eric)
- Add a selftest scenario to check the immutable behavior of
the registers. (Sebastian)
- Add a selftest scenario to check the valid behavior of
PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} registers when accessed
from userspace.
- Address other nits.
v7: Thanks, Oliver for the suggestions
- Rebase the series onto kvmarm/next.
- Move the logic to set the default PMU for the guest from
kvm_reset_vcpu() to __kvm_vcpu_set_target() to deal with the
error returned.
- Add a helper, kvm_arm_get_num_counters(), to read the number
of general-purpose counters.
- Use this helper to fix the error reported by kernel test robot [3].
v6: Thanks, Oliver and Shaoqin for the suggestions
- Split the previously defined kvm_arm_set_vm_pmu() into separate
functions: default arm_pmu and a caller requested arm_pmu.
- Send -EINVAL from kvm_reset_vcpu(), instead of -ENODEV for the
case where KVM fails to set a default arm_pmu, to remain consistent
with the existing behavior.
- Drop the v5 patch-5/12 that removes ARMV8_PMU_PMCR_N_MASK and adds
ARMV8_PMU_PMCR_N. Make corresponding changes to v5 patch-6/12.
- Disregard introducing 'pmcr_n_limit' in kvm->arch as a member to
be accessed later in 'set_pmcr()'. Instead, directly obtain the
value by accessing the saved 'arm_pmu'.
- 'set_pmcr()' ignores the error when userspace tries to set PMCR.N
greater than the hardware limit to keep the existing API behavior.
- 'set_pmcr()' ignores modifications to the register after the VM has
started and returns a success to userspace.
- Introduce [get|set]_pmcr_n() helpers in the selftest to make
modifications to the field easier.
- Define the 'vpmu_vm' globally in the selftest, instead of allocating
it every time a VM is created.
- Use the new printf style __GUEST_ASSERT()s in the selftest.
v5:
https://lore.kernel.org/all/[email protected]/
- Drop the patches (v4 3,4) related to PMU version fixes as it's
now being handled in a separate series [4].
- Switch to config_lock, instead of kvm->lock, while configuring
the guest PMU.
- Instead of continuing after a WARN_ON() for the return value of
kvm_arm_set_vm_pmu() in kvm_arm_pmu_v3_set_pmu(), patch-1 now
returns from the function immediately with the error code.
- Fix WARN_ON() logic in kvm_host_pmu_init() (patch v4 9/14).
- Instead of returning 0, return -ENODEV from the
kvm_arm_set_vm_pmu() stub function.
- Do not define the PMEVN_CASE() and PMEVN_SWITCH() macros in
the selftest code as they are now included in the imported
arm_pmuv3.h header.
- Since the (initial) purpose of the selftest is to test the
accessibility of the counter registers, remove the functional
test at the end of test_access_pmc_regs(). It'll be added
later in a separate series.
- Introduce additional helper functions (destroy_vpmu_vm(),
PMC_ACC_TO_IDX()) in the selftest for ease of maintenance
and debugging.
v4:
https://lore.kernel.org/all/[email protected]/
- Fix the selftest bug in patch 13 (Have test_access_pmc_regs() to
specify pmc index for test_bitmap_pmu_regs() instead of bit-shifted
value (Thank you Raghavendra for the reporting the issue!).
v3:
https://lore.kernel.org/all/[email protected]/
- Remove reset_pmu_reg(), and use reset_val() instead. [Marc]
- Fixed the initial value of PMCR_EL0.N on heterogeneous
PMU systems. [Oliver]
- Fixed PMUVer issues on heterogeneous PMU systems.
- Fixed typos [Shaoqin]
v2:
https://lore.kernel.org/all/[email protected]/
- Added the sys_reg's set_user() handler for the PMCR_EL0 to
disallow userspace to set PMCR_EL0.N for the vCPU to a value
that is greater than the host value (and added a new test
case for this behavior). [Oliver]
- Added to the commit log of the patch 2 that PMUSERENR_EL0 and
PMCCFILTR_EL0 have UNKNOWN reset values.
v1:
https://lore.kernel.org/all/[email protected]/
Thank you.
Raghavendra
[1]: https://lore.kernel.org/all/[email protected]/
[2]: https://lore.kernel.org/all/[email protected]/
[3]: https://lore.kernel.org/all/[email protected]/
[4]: https://lore.kernel.org/all/[email protected]/
Raghavendra Rao Ananta (6):
KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR},
PMOVS{SET,CLR}
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first
run
tools: Import arm_pmuv3.h
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU test for immutability
Reiji Watanabe (7):
KVM: arm64: PMU: Introduce helpers to set the guest's PMU
KVM: arm64: PMU: Set the default PMU for the guest before vCPU reset
KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: selftests: aarch64: Introduce vpmu_counter_access test
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: vPMU register test for unimplemented counters
arch/arm64/include/asm/kvm_host.h | 3 +
arch/arm64/kvm/arm.c | 22 +-
arch/arm64/kvm/pmu-emul.c | 112 ++-
arch/arm64/kvm/sys_regs.c | 180 ++++-
include/kvm/arm_pmu.h | 20 +
tools/include/perf/arm_pmuv3.h | 308 ++++++++
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/vpmu_counter_access.c | 726 ++++++++++++++++++
.../selftests/kvm/include/aarch64/processor.h | 1 +
9 files changed, 1319 insertions(+), 54 deletions(-)
create mode 100644 tools/include/perf/arm_pmuv3.h
create mode 100644 tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
base-commit: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
Introduce new helper functions to set the guest's PMU
(kvm->arch.arm_pmu) either to a default probed instance or to a
caller requested one, and use it when the guest's PMU needs to
be set. These helpers will make it easier for the following
patches to modify the relevant code.
No functional change intended.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
arch/arm64/kvm/pmu-emul.c | 50 +++++++++++++++++++++++++++------------
1 file changed, 35 insertions(+), 15 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 3afb281ed8d2c..eb5dcb12dafe9 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -874,6 +874,36 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
return true;
}
+static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
+{
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ kvm->arch.arm_pmu = arm_pmu;
+}
+
+/**
+ * kvm_arm_set_default_pmu - No PMU set, get the default one.
+ * @kvm: The kvm pointer
+ *
+ * The observant among you will notice that the supported_cpus
+ * mask does not get updated for the default PMU even though it
+ * is quite possible the selected instance supports only a
+ * subset of cores in the system. This is intentional, and
+ * upholds the preexisting behavior on heterogeneous systems
+ * where vCPUs can be scheduled on any core but the guest
+ * counters could stop working.
+ */
+static int kvm_arm_set_default_pmu(struct kvm *kvm)
+{
+ struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu();
+
+ if (!arm_pmu)
+ return -ENODEV;
+
+ kvm_arm_set_pmu(kvm, arm_pmu);
+ return 0;
+}
+
static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
{
struct kvm *kvm = vcpu->kvm;
@@ -893,7 +923,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
break;
}
- kvm->arch.arm_pmu = arm_pmu;
+ kvm_arm_set_pmu(kvm, arm_pmu);
cpumask_copy(kvm->arch.supported_cpus, &arm_pmu->supported_cpus);
ret = 0;
break;
@@ -917,20 +947,10 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
return -EBUSY;
if (!kvm->arch.arm_pmu) {
- /*
- * No PMU set, get the default one.
- *
- * The observant among you will notice that the supported_cpus
- * mask does not get updated for the default PMU even though it
- * is quite possible the selected instance supports only a
- * subset of cores in the system. This is intentional, and
- * upholds the preexisting behavior on heterogeneous systems
- * where vCPUs can be scheduled on any core but the guest
- * counters could stop working.
- */
- kvm->arch.arm_pmu = kvm_pmu_probe_armpmu();
- if (!kvm->arch.arm_pmu)
- return -ENODEV;
+ int ret = kvm_arm_set_default_pmu(kvm);
+
+ if (ret)
+ return ret;
}
switch (attr->attr) {
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
The following patches will use the number of counters information
from the arm_pmu and use this to set the PMCR.N for the guest
during vCPU reset. However, since the guest is not associated
with any arm_pmu until userspace configures the vPMU device
attributes, and a reset can happen before this event, assign a
default PMU to the guest just before doing the reset.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/kvm/arm.c | 19 +++++++++++++++++++
arch/arm64/kvm/pmu-emul.c | 16 ++++------------
include/kvm/arm_pmu.h | 6 ++++++
3 files changed, 29 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c6cad400490f9..08c2f76983b9d 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1319,6 +1319,21 @@ static bool kvm_vcpu_init_changed(struct kvm_vcpu *vcpu,
KVM_VCPU_MAX_FEATURES);
}
+static int kvm_setup_vcpu(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+
+ /*
+ * When the vCPU has a PMU, but no PMU is set for the guest
+ * yet, set the default one.
+ */
+ if (kvm_vcpu_has_pmu(vcpu) && !kvm->arch.arm_pmu &&
+ kvm_arm_set_default_pmu(kvm))
+ return -EINVAL;
+
+ return 0;
+}
+
static int __kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
const struct kvm_vcpu_init *init)
{
@@ -1334,6 +1349,10 @@ static int __kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
bitmap_copy(kvm->arch.vcpu_features, &features, KVM_VCPU_MAX_FEATURES);
+ ret = kvm_setup_vcpu(vcpu);
+ if (ret)
+ goto out_unlock;
+
/* Now we know what it is, we can reset it. */
kvm_reset_vcpu(vcpu);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index eb5dcb12dafe9..66c244021ff08 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -717,10 +717,9 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
* It is still necessary to get a valid cpu, though, to probe for the
* default PMU instance as userspace is not required to specify a PMU
* type. In order to uphold the preexisting behavior KVM selects the
- * PMU instance for the core where the first call to the
- * KVM_ARM_VCPU_PMU_V3_CTRL attribute group occurs. A dependent use case
- * would be a user with disdain of all things big.LITTLE that affines
- * the VMM to a particular cluster of cores.
+ * PMU instance for the core just before the vcpu reset. A dependent use
+ * case would be a user with disdain of all things big.LITTLE that
+ * affines the VMM to a particular cluster of cores.
*
* In any case, userspace should just do the sane thing and use the UAPI
* to select a PMU type directly. But, be wary of the baggage being
@@ -893,7 +892,7 @@ static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
* where vCPUs can be scheduled on any core but the guest
* counters could stop working.
*/
-static int kvm_arm_set_default_pmu(struct kvm *kvm)
+int kvm_arm_set_default_pmu(struct kvm *kvm)
{
struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu();
@@ -946,13 +945,6 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
if (vcpu->arch.pmu.created)
return -EBUSY;
- if (!kvm->arch.arm_pmu) {
- int ret = kvm_arm_set_default_pmu(kvm);
-
- if (ret)
- return ret;
- }
-
switch (attr->attr) {
case KVM_ARM_VCPU_PMU_V3_IRQ: {
int __user *uaddr = (int __user *)(long)attr->addr;
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 3546ebc469ad7..858ed9ce828a6 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -101,6 +101,7 @@ void kvm_vcpu_pmu_resync_el0(void);
})
u8 kvm_arm_pmu_get_pmuver_limit(void);
+int kvm_arm_set_default_pmu(struct kvm *kvm);
#else
struct kvm_pmu {
@@ -174,6 +175,11 @@ static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
}
static inline void kvm_vcpu_pmu_resync_el0(void) {}
+static inline int kvm_arm_set_default_pmu(struct kvm *kvm)
+{
+ return -ENODEV;
+}
+
#endif
#endif
--
2.42.0.655.g421f12c284-goog
For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
Hence to ensure correct KVM's PMU emulation, mask out the bits in
these registers for these unimplemented counters before the first
vCPU run.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/kvm/arm.c | 2 +-
arch/arm64/kvm/pmu-emul.c | 11 +++++++++++
include/kvm/arm_pmu.h | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e3074a9e23a8b..3c0bb80483fb1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -857,7 +857,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
}
if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
- kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
+ kvm_vcpu_handle_request_reload_pmu(vcpu);
if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
kvm_vcpu_pmu_restore_guest(vcpu);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 9e24581206c24..31e4933293b76 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -788,6 +788,17 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
return val & mask;
}
+void kvm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu)
+{
+ u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+
+ kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
+
+ __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask;
+ __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask;
+ __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask;
+}
+
int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)
{
if (!kvm_vcpu_has_pmu(vcpu))
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 2e90f38090e6d..567dc288a5ddb 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -63,6 +63,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
u64 select_idx);
+void kvm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu);
int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr);
int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu,
@@ -142,6 +143,7 @@ static inline void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val) {}
static inline void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val) {}
static inline void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu,
u64 data, u64 select_idx) {}
+static inline void vm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu) {}
static inline int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr)
{
--
2.42.0.655.g421f12c284-goog
The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, the value is set to the same
value as the current PE on every vCPU reset. Unless the vCPU is
pinned to PEs that has the PMU associated to the guest from the
initial vCPU reset, the value might be different from the PMU's
PMCR_EL0.N on heterogeneous PMU systems.
Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
value. Track the PMCR_EL0.N per guest, as only one PMU can be set
for the guest (PMCR_EL0.N must be the same for all vCPUs of the
guest), and it is convenient for updating the value.
To achieve this, the patch introduces a helper,
kvm_arm_pmu_get_max_counters(), that reads the maximum number of
counters from the arm_pmu associated to the VM. Make the function
global as upcoming patches will be interested to know the value
while setting the PMCR.N of the guest from userspace.
KVM does not yet support userspace modifying PMCR_EL0.N.
The following patch will add support for that.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 3 +++
arch/arm64/kvm/pmu-emul.c | 26 +++++++++++++++++++++++++-
arch/arm64/kvm/sys_regs.c | 28 ++++++++++++++--------------
include/kvm/arm_pmu.h | 6 ++++++
4 files changed, 48 insertions(+), 15 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 846a7706e925c..5653d3553e3ee 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -290,6 +290,9 @@ struct kvm_arch {
cpumask_var_t supported_cpus;
+ /* PMCR_EL0.N value for the guest */
+ u8 pmcr_n;
+
/* Hypercall features firmware registers' descriptor */
struct kvm_smccc_features smccc_feat;
struct maple_tree smccc_filter;
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 097bf7122130d..9e24581206c24 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -690,6 +690,9 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
if (!entry)
goto out_unlock;
+ WARN_ON((pmu->num_events <= 0) ||
+ (pmu->num_events > ARMV8_PMU_MAX_COUNTERS));
+
entry->arm_pmu = pmu;
list_add_tail(&entry->entry, &arm_pmus);
@@ -873,11 +876,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
return true;
}
+/**
+ * kvm_arm_pmu_get_max_counters - Return the max number of PMU counters.
+ * @kvm: The kvm pointer
+ */
+int kvm_arm_pmu_get_max_counters(struct kvm *kvm)
+{
+ struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
+
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ /*
+ * The arm_pmu->num_events considers the cycle counter as well.
+ * Ignore that and return only the general-purpose counters.
+ */
+ return arm_pmu->num_events - 1;
+}
+
static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
{
lockdep_assert_held(&kvm->arch.config_lock);
kvm->arch.arm_pmu = arm_pmu;
+ kvm->arch.pmcr_n = kvm_arm_pmu_get_max_counters(kvm);
}
/**
@@ -1091,5 +1112,8 @@ u8 kvm_arm_pmu_get_pmuver_limit(void)
*/
u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
{
- return __vcpu_sys_reg(vcpu, PMCR_EL0);
+ u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0) &
+ ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
+
+ return pmcr | ((u64)vcpu->kvm->arch.pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index a31cecb3d29fb..faf97878dfbbb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -721,12 +721,7 @@ static u64 reset_pmu_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
u64 n, mask = BIT(ARMV8_PMU_CYCLE_IDX);
- /* No PMU available, any PMU reg may UNDEF... */
- if (!kvm_arm_support_pmu_v3())
- return 0;
-
- n = read_sysreg(pmcr_el0) >> ARMV8_PMU_PMCR_N_SHIFT;
- n &= ARMV8_PMU_PMCR_N_MASK;
+ n = vcpu->kvm->arch.pmcr_n;
if (n)
mask |= GENMASK(n - 1, 0);
@@ -762,17 +757,15 @@ static u64 reset_pmselr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
static u64 reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
- u64 pmcr;
+ u64 pmcr = 0;
- /* No PMU available, PMCR_EL0 may UNDEF... */
- if (!kvm_arm_support_pmu_v3())
- return 0;
-
- /* Only preserve PMCR_EL0.N, and reset the rest to 0 */
- pmcr = read_sysreg(pmcr_el0) & (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
if (!kvm_supports_32bit_el0())
pmcr |= ARMV8_PMU_PMCR_LC;
+ /*
+ * The value of PMCR.N field is included when the
+ * vCPU register is read via kvm_vcpu_read_pmcr().
+ */
__vcpu_sys_reg(vcpu, r->reg) = pmcr;
return __vcpu_sys_reg(vcpu, r->reg);
@@ -1103,6 +1096,13 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
+static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ *val = kvm_vcpu_read_pmcr(vcpu);
+ return 0;
+}
+
/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
#define DBG_BCR_BVR_WCR_WVR_EL1(n) \
{ SYS_DESC(SYS_DBGBVRn_EL1(n)), \
@@ -2235,7 +2235,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_SVCR), undef_access },
{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
- .reset = reset_pmcr, .reg = PMCR_EL0 },
+ .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
{ PMU_SYS_REG(PMCNTENSET_EL0),
.access = access_pmcnten, .reg = PMCNTENSET_EL0 },
{ PMU_SYS_REG(PMCNTENCLR_EL0),
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index cd980d78b86b5..2e90f38090e6d 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -102,6 +102,7 @@ void kvm_vcpu_pmu_resync_el0(void);
u8 kvm_arm_pmu_get_pmuver_limit(void);
int kvm_arm_set_default_pmu(struct kvm *kvm);
+int kvm_arm_pmu_get_max_counters(struct kvm *kvm);
u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu);
#else
@@ -181,6 +182,11 @@ static inline int kvm_arm_set_default_pmu(struct kvm *kvm)
return -ENODEV;
}
+static inline int kvm_arm_pmu_get_max_counters(struct kvm *kvm)
+{
+ return -ENODEV;
+}
+
static inline u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
{
return 0;
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.
Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.
Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
1 file changed, 55 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e5d497596ef8..a2c5f210b3d6b 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1176,6 +1176,59 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
return 0;
}
+static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 val)
+{
+ struct kvm *kvm = vcpu->kvm;
+ u64 new_n, mutable_mask;
+
+ mutex_lock(&kvm->arch.config_lock);
+
+ /*
+ * Make PMCR immutable once the VM has started running, but
+ * do not return an error to meet the existing expectations.
+ */
+ if (kvm_vm_has_ran_once(vcpu->kvm)) {
+ mutex_unlock(&kvm->arch.config_lock);
+ return 0;
+ }
+
+ new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
+ if (new_n != kvm->arch.pmcr_n) {
+ u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);
+
+ /*
+ * The vCPU can't have more counters than the PMU hardware
+ * implements. Ignore this error to maintain compatibility
+ * with the existing KVM behavior.
+ */
+ if (new_n <= pmcr_n_limit)
+ kvm->arch.pmcr_n = new_n;
+ }
+ mutex_unlock(&kvm->arch.config_lock);
+
+ /*
+ * Ignore writes to RES0 bits, read only bits that are cleared on
+ * vCPU reset, and writable bits that KVM doesn't support yet.
+ * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
+ * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
+ * But, we leave the bit as it is here, as the vCPU's PMUver might
+ * be changed later (NOTE: the bit will be cleared on first vCPU run
+ * if necessary).
+ */
+ mutable_mask = (ARMV8_PMU_PMCR_MASK |
+ (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));
+ val &= mutable_mask;
+ val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
+
+ /* The LC bit is RES1 when AArch32 is not supported */
+ if (!kvm_supports_32bit_el0())
+ val |= ARMV8_PMU_PMCR_LC;
+
+ __vcpu_sys_reg(vcpu, r->reg) = val;
+ return 0;
+}
+
/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
#define DBG_BCR_BVR_WCR_WVR_EL1(n) \
{ SYS_DESC(SYS_DBGBVRn_EL1(n)), \
@@ -2309,8 +2362,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_CTR_EL0), access_ctr },
{ SYS_DESC(SYS_SVCR), undef_access },
- { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
- .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
+ { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
+ .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
{ PMU_SYS_REG(PMCNTENSET_EL0),
.access = access_pmcnten, .reg = PMCNTENSET_EL0,
.get_user = get_pmcnten, .set_user = set_pmcnten },
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
reads a vCPU's PMCR_EL0.
Currently, the PMCR_EL0 value is tracked per vCPU. The following
patches will make (only) PMCR_EL0.N track per guest. Having the
new helper will be useful to combine the PMCR_EL0.N field
(tracked per guest) and the other fields (tracked per vCPU)
to provide the value of PMCR_EL0.
No functional change intended.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
arch/arm64/kvm/arm.c | 3 +--
arch/arm64/kvm/pmu-emul.c | 21 +++++++++++++++------
arch/arm64/kvm/sys_regs.c | 6 +++---
include/kvm/arm_pmu.h | 6 ++++++
4 files changed, 25 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 08c2f76983b9d..e3074a9e23a8b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -857,8 +857,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
}
if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
- kvm_pmu_handle_pmcr(vcpu,
- __vcpu_sys_reg(vcpu, PMCR_EL0));
+ kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
kvm_vcpu_pmu_restore_guest(vcpu);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 66c244021ff08..097bf7122130d 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -72,7 +72,7 @@ static bool kvm_pmc_is_64bit(struct kvm_pmc *pmc)
static bool kvm_pmc_has_64bit_overflow(struct kvm_pmc *pmc)
{
- u64 val = __vcpu_sys_reg(kvm_pmc_to_vcpu(pmc), PMCR_EL0);
+ u64 val = kvm_vcpu_read_pmcr(kvm_pmc_to_vcpu(pmc));
return (pmc->idx < ARMV8_PMU_CYCLE_IDX && (val & ARMV8_PMU_PMCR_LP)) ||
(pmc->idx == ARMV8_PMU_CYCLE_IDX && (val & ARMV8_PMU_PMCR_LC));
@@ -250,7 +250,7 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu)
u64 kvm_pmu_valid_counter_mask(struct kvm_vcpu *vcpu)
{
- u64 val = __vcpu_sys_reg(vcpu, PMCR_EL0) >> ARMV8_PMU_PMCR_N_SHIFT;
+ u64 val = kvm_vcpu_read_pmcr(vcpu) >> ARMV8_PMU_PMCR_N_SHIFT;
val &= ARMV8_PMU_PMCR_N_MASK;
if (val == 0)
@@ -272,7 +272,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_vcpu_has_pmu(vcpu))
return;
- if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E) || !val)
+ if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) || !val)
return;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
@@ -324,7 +324,7 @@ static u64 kvm_pmu_overflow_status(struct kvm_vcpu *vcpu)
{
u64 reg = 0;
- if ((__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E)) {
+ if ((kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E)) {
reg = __vcpu_sys_reg(vcpu, PMOVSSET_EL0);
reg &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
reg &= __vcpu_sys_reg(vcpu, PMINTENSET_EL1);
@@ -426,7 +426,7 @@ static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu,
{
int i;
- if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
+ if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E))
return;
/* Weed out disabled counters */
@@ -569,7 +569,7 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
{
struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
- return (__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E) &&
+ return (kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) &&
(__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & BIT(pmc->idx));
}
@@ -1084,3 +1084,12 @@ u8 kvm_arm_pmu_get_pmuver_limit(void)
ID_AA64DFR0_EL1_PMUVer_V3P5);
return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), tmp);
}
+
+/**
+ * kvm_vcpu_read_pmcr - Read PMCR_EL0 register for the vCPU
+ * @vcpu: The vcpu pointer
+ */
+u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
+{
+ return __vcpu_sys_reg(vcpu, PMCR_EL0);
+}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index ce1bb97d35176..a31cecb3d29fb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -822,7 +822,7 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
* Only update writeable bits of PMCR (continuing into
* kvm_pmu_handle_pmcr() as well)
*/
- val = __vcpu_sys_reg(vcpu, PMCR_EL0);
+ val = kvm_vcpu_read_pmcr(vcpu);
val &= ~ARMV8_PMU_PMCR_MASK;
val |= p->regval & ARMV8_PMU_PMCR_MASK;
if (!kvm_supports_32bit_el0())
@@ -830,7 +830,7 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
kvm_pmu_handle_pmcr(vcpu, val);
} else {
/* PMCR.P & PMCR.C are RAZ */
- val = __vcpu_sys_reg(vcpu, PMCR_EL0)
+ val = kvm_vcpu_read_pmcr(vcpu)
& ~(ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C);
p->regval = val;
}
@@ -879,7 +879,7 @@ static bool pmu_counter_idx_valid(struct kvm_vcpu *vcpu, u64 idx)
{
u64 pmcr, val;
- pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0);
+ pmcr = kvm_vcpu_read_pmcr(vcpu);
val = (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX) {
kvm_inject_undefined(vcpu);
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 858ed9ce828a6..cd980d78b86b5 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -103,6 +103,7 @@ void kvm_vcpu_pmu_resync_el0(void);
u8 kvm_arm_pmu_get_pmuver_limit(void);
int kvm_arm_set_default_pmu(struct kvm *kvm);
+u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu);
#else
struct kvm_pmu {
};
@@ -180,6 +181,11 @@ static inline int kvm_arm_set_default_pmu(struct kvm *kvm)
return -ENODEV;
}
+static inline u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
+{
+ return 0;
+}
+
#endif
#endif
--
2.42.0.655.g421f12c284-goog
Import kernel's include/linux/perf/arm_pmuv3.h, with the
definition of PMEVN_SWITCH() additionally including an assert()
for the 'default' case. The following patches will use macros
defined in this header.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
tools/include/perf/arm_pmuv3.h | 308 +++++++++++++++++++++++++++++++++
1 file changed, 308 insertions(+)
create mode 100644 tools/include/perf/arm_pmuv3.h
diff --git a/tools/include/perf/arm_pmuv3.h b/tools/include/perf/arm_pmuv3.h
new file mode 100644
index 0000000000000..e822d49fb5b88
--- /dev/null
+++ b/tools/include/perf/arm_pmuv3.h
@@ -0,0 +1,308 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __PERF_ARM_PMUV3_H
+#define __PERF_ARM_PMUV3_H
+
+#include <assert.h>
+#include <asm/bug.h>
+
+#define ARMV8_PMU_MAX_COUNTERS 32
+#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)
+
+/*
+ * Common architectural and microarchitectural event numbers.
+ */
+#define ARMV8_PMUV3_PERFCTR_SW_INCR 0x0000
+#define ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL 0x0001
+#define ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL 0x0002
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL 0x0003
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE 0x0004
+#define ARMV8_PMUV3_PERFCTR_L1D_TLB_REFILL 0x0005
+#define ARMV8_PMUV3_PERFCTR_LD_RETIRED 0x0006
+#define ARMV8_PMUV3_PERFCTR_ST_RETIRED 0x0007
+#define ARMV8_PMUV3_PERFCTR_INST_RETIRED 0x0008
+#define ARMV8_PMUV3_PERFCTR_EXC_TAKEN 0x0009
+#define ARMV8_PMUV3_PERFCTR_EXC_RETURN 0x000A
+#define ARMV8_PMUV3_PERFCTR_CID_WRITE_RETIRED 0x000B
+#define ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED 0x000C
+#define ARMV8_PMUV3_PERFCTR_BR_IMMED_RETIRED 0x000D
+#define ARMV8_PMUV3_PERFCTR_BR_RETURN_RETIRED 0x000E
+#define ARMV8_PMUV3_PERFCTR_UNALIGNED_LDST_RETIRED 0x000F
+#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED 0x0010
+#define ARMV8_PMUV3_PERFCTR_CPU_CYCLES 0x0011
+#define ARMV8_PMUV3_PERFCTR_BR_PRED 0x0012
+#define ARMV8_PMUV3_PERFCTR_MEM_ACCESS 0x0013
+#define ARMV8_PMUV3_PERFCTR_L1I_CACHE 0x0014
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_WB 0x0015
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE 0x0016
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_REFILL 0x0017
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_WB 0x0018
+#define ARMV8_PMUV3_PERFCTR_BUS_ACCESS 0x0019
+#define ARMV8_PMUV3_PERFCTR_MEMORY_ERROR 0x001A
+#define ARMV8_PMUV3_PERFCTR_INST_SPEC 0x001B
+#define ARMV8_PMUV3_PERFCTR_TTBR_WRITE_RETIRED 0x001C
+#define ARMV8_PMUV3_PERFCTR_BUS_CYCLES 0x001D
+#define ARMV8_PMUV3_PERFCTR_CHAIN 0x001E
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_ALLOCATE 0x001F
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_ALLOCATE 0x0020
+#define ARMV8_PMUV3_PERFCTR_BR_RETIRED 0x0021
+#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED_RETIRED 0x0022
+#define ARMV8_PMUV3_PERFCTR_STALL_FRONTEND 0x0023
+#define ARMV8_PMUV3_PERFCTR_STALL_BACKEND 0x0024
+#define ARMV8_PMUV3_PERFCTR_L1D_TLB 0x0025
+#define ARMV8_PMUV3_PERFCTR_L1I_TLB 0x0026
+#define ARMV8_PMUV3_PERFCTR_L2I_CACHE 0x0027
+#define ARMV8_PMUV3_PERFCTR_L2I_CACHE_REFILL 0x0028
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_ALLOCATE 0x0029
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL 0x002A
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE 0x002B
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_WB 0x002C
+#define ARMV8_PMUV3_PERFCTR_L2D_TLB_REFILL 0x002D
+#define ARMV8_PMUV3_PERFCTR_L2I_TLB_REFILL 0x002E
+#define ARMV8_PMUV3_PERFCTR_L2D_TLB 0x002F
+#define ARMV8_PMUV3_PERFCTR_L2I_TLB 0x0030
+#define ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS 0x0031
+#define ARMV8_PMUV3_PERFCTR_LL_CACHE 0x0032
+#define ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS 0x0033
+#define ARMV8_PMUV3_PERFCTR_DTLB_WALK 0x0034
+#define ARMV8_PMUV3_PERFCTR_ITLB_WALK 0x0035
+#define ARMV8_PMUV3_PERFCTR_LL_CACHE_RD 0x0036
+#define ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD 0x0037
+#define ARMV8_PMUV3_PERFCTR_REMOTE_ACCESS_RD 0x0038
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_LMISS_RD 0x0039
+#define ARMV8_PMUV3_PERFCTR_OP_RETIRED 0x003A
+#define ARMV8_PMUV3_PERFCTR_OP_SPEC 0x003B
+#define ARMV8_PMUV3_PERFCTR_STALL 0x003C
+#define ARMV8_PMUV3_PERFCTR_STALL_SLOT_BACKEND 0x003D
+#define ARMV8_PMUV3_PERFCTR_STALL_SLOT_FRONTEND 0x003E
+#define ARMV8_PMUV3_PERFCTR_STALL_SLOT 0x003F
+
+/* Statistical profiling extension microarchitectural events */
+#define ARMV8_SPE_PERFCTR_SAMPLE_POP 0x4000
+#define ARMV8_SPE_PERFCTR_SAMPLE_FEED 0x4001
+#define ARMV8_SPE_PERFCTR_SAMPLE_FILTRATE 0x4002
+#define ARMV8_SPE_PERFCTR_SAMPLE_COLLISION 0x4003
+
+/* AMUv1 architecture events */
+#define ARMV8_AMU_PERFCTR_CNT_CYCLES 0x4004
+#define ARMV8_AMU_PERFCTR_STALL_BACKEND_MEM 0x4005
+
+/* long-latency read miss events */
+#define ARMV8_PMUV3_PERFCTR_L1I_CACHE_LMISS 0x4006
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_LMISS_RD 0x4009
+#define ARMV8_PMUV3_PERFCTR_L2I_CACHE_LMISS 0x400A
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD 0x400B
+
+/* Trace buffer events */
+#define ARMV8_PMUV3_PERFCTR_TRB_WRAP 0x400C
+#define ARMV8_PMUV3_PERFCTR_TRB_TRIG 0x400E
+
+/* Trace unit events */
+#define ARMV8_PMUV3_PERFCTR_TRCEXTOUT0 0x4010
+#define ARMV8_PMUV3_PERFCTR_TRCEXTOUT1 0x4011
+#define ARMV8_PMUV3_PERFCTR_TRCEXTOUT2 0x4012
+#define ARMV8_PMUV3_PERFCTR_TRCEXTOUT3 0x4013
+#define ARMV8_PMUV3_PERFCTR_CTI_TRIGOUT4 0x4018
+#define ARMV8_PMUV3_PERFCTR_CTI_TRIGOUT5 0x4019
+#define ARMV8_PMUV3_PERFCTR_CTI_TRIGOUT6 0x401A
+#define ARMV8_PMUV3_PERFCTR_CTI_TRIGOUT7 0x401B
+
+/* additional latency from alignment events */
+#define ARMV8_PMUV3_PERFCTR_LDST_ALIGN_LAT 0x4020
+#define ARMV8_PMUV3_PERFCTR_LD_ALIGN_LAT 0x4021
+#define ARMV8_PMUV3_PERFCTR_ST_ALIGN_LAT 0x4022
+
+/* Armv8.5 Memory Tagging Extension events */
+#define ARMV8_MTE_PERFCTR_MEM_ACCESS_CHECKED 0x4024
+#define ARMV8_MTE_PERFCTR_MEM_ACCESS_CHECKED_RD 0x4025
+#define ARMV8_MTE_PERFCTR_MEM_ACCESS_CHECKED_WR 0x4026
+
+/* ARMv8 recommended implementation defined event types */
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_RD 0x0040
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WR 0x0041
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_RD 0x0042
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_WR 0x0043
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_INNER 0x0044
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_OUTER 0x0045
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_VICTIM 0x0046
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_CLEAN 0x0047
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_INVAL 0x0048
+
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_RD 0x004C
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_WR 0x004D
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_RD 0x004E
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_WR 0x004F
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_RD 0x0050
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WR 0x0051
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_RD 0x0052
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_WR 0x0053
+
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_VICTIM 0x0056
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_CLEAN 0x0057
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_INVAL 0x0058
+
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_RD 0x005C
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_WR 0x005D
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_RD 0x005E
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_WR 0x005F
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_RD 0x0060
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_WR 0x0061
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_SHARED 0x0062
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NOT_SHARED 0x0063
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NORMAL 0x0064
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_PERIPH 0x0065
+#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_RD 0x0066
+#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_WR 0x0067
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LD_SPEC 0x0068
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_ST_SPEC 0x0069
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LDST_SPEC 0x006A
+
+#define ARMV8_IMPDEF_PERFCTR_LDREX_SPEC 0x006C
+#define ARMV8_IMPDEF_PERFCTR_STREX_PASS_SPEC 0x006D
+#define ARMV8_IMPDEF_PERFCTR_STREX_FAIL_SPEC 0x006E
+#define ARMV8_IMPDEF_PERFCTR_STREX_SPEC 0x006F
+#define ARMV8_IMPDEF_PERFCTR_LD_SPEC 0x0070
+#define ARMV8_IMPDEF_PERFCTR_ST_SPEC 0x0071
+#define ARMV8_IMPDEF_PERFCTR_LDST_SPEC 0x0072
+#define ARMV8_IMPDEF_PERFCTR_DP_SPEC 0x0073
+#define ARMV8_IMPDEF_PERFCTR_ASE_SPEC 0x0074
+#define ARMV8_IMPDEF_PERFCTR_VFP_SPEC 0x0075
+#define ARMV8_IMPDEF_PERFCTR_PC_WRITE_SPEC 0x0076
+#define ARMV8_IMPDEF_PERFCTR_CRYPTO_SPEC 0x0077
+#define ARMV8_IMPDEF_PERFCTR_BR_IMMED_SPEC 0x0078
+#define ARMV8_IMPDEF_PERFCTR_BR_RETURN_SPEC 0x0079
+#define ARMV8_IMPDEF_PERFCTR_BR_INDIRECT_SPEC 0x007A
+
+#define ARMV8_IMPDEF_PERFCTR_ISB_SPEC 0x007C
+#define ARMV8_IMPDEF_PERFCTR_DSB_SPEC 0x007D
+#define ARMV8_IMPDEF_PERFCTR_DMB_SPEC 0x007E
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_UNDEF 0x0081
+#define ARMV8_IMPDEF_PERFCTR_EXC_SVC 0x0082
+#define ARMV8_IMPDEF_PERFCTR_EXC_PABORT 0x0083
+#define ARMV8_IMPDEF_PERFCTR_EXC_DABORT 0x0084
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_IRQ 0x0086
+#define ARMV8_IMPDEF_PERFCTR_EXC_FIQ 0x0087
+#define ARMV8_IMPDEF_PERFCTR_EXC_SMC 0x0088
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_HVC 0x008A
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_PABORT 0x008B
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_DABORT 0x008C
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_OTHER 0x008D
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_IRQ 0x008E
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_FIQ 0x008F
+#define ARMV8_IMPDEF_PERFCTR_RC_LD_SPEC 0x0090
+#define ARMV8_IMPDEF_PERFCTR_RC_ST_SPEC 0x0091
+
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_RD 0x00A0
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WR 0x00A1
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_RD 0x00A2
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_WR 0x00A3
+
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_VICTIM 0x00A6
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_CLEAN 0x00A7
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_INVAL 0x00A8
+
+/*
+ * Per-CPU PMCR: config reg
+ */
+#define ARMV8_PMU_PMCR_E (1 << 0) /* Enable all counters */
+#define ARMV8_PMU_PMCR_P (1 << 1) /* Reset all counters */
+#define ARMV8_PMU_PMCR_C (1 << 2) /* Cycle counter reset */
+#define ARMV8_PMU_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */
+#define ARMV8_PMU_PMCR_X (1 << 4) /* Export to ETM */
+#define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/
+#define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */
+#define ARMV8_PMU_PMCR_LP (1 << 7) /* Long event counter enable */
+#define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */
+#define ARMV8_PMU_PMCR_N_MASK 0x1f
+#define ARMV8_PMU_PMCR_MASK 0xff /* Mask for writable bits */
+
+/*
+ * PMOVSR: counters overflow flag status reg
+ */
+#define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */
+#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK
+
+/*
+ * PMXEVTYPER: Event selection reg
+ */
+#define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */
+#define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */
+
+/*
+ * Event filters for PMUv3
+ */
+#define ARMV8_PMU_EXCLUDE_EL1 (1U << 31)
+#define ARMV8_PMU_EXCLUDE_EL0 (1U << 30)
+#define ARMV8_PMU_INCLUDE_EL2 (1U << 27)
+
+/*
+ * PMUSERENR: user enable reg
+ */
+#define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */
+#define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */
+#define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */
+#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
+#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
+
+/* PMMIR_EL1.SLOTS mask */
+#define ARMV8_PMU_SLOTS_MASK 0xff
+
+#define ARMV8_PMU_BUS_SLOTS_SHIFT 8
+#define ARMV8_PMU_BUS_SLOTS_MASK 0xff
+#define ARMV8_PMU_BUS_WIDTH_SHIFT 16
+#define ARMV8_PMU_BUS_WIDTH_MASK 0xf
+
+/*
+ * This code is really good
+ */
+
+#define PMEVN_CASE(n, case_macro) \
+ case n: case_macro(n); break
+
+#define PMEVN_SWITCH(x, case_macro) \
+ do { \
+ switch (x) { \
+ PMEVN_CASE(0, case_macro); \
+ PMEVN_CASE(1, case_macro); \
+ PMEVN_CASE(2, case_macro); \
+ PMEVN_CASE(3, case_macro); \
+ PMEVN_CASE(4, case_macro); \
+ PMEVN_CASE(5, case_macro); \
+ PMEVN_CASE(6, case_macro); \
+ PMEVN_CASE(7, case_macro); \
+ PMEVN_CASE(8, case_macro); \
+ PMEVN_CASE(9, case_macro); \
+ PMEVN_CASE(10, case_macro); \
+ PMEVN_CASE(11, case_macro); \
+ PMEVN_CASE(12, case_macro); \
+ PMEVN_CASE(13, case_macro); \
+ PMEVN_CASE(14, case_macro); \
+ PMEVN_CASE(15, case_macro); \
+ PMEVN_CASE(16, case_macro); \
+ PMEVN_CASE(17, case_macro); \
+ PMEVN_CASE(18, case_macro); \
+ PMEVN_CASE(19, case_macro); \
+ PMEVN_CASE(20, case_macro); \
+ PMEVN_CASE(21, case_macro); \
+ PMEVN_CASE(22, case_macro); \
+ PMEVN_CASE(23, case_macro); \
+ PMEVN_CASE(24, case_macro); \
+ PMEVN_CASE(25, case_macro); \
+ PMEVN_CASE(26, case_macro); \
+ PMEVN_CASE(27, case_macro); \
+ PMEVN_CASE(28, case_macro); \
+ PMEVN_CASE(29, case_macro); \
+ PMEVN_CASE(30, case_macro); \
+ default: \
+ WARN(1, "Invalid PMEV* index\n"); \
+ assert(0); \
+ } \
+ } while (0)
+
+#endif
--
2.42.0.655.g421f12c284-goog
For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
explicitly implement the {get,set}_user functions for these
registers to mask out unimplemented counters for userspace reads
and writes.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/kvm/sys_regs.c | 91 ++++++++++++++++++++++++++++++++++++---
1 file changed, 85 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index faf97878dfbbb..2e5d497596ef8 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -987,6 +987,45 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
+static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
+ u64 reg, u64 val, bool set)
+{
+ struct kvm *kvm = vcpu->kvm;
+
+ mutex_lock(&kvm->arch.config_lock);
+
+ /* Make the register immutable once the VM has started running */
+ if (kvm_vm_has_ran_once(kvm)) {
+ mutex_unlock(&kvm->arch.config_lock);
+ return;
+ }
+
+ val &= kvm_pmu_valid_counter_mask(vcpu);
+ mutex_unlock(&kvm->arch.config_lock);
+
+ if (set)
+ __vcpu_sys_reg(vcpu, reg) |= val;
+ else
+ __vcpu_sys_reg(vcpu, reg) &= ~val;
+}
+
+static int get_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+
+ *val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
+ return 0;
+}
+
+static int set_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 val)
+{
+ /* r->Op2 & 0x1: true for PMCNTENSET_EL0, else PMCNTENCLR_EL0 */
+ set_pmreg_for_valid_counters(vcpu, PMCNTENSET_EL0, val, r->Op2 & 0x1);
+ return 0;
+}
+
static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
@@ -1015,6 +1054,23 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
+static int get_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+
+ *val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1) & mask;
+ return 0;
+}
+
+static int set_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 val)
+{
+ /* r->Op2 & 0x1: true for PMINTENSET_EL1, else PMINTENCLR_EL1 */
+ set_pmreg_for_valid_counters(vcpu, PMINTENSET_EL1, val, r->Op2 & 0x1);
+ return 0;
+}
+
static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
@@ -1039,6 +1095,23 @@ static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
+static int set_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 val)
+{
+ /* r->CRm & 0x2: true for PMOVSSET_EL0, else PMOVSCLR_EL0 */
+ set_pmreg_for_valid_counters(vcpu, PMOVSSET_EL0, val, r->CRm & 0x2);
+ return 0;
+}
+
+static int get_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
+ u64 *val)
+{
+ u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+
+ *val = __vcpu_sys_reg(vcpu, PMOVSSET_EL0) & mask;
+ return 0;
+}
+
static bool access_pmovs(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
@@ -2184,9 +2257,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
/* PMBIDR_EL1 is not trapped */
{ PMU_SYS_REG(PMINTENSET_EL1),
- .access = access_pminten, .reg = PMINTENSET_EL1 },
+ .access = access_pminten, .reg = PMINTENSET_EL1,
+ .get_user = get_pminten, .set_user = set_pminten },
{ PMU_SYS_REG(PMINTENCLR_EL1),
- .access = access_pminten, .reg = PMINTENSET_EL1 },
+ .access = access_pminten, .reg = PMINTENSET_EL1,
+ .get_user = get_pminten, .set_user = set_pminten },
{ SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
@@ -2237,11 +2312,14 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
.reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
{ PMU_SYS_REG(PMCNTENSET_EL0),
- .access = access_pmcnten, .reg = PMCNTENSET_EL0 },
+ .access = access_pmcnten, .reg = PMCNTENSET_EL0,
+ .get_user = get_pmcnten, .set_user = set_pmcnten },
{ PMU_SYS_REG(PMCNTENCLR_EL0),
- .access = access_pmcnten, .reg = PMCNTENSET_EL0 },
+ .access = access_pmcnten, .reg = PMCNTENSET_EL0,
+ .get_user = get_pmcnten, .set_user = set_pmcnten },
{ PMU_SYS_REG(PMOVSCLR_EL0),
- .access = access_pmovs, .reg = PMOVSSET_EL0 },
+ .access = access_pmovs, .reg = PMOVSSET_EL0,
+ .get_user = get_pmovs, .set_user = set_pmovs },
/*
* PM_SWINC_EL0 is exposed to userspace as RAZ/WI, as it was
* previously (and pointlessly) advertised in the past...
@@ -2269,7 +2347,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(PMUSERENR_EL0), .access = access_pmuserenr,
.reset = reset_val, .reg = PMUSERENR_EL0, .val = 0 },
{ PMU_SYS_REG(PMOVSSET_EL0),
- .access = access_pmovs, .reg = PMOVSSET_EL0 },
+ .access = access_pmovs, .reg = PMOVSSET_EL0,
+ .get_user = get_pmovs, .set_user = set_pmovs },
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
Add a new test case to the vpmu_counter_access test to check if PMU
registers or their bits for implemented counters on the vCPU are
readable/writable as expected, and can be programmed to count events.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
.../kvm/aarch64/vpmu_counter_access.c | 270 +++++++++++++++++-
1 file changed, 266 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
index 4c6e1fe87e0e6..a579286b6f116 100644
--- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -5,7 +5,8 @@
* Copyright (c) 2023 Google LLC.
*
* This test checks if the guest can see the same number of the PMU event
- * counters (PMCR_EL0.N) that userspace sets.
+ * counters (PMCR_EL0.N) that userspace sets, and if the guest can access
+ * those counters.
* This test runs only when KVM_CAP_ARM_PMU_V3 is supported on the host.
*/
#include <kvm_util.h>
@@ -37,6 +38,255 @@ static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
*pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
}
+/* Read PMEVTCNTR<n>_EL0 through PMXEVCNTR_EL0 */
+static inline unsigned long read_sel_evcntr(int sel)
+{
+ write_sysreg(sel, pmselr_el0);
+ isb();
+ return read_sysreg(pmxevcntr_el0);
+}
+
+/* Write PMEVTCNTR<n>_EL0 through PMXEVCNTR_EL0 */
+static inline void write_sel_evcntr(int sel, unsigned long val)
+{
+ write_sysreg(sel, pmselr_el0);
+ isb();
+ write_sysreg(val, pmxevcntr_el0);
+ isb();
+}
+
+/* Read PMEVTYPER<n>_EL0 through PMXEVTYPER_EL0 */
+static inline unsigned long read_sel_evtyper(int sel)
+{
+ write_sysreg(sel, pmselr_el0);
+ isb();
+ return read_sysreg(pmxevtyper_el0);
+}
+
+/* Write PMEVTYPER<n>_EL0 through PMXEVTYPER_EL0 */
+static inline void write_sel_evtyper(int sel, unsigned long val)
+{
+ write_sysreg(sel, pmselr_el0);
+ isb();
+ write_sysreg(val, pmxevtyper_el0);
+ isb();
+}
+
+static inline void enable_counter(int idx)
+{
+ uint64_t v = read_sysreg(pmcntenset_el0);
+
+ write_sysreg(BIT(idx) | v, pmcntenset_el0);
+ isb();
+}
+
+static inline void disable_counter(int idx)
+{
+ uint64_t v = read_sysreg(pmcntenset_el0);
+
+ write_sysreg(BIT(idx) | v, pmcntenclr_el0);
+ isb();
+}
+
+static void pmu_disable_reset(void)
+{
+ uint64_t pmcr = read_sysreg(pmcr_el0);
+
+ /* Reset all counters, disabling them */
+ pmcr &= ~ARMV8_PMU_PMCR_E;
+ write_sysreg(pmcr | ARMV8_PMU_PMCR_P, pmcr_el0);
+ isb();
+}
+
+#define RETURN_READ_PMEVCNTRN(n) \
+ return read_sysreg(pmevcntr##n##_el0)
+static unsigned long read_pmevcntrn(int n)
+{
+ PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
+ return 0;
+}
+
+#define WRITE_PMEVCNTRN(n) \
+ write_sysreg(val, pmevcntr##n##_el0)
+static void write_pmevcntrn(int n, unsigned long val)
+{
+ PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
+ isb();
+}
+
+#define READ_PMEVTYPERN(n) \
+ return read_sysreg(pmevtyper##n##_el0)
+static unsigned long read_pmevtypern(int n)
+{
+ PMEVN_SWITCH(n, READ_PMEVTYPERN);
+ return 0;
+}
+
+#define WRITE_PMEVTYPERN(n) \
+ write_sysreg(val, pmevtyper##n##_el0)
+static void write_pmevtypern(int n, unsigned long val)
+{
+ PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
+ isb();
+}
+
+/*
+ * The pmc_accessor structure has pointers to PMEV{CNTR,TYPER}<n>_EL0
+ * accessors that test cases will use. Each of the accessors will
+ * either directly reads/writes PMEV{CNTR,TYPER}<n>_EL0
+ * (i.e. {read,write}_pmev{cnt,type}rn()), or reads/writes them through
+ * PMXEV{CNTR,TYPER}_EL0 (i.e. {read,write}_sel_ev{cnt,type}r()).
+ *
+ * This is used to test that combinations of those accessors provide
+ * the consistent behavior.
+ */
+struct pmc_accessor {
+ /* A function to be used to read PMEVTCNTR<n>_EL0 */
+ unsigned long (*read_cntr)(int idx);
+ /* A function to be used to write PMEVTCNTR<n>_EL0 */
+ void (*write_cntr)(int idx, unsigned long val);
+ /* A function to be used to read PMEVTYPER<n>_EL0 */
+ unsigned long (*read_typer)(int idx);
+ /* A function to be used to write PMEVTYPER<n>_EL0 */
+ void (*write_typer)(int idx, unsigned long val);
+};
+
+struct pmc_accessor pmc_accessors[] = {
+ /* test with all direct accesses */
+ { read_pmevcntrn, write_pmevcntrn, read_pmevtypern, write_pmevtypern },
+ /* test with all indirect accesses */
+ { read_sel_evcntr, write_sel_evcntr, read_sel_evtyper, write_sel_evtyper },
+ /* read with direct accesses, and write with indirect accesses */
+ { read_pmevcntrn, write_sel_evcntr, read_pmevtypern, write_sel_evtyper },
+ /* read with indirect accesses, and write with direct accesses */
+ { read_sel_evcntr, write_pmevcntrn, read_sel_evtyper, write_pmevtypern },
+};
+
+/*
+ * Convert a pointer of pmc_accessor to an index in pmc_accessors[],
+ * assuming that the pointer is one of the entries in pmc_accessors[].
+ */
+#define PMC_ACC_TO_IDX(acc) (acc - &pmc_accessors[0])
+
+#define GUEST_ASSERT_BITMAP_REG(regname, mask, set_expected) \
+{ \
+ uint64_t _tval = read_sysreg(regname); \
+ \
+ if (set_expected) \
+ __GUEST_ASSERT((_tval & mask), \
+ "tval: 0x%lx; mask: 0x%lx; set_expected: 0x%lx", \
+ _tval, mask, set_expected); \
+ else \
+ __GUEST_ASSERT(!(_tval & mask), \
+ "tval: 0x%lx; mask: 0x%lx; set_expected: 0x%lx", \
+ _tval, mask, set_expected); \
+}
+
+/*
+ * Check if @mask bits in {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers
+ * are set or cleared as specified in @set_expected.
+ */
+static void check_bitmap_pmu_regs(uint64_t mask, bool set_expected)
+{
+ GUEST_ASSERT_BITMAP_REG(pmcntenset_el0, mask, set_expected);
+ GUEST_ASSERT_BITMAP_REG(pmcntenclr_el0, mask, set_expected);
+ GUEST_ASSERT_BITMAP_REG(pmintenset_el1, mask, set_expected);
+ GUEST_ASSERT_BITMAP_REG(pmintenclr_el1, mask, set_expected);
+ GUEST_ASSERT_BITMAP_REG(pmovsset_el0, mask, set_expected);
+ GUEST_ASSERT_BITMAP_REG(pmovsclr_el0, mask, set_expected);
+}
+
+/*
+ * Check if the bit in {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers corresponding
+ * to the specified counter (@pmc_idx) can be read/written as expected.
+ * When @set_op is true, it tries to set the bit for the counter in
+ * those registers by writing the SET registers (the bit won't be set
+ * if the counter is not implemented though).
+ * Otherwise, it tries to clear the bits in the registers by writing
+ * the CLR registers.
+ * Then, it checks if the values indicated in the registers are as expected.
+ */
+static void test_bitmap_pmu_regs(int pmc_idx, bool set_op)
+{
+ uint64_t pmcr_n, test_bit = BIT(pmc_idx);
+ bool set_expected = false;
+
+ if (set_op) {
+ write_sysreg(test_bit, pmcntenset_el0);
+ write_sysreg(test_bit, pmintenset_el1);
+ write_sysreg(test_bit, pmovsset_el0);
+
+ /* The bit will be set only if the counter is implemented */
+ pmcr_n = get_pmcr_n(read_sysreg(pmcr_el0));
+ set_expected = (pmc_idx < pmcr_n) ? true : false;
+ } else {
+ write_sysreg(test_bit, pmcntenclr_el0);
+ write_sysreg(test_bit, pmintenclr_el1);
+ write_sysreg(test_bit, pmovsclr_el0);
+ }
+ check_bitmap_pmu_regs(test_bit, set_expected);
+}
+
+/*
+ * Tests for reading/writing registers for the (implemented) event counter
+ * specified by @pmc_idx.
+ */
+static void test_access_pmc_regs(struct pmc_accessor *acc, int pmc_idx)
+{
+ uint64_t write_data, read_data;
+
+ /* Disable all PMCs and reset all PMCs to zero. */
+ pmu_disable_reset();
+
+ /*
+ * Tests for reading/writing {PMCNTEN,PMINTEN,PMOVS}{SET,CLR}_EL1.
+ */
+
+ /* Make sure that the bit in those registers are set to 0 */
+ test_bitmap_pmu_regs(pmc_idx, false);
+ /* Test if setting the bit in those registers works */
+ test_bitmap_pmu_regs(pmc_idx, true);
+ /* Test if clearing the bit in those registers works */
+ test_bitmap_pmu_regs(pmc_idx, false);
+
+ /*
+ * Tests for reading/writing the event type register.
+ */
+
+ /*
+ * Set the event type register to an arbitrary value just for testing
+ * of reading/writing the register.
+ * Arm ARM says that for the event from 0x0000 to 0x003F,
+ * the value indicated in the PMEVTYPER<n>_EL0.evtCount field is
+ * the value written to the field even when the specified event
+ * is not supported.
+ */
+ write_data = (ARMV8_PMU_EXCLUDE_EL1 | ARMV8_PMUV3_PERFCTR_INST_RETIRED);
+ acc->write_typer(pmc_idx, write_data);
+ read_data = acc->read_typer(pmc_idx);
+ __GUEST_ASSERT(read_data == write_data,
+ "pmc_idx: 0x%lx; acc_idx: 0x%lx; read_data: 0x%lx; write_data: 0x%lx",
+ pmc_idx, PMC_ACC_TO_IDX(acc), read_data, write_data);
+
+ /*
+ * Tests for reading/writing the event count register.
+ */
+
+ read_data = acc->read_cntr(pmc_idx);
+
+ /* The count value must be 0, as it is disabled and reset */
+ __GUEST_ASSERT(read_data == 0,
+ "pmc_idx: 0x%lx; acc_idx: 0x%lx; read_data: 0x%lx",
+ pmc_idx, PMC_ACC_TO_IDX(acc), read_data);
+
+ write_data = read_data + pmc_idx + 0x12345;
+ acc->write_cntr(pmc_idx, write_data);
+ read_data = acc->read_cntr(pmc_idx);
+ __GUEST_ASSERT(read_data == write_data,
+ "pmc_idx: 0x%lx; acc_idx: 0x%lx; read_data: 0x%lx; write_data: 0x%lx",
+ pmc_idx, PMC_ACC_TO_IDX(acc), read_data, write_data);
+}
+
static void guest_sync_handler(struct ex_regs *regs)
{
uint64_t esr, ec;
@@ -49,11 +299,14 @@ static void guest_sync_handler(struct ex_regs *regs)
/*
* The guest is configured with PMUv3 with @expected_pmcr_n number of
* event counters.
- * Check if @expected_pmcr_n is consistent with PMCR_EL0.N.
+ * Check if @expected_pmcr_n is consistent with PMCR_EL0.N, and
+ * if reading/writing PMU registers for implemented counters works
+ * as expected.
*/
static void guest_code(uint64_t expected_pmcr_n)
{
uint64_t pmcr, pmcr_n;
+ int i, pmc;
__GUEST_ASSERT(expected_pmcr_n <= ARMV8_PMU_MAX_GENERAL_COUNTERS,
"Expected PMCR.N: 0x%lx; ARMv8 general counters: 0x%lx",
@@ -67,6 +320,15 @@ static void guest_code(uint64_t expected_pmcr_n)
"Expected PMCR.N: 0x%lx, PMCR.N: 0x%lx",
expected_pmcr_n, pmcr_n);
+ /*
+ * Tests for reading/writing PMU registers for implemented counters.
+ * Use each combination of PMEVT{CNTR,TYPER}<n>_EL0 accessor functions.
+ */
+ for (i = 0; i < ARRAY_SIZE(pmc_accessors); i++) {
+ for (pmc = 0; pmc < pmcr_n; pmc++)
+ test_access_pmc_regs(&pmc_accessors[i], pmc);
+ }
+
GUEST_DONE();
}
@@ -179,7 +441,7 @@ static void test_create_vpmu_vm_with_pmcr_n(uint64_t pmcr_n, bool expect_fail)
* Create a guest with one vCPU, set the PMCR_EL0.N for the vCPU to @pmcr_n,
* and run the test.
*/
-static void run_test(uint64_t pmcr_n)
+static void run_access_test(uint64_t pmcr_n)
{
uint64_t sp;
struct kvm_vcpu *vcpu;
@@ -246,7 +508,7 @@ int main(void)
pmcr_n = get_pmcr_n_limit();
for (i = 0; i <= pmcr_n; i++)
- run_test(i);
+ run_access_test(i);
for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
run_error_test(i);
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
Introduce vpmu_counter_access test for arm64 platforms.
The test configures PMUv3 for a vCPU, sets PMCR_EL0.N for the vCPU,
and check if the guest can consistently see the same number of the
PMU event counters (PMCR_EL0.N) that userspace sets.
This test case is done with each of the PMCR_EL0.N values from
0 to 31 (With the PMCR_EL0.N values greater than the host value,
the test expects KVM_SET_ONE_REG for the PMCR_EL0 to fail).
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
.../kvm/aarch64/vpmu_counter_access.c | 255 ++++++++++++++++++
2 files changed, 256 insertions(+)
create mode 100644 tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 4f4f6ad025f4b..f047eda7b1c0b 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -161,6 +161,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/smccc_filter
TEST_GEN_PROGS_aarch64 += aarch64/vcpu_width_config
TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
+TEST_GEN_PROGS_aarch64 += aarch64/vpmu_counter_access
TEST_GEN_PROGS_aarch64 += access_tracking_perf_test
TEST_GEN_PROGS_aarch64 += demand_paging_test
TEST_GEN_PROGS_aarch64 += dirty_log_test
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
new file mode 100644
index 0000000000000..4c6e1fe87e0e6
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vpmu_counter_access - Test vPMU event counter access
+ *
+ * Copyright (c) 2023 Google LLC.
+ *
+ * This test checks if the guest can see the same number of the PMU event
+ * counters (PMCR_EL0.N) that userspace sets.
+ * This test runs only when KVM_CAP_ARM_PMU_V3 is supported on the host.
+ */
+#include <kvm_util.h>
+#include <processor.h>
+#include <test_util.h>
+#include <vgic.h>
+#include <perf/arm_pmuv3.h>
+#include <linux/bitfield.h>
+
+/* The max number of the PMU event counters (excluding the cycle counter) */
+#define ARMV8_PMU_MAX_GENERAL_COUNTERS (ARMV8_PMU_MAX_COUNTERS - 1)
+
+struct vpmu_vm {
+ struct kvm_vm *vm;
+ struct kvm_vcpu *vcpu;
+ int gic_fd;
+};
+
+static struct vpmu_vm vpmu_vm;
+
+static uint64_t get_pmcr_n(uint64_t pmcr)
+{
+ return (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
+}
+
+static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
+{
+ *pmcr = *pmcr & ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
+ *pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
+}
+
+static void guest_sync_handler(struct ex_regs *regs)
+{
+ uint64_t esr, ec;
+
+ esr = read_sysreg(esr_el1);
+ ec = (esr >> ESR_EC_SHIFT) & ESR_EC_MASK;
+ __GUEST_ASSERT(0, "PC: 0x%lx; ESR: 0x%lx; EC: 0x%lx", regs->pc, esr, ec);
+}
+
+/*
+ * The guest is configured with PMUv3 with @expected_pmcr_n number of
+ * event counters.
+ * Check if @expected_pmcr_n is consistent with PMCR_EL0.N.
+ */
+static void guest_code(uint64_t expected_pmcr_n)
+{
+ uint64_t pmcr, pmcr_n;
+
+ __GUEST_ASSERT(expected_pmcr_n <= ARMV8_PMU_MAX_GENERAL_COUNTERS,
+ "Expected PMCR.N: 0x%lx; ARMv8 general counters: 0x%lx",
+ expected_pmcr_n, ARMV8_PMU_MAX_GENERAL_COUNTERS);
+
+ pmcr = read_sysreg(pmcr_el0);
+ pmcr_n = get_pmcr_n(pmcr);
+
+ /* Make sure that PMCR_EL0.N indicates the value userspace set */
+ __GUEST_ASSERT(pmcr_n == expected_pmcr_n,
+ "Expected PMCR.N: 0x%lx, PMCR.N: 0x%lx",
+ expected_pmcr_n, pmcr_n);
+
+ GUEST_DONE();
+}
+
+#define GICD_BASE_GPA 0x8000000ULL
+#define GICR_BASE_GPA 0x80A0000ULL
+
+/* Create a VM that has one vCPU with PMUv3 configured. */
+static void create_vpmu_vm(void *guest_code)
+{
+ struct kvm_vcpu_init init;
+ uint8_t pmuver, ec;
+ uint64_t dfr0, irq = 23;
+ struct kvm_device_attr irq_attr = {
+ .group = KVM_ARM_VCPU_PMU_V3_CTRL,
+ .attr = KVM_ARM_VCPU_PMU_V3_IRQ,
+ .addr = (uint64_t)&irq,
+ };
+ struct kvm_device_attr init_attr = {
+ .group = KVM_ARM_VCPU_PMU_V3_CTRL,
+ .attr = KVM_ARM_VCPU_PMU_V3_INIT,
+ };
+
+ /* The test creates the vpmu_vm multiple times. Ensure a clean state */
+ memset(&vpmu_vm, 0, sizeof(vpmu_vm));
+
+ vpmu_vm.vm = vm_create(1);
+ vm_init_descriptor_tables(vpmu_vm.vm);
+ for (ec = 0; ec < ESR_EC_NUM; ec++) {
+ vm_install_sync_handler(vpmu_vm.vm, VECTOR_SYNC_CURRENT, ec,
+ guest_sync_handler);
+ }
+
+ /* Create vCPU with PMUv3 */
+ vm_ioctl(vpmu_vm.vm, KVM_ARM_PREFERRED_TARGET, &init);
+ init.features[0] |= (1 << KVM_ARM_VCPU_PMU_V3);
+ vpmu_vm.vcpu = aarch64_vcpu_add(vpmu_vm.vm, 0, &init, guest_code);
+ vcpu_init_descriptor_tables(vpmu_vm.vcpu);
+ vpmu_vm.gic_fd = vgic_v3_setup(vpmu_vm.vm, 1, 64,
+ GICD_BASE_GPA, GICR_BASE_GPA);
+ __TEST_REQUIRE(vpmu_vm.gic_fd >= 0,
+ "Failed to create vgic-v3, skipping");
+
+ /* Make sure that PMUv3 support is indicated in the ID register */
+ vcpu_get_reg(vpmu_vm.vcpu,
+ KVM_ARM64_SYS_REG(SYS_ID_AA64DFR0_EL1), &dfr0);
+ pmuver = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), dfr0);
+ TEST_ASSERT(pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF &&
+ pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP,
+ "Unexpected PMUVER (0x%x) on the vCPU with PMUv3", pmuver);
+
+ /* Initialize vPMU */
+ vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, &irq_attr);
+ vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, &init_attr);
+}
+
+static void destroy_vpmu_vm(void)
+{
+ close(vpmu_vm.gic_fd);
+ kvm_vm_free(vpmu_vm.vm);
+}
+
+static void run_vcpu(struct kvm_vcpu *vcpu, uint64_t pmcr_n)
+{
+ struct ucall uc;
+
+ vcpu_args_set(vcpu, 1, pmcr_n);
+ vcpu_run(vcpu);
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ break;
+ case UCALL_DONE:
+ break;
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ break;
+ }
+}
+
+static void test_create_vpmu_vm_with_pmcr_n(uint64_t pmcr_n, bool expect_fail)
+{
+ struct kvm_vcpu *vcpu;
+ uint64_t pmcr, pmcr_orig;
+
+ create_vpmu_vm(guest_code);
+ vcpu = vpmu_vm.vcpu;
+
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0), &pmcr_orig);
+ pmcr = pmcr_orig;
+
+ /*
+ * Setting a larger value of PMCR.N should not modify the field, and
+ * return a success.
+ */
+ set_pmcr_n(&pmcr, pmcr_n);
+ vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0), pmcr);
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0), &pmcr);
+
+ if (expect_fail)
+ TEST_ASSERT(pmcr_orig == pmcr,
+ "PMCR.N modified by KVM to a larger value (PMCR: 0x%lx) for pmcr_n: 0x%lx\n",
+ pmcr, pmcr_n);
+ else
+ TEST_ASSERT(pmcr_n == get_pmcr_n(pmcr),
+ "Failed to update PMCR.N to %lu (received: %lu)\n",
+ pmcr_n, get_pmcr_n(pmcr));
+}
+
+/*
+ * Create a guest with one vCPU, set the PMCR_EL0.N for the vCPU to @pmcr_n,
+ * and run the test.
+ */
+static void run_test(uint64_t pmcr_n)
+{
+ uint64_t sp;
+ struct kvm_vcpu *vcpu;
+ struct kvm_vcpu_init init;
+
+ pr_debug("Test with pmcr_n %lu\n", pmcr_n);
+
+ test_create_vpmu_vm_with_pmcr_n(pmcr_n, false);
+ vcpu = vpmu_vm.vcpu;
+
+ /* Save the initial sp to restore them later to run the guest again */
+ vcpu_get_reg(vcpu, ARM64_CORE_REG(sp_el1), &sp);
+
+ run_vcpu(vcpu, pmcr_n);
+
+ /*
+ * Reset and re-initialize the vCPU, and run the guest code again to
+ * check if PMCR_EL0.N is preserved.
+ */
+ vm_ioctl(vpmu_vm.vm, KVM_ARM_PREFERRED_TARGET, &init);
+ init.features[0] |= (1 << KVM_ARM_VCPU_PMU_V3);
+ aarch64_vcpu_setup(vcpu, &init);
+ vcpu_init_descriptor_tables(vcpu);
+ vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), sp);
+ vcpu_set_reg(vcpu, ARM64_CORE_REG(regs.pc), (uint64_t)guest_code);
+
+ run_vcpu(vcpu, pmcr_n);
+
+ destroy_vpmu_vm();
+}
+
+/*
+ * Create a guest with one vCPU, and attempt to set the PMCR_EL0.N for
+ * the vCPU to @pmcr_n, which is larger than the host value.
+ * The attempt should fail as @pmcr_n is too big to set for the vCPU.
+ */
+static void run_error_test(uint64_t pmcr_n)
+{
+ pr_debug("Error test with pmcr_n %lu (larger than the host)\n", pmcr_n);
+
+ test_create_vpmu_vm_with_pmcr_n(pmcr_n, true);
+ destroy_vpmu_vm();
+}
+
+/*
+ * Return the default number of implemented PMU event counters excluding
+ * the cycle counter (i.e. PMCR_EL0.N value) for the guest.
+ */
+static uint64_t get_pmcr_n_limit(void)
+{
+ uint64_t pmcr;
+
+ create_vpmu_vm(guest_code);
+ vcpu_get_reg(vpmu_vm.vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0), &pmcr);
+ destroy_vpmu_vm();
+ return get_pmcr_n(pmcr);
+}
+
+int main(void)
+{
+ uint64_t i, pmcr_n;
+
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
+
+ pmcr_n = get_pmcr_n_limit();
+ for (i = 0; i <= pmcr_n; i++)
+ run_test(i);
+
+ for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
+ run_error_test(i);
+
+ return 0;
+}
--
2.42.0.655.g421f12c284-goog
From: Reiji Watanabe <[email protected]>
Add a new test case to the vpmu_counter_access test to check
if PMU registers or their bits for unimplemented counters are not
accessible or are RAZ, as expected.
Signed-off-by: Reiji Watanabe <[email protected]>
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
.../kvm/aarch64/vpmu_counter_access.c | 93 +++++++++++++++++--
.../selftests/kvm/include/aarch64/processor.h | 1 +
2 files changed, 87 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
index a579286b6f116..d5143925690db 100644
--- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -5,8 +5,9 @@
* Copyright (c) 2023 Google LLC.
*
* This test checks if the guest can see the same number of the PMU event
- * counters (PMCR_EL0.N) that userspace sets, and if the guest can access
- * those counters.
+ * counters (PMCR_EL0.N) that userspace sets, if the guest can access
+ * those counters, and if the guest is prevented from accessing any
+ * other counters.
* This test runs only when KVM_CAP_ARM_PMU_V3 is supported on the host.
*/
#include <kvm_util.h>
@@ -287,25 +288,85 @@ static void test_access_pmc_regs(struct pmc_accessor *acc, int pmc_idx)
pmc_idx, PMC_ACC_TO_IDX(acc), read_data, write_data);
}
+#define INVALID_EC (-1ul)
+uint64_t expected_ec = INVALID_EC;
+uint64_t op_end_addr;
+
static void guest_sync_handler(struct ex_regs *regs)
{
uint64_t esr, ec;
esr = read_sysreg(esr_el1);
ec = (esr >> ESR_EC_SHIFT) & ESR_EC_MASK;
- __GUEST_ASSERT(0, "PC: 0x%lx; ESR: 0x%lx; EC: 0x%lx", regs->pc, esr, ec);
+
+ __GUEST_ASSERT(op_end_addr && (expected_ec == ec),
+ "PC: 0x%lx; ESR: 0x%lx; EC: 0x%lx; EC expected: 0x%lx",
+ regs->pc, esr, ec, expected_ec);
+
+ /* Will go back to op_end_addr after the handler exits */
+ regs->pc = op_end_addr;
+
+ /*
+ * Clear op_end_addr, and set expected_ec to INVALID_EC
+ * as a sign that an exception has occurred.
+ */
+ op_end_addr = 0;
+ expected_ec = INVALID_EC;
+}
+
+/*
+ * Run the given operation that should trigger an exception with the
+ * given exception class. The exception handler (guest_sync_handler)
+ * will reset op_end_addr to 0, and expected_ec to INVALID_EC, and
+ * will come back to the instruction at the @done_label.
+ * The @done_label must be a unique label in this test program.
+ */
+#define TEST_EXCEPTION(ec, ops, done_label) \
+{ \
+ extern int done_label; \
+ \
+ WRITE_ONCE(op_end_addr, (uint64_t)&done_label); \
+ GUEST_ASSERT(ec != INVALID_EC); \
+ WRITE_ONCE(expected_ec, ec); \
+ dsb(ish); \
+ ops; \
+ asm volatile(#done_label":"); \
+ GUEST_ASSERT(!op_end_addr); \
+ GUEST_ASSERT(expected_ec == INVALID_EC); \
+}
+
+/*
+ * Tests for reading/writing registers for the unimplemented event counter
+ * specified by @pmc_idx (>= PMCR_EL0.N).
+ */
+static void test_access_invalid_pmc_regs(struct pmc_accessor *acc, int pmc_idx)
+{
+ /*
+ * Reading/writing the event count/type registers should cause
+ * an UNDEFINED exception.
+ */
+ TEST_EXCEPTION(ESR_EC_UNKNOWN, acc->read_cntr(pmc_idx), inv_rd_cntr);
+ TEST_EXCEPTION(ESR_EC_UNKNOWN, acc->write_cntr(pmc_idx, 0), inv_wr_cntr);
+ TEST_EXCEPTION(ESR_EC_UNKNOWN, acc->read_typer(pmc_idx), inv_rd_typer);
+ TEST_EXCEPTION(ESR_EC_UNKNOWN, acc->write_typer(pmc_idx, 0), inv_wr_typer);
+ /*
+ * The bit corresponding to the (unimplemented) counter in
+ * {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers should be RAZ.
+ */
+ test_bitmap_pmu_regs(pmc_idx, 1);
+ test_bitmap_pmu_regs(pmc_idx, 0);
}
/*
* The guest is configured with PMUv3 with @expected_pmcr_n number of
* event counters.
* Check if @expected_pmcr_n is consistent with PMCR_EL0.N, and
- * if reading/writing PMU registers for implemented counters works
- * as expected.
+ * if reading/writing PMU registers for implemented or unimplemented
+ * counters works as expected.
*/
static void guest_code(uint64_t expected_pmcr_n)
{
- uint64_t pmcr, pmcr_n;
+ uint64_t pmcr, pmcr_n, unimp_mask;
int i, pmc;
__GUEST_ASSERT(expected_pmcr_n <= ARMV8_PMU_MAX_GENERAL_COUNTERS,
@@ -320,15 +381,33 @@ static void guest_code(uint64_t expected_pmcr_n)
"Expected PMCR.N: 0x%lx, PMCR.N: 0x%lx",
expected_pmcr_n, pmcr_n);
+ /*
+ * Make sure that (RAZ) bits corresponding to unimplemented event
+ * counters in {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers are reset
+ * to zero.
+ * (NOTE: bits for implemented event counters are reset to UNKNOWN)
+ */
+ unimp_mask = GENMASK_ULL(ARMV8_PMU_MAX_GENERAL_COUNTERS - 1, pmcr_n);
+ check_bitmap_pmu_regs(unimp_mask, false);
+
/*
* Tests for reading/writing PMU registers for implemented counters.
- * Use each combination of PMEVT{CNTR,TYPER}<n>_EL0 accessor functions.
+ * Use each combination of PMEV{CNTR,TYPER}<n>_EL0 accessor functions.
*/
for (i = 0; i < ARRAY_SIZE(pmc_accessors); i++) {
for (pmc = 0; pmc < pmcr_n; pmc++)
test_access_pmc_regs(&pmc_accessors[i], pmc);
}
+ /*
+ * Tests for reading/writing PMU registers for unimplemented counters.
+ * Use each combination of PMEV{CNTR,TYPER}<n>_EL0 accessor functions.
+ */
+ for (i = 0; i < ARRAY_SIZE(pmc_accessors); i++) {
+ for (pmc = pmcr_n; pmc < ARMV8_PMU_MAX_GENERAL_COUNTERS; pmc++)
+ test_access_invalid_pmc_regs(&pmc_accessors[i], pmc);
+ }
+
GUEST_DONE();
}
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index cb537253a6b9c..c42d683102c7a 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -104,6 +104,7 @@ enum {
#define ESR_EC_SHIFT 26
#define ESR_EC_MASK (ESR_EC_NUM - 1)
+#define ESR_EC_UNKNOWN 0x0
#define ESR_EC_SVC64 0x15
#define ESR_EC_IABT 0x21
#define ESR_EC_DABT 0x25
--
2.42.0.655.g421f12c284-goog
Add a vPMU test scenario to validate the userspace accesses for
the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} to ensure
that KVM honors the architectural definitions of these registers
for a given PMCR.N.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
.../kvm/aarch64/vpmu_counter_access.c | 87 ++++++++++++++++++-
1 file changed, 86 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
index d5143925690db..2b697b144e677 100644
--- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -8,6 +8,8 @@
* counters (PMCR_EL0.N) that userspace sets, if the guest can access
* those counters, and if the guest is prevented from accessing any
* other counters.
+ * It also checks if the userspace accesses to the PMU regsisters honor the
+ * PMCR.N value that's set for the guest.
* This test runs only when KVM_CAP_ARM_PMU_V3 is supported on the host.
*/
#include <kvm_util.h>
@@ -20,6 +22,9 @@
/* The max number of the PMU event counters (excluding the cycle counter) */
#define ARMV8_PMU_MAX_GENERAL_COUNTERS (ARMV8_PMU_MAX_COUNTERS - 1)
+/* The cycle counter bit position that's common among the PMU registers */
+#define ARMV8_PMU_CYCLE_IDX 31
+
struct vpmu_vm {
struct kvm_vm *vm;
struct kvm_vcpu *vcpu;
@@ -28,6 +33,13 @@ struct vpmu_vm {
static struct vpmu_vm vpmu_vm;
+struct pmreg_sets {
+ uint64_t set_reg_id;
+ uint64_t clr_reg_id;
+};
+
+#define PMREG_SET(set, clr) {.set_reg_id = set, .clr_reg_id = clr}
+
static uint64_t get_pmcr_n(uint64_t pmcr)
{
return (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
@@ -39,6 +51,15 @@ static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
*pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
}
+static uint64_t get_counters_mask(uint64_t n)
+{
+ uint64_t mask = BIT(ARMV8_PMU_CYCLE_IDX);
+
+ if (n)
+ mask |= GENMASK(n - 1, 0);
+ return mask;
+}
+
/* Read PMEVTCNTR<n>_EL0 through PMXEVCNTR_EL0 */
static inline unsigned long read_sel_evcntr(int sel)
{
@@ -552,6 +573,68 @@ static void run_access_test(uint64_t pmcr_n)
destroy_vpmu_vm();
}
+static struct pmreg_sets validity_check_reg_sets[] = {
+ PMREG_SET(SYS_PMCNTENSET_EL0, SYS_PMCNTENCLR_EL0),
+ PMREG_SET(SYS_PMINTENSET_EL1, SYS_PMINTENCLR_EL1),
+ PMREG_SET(SYS_PMOVSSET_EL0, SYS_PMOVSCLR_EL0),
+};
+
+/*
+ * Create a VM, and check if KVM handles the userspace accesses of
+ * the PMU register sets in @validity_check_reg_sets[] correctly.
+ */
+static void run_pmregs_validity_test(uint64_t pmcr_n)
+{
+ int i;
+ struct kvm_vcpu *vcpu;
+ uint64_t set_reg_id, clr_reg_id, reg_val;
+ uint64_t valid_counters_mask, max_counters_mask;
+
+ test_create_vpmu_vm_with_pmcr_n(pmcr_n, false);
+ vcpu = vpmu_vm.vcpu;
+
+ valid_counters_mask = get_counters_mask(pmcr_n);
+ max_counters_mask = get_counters_mask(ARMV8_PMU_MAX_COUNTERS);
+
+ for (i = 0; i < ARRAY_SIZE(validity_check_reg_sets); i++) {
+ set_reg_id = validity_check_reg_sets[i].set_reg_id;
+ clr_reg_id = validity_check_reg_sets[i].clr_reg_id;
+
+ /*
+ * Test if the 'set' and 'clr' variants of the registers
+ * are initialized based on the number of valid counters.
+ */
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(set_reg_id), ®_val);
+ TEST_ASSERT((reg_val & (~valid_counters_mask)) == 0,
+ "Initial read of set_reg: 0x%llx has unimplemented counters enabled: 0x%lx\n",
+ KVM_ARM64_SYS_REG(set_reg_id), reg_val);
+
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(clr_reg_id), ®_val);
+ TEST_ASSERT((reg_val & (~valid_counters_mask)) == 0,
+ "Initial read of clr_reg: 0x%llx has unimplemented counters enabled: 0x%lx\n",
+ KVM_ARM64_SYS_REG(clr_reg_id), reg_val);
+
+ /*
+ * Using the 'set' variant, force-set the register to the
+ * max number of possible counters and test if KVM discards
+ * the bits for unimplemented counters as it should.
+ */
+ vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(set_reg_id), max_counters_mask);
+
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(set_reg_id), ®_val);
+ TEST_ASSERT((reg_val & (~valid_counters_mask)) == 0,
+ "Read of set_reg: 0x%llx has unimplemented counters enabled: 0x%lx\n",
+ KVM_ARM64_SYS_REG(set_reg_id), reg_val);
+
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(clr_reg_id), ®_val);
+ TEST_ASSERT((reg_val & (~valid_counters_mask)) == 0,
+ "Read of clr_reg: 0x%llx has unimplemented counters enabled: 0x%lx\n",
+ KVM_ARM64_SYS_REG(clr_reg_id), reg_val);
+ }
+
+ destroy_vpmu_vm();
+}
+
/*
* Create a guest with one vCPU, and attempt to set the PMCR_EL0.N for
* the vCPU to @pmcr_n, which is larger than the host value.
@@ -586,8 +669,10 @@ int main(void)
TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
pmcr_n = get_pmcr_n_limit();
- for (i = 0; i <= pmcr_n; i++)
+ for (i = 0; i <= pmcr_n; i++) {
run_access_test(i);
+ run_pmregs_validity_test(i);
+ }
for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
run_error_test(i);
--
2.42.0.655.g421f12c284-goog
KVM marks some of the vPMU registers as immutable to
userspace once the vCPU has started running. Add a test
scenario to check this behavior.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
.../kvm/aarch64/vpmu_counter_access.c | 47 ++++++++++++++++++-
1 file changed, 46 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
index 2b697b144e677..f87d76c614e8b 100644
--- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -9,7 +9,8 @@
* those counters, and if the guest is prevented from accessing any
* other counters.
* It also checks if the userspace accesses to the PMU regsisters honor the
- * PMCR.N value that's set for the guest.
+ * PMCR.N value that's set for the guest and if these accesses are immutable
+ * after KVM has run once.
* This test runs only when KVM_CAP_ARM_PMU_V3 is supported on the host.
*/
#include <kvm_util.h>
@@ -648,6 +649,48 @@ static void run_error_test(uint64_t pmcr_n)
destroy_vpmu_vm();
}
+static uint64_t immutable_regs[] = {
+ SYS_PMCR_EL0,
+ SYS_PMCNTENSET_EL0,
+ SYS_PMCNTENCLR_EL0,
+ SYS_PMINTENSET_EL1,
+ SYS_PMINTENCLR_EL1,
+ SYS_PMOVSSET_EL0,
+ SYS_PMOVSCLR_EL0,
+};
+
+/*
+ * Create a guest with one vCPU, run it, and then make an attempt to update
+ * the registers in @immutable_regs[] (with their complements). KVM shouldn't
+ * allow updating these registers once vCPU starts running. Hence, the test
+ * fails if that's not the case.
+ */
+static void run_immutable_test(uint64_t pmcr_n)
+{
+ int i;
+ struct kvm_vcpu *vcpu;
+ uint64_t reg_id, reg_val, reg_val_orig;
+
+ create_vpmu_vm(guest_code);
+ vcpu = vpmu_vm.vcpu;
+
+ run_vcpu(vcpu, pmcr_n);
+
+ for (i = 0; i < ARRAY_SIZE(immutable_regs); i++) {
+ reg_id = immutable_regs[i];
+
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(reg_id), ®_val_orig);
+ vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(reg_id), ~reg_val_orig);
+ vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(reg_id), ®_val);
+
+ TEST_ASSERT(reg_val == reg_val_orig,
+ "Register 0x%llx value updated after vCPU run: 0x%lx; expected: 0x%lx\n",
+ KVM_ARM64_SYS_REG(reg_id), reg_val, reg_val_orig);
+ }
+
+ destroy_vpmu_vm();
+}
+
/*
* Return the default number of implemented PMU event counters excluding
* the cycle counter (i.e. PMCR_EL0.N value) for the guest.
@@ -677,5 +720,7 @@ int main(void)
for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
run_error_test(i);
+ run_immutable_test(pmcr_n);
+
return 0;
}
--
2.42.0.655.g421f12c284-goog
On Fri, 20 Oct 2023 22:40:42 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> From: Reiji Watanabe <[email protected]>
>
> The following patches will use the number of counters information
> from the arm_pmu and use this to set the PMCR.N for the guest
> during vCPU reset. However, since the guest is not associated
> with any arm_pmu until userspace configures the vPMU device
> attributes, and a reset can happen before this event, assign a
> default PMU to the guest just before doing the reset.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/kvm/arm.c | 19 +++++++++++++++++++
> arch/arm64/kvm/pmu-emul.c | 16 ++++------------
> include/kvm/arm_pmu.h | 6 ++++++
> 3 files changed, 29 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c6cad400490f9..08c2f76983b9d 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1319,6 +1319,21 @@ static bool kvm_vcpu_init_changed(struct kvm_vcpu *vcpu,
> KVM_VCPU_MAX_FEATURES);
> }
>
> +static int kvm_setup_vcpu(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> +
> + /*
> + * When the vCPU has a PMU, but no PMU is set for the guest
> + * yet, set the default one.
> + */
> + if (kvm_vcpu_has_pmu(vcpu) && !kvm->arch.arm_pmu &&
> + kvm_arm_set_default_pmu(kvm))
> + return -EINVAL;
nit: I'm not keen on re-interpreting the error code. If
kvm_arm_set_default_pmu() returns an error, we should return *that*
particular error, and not any other. Something like:
static int kvm_setup_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
int err = 0;
/*
* When the vCPU has a PMU, but no PMU is set for the guest
* yet, set the default one.
*/
if (kvm_vcpu_has_pmu(vcpu) && !kvm->arch.arm_pmu)
err = kvm_arm_set_default_pmu(kvm);
return err;
}
> +
> + return 0;
> +}
> +
> static int __kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> const struct kvm_vcpu_init *init)
> {
> @@ -1334,6 +1349,10 @@ static int __kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
>
> bitmap_copy(kvm->arch.vcpu_features, &features, KVM_VCPU_MAX_FEATURES);
>
> + ret = kvm_setup_vcpu(vcpu);
> + if (ret)
> + goto out_unlock;
> +
Hmmm. Contrary to what the commit message says, the default PMU is not
picked at reset time, but at the point where the target is set (the
very first vcpu init). Which is pretty different from reset, which
happens more than once.
I also can't say I'm over the moon with yet another function that does
a very tiny bit of initialisation outside of the rest of the code that
performs the vcpu init. Following things is an absolute maze...
> /* Now we know what it is, we can reset it. */
> kvm_reset_vcpu(vcpu);
>
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index eb5dcb12dafe9..66c244021ff08 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -717,10 +717,9 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
> * It is still necessary to get a valid cpu, though, to probe for the
> * default PMU instance as userspace is not required to specify a PMU
> * type. In order to uphold the preexisting behavior KVM selects the
> - * PMU instance for the core where the first call to the
> - * KVM_ARM_VCPU_PMU_V3_CTRL attribute group occurs. A dependent use case
> - * would be a user with disdain of all things big.LITTLE that affines
> - * the VMM to a particular cluster of cores.
> + * PMU instance for the core just before the vcpu reset. A dependent use
> + * case would be a user with disdain of all things big.LITTLE that
> + * affines the VMM to a particular cluster of cores.
Same problem, see above.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023 22:40:44 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> The number of PMU event counters is indicated in PMCR_EL0.N.
> For a vCPU with PMUv3 configured, the value is set to the same
> value as the current PE on every vCPU reset. Unless the vCPU is
> pinned to PEs that has the PMU associated to the guest from the
> initial vCPU reset, the value might be different from the PMU's
> PMCR_EL0.N on heterogeneous PMU systems.
>
> Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
> value. Track the PMCR_EL0.N per guest, as only one PMU can be set
> for the guest (PMCR_EL0.N must be the same for all vCPUs of the
> guest), and it is convenient for updating the value.
>
> To achieve this, the patch introduces a helper,
> kvm_arm_pmu_get_max_counters(), that reads the maximum number of
> counters from the arm_pmu associated to the VM. Make the function
> global as upcoming patches will be interested to know the value
> while setting the PMCR.N of the guest from userspace.
>
> KVM does not yet support userspace modifying PMCR_EL0.N.
> The following patch will add support for that.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/include/asm/kvm_host.h | 3 +++
> arch/arm64/kvm/pmu-emul.c | 26 +++++++++++++++++++++++++-
> arch/arm64/kvm/sys_regs.c | 28 ++++++++++++++--------------
> include/kvm/arm_pmu.h | 6 ++++++
> 4 files changed, 48 insertions(+), 15 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 846a7706e925c..5653d3553e3ee 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -290,6 +290,9 @@ struct kvm_arch {
>
> cpumask_var_t supported_cpus;
>
> + /* PMCR_EL0.N value for the guest */
> + u8 pmcr_n;
> +
> /* Hypercall features firmware registers' descriptor */
> struct kvm_smccc_features smccc_feat;
> struct maple_tree smccc_filter;
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 097bf7122130d..9e24581206c24 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -690,6 +690,9 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
> if (!entry)
> goto out_unlock;
>
> + WARN_ON((pmu->num_events <= 0) ||
> + (pmu->num_events > ARMV8_PMU_MAX_COUNTERS));
> +
So if we find a PMU that is completely bonkers (we *know* we cannot
make use of it), we still pick it? What is the point?
Honestly, I don't think this warning adds any value, and doesn't seem
to be required for this patch anyway.
> entry->arm_pmu = pmu;
> list_add_tail(&entry->entry, &arm_pmus);
>
> @@ -873,11 +876,29 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> return true;
> }
>
> +/**
> + * kvm_arm_pmu_get_max_counters - Return the max number of PMU counters.
> + * @kvm: The kvm pointer
> + */
> +int kvm_arm_pmu_get_max_counters(struct kvm *kvm)
> +{
> + struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
> +
> + lockdep_assert_held(&kvm->arch.config_lock);
> +
> + /*
> + * The arm_pmu->num_events considers the cycle counter as well.
> + * Ignore that and return only the general-purpose counters.
> + */
> + return arm_pmu->num_events - 1;
How is that going to work when the PMU supports a fixed instruction
counter, as it is the case with FEAT_PMUv3_ICNTR? The kernel doesn't
support it yet, but this will eventually be the case, and this little
game will break.
> +}
> +
> static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
> {
> lockdep_assert_held(&kvm->arch.config_lock);
>
> kvm->arch.arm_pmu = arm_pmu;
> + kvm->arch.pmcr_n = kvm_arm_pmu_get_max_counters(kvm);
Can you make the return type of kvm_arm_pmu_get_max_counters()
homogeneous with that of pmcr_n?
> }
>
> /**
> @@ -1091,5 +1112,8 @@ u8 kvm_arm_pmu_get_pmuver_limit(void)
> */
> u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
> {
> - return __vcpu_sys_reg(vcpu, PMCR_EL0);
> + u64 pmcr = __vcpu_sys_reg(vcpu, PMCR_EL0) &
> + ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
> +
> + return pmcr | ((u64)vcpu->kvm->arch.pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
> }
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a31cecb3d29fb..faf97878dfbbb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -721,12 +721,7 @@ static u64 reset_pmu_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> {
> u64 n, mask = BIT(ARMV8_PMU_CYCLE_IDX);
>
> - /* No PMU available, any PMU reg may UNDEF... */
> - if (!kvm_arm_support_pmu_v3())
> - return 0;
> -
> - n = read_sysreg(pmcr_el0) >> ARMV8_PMU_PMCR_N_SHIFT;
> - n &= ARMV8_PMU_PMCR_N_MASK;
> + n = vcpu->kvm->arch.pmcr_n;
> if (n)
> mask |= GENMASK(n - 1, 0);
>
> @@ -762,17 +757,15 @@ static u64 reset_pmselr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
>
> static u64 reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
> {
> - u64 pmcr;
> + u64 pmcr = 0;
>
> - /* No PMU available, PMCR_EL0 may UNDEF... */
> - if (!kvm_arm_support_pmu_v3())
> - return 0;
> -
> - /* Only preserve PMCR_EL0.N, and reset the rest to 0 */
> - pmcr = read_sysreg(pmcr_el0) & (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
> if (!kvm_supports_32bit_el0())
> pmcr |= ARMV8_PMU_PMCR_LC;
>
> + /*
> + * The value of PMCR.N field is included when the
> + * vCPU register is read via kvm_vcpu_read_pmcr().
> + */
> __vcpu_sys_reg(vcpu, r->reg) = pmcr;
>
> return __vcpu_sys_reg(vcpu, r->reg);
> @@ -1103,6 +1096,13 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> +static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> + u64 *val)
> +{
> + *val = kvm_vcpu_read_pmcr(vcpu);
> + return 0;
> +}
> +
> /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
> #define DBG_BCR_BVR_WCR_WVR_EL1(n) \
> { SYS_DESC(SYS_DBGBVRn_EL1(n)), \
> @@ -2235,7 +2235,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> { SYS_DESC(SYS_SVCR), undef_access },
>
> { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
> - .reset = reset_pmcr, .reg = PMCR_EL0 },
> + .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
So since you don't provide a set_user() callback, userspace can still
write anything it wants. Should we take this opportunity to sanitise
things a bit?
> { PMU_SYS_REG(PMCNTENSET_EL0),
> .access = access_pmcnten, .reg = PMCNTENSET_EL0 },
> { PMU_SYS_REG(PMCNTENCLR_EL0),
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index cd980d78b86b5..2e90f38090e6d 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -102,6 +102,7 @@ void kvm_vcpu_pmu_resync_el0(void);
>
> u8 kvm_arm_pmu_get_pmuver_limit(void);
> int kvm_arm_set_default_pmu(struct kvm *kvm);
> +int kvm_arm_pmu_get_max_counters(struct kvm *kvm);
>
> u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu);
> #else
> @@ -181,6 +182,11 @@ static inline int kvm_arm_set_default_pmu(struct kvm *kvm)
> return -ENODEV;
> }
>
> +static inline int kvm_arm_pmu_get_max_counters(struct kvm *kvm)
> +{
> + return -ENODEV;
> +}
> +
> static inline u64 kvm_vcpu_read_pmcr(struct kvm_vcpu *vcpu)
> {
> return 0;
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023 22:40:45 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
> PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
> explicitly implement the {get,set}_user functions for these
> registers to mask out unimplemented counters for userspace reads
> and writes.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/kvm/sys_regs.c | 91 ++++++++++++++++++++++++++++++++++++---
> 1 file changed, 85 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index faf97878dfbbb..2e5d497596ef8 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,6 +987,45 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> +static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
> + u64 reg, u64 val, bool set)
> +{
> + struct kvm *kvm = vcpu->kvm;
> +
> + mutex_lock(&kvm->arch.config_lock);
> +
> + /* Make the register immutable once the VM has started running */
> + if (kvm_vm_has_ran_once(kvm)) {
> + mutex_unlock(&kvm->arch.config_lock);
> + return;
> + }
> +
> + val &= kvm_pmu_valid_counter_mask(vcpu);
> + mutex_unlock(&kvm->arch.config_lock);
> +
> + if (set)
> + __vcpu_sys_reg(vcpu, reg) |= val;
> + else
> + __vcpu_sys_reg(vcpu, reg) &= ~val;
> +}
> +
> +static int get_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> + u64 *val)
> +{
> + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> +
> + *val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
> + return 0;
> +}
> +
> +static int set_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> + u64 val)
> +{
> + /* r->Op2 & 0x1: true for PMCNTENSET_EL0, else PMCNTENCLR_EL0 */
> + set_pmreg_for_valid_counters(vcpu, PMCNTENSET_EL0, val, r->Op2 & 0x1);
> + return 0;
> +}
Huh, this is really ugly. Why the explosion of pointless helpers when
the whole design of the sysreg infrastructure to have *common* helpers
for registers that behave the same way?
I'd expect something like the hack below instead.
M.
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index a2c5f210b3d6..8f560a2496f2 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -987,42 +987,46 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
-static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
- u64 reg, u64 val, bool set)
+static int set_pmreg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, u64 val)
{
struct kvm *kvm = vcpu->kvm;
+ bool set;
mutex_lock(&kvm->arch.config_lock);
/* Make the register immutable once the VM has started running */
if (kvm_vm_has_ran_once(kvm)) {
mutex_unlock(&kvm->arch.config_lock);
- return;
+ return 0;
}
val &= kvm_pmu_valid_counter_mask(vcpu);
mutex_unlock(&kvm->arch.config_lock);
+ switch(r->reg) {
+ case PMOVSSET_EL0:
+ /* CRm[1] being set indicates a SET register, and CLR otherwise */
+ set = r->CRm & 2;
+ break;
+ default:
+ /* Op2[0] being set indicates a SET register, and CLR otherwise */
+ set = r->Op2 & 1;
+ break;
+ }
+
if (set)
- __vcpu_sys_reg(vcpu, reg) |= val;
+ __vcpu_sys_reg(vcpu, r->reg) |= val;
else
- __vcpu_sys_reg(vcpu, reg) &= ~val;
-}
-
-static int get_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 *val)
-{
- u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+ __vcpu_sys_reg(vcpu, r->reg) &= ~val;
- *val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
return 0;
}
-static int set_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 val)
+static int get_pmreg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, u64 *val)
{
- /* r->Op2 & 0x1: true for PMCNTENSET_EL0, else PMCNTENCLR_EL0 */
- set_pmreg_for_valid_counters(vcpu, PMCNTENSET_EL0, val, r->Op2 & 0x1);
+ u64 mask = kvm_pmu_valid_counter_mask(vcpu);
+
+ *val = __vcpu_sys_reg(vcpu, r->reg) & mask;
return 0;
}
@@ -1054,23 +1058,6 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
-static int get_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 *val)
-{
- u64 mask = kvm_pmu_valid_counter_mask(vcpu);
-
- *val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1) & mask;
- return 0;
-}
-
-static int set_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 val)
-{
- /* r->Op2 & 0x1: true for PMINTENSET_EL1, else PMINTENCLR_EL1 */
- set_pmreg_for_valid_counters(vcpu, PMINTENSET_EL1, val, r->Op2 & 0x1);
- return 0;
-}
-
static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
@@ -1095,23 +1082,6 @@ static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
-static int set_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 val)
-{
- /* r->CRm & 0x2: true for PMOVSSET_EL0, else PMOVSCLR_EL0 */
- set_pmreg_for_valid_counters(vcpu, PMOVSSET_EL0, val, r->CRm & 0x2);
- return 0;
-}
-
-static int get_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
- u64 *val)
-{
- u64 mask = kvm_pmu_valid_counter_mask(vcpu);
-
- *val = __vcpu_sys_reg(vcpu, PMOVSSET_EL0) & mask;
- return 0;
-}
-
static bool access_pmovs(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
@@ -2311,10 +2281,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ PMU_SYS_REG(PMINTENSET_EL1),
.access = access_pminten, .reg = PMINTENSET_EL1,
- .get_user = get_pminten, .set_user = set_pminten },
+ .get_user = get_pmreg, .set_user = set_pmreg },
{ PMU_SYS_REG(PMINTENCLR_EL1),
.access = access_pminten, .reg = PMINTENSET_EL1,
- .get_user = get_pminten, .set_user = set_pminten },
+ .get_user = get_pmreg, .set_user = set_pmreg },
{ SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
@@ -2366,13 +2336,13 @@ static const struct sys_reg_desc sys_reg_descs[] = {
.reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
{ PMU_SYS_REG(PMCNTENSET_EL0),
.access = access_pmcnten, .reg = PMCNTENSET_EL0,
- .get_user = get_pmcnten, .set_user = set_pmcnten },
+ .get_user = get_pmreg, .set_user = set_pmreg },
{ PMU_SYS_REG(PMCNTENCLR_EL0),
.access = access_pmcnten, .reg = PMCNTENSET_EL0,
- .get_user = get_pmcnten, .set_user = set_pmcnten },
+ .get_user = get_pmreg, .set_user = set_pmreg },
{ PMU_SYS_REG(PMOVSCLR_EL0),
.access = access_pmovs, .reg = PMOVSSET_EL0,
- .get_user = get_pmovs, .set_user = set_pmovs },
+ .get_user = get_pmreg, .set_user = set_pmreg },
/*
* PM_SWINC_EL0 is exposed to userspace as RAZ/WI, as it was
* previously (and pointlessly) advertised in the past...
@@ -2401,7 +2371,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
.reset = reset_val, .reg = PMUSERENR_EL0, .val = 0 },
{ PMU_SYS_REG(PMOVSSET_EL0),
.access = access_pmovs, .reg = PMOVSSET_EL0,
- .get_user = get_pmovs, .set_user = set_pmovs },
+ .get_user = get_pmreg, .set_user = set_pmreg },
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023 22:40:46 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
> and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
> Hence to ensure correct KVM's PMU emulation, mask out the bits in
> these registers for these unimplemented counters before the first
> vCPU run.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/kvm/arm.c | 2 +-
> arch/arm64/kvm/pmu-emul.c | 11 +++++++++++
> include/kvm/arm_pmu.h | 2 ++
> 3 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e3074a9e23a8b..3c0bb80483fb1 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -857,7 +857,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> }
>
> if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
> - kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> + kvm_vcpu_handle_request_reload_pmu(vcpu);
Please rename this to kvm_vcpu_reload_pmu(). That's long enough. But
see below.
>
> if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
> kvm_vcpu_pmu_restore_guest(vcpu);
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 9e24581206c24..31e4933293b76 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -788,6 +788,17 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
> return val & mask;
> }
>
> +void kvm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu)
> +{
> + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> +
> + kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> +
> + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask;
> + __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask;
> + __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask;
> +}
Why is this done on a vcpu request? Why can't it be done upfront, when
we're requesting the reload? Or when assigning the PMU? Or when
setting PMCR_EL0?
M.
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023 22:40:47 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> From: Reiji Watanabe <[email protected]>
>
> KVM does not yet support userspace modifying PMCR_EL0.N (With
> the previous patch, KVM ignores what is written by userspace).
> Add support userspace limiting PMCR_EL0.N.
>
> Disallow userspace to set PMCR_EL0.N to a value that is greater
> than the host value as KVM doesn't support more event counters
> than what the host HW implements. Also, make this register
> immutable after the VM has started running. To maintain the
> existing expectations, instead of returning an error, KVM
> returns a success for these two cases.
>
> Finally, ignore writes to read-only bits that are cleared on
> vCPU reset, and RES{0,1} bits (including writable bits that
> KVM doesn't support yet), as those bits shouldn't be modified
> (at least with the current KVM).
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 2e5d497596ef8..a2c5f210b3d6b 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1176,6 +1176,59 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> return 0;
> }
>
> +static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> + u64 val)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + u64 new_n, mutable_mask;
Really, this lacks consistency. Either you make N a u8 everywhere, or
a u64 everywhere. I don't mind either, but the type confusion is not
great.
> +
> + mutex_lock(&kvm->arch.config_lock);
> +
> + /*
> + * Make PMCR immutable once the VM has started running, but
> + * do not return an error to meet the existing expectations.
> + */
> + if (kvm_vm_has_ran_once(vcpu->kvm)) {
> + mutex_unlock(&kvm->arch.config_lock);
> + return 0;
> + }
> +
> + new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
> + if (new_n != kvm->arch.pmcr_n) {
Why do we need to check this?
> + u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);
Can you see why I'm annoyed?
> +
> + /*
> + * The vCPU can't have more counters than the PMU hardware
> + * implements. Ignore this error to maintain compatibility
> + * with the existing KVM behavior.
> + */
> + if (new_n <= pmcr_n_limit)
Isn't this the only thing that actually matters?
> + kvm->arch.pmcr_n = new_n;
> + }
> + mutex_unlock(&kvm->arch.config_lock);
> +
> + /*
> + * Ignore writes to RES0 bits, read only bits that are cleared on
> + * vCPU reset, and writable bits that KVM doesn't support yet.
> + * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
> + * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
> + * But, we leave the bit as it is here, as the vCPU's PMUver might
> + * be changed later (NOTE: the bit will be cleared on first vCPU run
> + * if necessary).
> + */
> + mutable_mask = (ARMV8_PMU_PMCR_MASK |
> + (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));
Why is N part of the 'mutable' mask? The only bits that should make it
into the register are ARMV8_PMU_PMCR_MASK.
> + val &= mutable_mask;
> + val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
> +
> + /* The LC bit is RES1 when AArch32 is not supported */
> + if (!kvm_supports_32bit_el0())
> + val |= ARMV8_PMU_PMCR_LC;
> +
> + __vcpu_sys_reg(vcpu, r->reg) = val;
> + return 0;
I think this should be rewritten as:
val &= ARMV8_PMU_PMCR_MASK;
/* The LC bit is RES1 when AArch32 is not supported */
if (!kvm_supports_32bit_el0())
val |= ARMV8_PMU_PMCR_LC;
__vcpu_sys_reg(vcpu, r->reg) = val;
return 0;
And that's it. Drop this 'mutable_mask' nonsense, as we should be
getting the correct value (merge of the per-vcpu register and VM-wide
N) since patch 4.
M.
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023 22:40:40 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> Hello,
>
> The goal of this series is to allow userspace to limit the number
> of PMU event counters on the vCPU. We need this to support migration
> across systems that implement different numbers of counters.
[...]
I've gone through the initial patches, and stopped before the tests
(which I usually can't be bothered to review anyway).
The comments I have a relatively minor and could be applied as fixes
on top if Oliver can be convinced to do so. Note that patch #4 has an
attribution issue.
> base-commit: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
maz@valley-girl:~/hot-poop/arm-platforms$ git describe 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
fatal: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3 is neither a commit nor blob
Can you please make an effort to base your postings on a known, stable
commit? A tagged -rc would be best. but certainly not a random commit.
This sort of information is just as useful as "No functional change
intended"...
M.
--
Without deviation from the norm, progress is not possible.
On Fri, 20 Oct 2023, Raghavendra Rao Ananta wrote:
> From: Reiji Watanabe <[email protected]>
>
> Introduce new helper functions to set the guest's PMU
> (kvm->arch.arm_pmu) either to a default probed instance or to a
> caller requested one, and use it when the guest's PMU needs to
> be set. These helpers will make it easier for the following
> patches to modify the relevant code.
>
> No functional change intended.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> Reviewed-by: Eric Auger <[email protected]>
Reviewed-by: Sebastian Ott <[email protected]>
On Fri, 20 Oct 2023, Raghavendra Rao Ananta wrote:
> From: Reiji Watanabe <[email protected]>
>
> The following patches will use the number of counters information
> from the arm_pmu and use this to set the PMCR.N for the guest
> during vCPU reset. However, since the guest is not associated
> with any arm_pmu until userspace configures the vPMU device
> attributes, and a reset can happen before this event, assign a
> default PMU to the guest just before doing the reset.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Sebastian Ott <[email protected]>
On Fri, 20 Oct 2023, Raghavendra Rao Ananta wrote:
> From: Reiji Watanabe <[email protected]>
>
> Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
> reads a vCPU's PMCR_EL0.
>
> Currently, the PMCR_EL0 value is tracked per vCPU. The following
> patches will make (only) PMCR_EL0.N track per guest. Having the
> new helper will be useful to combine the PMCR_EL0.N field
> (tracked per guest) and the other fields (tracked per vCPU)
> to provide the value of PMCR_EL0.
>
> No functional change intended.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> Reviewed-by: Eric Auger <[email protected]>
Reviewed-by: Sebastian Ott <[email protected]>
On Fri, 20 Oct 2023, Raghavendra Rao Ananta wrote:
> The number of PMU event counters is indicated in PMCR_EL0.N.
> For a vCPU with PMUv3 configured, the value is set to the same
> value as the current PE on every vCPU reset. Unless the vCPU is
> pinned to PEs that has the PMU associated to the guest from the
> initial vCPU reset, the value might be different from the PMU's
> PMCR_EL0.N on heterogeneous PMU systems.
>
> Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
> value. Track the PMCR_EL0.N per guest, as only one PMU can be set
> for the guest (PMCR_EL0.N must be the same for all vCPUs of the
> guest), and it is convenient for updating the value.
>
> To achieve this, the patch introduces a helper,
> kvm_arm_pmu_get_max_counters(), that reads the maximum number of
> counters from the arm_pmu associated to the VM. Make the function
> global as upcoming patches will be interested to know the value
> while setting the PMCR.N of the guest from userspace.
>
> KVM does not yet support userspace modifying PMCR_EL0.N.
> The following patch will add support for that.
>
> Signed-off-by: Reiji Watanabe <[email protected]>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Sebastian Ott <[email protected]>
On Mon, Oct 23, 2023 at 5:31 AM Marc Zyngier <[email protected]> wrote:
>
> On Fri, 20 Oct 2023 22:40:45 +0100,
> Raghavendra Rao Ananta <[email protected]> wrote:
> >
> > For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
> > PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
> > explicitly implement the {get,set}_user functions for these
> > registers to mask out unimplemented counters for userspace reads
> > and writes.
> >
> > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > ---
> > arch/arm64/kvm/sys_regs.c | 91 ++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 85 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index faf97878dfbbb..2e5d497596ef8 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -987,6 +987,45 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> > return true;
> > }
> >
> > +static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
> > + u64 reg, u64 val, bool set)
> > +{
> > + struct kvm *kvm = vcpu->kvm;
> > +
> > + mutex_lock(&kvm->arch.config_lock);
> > +
> > + /* Make the register immutable once the VM has started running */
> > + if (kvm_vm_has_ran_once(kvm)) {
> > + mutex_unlock(&kvm->arch.config_lock);
> > + return;
> > + }
> > +
> > + val &= kvm_pmu_valid_counter_mask(vcpu);
> > + mutex_unlock(&kvm->arch.config_lock);
> > +
> > + if (set)
> > + __vcpu_sys_reg(vcpu, reg) |= val;
> > + else
> > + __vcpu_sys_reg(vcpu, reg) &= ~val;
> > +}
> > +
> > +static int get_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > + u64 *val)
> > +{
> > + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> > +
> > + *val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
> > + return 0;
> > +}
> > +
> > +static int set_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > + u64 val)
> > +{
> > + /* r->Op2 & 0x1: true for PMCNTENSET_EL0, else PMCNTENCLR_EL0 */
> > + set_pmreg_for_valid_counters(vcpu, PMCNTENSET_EL0, val, r->Op2 & 0x1);
> > + return 0;
> > +}
>
> Huh, this is really ugly. Why the explosion of pointless helpers when
> the whole design of the sysreg infrastructure to have *common* helpers
> for registers that behave the same way?
>
> I'd expect something like the hack below instead.
>
> M.
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index a2c5f210b3d6..8f560a2496f2 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,42 +987,46 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> -static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
> - u64 reg, u64 val, bool set)
> +static int set_pmreg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, u64 val)
> {
> struct kvm *kvm = vcpu->kvm;
> + bool set;
>
> mutex_lock(&kvm->arch.config_lock);
>
> /* Make the register immutable once the VM has started running */
> if (kvm_vm_has_ran_once(kvm)) {
> mutex_unlock(&kvm->arch.config_lock);
> - return;
> + return 0;
> }
>
> val &= kvm_pmu_valid_counter_mask(vcpu);
> mutex_unlock(&kvm->arch.config_lock);
>
> + switch(r->reg) {
> + case PMOVSSET_EL0:
> + /* CRm[1] being set indicates a SET register, and CLR otherwise */
> + set = r->CRm & 2;
> + break;
> + default:
> + /* Op2[0] being set indicates a SET register, and CLR otherwise */
> + set = r->Op2 & 1;
> + break;
> + }
> +
> if (set)
> - __vcpu_sys_reg(vcpu, reg) |= val;
> + __vcpu_sys_reg(vcpu, r->reg) |= val;
> else
> - __vcpu_sys_reg(vcpu, reg) &= ~val;
> -}
> -
> -static int get_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 *val)
> -{
> - u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> + __vcpu_sys_reg(vcpu, r->reg) &= ~val;
>
> - *val = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
> return 0;
> }
>
> -static int set_pmcnten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 val)
> +static int get_pmreg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, u64 *val)
> {
> - /* r->Op2 & 0x1: true for PMCNTENSET_EL0, else PMCNTENCLR_EL0 */
> - set_pmreg_for_valid_counters(vcpu, PMCNTENSET_EL0, val, r->Op2 & 0x1);
> + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> +
> + *val = __vcpu_sys_reg(vcpu, r->reg) & mask;
> return 0;
> }
>
> @@ -1054,23 +1058,6 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> -static int get_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 *val)
> -{
> - u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> -
> - *val = __vcpu_sys_reg(vcpu, PMINTENSET_EL1) & mask;
> - return 0;
> -}
> -
> -static int set_pminten(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 val)
> -{
> - /* r->Op2 & 0x1: true for PMINTENSET_EL1, else PMINTENCLR_EL1 */
> - set_pmreg_for_valid_counters(vcpu, PMINTENSET_EL1, val, r->Op2 & 0x1);
> - return 0;
> -}
> -
> static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> const struct sys_reg_desc *r)
> {
> @@ -1095,23 +1082,6 @@ static bool access_pminten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> -static int set_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 val)
> -{
> - /* r->CRm & 0x2: true for PMOVSSET_EL0, else PMOVSCLR_EL0 */
> - set_pmreg_for_valid_counters(vcpu, PMOVSSET_EL0, val, r->CRm & 0x2);
> - return 0;
> -}
> -
> -static int get_pmovs(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> - u64 *val)
> -{
> - u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> -
> - *val = __vcpu_sys_reg(vcpu, PMOVSSET_EL0) & mask;
> - return 0;
> -}
> -
> static bool access_pmovs(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> const struct sys_reg_desc *r)
> {
> @@ -2311,10 +2281,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>
> { PMU_SYS_REG(PMINTENSET_EL1),
> .access = access_pminten, .reg = PMINTENSET_EL1,
> - .get_user = get_pminten, .set_user = set_pminten },
> + .get_user = get_pmreg, .set_user = set_pmreg },
> { PMU_SYS_REG(PMINTENCLR_EL1),
> .access = access_pminten, .reg = PMINTENSET_EL1,
> - .get_user = get_pminten, .set_user = set_pminten },
> + .get_user = get_pmreg, .set_user = set_pmreg },
> { SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
>
> { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
> @@ -2366,13 +2336,13 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
> { PMU_SYS_REG(PMCNTENSET_EL0),
> .access = access_pmcnten, .reg = PMCNTENSET_EL0,
> - .get_user = get_pmcnten, .set_user = set_pmcnten },
> + .get_user = get_pmreg, .set_user = set_pmreg },
> { PMU_SYS_REG(PMCNTENCLR_EL0),
> .access = access_pmcnten, .reg = PMCNTENSET_EL0,
> - .get_user = get_pmcnten, .set_user = set_pmcnten },
> + .get_user = get_pmreg, .set_user = set_pmreg },
> { PMU_SYS_REG(PMOVSCLR_EL0),
> .access = access_pmovs, .reg = PMOVSSET_EL0,
> - .get_user = get_pmovs, .set_user = set_pmovs },
> + .get_user = get_pmreg, .set_user = set_pmreg },
> /*
> * PM_SWINC_EL0 is exposed to userspace as RAZ/WI, as it was
> * previously (and pointlessly) advertised in the past...
> @@ -2401,7 +2371,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> .reset = reset_val, .reg = PMUSERENR_EL0, .val = 0 },
> { PMU_SYS_REG(PMOVSSET_EL0),
> .access = access_pmovs, .reg = PMOVSSET_EL0,
> - .get_user = get_pmovs, .set_user = set_pmovs },
> + .get_user = get_pmreg, .set_user = set_pmreg },
>
> { SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
> { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
>
Thanks for the suggestion. I'll consider this in the next iteration.
- Raghavendra
On Mon, Oct 23, 2023 at 5:42 AM Marc Zyngier <[email protected]> wrote:
>
> On Fri, 20 Oct 2023 22:40:46 +0100,
> Raghavendra Rao Ananta <[email protected]> wrote:
> >
> > For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
> > and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
> > Hence to ensure correct KVM's PMU emulation, mask out the bits in
> > these registers for these unimplemented counters before the first
> > vCPU run.
> >
> > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > ---
> > arch/arm64/kvm/arm.c | 2 +-
> > arch/arm64/kvm/pmu-emul.c | 11 +++++++++++
> > include/kvm/arm_pmu.h | 2 ++
> > 3 files changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e3074a9e23a8b..3c0bb80483fb1 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -857,7 +857,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> > }
> >
> > if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
> > - kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> > + kvm_vcpu_handle_request_reload_pmu(vcpu);
>
> Please rename this to kvm_vcpu_reload_pmu(). That's long enough. But
> see below.
>
Sounds good.
> >
> > if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
> > kvm_vcpu_pmu_restore_guest(vcpu);
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index 9e24581206c24..31e4933293b76 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -788,6 +788,17 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
> > return val & mask;
> > }
> >
> > +void kvm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu)
> > +{
> > + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> > +
> > + kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> > +
> > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask;
> > + __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask;
> > + __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask;
> > +}
>
> Why is this done on a vcpu request? Why can't it be done upfront, when
> we're requesting the reload? Or when assigning the PMU? Or when
> setting PMCR_EL0?
>
The idea was to do this only once, after userspace has configured the
PMCR.N (and has no option to change it), but before we run the guest
for the first time. So, I guess this can be done when we are
requesting the reload, if you prefer.
When assigning the PMU, it could be too early to sanitize as the
userspace would not have configured the PMCR.N yet.
It can probably be done when userspace configures PMCR.N, but since
this field is per-guest, we may have to apply the setting for all the
vCPUs during the ioctl, which may get a little ugly.
Thank you.
Raghavendra
On Mon, Oct 23, 2023 at 6:00 AM Marc Zyngier <[email protected]> wrote:
>
> On Fri, 20 Oct 2023 22:40:47 +0100,
> Raghavendra Rao Ananta <[email protected]> wrote:
> >
> > From: Reiji Watanabe <[email protected]>
> >
> > KVM does not yet support userspace modifying PMCR_EL0.N (With
> > the previous patch, KVM ignores what is written by userspace).
> > Add support userspace limiting PMCR_EL0.N.
> >
> > Disallow userspace to set PMCR_EL0.N to a value that is greater
> > than the host value as KVM doesn't support more event counters
> > than what the host HW implements. Also, make this register
> > immutable after the VM has started running. To maintain the
> > existing expectations, instead of returning an error, KVM
> > returns a success for these two cases.
> >
> > Finally, ignore writes to read-only bits that are cleared on
> > vCPU reset, and RES{0,1} bits (including writable bits that
> > KVM doesn't support yet), as those bits shouldn't be modified
> > (at least with the current KVM).
> >
> > Signed-off-by: Reiji Watanabe <[email protected]>
> > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > ---
> > arch/arm64/kvm/sys_regs.c | 57 +++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 55 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 2e5d497596ef8..a2c5f210b3d6b 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1176,6 +1176,59 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > return 0;
> > }
> >
> > +static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > + u64 val)
> > +{
> > + struct kvm *kvm = vcpu->kvm;
> > + u64 new_n, mutable_mask;
>
> Really, this lacks consistency. Either you make N a u8 everywhere, or
> a u64 everywhere. I don't mind either, but the type confusion is not
> great.
>
Sorry about that. I'll make it u8 across the board.
> > +
> > + mutex_lock(&kvm->arch.config_lock);
> > +
> > + /*
> > + * Make PMCR immutable once the VM has started running, but
> > + * do not return an error to meet the existing expectations.
> > + */
> > + if (kvm_vm_has_ran_once(vcpu->kvm)) {
> > + mutex_unlock(&kvm->arch.config_lock);
> > + return 0;
> > + }
> > +
> > + new_n = (val >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
> > + if (new_n != kvm->arch.pmcr_n) {
>
> Why do we need to check this?
>
Hmm, it may be redundant. I guess we can skip this, check for the
limit, and directly write new_n to kvm->arch.pmcr_n.
> > + u8 pmcr_n_limit = kvm_arm_pmu_get_max_counters(kvm);
>
> Can you see why I'm annoyed?
>
Yes. I'll make these consistent.
> > +
> > + /*
> > + * The vCPU can't have more counters than the PMU hardware
> > + * implements. Ignore this error to maintain compatibility
> > + * with the existing KVM behavior.
> > + */
> > + if (new_n <= pmcr_n_limit)
>
> Isn't this the only thing that actually matters?
>
Yes, I'll remove the above check.
> > + kvm->arch.pmcr_n = new_n;
> > + }
> > + mutex_unlock(&kvm->arch.config_lock);
> > +
> > + /*
> > + * Ignore writes to RES0 bits, read only bits that are cleared on
> > + * vCPU reset, and writable bits that KVM doesn't support yet.
> > + * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
> > + * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
> > + * But, we leave the bit as it is here, as the vCPU's PMUver might
> > + * be changed later (NOTE: the bit will be cleared on first vCPU run
> > + * if necessary).
> > + */
> > + mutable_mask = (ARMV8_PMU_PMCR_MASK |
> > + (ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT));
>
> Why is N part of the 'mutable' mask? The only bits that should make it
> into the register are ARMV8_PMU_PMCR_MASK.
>
> > + val &= mutable_mask;
> > + val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
> > +
> > + /* The LC bit is RES1 when AArch32 is not supported */
> > + if (!kvm_supports_32bit_el0())
> > + val |= ARMV8_PMU_PMCR_LC;
> > +
> > + __vcpu_sys_reg(vcpu, r->reg) = val;
> > + return 0;
>
> I think this should be rewritten as:
>
> val &= ARMV8_PMU_PMCR_MASK;
> /* The LC bit is RES1 when AArch32 is not supported */
> if (!kvm_supports_32bit_el0())
> val |= ARMV8_PMU_PMCR_LC;
>
> __vcpu_sys_reg(vcpu, r->reg) = val;
> return 0;
>
> And that's it. Drop this 'mutable_mask' nonsense, as we should be
> getting the correct value (merge of the per-vcpu register and VM-wide
> N) since patch 4.
>
Sure, I'll consider this.
Thank you.
Raghavendra
On Mon, Oct 23, 2023 at 6:09 AM Marc Zyngier <[email protected]> wrote:
>
> On Fri, 20 Oct 2023 22:40:40 +0100,
> Raghavendra Rao Ananta <[email protected]> wrote:
> >
> > Hello,
> >
> > The goal of this series is to allow userspace to limit the number
> > of PMU event counters on the vCPU. We need this to support migration
> > across systems that implement different numbers of counters.
>
> [...]
>
> I've gone through the initial patches, and stopped before the tests
> (which I usually can't be bothered to review anyway).
>
> The comments I have a relatively minor and could be applied as fixes
> on top if Oliver can be convinced to do so. Note that patch #4 has an
> attribution issue.
>
> > base-commit: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
>
> maz@valley-girl:~/hot-poop/arm-platforms$ git describe 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
> fatal: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3 is neither a commit nor blob
>
> Can you please make an effort to base your postings on a known, stable
> commit? A tagged -rc would be best. but certainly not a random commit.
>
I usually do base on a known -rc. But this series needed a couple of
series from kvmarm/next (mentioned in the original patch), and hence I
rebased on top of them. How do you suggest I handle this in the
future? Rebase to a known -rc on mainline, apply the required series,
and then my series on top?
Thank you.
Raghavendra
On Mon, 23 Oct 2023 18:42:43 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> On Mon, Oct 23, 2023 at 5:42 AM Marc Zyngier <[email protected]> wrote:
> >
> > On Fri, 20 Oct 2023 22:40:46 +0100,
> > Raghavendra Rao Ananta <[email protected]> wrote:
> > >
> > > For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
> > > and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
> > > Hence to ensure correct KVM's PMU emulation, mask out the bits in
> > > these registers for these unimplemented counters before the first
> > > vCPU run.
> > >
> > > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > > ---
> > > arch/arm64/kvm/arm.c | 2 +-
> > > arch/arm64/kvm/pmu-emul.c | 11 +++++++++++
> > > include/kvm/arm_pmu.h | 2 ++
> > > 3 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index e3074a9e23a8b..3c0bb80483fb1 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -857,7 +857,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> > > }
> > >
> > > if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
> > > - kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> > > + kvm_vcpu_handle_request_reload_pmu(vcpu);
> >
> > Please rename this to kvm_vcpu_reload_pmu(). That's long enough. But
> > see below.
> >
> Sounds good.
>
> > >
> > > if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
> > > kvm_vcpu_pmu_restore_guest(vcpu);
> > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > > index 9e24581206c24..31e4933293b76 100644
> > > --- a/arch/arm64/kvm/pmu-emul.c
> > > +++ b/arch/arm64/kvm/pmu-emul.c
> > > @@ -788,6 +788,17 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
> > > return val & mask;
> > > }
> > >
> > > +void kvm_vcpu_handle_request_reload_pmu(struct kvm_vcpu *vcpu)
> > > +{
> > > + u64 mask = kvm_pmu_valid_counter_mask(vcpu);
> > > +
> > > + kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));
> > > +
> > > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask;
> > > + __vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask;
> > > + __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask;
> > > +}
> >
> > Why is this done on a vcpu request? Why can't it be done upfront, when
> > we're requesting the reload? Or when assigning the PMU? Or when
> > setting PMCR_EL0?
> >
> The idea was to do this only once, after userspace has configured the
> PMCR.N (and has no option to change it), but before we run the guest
> for the first time. So, I guess this can be done when we are
> requesting the reload, if you prefer.
Well, I'm trying to limit the proliferation of these one-off "helpers"
that make the code hard to follow. So it isn't "what I prefer", but
what makes the code easier to understand without having to follow a
maze of pointless abstraction.
> When assigning the PMU, it could be too early to sanitize as the
> userspace would not have configured the PMCR.N yet.
> It can probably be done when userspace configures PMCR.N, but since
> this field is per-guest, we may have to apply the setting for all the
> vCPUs during the ioctl, which may get a little ugly.
Right. So it has to happen at the point where userspace cannot write
to PMCR_EL0 anymore, for which any of the options I mentioned is too
early. What you have is thus correct. But it would have helped if that
rationale was captured in the commit message.
M.
--
Without deviation from the norm, progress is not possible.
On Mon, 23 Oct 2023 18:58:19 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> On Mon, Oct 23, 2023 at 6:09 AM Marc Zyngier <[email protected]> wrote:
> >
> > On Fri, 20 Oct 2023 22:40:40 +0100,
> > Raghavendra Rao Ananta <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > The goal of this series is to allow userspace to limit the number
> > > of PMU event counters on the vCPU. We need this to support migration
> > > across systems that implement different numbers of counters.
> >
> > [...]
> >
> > I've gone through the initial patches, and stopped before the tests
> > (which I usually can't be bothered to review anyway).
> >
> > The comments I have a relatively minor and could be applied as fixes
> > on top if Oliver can be convinced to do so. Note that patch #4 has an
> > attribution issue.
> >
> > > base-commit: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
> >
> > maz@valley-girl:~/hot-poop/arm-platforms$ git describe 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3
> > fatal: 0a3a1665cbc59ee8d6326aa6c0b4a8d1cd67dda3 is neither a commit nor blob
> >
> > Can you please make an effort to base your postings on a known, stable
> > commit? A tagged -rc would be best. but certainly not a random commit.
> >
> I usually do base on a known -rc. But this series needed a couple of
> series from kvmarm/next (mentioned in the original patch), and hence I
> rebased on top of them.
Well, that commit has since disappeared, as git cannot find it (as
demonstrated above). Which is why I insist on a public tag as a base,
as everything else is completely volatile.
> How do you suggest I handle this in the future? Rebase to a known
> -rc on mainline, apply the required series, and then my series on
> top?
No. You base your own series on an -rc (ideally, -rc3). If there is a
conflict with another series, it is our job (Oliver and I) to fix it
(bonus points if you indicate a resolution for the conflict in the
cover letter).
If there is a hard dependency (something that would actively prevent
your series from working at all), you cherry-pick the minimal set of
patches that makes your own series functional as a *prefix*, and post
the whole thing, including the patches you depend on. Oliver and I
will make sure the common prefix is dealt with without duplication.
And for what it is worth, this series directly applies on v6.6-rc3
without a conflict.
M.
--
Without deviation from the norm, progress is not possible.
On Mon, Oct 23, 2023 at 11:40:50AM +0100, Marc Zyngier wrote:
[...]
> > +static int kvm_setup_vcpu(struct kvm_vcpu *vcpu)
> > +{
> > + struct kvm *kvm = vcpu->kvm;
> > +
> > + /*
> > + * When the vCPU has a PMU, but no PMU is set for the guest
> > + * yet, set the default one.
> > + */
> > + if (kvm_vcpu_has_pmu(vcpu) && !kvm->arch.arm_pmu &&
> > + kvm_arm_set_default_pmu(kvm))
> > + return -EINVAL;
>
> nit: I'm not keen on re-interpreting the error code. If
> kvm_arm_set_default_pmu() returns an error, we should return *that*
> particular error, and not any other. Something like:
The code took this shape because I had an issue with returning ENODEV on
the KVM_ARM_VCPU_INIT ioctl, which is not a documented error code.
Now that the vCPU flags are sanitised early in the ioctl, KVM has
decided at this point that vPMU is a supported feature.
Given that, I think ENODEV is fine now as the unexpected return value
would indicate a bug in KVM.
> Hmmm. Contrary to what the commit message says, the default PMU is not
> picked at reset time, but at the point where the target is set (the
> very first vcpu init). Which is pretty different from reset, which
> happens more than once.
>
> I also can't say I'm over the moon with yet another function that does
> a very tiny bit of initialisation outside of the rest of the code that
> performs the vcpu init. Following things is an absolute maze...
I'm fine with this being inlined into __kvm_vcpu_set_target() so long as
we maintain the clear distinction between one-time setup and vCPU reset.
--
Thanks,
Oliver
On Fri, 20 Oct 2023 22:40:40 +0100,
Raghavendra Rao Ananta <[email protected]> wrote:
>
> Hello,
>
> The goal of this series is to allow userspace to limit the number
> of PMU event counters on the vCPU. We need this to support migration
> across systems that implement different numbers of counters.
FWIW, I've pushed out a branch[1] with a set of fixes that address
some of the comments I had on this series. Feel free to squash them in
your series as you see fit.
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu_pmcr_n
--
Without deviation from the norm, progress is not possible.
On Fri, Oct 20, 2023 at 09:40:45PM +0000, Raghavendra Rao Ananta wrote:
> For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
> PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
> explicitly implement the {get,set}_user functions for these
> registers to mask out unimplemented counters for userspace reads
> and writes.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/kvm/sys_regs.c | 91 ++++++++++++++++++++++++++++++++++++---
> 1 file changed, 85 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index faf97878dfbbb..2e5d497596ef8 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,6 +987,45 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
> return true;
> }
>
> +static void set_pmreg_for_valid_counters(struct kvm_vcpu *vcpu,
> + u64 reg, u64 val, bool set)
> +{
> + struct kvm *kvm = vcpu->kvm;
> +
> + mutex_lock(&kvm->arch.config_lock);
> +
> + /* Make the register immutable once the VM has started running */
This is a considerable change from the existing behavior and lacks
justification. These registers, or rather the state that these aliases
update, is mutable from the guest. I see no reason for excluding
userspace from this behavior.
> + if (kvm_vm_has_ran_once(kvm)) {
> + mutex_unlock(&kvm->arch.config_lock);
> + return;
> + }
> +
> + val &= kvm_pmu_valid_counter_mask(vcpu);
> + mutex_unlock(&kvm->arch.config_lock);
I'm not entirely sold on taking the config_lock here.
- If userspace is doing these ioctls in parallel then it cannot guarantee
ordering in the first place, even w/ locking under the hood. Any
garbage values will be discarded by KVM_REQ_RELOAD_PMU.
- If the VM has already started PMCR.N is immutable, so there is no
race.
--
Thanks,
Oliver
On Fri, Oct 20, 2023 at 09:40:44PM +0000, Raghavendra Rao Ananta wrote:
[...]
> +int kvm_arm_pmu_get_max_counters(struct kvm *kvm)
> +{
> + struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
> +
> + lockdep_assert_held(&kvm->arch.config_lock);
This lockdep assertion is misleading. Readers of kvm_arch::arm_pmu *are
not* serialized by the config_lock.
--
Thanks,
Oliver
On Fri, Oct 20, 2023 at 09:40:53PM +0000, Raghavendra Rao Ananta wrote:
> KVM marks some of the vPMU registers as immutable to
> userspace once the vCPU has started running. Add a test
> scenario to check this behavior.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Now that PMCR_EL0.N is the only thing that's getting the immutability
treatment this patch fails. I'll probably drop it.
--
Thanks,
Oliver
On Fri, Oct 20, 2023 at 09:40:51PM +0000, Raghavendra Rao Ananta wrote:
[...]
> +#define INVALID_EC (-1ul)
> +uint64_t expected_ec = INVALID_EC;
> +uint64_t op_end_addr;
> +
> static void guest_sync_handler(struct ex_regs *regs)
> {
> uint64_t esr, ec;
>
> esr = read_sysreg(esr_el1);
> ec = (esr >> ESR_EC_SHIFT) & ESR_EC_MASK;
> - __GUEST_ASSERT(0, "PC: 0x%lx; ESR: 0x%lx; EC: 0x%lx", regs->pc, esr, ec);
> +
> + __GUEST_ASSERT(op_end_addr && (expected_ec == ec),
> + "PC: 0x%lx; ESR: 0x%lx; EC: 0x%lx; EC expected: 0x%lx",
> + regs->pc, esr, ec, expected_ec);
> +
> + /* Will go back to op_end_addr after the handler exits */
> + regs->pc = op_end_addr;
This sort of game is exceedingly fragile, and actually causes the test
to fail when I build it with clang. The test body is written in C, so
you don't know if the label you've chosen as the return address is
actually the next instruction after the sysreg access.
A64 instructions are guaranteed to be 32 bit, so we can just increment
PC by 4 here.
--
Thanks,
Oliver
On Fri, Oct 20, 2023 at 09:40:47PM +0000, Raghavendra Rao Ananta wrote:
[...]
> + /*
> + * Make PMCR immutable once the VM has started running, but
> + * do not return an error to meet the existing expectations.
> + */
> + if (kvm_vm_has_ran_once(vcpu->kvm)) {
> + mutex_unlock(&kvm->arch.config_lock);
> + return 0;
> + }
Marc pointed out offline that PMCR_EL0 needs to be mutable at runtime as
well. I'll admit, the architecture isn't very helpful as it is both used
for identification _and_ configuration.
What I had in mind a few revisions ago when I gave the suggestion was to
ignore writes to _just_ the PMCR_EL0.N field after the VM has started.
--
Thanks,
Oliver
On Mon, Oct 23, 2023 at 07:35:44PM +0100, Marc Zyngier wrote:
> On Fri, 20 Oct 2023 22:40:40 +0100,
> Raghavendra Rao Ananta <[email protected]> wrote:
> >
> > Hello,
> >
> > The goal of this series is to allow userspace to limit the number
> > of PMU event counters on the vCPU. We need this to support migration
> > across systems that implement different numbers of counters.
>
> FWIW, I've pushed out a branch[1] with a set of fixes that address
> some of the comments I had on this series. Feel free to squash them in
> your series as you see fit.
I did a second round of fixes on top of what Marc has and pushed that to
a branch [*]. If everything looks good I'll take it for 6.7.
[*] https://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux.git/log/?h=kvm-arm64/pmu_pmcr_n
--
Thanks,
Oliver
On Fri, 20 Oct 2023 21:40:40 +0000, Raghavendra Rao Ananta wrote:
> The goal of this series is to allow userspace to limit the number
> of PMU event counters on the vCPU. We need this to support migration
> across systems that implement different numbers of counters.
>
> The number of PMU event counters is indicated in PMCR_EL0.N.
> For a vCPU with PMUv3 configured, its value will be the same as
> the current PE by default. Userspace can set PMCR_EL0.N for the
> vCPU to any value even with the current KVM using KVM_SET_ONE_REG.
> However, it is practically unsupported, as KVM resets PMCR_EL0.N
> to the host value on vCPU reset and some KVM code uses the host
> value to identify (un)implemented event counters on the vCPU.
>
> [...]
I've applied this with Marc + I's fixes. I'm happy to toss any fixes
on top of this series if folks spot issues.
[01/13] KVM: arm64: PMU: Introduce helpers to set the guest's PMU
https://git.kernel.org/kvmarm/kvmarm/c/1616ca6f3c10
[02/13] KVM: arm64: PMU: Set the default PMU for the guest before vCPU reset
https://git.kernel.org/kvmarm/kvmarm/c/427733579744
[03/13] KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
https://git.kernel.org/kvmarm/kvmarm/c/57fc267f1b5c
[04/13] KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
https://git.kernel.org/kvmarm/kvmarm/c/4d20debf9ca1
[05/13] KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
https://git.kernel.org/kvmarm/kvmarm/c/a45f41d754e0
[06/13] KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
https://git.kernel.org/kvmarm/kvmarm/c/27131b199f9f
[07/13] KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
https://git.kernel.org/kvmarm/kvmarm/c/ea9ca904d24f
[08/13] tools: Import arm_pmuv3.h
https://git.kernel.org/kvmarm/kvmarm/c/9f4b3273dfbe
[09/13] KVM: selftests: aarch64: Introduce vpmu_counter_access test
https://git.kernel.org/kvmarm/kvmarm/c/8d0aebe1ca2b
[10/13] KVM: selftests: aarch64: vPMU register test for implemented counters
https://git.kernel.org/kvmarm/kvmarm/c/ada1ae68262d
[11/13] KVM: selftests: aarch64: vPMU register test for unimplemented counters
https://git.kernel.org/kvmarm/kvmarm/c/e1cc87206348
[12/13] KVM: selftests: aarch64: vPMU test for validating user accesses
https://git.kernel.org/kvmarm/kvmarm/c/62708be351fe
--
Best,
Oliver