Hello,
Continuing the discussion from [1], the series tries to add support
for the userspace to elect the hypercall services that it wishes
to expose to the guest, rather than the guest discovering them
unconditionally. The idea employed by the series was taken from
[1] as suggested by Marc Z.
In a broad sense, the concept is similar to the current implementation
of PSCI interface- create a 'firmware psuedo-register' to handle the
firmware revisions. The series extends this idea to all the other
hypercalls such as TRNG (True Random Number Generator), PV_TIME
(Paravirtualized Time), and PTP (Precision Time protocol).
For better categorization and future scaling, these firmware registers
are categorized based on the service call owners. Also, unlike the
existing firmware psuedo-registers, they hold the features supported
in the form of a bitmap.
During the VM initialization, the registers holds an upper-limit of
the features supported by each one of them. It's expected that the
userspace discover the features provided by each register via GET_ONE_REG,
and writeback the desired values using SET_ONE_REG. KVM allows this
modification only until the VM has started.
Some of the standard function-ids, such as ARM_SMCCC_VERSION_FUNC_ID,
need not be associated with a feature bit. For such ids, the series
introduced an allowed-list, hvc_func_default_allowed_list[], that holds
all such ids. As a result, the functions that are not elected by userspace,
or if they are not a part of this allowed-list, will be denied for when
the guests invoke them.
Older VMMs can simply ignore this interface and the hypercall services
will be exposed unconditionally to the guests, thus ensuring backward
compatibility.
The patches are based off of mainline kernel 5.18-rc1, with the selftest
patches from [2] applied.
Patch-1 factors out the non-PSCI related interface from psci.c to
hypercalls.c, as the series would extend the list in the upcoming
patches.
Patch-2 sets up the framework for the bitmap firmware psuedo-registers.
It includes read/write support for the registers, and a helper to check
if a particular hypercall service is supported for the guest.
It also adds the register KVM_REG_ARM_STD_HYP_BMAP to support ARM's
standard secure services.
Patch-3 introduces the firmware register, KVM_REG_ARM_STD_HYP_BMAP,
which holds the standard hypervisor services (such as PV_TIME).
Patch-4 introduces the firmware register, KVM_REG_ARM_VENDOR_HYP_BMAP,
which holds the vendor specific hypercall services.
Patch-5,6 Add the necessary documentation for the newly added firmware
registers.
Patch-7 imports the SMCCC definitions from linux/arm-smccc.h into tools/
for further use in selftests.
Patch-8 adds the selftest to test the guest (using 'hvc') and userspace
interfaces (SET/GET_ONE_REG).
Patch-9 adds these firmware registers into the get-reg-list selftest.
Patch-10 is unrelated to the series, but adds KVM_REG_ARM_FW_REG(3)
to base_regs[] of get-regs-list selftest for the sake of completion.
[1]: https://lore.kernel.org/kvmarm/[email protected]/T/
[2]: https://lore.kernel.org/kvmarm/[email protected]/
Regards,
Raghavendra
v4 -> v5:
Addressed comments by Oliver (thank you!):
- Rebased the series to accommodate ARM_SMCCC_ARCH_WORKAROUND_3
and PSCI 1.1 changes, and capturing VM's first run.
- Removed the patches related to register scoping (v4 02/13 and
03/13). I plan to re-introduce them in its own series.
- Dropped the patch that captures VM's first run.
- Moved the bitmap feature firmware registers to its own CORPOC
space (0x0016).
- Move the KVM_REG_ARM_*_BIT_MAX definitions from uapi header
to internal header (arm_hypercalls.h).
- Renamed the hypercall descriptor to 'struct kvm_smccc_features',
and kvm_hvc_call_supported() to kvm_hvc_call_allowed().
- Introduced an allowed-list to hold the function-ids that aren't
represented by feature-bits.
- Introduced kvm_psci_func_id_is_valid() to check if a given
function-id is a valid PSCI id, which is used in
kvm_hvc_call_allowed().
- Introduced KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT as bit-0 of
KVM_REG_ARM_VENDOR_HYP_BMAP register and
KVM_REG_ARM_VENDOR_HYP_BIT_PTP is moved to bit-1.
- Updated the arm-smccc.h import to include the definition of
ARM_SMCCC_ARCH_WORKAROUND_3.
- Introduced the KVM_REG_ARM_FW_FEAT_BMAP COPROC definition to
get-reg-list selftest.
- Created a new patch to include KVM_REG_ARM_FW_REG(3) in
get-reg-list.
v3 -> v4
Addressed comments and took suggestions by Reiji, Oliver, Marc,
Sean and Jim:
- Renamed and moved the VM has run once check to arm64.
- Introduced the capability to dynamically modify the register
encodings to include the scope information.
- Replaced mutex_lock with READ_ONCE and WRITE_ONCE when the
bitmaps are accessed.
- The hypercalls selftest re-runs with KVM_CAP_ARM_REG_SCOPE
enabled.
v2 -> v3
Addressed comments by Marc and Andrew:
- Dropped kvm_vcpu_has_run_once() implementation.
- Redifined kvm_vm_has_run_once() as kvm_vm_has_started() in the core
KVM code that introduces a new field, 'vm_started', to track this.
- KVM_CAP_ARM_HVC_FW_REG_BMAP returns the number of psuedo-firmware
bitmap registers upon a 'read'. Support for 'write' removed.
- Removed redundant spinlock, 'fw_reg_bmap_enabled' fields from the
hypercall descriptor structure.
- A separate sub-struct to hold the bitmap info is removed. The bitmap
info is directly stored in the hypercall descriptor structure
(struct kvm_hvc_desc).
v1 -> v2
Addressed comments by Oliver (thanks!):
- Introduced kvm_vcpu_has_run_once() and kvm_vm_has_run_once() in the
core kvm code, rather than relying on ARM specific
vcpu->arch.has_run_once.
- Writing to KVM_REG_ARM_PSCI_VERSION is done in hypercalls.c itself,
rather than separating out to psci.c.
- Introduced KVM_CAP_ARM_HVC_FW_REG_BMAP to enable the extension.
- Tracks the register accesses from VMM to decide whether to sanitize
a register or not, as opposed to sanitizing upon the first 'write'
in v1.
- kvm_hvc_call_supported() is implemented using a direct switch-case
statement, instead of looping over all the registers to pick the
register for the function-id.
- Replaced the register bit definitions with #defines, instead of enums.
- Removed the patch v1-06/08 that imports the firmware register
definitions as it's not needed.
- Separated out the documentations in its own patch, and the renaming
of hypercalls.rst to psci.rst into another patch.
- Add the new firmware registers to get-reg-list KVM selftest.
v1: https://lore.kernel.org/kvmarm/[email protected]/
v2: https://lore.kernel.org/kvmarm/[email protected]/
v3: https://lore.kernel.org/linux-arm-kernel/[email protected]/
v4: https://lore.kernel.org/lkml/[email protected]/
Raghavendra Rao Ananta (10):
KVM: arm64: Factor out firmware register handling from psci.c
KVM: arm64: Setup a framework for hypercall bitmap firmware registers
KVM: arm64: Add standard hypervisor firmware register
KVM: arm64: Add vendor hypervisor firmware register
Docs: KVM: Rename psci.rst to hypercalls.rst
Docs: KVM: Add doc for the bitmap firmware registers
tools: Import ARM SMCCC definitions
selftests: KVM: aarch64: Introduce hypercall ABI test
selftests: KVM: aarch64: Add the bitmap firmware registers to
get-reg-list
selftests: KVM: aarch64: Add KVM_REG_ARM_FW_REG(3) to get-reg-list
Documentation/virt/kvm/api.rst | 17 +
Documentation/virt/kvm/arm/hypercalls.rst | 136 +++++++
Documentation/virt/kvm/arm/psci.rst | 77 ----
arch/arm64/include/asm/kvm_host.h | 16 +
arch/arm64/include/uapi/asm/kvm.h | 16 +
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/guest.c | 10 +-
arch/arm64/kvm/hypercalls.c | 321 +++++++++++++++-
arch/arm64/kvm/psci.c | 183 ----------
include/kvm/arm_hypercalls.h | 22 ++
include/kvm/arm_psci.h | 17 +-
tools/include/linux/arm-smccc.h | 193 ++++++++++
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/aarch64/get-reg-list.c | 9 +
.../selftests/kvm/aarch64/hypercalls.c | 344 ++++++++++++++++++
16 files changed, 1092 insertions(+), 272 deletions(-)
create mode 100644 Documentation/virt/kvm/arm/hypercalls.rst
delete mode 100644 Documentation/virt/kvm/arm/psci.rst
create mode 100644 tools/include/linux/arm-smccc.h
create mode 100644 tools/testing/selftests/kvm/aarch64/hypercalls.c
--
2.35.1.1094.g7c7d902a7c-goog
Since the doc also covers general hypercalls' details,
rather than just PSCI, and the fact that the bitmap firmware
registers' details will be added to this doc, rename the file
to a more appropriate name- hypercalls.rst.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
---
Documentation/virt/kvm/arm/{psci.rst => hypercalls.rst} | 0
1 file changed, 0 insertions(+), 0 deletions(-)
rename Documentation/virt/kvm/arm/{psci.rst => hypercalls.rst} (100%)
diff --git a/Documentation/virt/kvm/arm/psci.rst b/Documentation/virt/kvm/arm/hypercalls.rst
similarity index 100%
rename from Documentation/virt/kvm/arm/psci.rst
rename to Documentation/virt/kvm/arm/hypercalls.rst
--
2.35.1.1094.g7c7d902a7c-goog
Common hypercall firmware register handing is currently employed
by psci.c. Since the upcoming patches add more of these registers,
it's better to move the generic handling to hypercall.c for a
cleaner presentation.
While we are at it, collect all the firmware registers under
fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.
No functional change intended.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
---
arch/arm64/kvm/guest.c | 2 +-
arch/arm64/kvm/hypercalls.c | 185 +++++++++++++++++++++++++++++++++++
arch/arm64/kvm/psci.c | 183 ----------------------------------
include/kvm/arm_hypercalls.h | 7 ++
include/kvm/arm_psci.h | 7 --
5 files changed, 193 insertions(+), 191 deletions(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 7e15b03fbdf8..0d5cca56cbda 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -18,7 +18,7 @@
#include <linux/string.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
-#include <kvm/arm_psci.h>
+#include <kvm/arm_hypercalls.h>
#include <asm/cputype.h>
#include <linux/uaccess.h>
#include <asm/fpsimd.h>
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 202b8c455724..fa6d9378d8e7 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -158,3 +158,188 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
return 1;
}
+
+static const u64 kvm_arm_fw_reg_ids[] = {
+ KVM_REG_ARM_PSCI_VERSION,
+ KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
+ KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
+ KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
+};
+
+int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
+{
+ return ARRAY_SIZE(kvm_arm_fw_reg_ids);
+}
+
+int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(kvm_arm_fw_reg_ids); i++) {
+ if (put_user(kvm_arm_fw_reg_ids[i], uindices++))
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+#define KVM_REG_FEATURE_LEVEL_WIDTH 4
+#define KVM_REG_FEATURE_LEVEL_MASK GENMASK(KVM_REG_FEATURE_LEVEL_WIDTH, 0)
+
+/*
+ * Convert the workaround level into an easy-to-compare number, where higher
+ * values mean better protection.
+ */
+static int get_kernel_wa_level(u64 regid)
+{
+ switch (regid) {
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
+ switch (arm64_get_spectre_v2_state()) {
+ case SPECTRE_VULNERABLE:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
+ case SPECTRE_MITIGATED:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
+ case SPECTRE_UNAFFECTED:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
+ }
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
+ switch (arm64_get_spectre_v4_state()) {
+ case SPECTRE_MITIGATED:
+ /*
+ * As for the hypercall discovery, we pretend we
+ * don't have any FW mitigation if SSBS is there at
+ * all times.
+ */
+ if (cpus_have_final_cap(ARM64_SSBS))
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+ fallthrough;
+ case SPECTRE_UNAFFECTED:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
+ case SPECTRE_VULNERABLE:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+ }
+ break;
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
+ switch (arm64_get_spectre_bhb_state()) {
+ case SPECTRE_VULNERABLE:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
+ case SPECTRE_MITIGATED:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
+ case SPECTRE_UNAFFECTED:
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
+ }
+ return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
+ }
+
+ return -EINVAL;
+}
+
+int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+ void __user *uaddr = (void __user *)(long)reg->addr;
+ u64 val;
+
+ switch (reg->id) {
+ case KVM_REG_ARM_PSCI_VERSION:
+ val = kvm_psci_version(vcpu);
+ break;
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
+ val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
+ break;
+ default:
+ return -ENOENT;
+ }
+
+ if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ return 0;
+}
+
+int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+ void __user *uaddr = (void __user *)(long)reg->addr;
+ u64 val;
+ int wa_level;
+
+ if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ switch (reg->id) {
+ case KVM_REG_ARM_PSCI_VERSION:
+ {
+ bool wants_02;
+
+ wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
+
+ switch (val) {
+ case KVM_ARM_PSCI_0_1:
+ if (wants_02)
+ return -EINVAL;
+ vcpu->kvm->arch.psci_version = val;
+ return 0;
+ case KVM_ARM_PSCI_0_2:
+ case KVM_ARM_PSCI_1_0:
+ case KVM_ARM_PSCI_1_1:
+ if (!wants_02)
+ return -EINVAL;
+ vcpu->kvm->arch.psci_version = val;
+ return 0;
+ }
+ break;
+ }
+
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
+ if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
+ return -EINVAL;
+
+ if (get_kernel_wa_level(reg->id) < val)
+ return -EINVAL;
+
+ return 0;
+
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
+ if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
+ KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
+ return -EINVAL;
+
+ /* The enabled bit must not be set unless the level is AVAIL. */
+ if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
+ (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
+ return -EINVAL;
+
+ /*
+ * Map all the possible incoming states to the only two we
+ * really want to deal with.
+ */
+ switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
+ wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+ break;
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
+ case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
+ wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /*
+ * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
+ * other way around.
+ */
+ if (get_kernel_wa_level(reg->id) < wa_level)
+ return -EINVAL;
+
+ return 0;
+ default:
+ return -ENOENT;
+ }
+
+ return -EINVAL;
+}
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 372da09a2fab..bdfa93ca57d1 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -439,186 +439,3 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
return -EINVAL;
}
}
-
-int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
-{
- return 4; /* PSCI version and three workaround registers */
-}
-
-int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
-{
- if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
- return -EFAULT;
-
- if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
- return -EFAULT;
-
- if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
- return -EFAULT;
-
- if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3, uindices++))
- return -EFAULT;
-
- return 0;
-}
-
-#define KVM_REG_FEATURE_LEVEL_WIDTH 4
-#define KVM_REG_FEATURE_LEVEL_MASK (BIT(KVM_REG_FEATURE_LEVEL_WIDTH) - 1)
-
-/*
- * Convert the workaround level into an easy-to-compare number, where higher
- * values mean better protection.
- */
-static int get_kernel_wa_level(u64 regid)
-{
- switch (regid) {
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
- switch (arm64_get_spectre_v2_state()) {
- case SPECTRE_VULNERABLE:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
- case SPECTRE_MITIGATED:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
- case SPECTRE_UNAFFECTED:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
- }
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
- switch (arm64_get_spectre_v4_state()) {
- case SPECTRE_MITIGATED:
- /*
- * As for the hypercall discovery, we pretend we
- * don't have any FW mitigation if SSBS is there at
- * all times.
- */
- if (cpus_have_final_cap(ARM64_SSBS))
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
- fallthrough;
- case SPECTRE_UNAFFECTED:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
- case SPECTRE_VULNERABLE:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
- }
- break;
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
- switch (arm64_get_spectre_bhb_state()) {
- case SPECTRE_VULNERABLE:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
- case SPECTRE_MITIGATED:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
- case SPECTRE_UNAFFECTED:
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
- }
- return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
- }
-
- return -EINVAL;
-}
-
-int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
-{
- void __user *uaddr = (void __user *)(long)reg->addr;
- u64 val;
-
- switch (reg->id) {
- case KVM_REG_ARM_PSCI_VERSION:
- val = kvm_psci_version(vcpu);
- break;
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
- val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
- break;
- default:
- return -ENOENT;
- }
-
- if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
- return -EFAULT;
-
- return 0;
-}
-
-int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
-{
- void __user *uaddr = (void __user *)(long)reg->addr;
- u64 val;
- int wa_level;
-
- if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
- return -EFAULT;
-
- switch (reg->id) {
- case KVM_REG_ARM_PSCI_VERSION:
- {
- bool wants_02;
-
- wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
-
- switch (val) {
- case KVM_ARM_PSCI_0_1:
- if (wants_02)
- return -EINVAL;
- vcpu->kvm->arch.psci_version = val;
- return 0;
- case KVM_ARM_PSCI_0_2:
- case KVM_ARM_PSCI_1_0:
- case KVM_ARM_PSCI_1_1:
- if (!wants_02)
- return -EINVAL;
- vcpu->kvm->arch.psci_version = val;
- return 0;
- }
- break;
- }
-
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
- if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
- return -EINVAL;
-
- if (get_kernel_wa_level(reg->id) < val)
- return -EINVAL;
-
- return 0;
-
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
- if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
- KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
- return -EINVAL;
-
- /* The enabled bit must not be set unless the level is AVAIL. */
- if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
- (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
- return -EINVAL;
-
- /*
- * Map all the possible incoming states to the only two we
- * really want to deal with.
- */
- switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
- wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
- break;
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
- case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
- wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
- break;
- default:
- return -EINVAL;
- }
-
- /*
- * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
- * other way around.
- */
- if (get_kernel_wa_level(reg->id) < wa_level)
- return -EINVAL;
-
- return 0;
- default:
- return -ENOENT;
- }
-
- return -EINVAL;
-}
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..5d38628a8d04 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -40,4 +40,11 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
vcpu_set_reg(vcpu, 3, a3);
}
+struct kvm_one_reg;
+
+int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
+int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
+int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+
#endif
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 68b96c3826c3..6e55b9283789 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -39,11 +39,4 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
int kvm_psci_call(struct kvm_vcpu *vcpu);
-struct kvm_one_reg;
-
-int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
-int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
-int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
-int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
-
#endif /* __KVM_ARM_PSCI_H__ */
--
2.35.1.1094.g7c7d902a7c-goog
Add the psuedo-firmware registers KVM_REG_ARM_STD_BMAP,
KVM_REG_ARM_STD_HYP_BMAP, and KVM_REG_ARM_VENDOR_HYP_BMAP to
the base_regs[] list.
Also, add the COPROC support for KVM_REG_ARM_FW_FEAT_BMAP.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
index f12147c43464..281c08b3fdd2 100644
--- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c
+++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
@@ -294,6 +294,11 @@ static void print_reg(struct vcpu_config *c, __u64 id)
"%s: Unexpected bits set in FW reg id: 0x%llx", config_name(c), id);
printf("\tKVM_REG_ARM_FW_REG(%lld),\n", id & 0xffff);
break;
+ case KVM_REG_ARM_FW_FEAT_BMAP:
+ TEST_ASSERT(id == KVM_REG_ARM_FW_FEAT_BMAP_REG(id & 0xffff),
+ "%s: Unexpected bits set in the bitmap feature FW reg id: 0x%llx", config_name(c), id);
+ printf("\tKVM_REG_ARM_FW_FEAT_BMAP_REG(%lld),\n", id & 0xffff);
+ break;
case KVM_REG_ARM64_SVE:
if (has_cap(c, KVM_CAP_ARM_SVE))
printf("\t%s,\n", sve_id_to_str(c, id));
@@ -686,6 +691,9 @@ static __u64 base_regs[] = {
KVM_REG_ARM_FW_REG(0),
KVM_REG_ARM_FW_REG(1),
KVM_REG_ARM_FW_REG(2),
+ KVM_REG_ARM_FW_FEAT_BMAP_REG(0), /* KVM_REG_ARM_STD_BMAP */
+ KVM_REG_ARM_FW_FEAT_BMAP_REG(1), /* KVM_REG_ARM_STD_HYP_BMAP */
+ KVM_REG_ARM_FW_FEAT_BMAP_REG(2), /* KVM_REG_ARM_VENDOR_HYP_BMAP */
ARM64_SYS_REG(3, 3, 14, 3, 1), /* CNTV_CTL_EL0 */
ARM64_SYS_REG(3, 3, 14, 3, 2), /* CNTV_CVAL_EL0 */
ARM64_SYS_REG(3, 3, 14, 0, 2),
--
2.35.1.1094.g7c7d902a7c-goog
Introduce the firmware register to hold the vendor specific
hypervisor service calls (owner value 6) as a bitmap. The
bitmap represents the features that'll be enabled for the
guest, as configured by the user-space. Currently, this
includes support for KVM-vendor features, and Precision Time
Protocol (PTP), represented by bit-0 and bit-1 respectively.
Signed-off-by: Raghavendra Rao Ananta <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/include/uapi/asm/kvm.h | 4 ++++
arch/arm64/kvm/hypercalls.c | 21 +++++++++++++++++----
include/kvm/arm_hypercalls.h | 4 ++++
4 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 20165242ebd9..b79161bad69a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -106,10 +106,12 @@ struct kvm_arch_memory_slot {
*
* @std_bmap: Bitmap of standard secure service calls
* @std_hyp_bmap: Bitmap of standard hypervisor service calls
+ * @vendor_hyp_bmap: Bitmap of vendor specific hypervisor service calls
*/
struct kvm_smccc_features {
u64 std_bmap;
u64 std_hyp_bmap;
+ u64 vendor_hyp_bmap;
};
struct kvm_arch {
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 67353bf4e69d..9a5ac0ed4113 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -344,6 +344,10 @@ struct kvm_arm_copy_mte_tags {
#define KVM_REG_ARM_STD_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(1)
#define KVM_REG_ARM_STD_HYP_BIT_PV_TIME BIT(0)
+#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
+#define KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT BIT(0)
+#define KVM_REG_ARM_VENDOR_HYP_BIT_PTP BIT(1)
+
/* Device Control API: ARM VGIC */
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 64ae6c7e7145..80836c341fd3 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -66,8 +66,6 @@ static const u32 hvc_func_default_allowed_list[] = {
ARM_SMCCC_VERSION_FUNC_ID,
ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID,
- ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
- ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
};
static bool kvm_hvc_call_default_allowed(struct kvm_vcpu *vcpu, u32 func_id)
@@ -102,6 +100,12 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
case ARM_SMCCC_HV_PV_TIME_ST:
return kvm_arm_fw_reg_feat_enabled(smccc_feat->std_hyp_bmap,
KVM_REG_ARM_STD_HYP_BIT_PV_TIME);
+ case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
+ return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
+ KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT);
+ case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
+ return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
+ KVM_REG_ARM_VENDOR_HYP_BIT_PTP);
default:
return kvm_hvc_call_default_allowed(vcpu, func_id);
}
@@ -194,8 +198,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
break;
case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
- val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
- val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
+ val[0] = smccc_feat->vendor_hyp_bmap;
break;
case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
kvm_ptp_get_time(vcpu, val);
@@ -222,6 +225,7 @@ static const u64 kvm_arm_fw_reg_ids[] = {
KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
KVM_REG_ARM_STD_BMAP,
KVM_REG_ARM_STD_HYP_BMAP,
+ KVM_REG_ARM_VENDOR_HYP_BMAP,
};
void kvm_arm_init_hypercalls(struct kvm *kvm)
@@ -230,6 +234,7 @@ void kvm_arm_init_hypercalls(struct kvm *kvm)
smccc_feat->std_bmap = KVM_ARM_SMCCC_STD_FEATURES;
smccc_feat->std_hyp_bmap = KVM_ARM_SMCCC_STD_HYP_FEATURES;
+ smccc_feat->vendor_hyp_bmap = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
}
int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
@@ -322,6 +327,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
case KVM_REG_ARM_STD_HYP_BMAP:
val = READ_ONCE(smccc_feat->std_hyp_bmap);
break;
+ case KVM_REG_ARM_VENDOR_HYP_BMAP:
+ val = READ_ONCE(smccc_feat->vendor_hyp_bmap);
+ break;
default:
return -ENOENT;
}
@@ -348,6 +356,10 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
fw_reg_bmap = &smccc_feat->std_hyp_bmap;
fw_reg_features = KVM_ARM_SMCCC_STD_HYP_FEATURES;
break;
+ case KVM_REG_ARM_VENDOR_HYP_BMAP:
+ fw_reg_bmap = &smccc_feat->vendor_hyp_bmap;
+ fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
+ break;
default:
return -ENOENT;
}
@@ -453,6 +465,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
return 0;
case KVM_REG_ARM_STD_BMAP:
case KVM_REG_ARM_STD_HYP_BMAP:
+ case KVM_REG_ARM_VENDOR_HYP_BMAP:
return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val);
default:
return -ENOENT;
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index b0915d8c5b81..eaf4f6b318a8 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -9,6 +9,7 @@
/* Last valid bits of the bitmapped firmware registers */
#define KVM_REG_ARM_STD_BMAP_BIT_MAX 0
#define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0
+#define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1
#define KVM_ARM_SMCCC_STD_FEATURES \
GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0)
@@ -16,6 +17,9 @@
#define KVM_ARM_SMCCC_STD_HYP_FEATURES \
GENMASK_ULL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX, 0)
+#define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \
+ GENMASK_ULL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX, 0)
+
int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
--
2.35.1.1094.g7c7d902a7c-goog
Hi Gavin,
On Tue, Apr 12, 2022 at 12:07 AM Gavin Shan <[email protected]> wrote:
>
> Hi Raghavendra,
>
> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> > Common hypercall firmware register handing is currently employed
> > by psci.c. Since the upcoming patches add more of these registers,
> > it's better to move the generic handling to hypercall.c for a
> > cleaner presentation.
> >
> > While we are at it, collect all the firmware registers under
> > fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
> > kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
> > KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.
> >
> > No functional change intended.
> >
> > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > Reviewed-by: Oliver Upton <[email protected]>
> > ---
> > arch/arm64/kvm/guest.c | 2 +-
> > arch/arm64/kvm/hypercalls.c | 185 +++++++++++++++++++++++++++++++++++
> > arch/arm64/kvm/psci.c | 183 ----------------------------------
> > include/kvm/arm_hypercalls.h | 7 ++
> > include/kvm/arm_psci.h | 7 --
> > 5 files changed, 193 insertions(+), 191 deletions(-)
> >
>
> Apart from the below nits:
>
> Reviewed-by: Gavin Shan <[email protected]>
>
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 7e15b03fbdf8..0d5cca56cbda 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -18,7 +18,7 @@
> > #include <linux/string.h>
> > #include <linux/vmalloc.h>
> > #include <linux/fs.h>
> > -#include <kvm/arm_psci.h>
> > +#include <kvm/arm_hypercalls.h>
> > #include <asm/cputype.h>
> > #include <linux/uaccess.h>
> > #include <asm/fpsimd.h>
> > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> > index 202b8c455724..fa6d9378d8e7 100644
> > --- a/arch/arm64/kvm/hypercalls.c
> > +++ b/arch/arm64/kvm/hypercalls.c
> > @@ -158,3 +158,188 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> > smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
> > return 1;
> > }
> > +
> > +static const u64 kvm_arm_fw_reg_ids[] = {
> > + KVM_REG_ARM_PSCI_VERSION,
> > + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> > + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
> > + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
> > +};
> > +
> > +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> > +{
> > + return ARRAY_SIZE(kvm_arm_fw_reg_ids);
> > +}
> > +
> > +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < ARRAY_SIZE(kvm_arm_fw_reg_ids); i++) {
> > + if (put_user(kvm_arm_fw_reg_ids[i], uindices++))
> > + return -EFAULT;
> > + }
> > +
> > + return 0;
> > +}
> > +
>
> Since we're here, I think we can make this function to return 'ARRAY_SIZE(kvm_arm_fw_reg_ids)',
> to be consistent with copy_{core, sve}_reg_indices(). With the return value fixed, additional
> patch can use @ret in kvm_arm_copy_reg_indices().
>
Well we can, however, since this is mostly refactoring, I didn't want
to change the original functionality of the code.
The only caller of this is kvm_arm_copy_reg_indices()
(arch/arm64/kvm/guest.c), which only checks for 'ret < 0'.
Also, do you have a need for it? If yes, I can change it in the next revision.
> > +#define KVM_REG_FEATURE_LEVEL_WIDTH 4
> > +#define KVM_REG_FEATURE_LEVEL_MASK GENMASK(KVM_REG_FEATURE_LEVEL_WIDTH, 0)
> > +
>
> It seems 'BIT()' is replaced with 'GENMASK' in the movement, but it's not mentioned
> in the commit log. I guess it'd better to mention it if you agree.
>
The last sentence of the commit text mentions this :)
Please let me know if it's not clear.
> > +/*
> > + * Convert the workaround level into an easy-to-compare number, where higher
> > + * values mean better protection.
> > + */
> > +static int get_kernel_wa_level(u64 regid)
> > +{
> > + switch (regid) {
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > + switch (arm64_get_spectre_v2_state()) {
> > + case SPECTRE_VULNERABLE:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> > + case SPECTRE_MITIGATED:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> > + case SPECTRE_UNAFFECTED:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
> > + }
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > + switch (arm64_get_spectre_v4_state()) {
> > + case SPECTRE_MITIGATED:
> > + /*
> > + * As for the hypercall discovery, we pretend we
> > + * don't have any FW mitigation if SSBS is there at
> > + * all times.
> > + */
> > + if (cpus_have_final_cap(ARM64_SSBS))
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > + fallthrough;
> > + case SPECTRE_UNAFFECTED:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> > + case SPECTRE_VULNERABLE:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > + }
> > + break;
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > + switch (arm64_get_spectre_bhb_state()) {
> > + case SPECTRE_VULNERABLE:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> > + case SPECTRE_MITIGATED:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
> > + case SPECTRE_UNAFFECTED:
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
> > + }
> > + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> > + }
> > +
> > + return -EINVAL;
> > +}
> > +
> > +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > + void __user *uaddr = (void __user *)(long)reg->addr;
> > + u64 val;
> > +
> > + switch (reg->id) {
> > + case KVM_REG_ARM_PSCI_VERSION:
> > + val = kvm_psci_version(vcpu);
> > + break;
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > + val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
> > + break;
> > + default:
> > + return -ENOENT;
> > + }
> > +
> > + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> > +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > + void __user *uaddr = (void __user *)(long)reg->addr;
> > + u64 val;
> > + int wa_level;
> > +
> > + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> > + return -EFAULT;
> > +
> > + switch (reg->id) {
> > + case KVM_REG_ARM_PSCI_VERSION:
> > + {
> > + bool wants_02;
> > +
> > + wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
> > +
> > + switch (val) {
> > + case KVM_ARM_PSCI_0_1:
> > + if (wants_02)
> > + return -EINVAL;
> > + vcpu->kvm->arch.psci_version = val;
> > + return 0;
> > + case KVM_ARM_PSCI_0_2:
> > + case KVM_ARM_PSCI_1_0:
> > + case KVM_ARM_PSCI_1_1:
> > + if (!wants_02)
> > + return -EINVAL;
> > + vcpu->kvm->arch.psci_version = val;
> > + return 0;
> > + }
> > + break;
> > + }
> > +
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > + if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
> > + return -EINVAL;
> > +
> > + if (get_kernel_wa_level(reg->id) < val)
> > + return -EINVAL;
> > +
> > + return 0;
> > +
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > + if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
> > + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> > + return -EINVAL;
> > +
> > + /* The enabled bit must not be set unless the level is AVAIL. */
> > + if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
> > + (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
> > + return -EINVAL;
> > +
> > + /*
> > + * Map all the possible incoming states to the only two we
> > + * really want to deal with.
> > + */
> > + switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
> > + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > + break;
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> > + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
> > + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + /*
> > + * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
> > + * other way around.
> > + */
> > + if (get_kernel_wa_level(reg->id) < wa_level)
> > + return -EINVAL;
> > +
> > + return 0;
> > + default:
> > + return -ENOENT;
> > + }
> > +
> > + return -EINVAL;
> > +}
> > diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> > index 372da09a2fab..bdfa93ca57d1 100644
> > --- a/arch/arm64/kvm/psci.c
> > +++ b/arch/arm64/kvm/psci.c
> > @@ -439,186 +439,3 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
> > return -EINVAL;
> > }
> > }
> > -
> > -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> > -{
> > - return 4; /* PSCI version and three workaround registers */
> > -}
> > -
> > -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> > -{
> > - if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> > - return -EFAULT;
> > -
> > - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
> > - return -EFAULT;
> > -
> > - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
> > - return -EFAULT;
> > -
> > - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3, uindices++))
> > - return -EFAULT;
> > -
> > - return 0;
> > -}
> > -
> > -#define KVM_REG_FEATURE_LEVEL_WIDTH 4
> > -#define KVM_REG_FEATURE_LEVEL_MASK (BIT(KVM_REG_FEATURE_LEVEL_WIDTH) - 1)
> > -
> > -/*
> > - * Convert the workaround level into an easy-to-compare number, where higher
> > - * values mean better protection.
> > - */
> > -static int get_kernel_wa_level(u64 regid)
> > -{
> > - switch (regid) {
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > - switch (arm64_get_spectre_v2_state()) {
> > - case SPECTRE_VULNERABLE:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> > - case SPECTRE_MITIGATED:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> > - case SPECTRE_UNAFFECTED:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
> > - }
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > - switch (arm64_get_spectre_v4_state()) {
> > - case SPECTRE_MITIGATED:
> > - /*
> > - * As for the hypercall discovery, we pretend we
> > - * don't have any FW mitigation if SSBS is there at
> > - * all times.
> > - */
> > - if (cpus_have_final_cap(ARM64_SSBS))
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > - fallthrough;
> > - case SPECTRE_UNAFFECTED:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> > - case SPECTRE_VULNERABLE:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > - }
> > - break;
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > - switch (arm64_get_spectre_bhb_state()) {
> > - case SPECTRE_VULNERABLE:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> > - case SPECTRE_MITIGATED:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
> > - case SPECTRE_UNAFFECTED:
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
> > - }
> > - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> > - }
> > -
> > - return -EINVAL;
> > -}
> > -
> > -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > -{
> > - void __user *uaddr = (void __user *)(long)reg->addr;
> > - u64 val;
> > -
> > - switch (reg->id) {
> > - case KVM_REG_ARM_PSCI_VERSION:
> > - val = kvm_psci_version(vcpu);
> > - break;
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > - val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
> > - break;
> > - default:
> > - return -ENOENT;
> > - }
> > -
> > - if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> > - return -EFAULT;
> > -
> > - return 0;
> > -}
> > -
> > -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > -{
> > - void __user *uaddr = (void __user *)(long)reg->addr;
> > - u64 val;
> > - int wa_level;
> > -
> > - if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> > - return -EFAULT;
> > -
> > - switch (reg->id) {
> > - case KVM_REG_ARM_PSCI_VERSION:
> > - {
> > - bool wants_02;
> > -
> > - wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
> > -
> > - switch (val) {
> > - case KVM_ARM_PSCI_0_1:
> > - if (wants_02)
> > - return -EINVAL;
> > - vcpu->kvm->arch.psci_version = val;
> > - return 0;
> > - case KVM_ARM_PSCI_0_2:
> > - case KVM_ARM_PSCI_1_0:
> > - case KVM_ARM_PSCI_1_1:
> > - if (!wants_02)
> > - return -EINVAL;
> > - vcpu->kvm->arch.psci_version = val;
> > - return 0;
> > - }
> > - break;
> > - }
> > -
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> > - if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
> > - return -EINVAL;
> > -
> > - if (get_kernel_wa_level(reg->id) < val)
> > - return -EINVAL;
> > -
> > - return 0;
> > -
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> > - if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
> > - KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> > - return -EINVAL;
> > -
> > - /* The enabled bit must not be set unless the level is AVAIL. */
> > - if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
> > - (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
> > - return -EINVAL;
> > -
> > - /*
> > - * Map all the possible incoming states to the only two we
> > - * really want to deal with.
> > - */
> > - switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
> > - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> > - break;
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> > - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
> > - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> > - break;
> > - default:
> > - return -EINVAL;
> > - }
> > -
> > - /*
> > - * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
> > - * other way around.
> > - */
> > - if (get_kernel_wa_level(reg->id) < wa_level)
> > - return -EINVAL;
> > -
> > - return 0;
> > - default:
> > - return -ENOENT;
> > - }
> > -
> > - return -EINVAL;
> > -}
> > diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> > index 0e2509d27910..5d38628a8d04 100644
> > --- a/include/kvm/arm_hypercalls.h
> > +++ b/include/kvm/arm_hypercalls.h
> > @@ -40,4 +40,11 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
> > vcpu_set_reg(vcpu, 3, a3);
> > }
> >
> > +struct kvm_one_reg;
> > +
> > +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> > +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> > +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> > +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> > +
> > #endif
> > diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> > index 68b96c3826c3..6e55b9283789 100644
> > --- a/include/kvm/arm_psci.h
> > +++ b/include/kvm/arm_psci.h
> > @@ -39,11 +39,4 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
> >
> > int kvm_psci_call(struct kvm_vcpu *vcpu);
> >
> > -struct kvm_one_reg;
> > -
> > -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> > -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> > -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> > -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> > -
> > #endif /* __KVM_ARM_PSCI_H__ */
> >
>
> Thanks,
> Gavin
Thank you for the review, Gavin.
Regards,
Raghavendra
>
Hi Raghavendra,
On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> Common hypercall firmware register handing is currently employed
> by psci.c. Since the upcoming patches add more of these registers,
> it's better to move the generic handling to hypercall.c for a
> cleaner presentation.
>
> While we are at it, collect all the firmware registers under
> fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
> kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
> KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.
>
> No functional change intended.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> Reviewed-by: Oliver Upton <[email protected]>
> ---
> arch/arm64/kvm/guest.c | 2 +-
> arch/arm64/kvm/hypercalls.c | 185 +++++++++++++++++++++++++++++++++++
> arch/arm64/kvm/psci.c | 183 ----------------------------------
> include/kvm/arm_hypercalls.h | 7 ++
> include/kvm/arm_psci.h | 7 --
> 5 files changed, 193 insertions(+), 191 deletions(-)
>
Apart from the below nits:
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 7e15b03fbdf8..0d5cca56cbda 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -18,7 +18,7 @@
> #include <linux/string.h>
> #include <linux/vmalloc.h>
> #include <linux/fs.h>
> -#include <kvm/arm_psci.h>
> +#include <kvm/arm_hypercalls.h>
> #include <asm/cputype.h>
> #include <linux/uaccess.h>
> #include <asm/fpsimd.h>
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 202b8c455724..fa6d9378d8e7 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -158,3 +158,188 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
> return 1;
> }
> +
> +static const u64 kvm_arm_fw_reg_ids[] = {
> + KVM_REG_ARM_PSCI_VERSION,
> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
> +};
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> +{
> + return ARRAY_SIZE(kvm_arm_fw_reg_ids);
> +}
> +
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> +{
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(kvm_arm_fw_reg_ids); i++) {
> + if (put_user(kvm_arm_fw_reg_ids[i], uindices++))
> + return -EFAULT;
> + }
> +
> + return 0;
> +}
> +
Since we're here, I think we can make this function to return 'ARRAY_SIZE(kvm_arm_fw_reg_ids)',
to be consistent with copy_{core, sve}_reg_indices(). With the return value fixed, additional
patch can use @ret in kvm_arm_copy_reg_indices().
> +#define KVM_REG_FEATURE_LEVEL_WIDTH 4
> +#define KVM_REG_FEATURE_LEVEL_MASK GENMASK(KVM_REG_FEATURE_LEVEL_WIDTH, 0)
> +
It seems 'BIT()' is replaced with 'GENMASK' in the movement, but it's not mentioned
in the commit log. I guess it'd better to mention it if you agree.
> +/*
> + * Convert the workaround level into an easy-to-compare number, where higher
> + * values mean better protection.
> + */
> +static int get_kernel_wa_level(u64 regid)
> +{
> + switch (regid) {
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> + switch (arm64_get_spectre_v2_state()) {
> + case SPECTRE_VULNERABLE:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> + case SPECTRE_MITIGATED:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> + case SPECTRE_UNAFFECTED:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
> + }
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> + switch (arm64_get_spectre_v4_state()) {
> + case SPECTRE_MITIGATED:
> + /*
> + * As for the hypercall discovery, we pretend we
> + * don't have any FW mitigation if SSBS is there at
> + * all times.
> + */
> + if (cpus_have_final_cap(ARM64_SSBS))
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> + fallthrough;
> + case SPECTRE_UNAFFECTED:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> + case SPECTRE_VULNERABLE:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> + }
> + break;
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> + switch (arm64_get_spectre_bhb_state()) {
> + case SPECTRE_VULNERABLE:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> + case SPECTRE_MITIGATED:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
> + case SPECTRE_UNAFFECTED:
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
> + }
> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> + }
> +
> + return -EINVAL;
> +}
> +
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> +
> + switch (reg->id) {
> + case KVM_REG_ARM_PSCI_VERSION:
> + val = kvm_psci_version(vcpu);
> + break;
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> + val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
> + break;
> + default:
> + return -ENOENT;
> + }
> +
> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> + int wa_level;
> +
> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + switch (reg->id) {
> + case KVM_REG_ARM_PSCI_VERSION:
> + {
> + bool wants_02;
> +
> + wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
> +
> + switch (val) {
> + case KVM_ARM_PSCI_0_1:
> + if (wants_02)
> + return -EINVAL;
> + vcpu->kvm->arch.psci_version = val;
> + return 0;
> + case KVM_ARM_PSCI_0_2:
> + case KVM_ARM_PSCI_1_0:
> + case KVM_ARM_PSCI_1_1:
> + if (!wants_02)
> + return -EINVAL;
> + vcpu->kvm->arch.psci_version = val;
> + return 0;
> + }
> + break;
> + }
> +
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> + if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
> + return -EINVAL;
> +
> + if (get_kernel_wa_level(reg->id) < val)
> + return -EINVAL;
> +
> + return 0;
> +
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> + if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> + return -EINVAL;
> +
> + /* The enabled bit must not be set unless the level is AVAIL. */
> + if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
> + (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
> + return -EINVAL;
> +
> + /*
> + * Map all the possible incoming states to the only two we
> + * really want to deal with.
> + */
> + switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
> + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> + break;
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
> + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + /*
> + * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
> + * other way around.
> + */
> + if (get_kernel_wa_level(reg->id) < wa_level)
> + return -EINVAL;
> +
> + return 0;
> + default:
> + return -ENOENT;
> + }
> +
> + return -EINVAL;
> +}
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 372da09a2fab..bdfa93ca57d1 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -439,186 +439,3 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
> return -EINVAL;
> }
> }
> -
> -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> -{
> - return 4; /* PSCI version and three workaround registers */
> -}
> -
> -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> -{
> - if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> - return -EFAULT;
> -
> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
> - return -EFAULT;
> -
> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
> - return -EFAULT;
> -
> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3, uindices++))
> - return -EFAULT;
> -
> - return 0;
> -}
> -
> -#define KVM_REG_FEATURE_LEVEL_WIDTH 4
> -#define KVM_REG_FEATURE_LEVEL_MASK (BIT(KVM_REG_FEATURE_LEVEL_WIDTH) - 1)
> -
> -/*
> - * Convert the workaround level into an easy-to-compare number, where higher
> - * values mean better protection.
> - */
> -static int get_kernel_wa_level(u64 regid)
> -{
> - switch (regid) {
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> - switch (arm64_get_spectre_v2_state()) {
> - case SPECTRE_VULNERABLE:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> - case SPECTRE_MITIGATED:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> - case SPECTRE_UNAFFECTED:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
> - }
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> - switch (arm64_get_spectre_v4_state()) {
> - case SPECTRE_MITIGATED:
> - /*
> - * As for the hypercall discovery, we pretend we
> - * don't have any FW mitigation if SSBS is there at
> - * all times.
> - */
> - if (cpus_have_final_cap(ARM64_SSBS))
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> - fallthrough;
> - case SPECTRE_UNAFFECTED:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> - case SPECTRE_VULNERABLE:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> - }
> - break;
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> - switch (arm64_get_spectre_bhb_state()) {
> - case SPECTRE_VULNERABLE:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> - case SPECTRE_MITIGATED:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
> - case SPECTRE_UNAFFECTED:
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
> - }
> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
> - }
> -
> - return -EINVAL;
> -}
> -
> -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> -{
> - void __user *uaddr = (void __user *)(long)reg->addr;
> - u64 val;
> -
> - switch (reg->id) {
> - case KVM_REG_ARM_PSCI_VERSION:
> - val = kvm_psci_version(vcpu);
> - break;
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> - val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
> - break;
> - default:
> - return -ENOENT;
> - }
> -
> - if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> - return -EFAULT;
> -
> - return 0;
> -}
> -
> -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> -{
> - void __user *uaddr = (void __user *)(long)reg->addr;
> - u64 val;
> - int wa_level;
> -
> - if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> - return -EFAULT;
> -
> - switch (reg->id) {
> - case KVM_REG_ARM_PSCI_VERSION:
> - {
> - bool wants_02;
> -
> - wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
> -
> - switch (val) {
> - case KVM_ARM_PSCI_0_1:
> - if (wants_02)
> - return -EINVAL;
> - vcpu->kvm->arch.psci_version = val;
> - return 0;
> - case KVM_ARM_PSCI_0_2:
> - case KVM_ARM_PSCI_1_0:
> - case KVM_ARM_PSCI_1_1:
> - if (!wants_02)
> - return -EINVAL;
> - vcpu->kvm->arch.psci_version = val;
> - return 0;
> - }
> - break;
> - }
> -
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
> - if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
> - return -EINVAL;
> -
> - if (get_kernel_wa_level(reg->id) < val)
> - return -EINVAL;
> -
> - return 0;
> -
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> - if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
> - KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> - return -EINVAL;
> -
> - /* The enabled bit must not be set unless the level is AVAIL. */
> - if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
> - (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
> - return -EINVAL;
> -
> - /*
> - * Map all the possible incoming states to the only two we
> - * really want to deal with.
> - */
> - switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
> - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> - break;
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
> - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
> - break;
> - default:
> - return -EINVAL;
> - }
> -
> - /*
> - * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
> - * other way around.
> - */
> - if (get_kernel_wa_level(reg->id) < wa_level)
> - return -EINVAL;
> -
> - return 0;
> - default:
> - return -ENOENT;
> - }
> -
> - return -EINVAL;
> -}
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index 0e2509d27910..5d38628a8d04 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -40,4 +40,11 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
> vcpu_set_reg(vcpu, 3, a3);
> }
>
> +struct kvm_one_reg;
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +
> #endif
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 68b96c3826c3..6e55b9283789 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -39,11 +39,4 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
>
> int kvm_psci_call(struct kvm_vcpu *vcpu);
>
> -struct kvm_one_reg;
> -
> -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> -
> #endif /* __KVM_ARM_PSCI_H__ */
>
Thanks,
Gavin
Hi Raghavendra,
On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> Introduce the firmware register to hold the vendor specific
> hypervisor service calls (owner value 6) as a bitmap. The
> bitmap represents the features that'll be enabled for the
> guest, as configured by the user-space. Currently, this
> includes support for KVM-vendor features, and Precision Time
> Protocol (PTP), represented by bit-0 and bit-1 respectively.
>
> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> ---
> arch/arm64/include/asm/kvm_host.h | 2 ++
> arch/arm64/include/uapi/asm/kvm.h | 4 ++++
> arch/arm64/kvm/hypercalls.c | 21 +++++++++++++++++----
> include/kvm/arm_hypercalls.h | 4 ++++
> 4 files changed, 27 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 20165242ebd9..b79161bad69a 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -106,10 +106,12 @@ struct kvm_arch_memory_slot {
> *
> * @std_bmap: Bitmap of standard secure service calls
> * @std_hyp_bmap: Bitmap of standard hypervisor service calls
> + * @vendor_hyp_bmap: Bitmap of vendor specific hypervisor service calls
> */
> struct kvm_smccc_features {
> u64 std_bmap;
> u64 std_hyp_bmap;
> + u64 vendor_hyp_bmap;
> };
>
> struct kvm_arch {
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 67353bf4e69d..9a5ac0ed4113 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -344,6 +344,10 @@ struct kvm_arm_copy_mte_tags {
> #define KVM_REG_ARM_STD_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(1)
> #define KVM_REG_ARM_STD_HYP_BIT_PV_TIME BIT(0)
>
> +#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
> +#define KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT BIT(0)
> +#define KVM_REG_ARM_VENDOR_HYP_BIT_PTP BIT(1)
> +
> /* Device Control API: ARM VGIC */
> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 64ae6c7e7145..80836c341fd3 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -66,8 +66,6 @@ static const u32 hvc_func_default_allowed_list[] = {
> ARM_SMCCC_VERSION_FUNC_ID,
> ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID,
> - ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
> - ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
> };
>
> static bool kvm_hvc_call_default_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> @@ -102,6 +100,12 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> case ARM_SMCCC_HV_PV_TIME_ST:
> return kvm_arm_fw_reg_feat_enabled(smccc_feat->std_hyp_bmap,
> KVM_REG_ARM_STD_HYP_BIT_PV_TIME);
> + case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> + KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT);
> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> + KVM_REG_ARM_VENDOR_HYP_BIT_PTP);
> default:
> return kvm_hvc_call_default_allowed(vcpu, func_id);
> }
I guess we may return SMCCC_RET_NOT_SUPPORTED for ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT isn't set? Otherwise, we need explain it
in the commit log.
KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other.
I think PTP can't be on if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT is off.
> @@ -194,8 +198,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
> break;
> case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> - val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
> - val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
> + val[0] = smccc_feat->vendor_hyp_bmap;
> break;
> case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> kvm_ptp_get_time(vcpu, val);
> @@ -222,6 +225,7 @@ static const u64 kvm_arm_fw_reg_ids[] = {
> KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
> KVM_REG_ARM_STD_BMAP,
> KVM_REG_ARM_STD_HYP_BMAP,
> + KVM_REG_ARM_VENDOR_HYP_BMAP,
> };
>
> void kvm_arm_init_hypercalls(struct kvm *kvm)
> @@ -230,6 +234,7 @@ void kvm_arm_init_hypercalls(struct kvm *kvm)
>
> smccc_feat->std_bmap = KVM_ARM_SMCCC_STD_FEATURES;
> smccc_feat->std_hyp_bmap = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> + smccc_feat->vendor_hyp_bmap = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> }
>
> int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> @@ -322,6 +327,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> case KVM_REG_ARM_STD_HYP_BMAP:
> val = READ_ONCE(smccc_feat->std_hyp_bmap);
> break;
> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> + val = READ_ONCE(smccc_feat->vendor_hyp_bmap);
> + break;
> default:
> return -ENOENT;
> }
> @@ -348,6 +356,10 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
> fw_reg_bmap = &smccc_feat->std_hyp_bmap;
> fw_reg_features = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> break;
> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> + fw_reg_bmap = &smccc_feat->vendor_hyp_bmap;
> + fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> + break;
> default:
> return -ENOENT;
> }
If KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other,
special code is needed to gurantee PTP is cleared if VENDOR_HYP is disabled.
> @@ -453,6 +465,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> return 0;
> case KVM_REG_ARM_STD_BMAP:
> case KVM_REG_ARM_STD_HYP_BMAP:
> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val);
> default:
> return -ENOENT;
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index b0915d8c5b81..eaf4f6b318a8 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -9,6 +9,7 @@
> /* Last valid bits of the bitmapped firmware registers */
> #define KVM_REG_ARM_STD_BMAP_BIT_MAX 0
> #define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0
> +#define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1
>
> #define KVM_ARM_SMCCC_STD_FEATURES \
> GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0)
> @@ -16,6 +17,9 @@
> #define KVM_ARM_SMCCC_STD_HYP_FEATURES \
> GENMASK_ULL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX, 0)
>
> +#define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \
> + GENMASK_ULL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX, 0)
> +
> int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>
> static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
>
Thanks,
Gavin
Hi Raghavendra,
On 4/13/22 12:41 AM, Raghavendra Rao Ananta wrote:
> On Tue, Apr 12, 2022 at 12:07 AM Gavin Shan <[email protected]> wrote:
>> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
>>> Common hypercall firmware register handing is currently employed
>>> by psci.c. Since the upcoming patches add more of these registers,
>>> it's better to move the generic handling to hypercall.c for a
>>> cleaner presentation.
>>>
>>> While we are at it, collect all the firmware registers under
>>> fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and
>>> kvm_arm_copy_fw_reg_indices() in a generic way. Also, define
>>> KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead.
>>>
>>> No functional change intended.
>>>
>>> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
>>> Reviewed-by: Oliver Upton <[email protected]>
>>> ---
>>> arch/arm64/kvm/guest.c | 2 +-
>>> arch/arm64/kvm/hypercalls.c | 185 +++++++++++++++++++++++++++++++++++
>>> arch/arm64/kvm/psci.c | 183 ----------------------------------
>>> include/kvm/arm_hypercalls.h | 7 ++
>>> include/kvm/arm_psci.h | 7 --
>>> 5 files changed, 193 insertions(+), 191 deletions(-)
>>>
>>
>> Apart from the below nits:
>>
>> Reviewed-by: Gavin Shan <[email protected]>
>>
>>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>>> index 7e15b03fbdf8..0d5cca56cbda 100644
>>> --- a/arch/arm64/kvm/guest.c
>>> +++ b/arch/arm64/kvm/guest.c
>>> @@ -18,7 +18,7 @@
>>> #include <linux/string.h>
>>> #include <linux/vmalloc.h>
>>> #include <linux/fs.h>
>>> -#include <kvm/arm_psci.h>
>>> +#include <kvm/arm_hypercalls.h>
>>> #include <asm/cputype.h>
>>> #include <linux/uaccess.h>
>>> #include <asm/fpsimd.h>
>>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>>> index 202b8c455724..fa6d9378d8e7 100644
>>> --- a/arch/arm64/kvm/hypercalls.c
>>> +++ b/arch/arm64/kvm/hypercalls.c
>>> @@ -158,3 +158,188 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>>> smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
>>> return 1;
>>> }
>>> +
>>> +static const u64 kvm_arm_fw_reg_ids[] = {
>>> + KVM_REG_ARM_PSCI_VERSION,
>>> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
>>> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
>>> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
>>> +};
>>> +
>>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>>> +{
>>> + return ARRAY_SIZE(kvm_arm_fw_reg_ids);
>>> +}
>>> +
>>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>>> +{
>>> + int i;
>>> +
>>> + for (i = 0; i < ARRAY_SIZE(kvm_arm_fw_reg_ids); i++) {
>>> + if (put_user(kvm_arm_fw_reg_ids[i], uindices++))
>>> + return -EFAULT;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>
>> Since we're here, I think we can make this function to return 'ARRAY_SIZE(kvm_arm_fw_reg_ids)',
>> to be consistent with copy_{core, sve}_reg_indices(). With the return value fixed, additional
>> patch can use @ret in kvm_arm_copy_reg_indices().
>>
> Well we can, however, since this is mostly refactoring, I didn't want
> to change the original functionality of the code.
> The only caller of this is kvm_arm_copy_reg_indices()
> (arch/arm64/kvm/guest.c), which only checks for 'ret < 0'.
> Also, do you have a need for it? If yes, I can change it in the next revision.
>
The current implementation isn't wrong. With the return value fixed,
The individual snippets in kvm_arm_copy_reg_indices() looks similar.
It's not a big deal. If you plan to fix the return value for
kvm_arm_copy_fw_reg_indices() and copy_timer_indices(), a separate
patch can make the changes before this patch [v5 01/10]. Or we
can ignore this and fix it after this series is merged :)
>>> +#define KVM_REG_FEATURE_LEVEL_WIDTH 4
>>> +#define KVM_REG_FEATURE_LEVEL_MASK GENMASK(KVM_REG_FEATURE_LEVEL_WIDTH, 0)
>>> +
>>
>> It seems 'BIT()' is replaced with 'GENMASK' in the movement, but it's not mentioned
>> in the commit log. I guess it'd better to mention it if you agree.
>>
> The last sentence of the commit text mentions this :)
> Please let me know if it's not clear.
>
Exactly and it's clear. Sorry that I missed it :)
>>> +/*
>>> + * Convert the workaround level into an easy-to-compare number, where higher
>>> + * values mean better protection.
>>> + */
>>> +static int get_kernel_wa_level(u64 regid)
>>> +{
>>> + switch (regid) {
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> + switch (arm64_get_spectre_v2_state()) {
>>> + case SPECTRE_VULNERABLE:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
>>> + case SPECTRE_MITIGATED:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
>>> + case SPECTRE_UNAFFECTED:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
>>> + }
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> + switch (arm64_get_spectre_v4_state()) {
>>> + case SPECTRE_MITIGATED:
>>> + /*
>>> + * As for the hypercall discovery, we pretend we
>>> + * don't have any FW mitigation if SSBS is there at
>>> + * all times.
>>> + */
>>> + if (cpus_have_final_cap(ARM64_SSBS))
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> + fallthrough;
>>> + case SPECTRE_UNAFFECTED:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
>>> + case SPECTRE_VULNERABLE:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> + }
>>> + break;
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> + switch (arm64_get_spectre_bhb_state()) {
>>> + case SPECTRE_VULNERABLE:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
>>> + case SPECTRE_MITIGATED:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
>>> + case SPECTRE_UNAFFECTED:
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
>>> + }
>>> + return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
>>> + }
>>> +
>>> + return -EINVAL;
>>> +}
>>> +
>>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> +{
>>> + void __user *uaddr = (void __user *)(long)reg->addr;
>>> + u64 val;
>>> +
>>> + switch (reg->id) {
>>> + case KVM_REG_ARM_PSCI_VERSION:
>>> + val = kvm_psci_version(vcpu);
>>> + break;
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> + val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
>>> + break;
>>> + default:
>>> + return -ENOENT;
>>> + }
>>> +
>>> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
>>> + return -EFAULT;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> +{
>>> + void __user *uaddr = (void __user *)(long)reg->addr;
>>> + u64 val;
>>> + int wa_level;
>>> +
>>> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
>>> + return -EFAULT;
>>> +
>>> + switch (reg->id) {
>>> + case KVM_REG_ARM_PSCI_VERSION:
>>> + {
>>> + bool wants_02;
>>> +
>>> + wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
>>> +
>>> + switch (val) {
>>> + case KVM_ARM_PSCI_0_1:
>>> + if (wants_02)
>>> + return -EINVAL;
>>> + vcpu->kvm->arch.psci_version = val;
>>> + return 0;
>>> + case KVM_ARM_PSCI_0_2:
>>> + case KVM_ARM_PSCI_1_0:
>>> + case KVM_ARM_PSCI_1_1:
>>> + if (!wants_02)
>>> + return -EINVAL;
>>> + vcpu->kvm->arch.psci_version = val;
>>> + return 0;
>>> + }
>>> + break;
>>> + }
>>> +
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> + if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
>>> + return -EINVAL;
>>> +
>>> + if (get_kernel_wa_level(reg->id) < val)
>>> + return -EINVAL;
>>> +
>>> + return 0;
>>> +
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> + if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
>>> + KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
>>> + return -EINVAL;
>>> +
>>> + /* The enabled bit must not be set unless the level is AVAIL. */
>>> + if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
>>> + (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
>>> + return -EINVAL;
>>> +
>>> + /*
>>> + * Map all the possible incoming states to the only two we
>>> + * really want to deal with.
>>> + */
>>> + switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
>>> + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> + break;
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
>>> + case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
>>> + wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
>>> + break;
>>> + default:
>>> + return -EINVAL;
>>> + }
>>> +
>>> + /*
>>> + * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
>>> + * other way around.
>>> + */
>>> + if (get_kernel_wa_level(reg->id) < wa_level)
>>> + return -EINVAL;
>>> +
>>> + return 0;
>>> + default:
>>> + return -ENOENT;
>>> + }
>>> +
>>> + return -EINVAL;
>>> +}
>>> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
>>> index 372da09a2fab..bdfa93ca57d1 100644
>>> --- a/arch/arm64/kvm/psci.c
>>> +++ b/arch/arm64/kvm/psci.c
>>> @@ -439,186 +439,3 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
>>> return -EINVAL;
>>> }
>>> }
>>> -
>>> -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>>> -{
>>> - return 4; /* PSCI version and three workaround registers */
>>> -}
>>> -
>>> -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>>> -{
>>> - if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
>>> - return -EFAULT;
>>> -
>>> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
>>> - return -EFAULT;
>>> -
>>> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
>>> - return -EFAULT;
>>> -
>>> - if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3, uindices++))
>>> - return -EFAULT;
>>> -
>>> - return 0;
>>> -}
>>> -
>>> -#define KVM_REG_FEATURE_LEVEL_WIDTH 4
>>> -#define KVM_REG_FEATURE_LEVEL_MASK (BIT(KVM_REG_FEATURE_LEVEL_WIDTH) - 1)
>>> -
>>> -/*
>>> - * Convert the workaround level into an easy-to-compare number, where higher
>>> - * values mean better protection.
>>> - */
>>> -static int get_kernel_wa_level(u64 regid)
>>> -{
>>> - switch (regid) {
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> - switch (arm64_get_spectre_v2_state()) {
>>> - case SPECTRE_VULNERABLE:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
>>> - case SPECTRE_MITIGATED:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
>>> - case SPECTRE_UNAFFECTED:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
>>> - }
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> - switch (arm64_get_spectre_v4_state()) {
>>> - case SPECTRE_MITIGATED:
>>> - /*
>>> - * As for the hypercall discovery, we pretend we
>>> - * don't have any FW mitigation if SSBS is there at
>>> - * all times.
>>> - */
>>> - if (cpus_have_final_cap(ARM64_SSBS))
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> - fallthrough;
>>> - case SPECTRE_UNAFFECTED:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
>>> - case SPECTRE_VULNERABLE:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> - }
>>> - break;
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> - switch (arm64_get_spectre_bhb_state()) {
>>> - case SPECTRE_VULNERABLE:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
>>> - case SPECTRE_MITIGATED:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_AVAIL;
>>> - case SPECTRE_UNAFFECTED:
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_REQUIRED;
>>> - }
>>> - return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3_NOT_AVAIL;
>>> - }
>>> -
>>> - return -EINVAL;
>>> -}
>>> -
>>> -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> -{
>>> - void __user *uaddr = (void __user *)(long)reg->addr;
>>> - u64 val;
>>> -
>>> - switch (reg->id) {
>>> - case KVM_REG_ARM_PSCI_VERSION:
>>> - val = kvm_psci_version(vcpu);
>>> - break;
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> - val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
>>> - break;
>>> - default:
>>> - return -ENOENT;
>>> - }
>>> -
>>> - if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
>>> - return -EFAULT;
>>> -
>>> - return 0;
>>> -}
>>> -
>>> -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> -{
>>> - void __user *uaddr = (void __user *)(long)reg->addr;
>>> - u64 val;
>>> - int wa_level;
>>> -
>>> - if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
>>> - return -EFAULT;
>>> -
>>> - switch (reg->id) {
>>> - case KVM_REG_ARM_PSCI_VERSION:
>>> - {
>>> - bool wants_02;
>>> -
>>> - wants_02 = test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features);
>>> -
>>> - switch (val) {
>>> - case KVM_ARM_PSCI_0_1:
>>> - if (wants_02)
>>> - return -EINVAL;
>>> - vcpu->kvm->arch.psci_version = val;
>>> - return 0;
>>> - case KVM_ARM_PSCI_0_2:
>>> - case KVM_ARM_PSCI_1_0:
>>> - case KVM_ARM_PSCI_1_1:
>>> - if (!wants_02)
>>> - return -EINVAL;
>>> - vcpu->kvm->arch.psci_version = val;
>>> - return 0;
>>> - }
>>> - break;
>>> - }
>>> -
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
>>> - if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
>>> - return -EINVAL;
>>> -
>>> - if (get_kernel_wa_level(reg->id) < val)
>>> - return -EINVAL;
>>> -
>>> - return 0;
>>> -
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>>> - if (val & ~(KVM_REG_FEATURE_LEVEL_MASK |
>>> - KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
>>> - return -EINVAL;
>>> -
>>> - /* The enabled bit must not be set unless the level is AVAIL. */
>>> - if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
>>> - (val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
>>> - return -EINVAL;
>>> -
>>> - /*
>>> - * Map all the possible incoming states to the only two we
>>> - * really want to deal with.
>>> - */
>>> - switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
>>> - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
>>> - break;
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
>>> - case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
>>> - wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
>>> - break;
>>> - default:
>>> - return -EINVAL;
>>> - }
>>> -
>>> - /*
>>> - * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
>>> - * other way around.
>>> - */
>>> - if (get_kernel_wa_level(reg->id) < wa_level)
>>> - return -EINVAL;
>>> -
>>> - return 0;
>>> - default:
>>> - return -ENOENT;
>>> - }
>>> -
>>> - return -EINVAL;
>>> -}
>>> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
>>> index 0e2509d27910..5d38628a8d04 100644
>>> --- a/include/kvm/arm_hypercalls.h
>>> +++ b/include/kvm/arm_hypercalls.h
>>> @@ -40,4 +40,11 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
>>> vcpu_set_reg(vcpu, 3, a3);
>>> }
>>>
>>> +struct kvm_one_reg;
>>> +
>>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
>>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>>> +
>>> #endif
>>> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
>>> index 68b96c3826c3..6e55b9283789 100644
>>> --- a/include/kvm/arm_psci.h
>>> +++ b/include/kvm/arm_psci.h
>>> @@ -39,11 +39,4 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
>>>
>>> int kvm_psci_call(struct kvm_vcpu *vcpu);
>>>
>>> -struct kvm_one_reg;
>>> -
>>> -int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
>>> -int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>>> -int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>>> -int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>>> -
>>> #endif /* __KVM_ARM_PSCI_H__ */
>>>
>>
[...]
> Thank you for the review, Gavin.
>
No worries. The SDEI virtualization series depends on this to
some extent :)
Thanks,
Gavin
On Tue, Apr 12, 2022 at 8:59 PM Gavin Shan <[email protected]> wrote:
>
> Hi Raghavendra,
>
> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> > Introduce the firmware register to hold the vendor specific
> > hypervisor service calls (owner value 6) as a bitmap. The
> > bitmap represents the features that'll be enabled for the
> > guest, as configured by the user-space. Currently, this
> > includes support for KVM-vendor features, and Precision Time
> > Protocol (PTP), represented by bit-0 and bit-1 respectively.
> >
> > Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> > ---
> > arch/arm64/include/asm/kvm_host.h | 2 ++
> > arch/arm64/include/uapi/asm/kvm.h | 4 ++++
> > arch/arm64/kvm/hypercalls.c | 21 +++++++++++++++++----
> > include/kvm/arm_hypercalls.h | 4 ++++
> > 4 files changed, 27 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 20165242ebd9..b79161bad69a 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -106,10 +106,12 @@ struct kvm_arch_memory_slot {
> > *
> > * @std_bmap: Bitmap of standard secure service calls
> > * @std_hyp_bmap: Bitmap of standard hypervisor service calls
> > + * @vendor_hyp_bmap: Bitmap of vendor specific hypervisor service calls
> > */
> > struct kvm_smccc_features {
> > u64 std_bmap;
> > u64 std_hyp_bmap;
> > + u64 vendor_hyp_bmap;
> > };
> >
> > struct kvm_arch {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 67353bf4e69d..9a5ac0ed4113 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -344,6 +344,10 @@ struct kvm_arm_copy_mte_tags {
> > #define KVM_REG_ARM_STD_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(1)
> > #define KVM_REG_ARM_STD_HYP_BIT_PV_TIME BIT(0)
> >
> > +#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
> > +#define KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT BIT(0)
> > +#define KVM_REG_ARM_VENDOR_HYP_BIT_PTP BIT(1)
> > +
> > /* Device Control API: ARM VGIC */
> > #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> > #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> > index 64ae6c7e7145..80836c341fd3 100644
> > --- a/arch/arm64/kvm/hypercalls.c
> > +++ b/arch/arm64/kvm/hypercalls.c
> > @@ -66,8 +66,6 @@ static const u32 hvc_func_default_allowed_list[] = {
> > ARM_SMCCC_VERSION_FUNC_ID,
> > ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> > ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID,
> > - ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
> > - ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
> > };
> >
> > static bool kvm_hvc_call_default_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> > @@ -102,6 +100,12 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> > case ARM_SMCCC_HV_PV_TIME_ST:
> > return kvm_arm_fw_reg_feat_enabled(smccc_feat->std_hyp_bmap,
> > KVM_REG_ARM_STD_HYP_BIT_PV_TIME);
> > + case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> > + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> > + KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT);
> > + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> > + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> > + KVM_REG_ARM_VENDOR_HYP_BIT_PTP);
> > default:
> > return kvm_hvc_call_default_allowed(vcpu, func_id);
> > }
>
> I guess we may return SMCCC_RET_NOT_SUPPORTED for ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
> if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT isn't set? Otherwise, we need explain it
> in the commit log.
>
ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID is a part of the hvc
allowed-list (hvc_func_default_allowed_list[]), which means it's not
associated with any feature bit and is always enabled. If the guest
were to issue ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, we'd end up in
the 'default' case and the kvm_hvc_call_default_allowed() would return
'true'. This is documented in patch 2/10.
> KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other.
> I think PTP can't be on if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT is off.
>
Actually we went through this scenario [1]. Of course, we can build
some logic around it to make sure that the userspace does the right
thing, but at this point the consensus is that, unless it's an issue
for KVM, it's treated as a userspace bug.
> > @@ -194,8 +198,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> > val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
> > break;
> > case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> > - val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
> > - val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
> > + val[0] = smccc_feat->vendor_hyp_bmap;
> > break;
> > case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> > kvm_ptp_get_time(vcpu, val);
> > @@ -222,6 +225,7 @@ static const u64 kvm_arm_fw_reg_ids[] = {
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
> > KVM_REG_ARM_STD_BMAP,
> > KVM_REG_ARM_STD_HYP_BMAP,
> > + KVM_REG_ARM_VENDOR_HYP_BMAP,
> > };
> >
> > void kvm_arm_init_hypercalls(struct kvm *kvm)
> > @@ -230,6 +234,7 @@ void kvm_arm_init_hypercalls(struct kvm *kvm)
> >
> > smccc_feat->std_bmap = KVM_ARM_SMCCC_STD_FEATURES;
> > smccc_feat->std_hyp_bmap = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> > + smccc_feat->vendor_hyp_bmap = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> > }
> >
> > int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> > @@ -322,6 +327,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > case KVM_REG_ARM_STD_HYP_BMAP:
> > val = READ_ONCE(smccc_feat->std_hyp_bmap);
> > break;
> > + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> > + val = READ_ONCE(smccc_feat->vendor_hyp_bmap);
> > + break;
> > default:
> > return -ENOENT;
> > }
> > @@ -348,6 +356,10 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
> > fw_reg_bmap = &smccc_feat->std_hyp_bmap;
> > fw_reg_features = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> > break;
> > + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> > + fw_reg_bmap = &smccc_feat->vendor_hyp_bmap;
> > + fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> > + break;
> > default:
> > return -ENOENT;
> > }
>
> If KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other,
> special code is needed to gurantee PTP is cleared if VENDOR_HYP is disabled.
>
Please see the above comment :)
> > @@ -453,6 +465,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > return 0;
> > case KVM_REG_ARM_STD_BMAP:
> > case KVM_REG_ARM_STD_HYP_BMAP:
> > + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> > return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val);
> > default:
> > return -ENOENT;
> > diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> > index b0915d8c5b81..eaf4f6b318a8 100644
> > --- a/include/kvm/arm_hypercalls.h
> > +++ b/include/kvm/arm_hypercalls.h
> > @@ -9,6 +9,7 @@
> > /* Last valid bits of the bitmapped firmware registers */
> > #define KVM_REG_ARM_STD_BMAP_BIT_MAX 0
> > #define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0
> > +#define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1
> >
> > #define KVM_ARM_SMCCC_STD_FEATURES \
> > GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0)
> > @@ -16,6 +17,9 @@
> > #define KVM_ARM_SMCCC_STD_HYP_FEATURES \
> > GENMASK_ULL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX, 0)
> >
> > +#define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \
> > + GENMASK_ULL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX, 0)
> > +
> > int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
> >
> > static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> >
>
> Thanks,
> Gavin
>
Thanks for the review.
Regards,
Raghavendra
[1]: https://lore.kernel.org/lkml/[email protected]/
Hi Raghavendra,
On 4/14/22 12:59 AM, Raghavendra Rao Ananta wrote:
> On Tue, Apr 12, 2022 at 8:59 PM Gavin Shan <[email protected]> wrote:
>> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
>>> Introduce the firmware register to hold the vendor specific
>>> hypervisor service calls (owner value 6) as a bitmap. The
>>> bitmap represents the features that'll be enabled for the
>>> guest, as configured by the user-space. Currently, this
>>> includes support for KVM-vendor features, and Precision Time
>>> Protocol (PTP), represented by bit-0 and bit-1 respectively.
>>>
>>> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
>>> ---
>>> arch/arm64/include/asm/kvm_host.h | 2 ++
>>> arch/arm64/include/uapi/asm/kvm.h | 4 ++++
>>> arch/arm64/kvm/hypercalls.c | 21 +++++++++++++++++----
>>> include/kvm/arm_hypercalls.h | 4 ++++
>>> 4 files changed, 27 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>> index 20165242ebd9..b79161bad69a 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -106,10 +106,12 @@ struct kvm_arch_memory_slot {
>>> *
>>> * @std_bmap: Bitmap of standard secure service calls
>>> * @std_hyp_bmap: Bitmap of standard hypervisor service calls
>>> + * @vendor_hyp_bmap: Bitmap of vendor specific hypervisor service calls
>>> */
>>> struct kvm_smccc_features {
>>> u64 std_bmap;
>>> u64 std_hyp_bmap;
>>> + u64 vendor_hyp_bmap;
>>> };
>>>
>>> struct kvm_arch {
>>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>>> index 67353bf4e69d..9a5ac0ed4113 100644
>>> --- a/arch/arm64/include/uapi/asm/kvm.h
>>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>>> @@ -344,6 +344,10 @@ struct kvm_arm_copy_mte_tags {
>>> #define KVM_REG_ARM_STD_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(1)
>>> #define KVM_REG_ARM_STD_HYP_BIT_PV_TIME BIT(0)
>>>
>>> +#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
>>> +#define KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT BIT(0)
>>> +#define KVM_REG_ARM_VENDOR_HYP_BIT_PTP BIT(1)
>>> +
>>> /* Device Control API: ARM VGIC */
>>> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
>>> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
>>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>>> index 64ae6c7e7145..80836c341fd3 100644
>>> --- a/arch/arm64/kvm/hypercalls.c
>>> +++ b/arch/arm64/kvm/hypercalls.c
>>> @@ -66,8 +66,6 @@ static const u32 hvc_func_default_allowed_list[] = {
>>> ARM_SMCCC_VERSION_FUNC_ID,
>>> ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
>>> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID,
>>> - ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
>>> - ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
>>> };
>>>
>>> static bool kvm_hvc_call_default_allowed(struct kvm_vcpu *vcpu, u32 func_id)
>>> @@ -102,6 +100,12 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
>>> case ARM_SMCCC_HV_PV_TIME_ST:
>>> return kvm_arm_fw_reg_feat_enabled(smccc_feat->std_hyp_bmap,
>>> KVM_REG_ARM_STD_HYP_BIT_PV_TIME);
>>> + case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
>>> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
>>> + KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT);
>>> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
>>> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
>>> + KVM_REG_ARM_VENDOR_HYP_BIT_PTP);
>>> default:
>>> return kvm_hvc_call_default_allowed(vcpu, func_id);
>>> }
>>
>> I guess we may return SMCCC_RET_NOT_SUPPORTED for ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
>> if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT isn't set? Otherwise, we need explain it
>> in the commit log.
>>
> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID is a part of the hvc
> allowed-list (hvc_func_default_allowed_list[]), which means it's not
> associated with any feature bit and is always enabled. If the guest
> were to issue ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, we'd end up in
> the 'default' case and the kvm_hvc_call_default_allowed() would return
> 'true'. This is documented in patch 2/10.
>
I think I might not make myself clear and sorry for that. The point is
the following hvc calls should be belonging to 'Vendor Specific Hypervisor
Service', or I'm wrong. If I'm correct, VENDOR_HYP_CALL_UID_FUNC_ID
should be disallowed if bit#0 isn't set in @vendor_hyp_bmap.
ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID
ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID
ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID was introduced by commit 6e085e0ac9cf
("arm/arm64: Probe for the presence of KVM hypervisor"). According to the
commit log, the identifier and supported (vendor specific) feature list
is returned by this call and ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID.
So the users depend on both calls to probe the supported features or
services. So it seems incorrect to allow ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
even the 'Vendor Specific Hypervisor Service' is disabled and bit#0
is cleared in @vendor_hyp_bmap by users.
>> KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other.
>> I think PTP can't be on if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT is off.
>>
> Actually we went through this scenario [1]. Of course, we can build
> some logic around it to make sure that the userspace does the right
> thing, but at this point the consensus is that, unless it's an issue
> for KVM, it's treated as a userspace bug.
>
Thanks for the pointer. I chime in late and I didn't check the reviewing
history on this series. Hopefully I didn't bring too much confusing comments
to you.
I think it's fine by treating it as a userspace bug, but it would be nice
to add comments somewhere if you agree.
>>> @@ -194,8 +198,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>>> val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
>>> break;
>>> case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
>>> - val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
>>> - val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
>>> + val[0] = smccc_feat->vendor_hyp_bmap;
>>> break;
>>> case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
>>> kvm_ptp_get_time(vcpu, val);
>>> @@ -222,6 +225,7 @@ static const u64 kvm_arm_fw_reg_ids[] = {
>>> KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
>>> KVM_REG_ARM_STD_BMAP,
>>> KVM_REG_ARM_STD_HYP_BMAP,
>>> + KVM_REG_ARM_VENDOR_HYP_BMAP,
>>> };
>>>
>>> void kvm_arm_init_hypercalls(struct kvm *kvm)
>>> @@ -230,6 +234,7 @@ void kvm_arm_init_hypercalls(struct kvm *kvm)
>>>
>>> smccc_feat->std_bmap = KVM_ARM_SMCCC_STD_FEATURES;
>>> smccc_feat->std_hyp_bmap = KVM_ARM_SMCCC_STD_HYP_FEATURES;
>>> + smccc_feat->vendor_hyp_bmap = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
>>> }
>>>
>>> int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>>> @@ -322,6 +327,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> case KVM_REG_ARM_STD_HYP_BMAP:
>>> val = READ_ONCE(smccc_feat->std_hyp_bmap);
>>> break;
>>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
>>> + val = READ_ONCE(smccc_feat->vendor_hyp_bmap);
>>> + break;
>>> default:
>>> return -ENOENT;
>>> }
>>> @@ -348,6 +356,10 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
>>> fw_reg_bmap = &smccc_feat->std_hyp_bmap;
>>> fw_reg_features = KVM_ARM_SMCCC_STD_HYP_FEATURES;
>>> break;
>>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
>>> + fw_reg_bmap = &smccc_feat->vendor_hyp_bmap;
>>> + fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
>>> + break;
>>> default:
>>> return -ENOENT;
>>> }
>>
>> If KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other,
>> special code is needed to gurantee PTP is cleared if VENDOR_HYP is disabled.
>>
> Please see the above comment :)
>
Thanks for the pointer and explanation :)
>>> @@ -453,6 +465,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>>> return 0;
>>> case KVM_REG_ARM_STD_BMAP:
>>> case KVM_REG_ARM_STD_HYP_BMAP:
>>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
>>> return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val);
>>> default:
>>> return -ENOENT;
>>> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
>>> index b0915d8c5b81..eaf4f6b318a8 100644
>>> --- a/include/kvm/arm_hypercalls.h
>>> +++ b/include/kvm/arm_hypercalls.h
>>> @@ -9,6 +9,7 @@
>>> /* Last valid bits of the bitmapped firmware registers */
>>> #define KVM_REG_ARM_STD_BMAP_BIT_MAX 0
>>> #define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0
>>> +#define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1
>>>
>>> #define KVM_ARM_SMCCC_STD_FEATURES \
>>> GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0)
>>> @@ -16,6 +17,9 @@
>>> #define KVM_ARM_SMCCC_STD_HYP_FEATURES \
>>> GENMASK_ULL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX, 0)
>>>
>>> +#define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \
>>> + GENMASK_ULL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX, 0)
>>> +
>>> int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>>>
>>> static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
>>>
>>
>
> Thanks for the review.
>
No worries and sorry for the late chime-in :)
>
> [1]: https://lore.kernel.org/lkml/[email protected]/
>
Thanks,
Gavin
On Fri, 15 Apr 2022 07:44:55 +0100,
Gavin Shan <[email protected]> wrote:
>
> Hi Raghavendra,
>
> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> > Continuing the discussion from [1], the series tries to add support
> > for the userspace to elect the hypercall services that it wishes
> > to expose to the guest, rather than the guest discovering them
> > unconditionally. The idea employed by the series was taken from
> > [1] as suggested by Marc Z.
> >
> > In a broad sense, the concept is similar to the current implementation
> > of PSCI interface- create a 'firmware psuedo-register' to handle the
> > firmware revisions. The series extends this idea to all the other
> > hypercalls such as TRNG (True Random Number Generator), PV_TIME
> > (Paravirtualized Time), and PTP (Precision Time protocol).
> >
> > For better categorization and future scaling, these firmware registers
> > are categorized based on the service call owners. Also, unlike the
> > existing firmware psuedo-registers, they hold the features supported
> > in the form of a bitmap.
> >
> > During the VM initialization, the registers holds an upper-limit of
> > the features supported by each one of them. It's expected that the
> > userspace discover the features provided by each register via GET_ONE_REG,
> > and writeback the desired values using SET_ONE_REG. KVM allows this
> > modification only until the VM has started.
> >
> > Some of the standard function-ids, such as ARM_SMCCC_VERSION_FUNC_ID,
> > need not be associated with a feature bit. For such ids, the series
> > introduced an allowed-list, hvc_func_default_allowed_list[], that holds
> > all such ids. As a result, the functions that are not elected by userspace,
> > or if they are not a part of this allowed-list, will be denied for when
> > the guests invoke them.
> >
> > Older VMMs can simply ignore this interface and the hypercall services
> > will be exposed unconditionally to the guests, thus ensuring backward
> > compatibility.
> >
>
> [...]
>
> I rethinking about the design again and just get one question. Hopefully,
> someone have the answer for us. The newly added 3 pseudo registers and
> the existing ones like KVM_REG_ARM_PSCI_VERSION are all tied up with
> vcpu, instead of VM. I don't think it's correct. I'm not sure if VM-scoped
> pseudo registers aren't allowed by ARM architecture or the effort isn't
> worthy to support it.
We have had that discussion before (around version 2 of this series,
if I remember well).
>
> These pseudo registers are introduced to present the available hypercalls,
> and then they can be disabled from userspace. In the implementation, these 3
> registers are vcpu scoped. It means that multiple vcpus can be asymmetric
> in terms of usable hypercalls. For example, ARM_SMCCC_TRNG hypercalls
> can be enabled on vcpu0, but disabled on vcpu1. I don't think it's expected.
No, that's not the way this is supposed to work. These hypercalls are
of course global, even if the accessor is per-vcpu. This is similar to
tons of other things, such as some of the PMU data, the timer virtual
offset... the list goes on. If that's not what this code does, then it
is a bug and it needs to be fixed.
> On the other hand, the information stored in these 3 registers needs to
> be migrated through {GET,SET}_ONE_REG by VMM (QEMU). all the information
> stored in these 3 registers are all same on all vcpus, which is exactly
> as we expect. In migration circumstance, we're transporting identical
> information for all vcpus and it's unnecessary.
Yes, we all understand that. My response to that was (and still is):
- There is no need to invent a new userspace interface. The one we
have is terrible enough, and we don't need another square wheel that
would need to be maintained beside the existing one.
- Let's say we have 1024 new pseudo-registers, 1024 vcpus, 64bit regs:
that's 8MB worth of extra data. This is not insignificant, but also
not really a problem given that such a large VM is probably attached
to a proportionally large amount of memory. In practice, we're
talking of less than 10 registers, and less than 100 vcpus. A crazy
8kB at most. Who cares?
- If this is eventually deemed to be a *real* scalability problem, we
can always expose a map of registers that are global, and let
userspace know that it can elide the rest. Problem solved, backward
compatibility preserved. And I'm willing to bet that we won't need
it in my lifetime.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On Wed, Apr 13, 2022 at 6:04 PM Gavin Shan <[email protected]> wrote:
>
> Hi Raghavendra,
>
> On 4/14/22 12:59 AM, Raghavendra Rao Ananta wrote:
> > On Tue, Apr 12, 2022 at 8:59 PM Gavin Shan <[email protected]> wrote:
> >> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> >>> Introduce the firmware register to hold the vendor specific
> >>> hypervisor service calls (owner value 6) as a bitmap. The
> >>> bitmap represents the features that'll be enabled for the
> >>> guest, as configured by the user-space. Currently, this
> >>> includes support for KVM-vendor features, and Precision Time
> >>> Protocol (PTP), represented by bit-0 and bit-1 respectively.
> >>>
> >>> Signed-off-by: Raghavendra Rao Ananta <[email protected]>
> >>> ---
> >>> arch/arm64/include/asm/kvm_host.h | 2 ++
> >>> arch/arm64/include/uapi/asm/kvm.h | 4 ++++
> >>> arch/arm64/kvm/hypercalls.c | 21 +++++++++++++++++----
> >>> include/kvm/arm_hypercalls.h | 4 ++++
> >>> 4 files changed, 27 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >>> index 20165242ebd9..b79161bad69a 100644
> >>> --- a/arch/arm64/include/asm/kvm_host.h
> >>> +++ b/arch/arm64/include/asm/kvm_host.h
> >>> @@ -106,10 +106,12 @@ struct kvm_arch_memory_slot {
> >>> *
> >>> * @std_bmap: Bitmap of standard secure service calls
> >>> * @std_hyp_bmap: Bitmap of standard hypervisor service calls
> >>> + * @vendor_hyp_bmap: Bitmap of vendor specific hypervisor service calls
> >>> */
> >>> struct kvm_smccc_features {
> >>> u64 std_bmap;
> >>> u64 std_hyp_bmap;
> >>> + u64 vendor_hyp_bmap;
> >>> };
> >>>
> >>> struct kvm_arch {
> >>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> >>> index 67353bf4e69d..9a5ac0ed4113 100644
> >>> --- a/arch/arm64/include/uapi/asm/kvm.h
> >>> +++ b/arch/arm64/include/uapi/asm/kvm.h
> >>> @@ -344,6 +344,10 @@ struct kvm_arm_copy_mte_tags {
> >>> #define KVM_REG_ARM_STD_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(1)
> >>> #define KVM_REG_ARM_STD_HYP_BIT_PV_TIME BIT(0)
> >>>
> >>> +#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
> >>> +#define KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT BIT(0)
> >>> +#define KVM_REG_ARM_VENDOR_HYP_BIT_PTP BIT(1)
> >>> +
> >>> /* Device Control API: ARM VGIC */
> >>> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> >>> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> >>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> >>> index 64ae6c7e7145..80836c341fd3 100644
> >>> --- a/arch/arm64/kvm/hypercalls.c
> >>> +++ b/arch/arm64/kvm/hypercalls.c
> >>> @@ -66,8 +66,6 @@ static const u32 hvc_func_default_allowed_list[] = {
> >>> ARM_SMCCC_VERSION_FUNC_ID,
> >>> ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> >>> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID,
> >>> - ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID,
> >>> - ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
> >>> };
> >>>
> >>> static bool kvm_hvc_call_default_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> >>> @@ -102,6 +100,12 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
> >>> case ARM_SMCCC_HV_PV_TIME_ST:
> >>> return kvm_arm_fw_reg_feat_enabled(smccc_feat->std_hyp_bmap,
> >>> KVM_REG_ARM_STD_HYP_BIT_PV_TIME);
> >>> + case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> >>> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> >>> + KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT);
> >>> + case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> >>> + return kvm_arm_fw_reg_feat_enabled(smccc_feat->vendor_hyp_bmap,
> >>> + KVM_REG_ARM_VENDOR_HYP_BIT_PTP);
> >>> default:
> >>> return kvm_hvc_call_default_allowed(vcpu, func_id);
> >>> }
> >>
> >> I guess we may return SMCCC_RET_NOT_SUPPORTED for ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
> >> if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT isn't set? Otherwise, we need explain it
> >> in the commit log.
> >>
> > ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID is a part of the hvc
> > allowed-list (hvc_func_default_allowed_list[]), which means it's not
> > associated with any feature bit and is always enabled. If the guest
> > were to issue ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, we'd end up in
> > the 'default' case and the kvm_hvc_call_default_allowed() would return
> > 'true'. This is documented in patch 2/10.
> >
>
> I think I might not make myself clear and sorry for that. The point is
> the following hvc calls should be belonging to 'Vendor Specific Hypervisor
> Service', or I'm wrong. If I'm correct, VENDOR_HYP_CALL_UID_FUNC_ID
> should be disallowed if bit#0 isn't set in @vendor_hyp_bmap.
>
> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
> ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID
> ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID
>
> ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID was introduced by commit 6e085e0ac9cf
> ("arm/arm64: Probe for the presence of KVM hypervisor"). According to the
> commit log, the identifier and supported (vendor specific) feature list
> is returned by this call and ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID.
> So the users depend on both calls to probe the supported features or
> services. So it seems incorrect to allow ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID
> even the 'Vendor Specific Hypervisor Service' is disabled and bit#0
> is cleared in @vendor_hyp_bmap by users.
>
Hm, it was a grey area for me since the FEATURES_FUNC_ID didn't
broadcast the presence of UID_FUNC_ID. But what you said makes sense.
UID_FUNC_ID should tag along with FEATURES_FUNC_ID. I can merge both
of them under bit-0.
Thanks for sharing the background.
Regards,
Raghavendra
> >> KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other.
> >> I think PTP can't be on if KVM_REG_ARM_VENDOR_HYP_BIT_FUNC_FEAT is off.
> >>
> > Actually we went through this scenario [1]. Of course, we can build
> > some logic around it to make sure that the userspace does the right
> > thing, but at this point the consensus is that, unless it's an issue
> > for KVM, it's treated as a userspace bug.
> >
>
> Thanks for the pointer. I chime in late and I didn't check the reviewing
> history on this series. Hopefully I didn't bring too much confusing comments
> to you.
>
> I think it's fine by treating it as a userspace bug, but it would be nice
> to add comments somewhere if you agree.
>
> >>> @@ -194,8 +198,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> >>> val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3;
> >>> break;
> >>> case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> >>> - val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
> >>> - val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
> >>> + val[0] = smccc_feat->vendor_hyp_bmap;
> >>> break;
> >>> case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> >>> kvm_ptp_get_time(vcpu, val);
> >>> @@ -222,6 +225,7 @@ static const u64 kvm_arm_fw_reg_ids[] = {
> >>> KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3,
> >>> KVM_REG_ARM_STD_BMAP,
> >>> KVM_REG_ARM_STD_HYP_BMAP,
> >>> + KVM_REG_ARM_VENDOR_HYP_BMAP,
> >>> };
> >>>
> >>> void kvm_arm_init_hypercalls(struct kvm *kvm)
> >>> @@ -230,6 +234,7 @@ void kvm_arm_init_hypercalls(struct kvm *kvm)
> >>>
> >>> smccc_feat->std_bmap = KVM_ARM_SMCCC_STD_FEATURES;
> >>> smccc_feat->std_hyp_bmap = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> >>> + smccc_feat->vendor_hyp_bmap = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> >>> }
> >>>
> >>> int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> >>> @@ -322,6 +327,9 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>> case KVM_REG_ARM_STD_HYP_BMAP:
> >>> val = READ_ONCE(smccc_feat->std_hyp_bmap);
> >>> break;
> >>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> >>> + val = READ_ONCE(smccc_feat->vendor_hyp_bmap);
> >>> + break;
> >>> default:
> >>> return -ENOENT;
> >>> }
> >>> @@ -348,6 +356,10 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
> >>> fw_reg_bmap = &smccc_feat->std_hyp_bmap;
> >>> fw_reg_features = KVM_ARM_SMCCC_STD_HYP_FEATURES;
> >>> break;
> >>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> >>> + fw_reg_bmap = &smccc_feat->vendor_hyp_bmap;
> >>> + fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES;
> >>> + break;
> >>> default:
> >>> return -ENOENT;
> >>> }
> >>
> >> If KVM_REG_ARM_VENDOR_HYP_BIT_{FUNC_FEAT, PTP} aren't parallel to each other,
> >> special code is needed to gurantee PTP is cleared if VENDOR_HYP is disabled.
> >>
> > Please see the above comment :)
> >
>
> Thanks for the pointer and explanation :)
>
> >>> @@ -453,6 +465,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>> return 0;
> >>> case KVM_REG_ARM_STD_BMAP:
> >>> case KVM_REG_ARM_STD_HYP_BMAP:
> >>> + case KVM_REG_ARM_VENDOR_HYP_BMAP:
> >>> return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val);
> >>> default:
> >>> return -ENOENT;
> >>> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> >>> index b0915d8c5b81..eaf4f6b318a8 100644
> >>> --- a/include/kvm/arm_hypercalls.h
> >>> +++ b/include/kvm/arm_hypercalls.h
> >>> @@ -9,6 +9,7 @@
> >>> /* Last valid bits of the bitmapped firmware registers */
> >>> #define KVM_REG_ARM_STD_BMAP_BIT_MAX 0
> >>> #define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0
> >>> +#define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1
> >>>
> >>> #define KVM_ARM_SMCCC_STD_FEATURES \
> >>> GENMASK_ULL(KVM_REG_ARM_STD_BMAP_BIT_MAX, 0)
> >>> @@ -16,6 +17,9 @@
> >>> #define KVM_ARM_SMCCC_STD_HYP_FEATURES \
> >>> GENMASK_ULL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX, 0)
> >>>
> >>> +#define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \
> >>> + GENMASK_ULL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX, 0)
> >>> +
> >>> int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
> >>>
> >>> static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> >>>
> >>
> >
> > Thanks for the review.
> >
>
> No worries and sorry for the late chime-in :)
>
> >
> > [1]: https://lore.kernel.org/lkml/[email protected]/
> >
>
> Thanks,
> Gavin
>
Hi Raghavendra,
On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
> Continuing the discussion from [1], the series tries to add support
> for the userspace to elect the hypercall services that it wishes
> to expose to the guest, rather than the guest discovering them
> unconditionally. The idea employed by the series was taken from
> [1] as suggested by Marc Z.
>
> In a broad sense, the concept is similar to the current implementation
> of PSCI interface- create a 'firmware psuedo-register' to handle the
> firmware revisions. The series extends this idea to all the other
> hypercalls such as TRNG (True Random Number Generator), PV_TIME
> (Paravirtualized Time), and PTP (Precision Time protocol).
>
> For better categorization and future scaling, these firmware registers
> are categorized based on the service call owners. Also, unlike the
> existing firmware psuedo-registers, they hold the features supported
> in the form of a bitmap.
>
> During the VM initialization, the registers holds an upper-limit of
> the features supported by each one of them. It's expected that the
> userspace discover the features provided by each register via GET_ONE_REG,
> and writeback the desired values using SET_ONE_REG. KVM allows this
> modification only until the VM has started.
>
> Some of the standard function-ids, such as ARM_SMCCC_VERSION_FUNC_ID,
> need not be associated with a feature bit. For such ids, the series
> introduced an allowed-list, hvc_func_default_allowed_list[], that holds
> all such ids. As a result, the functions that are not elected by userspace,
> or if they are not a part of this allowed-list, will be denied for when
> the guests invoke them.
>
> Older VMMs can simply ignore this interface and the hypercall services
> will be exposed unconditionally to the guests, thus ensuring backward
> compatibility.
>
[...]
I rethinking about the design again and just get one question. Hopefully,
someone have the answer for us. The newly added 3 pseudo registers and
the existing ones like KVM_REG_ARM_PSCI_VERSION are all tied up with
vcpu, instead of VM. I don't think it's correct. I'm not sure if VM-scoped
pseudo registers aren't allowed by ARM architecture or the effort isn't
worthy to support it.
These pseudo registers are introduced to present the available hypercalls,
and then they can be disabled from userspace. In the implementation, these 3
registers are vcpu scoped. It means that multiple vcpus can be asymmetric
in terms of usable hypercalls. For example, ARM_SMCCC_TRNG hypercalls
can be enabled on vcpu0, but disabled on vcpu1. I don't think it's expected.
On the other hand, the information stored in these 3 registers needs to
be migrated through {GET,SET}_ONE_REG by VMM (QEMU). all the information
stored in these 3 registers are all same on all vcpus, which is exactly
as we expect. In migration circumstance, we're transporting identical
information for all vcpus and it's unnecessary.
Thanks,
Gavin
Hi Marc,
On 4/15/22 4:58 PM, Marc Zyngier wrote:
> On Fri, 15 Apr 2022 07:44:55 +0100,
> Gavin Shan <[email protected]> wrote:
>> On 4/7/22 9:15 AM, Raghavendra Rao Ananta wrote:
>>> Continuing the discussion from [1], the series tries to add support
>>> for the userspace to elect the hypercall services that it wishes
>>> to expose to the guest, rather than the guest discovering them
>>> unconditionally. The idea employed by the series was taken from
>>> [1] as suggested by Marc Z.
>>>
>>> In a broad sense, the concept is similar to the current implementation
>>> of PSCI interface- create a 'firmware psuedo-register' to handle the
>>> firmware revisions. The series extends this idea to all the other
>>> hypercalls such as TRNG (True Random Number Generator), PV_TIME
>>> (Paravirtualized Time), and PTP (Precision Time protocol).
>>>
>>> For better categorization and future scaling, these firmware registers
>>> are categorized based on the service call owners. Also, unlike the
>>> existing firmware psuedo-registers, they hold the features supported
>>> in the form of a bitmap.
>>>
>>> During the VM initialization, the registers holds an upper-limit of
>>> the features supported by each one of them. It's expected that the
>>> userspace discover the features provided by each register via GET_ONE_REG,
>>> and writeback the desired values using SET_ONE_REG. KVM allows this
>>> modification only until the VM has started.
>>>
>>> Some of the standard function-ids, such as ARM_SMCCC_VERSION_FUNC_ID,
>>> need not be associated with a feature bit. For such ids, the series
>>> introduced an allowed-list, hvc_func_default_allowed_list[], that holds
>>> all such ids. As a result, the functions that are not elected by userspace,
>>> or if they are not a part of this allowed-list, will be denied for when
>>> the guests invoke them.
>>>
>>> Older VMMs can simply ignore this interface and the hypercall services
>>> will be exposed unconditionally to the guests, thus ensuring backward
>>> compatibility.
>>>
>>
>> [...]
>>
>> I rethinking about the design again and just get one question. Hopefully,
>> someone have the answer for us. The newly added 3 pseudo registers and
>> the existing ones like KVM_REG_ARM_PSCI_VERSION are all tied up with
>> vcpu, instead of VM. I don't think it's correct. I'm not sure if VM-scoped
>> pseudo registers aren't allowed by ARM architecture or the effort isn't
>> worthy to support it.
>
> We have had that discussion before (around version 2 of this series,
> if I remember well).
>
Yeah, I'm chime-in this series lately. There must be some discussions,
including this topic, I missed :)
>>
>> These pseudo registers are introduced to present the available hypercalls,
>> and then they can be disabled from userspace. In the implementation, these 3
>> registers are vcpu scoped. It means that multiple vcpus can be asymmetric
>> in terms of usable hypercalls. For example, ARM_SMCCC_TRNG hypercalls
>> can be enabled on vcpu0, but disabled on vcpu1. I don't think it's expected.
>
> No, that's not the way this is supposed to work. These hypercalls are
> of course global, even if the accessor is per-vcpu. This is similar to
> tons of other things, such as some of the PMU data, the timer virtual
> offset... the list goes on. If that's not what this code does, then it
> is a bug and it needs to be fixed.
>
Ok.
>> On the other hand, the information stored in these 3 registers needs to
>> be migrated through {GET,SET}_ONE_REG by VMM (QEMU). all the information
>> stored in these 3 registers are all same on all vcpus, which is exactly
>> as we expect. In migration circumstance, we're transporting identical
>> information for all vcpus and it's unnecessary.
>
> Yes, we all understand that. My response to that was (and still is):
>
> - There is no need to invent a new userspace interface. The one we
> have is terrible enough, and we don't need another square wheel that
> would need to be maintained beside the existing one.
>
> - Let's say we have 1024 new pseudo-registers, 1024 vcpus, 64bit regs:
> that's 8MB worth of extra data. This is not insignificant, but also
> not really a problem given that such a large VM is probably attached
> to a proportionally large amount of memory. In practice, we're
> talking of less than 10 registers, and less than 100 vcpus. A crazy
> 8kB at most. Who cares?
>
> - If this is eventually deemed to be a *real* scalability problem, we
> can always expose a map of registers that are global, and let
> userspace know that it can elide the rest. Problem solved, backward
> compatibility preserved. And I'm willing to bet that we won't need
> it in my lifetime.
>
The reason why I raised question is just to check if it's a missed
point in the design. As I said, I obviously missed the previous
discussions and glad that this has been discussed through.
Thanks for the details. Yes, it's totally fine to migrate 8KB data.
Besides, VMM (QEMU) can choose to do migration on one single vcpu,
instead of all of them, as you said.
Thanks,
Gavin