2018-02-01 11:54:11

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 00/18] arm64: Add SMCCC v1.1 support and CVE-2017-5715 (Spectre variant 2) mitigation

ARM has recently published a SMC Calling Convention (SMCCC)
specification update[1] that provides an optimised calling convention
and optional, discoverable support for mitigating CVE-2017-5715. ARM
Trusted Firmware (ATF) has already gained such an implementation[2].

This series addresses a few things:

- It provides a KVM implementation of PSCI v1.0, which is a
prerequisite for being able to discover SMCCC v1.1, together with a
new userspace API to control the PSCI revision number that the guest
sees.

- It allows KVM to advertise SMCCC v1.1, which is de-facto supported
already (it never corrupts any of the guest registers).

- It implements KVM support for the ARCH_WORKAROUND_1 function that is
used to mitigate CVE-2017-5715 in a guest (if such mitigation is
available on the host).

- It implements SMCCC v1.1 and ARCH_WORKAROUND_1 discovery support in
the kernel itself.

- It finally provides firmware callbacks for CVE-2017-5715 for both
kernel and KVM and drop the initial PSCI_GET_VERSION based
mitigation.

Patch 1 is already merged, and included here for reference. Patches on
top of arm64/for-next/core. Tested on Seattle and Juno, the latter
with ATF implementing SMCCC v1.1.

[1]: https://developer.arm.com/support/security-update/downloads/

[2]: https://github.com/ARM-software/arm-trusted-firmware/pull/1240

* From v2:
- Fixed SMC handling in KVM
- PSCI fixes and tidying up
- SMCCC primitive rework for better code generation (both efficiency
and correctness)
- Remove PSCI_GET_VERSION as a mitigation vector

* From v1:
- Fixed 32bit build
- Fix function number sign extension (Ard)
- Inline SMCCC v1.1 primitives (cpp soup)
- Prevent SMCCC spamming on feature probing
- Random fixes and tidying up

Marc Zyngier (18):
arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
arm64: KVM: Increment PC after handling an SMC trap
arm/arm64: KVM: Consolidate the PSCI include files
arm/arm64: KVM: Add PSCI_VERSION helper
arm/arm64: KVM: Add smccc accessors to PSCI code
arm/arm64: KVM: Implement PSCI 1.0 support
arm/arm64: KVM: Add PSCI version selection API
arm/arm64: KVM: Advertise SMCCC v1.1
arm/arm64: KVM: Turn kvm_psci_version into a static inline
arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
firmware/psci: Expose PSCI conduit
firmware/psci: Expose SMCCC version through psci_ops
arm/arm64: smccc: Make function identifiers an unsigned quantity
arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
arm64: Kill PSCI_GET_VERSION as a variant-2 workaround

Documentation/virtual/kvm/api.txt | 3 +-
Documentation/virtual/kvm/arm/psci.txt | 30 +++++
arch/arm/include/asm/kvm_host.h | 10 ++
arch/arm/include/asm/kvm_psci.h | 27 -----
arch/arm/include/uapi/asm/kvm.h | 6 +
arch/arm/kvm/guest.c | 13 +++
arch/arm/kvm/handle_exit.c | 17 ++-
arch/arm64/include/asm/kvm_host.h | 9 ++
arch/arm64/include/asm/kvm_psci.h | 27 -----
arch/arm64/include/uapi/asm/kvm.h | 6 +
arch/arm64/kernel/bpi.S | 44 ++++----
arch/arm64/kernel/cpu_errata.c | 77 ++++++++++---
arch/arm64/kvm/guest.c | 14 ++-
arch/arm64/kvm/handle_exit.c | 18 ++-
arch/arm64/kvm/hyp/hyp-entry.S | 20 +++-
arch/arm64/kvm/hyp/switch.c | 14 +--
drivers/firmware/psci.c | 47 +++++++-
include/kvm/arm_psci.h | 63 +++++++++++
include/linux/arm-smccc.h | 167 +++++++++++++++++++++++++++-
include/linux/psci.h | 13 +++
include/uapi/linux/psci.h | 3 +
virt/kvm/arm/arm.c | 2 +-
virt/kvm/arm/psci.c | 196 +++++++++++++++++++++++++++++----
23 files changed, 677 insertions(+), 149 deletions(-)
create mode 100644 Documentation/virtual/kvm/arm/psci.txt
delete mode 100644 arch/arm/include/asm/kvm_psci.h
delete mode 100644 arch/arm64/include/asm/kvm_psci.h
create mode 100644 include/kvm/arm_psci.h

--
2.14.2



2018-02-01 11:49:03

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 01/18] arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls

KVM doesn't follow the SMCCC when it comes to unimplemented calls,
and inject an UNDEF instead of returning an error. Since firmware
calls are now used for security mitigation, they are becoming more
common, and the undef is counter productive.

Instead, let's follow the SMCCC which states that -1 must be returned
to the caller when getting an unknown function number.

Cc: <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
---
arch/arm64/kvm/handle_exit.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index c09fc5a576c7..520b0dad3c62 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -53,7 +53,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

ret = kvm_psci_call(vcpu);
if (ret < 0) {
- kvm_inject_undefined(vcpu);
+ vcpu_set_reg(vcpu, 0, ~0UL);
return 1;
}

@@ -62,7 +62,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
- kvm_inject_undefined(vcpu);
+ vcpu_set_reg(vcpu, 0, ~0UL);
return 1;
}

--
2.14.2


2018-02-01 11:49:08

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 07/18] arm/arm64: KVM: Implement PSCI 1.0 support

PSCI 1.0 can be trivially implemented by having PSCI 0.2 and
the FEATURES call. Of, and returning 1.0 as the PSCI version.

We happily ignore everything else, as it is optional.

Signed-off-by: Marc Zyngier <[email protected]>
---
include/kvm/arm_psci.h | 1 +
virt/kvm/arm/psci.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)

diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 5659343580a3..5446435457c2 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -22,6 +22,7 @@

#define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
#define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
+#define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)

int kvm_psci_version(struct kvm_vcpu *vcpu);
int kvm_psci_call(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index c41553d35110..291874cff85e 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -313,6 +313,47 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
return ret;
}

+static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
+{
+ u32 psci_fn = smccc_get_function(vcpu);
+ u32 feature;
+ unsigned long val;
+ int ret = 1;
+
+ switch(psci_fn) {
+ case PSCI_0_2_FN_PSCI_VERSION:
+ val = KVM_ARM_PSCI_1_0;
+ break;
+ case PSCI_1_0_FN_PSCI_FEATURES:
+ feature = smccc_get_arg1(vcpu);
+ switch(feature) {
+ case PSCI_0_2_FN_PSCI_VERSION:
+ case PSCI_0_2_FN_CPU_SUSPEND:
+ case PSCI_0_2_FN64_CPU_SUSPEND:
+ case PSCI_0_2_FN_CPU_OFF:
+ case PSCI_0_2_FN_CPU_ON:
+ case PSCI_0_2_FN64_CPU_ON:
+ case PSCI_0_2_FN_AFFINITY_INFO:
+ case PSCI_0_2_FN64_AFFINITY_INFO:
+ case PSCI_0_2_FN_MIGRATE_INFO_TYPE:
+ case PSCI_0_2_FN_SYSTEM_OFF:
+ case PSCI_0_2_FN_SYSTEM_RESET:
+ case PSCI_1_0_FN_PSCI_FEATURES:
+ val = 0;
+ break;
+ default:
+ val = PSCI_RET_NOT_SUPPORTED;
+ break;
+ }
+ break;
+ default:
+ return kvm_psci_0_2_call(vcpu);
+ }
+
+ smccc_set_retval(vcpu, val, 0, 0, 0);
+ return ret;
+}
+
static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -355,6 +396,8 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
int kvm_psci_call(struct kvm_vcpu *vcpu)
{
switch (kvm_psci_version(vcpu)) {
+ case KVM_ARM_PSCI_1_0:
+ return kvm_psci_1_0_call(vcpu);
case KVM_ARM_PSCI_0_2:
return kvm_psci_0_2_call(vcpu);
case KVM_ARM_PSCI_0_1:
--
2.14.2


2018-02-01 11:49:26

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 13/18] firmware/psci: Expose PSCI conduit

In order to call into the firmware to apply workarounds, it is
useful to find out whether we're using HVC or SMC. Let's expose
this through the psci_ops.

Acked-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/firmware/psci.c | 28 +++++++++++++++++++++++-----
include/linux/psci.h | 7 +++++++
2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index 8b25d31e8401..e9493da2b111 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -59,7 +59,9 @@ bool psci_tos_resident_on(int cpu)
return cpu == resident_cpu;
}

-struct psci_operations psci_ops;
+struct psci_operations psci_ops = {
+ .conduit = PSCI_CONDUIT_NONE,
+};

typedef unsigned long (psci_fn)(unsigned long, unsigned long,
unsigned long, unsigned long);
@@ -210,6 +212,22 @@ static unsigned long psci_migrate_info_up_cpu(void)
0, 0, 0);
}

+static void set_conduit(enum psci_conduit conduit)
+{
+ switch (conduit) {
+ case PSCI_CONDUIT_HVC:
+ invoke_psci_fn = __invoke_psci_fn_hvc;
+ break;
+ case PSCI_CONDUIT_SMC:
+ invoke_psci_fn = __invoke_psci_fn_smc;
+ break;
+ default:
+ WARN(1, "Unexpected PSCI conduit %d\n", conduit);
+ }
+
+ psci_ops.conduit = conduit;
+}
+
static int get_set_conduit_method(struct device_node *np)
{
const char *method;
@@ -222,9 +240,9 @@ static int get_set_conduit_method(struct device_node *np)
}

if (!strcmp("hvc", method)) {
- invoke_psci_fn = __invoke_psci_fn_hvc;
+ set_conduit(PSCI_CONDUIT_HVC);
} else if (!strcmp("smc", method)) {
- invoke_psci_fn = __invoke_psci_fn_smc;
+ set_conduit(PSCI_CONDUIT_SMC);
} else {
pr_warn("invalid \"method\" property: %s\n", method);
return -EINVAL;
@@ -654,9 +672,9 @@ int __init psci_acpi_init(void)
pr_info("probing for conduit method from ACPI.\n");

if (acpi_psci_use_hvc())
- invoke_psci_fn = __invoke_psci_fn_hvc;
+ set_conduit(PSCI_CONDUIT_HVC);
else
- invoke_psci_fn = __invoke_psci_fn_smc;
+ set_conduit(PSCI_CONDUIT_SMC);

return psci_probe();
}
diff --git a/include/linux/psci.h b/include/linux/psci.h
index f724fd8c78e8..f2679e5faa4f 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -25,6 +25,12 @@ bool psci_tos_resident_on(int cpu);
int psci_cpu_init_idle(unsigned int cpu);
int psci_cpu_suspend_enter(unsigned long index);

+enum psci_conduit {
+ PSCI_CONDUIT_NONE,
+ PSCI_CONDUIT_SMC,
+ PSCI_CONDUIT_HVC,
+};
+
struct psci_operations {
u32 (*get_version)(void);
int (*cpu_suspend)(u32 state, unsigned long entry_point);
@@ -34,6 +40,7 @@ struct psci_operations {
int (*affinity_info)(unsigned long target_affinity,
unsigned long lowest_affinity_level);
int (*migrate_info_type)(void);
+ enum psci_conduit conduit;
};

extern struct psci_operations psci_ops;
--
2.14.2


2018-02-01 11:49:30

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 15/18] arm/arm64: smccc: Make function identifiers an unsigned quantity

Function identifiers are a 32bit, unsigned quantity. But we never
tell so to the compiler, resulting in the following:

4ac: b26187e0 mov x0, #0xffffffff80000001

We thus rely on the firmware narrowing it for us, which is not
always a reasonable expectation.

Cc: [email protected]
Reported-by: Ard Biesheuvel <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Tested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
include/linux/arm-smccc.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index e1ef944ef1da..dd44d8458c04 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -14,14 +14,16 @@
#ifndef __LINUX_ARM_SMCCC_H
#define __LINUX_ARM_SMCCC_H

+#include <uapi/linux/const.h>
+
/*
* This file provides common defines for ARM SMC Calling Convention as
* specified in
* http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
*/

-#define ARM_SMCCC_STD_CALL 0
-#define ARM_SMCCC_FAST_CALL 1
+#define ARM_SMCCC_STD_CALL _AC(0,U)
+#define ARM_SMCCC_FAST_CALL _AC(1,U)
#define ARM_SMCCC_TYPE_SHIFT 31

#define ARM_SMCCC_SMC_32 0
--
2.14.2


2018-02-01 11:50:09

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 16/18] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive

One of the major improvement of SMCCC v1.1 is that it only clobbers
the first 4 registers, both on 32 and 64bit. This means that it
becomes very easy to provide an inline version of the SMC call
primitive, and avoid performing a function call to stash the
registers that would otherwise be clobbered by SMCCC v1.0.

Signed-off-by: Marc Zyngier <[email protected]>
---
include/linux/arm-smccc.h | 143 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 143 insertions(+)

diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index dd44d8458c04..575aabe85905 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -150,5 +150,148 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,

#define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__)

+/* SMCCC v1.1 implementation madness follows */
+#ifdef CONFIG_ARM64
+
+#define SMCCC_SMC_INST "smc #0"
+#define SMCCC_HVC_INST "hvc #0"
+
+#endif
+
+#ifdef CONFIG_ARM
+#include <asm/opcodes-sec.h>
+#include <asm/opcodes-virt.h>
+
+#define SMCCC_SMC_INST __SMC(0)
+#define SMCCC_HVC_INST __HVC(0)
+
+#endif
+
+#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x
+
+#define __count_args(...) \
+ ___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0)
+
+#define __constraint_write_0 \
+ "+r" (r0), "=&r" (r1), "=&r" (r2), "=&r" (r3)
+#define __constraint_write_1 \
+ "+r" (r0), "+r" (r1), "=&r" (r2), "=&r" (r3)
+#define __constraint_write_2 \
+ "+r" (r0), "+r" (r1), "+r" (r2), "=&r" (r3)
+#define __constraint_write_3 \
+ "+r" (r0), "+r" (r1), "+r" (r2), "+r" (r3)
+#define __constraint_write_4 __constraint_write_3
+#define __constraint_write_5 __constraint_write_4
+#define __constraint_write_6 __constraint_write_5
+#define __constraint_write_7 __constraint_write_6
+
+#define __constraint_read_0
+#define __constraint_read_1
+#define __constraint_read_2
+#define __constraint_read_3
+#define __constraint_read_4 "r" (r4)
+#define __constraint_read_5 __constraint_read_4, "r" (r5)
+#define __constraint_read_6 __constraint_read_5, "r" (r6)
+#define __constraint_read_7 __constraint_read_6, "r" (r7)
+
+#define __declare_arg_0(a0, res) \
+ struct arm_smccc_res *___res = res; \
+ register u32 r0 asm("r0") = a0; \
+ register unsigned long r1 asm("r1"); \
+ register unsigned long r2 asm("r2"); \
+ register unsigned long r3 asm("r3")
+
+#define __declare_arg_1(a0, a1, res) \
+ struct arm_smccc_res *___res = res; \
+ register u32 r0 asm("r0") = a0; \
+ register typeof(a1) r1 asm("r1") = a1; \
+ register unsigned long r2 asm("r2"); \
+ register unsigned long r3 asm("r3")
+
+#define __declare_arg_2(a0, a1, a2, res) \
+ struct arm_smccc_res *___res = res; \
+ register u32 r0 asm("r0") = a0; \
+ register typeof(a1) r1 asm("r1") = a1; \
+ register typeof(a2) r2 asm("r2") = a2; \
+ register unsigned long r3 asm("r3")
+
+#define __declare_arg_3(a0, a1, a2, a3, res) \
+ struct arm_smccc_res *___res = res; \
+ register u32 r0 asm("r0") = a0; \
+ register typeof(a1) r1 asm("r1") = a1; \
+ register typeof(a2) r2 asm("r2") = a2; \
+ register typeof(a3) r3 asm("r3") = a3
+
+#define __declare_arg_4(a0, a1, a2, a3, a4, res) \
+ __declare_arg_3(a0, a1, a2, a3, res); \
+ register typeof(a4) r4 asm("r4") = a4
+
+#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res) \
+ __declare_arg_4(a0, a1, a2, a3, a4, res); \
+ register typeof(a5) r5 asm("r5") = a5
+
+#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res) \
+ __declare_arg_5(a0, a1, a2, a3, a4, a5, res); \
+ register typeof(a6) r6 asm("r6") = a6
+
+#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \
+ __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res); \
+ register typeof(a7) r7 asm("r7") = a7
+
+#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
+#define __declare_args(count, ...) ___declare_args(count, __VA_ARGS__)
+
+#define ___constraints(count) \
+ : __constraint_write_ ## count \
+ : __constraint_read_ ## count \
+ : "memory"
+#define __constraints(count) ___constraints(count)
+
+/*
+ * We have an output list that is not necessarily used, and GCC feels
+ * entitled to optimise the whole sequence away. "volatile" is what
+ * makes it stick.
+ */
+#define __arm_smccc_1_1(inst, ...) \
+ do { \
+ __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
+ asm volatile(inst "\n" \
+ __constraints(__count_args(__VA_ARGS__))); \
+ if (___res) \
+ *___res = (typeof(*___res)){r0, r1, r2, r3}; \
+ } while (0)
+
+/*
+ * arm_smccc_1_1_smc() - make an SMCCC v1.1 compliant SMC call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro is used to make SMC calls following SMC Calling Convention v1.1.
+ * The content of the supplied param are copied to registers 0 to 7 prior
+ * to the SMC instruction. The return values are updated with the content
+ * from register 0 to 3 on return from the SMC instruction if not NULL.
+ */
+#define arm_smccc_1_1_smc(...) __arm_smccc_1_1(SMCCC_SMC_INST, __VA_ARGS__)
+
+/*
+ * arm_smccc_1_1_hvc() - make an SMCCC v1.1 compliant HVC call
+ *
+ * This is a variadic macro taking one to eight source arguments, and
+ * an optional return structure.
+ *
+ * @a0-a7: arguments passed in registers 0 to 7
+ * @res: result values from registers 0 to 3
+ *
+ * This macro is used to make HVC calls following SMC Calling Convention v1.1.
+ * The content of the supplied param are copied to registers 0 to 7 prior
+ * to the HVC instruction. The return values are updated with the content
+ * from register 0 to 3 on return from the HVC instruction if not NULL.
+ */
+#define arm_smccc_1_1_hvc(...) __arm_smccc_1_1(SMCCC_HVC_INST, __VA_ARGS__)
+
#endif /*__ASSEMBLY__*/
#endif /*__LINUX_ARM_SMCCC_H*/
--
2.14.2


2018-02-01 11:50:40

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 17/18] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support

Add the detection and runtime code for ARM_SMCCC_ARCH_WORKAROUND_1.
It is lovely. Really.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/kernel/bpi.S | 20 +++++++++++++
arch/arm64/kernel/cpu_errata.c | 68 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
index 76225c2611ea..fdeed629f2c6 100644
--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -17,6 +17,7 @@
*/

#include <linux/linkage.h>
+#include <linux/arm-smccc.h>

.macro ventry target
.rept 31
@@ -85,3 +86,22 @@ ENTRY(__qcom_hyp_sanitize_link_stack_start)
.endr
ldp x29, x30, [sp], #16
ENTRY(__qcom_hyp_sanitize_link_stack_end)
+
+.macro smccc_workaround_1 inst
+ sub sp, sp, #(8 * 4)
+ stp x2, x3, [sp, #(8 * 0)]
+ stp x0, x1, [sp, #(8 * 2)]
+ mov w0, #ARM_SMCCC_ARCH_WORKAROUND_1
+ \inst #0
+ ldp x2, x3, [sp, #(8 * 0)]
+ ldp x0, x1, [sp, #(8 * 2)]
+ add sp, sp, #(8 * 4)
+.endm
+
+ENTRY(__smccc_workaround_1_smc_start)
+ smccc_workaround_1 smc
+ENTRY(__smccc_workaround_1_smc_end)
+
+ENTRY(__smccc_workaround_1_hvc_start)
+ smccc_workaround_1 hvc
+ENTRY(__smccc_workaround_1_hvc_end)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index ed6881882231..9e77809a3b23 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -70,6 +70,10 @@ DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
extern char __qcom_hyp_sanitize_link_stack_start[];
extern char __qcom_hyp_sanitize_link_stack_end[];
+extern char __smccc_workaround_1_smc_start[];
+extern char __smccc_workaround_1_smc_end[];
+extern char __smccc_workaround_1_hvc_start[];
+extern char __smccc_workaround_1_hvc_end[];

static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
@@ -116,6 +120,10 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
#define __psci_hyp_bp_inval_end NULL
#define __qcom_hyp_sanitize_link_stack_start NULL
#define __qcom_hyp_sanitize_link_stack_end NULL
+#define __smccc_workaround_1_smc_start NULL
+#define __smccc_workaround_1_smc_end NULL
+#define __smccc_workaround_1_hvc_start NULL
+#define __smccc_workaround_1_hvc_end NULL

static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
const char *hyp_vecs_start,
@@ -142,17 +150,75 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
}

+#include <uapi/linux/psci.h>
+#include <linux/arm-smccc.h>
#include <linux/psci.h>

+static void call_smc_arch_workaround_1(void)
+{
+ arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
+}
+
+static void call_hvc_arch_workaround_1(void)
+{
+ arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
+}
+
+static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
+{
+ bp_hardening_cb_t cb;
+ void *smccc_start, *smccc_end;
+ struct arm_smccc_res res;
+
+ if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+ return false;
+
+ if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
+ return false;
+
+ switch (psci_ops.conduit) {
+ case PSCI_CONDUIT_HVC:
+ arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+ if (res.a0)
+ return false;
+ cb = call_hvc_arch_workaround_1;
+ smccc_start = __smccc_workaround_1_hvc_start;
+ smccc_end = __smccc_workaround_1_hvc_end;
+ break;
+
+ case PSCI_CONDUIT_SMC:
+ arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
+ ARM_SMCCC_ARCH_WORKAROUND_1, &res);
+ if (res.a0)
+ return false;
+ cb = call_smc_arch_workaround_1;
+ smccc_start = __smccc_workaround_1_smc_start;
+ smccc_end = __smccc_workaround_1_smc_end;
+ break;
+
+ default:
+ return false;
+ }
+
+ install_bp_hardening_cb(entry, cb, smccc_start, smccc_end);
+
+ return true;
+}
+
static int enable_psci_bp_hardening(void *data)
{
const struct arm64_cpu_capabilities *entry = data;

- if (psci_ops.get_version)
+ if (psci_ops.get_version) {
+ if (check_smccc_arch_workaround_1(entry))
+ return 0;
+
install_bp_hardening_cb(entry,
(bp_hardening_cb_t)psci_ops.get_version,
__psci_hyp_bp_inval_start,
__psci_hyp_bp_inval_end);
+ }

return 0;
}
--
2.14.2


2018-02-01 11:50:41

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 14/18] firmware/psci: Expose SMCCC version through psci_ops

Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
let's do that at boot time, and expose the version of the calling
convention as part of the psci_ops structure.

Acked-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/firmware/psci.c | 19 +++++++++++++++++++
include/linux/psci.h | 6 ++++++
2 files changed, 25 insertions(+)

diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index e9493da2b111..8631906c414c 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -61,6 +61,7 @@ bool psci_tos_resident_on(int cpu)

struct psci_operations psci_ops = {
.conduit = PSCI_CONDUIT_NONE,
+ .smccc_version = SMCCC_VERSION_1_0,
};

typedef unsigned long (psci_fn)(unsigned long, unsigned long,
@@ -511,6 +512,23 @@ static void __init psci_init_migrate(void)
pr_info("Trusted OS resident on physical CPU 0x%lx\n", cpuid);
}

+static void __init psci_init_smccc(u32 ver)
+{
+ int feature;
+
+ feature = psci_features(ARM_SMCCC_VERSION_FUNC_ID);
+
+ if (feature != PSCI_RET_NOT_SUPPORTED) {
+ ver = invoke_psci_fn(ARM_SMCCC_VERSION_FUNC_ID, 0, 0, 0);
+ if (ver != ARM_SMCCC_VERSION_1_1)
+ psci_ops.smccc_version = SMCCC_VERSION_1_0;
+ else
+ psci_ops.smccc_version = SMCCC_VERSION_1_1;
+ }
+
+ pr_info("SMC Calling Convention v1.%d\n", psci_ops.smccc_version);
+}
+
static void __init psci_0_2_set_functions(void)
{
pr_info("Using standard PSCI v0.2 function IDs\n");
@@ -559,6 +577,7 @@ static int __init psci_probe(void)
psci_init_migrate();

if (PSCI_VERSION_MAJOR(ver) >= 1) {
+ psci_init_smccc(ver);
psci_init_cpu_suspend();
psci_init_system_suspend();
}
diff --git a/include/linux/psci.h b/include/linux/psci.h
index f2679e5faa4f..8b1b3b5935ab 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -31,6 +31,11 @@ enum psci_conduit {
PSCI_CONDUIT_HVC,
};

+enum smccc_version {
+ SMCCC_VERSION_1_0,
+ SMCCC_VERSION_1_1,
+};
+
struct psci_operations {
u32 (*get_version)(void);
int (*cpu_suspend)(u32 state, unsigned long entry_point);
@@ -41,6 +46,7 @@ struct psci_operations {
unsigned long lowest_affinity_level);
int (*migrate_info_type)(void);
enum psci_conduit conduit;
+ enum smccc_version smccc_version;
};

extern struct psci_operations psci_ops;
--
2.14.2


2018-02-01 11:51:15

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 11/18] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support

A new feature of SMCCC 1.1 is that it offers firmware-based CPU
workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
BP hardening for CVE-2017-5715.

If the host has some mitigation for this issue, report that
we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
host workaround on every guest exit.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/kvm_host.h | 7 +++++++
arch/arm64/include/asm/kvm_host.h | 6 ++++++
include/linux/arm-smccc.h | 5 +++++
virt/kvm/arm/psci.c | 9 ++++++++-
4 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index e9d57060d88c..6c05e3b13081 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -309,4 +309,11 @@ static inline void kvm_fpsimd_flush_cpu_state(void) {}

static inline void kvm_arm_vhe_guest_enter(void) {}
static inline void kvm_arm_vhe_guest_exit(void) {}
+
+static inline bool kvm_arm_harden_branch_predictor(void)
+{
+ /* No way to detect it yet, pretend it is not there. */
+ return false;
+}
+
#endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 10af386642c6..448d3b9a58cb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -418,4 +418,10 @@ static inline void kvm_arm_vhe_guest_exit(void)
{
local_daif_restore(DAIF_PROCCTX_NOIRQ);
}
+
+static inline bool kvm_arm_harden_branch_predictor(void)
+{
+ return cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR);
+}
+
#endif /* __ARM64_KVM_HOST_H__ */
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index dc68aa5a7261..e1ef944ef1da 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -73,6 +73,11 @@
ARM_SMCCC_SMC_32, \
0, 1)

+#define ARM_SMCCC_ARCH_WORKAROUND_1 \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_32, \
+ 0, 0x8000)
+
#ifndef __ASSEMBLY__

#include <linux/linkage.h>
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 2efacbe7b1a2..22c24561d07d 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -406,13 +406,20 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
{
u32 func_id = smccc_get_function(vcpu);
u32 val = PSCI_RET_NOT_SUPPORTED;
+ u32 feature;

switch (func_id) {
case ARM_SMCCC_VERSION_FUNC_ID:
val = ARM_SMCCC_VERSION_1_1;
break;
case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
- /* Nothing supported yet */
+ feature = smccc_get_arg1(vcpu);
+ switch(feature) {
+ case ARM_SMCCC_ARCH_WORKAROUND_1:
+ if (kvm_arm_harden_branch_predictor())
+ val = 0;
+ break;
+ }
break;
default:
return kvm_psci_call(vcpu);
--
2.14.2


2018-02-01 11:51:27

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 10/18] arm/arm64: KVM: Turn kvm_psci_version into a static inline

We're about to need kvm_psci_version in HYP too. So let's turn it
into a static inline, and pass the kvm structure as a second
parameter (so that HYP can do a kern_hyp_va on it).

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/kvm/hyp/switch.c | 20 ++++++++++++--------
include/kvm/arm_psci.h | 26 +++++++++++++++++++++++++-
virt/kvm/arm/psci.c | 25 +++----------------------
3 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 036e1f3d77a6..408c04d789a5 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -19,6 +19,8 @@
#include <linux/jump_label.h>
#include <uapi/linux/psci.h>

+#include <kvm/arm_psci.h>
+
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
@@ -350,14 +352,16 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)

if (exit_code == ARM_EXCEPTION_TRAP &&
(kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
- kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
- vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
- u64 val = PSCI_RET_NOT_SUPPORTED;
- if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
- val = 2;
-
- vcpu_set_reg(vcpu, 0, val);
- goto again;
+ kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32)) {
+ u32 val = vcpu_get_reg(vcpu, 0);
+
+ if (val == PSCI_0_2_FN_PSCI_VERSION) {
+ val = kvm_psci_version(vcpu, kern_hyp_va(vcpu->kvm));
+ if (unlikely(val == KVM_ARM_PSCI_0_1))
+ val = PSCI_RET_NOT_SUPPORTED;
+ vcpu_set_reg(vcpu, 0, val);
+ goto again;
+ }
}

if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 7b2e12697d4f..9b699f91171f 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -18,6 +18,7 @@
#ifndef __KVM_ARM_PSCI_H__
#define __KVM_ARM_PSCI_H__

+#include <linux/kvm_host.h>
#include <uapi/linux/psci.h>

#define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
@@ -26,7 +27,30 @@

#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0

-int kvm_psci_version(struct kvm_vcpu *vcpu);
+/*
+ * We need the KVM pointer independently from the vcpu as we can call
+ * this from HYP, and need to apply kern_hyp_va on it...
+ */
+static inline int kvm_psci_version(struct kvm_vcpu *vcpu, struct kvm *kvm)
+{
+ /*
+ * Our PSCI implementation stays the same across versions from
+ * v0.2 onward, only adding the few mandatory functions (such
+ * as FEATURES with 1.0) that are required by newer
+ * revisions. It is thus safe to return the latest, unless
+ * userspace has instructed us otherwise.
+ */
+ if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
+ if (kvm->arch.psci_version)
+ return kvm->arch.psci_version;
+
+ return KVM_ARM_PSCI_LATEST;
+ }
+
+ return KVM_ARM_PSCI_0_1;
+}
+
+
int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);

struct kvm_one_reg;
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 53272e8e0d37..2efacbe7b1a2 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -124,7 +124,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
if (!vcpu)
return PSCI_RET_INVALID_PARAMS;
if (!vcpu->arch.power_off) {
- if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
+ if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
return PSCI_RET_ALREADY_ON;
else
return PSCI_RET_INVALID_PARAMS;
@@ -233,25 +233,6 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET);
}

-int kvm_psci_version(struct kvm_vcpu *vcpu)
-{
- /*
- * Our PSCI implementation stays the same across versions from
- * v0.2 onward, only adding the few mandatory functions (such
- * as FEATURES with 1.0) that are required by newer
- * revisions. It is thus safe to return the latest, unless
- * userspace has instructed us otherwise.
- */
- if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
- if (vcpu->kvm->arch.psci_version)
- return vcpu->kvm->arch.psci_version;
-
- return KVM_ARM_PSCI_LATEST;
- }
-
- return KVM_ARM_PSCI_0_1;
-}
-
static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -409,7 +390,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
*/
static int kvm_psci_call(struct kvm_vcpu *vcpu)
{
- switch (kvm_psci_version(vcpu)) {
+ switch (kvm_psci_version(vcpu, vcpu->kvm)) {
case KVM_ARM_PSCI_1_0:
return kvm_psci_1_0_call(vcpu);
case KVM_ARM_PSCI_0_2:
@@ -460,7 +441,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
void __user *uaddr = (void __user *)(long)reg->addr;
u64 val;

- val = kvm_psci_version(vcpu);
+ val = kvm_psci_version(vcpu, vcpu->kvm);
if (val == KVM_ARM_PSCI_0_1)
return -EINVAL;
if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
--
2.14.2


2018-02-01 11:51:37

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 18/18] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround

Now that we've standardised on SMCCC v1.1 to perform the branch
prediction invalidation, let's drop the previous band-aid.
If vendors haven't updated their firmware to do SMCCC 1.1, they
haven't updated PSCI either, so we don't loose anything.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/kernel/bpi.S | 24 -----------------------
arch/arm64/kernel/cpu_errata.c | 43 ++++++++++++------------------------------
arch/arm64/kvm/hyp/switch.c | 14 --------------
3 files changed, 12 insertions(+), 69 deletions(-)

diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
index fdeed629f2c6..e5de33513b5d 100644
--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -54,30 +54,6 @@ ENTRY(__bp_harden_hyp_vecs_start)
vectors __kvm_hyp_vector
.endr
ENTRY(__bp_harden_hyp_vecs_end)
-ENTRY(__psci_hyp_bp_inval_start)
- sub sp, sp, #(8 * 18)
- stp x16, x17, [sp, #(16 * 0)]
- stp x14, x15, [sp, #(16 * 1)]
- stp x12, x13, [sp, #(16 * 2)]
- stp x10, x11, [sp, #(16 * 3)]
- stp x8, x9, [sp, #(16 * 4)]
- stp x6, x7, [sp, #(16 * 5)]
- stp x4, x5, [sp, #(16 * 6)]
- stp x2, x3, [sp, #(16 * 7)]
- stp x0, x1, [sp, #(16 * 8)]
- mov x0, #0x84000000
- smc #0
- ldp x16, x17, [sp, #(16 * 0)]
- ldp x14, x15, [sp, #(16 * 1)]
- ldp x12, x13, [sp, #(16 * 2)]
- ldp x10, x11, [sp, #(16 * 3)]
- ldp x8, x9, [sp, #(16 * 4)]
- ldp x6, x7, [sp, #(16 * 5)]
- ldp x4, x5, [sp, #(16 * 6)]
- ldp x2, x3, [sp, #(16 * 7)]
- ldp x0, x1, [sp, #(16 * 8)]
- add sp, sp, #(8 * 18)
-ENTRY(__psci_hyp_bp_inval_end)

ENTRY(__qcom_hyp_sanitize_link_stack_start)
stp x29, x30, [sp, #-16]!
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 9e77809a3b23..b8279a11f57b 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -67,7 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused)
DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);

#ifdef CONFIG_KVM
-extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
extern char __qcom_hyp_sanitize_link_stack_start[];
extern char __qcom_hyp_sanitize_link_stack_end[];
extern char __smccc_workaround_1_smc_start[];
@@ -116,8 +115,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
spin_unlock(&bp_lock);
}
#else
-#define __psci_hyp_bp_inval_start NULL
-#define __psci_hyp_bp_inval_end NULL
#define __qcom_hyp_sanitize_link_stack_start NULL
#define __qcom_hyp_sanitize_link_stack_end NULL
#define __smccc_workaround_1_smc_start NULL
@@ -164,14 +161,15 @@ static void call_hvc_arch_workaround_1(void)
arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}

-static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
+static int smccc_arch_workaround_1(void *data)
{
+ const struct arm64_cpu_capabilities *entry = data;
bp_hardening_cb_t cb;
void *smccc_start, *smccc_end;
struct arm_smccc_res res;

if (!entry->matches(entry, SCOPE_LOCAL_CPU))
- return false;
+ return 0;

if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
return false;
@@ -181,7 +179,7 @@ static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *e
arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
if (res.a0)
- return false;
+ return 0;
cb = call_hvc_arch_workaround_1;
smccc_start = __smccc_workaround_1_hvc_start;
smccc_end = __smccc_workaround_1_hvc_end;
@@ -191,35 +189,18 @@ static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *e
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
if (res.a0)
- return false;
+ return 0;
cb = call_smc_arch_workaround_1;
smccc_start = __smccc_workaround_1_smc_start;
smccc_end = __smccc_workaround_1_smc_end;
break;

default:
- return false;
+ return 0;
}

install_bp_hardening_cb(entry, cb, smccc_start, smccc_end);

- return true;
-}
-
-static int enable_psci_bp_hardening(void *data)
-{
- const struct arm64_cpu_capabilities *entry = data;
-
- if (psci_ops.get_version) {
- if (check_smccc_arch_workaround_1(entry))
- return 0;
-
- install_bp_hardening_cb(entry,
- (bp_hardening_cb_t)psci_ops.get_version,
- __psci_hyp_bp_inval_start,
- __psci_hyp_bp_inval_end);
- }
-
return 0;
}

@@ -399,22 +380,22 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
@@ -428,12 +409,12 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
- .enable = enable_psci_bp_hardening,
+ .enable = smccc_arch_workaround_1,
},
#endif
{
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 408c04d789a5..cac6a0500162 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -350,20 +350,6 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
goto again;

- if (exit_code == ARM_EXCEPTION_TRAP &&
- (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
- kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32)) {
- u32 val = vcpu_get_reg(vcpu, 0);
-
- if (val == PSCI_0_2_FN_PSCI_VERSION) {
- val = kvm_psci_version(vcpu, kern_hyp_va(vcpu->kvm));
- if (unlikely(val == KVM_ARM_PSCI_0_1))
- val = PSCI_RET_NOT_SUPPORTED;
- vcpu_set_reg(vcpu, 0, val);
- goto again;
- }
- }
-
if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
exit_code == ARM_EXCEPTION_TRAP) {
bool valid;
--
2.14.2


2018-02-01 11:51:38

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 12/18] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling

We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible.
So let's intercept it as early as we can by testing for the
function call number as soon as we've identified a HVC call
coming from the guest.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/kvm/hyp/hyp-entry.S | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index e4f37b9dd47c..f36464bd57c5 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -15,6 +15,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

+#include <linux/arm-smccc.h>
#include <linux/linkage.h>

#include <asm/alternative.h>
@@ -64,10 +65,11 @@ alternative_endif
lsr x0, x1, #ESR_ELx_EC_SHIFT

cmp x0, #ESR_ELx_EC_HVC64
+ ccmp x0, #ESR_ELx_EC_HVC32, #4, ne
b.ne el1_trap

- mrs x1, vttbr_el2 // If vttbr is valid, the 64bit guest
- cbnz x1, el1_trap // called HVC
+ mrs x1, vttbr_el2 // If vttbr is valid, the guest
+ cbnz x1, el1_hvc_guest // called HVC

/* Here, we're pretty sure the host called HVC. */
ldp x0, x1, [sp], #16
@@ -100,6 +102,20 @@ alternative_endif

eret

+el1_hvc_guest:
+ /*
+ * Fastest possible path for ARM_SMCCC_ARCH_WORKAROUND_1.
+ * The workaround has already been applied on the host,
+ * so let's quickly get back to the guest. We don't bother
+ * restoring x1, as it can be clobbered anyway.
+ */
+ ldr x1, [sp] // Guest's x0
+ eor w1, w1, #ARM_SMCCC_ARCH_WORKAROUND_1
+ cbnz w1, el1_trap
+ mov x0, x1
+ add sp, sp, #16
+ eret
+
el1_trap:
/*
* x0: ESR_EC
--
2.14.2


2018-02-01 11:51:59

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 09/18] arm/arm64: KVM: Advertise SMCCC v1.1

The new SMC Calling Convention (v1.1) allows for a reduced overhead
when calling into the firmware, and provides a new feature discovery
mechanism.

Make it visible to KVM guests.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 2 +-
include/kvm/arm_psci.h | 2 +-
include/linux/arm-smccc.h | 13 +++++++++++++
virt/kvm/arm/psci.c | 24 +++++++++++++++++++++++-
5 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index 230ae4079108..910bd8dabb3c 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -36,7 +36,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_vcpu_hvc_get_imm(vcpu));
vcpu->stat.hvc_exit_stat++;

- ret = kvm_psci_call(vcpu);
+ ret = kvm_hvc_call_handler(vcpu);
if (ret < 0) {
vcpu_set_reg(vcpu, 0, ~0UL);
return 1;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 588f910632a7..e5e741bfffe1 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -52,7 +52,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_vcpu_hvc_get_imm(vcpu));
vcpu->stat.hvc_exit_stat++;

- ret = kvm_psci_call(vcpu);
+ ret = kvm_hvc_call_handler(vcpu);
if (ret < 0) {
vcpu_set_reg(vcpu, 0, ~0UL);
return 1;
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 4ee098c39e01..7b2e12697d4f 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -27,7 +27,7 @@
#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0

int kvm_psci_version(struct kvm_vcpu *vcpu);
-int kvm_psci_call(struct kvm_vcpu *vcpu);
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);

struct kvm_one_reg;

diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 4c5bca38c653..dc68aa5a7261 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -60,6 +60,19 @@
#define ARM_SMCCC_QUIRK_NONE 0
#define ARM_SMCCC_QUIRK_QCOM_A6 1 /* Save/restore register a6 */

+#define ARM_SMCCC_VERSION_1_0 0x10000
+#define ARM_SMCCC_VERSION_1_1 0x10001
+
+#define ARM_SMCCC_VERSION_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_32, \
+ 0, 0)
+
+#define ARM_SMCCC_ARCH_FEATURES_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_32, \
+ 0, 1)
+
#ifndef __ASSEMBLY__

#include <linux/linkage.h>
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 5c8366b71639..53272e8e0d37 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -15,6 +15,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

+#include <linux/arm-smccc.h>
#include <linux/preempt.h>
#include <linux/kvm_host.h>
#include <linux/uaccess.h>
@@ -351,6 +352,7 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
case PSCI_0_2_FN_SYSTEM_OFF:
case PSCI_0_2_FN_SYSTEM_RESET:
case PSCI_1_0_FN_PSCI_FEATURES:
+ case ARM_SMCCC_VERSION_FUNC_ID:
val = 0;
break;
default:
@@ -405,7 +407,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
* Errors:
* -EINVAL: Unrecognized PSCI function
*/
-int kvm_psci_call(struct kvm_vcpu *vcpu)
+static int kvm_psci_call(struct kvm_vcpu *vcpu)
{
switch (kvm_psci_version(vcpu)) {
case KVM_ARM_PSCI_1_0:
@@ -419,6 +421,26 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
};
}

+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+ u32 func_id = smccc_get_function(vcpu);
+ u32 val = PSCI_RET_NOT_SUPPORTED;
+
+ switch (func_id) {
+ case ARM_SMCCC_VERSION_FUNC_ID:
+ val = ARM_SMCCC_VERSION_1_1;
+ break;
+ case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
+ /* Nothing supported yet */
+ break;
+ default:
+ return kvm_psci_call(vcpu);
+ }
+
+ smccc_set_retval(vcpu, val, 0, 0, 0);
+ return 1;
+}
+
int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
{
return 1; /* PSCI version */
--
2.14.2


2018-02-01 11:52:09

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

Although we've implemented PSCI 1.0 and 1.1, nothing can select them
Since all the new PSCI versions are backward compatible, we decide to
default to the latest version of the PSCI implementation. This is no
different from doing a firmware upgrade on KVM.

But in order to give a chance to hypothetical badly implemented guests
that would have a fit by discovering something other than PSCI 0.2,
let's provide a new API that allows userspace to pick one particular
version of the API.

This is implemented as a new class of "firmware" registers, where
we expose the PSCI version. This allows the PSCI version to be
save/restored as part of a guest migration, and also set to
any supported version if the guest requires it.

Signed-off-by: Marc Zyngier <[email protected]>
---
Documentation/virtual/kvm/api.txt | 3 +-
Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
arch/arm/include/asm/kvm_host.h | 3 ++
arch/arm/include/uapi/asm/kvm.h | 6 +++
arch/arm/kvm/guest.c | 13 +++++++
arch/arm64/include/asm/kvm_host.h | 3 ++
arch/arm64/include/uapi/asm/kvm.h | 6 +++
arch/arm64/kvm/guest.c | 14 ++++++-
include/kvm/arm_psci.h | 9 +++++
virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
10 files changed, 151 insertions(+), 4 deletions(-)
create mode 100644 Documentation/virtual/kvm/arm/psci.txt

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 57d3ee9e4bde..334905202141 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2493,7 +2493,8 @@ Possible features:
and execute guest code when KVM_RUN is called.
- KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
- - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
+ - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
+ backward compatible with v0.2) for the CPU.
Depends on KVM_CAP_ARM_PSCI_0_2.
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
Depends on KVM_CAP_ARM_PMU_V3.
diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
new file mode 100644
index 000000000000..aafdab887b04
--- /dev/null
+++ b/Documentation/virtual/kvm/arm/psci.txt
@@ -0,0 +1,30 @@
+KVM implements the PSCI (Power State Coordination Interface)
+specification in order to provide services such as CPU on/off, reset
+and power-off to the guest.
+
+The PSCI specification is regularly updated to provide new features,
+and KVM implements these updates if they make sense from a virtualization
+point of view.
+
+This means that a guest booted on two different versions of KVM can
+observe two different "firmware" revisions. This could cause issues if
+a given guest is tied to a particular PSCI revision (unlikely), or if
+a migration causes a different PSCI version to be exposed out of the
+blue to an unsuspecting guest.
+
+In order to remedy this situation, KVM exposes a set of "firmware
+pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
+interface. These registers can be saved/restored by userspace, and set
+to a convenient value if required.
+
+The following register is defined:
+
+* KVM_REG_ARM_PSCI_VERSION:
+
+ - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
+ (and thus has already been initialized)
+ - Returns the current PSCI version on GET_ONE_REG (defaulting to the
+ highest PSCI version implemented by KVM and compatible with v0.2)
+ - Allows any PSCI version implemented by KVM and compatible with
+ v0.2 to be set with SET_ONE_REG
+ - Affects the whole VM (even if the register view is per-vcpu)
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index acbf9ec7b396..e9d57060d88c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -75,6 +75,9 @@ struct kvm_arch {
/* Interrupt controller */
struct vgic_dist vgic;
int max_vcpus;
+
+ /* Mandated version of PSCI */
+ u32 psci_version;
};

#define KVM_NR_MEM_OBJS 40
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 6edd177bb1c7..47dfc99f5cd0 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -186,6 +186,12 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_VFP_FPINST 0x1009
#define KVM_REG_ARM_VFP_FPINST2 0x100A

+/* KVM-as-firmware specific pseudo-registers */
+#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM | KVM_REG_SIZE_U64 | \
+ KVM_REG_ARM_FW | ((r) & 0xffff))
+#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
+
/* Device Control API: ARM VGIC */
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 1e0784ebbfd6..a18f33edc471 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -22,6 +22,7 @@
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
+#include <kvm/arm_psci.h>
#include <asm/cputype.h>
#include <linux/uaccess.h>
#include <asm/kvm.h>
@@ -176,6 +177,7 @@ static unsigned long num_core_regs(void)
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
{
return num_core_regs() + kvm_arm_num_coproc_regs(vcpu)
+ + kvm_arm_get_fw_num_regs(vcpu)
+ NUM_TIMER_REGS;
}

@@ -196,6 +198,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
uindices++;
}

+ ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
+ if (ret)
+ return ret;
+ uindices += kvm_arm_get_fw_num_regs(vcpu);
+
ret = copy_timer_indices(vcpu, uindices);
if (ret)
return ret;
@@ -214,6 +221,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
return get_core_reg(vcpu, reg);

+ if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
+ return kvm_arm_get_fw_reg(vcpu, reg);
+
if (is_timer_reg(reg->id))
return get_timer_reg(vcpu, reg);

@@ -230,6 +240,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
return set_core_reg(vcpu, reg);

+ if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
+ return kvm_arm_set_fw_reg(vcpu, reg);
+
if (is_timer_reg(reg->id))
return set_timer_reg(vcpu, reg);

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4485ae8e98de..10af386642c6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -73,6 +73,9 @@ struct kvm_arch {

/* Interrupt controller */
struct vgic_dist vgic;
+
+ /* Mandated version of PSCI */
+ u32 psci_version;
};

#define KVM_NR_MEM_OBJS 40
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 9abbf3044654..04b3256f8e6d 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -206,6 +206,12 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_TIMER_CNT ARM64_SYS_REG(3, 3, 14, 3, 2)
#define KVM_REG_ARM_TIMER_CVAL ARM64_SYS_REG(3, 3, 14, 0, 2)

+/* KVM-as-firmware specific pseudo-registers */
+#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
+#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
+ KVM_REG_ARM_FW | ((r) & 0xffff))
+#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
+
/* Device Control API: ARM VGIC */
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5c7f657dd207..811f04c5760e 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -25,6 +25,7 @@
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
+#include <kvm/arm_psci.h>
#include <asm/cputype.h>
#include <linux/uaccess.h>
#include <asm/kvm.h>
@@ -205,7 +206,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
{
return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
- + NUM_TIMER_REGS;
+ + kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
}

/**
@@ -225,6 +226,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
uindices++;
}

+ ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
+ if (ret)
+ return ret;
+ uindices += kvm_arm_get_fw_num_regs(vcpu);
+
ret = copy_timer_indices(vcpu, uindices);
if (ret)
return ret;
@@ -243,6 +249,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
return get_core_reg(vcpu, reg);

+ if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
+ return kvm_arm_get_fw_reg(vcpu, reg);
+
if (is_timer_reg(reg->id))
return get_timer_reg(vcpu, reg);

@@ -259,6 +268,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
return set_core_reg(vcpu, reg);

+ if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
+ return kvm_arm_set_fw_reg(vcpu, reg);
+
if (is_timer_reg(reg->id))
return set_timer_reg(vcpu, reg);

diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 5446435457c2..4ee098c39e01 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -24,7 +24,16 @@
#define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
#define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)

+#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
+
int kvm_psci_version(struct kvm_vcpu *vcpu);
int kvm_psci_call(struct kvm_vcpu *vcpu);

+struct kvm_one_reg;
+
+int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
+int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
+int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+
#endif /* __KVM_ARM_PSCI_H__ */
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 291874cff85e..5c8366b71639 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -17,6 +17,7 @@

#include <linux/preempt.h>
#include <linux/kvm_host.h>
+#include <linux/uaccess.h>
#include <linux/wait.h>

#include <asm/cputype.h>
@@ -233,8 +234,19 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)

int kvm_psci_version(struct kvm_vcpu *vcpu)
{
- if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
- return KVM_ARM_PSCI_0_2;
+ /*
+ * Our PSCI implementation stays the same across versions from
+ * v0.2 onward, only adding the few mandatory functions (such
+ * as FEATURES with 1.0) that are required by newer
+ * revisions. It is thus safe to return the latest, unless
+ * userspace has instructed us otherwise.
+ */
+ if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
+ if (vcpu->kvm->arch.psci_version)
+ return vcpu->kvm->arch.psci_version;
+
+ return KVM_ARM_PSCI_LATEST;
+ }

return KVM_ARM_PSCI_0_1;
}
@@ -406,3 +418,55 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
return -EINVAL;
};
}
+
+int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
+{
+ return 1; /* PSCI version */
+}
+
+int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
+{
+ if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
+ return -EFAULT;
+
+ return 0;
+}
+
+int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+ if (reg->id == KVM_REG_ARM_PSCI_VERSION) {
+ void __user *uaddr = (void __user *)(long)reg->addr;
+ u64 val;
+
+ val = kvm_psci_version(vcpu);
+ if (val == KVM_ARM_PSCI_0_1)
+ return -EINVAL;
+ if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+ if (reg->id == KVM_REG_ARM_PSCI_VERSION &&
+ test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
+ void __user *uaddr = (void __user *)(long)reg->addr;
+ u64 val;
+
+ if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ switch (val) {
+ case KVM_ARM_PSCI_0_2:
+ case KVM_ARM_PSCI_1_0:
+ vcpu->kvm->arch.psci_version = val;
+ return 0;
+ }
+ }
+
+ return -EINVAL;
+}
--
2.14.2


2018-02-01 11:52:50

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 02/18] arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls

KVM doesn't follow the SMCCC when it comes to unimplemented calls,
and inject an UNDEF instead of returning an error. Since firmware
calls are now used for security mitigation, they are becoming more
common, and the undef is counter productive.

Instead, let's follow the SMCCC which states that -1 must be returned
to the caller when getting an unknown function number.

Cc: <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/kvm/handle_exit.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index cf8bf6bf87c4..a4bf0f6f024a 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -38,7 +38,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

ret = kvm_psci_call(vcpu);
if (ret < 0) {
- kvm_inject_undefined(vcpu);
+ vcpu_set_reg(vcpu, 0, ~0UL);
return 1;
}

@@ -47,7 +47,16 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
- kvm_inject_undefined(vcpu);
+ /*
+ * "If an SMC instruction executed at Non-secure EL1 is
+ * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a
+ * Trap exception, not a Secure Monitor Call exception [...]"
+ *
+ * We need to advance the PC after the trap, as it would
+ * otherwise return to the same address...
+ */
+ vcpu_set_reg(vcpu, 0, ~0UL);
+ kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
return 1;
}

--
2.14.2


2018-02-01 11:53:14

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 06/18] arm/arm64: KVM: Add smccc accessors to PSCI code

Instead of open coding the accesses to the various registers,
let's add explicit SMCCC accessors.

Signed-off-by: Marc Zyngier <[email protected]>
---
virt/kvm/arm/psci.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 10 deletions(-)

diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 999f94d6bb98..c41553d35110 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -32,6 +32,38 @@

#define AFFINITY_MASK(level) ~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)

+static u32 smccc_get_function(struct kvm_vcpu *vcpu)
+{
+ return vcpu_get_reg(vcpu, 0);
+}
+
+static unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+{
+ return vcpu_get_reg(vcpu, 1);
+}
+
+static unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
+{
+ return vcpu_get_reg(vcpu, 2);
+}
+
+static unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+{
+ return vcpu_get_reg(vcpu, 3);
+}
+
+static void smccc_set_retval(struct kvm_vcpu *vcpu,
+ unsigned long a0,
+ unsigned long a1,
+ unsigned long a2,
+ unsigned long a3)
+{
+ vcpu_set_reg(vcpu, 0, a0);
+ vcpu_set_reg(vcpu, 1, a1);
+ vcpu_set_reg(vcpu, 2, a2);
+ vcpu_set_reg(vcpu, 3, a3);
+}
+
static unsigned long psci_affinity_mask(unsigned long affinity_level)
{
if (affinity_level <= 3)
@@ -77,7 +109,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
unsigned long context_id;
phys_addr_t target_pc;

- cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+ cpu_id = smccc_get_arg1(source_vcpu) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);

@@ -96,8 +128,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
return PSCI_RET_INVALID_PARAMS;
}

- target_pc = vcpu_get_reg(source_vcpu, 2);
- context_id = vcpu_get_reg(source_vcpu, 3);
+ target_pc = smccc_get_arg2(source_vcpu);
+ context_id = smccc_get_arg3(source_vcpu);

kvm_reset_vcpu(vcpu);

@@ -116,7 +148,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
* NOTE: We always update r0 (or x0) because for PSCI v0.1
* the general puspose registers are undefined upon CPU_ON.
*/
- vcpu_set_reg(vcpu, 0, context_id);
+ smccc_set_retval(vcpu, context_id, 0, 0, 0);
vcpu->arch.power_off = false;
smp_mb(); /* Make sure the above is visible */

@@ -136,8 +168,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;

- target_affinity = vcpu_get_reg(vcpu, 1);
- lowest_affinity_level = vcpu_get_reg(vcpu, 2);
+ target_affinity = smccc_get_arg1(vcpu);
+ lowest_affinity_level = smccc_get_arg2(vcpu);

/* Determine target affinity mask */
target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -210,7 +242,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
- unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
+ u32 psci_fn = smccc_get_function(vcpu);
unsigned long val;
int ret = 1;

@@ -277,14 +309,14 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
break;
}

- vcpu_set_reg(vcpu, 0, val);
+ smccc_set_retval(vcpu, val, 0, 0, 0);
return ret;
}

static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
- unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
+ u32 psci_fn = smccc_get_function(vcpu);
unsigned long val;

switch (psci_fn) {
@@ -302,7 +334,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
break;
}

- vcpu_set_reg(vcpu, 0, val);
+ smccc_set_retval(vcpu, val, 0, 0, 0);
return 1;
}

--
2.14.2


2018-02-01 11:53:43

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 03/18] arm64: KVM: Increment PC after handling an SMC trap

When handling an SMC trap, the "preferred return address" is set
to that of the SMC, and not the next PC (which is a departure from
the behaviour of an SMC that isn't trapped).

Increment PC in the handler, as the guest is otherwise forever
stuck...

Cc: [email protected]
Fixes: acfb3b883f6d ("arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls")
Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/kvm/handle_exit.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 520b0dad3c62..5493bbefbd0d 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -62,7 +62,16 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
+ /*
+ * "If an SMC instruction executed at Non-secure EL1 is
+ * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a
+ * Trap exception, not a Secure Monitor Call exception [...]"
+ *
+ * We need to advance the PC after the trap, as it would
+ * otherwise return to the same address...
+ */
vcpu_set_reg(vcpu, 0, ~0UL);
+ kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
return 1;
}

--
2.14.2


2018-02-01 11:53:52

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 05/18] arm/arm64: KVM: Add PSCI_VERSION helper

As we're about to trigger a PSCI version explosion, it doesn't
hurt to introduce a PSCI_VERSION helper that is going to be
used everywhere.

Signed-off-by: Marc Zyngier <[email protected]>
---
include/kvm/arm_psci.h | 6 ++++--
include/uapi/linux/psci.h | 3 +++
virt/kvm/arm/psci.c | 4 +---
3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 2042bb909474..5659343580a3 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -18,8 +18,10 @@
#ifndef __KVM_ARM_PSCI_H__
#define __KVM_ARM_PSCI_H__

-#define KVM_ARM_PSCI_0_1 1
-#define KVM_ARM_PSCI_0_2 2
+#include <uapi/linux/psci.h>
+
+#define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
+#define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)

int kvm_psci_version(struct kvm_vcpu *vcpu);
int kvm_psci_call(struct kvm_vcpu *vcpu);
diff --git a/include/uapi/linux/psci.h b/include/uapi/linux/psci.h
index 760e52a9640f..b3bcabe380da 100644
--- a/include/uapi/linux/psci.h
+++ b/include/uapi/linux/psci.h
@@ -88,6 +88,9 @@
(((ver) & PSCI_VERSION_MAJOR_MASK) >> PSCI_VERSION_MAJOR_SHIFT)
#define PSCI_VERSION_MINOR(ver) \
((ver) & PSCI_VERSION_MINOR_MASK)
+#define PSCI_VERSION(maj, min) \
+ ((((maj) << PSCI_VERSION_MAJOR_SHIFT) & PSCI_VERSION_MAJOR_MASK) | \
+ ((min) & PSCI_VERSION_MINOR_MASK))

/* PSCI features decoding (>=1.0) */
#define PSCI_1_0_FEATURES_CPU_SUSPEND_PF_SHIFT 1
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index b322e46fd142..999f94d6bb98 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -25,8 +25,6 @@

#include <kvm/arm_psci.h>

-#include <uapi/linux/psci.h>
-
/*
* This is an implementation of the Power State Coordination Interface
* as described in ARM document number ARM DEN 0022A.
@@ -222,7 +220,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
* Bits[31:16] = Major Version = 0
* Bits[15:0] = Minor Version = 2
*/
- val = 2;
+ val = KVM_ARM_PSCI_0_2;
break;
case PSCI_0_2_FN_CPU_SUSPEND:
case PSCI_0_2_FN64_CPU_SUSPEND:
--
2.14.2


2018-02-01 12:02:36

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 04/18] arm/arm64: KVM: Consolidate the PSCI include files

As we're about to update the PSCI support, and because I'm lazy,
let's move the PSCI include file to include/kvm so that both
ARM architectures can find it.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/kvm_psci.h | 27 ----------------------
arch/arm/kvm/handle_exit.c | 2 +-
arch/arm64/kvm/handle_exit.c | 3 ++-
.../asm/kvm_psci.h => include/kvm/arm_psci.h | 6 ++---
virt/kvm/arm/arm.c | 2 +-
virt/kvm/arm/psci.c | 3 ++-
6 files changed, 9 insertions(+), 34 deletions(-)
delete mode 100644 arch/arm/include/asm/kvm_psci.h
rename arch/arm64/include/asm/kvm_psci.h => include/kvm/arm_psci.h (89%)

diff --git a/arch/arm/include/asm/kvm_psci.h b/arch/arm/include/asm/kvm_psci.h
deleted file mode 100644
index 6bda945d31fa..000000000000
--- a/arch/arm/include/asm/kvm_psci.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Copyright (C) 2012 - ARM Ltd
- * Author: Marc Zyngier <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program. If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __ARM_KVM_PSCI_H__
-#define __ARM_KVM_PSCI_H__
-
-#define KVM_ARM_PSCI_0_1 1
-#define KVM_ARM_PSCI_0_2 2
-
-int kvm_psci_version(struct kvm_vcpu *vcpu);
-int kvm_psci_call(struct kvm_vcpu *vcpu);
-
-#endif /* __ARM_KVM_PSCI_H__ */
diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
index a4bf0f6f024a..230ae4079108 100644
--- a/arch/arm/kvm/handle_exit.c
+++ b/arch/arm/kvm/handle_exit.c
@@ -21,7 +21,7 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_coproc.h>
#include <asm/kvm_mmu.h>
-#include <asm/kvm_psci.h>
+#include <kvm/arm_psci.h>
#include <trace/events/kvm.h>

#include "trace.h"
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 5493bbefbd0d..588f910632a7 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -22,13 +22,14 @@
#include <linux/kvm.h>
#include <linux/kvm_host.h>

+#include <kvm/arm_psci.h>
+
#include <asm/esr.h>
#include <asm/exception.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_coproc.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_mmu.h>
-#include <asm/kvm_psci.h>
#include <asm/debug-monitors.h>
#include <asm/traps.h>

diff --git a/arch/arm64/include/asm/kvm_psci.h b/include/kvm/arm_psci.h
similarity index 89%
rename from arch/arm64/include/asm/kvm_psci.h
rename to include/kvm/arm_psci.h
index bc39e557c56c..2042bb909474 100644
--- a/arch/arm64/include/asm/kvm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -15,8 +15,8 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

-#ifndef __ARM64_KVM_PSCI_H__
-#define __ARM64_KVM_PSCI_H__
+#ifndef __KVM_ARM_PSCI_H__
+#define __KVM_ARM_PSCI_H__

#define KVM_ARM_PSCI_0_1 1
#define KVM_ARM_PSCI_0_2 2
@@ -24,4 +24,4 @@
int kvm_psci_version(struct kvm_vcpu *vcpu);
int kvm_psci_call(struct kvm_vcpu *vcpu);

-#endif /* __ARM64_KVM_PSCI_H__ */
+#endif /* __KVM_ARM_PSCI_H__ */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 15bf026eb182..af3e98fc377e 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -31,6 +31,7 @@
#include <linux/irqbypass.h>
#include <trace/events/kvm.h>
#include <kvm/arm_pmu.h>
+#include <kvm/arm_psci.h>

#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -46,7 +47,6 @@
#include <asm/kvm_mmu.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_coproc.h>
-#include <asm/kvm_psci.h>
#include <asm/sections.h>

#ifdef REQUIRES_VIRT
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index f1e363bab5e8..b322e46fd142 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -21,9 +21,10 @@

#include <asm/cputype.h>
#include <asm/kvm_emulate.h>
-#include <asm/kvm_psci.h>
#include <asm/kvm_host.h>

+#include <kvm/arm_psci.h>
+
#include <uapi/linux/psci.h>

/*
--
2.14.2


2018-02-01 12:26:45

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 13/18] firmware/psci: Expose PSCI conduit

On 01/02/18 11:46, Marc Zyngier wrote:
> In order to call into the firmware to apply workarounds, it is
> useful to find out whether we're using HVC or SMC. Let's expose
> this through the psci_ops.

Reviewed-by: Robin Murphy <[email protected]>

> Acked-by: Lorenzo Pieralisi <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/firmware/psci.c | 28 +++++++++++++++++++++++-----
> include/linux/psci.h | 7 +++++++
> 2 files changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
> index 8b25d31e8401..e9493da2b111 100644
> --- a/drivers/firmware/psci.c
> +++ b/drivers/firmware/psci.c
> @@ -59,7 +59,9 @@ bool psci_tos_resident_on(int cpu)
> return cpu == resident_cpu;
> }
>
> -struct psci_operations psci_ops;
> +struct psci_operations psci_ops = {
> + .conduit = PSCI_CONDUIT_NONE,
> +};
>
> typedef unsigned long (psci_fn)(unsigned long, unsigned long,
> unsigned long, unsigned long);
> @@ -210,6 +212,22 @@ static unsigned long psci_migrate_info_up_cpu(void)
> 0, 0, 0);
> }
>
> +static void set_conduit(enum psci_conduit conduit)
> +{
> + switch (conduit) {
> + case PSCI_CONDUIT_HVC:
> + invoke_psci_fn = __invoke_psci_fn_hvc;
> + break;
> + case PSCI_CONDUIT_SMC:
> + invoke_psci_fn = __invoke_psci_fn_smc;
> + break;
> + default:
> + WARN(1, "Unexpected PSCI conduit %d\n", conduit);
> + }
> +
> + psci_ops.conduit = conduit;
> +}
> +
> static int get_set_conduit_method(struct device_node *np)
> {
> const char *method;
> @@ -222,9 +240,9 @@ static int get_set_conduit_method(struct device_node *np)
> }
>
> if (!strcmp("hvc", method)) {
> - invoke_psci_fn = __invoke_psci_fn_hvc;
> + set_conduit(PSCI_CONDUIT_HVC);
> } else if (!strcmp("smc", method)) {
> - invoke_psci_fn = __invoke_psci_fn_smc;
> + set_conduit(PSCI_CONDUIT_SMC);
> } else {
> pr_warn("invalid \"method\" property: %s\n", method);
> return -EINVAL;
> @@ -654,9 +672,9 @@ int __init psci_acpi_init(void)
> pr_info("probing for conduit method from ACPI.\n");
>
> if (acpi_psci_use_hvc())
> - invoke_psci_fn = __invoke_psci_fn_hvc;
> + set_conduit(PSCI_CONDUIT_HVC);
> else
> - invoke_psci_fn = __invoke_psci_fn_smc;
> + set_conduit(PSCI_CONDUIT_SMC);
>
> return psci_probe();
> }
> diff --git a/include/linux/psci.h b/include/linux/psci.h
> index f724fd8c78e8..f2679e5faa4f 100644
> --- a/include/linux/psci.h
> +++ b/include/linux/psci.h
> @@ -25,6 +25,12 @@ bool psci_tos_resident_on(int cpu);
> int psci_cpu_init_idle(unsigned int cpu);
> int psci_cpu_suspend_enter(unsigned long index);
>
> +enum psci_conduit {
> + PSCI_CONDUIT_NONE,
> + PSCI_CONDUIT_SMC,
> + PSCI_CONDUIT_HVC,
> +};
> +
> struct psci_operations {
> u32 (*get_version)(void);
> int (*cpu_suspend)(u32 state, unsigned long entry_point);
> @@ -34,6 +40,7 @@ struct psci_operations {
> int (*affinity_info)(unsigned long target_affinity,
> unsigned long lowest_affinity_level);
> int (*migrate_info_type)(void);
> + enum psci_conduit conduit;
> };
>
> extern struct psci_operations psci_ops;
>

2018-02-01 12:33:55

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 14/18] firmware/psci: Expose SMCCC version through psci_ops

On 01/02/18 11:46, Marc Zyngier wrote:
> Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
> let's do that at boot time, and expose the version of the calling
> convention as part of the psci_ops structure.
>
> Acked-by: Lorenzo Pieralisi <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/firmware/psci.c | 19 +++++++++++++++++++
> include/linux/psci.h | 6 ++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
> index e9493da2b111..8631906c414c 100644
> --- a/drivers/firmware/psci.c
> +++ b/drivers/firmware/psci.c
> @@ -61,6 +61,7 @@ bool psci_tos_resident_on(int cpu)
>
> struct psci_operations psci_ops = {
> .conduit = PSCI_CONDUIT_NONE,
> + .smccc_version = SMCCC_VERSION_1_0,
> };
>
> typedef unsigned long (psci_fn)(unsigned long, unsigned long,
> @@ -511,6 +512,23 @@ static void __init psci_init_migrate(void)
> pr_info("Trusted OS resident on physical CPU 0x%lx\n", cpuid);
> }
>
> +static void __init psci_init_smccc(u32 ver)
> +{
> + int feature;
> +
> + feature = psci_features(ARM_SMCCC_VERSION_FUNC_ID);
> +
> + if (feature != PSCI_RET_NOT_SUPPORTED) {
> + ver = invoke_psci_fn(ARM_SMCCC_VERSION_FUNC_ID, 0, 0, 0);
> + if (ver != ARM_SMCCC_VERSION_1_1)
> + psci_ops.smccc_version = SMCCC_VERSION_1_0;

AFAICS, unless you somehow run psci_probe() twice *and* have
schizophrenic firmware, this assignment now does precisely nothing.

With the condition flipped and the redundant else case removed (or an
explanation of why I'm wrong...)

Reviewed-by: Robin Murphy <[email protected]>

> + else
> + psci_ops.smccc_version = SMCCC_VERSION_1_1;
> + }
> +
> + pr_info("SMC Calling Convention v1.%d\n", psci_ops.smccc_version);
> +}
> +
> static void __init psci_0_2_set_functions(void)
> {
> pr_info("Using standard PSCI v0.2 function IDs\n");
> @@ -559,6 +577,7 @@ static int __init psci_probe(void)
> psci_init_migrate();
>
> if (PSCI_VERSION_MAJOR(ver) >= 1) {
> + psci_init_smccc(ver);
> psci_init_cpu_suspend();
> psci_init_system_suspend();
> }
> diff --git a/include/linux/psci.h b/include/linux/psci.h
> index f2679e5faa4f..8b1b3b5935ab 100644
> --- a/include/linux/psci.h
> +++ b/include/linux/psci.h
> @@ -31,6 +31,11 @@ enum psci_conduit {
> PSCI_CONDUIT_HVC,
> };
>
> +enum smccc_version {
> + SMCCC_VERSION_1_0,
> + SMCCC_VERSION_1_1,
> +};
> +
> struct psci_operations {
> u32 (*get_version)(void);
> int (*cpu_suspend)(u32 state, unsigned long entry_point);
> @@ -41,6 +46,7 @@ struct psci_operations {
> unsigned long lowest_affinity_level);
> int (*migrate_info_type)(void);
> enum psci_conduit conduit;
> + enum smccc_version smccc_version;
> };
>
> extern struct psci_operations psci_ops;
>

2018-02-01 12:40:59

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 15/18] arm/arm64: smccc: Make function identifiers an unsigned quantity

On 01/02/18 11:46, Marc Zyngier wrote:
> Function identifiers are a 32bit, unsigned quantity. But we never
> tell so to the compiler, resulting in the following:
>
> 4ac: b26187e0 mov x0, #0xffffffff80000001
>
> We thus rely on the firmware narrowing it for us, which is not
> always a reasonable expectation.

I think technically it might be OK, since SMCCC states "A Function
Identifier is passed in register W0.", which implies that a conforming
implementation should also read w0, not x0, but it's certainly far
easier to be completely right than to justify being possibly wrong.

Reviewed-by: Robin Murphy <[email protected]>

> Cc: [email protected]
> Reported-by: Ard Biesheuvel <[email protected]>
> Acked-by: Ard Biesheuvel <[email protected]>
> Tested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> include/linux/arm-smccc.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
> index e1ef944ef1da..dd44d8458c04 100644
> --- a/include/linux/arm-smccc.h
> +++ b/include/linux/arm-smccc.h
> @@ -14,14 +14,16 @@
> #ifndef __LINUX_ARM_SMCCC_H
> #define __LINUX_ARM_SMCCC_H
>
> +#include <uapi/linux/const.h>
> +
> /*
> * This file provides common defines for ARM SMC Calling Convention as
> * specified in
> * http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
> */
>
> -#define ARM_SMCCC_STD_CALL 0
> -#define ARM_SMCCC_FAST_CALL 1
> +#define ARM_SMCCC_STD_CALL _AC(0,U)
> +#define ARM_SMCCC_FAST_CALL _AC(1,U)
> #define ARM_SMCCC_TYPE_SHIFT 31
>
> #define ARM_SMCCC_SMC_32 0
>

2018-02-01 12:45:58

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 15/18] arm/arm64: smccc: Make function identifiers an unsigned quantity

On 1 February 2018 at 12:40, Robin Murphy <[email protected]> wrote:
> On 01/02/18 11:46, Marc Zyngier wrote:
>>
>> Function identifiers are a 32bit, unsigned quantity. But we never
>> tell so to the compiler, resulting in the following:
>>
>> 4ac: b26187e0 mov x0, #0xffffffff80000001
>>
>> We thus rely on the firmware narrowing it for us, which is not
>> always a reasonable expectation.
>
>
> I think technically it might be OK, since SMCCC states "A Function
> Identifier is passed in register W0.", which implies that a conforming
> implementation should also read w0, not x0, but it's certainly far easier to
> be completely right than to justify being possibly wrong.
>
> Reviewed-by: Robin Murphy <[email protected]>
>

In my case, the function identifier wasn't the issue, but the
argument, which, for SMCCC_ARCH_FEATURES is also defined as uint32_t,
but did end up being interpreted incorrectly by the SMCCCv1.1
implementation that is now upstream in ARM-TF



>
>> Cc: [email protected]
>> Reported-by: Ard Biesheuvel <[email protected]>
>> Acked-by: Ard Biesheuvel <[email protected]>
>> Tested-by: Ard Biesheuvel <[email protected]>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> include/linux/arm-smccc.h | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
>> index e1ef944ef1da..dd44d8458c04 100644
>> --- a/include/linux/arm-smccc.h
>> +++ b/include/linux/arm-smccc.h
>> @@ -14,14 +14,16 @@
>> #ifndef __LINUX_ARM_SMCCC_H
>> #define __LINUX_ARM_SMCCC_H
>> +#include <uapi/linux/const.h>
>> +
>> /*
>> * This file provides common defines for ARM SMC Calling Convention as
>> * specified in
>> * http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html
>> */
>> -#define ARM_SMCCC_STD_CALL 0
>> -#define ARM_SMCCC_FAST_CALL 1
>> +#define ARM_SMCCC_STD_CALL _AC(0,U)
>> +#define ARM_SMCCC_FAST_CALL _AC(1,U)
>> #define ARM_SMCCC_TYPE_SHIFT 31
>> #define ARM_SMCCC_SMC_32 0
>>
>

2018-02-01 12:50:45

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 14/18] firmware/psci: Expose SMCCC version through psci_ops

On 01/02/18 12:32, Robin Murphy wrote:
> On 01/02/18 11:46, Marc Zyngier wrote:
>> Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
>> let's do that at boot time, and expose the version of the calling
>> convention as part of the psci_ops structure.
>>
>> Acked-by: Lorenzo Pieralisi <[email protected]>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> drivers/firmware/psci.c | 19 +++++++++++++++++++
>> include/linux/psci.h | 6 ++++++
>> 2 files changed, 25 insertions(+)
>>
>> diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
>> index e9493da2b111..8631906c414c 100644
>> --- a/drivers/firmware/psci.c
>> +++ b/drivers/firmware/psci.c
>> @@ -61,6 +61,7 @@ bool psci_tos_resident_on(int cpu)
>>
>> struct psci_operations psci_ops = {
>> .conduit = PSCI_CONDUIT_NONE,
>> + .smccc_version = SMCCC_VERSION_1_0,
>> };
>>
>> typedef unsigned long (psci_fn)(unsigned long, unsigned long,
>> @@ -511,6 +512,23 @@ static void __init psci_init_migrate(void)
>> pr_info("Trusted OS resident on physical CPU 0x%lx\n", cpuid);
>> }
>>
>> +static void __init psci_init_smccc(u32 ver)
>> +{
>> + int feature;
>> +
>> + feature = psci_features(ARM_SMCCC_VERSION_FUNC_ID);
>> +
>> + if (feature != PSCI_RET_NOT_SUPPORTED) {
>> + ver = invoke_psci_fn(ARM_SMCCC_VERSION_FUNC_ID, 0, 0, 0);
>> + if (ver != ARM_SMCCC_VERSION_1_1)
>> + psci_ops.smccc_version = SMCCC_VERSION_1_0;
>
> AFAICS, unless you somehow run psci_probe() twice *and* have
> schizophrenic firmware, this assignment now does precisely nothing.

That's a leftover of a previous tracing hack I had... Embarrassing.

> With the condition flipped and the redundant else case removed (or an
> explanation of why I'm wrong...)
>
> Reviewed-by: Robin Murphy <[email protected]>

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-02-01 13:36:06

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 16/18] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive

On 01/02/18 11:46, Marc Zyngier wrote:
> One of the major improvement of SMCCC v1.1 is that it only clobbers
> the first 4 registers, both on 32 and 64bit. This means that it
> becomes very easy to provide an inline version of the SMC call
> primitive, and avoid performing a function call to stash the
> registers that would otherwise be clobbered by SMCCC v1.0.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> include/linux/arm-smccc.h | 143 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 143 insertions(+)
>
> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
> index dd44d8458c04..575aabe85905 100644
> --- a/include/linux/arm-smccc.h
> +++ b/include/linux/arm-smccc.h
> @@ -150,5 +150,148 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,
>
> #define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__)
>
> +/* SMCCC v1.1 implementation madness follows */
> +#ifdef CONFIG_ARM64
> +
> +#define SMCCC_SMC_INST "smc #0"
> +#define SMCCC_HVC_INST "hvc #0"

Nit: Maybe the argument can go in the template and we just define the
instruction mnemonics here?

> +
> +#endif
> +
> +#ifdef CONFIG_ARM

#elif ?

> +#include <asm/opcodes-sec.h>
> +#include <asm/opcodes-virt.h>
> +
> +#define SMCCC_SMC_INST __SMC(0)
> +#define SMCCC_HVC_INST __HVC(0)

Oh, I see, it was to line up with this :(

I do wonder if we could just embed an asm(".arch armv7-a+virt\n") (if
even necessary) for ARM, then take advantage of the common mnemonics for
all 3 instruction sets instead of needing manual encoding tricks? I
don't think we should ever be pulling this file in for non-v7 builds.

I suppose that strictly that appears to need binutils 2.21 rather than
the offical supported minimum of 2.20, but are people going to be
throwing SMCCC configs at antique toolchains in practice?

> +
> +#endif
> +
> +#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x
> +
> +#define __count_args(...) \
> + ___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0)
> +
> +#define __constraint_write_0 \
> + "+r" (r0), "=&r" (r1), "=&r" (r2), "=&r" (r3)
> +#define __constraint_write_1 \
> + "+r" (r0), "+r" (r1), "=&r" (r2), "=&r" (r3)
> +#define __constraint_write_2 \
> + "+r" (r0), "+r" (r1), "+r" (r2), "=&r" (r3)
> +#define __constraint_write_3 \
> + "+r" (r0), "+r" (r1), "+r" (r2), "+r" (r3)
> +#define __constraint_write_4 __constraint_write_3
> +#define __constraint_write_5 __constraint_write_4
> +#define __constraint_write_6 __constraint_write_5
> +#define __constraint_write_7 __constraint_write_6
> +
> +#define __constraint_read_0
> +#define __constraint_read_1
> +#define __constraint_read_2
> +#define __constraint_read_3
> +#define __constraint_read_4 "r" (r4)
> +#define __constraint_read_5 __constraint_read_4, "r" (r5)
> +#define __constraint_read_6 __constraint_read_5, "r" (r6)
> +#define __constraint_read_7 __constraint_read_6, "r" (r7)
> +
> +#define __declare_arg_0(a0, res) \
> + struct arm_smccc_res *___res = res; \

Looks like the declaration of ___res could simply be factored out to the
template...

> + register u32 r0 asm("r0") = a0; \
> + register unsigned long r1 asm("r1"); \
> + register unsigned long r2 asm("r2"); \
> + register unsigned long r3 asm("r3")
> +
> +#define __declare_arg_1(a0, a1, res) \
> + struct arm_smccc_res *___res = res; \
> + register u32 r0 asm("r0") = a0; \
> + register typeof(a1) r1 asm("r1") = a1; \
> + register unsigned long r2 asm("r2"); \
> + register unsigned long r3 asm("r3")
> +
> +#define __declare_arg_2(a0, a1, a2, res) \
> + struct arm_smccc_res *___res = res; \
> + register u32 r0 asm("r0") = a0; \
> + register typeof(a1) r1 asm("r1") = a1; \
> + register typeof(a2) r2 asm("r2") = a2; \
> + register unsigned long r3 asm("r3")
> +
> +#define __declare_arg_3(a0, a1, a2, a3, res) \
> + struct arm_smccc_res *___res = res; \
> + register u32 r0 asm("r0") = a0; \
> + register typeof(a1) r1 asm("r1") = a1; \
> + register typeof(a2) r2 asm("r2") = a2; \
> + register typeof(a3) r3 asm("r3") = a3
> +
> +#define __declare_arg_4(a0, a1, a2, a3, a4, res) \
> + __declare_arg_3(a0, a1, a2, a3, res); \
> + register typeof(a4) r4 asm("r4") = a4
> +
> +#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res) \
> + __declare_arg_4(a0, a1, a2, a3, a4, res); \
> + register typeof(a5) r5 asm("r5") = a5
> +
> +#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res) \
> + __declare_arg_5(a0, a1, a2, a3, a4, a5, res); \
> + register typeof(a6) r6 asm("r6") = a6
> +
> +#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \
> + __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res); \
> + register typeof(a7) r7 asm("r7") = a7
> +
> +#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
> +#define __declare_args(count, ...) ___declare_args(count, __VA_ARGS__)
> +
> +#define ___constraints(count) \
> + : __constraint_write_ ## count \
> + : __constraint_read_ ## count \
> + : "memory"
> +#define __constraints(count) ___constraints(count)
> +
> +/*
> + * We have an output list that is not necessarily used, and GCC feels
> + * entitled to optimise the whole sequence away. "volatile" is what
> + * makes it stick.
> + */
> +#define __arm_smccc_1_1(inst, ...) \
> + do { \
> + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
> + asm volatile(inst "\n" \
> + __constraints(__count_args(__VA_ARGS__))); \
> + if (___res) \
> + *___res = (typeof(*___res)){r0, r1, r2, r3}; \

...especially since there's no obvious indication of where it comes from
when you're looking here.

Otherwise, though, this has already turned out pretty sleek;

Reviewed-by: Robin Murphy <[email protected]>

> + } while (0)
> +
> +/*
> + * arm_smccc_1_1_smc() - make an SMCCC v1.1 compliant SMC call
> + *
> + * This is a variadic macro taking one to eight source arguments, and
> + * an optional return structure.
> + *
> + * @a0-a7: arguments passed in registers 0 to 7
> + * @res: result values from registers 0 to 3
> + *
> + * This macro is used to make SMC calls following SMC Calling Convention v1.1.
> + * The content of the supplied param are copied to registers 0 to 7 prior
> + * to the SMC instruction. The return values are updated with the content
> + * from register 0 to 3 on return from the SMC instruction if not NULL.
> + */
> +#define arm_smccc_1_1_smc(...) __arm_smccc_1_1(SMCCC_SMC_INST, __VA_ARGS__)
> +
> +/*
> + * arm_smccc_1_1_hvc() - make an SMCCC v1.1 compliant HVC call
> + *
> + * This is a variadic macro taking one to eight source arguments, and
> + * an optional return structure.
> + *
> + * @a0-a7: arguments passed in registers 0 to 7
> + * @res: result values from registers 0 to 3
> + *
> + * This macro is used to make HVC calls following SMC Calling Convention v1.1.
> + * The content of the supplied param are copied to registers 0 to 7 prior
> + * to the HVC instruction. The return values are updated with the content
> + * from register 0 to 3 on return from the HVC instruction if not NULL.
> + */
> +#define arm_smccc_1_1_hvc(...) __arm_smccc_1_1(SMCCC_HVC_INST, __VA_ARGS__)
> +
> #endif /*__ASSEMBLY__*/
> #endif /*__LINUX_ARM_SMCCC_H*/
>

2018-02-01 13:56:57

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 16/18] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive

On 01/02/18 13:34, Robin Murphy wrote:
> On 01/02/18 11:46, Marc Zyngier wrote:
>> One of the major improvement of SMCCC v1.1 is that it only clobbers
>> the first 4 registers, both on 32 and 64bit. This means that it
>> becomes very easy to provide an inline version of the SMC call
>> primitive, and avoid performing a function call to stash the
>> registers that would otherwise be clobbered by SMCCC v1.0.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> include/linux/arm-smccc.h | 143 ++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 143 insertions(+)
>>
>> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
>> index dd44d8458c04..575aabe85905 100644
>> --- a/include/linux/arm-smccc.h
>> +++ b/include/linux/arm-smccc.h
>> @@ -150,5 +150,148 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,
>>
>> #define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__)
>>
>> +/* SMCCC v1.1 implementation madness follows */
>> +#ifdef CONFIG_ARM64
>> +
>> +#define SMCCC_SMC_INST "smc #0"
>> +#define SMCCC_HVC_INST "hvc #0"
>
> Nit: Maybe the argument can go in the template and we just define the
> instruction mnemonics here?
>
>> +
>> +#endif
>> +
>> +#ifdef CONFIG_ARM
>
> #elif ?

Sure, why not.

>
>> +#include <asm/opcodes-sec.h>
>> +#include <asm/opcodes-virt.h>
>> +
>> +#define SMCCC_SMC_INST __SMC(0)
>> +#define SMCCC_HVC_INST __HVC(0)
>
> Oh, I see, it was to line up with this :(
>
> I do wonder if we could just embed an asm(".arch armv7-a+virt\n") (if
> even necessary) for ARM, then take advantage of the common mnemonics for
> all 3 instruction sets instead of needing manual encoding tricks? I
> don't think we should ever be pulling this file in for non-v7 builds.
>
> I suppose that strictly that appears to need binutils 2.21 rather than
> the offical supported minimum of 2.20, but are people going to be
> throwing SMCCC configs at antique toolchains in practice?

It has been an issue in the past, back when we merged KVM. We settled on
a hybrid solution where code outside of KVM would not rely on a newer
toolchain, hence the macros that Dave introduced. Maybe we've moved on
and we can take that bold step?

>
>> +
>> +#endif
>> +
>> +#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x
>> +
>> +#define __count_args(...) \
>> + ___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0)
>> +
>> +#define __constraint_write_0 \
>> + "+r" (r0), "=&r" (r1), "=&r" (r2), "=&r" (r3)
>> +#define __constraint_write_1 \
>> + "+r" (r0), "+r" (r1), "=&r" (r2), "=&r" (r3)
>> +#define __constraint_write_2 \
>> + "+r" (r0), "+r" (r1), "+r" (r2), "=&r" (r3)
>> +#define __constraint_write_3 \
>> + "+r" (r0), "+r" (r1), "+r" (r2), "+r" (r3)
>> +#define __constraint_write_4 __constraint_write_3
>> +#define __constraint_write_5 __constraint_write_4
>> +#define __constraint_write_6 __constraint_write_5
>> +#define __constraint_write_7 __constraint_write_6
>> +
>> +#define __constraint_read_0
>> +#define __constraint_read_1
>> +#define __constraint_read_2
>> +#define __constraint_read_3
>> +#define __constraint_read_4 "r" (r4)
>> +#define __constraint_read_5 __constraint_read_4, "r" (r5)
>> +#define __constraint_read_6 __constraint_read_5, "r" (r6)
>> +#define __constraint_read_7 __constraint_read_6, "r" (r7)
>> +
>> +#define __declare_arg_0(a0, res) \
>> + struct arm_smccc_res *___res = res; \
>
> Looks like the declaration of ___res could simply be factored out to the
> template...

Tried that. But...

>
>> + register u32 r0 asm("r0") = a0; \
>> + register unsigned long r1 asm("r1"); \
>> + register unsigned long r2 asm("r2"); \
>> + register unsigned long r3 asm("r3")
>> +
>> +#define __declare_arg_1(a0, a1, res) \
>> + struct arm_smccc_res *___res = res; \
>> + register u32 r0 asm("r0") = a0; \
>> + register typeof(a1) r1 asm("r1") = a1; \
>> + register unsigned long r2 asm("r2"); \
>> + register unsigned long r3 asm("r3")
>> +
>> +#define __declare_arg_2(a0, a1, a2, res) \
>> + struct arm_smccc_res *___res = res; \
>> + register u32 r0 asm("r0") = a0; \
>> + register typeof(a1) r1 asm("r1") = a1; \
>> + register typeof(a2) r2 asm("r2") = a2; \
>> + register unsigned long r3 asm("r3")
>> +
>> +#define __declare_arg_3(a0, a1, a2, a3, res) \
>> + struct arm_smccc_res *___res = res; \
>> + register u32 r0 asm("r0") = a0; \
>> + register typeof(a1) r1 asm("r1") = a1; \
>> + register typeof(a2) r2 asm("r2") = a2; \
>> + register typeof(a3) r3 asm("r3") = a3
>> +
>> +#define __declare_arg_4(a0, a1, a2, a3, a4, res) \
>> + __declare_arg_3(a0, a1, a2, a3, res); \
>> + register typeof(a4) r4 asm("r4") = a4
>> +
>> +#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res) \
>> + __declare_arg_4(a0, a1, a2, a3, a4, res); \
>> + register typeof(a5) r5 asm("r5") = a5
>> +
>> +#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res) \
>> + __declare_arg_5(a0, a1, a2, a3, a4, a5, res); \
>> + register typeof(a6) r6 asm("r6") = a6
>> +
>> +#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \
>> + __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res); \
>> + register typeof(a7) r7 asm("r7") = a7
>> +
>> +#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
>> +#define __declare_args(count, ...) ___declare_args(count, __VA_ARGS__)
>> +
>> +#define ___constraints(count) \
>> + : __constraint_write_ ## count \
>> + : __constraint_read_ ## count \
>> + : "memory"
>> +#define __constraints(count) ___constraints(count)
>> +
>> +/*
>> + * We have an output list that is not necessarily used, and GCC feels
>> + * entitled to optimise the whole sequence away. "volatile" is what
>> + * makes it stick.
>> + */
>> +#define __arm_smccc_1_1(inst, ...) \
>> + do { \
>> + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
>> + asm volatile(inst "\n" \
>> + __constraints(__count_args(__VA_ARGS__))); \
>> + if (___res) \
>> + *___res = (typeof(*___res)){r0, r1, r2, r3}; \
>
> ...especially since there's no obvious indication of where it comes from
> when you're looking here.

... we don't have the variable name at all here (it is the last
parameter, and that doesn't quite work with the idea of variadic macros...).

The alternative would be to add a set of macros that return the result
parameter, based on the number of inputs. Not sure that's an improvement.

Thoughts?

>
> Otherwise, though, this has already turned out pretty sleek;
>
> Reviewed-by: Robin Murphy <[email protected]>

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-02-01 14:00:28

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 00/18] arm64: Add SMCCC v1.1 support and CVE-2017-5715 (Spectre variant 2) mitigation

On 1 February 2018 at 11:46, Marc Zyngier <[email protected]> wrote:
> ARM has recently published a SMC Calling Convention (SMCCC)
> specification update[1] that provides an optimised calling convention
> and optional, discoverable support for mitigating CVE-2017-5715. ARM
> Trusted Firmware (ATF) has already gained such an implementation[2].
>
> This series addresses a few things:
>
> - It provides a KVM implementation of PSCI v1.0, which is a
> prerequisite for being able to discover SMCCC v1.1, together with a
> new userspace API to control the PSCI revision number that the guest
> sees.
>
> - It allows KVM to advertise SMCCC v1.1, which is de-facto supported
> already (it never corrupts any of the guest registers).
>
> - It implements KVM support for the ARCH_WORKAROUND_1 function that is
> used to mitigate CVE-2017-5715 in a guest (if such mitigation is
> available on the host).
>
> - It implements SMCCC v1.1 and ARCH_WORKAROUND_1 discovery support in
> the kernel itself.
>
> - It finally provides firmware callbacks for CVE-2017-5715 for both
> kernel and KVM and drop the initial PSCI_GET_VERSION based
> mitigation.
>
> Patch 1 is already merged, and included here for reference. Patches on
> top of arm64/for-next/core. Tested on Seattle and Juno, the latter
> with ATF implementing SMCCC v1.1.
>
> [1]: https://developer.arm.com/support/security-update/downloads/
>
> [2]: https://github.com/ARM-software/arm-trusted-firmware/pull/1240
>
> * From v2:
> - Fixed SMC handling in KVM
> - PSCI fixes and tidying up
> - SMCCC primitive rework for better code generation (both efficiency
> and correctness)
> - Remove PSCI_GET_VERSION as a mitigation vector
>
> * From v1:
> - Fixed 32bit build
> - Fix function number sign extension (Ard)
> - Inline SMCCC v1.1 primitives (cpp soup)
> - Prevent SMCCC spamming on feature probing
> - Random fixes and tidying up
>
> Marc Zyngier (18):
> arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
> arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
> arm64: KVM: Increment PC after handling an SMC trap
> arm/arm64: KVM: Consolidate the PSCI include files
> arm/arm64: KVM: Add PSCI_VERSION helper
> arm/arm64: KVM: Add smccc accessors to PSCI code
> arm/arm64: KVM: Implement PSCI 1.0 support
> arm/arm64: KVM: Add PSCI version selection API
> arm/arm64: KVM: Advertise SMCCC v1.1
> arm/arm64: KVM: Turn kvm_psci_version into a static inline
> arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
> arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
> firmware/psci: Expose PSCI conduit
> firmware/psci: Expose SMCCC version through psci_ops
> arm/arm64: smccc: Make function identifiers an unsigned quantity
> arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
> arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
> arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
>

I have given this a spin on my Overdrive, and everything seems to work
as expected, both in the host and in the guest (I single stepped
through the guest to ensure that it gets the expected answer from the
SMCCC feature info call)

Tested-by: Ard Biesheuvel <[email protected]>

2018-02-01 14:19:13

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 16/18] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive

On 01/02/18 13:54, Marc Zyngier wrote:
> On 01/02/18 13:34, Robin Murphy wrote:
>> On 01/02/18 11:46, Marc Zyngier wrote:
>>> One of the major improvement of SMCCC v1.1 is that it only clobbers
>>> the first 4 registers, both on 32 and 64bit. This means that it
>>> becomes very easy to provide an inline version of the SMC call
>>> primitive, and avoid performing a function call to stash the
>>> registers that would otherwise be clobbered by SMCCC v1.0.
>>>
>>> Signed-off-by: Marc Zyngier <[email protected]>
>>> ---
>>> include/linux/arm-smccc.h | 143 ++++++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 143 insertions(+)
>>>
>>> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
>>> index dd44d8458c04..575aabe85905 100644
>>> --- a/include/linux/arm-smccc.h
>>> +++ b/include/linux/arm-smccc.h
>>> @@ -150,5 +150,148 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,
>>>
>>> #define arm_smccc_hvc_quirk(...) __arm_smccc_hvc(__VA_ARGS__)
>>>
>>> +/* SMCCC v1.1 implementation madness follows */
>>> +#ifdef CONFIG_ARM64
>>> +
>>> +#define SMCCC_SMC_INST "smc #0"
>>> +#define SMCCC_HVC_INST "hvc #0"
>>
>> Nit: Maybe the argument can go in the template and we just define the
>> instruction mnemonics here?
>>
>>> +
>>> +#endif
>>> +
>>> +#ifdef CONFIG_ARM
>>
>> #elif ?
>
> Sure, why not.
>
>>
>>> +#include <asm/opcodes-sec.h>
>>> +#include <asm/opcodes-virt.h>
>>> +
>>> +#define SMCCC_SMC_INST __SMC(0)
>>> +#define SMCCC_HVC_INST __HVC(0)
>>
>> Oh, I see, it was to line up with this :(
>>
>> I do wonder if we could just embed an asm(".arch armv7-a+virt\n") (if
>> even necessary) for ARM, then take advantage of the common mnemonics for
>> all 3 instruction sets instead of needing manual encoding tricks? I
>> don't think we should ever be pulling this file in for non-v7 builds.
>>
>> I suppose that strictly that appears to need binutils 2.21 rather than
>> the offical supported minimum of 2.20, but are people going to be
>> throwing SMCCC configs at antique toolchains in practice?
>
> It has been an issue in the past, back when we merged KVM. We settled on
> a hybrid solution where code outside of KVM would not rely on a newer
> toolchain, hence the macros that Dave introduced. Maybe we've moved on
> and we can take that bold step?

Either way I think we can happily throw that on the "future cleanup"
pile right now as it's not directly relevant to the purpose of the
patch; I'm sure we don't want to make potential backporting even more
difficult.

>>
>>> +
>>> +#endif
>>> +
>>> +#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x
>>> +
>>> +#define __count_args(...) \
>>> + ___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0)
>>> +
>>> +#define __constraint_write_0 \
>>> + "+r" (r0), "=&r" (r1), "=&r" (r2), "=&r" (r3)
>>> +#define __constraint_write_1 \
>>> + "+r" (r0), "+r" (r1), "=&r" (r2), "=&r" (r3)
>>> +#define __constraint_write_2 \
>>> + "+r" (r0), "+r" (r1), "+r" (r2), "=&r" (r3)
>>> +#define __constraint_write_3 \
>>> + "+r" (r0), "+r" (r1), "+r" (r2), "+r" (r3)
>>> +#define __constraint_write_4 __constraint_write_3
>>> +#define __constraint_write_5 __constraint_write_4
>>> +#define __constraint_write_6 __constraint_write_5
>>> +#define __constraint_write_7 __constraint_write_6
>>> +
>>> +#define __constraint_read_0
>>> +#define __constraint_read_1
>>> +#define __constraint_read_2
>>> +#define __constraint_read_3
>>> +#define __constraint_read_4 "r" (r4)
>>> +#define __constraint_read_5 __constraint_read_4, "r" (r5)
>>> +#define __constraint_read_6 __constraint_read_5, "r" (r6)
>>> +#define __constraint_read_7 __constraint_read_6, "r" (r7)
>>> +
>>> +#define __declare_arg_0(a0, res) \
>>> + struct arm_smccc_res *___res = res; \
>>
>> Looks like the declaration of ___res could simply be factored out to the
>> template...
>
> Tried that. But...
>
>>
>>> + register u32 r0 asm("r0") = a0; \
>>> + register unsigned long r1 asm("r1"); \
>>> + register unsigned long r2 asm("r2"); \
>>> + register unsigned long r3 asm("r3")
>>> +
>>> +#define __declare_arg_1(a0, a1, res) \
>>> + struct arm_smccc_res *___res = res; \
>>> + register u32 r0 asm("r0") = a0; \
>>> + register typeof(a1) r1 asm("r1") = a1; \
>>> + register unsigned long r2 asm("r2"); \
>>> + register unsigned long r3 asm("r3")
>>> +
>>> +#define __declare_arg_2(a0, a1, a2, res) \
>>> + struct arm_smccc_res *___res = res; \
>>> + register u32 r0 asm("r0") = a0; \
>>> + register typeof(a1) r1 asm("r1") = a1; \
>>> + register typeof(a2) r2 asm("r2") = a2; \
>>> + register unsigned long r3 asm("r3")
>>> +
>>> +#define __declare_arg_3(a0, a1, a2, a3, res) \
>>> + struct arm_smccc_res *___res = res; \
>>> + register u32 r0 asm("r0") = a0; \
>>> + register typeof(a1) r1 asm("r1") = a1; \
>>> + register typeof(a2) r2 asm("r2") = a2; \
>>> + register typeof(a3) r3 asm("r3") = a3
>>> +
>>> +#define __declare_arg_4(a0, a1, a2, a3, a4, res) \
>>> + __declare_arg_3(a0, a1, a2, a3, res); \
>>> + register typeof(a4) r4 asm("r4") = a4
>>> +
>>> +#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res) \
>>> + __declare_arg_4(a0, a1, a2, a3, a4, res); \
>>> + register typeof(a5) r5 asm("r5") = a5
>>> +
>>> +#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res) \
>>> + __declare_arg_5(a0, a1, a2, a3, a4, a5, res); \
>>> + register typeof(a6) r6 asm("r6") = a6
>>> +
>>> +#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \
>>> + __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res); \
>>> + register typeof(a7) r7 asm("r7") = a7
>>> +
>>> +#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
>>> +#define __declare_args(count, ...) ___declare_args(count, __VA_ARGS__)
>>> +
>>> +#define ___constraints(count) \
>>> + : __constraint_write_ ## count \
>>> + : __constraint_read_ ## count \
>>> + : "memory"
>>> +#define __constraints(count) ___constraints(count)
>>> +
>>> +/*
>>> + * We have an output list that is not necessarily used, and GCC feels
>>> + * entitled to optimise the whole sequence away. "volatile" is what
>>> + * makes it stick.
>>> + */
>>> +#define __arm_smccc_1_1(inst, ...) \
>>> + do { \
>>> + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
>>> + asm volatile(inst "\n" \
>>> + __constraints(__count_args(__VA_ARGS__))); \
>>> + if (___res) \
>>> + *___res = (typeof(*___res)){r0, r1, r2, r3}; \
>>
>> ...especially since there's no obvious indication of where it comes from
>> when you're looking here.
>
> ... we don't have the variable name at all here (it is the last
> parameter, and that doesn't quite work with the idea of variadic macros...).
>
> The alternative would be to add a set of macros that return the result
> parameter, based on the number of inputs. Not sure that's an improvement.

Ah, right, the significance of it being the *last* argument hadn't
clicked indeed. A whole barrage of extra macros just to extract res on
its own would be rather clunky, so let's just keep the nice streamlined
(if ever-so-slightly non-obvious) implementation as it is and ignore my
ramblings.

Robin.

2018-02-01 14:22:45

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 00/18] arm64: Add SMCCC v1.1 support and CVE-2017-5715 (Spectre variant 2) mitigation

On Thu, 01 Feb 2018 13:59:45 +0000,
Ard Biesheuvel wrote:
>
> On 1 February 2018 at 11:46, Marc Zyngier <[email protected]> wrote:
> > ARM has recently published a SMC Calling Convention (SMCCC)
> > specification update[1] that provides an optimised calling convention
> > and optional, discoverable support for mitigating CVE-2017-5715. ARM
> > Trusted Firmware (ATF) has already gained such an implementation[2].
> >
> > This series addresses a few things:
> >
> > - It provides a KVM implementation of PSCI v1.0, which is a
> > prerequisite for being able to discover SMCCC v1.1, together with a
> > new userspace API to control the PSCI revision number that the guest
> > sees.
> >
> > - It allows KVM to advertise SMCCC v1.1, which is de-facto supported
> > already (it never corrupts any of the guest registers).
> >
> > - It implements KVM support for the ARCH_WORKAROUND_1 function that is
> > used to mitigate CVE-2017-5715 in a guest (if such mitigation is
> > available on the host).
> >
> > - It implements SMCCC v1.1 and ARCH_WORKAROUND_1 discovery support in
> > the kernel itself.
> >
> > - It finally provides firmware callbacks for CVE-2017-5715 for both
> > kernel and KVM and drop the initial PSCI_GET_VERSION based
> > mitigation.
> >
> > Patch 1 is already merged, and included here for reference. Patches on
> > top of arm64/for-next/core. Tested on Seattle and Juno, the latter
> > with ATF implementing SMCCC v1.1.
> >
> > [1]: https://developer.arm.com/support/security-update/downloads/
> >
> > [2]: https://github.com/ARM-software/arm-trusted-firmware/pull/1240
> >
> > * From v2:
> > - Fixed SMC handling in KVM
> > - PSCI fixes and tidying up
> > - SMCCC primitive rework for better code generation (both efficiency
> > and correctness)
> > - Remove PSCI_GET_VERSION as a mitigation vector
> >
> > * From v1:
> > - Fixed 32bit build
> > - Fix function number sign extension (Ard)
> > - Inline SMCCC v1.1 primitives (cpp soup)
> > - Prevent SMCCC spamming on feature probing
> > - Random fixes and tidying up
> >
> > Marc Zyngier (18):
> > arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
> > arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
> > arm64: KVM: Increment PC after handling an SMC trap
> > arm/arm64: KVM: Consolidate the PSCI include files
> > arm/arm64: KVM: Add PSCI_VERSION helper
> > arm/arm64: KVM: Add smccc accessors to PSCI code
> > arm/arm64: KVM: Implement PSCI 1.0 support
> > arm/arm64: KVM: Add PSCI version selection API
> > arm/arm64: KVM: Advertise SMCCC v1.1
> > arm/arm64: KVM: Turn kvm_psci_version into a static inline
> > arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
> > arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
> > firmware/psci: Expose PSCI conduit
> > firmware/psci: Expose SMCCC version through psci_ops
> > arm/arm64: smccc: Make function identifiers an unsigned quantity
> > arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
> > arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
> > arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
> >
>
> I have given this a spin on my Overdrive, and everything seems to work
> as expected, both in the host and in the guest (I single stepped
> through the guest to ensure that it gets the expected answer from the
> SMCCC feature info call)
>
> Tested-by: Ard Biesheuvel <[email protected]>

Awesome, thanks Ard.

M.

--
Jazz is not dead, it just smell funny.

2018-02-01 21:19:43

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 14/18] firmware/psci: Expose SMCCC version through psci_ops

On 1 February 2018 at 11:46, Marc Zyngier <[email protected]> wrote:
> Since PSCI 1.0 allows the SMCCC version to be (indirectly) probed,
> let's do that at boot time, and expose the version of the calling
> convention as part of the psci_ops structure.
>
> Acked-by: Lorenzo Pieralisi <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/firmware/psci.c | 19 +++++++++++++++++++
> include/linux/psci.h | 6 ++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
> index e9493da2b111..8631906c414c 100644
> --- a/drivers/firmware/psci.c
> +++ b/drivers/firmware/psci.c
> @@ -61,6 +61,7 @@ bool psci_tos_resident_on(int cpu)
>
> struct psci_operations psci_ops = {
> .conduit = PSCI_CONDUIT_NONE,
> + .smccc_version = SMCCC_VERSION_1_0,
> };
>
> typedef unsigned long (psci_fn)(unsigned long, unsigned long,
> @@ -511,6 +512,23 @@ static void __init psci_init_migrate(void)
> pr_info("Trusted OS resident on physical CPU 0x%lx\n", cpuid);
> }
>
> +static void __init psci_init_smccc(u32 ver)
> +{
> + int feature;
> +
> + feature = psci_features(ARM_SMCCC_VERSION_FUNC_ID);
> +
> + if (feature != PSCI_RET_NOT_SUPPORTED) {
> + ver = invoke_psci_fn(ARM_SMCCC_VERSION_FUNC_ID, 0, 0, 0);
> + if (ver != ARM_SMCCC_VERSION_1_1)
> + psci_ops.smccc_version = SMCCC_VERSION_1_0;
> + else
> + psci_ops.smccc_version = SMCCC_VERSION_1_1;
> + }
> +
> + pr_info("SMC Calling Convention v1.%d\n", psci_ops.smccc_version);

This is a bit nasty: you are returning the numeric value of the enum
as the minor number, and hardcoding the major version number as 1,
while the return value of ARM_SMCCC_VERSION_FUNC_ID gives you the
exact numbers. I assume nobody is expecting SMCCC v2.3 anytime soon,
but it would still be a lot nicer to simply decode the value of 'ver'
(and make it default to ARM_SMCCC_VERSION_1_0 if the PSCI feature call
fails)


> +}
> +
> static void __init psci_0_2_set_functions(void)
> {
> pr_info("Using standard PSCI v0.2 function IDs\n");
> @@ -559,6 +577,7 @@ static int __init psci_probe(void)
> psci_init_migrate();
>
> if (PSCI_VERSION_MAJOR(ver) >= 1) {
> + psci_init_smccc(ver);
> psci_init_cpu_suspend();
> psci_init_system_suspend();
> }
> diff --git a/include/linux/psci.h b/include/linux/psci.h
> index f2679e5faa4f..8b1b3b5935ab 100644
> --- a/include/linux/psci.h
> +++ b/include/linux/psci.h
> @@ -31,6 +31,11 @@ enum psci_conduit {
> PSCI_CONDUIT_HVC,
> };
>
> +enum smccc_version {
> + SMCCC_VERSION_1_0,
> + SMCCC_VERSION_1_1,
> +};
> +
> struct psci_operations {
> u32 (*get_version)(void);
> int (*cpu_suspend)(u32 state, unsigned long entry_point);
> @@ -41,6 +46,7 @@ struct psci_operations {
> unsigned long lowest_affinity_level);
> int (*migrate_info_type)(void);
> enum psci_conduit conduit;
> + enum smccc_version smccc_version;
> };
>
> extern struct psci_operations psci_ops;
> --
> 2.14.2
>

2018-02-02 04:09:04

by Hanjun Guo

[permalink] [raw]
Subject: Re: [PATCH v3 18/18] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround

Hi Marc,

Thank you for keeping me in the loop, just minor comments below.

On 2018/2/1 19:46, Marc Zyngier wrote:
> Now that we've standardised on SMCCC v1.1 to perform the branch
> prediction invalidation, let's drop the previous band-aid.
> If vendors haven't updated their firmware to do SMCCC 1.1, they
> haven't updated PSCI either, so we don't loose anything.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/kernel/bpi.S | 24 -----------------------
> arch/arm64/kernel/cpu_errata.c | 43 ++++++++++++------------------------------
> arch/arm64/kvm/hyp/switch.c | 14 --------------
> 3 files changed, 12 insertions(+), 69 deletions(-)
>
> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
> index fdeed629f2c6..e5de33513b5d 100644
> --- a/arch/arm64/kernel/bpi.S
> +++ b/arch/arm64/kernel/bpi.S
> @@ -54,30 +54,6 @@ ENTRY(__bp_harden_hyp_vecs_start)
> vectors __kvm_hyp_vector
> .endr
> ENTRY(__bp_harden_hyp_vecs_end)
> -ENTRY(__psci_hyp_bp_inval_start)
> - sub sp, sp, #(8 * 18)
> - stp x16, x17, [sp, #(16 * 0)]
> - stp x14, x15, [sp, #(16 * 1)]
> - stp x12, x13, [sp, #(16 * 2)]
> - stp x10, x11, [sp, #(16 * 3)]
> - stp x8, x9, [sp, #(16 * 4)]
> - stp x6, x7, [sp, #(16 * 5)]
> - stp x4, x5, [sp, #(16 * 6)]
> - stp x2, x3, [sp, #(16 * 7)]
> - stp x0, x1, [sp, #(16 * 8)]
> - mov x0, #0x84000000
> - smc #0
> - ldp x16, x17, [sp, #(16 * 0)]
> - ldp x14, x15, [sp, #(16 * 1)]
> - ldp x12, x13, [sp, #(16 * 2)]
> - ldp x10, x11, [sp, #(16 * 3)]
> - ldp x8, x9, [sp, #(16 * 4)]
> - ldp x6, x7, [sp, #(16 * 5)]
> - ldp x4, x5, [sp, #(16 * 6)]
> - ldp x2, x3, [sp, #(16 * 7)]
> - ldp x0, x1, [sp, #(16 * 8)]
> - add sp, sp, #(8 * 18)
> -ENTRY(__psci_hyp_bp_inval_end)
>
> ENTRY(__qcom_hyp_sanitize_link_stack_start)
> stp x29, x30, [sp, #-16]!
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 9e77809a3b23..b8279a11f57b 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -67,7 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused)
> DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
>
> #ifdef CONFIG_KVM
> -extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
> extern char __qcom_hyp_sanitize_link_stack_start[];
> extern char __qcom_hyp_sanitize_link_stack_end[];
> extern char __smccc_workaround_1_smc_start[];
> @@ -116,8 +115,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
> spin_unlock(&bp_lock);
> }
> #else
> -#define __psci_hyp_bp_inval_start NULL
> -#define __psci_hyp_bp_inval_end NULL
> #define __qcom_hyp_sanitize_link_stack_start NULL
> #define __qcom_hyp_sanitize_link_stack_end NULL
> #define __smccc_workaround_1_smc_start NULL
> @@ -164,14 +161,15 @@ static void call_hvc_arch_workaround_1(void)
> arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
> }
>
> -static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
> +static int smccc_arch_workaround_1(void *data)
> {
> + const struct arm64_cpu_capabilities *entry = data;
> bp_hardening_cb_t cb;
> void *smccc_start, *smccc_end;
> struct arm_smccc_res res;
>
> if (!entry->matches(entry, SCOPE_LOCAL_CPU))

entry->matches() will be called twice in this function, another
one is in install_bp_hardening_cb() below, but install_bp_hardening_cb()
will be called in qcom_enable_link_stack_sanitization(), and
this is in the init path, so I think it's fine to keep as it is now.

> - return false;
> + return 0;
>
> if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
> return false;

return 0;

> @@ -181,7 +179,7 @@ static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *e
> arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_WORKAROUND_1, &res);
> if (res.a0)
> - return false;
> + return 0;
> cb = call_hvc_arch_workaround_1;
> smccc_start = __smccc_workaround_1_hvc_start;
> smccc_end = __smccc_workaround_1_hvc_end;
> @@ -191,35 +189,18 @@ static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *e
> arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_WORKAROUND_1, &res);
> if (res.a0)
> - return false;
> + return 0;
> cb = call_smc_arch_workaround_1;
> smccc_start = __smccc_workaround_1_smc_start;
> smccc_end = __smccc_workaround_1_smc_end;
> break;
>
> default:
> - return false;
> + return 0;
> }
>
> install_bp_hardening_cb(entry, cb, smccc_start, smccc_end);

Thanks
Hanjun


2018-02-02 12:34:31

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 03/18] arm64: KVM: Increment PC after handling an SMC trap

On Thu, Feb 01, 2018 at 11:46:42AM +0000, Marc Zyngier wrote:
> When handling an SMC trap, the "preferred return address" is set
> to that of the SMC, and not the next PC (which is a departure from
> the behaviour of an SMC that isn't trapped).
>
> Increment PC in the handler, as the guest is otherwise forever
> stuck...
>

Reviewed-by: Christoffer Dall <[email protected]>

> Cc: [email protected]
> Fixes: acfb3b883f6d ("arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls")
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/kvm/handle_exit.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 520b0dad3c62..5493bbefbd0d 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -62,7 +62,16 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
>
> static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> {
> + /*
> + * "If an SMC instruction executed at Non-secure EL1 is
> + * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a
> + * Trap exception, not a Secure Monitor Call exception [...]"
> + *
> + * We need to advance the PC after the trap, as it would
> + * otherwise return to the same address...
> + */
> vcpu_set_reg(vcpu, 0, ~0UL);
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> return 1;
> }
>
> --
> 2.14.2
>

2018-02-02 12:35:16

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 05/18] arm/arm64: KVM: Add PSCI_VERSION helper

On Thu, Feb 01, 2018 at 11:46:44AM +0000, Marc Zyngier wrote:
> As we're about to trigger a PSCI version explosion, it doesn't
> hurt to introduce a PSCI_VERSION helper that is going to be
> used everywhere.
>

Reviewed-by: Christoffer Dall <[email protected]>

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> include/kvm/arm_psci.h | 6 ++++--
> include/uapi/linux/psci.h | 3 +++
> virt/kvm/arm/psci.c | 4 +---
> 3 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 2042bb909474..5659343580a3 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -18,8 +18,10 @@
> #ifndef __KVM_ARM_PSCI_H__
> #define __KVM_ARM_PSCI_H__
>
> -#define KVM_ARM_PSCI_0_1 1
> -#define KVM_ARM_PSCI_0_2 2
> +#include <uapi/linux/psci.h>
> +
> +#define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
> +#define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
>
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> int kvm_psci_call(struct kvm_vcpu *vcpu);
> diff --git a/include/uapi/linux/psci.h b/include/uapi/linux/psci.h
> index 760e52a9640f..b3bcabe380da 100644
> --- a/include/uapi/linux/psci.h
> +++ b/include/uapi/linux/psci.h
> @@ -88,6 +88,9 @@
> (((ver) & PSCI_VERSION_MAJOR_MASK) >> PSCI_VERSION_MAJOR_SHIFT)
> #define PSCI_VERSION_MINOR(ver) \
> ((ver) & PSCI_VERSION_MINOR_MASK)
> +#define PSCI_VERSION(maj, min) \
> + ((((maj) << PSCI_VERSION_MAJOR_SHIFT) & PSCI_VERSION_MAJOR_MASK) | \
> + ((min) & PSCI_VERSION_MINOR_MASK))
>
> /* PSCI features decoding (>=1.0) */
> #define PSCI_1_0_FEATURES_CPU_SUSPEND_PF_SHIFT 1
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index b322e46fd142..999f94d6bb98 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -25,8 +25,6 @@
>
> #include <kvm/arm_psci.h>
>
> -#include <uapi/linux/psci.h>
> -
> /*
> * This is an implementation of the Power State Coordination Interface
> * as described in ARM document number ARM DEN 0022A.
> @@ -222,7 +220,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
> * Bits[31:16] = Major Version = 0
> * Bits[15:0] = Minor Version = 2
> */
> - val = 2;
> + val = KVM_ARM_PSCI_0_2;
> break;
> case PSCI_0_2_FN_CPU_SUSPEND:
> case PSCI_0_2_FN64_CPU_SUSPEND:
> --
> 2.14.2
>

2018-02-02 12:35:21

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 06/18] arm/arm64: KVM: Add smccc accessors to PSCI code

On Thu, Feb 01, 2018 at 11:46:45AM +0000, Marc Zyngier wrote:
> Instead of open coding the accesses to the various registers,
> let's add explicit SMCCC accessors.
>

Reviewed-by: Christoffer Dall <[email protected]>

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> virt/kvm/arm/psci.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 42 insertions(+), 10 deletions(-)
>
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 999f94d6bb98..c41553d35110 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -32,6 +32,38 @@
>
> #define AFFINITY_MASK(level) ~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
>
> +static u32 smccc_get_function(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_get_reg(vcpu, 0);
> +}
> +
> +static unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_get_reg(vcpu, 1);
> +}
> +
> +static unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_get_reg(vcpu, 2);
> +}
> +
> +static unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_get_reg(vcpu, 3);
> +}
> +
> +static void smccc_set_retval(struct kvm_vcpu *vcpu,
> + unsigned long a0,
> + unsigned long a1,
> + unsigned long a2,
> + unsigned long a3)
> +{
> + vcpu_set_reg(vcpu, 0, a0);
> + vcpu_set_reg(vcpu, 1, a1);
> + vcpu_set_reg(vcpu, 2, a2);
> + vcpu_set_reg(vcpu, 3, a3);
> +}
> +
> static unsigned long psci_affinity_mask(unsigned long affinity_level)
> {
> if (affinity_level <= 3)
> @@ -77,7 +109,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> unsigned long context_id;
> phys_addr_t target_pc;
>
> - cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
> + cpu_id = smccc_get_arg1(source_vcpu) & MPIDR_HWID_BITMASK;
> if (vcpu_mode_is_32bit(source_vcpu))
> cpu_id &= ~((u32) 0);
>
> @@ -96,8 +128,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> return PSCI_RET_INVALID_PARAMS;
> }
>
> - target_pc = vcpu_get_reg(source_vcpu, 2);
> - context_id = vcpu_get_reg(source_vcpu, 3);
> + target_pc = smccc_get_arg2(source_vcpu);
> + context_id = smccc_get_arg3(source_vcpu);
>
> kvm_reset_vcpu(vcpu);
>
> @@ -116,7 +148,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> * NOTE: We always update r0 (or x0) because for PSCI v0.1
> * the general puspose registers are undefined upon CPU_ON.
> */
> - vcpu_set_reg(vcpu, 0, context_id);
> + smccc_set_retval(vcpu, context_id, 0, 0, 0);
> vcpu->arch.power_off = false;
> smp_mb(); /* Make sure the above is visible */
>
> @@ -136,8 +168,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
> struct kvm *kvm = vcpu->kvm;
> struct kvm_vcpu *tmp;
>
> - target_affinity = vcpu_get_reg(vcpu, 1);
> - lowest_affinity_level = vcpu_get_reg(vcpu, 2);
> + target_affinity = smccc_get_arg1(vcpu);
> + lowest_affinity_level = smccc_get_arg2(vcpu);
>
> /* Determine target affinity mask */
> target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
> @@ -210,7 +242,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
> static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> - unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
> + u32 psci_fn = smccc_get_function(vcpu);
> unsigned long val;
> int ret = 1;
>
> @@ -277,14 +309,14 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
> break;
> }
>
> - vcpu_set_reg(vcpu, 0, val);
> + smccc_set_retval(vcpu, val, 0, 0, 0);
> return ret;
> }
>
> static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> - unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
> + u32 psci_fn = smccc_get_function(vcpu);
> unsigned long val;
>
> switch (psci_fn) {
> @@ -302,7 +334,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> break;
> }
>
> - vcpu_set_reg(vcpu, 0, val);
> + smccc_set_retval(vcpu, val, 0, 0, 0);
> return 1;
> }
>
> --
> 2.14.2
>

2018-02-02 12:35:22

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 07/18] arm/arm64: KVM: Implement PSCI 1.0 support

On Thu, Feb 01, 2018 at 11:46:46AM +0000, Marc Zyngier wrote:
> PSCI 1.0 can be trivially implemented by having PSCI 0.2 and
> the FEATURES call. Of, and returning 1.0 as the PSCI version.

Of? (Oh ?)

>
> We happily ignore everything else, as it is optional.

nit: Might be worth mentioning that there are other changes between v0.2
but they are clarifications or relaxations and therefore don't require
additional changes.

Reviewed-by: Christoffer Dall <[email protected]>

>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> include/kvm/arm_psci.h | 1 +
> virt/kvm/arm/psci.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 5659343580a3..5446435457c2 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -22,6 +22,7 @@
>
> #define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
> #define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
> +#define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)
>
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> int kvm_psci_call(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index c41553d35110..291874cff85e 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -313,6 +313,47 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
> +{
> + u32 psci_fn = smccc_get_function(vcpu);
> + u32 feature;
> + unsigned long val;
> + int ret = 1;
> +
> + switch(psci_fn) {
> + case PSCI_0_2_FN_PSCI_VERSION:
> + val = KVM_ARM_PSCI_1_0;
> + break;
> + case PSCI_1_0_FN_PSCI_FEATURES:
> + feature = smccc_get_arg1(vcpu);
> + switch(feature) {
> + case PSCI_0_2_FN_PSCI_VERSION:
> + case PSCI_0_2_FN_CPU_SUSPEND:
> + case PSCI_0_2_FN64_CPU_SUSPEND:
> + case PSCI_0_2_FN_CPU_OFF:
> + case PSCI_0_2_FN_CPU_ON:
> + case PSCI_0_2_FN64_CPU_ON:
> + case PSCI_0_2_FN_AFFINITY_INFO:
> + case PSCI_0_2_FN64_AFFINITY_INFO:
> + case PSCI_0_2_FN_MIGRATE_INFO_TYPE:
> + case PSCI_0_2_FN_SYSTEM_OFF:
> + case PSCI_0_2_FN_SYSTEM_RESET:
> + case PSCI_1_0_FN_PSCI_FEATURES:
> + val = 0;
> + break;
> + default:
> + val = PSCI_RET_NOT_SUPPORTED;
> + break;
> + }
> + break;
> + default:
> + return kvm_psci_0_2_call(vcpu);
> + }
> +
> + smccc_set_retval(vcpu, val, 0, 0, 0);
> + return ret;
> +}
> +
> static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> @@ -355,6 +396,8 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> int kvm_psci_call(struct kvm_vcpu *vcpu)
> {
> switch (kvm_psci_version(vcpu)) {
> + case KVM_ARM_PSCI_1_0:
> + return kvm_psci_1_0_call(vcpu);
> case KVM_ARM_PSCI_0_2:
> return kvm_psci_0_2_call(vcpu);
> case KVM_ARM_PSCI_0_1:
> --
> 2.14.2
>

2018-02-02 12:35:58

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 04/18] arm/arm64: KVM: Consolidate the PSCI include files

On Thu, Feb 01, 2018 at 11:46:43AM +0000, Marc Zyngier wrote:
> As we're about to update the PSCI support, and because I'm lazy,
> let's move the PSCI include file to include/kvm so that both
> ARM architectures can find it.
>
Acked-by: Christoffer Dall <[email protected]>

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm/include/asm/kvm_psci.h | 27 ----------------------
> arch/arm/kvm/handle_exit.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 3 ++-
> .../asm/kvm_psci.h => include/kvm/arm_psci.h | 6 ++---
> virt/kvm/arm/arm.c | 2 +-
> virt/kvm/arm/psci.c | 3 ++-
> 6 files changed, 9 insertions(+), 34 deletions(-)
> delete mode 100644 arch/arm/include/asm/kvm_psci.h
> rename arch/arm64/include/asm/kvm_psci.h => include/kvm/arm_psci.h (89%)
>
> diff --git a/arch/arm/include/asm/kvm_psci.h b/arch/arm/include/asm/kvm_psci.h
> deleted file mode 100644
> index 6bda945d31fa..000000000000
> --- a/arch/arm/include/asm/kvm_psci.h
> +++ /dev/null
> @@ -1,27 +0,0 @@
> -/*
> - * Copyright (C) 2012 - ARM Ltd
> - * Author: Marc Zyngier <[email protected]>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program. If not, see <http://www.gnu.org/licenses/>.
> - */
> -
> -#ifndef __ARM_KVM_PSCI_H__
> -#define __ARM_KVM_PSCI_H__
> -
> -#define KVM_ARM_PSCI_0_1 1
> -#define KVM_ARM_PSCI_0_2 2
> -
> -int kvm_psci_version(struct kvm_vcpu *vcpu);
> -int kvm_psci_call(struct kvm_vcpu *vcpu);
> -
> -#endif /* __ARM_KVM_PSCI_H__ */
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index a4bf0f6f024a..230ae4079108 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -21,7 +21,7 @@
> #include <asm/kvm_emulate.h>
> #include <asm/kvm_coproc.h>
> #include <asm/kvm_mmu.h>
> -#include <asm/kvm_psci.h>
> +#include <kvm/arm_psci.h>
> #include <trace/events/kvm.h>
>
> #include "trace.h"
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 5493bbefbd0d..588f910632a7 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -22,13 +22,14 @@
> #include <linux/kvm.h>
> #include <linux/kvm_host.h>
>
> +#include <kvm/arm_psci.h>
> +
> #include <asm/esr.h>
> #include <asm/exception.h>
> #include <asm/kvm_asm.h>
> #include <asm/kvm_coproc.h>
> #include <asm/kvm_emulate.h>
> #include <asm/kvm_mmu.h>
> -#include <asm/kvm_psci.h>
> #include <asm/debug-monitors.h>
> #include <asm/traps.h>
>
> diff --git a/arch/arm64/include/asm/kvm_psci.h b/include/kvm/arm_psci.h
> similarity index 89%
> rename from arch/arm64/include/asm/kvm_psci.h
> rename to include/kvm/arm_psci.h
> index bc39e557c56c..2042bb909474 100644
> --- a/arch/arm64/include/asm/kvm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -15,8 +15,8 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> -#ifndef __ARM64_KVM_PSCI_H__
> -#define __ARM64_KVM_PSCI_H__
> +#ifndef __KVM_ARM_PSCI_H__
> +#define __KVM_ARM_PSCI_H__
>
> #define KVM_ARM_PSCI_0_1 1
> #define KVM_ARM_PSCI_0_2 2
> @@ -24,4 +24,4 @@
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> int kvm_psci_call(struct kvm_vcpu *vcpu);
>
> -#endif /* __ARM64_KVM_PSCI_H__ */
> +#endif /* __KVM_ARM_PSCI_H__ */
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 15bf026eb182..af3e98fc377e 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -31,6 +31,7 @@
> #include <linux/irqbypass.h>
> #include <trace/events/kvm.h>
> #include <kvm/arm_pmu.h>
> +#include <kvm/arm_psci.h>
>
> #define CREATE_TRACE_POINTS
> #include "trace.h"
> @@ -46,7 +47,6 @@
> #include <asm/kvm_mmu.h>
> #include <asm/kvm_emulate.h>
> #include <asm/kvm_coproc.h>
> -#include <asm/kvm_psci.h>
> #include <asm/sections.h>
>
> #ifdef REQUIRES_VIRT
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index f1e363bab5e8..b322e46fd142 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -21,9 +21,10 @@
>
> #include <asm/cputype.h>
> #include <asm/kvm_emulate.h>
> -#include <asm/kvm_psci.h>
> #include <asm/kvm_host.h>
>
> +#include <kvm/arm_psci.h>
> +
> #include <uapi/linux/psci.h>
>
> /*
> --
> 2.14.2
>

2018-02-02 13:18:48

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 18/18] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround

On 02/02/18 04:05, Hanjun Guo wrote:
> Hi Marc,
>
> Thank you for keeping me in the loop, just minor comments below.
>
> On 2018/2/1 19:46, Marc Zyngier wrote:
>> Now that we've standardised on SMCCC v1.1 to perform the branch
>> prediction invalidation, let's drop the previous band-aid.
>> If vendors haven't updated their firmware to do SMCCC 1.1, they
>> haven't updated PSCI either, so we don't loose anything.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> arch/arm64/kernel/bpi.S | 24 -----------------------
>> arch/arm64/kernel/cpu_errata.c | 43 ++++++++++++------------------------------
>> arch/arm64/kvm/hyp/switch.c | 14 --------------
>> 3 files changed, 12 insertions(+), 69 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
>> index fdeed629f2c6..e5de33513b5d 100644
>> --- a/arch/arm64/kernel/bpi.S
>> +++ b/arch/arm64/kernel/bpi.S
>> @@ -54,30 +54,6 @@ ENTRY(__bp_harden_hyp_vecs_start)
>> vectors __kvm_hyp_vector
>> .endr
>> ENTRY(__bp_harden_hyp_vecs_end)
>> -ENTRY(__psci_hyp_bp_inval_start)
>> - sub sp, sp, #(8 * 18)
>> - stp x16, x17, [sp, #(16 * 0)]
>> - stp x14, x15, [sp, #(16 * 1)]
>> - stp x12, x13, [sp, #(16 * 2)]
>> - stp x10, x11, [sp, #(16 * 3)]
>> - stp x8, x9, [sp, #(16 * 4)]
>> - stp x6, x7, [sp, #(16 * 5)]
>> - stp x4, x5, [sp, #(16 * 6)]
>> - stp x2, x3, [sp, #(16 * 7)]
>> - stp x0, x1, [sp, #(16 * 8)]
>> - mov x0, #0x84000000
>> - smc #0
>> - ldp x16, x17, [sp, #(16 * 0)]
>> - ldp x14, x15, [sp, #(16 * 1)]
>> - ldp x12, x13, [sp, #(16 * 2)]
>> - ldp x10, x11, [sp, #(16 * 3)]
>> - ldp x8, x9, [sp, #(16 * 4)]
>> - ldp x6, x7, [sp, #(16 * 5)]
>> - ldp x4, x5, [sp, #(16 * 6)]
>> - ldp x2, x3, [sp, #(16 * 7)]
>> - ldp x0, x1, [sp, #(16 * 8)]
>> - add sp, sp, #(8 * 18)
>> -ENTRY(__psci_hyp_bp_inval_end)
>>
>> ENTRY(__qcom_hyp_sanitize_link_stack_start)
>> stp x29, x30, [sp, #-16]!
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index 9e77809a3b23..b8279a11f57b 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -67,7 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused)
>> DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
>>
>> #ifdef CONFIG_KVM
>> -extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
>> extern char __qcom_hyp_sanitize_link_stack_start[];
>> extern char __qcom_hyp_sanitize_link_stack_end[];
>> extern char __smccc_workaround_1_smc_start[];
>> @@ -116,8 +115,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
>> spin_unlock(&bp_lock);
>> }
>> #else
>> -#define __psci_hyp_bp_inval_start NULL
>> -#define __psci_hyp_bp_inval_end NULL
>> #define __qcom_hyp_sanitize_link_stack_start NULL
>> #define __qcom_hyp_sanitize_link_stack_end NULL
>> #define __smccc_workaround_1_smc_start NULL
>> @@ -164,14 +161,15 @@ static void call_hvc_arch_workaround_1(void)
>> arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
>> }
>>
>> -static bool check_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
>> +static int smccc_arch_workaround_1(void *data)
>> {
>> + const struct arm64_cpu_capabilities *entry = data;
>> bp_hardening_cb_t cb;
>> void *smccc_start, *smccc_end;
>> struct arm_smccc_res res;
>>
>> if (!entry->matches(entry, SCOPE_LOCAL_CPU))
>
> entry->matches() will be called twice in this function, another
> one is in install_bp_hardening_cb() below, but install_bp_hardening_cb()
> will be called in qcom_enable_link_stack_sanitization(), and
> this is in the init path, so I think it's fine to keep as it is now.

That's on purpose. Otherwise we spam the firmware/hypervisor with probe
calls for each entry that has the same capability. Not a big deal, but
still. This should be addressed when we get Suzuki's rework.

>
>> - return false;
>> + return 0;
>>
>> if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
>> return false;
>
> return 0;

Ah, indeed. Thanks.

M.
--
Jazz is not dead. It just smells funny...

2018-02-02 22:19:16

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> Since all the new PSCI versions are backward compatible, we decide to
> default to the latest version of the PSCI implementation. This is no
> different from doing a firmware upgrade on KVM.
>
> But in order to give a chance to hypothetical badly implemented guests
> that would have a fit by discovering something other than PSCI 0.2,
> let's provide a new API that allows userspace to pick one particular
> version of the API.
>
> This is implemented as a new class of "firmware" registers, where
> we expose the PSCI version. This allows the PSCI version to be
> save/restored as part of a guest migration, and also set to
> any supported version if the guest requires it.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> Documentation/virtual/kvm/api.txt | 3 +-
> Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> arch/arm/include/asm/kvm_host.h | 3 ++
> arch/arm/include/uapi/asm/kvm.h | 6 +++
> arch/arm/kvm/guest.c | 13 +++++++
> arch/arm64/include/asm/kvm_host.h | 3 ++
> arch/arm64/include/uapi/asm/kvm.h | 6 +++
> arch/arm64/kvm/guest.c | 14 ++++++-
> include/kvm/arm_psci.h | 9 +++++
> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> 10 files changed, 151 insertions(+), 4 deletions(-)
> create mode 100644 Documentation/virtual/kvm/arm/psci.txt
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 57d3ee9e4bde..334905202141 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2493,7 +2493,8 @@ Possible features:
> and execute guest code when KVM_RUN is called.
> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> + backward compatible with v0.2) for the CPU.
> Depends on KVM_CAP_ARM_PSCI_0_2.
> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> Depends on KVM_CAP_ARM_PMU_V3.
> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> new file mode 100644
> index 000000000000..aafdab887b04
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/psci.txt
> @@ -0,0 +1,30 @@
> +KVM implements the PSCI (Power State Coordination Interface)
> +specification in order to provide services such as CPU on/off, reset
> +and power-off to the guest.
> +
> +The PSCI specification is regularly updated to provide new features,
> +and KVM implements these updates if they make sense from a virtualization
> +point of view.
> +
> +This means that a guest booted on two different versions of KVM can
> +observe two different "firmware" revisions. This could cause issues if
> +a given guest is tied to a particular PSCI revision (unlikely), or if
> +a migration causes a different PSCI version to be exposed out of the
> +blue to an unsuspecting guest.
> +
> +In order to remedy this situation, KVM exposes a set of "firmware
> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> +interface. These registers can be saved/restored by userspace, and set
> +to a convenient value if required.
> +
> +The following register is defined:
> +
> +* KVM_REG_ARM_PSCI_VERSION:
> +
> + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> + (and thus has already been initialized)
> + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> + highest PSCI version implemented by KVM and compatible with v0.2)
> + - Allows any PSCI version implemented by KVM and compatible with
> + v0.2 to be set with SET_ONE_REG
> + - Affects the whole VM (even if the register view is per-vcpu)

Hi Marc,

I've put some more thought and experimentation into this. I think we
should change to a vcpu feature bit. The feature bit would be used to
force compat mode, v0.2, so KVM would still enable the new PSCI
version by default. Below are two tables describing why I think we
should switch to something other than a new sysreg, and below those
tables are notes as to why I think we should use a vcpu feature. The
asterisks in the tables point out behaviors that aren't what we want.
While both tables have an asterisk, the sysreg approach's issue is
bug. The vcpu feature approach's issue is risk incurred from an
unsupported migration, albeit one that is hard to detect without a
new machine type.

+-----------------------------------------------------------------------+
| sysreg approach |
+------------------+-----------+-------+--------------------------------+
| migration | userspace | works | notes |
| | change | | |
+------------------+-----------+-------+--------------------------------+
| new -> new | NO | YES | Expected |
+------------------+-----------+-------+--------------------------------+
| old -> new | NO | YES | PSCI 1.0 is backward compatible|
+------------------+-----------+-------+--------------------------------+
| new -> old | NO | NO | Migration fails due to the new |
| | | | sysreg. Migration shouldn't |
| | | | have been attempted, but no |
| | | | way to know without a new |
| | | | machine type. |
+------------------+-----------+-------+--------------------------------+
| compat -> old | YES | NO* | Even when setting PSCI version |
| | | | to 0.2, we add a new sysreg, |
| | | | so migration will still fail. |
+------------------+-----------+-------+--------------------------------+
| old -> compat | YES | YES | It's OK for the destination to |
| | | | support more sysregs than the |
| | | | source sends. |
+------------------+-----------+-------+--------------------------------+


+-----------------------------------------------------------------------+
| vcpu feature approach |
+------------------+-----------+-------+--------------------------------+
| migration | userspace | works | notes |
| | change | | |
+------------------+-----------+-------+--------------------------------+
| new -> new | NO | YES | Expected |
+------------------+-----------+-------+--------------------------------+
| old -> new | NO | YES | PSCI 1.0 is backward compatible|
+------------------+-----------+-------+--------------------------------+
| new -> old | NO | YES* | Migrates, but it's not safe |
| | | | for the guest kernel, and no |
| | | | way to know without a new |
| | | | machine type. |
+------------------+-----------+-------+--------------------------------+
| compat -> old | YES | YES | Expected |
+------------------+-----------+-------+--------------------------------+
| old -> compat | YES | YES | Expected |
+------------------+-----------+-------+--------------------------------+


Notes as to why the vcpu feature approach was selected:

1) While this is VM state, and thus a VM control would be a more natural
fit, a new vcpu feature bit would be much less new code. We also
already have a PSCI vcpu feature bit, so a new one will actually fit
quite well.

2) No new state needs to be migrated, as we already migrate the feature
bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
so bumping it one more in KVM doesn't require a QEMU change.


If we switch to a vcpu feature bit, then I think this patch can be
replaced with something like this


diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4485ae8e98de..cde330119fd3 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -42,7 +42,7 @@

#define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS

-#define KVM_VCPU_MAX_FEATURES 4
+#define KVM_VCPU_MAX_FEATURES 5

#define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 9abbf3044654..53ac5a633331 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -100,6 +100,7 @@ struct kvm_regs {
#define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */
#define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */
#define KVM_ARM_VCPU_PMU_V3 3 /* Support guest PMUv3 */
+#define KVM_ARM_VCPU_FORCE_PSCI_0_2 4 /* PSCI v0.2 only, nothing later */

struct kvm_vcpu_init {
__u32 target;
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 291874cff85e..946f74539727 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -233,9 +233,11 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)

int kvm_psci_version(struct kvm_vcpu *vcpu)
{
- if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+ if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
+ if (!test_bit(KVM_ARM_VCPU_FORCE_PSCI_0_2, vcpu->arch.features))
+ return KVM_ARM_PSCI_LATEST;
return KVM_ARM_PSCI_0_2;
-
+ }
return KVM_ARM_PSCI_0_1;
}


Thanks,
drew


> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index acbf9ec7b396..e9d57060d88c 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -75,6 +75,9 @@ struct kvm_arch {
> /* Interrupt controller */
> struct vgic_dist vgic;
> int max_vcpus;
> +
> + /* Mandated version of PSCI */
> + u32 psci_version;
> };
>
> #define KVM_NR_MEM_OBJS 40
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6edd177bb1c7..47dfc99f5cd0 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -186,6 +186,12 @@ struct kvm_arch_memory_slot {
> #define KVM_REG_ARM_VFP_FPINST 0x1009
> #define KVM_REG_ARM_VFP_FPINST2 0x100A
>
> +/* KVM-as-firmware specific pseudo-registers */
> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM | KVM_REG_SIZE_U64 | \
> + KVM_REG_ARM_FW | ((r) & 0xffff))
> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
> +
> /* Device Control API: ARM VGIC */
> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
> index 1e0784ebbfd6..a18f33edc471 100644
> --- a/arch/arm/kvm/guest.c
> +++ b/arch/arm/kvm/guest.c
> @@ -22,6 +22,7 @@
> #include <linux/module.h>
> #include <linux/vmalloc.h>
> #include <linux/fs.h>
> +#include <kvm/arm_psci.h>
> #include <asm/cputype.h>
> #include <linux/uaccess.h>
> #include <asm/kvm.h>
> @@ -176,6 +177,7 @@ static unsigned long num_core_regs(void)
> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> {
> return num_core_regs() + kvm_arm_num_coproc_regs(vcpu)
> + + kvm_arm_get_fw_num_regs(vcpu)
> + NUM_TIMER_REGS;
> }
>
> @@ -196,6 +198,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> uindices++;
> }
>
> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> + if (ret)
> + return ret;
> + uindices += kvm_arm_get_fw_num_regs(vcpu);
> +
> ret = copy_timer_indices(vcpu, uindices);
> if (ret)
> return ret;
> @@ -214,6 +221,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return get_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_get_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return get_timer_reg(vcpu, reg);
>
> @@ -230,6 +240,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return set_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_set_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return set_timer_reg(vcpu, reg);
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4485ae8e98de..10af386642c6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -73,6 +73,9 @@ struct kvm_arch {
>
> /* Interrupt controller */
> struct vgic_dist vgic;
> +
> + /* Mandated version of PSCI */
> + u32 psci_version;
> };
>
> #define KVM_NR_MEM_OBJS 40
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 9abbf3044654..04b3256f8e6d 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -206,6 +206,12 @@ struct kvm_arch_memory_slot {
> #define KVM_REG_ARM_TIMER_CNT ARM64_SYS_REG(3, 3, 14, 3, 2)
> #define KVM_REG_ARM_TIMER_CVAL ARM64_SYS_REG(3, 3, 14, 0, 2)
>
> +/* KVM-as-firmware specific pseudo-registers */
> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
> + KVM_REG_ARM_FW | ((r) & 0xffff))
> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
> +
> /* Device Control API: ARM VGIC */
> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5c7f657dd207..811f04c5760e 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -25,6 +25,7 @@
> #include <linux/module.h>
> #include <linux/vmalloc.h>
> #include <linux/fs.h>
> +#include <kvm/arm_psci.h>
> #include <asm/cputype.h>
> #include <linux/uaccess.h>
> #include <asm/kvm.h>
> @@ -205,7 +206,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> {
> return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
> - + NUM_TIMER_REGS;
> + + kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
> }
>
> /**
> @@ -225,6 +226,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> uindices++;
> }
>
> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> + if (ret)
> + return ret;
> + uindices += kvm_arm_get_fw_num_regs(vcpu);
> +
> ret = copy_timer_indices(vcpu, uindices);
> if (ret)
> return ret;
> @@ -243,6 +249,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return get_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_get_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return get_timer_reg(vcpu, reg);
>
> @@ -259,6 +268,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return set_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_set_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return set_timer_reg(vcpu, reg);
>
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 5446435457c2..4ee098c39e01 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -24,7 +24,16 @@
> #define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
> #define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)
>
> +#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
> +
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> int kvm_psci_call(struct kvm_vcpu *vcpu);
>
> +struct kvm_one_reg;
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +
> #endif /* __KVM_ARM_PSCI_H__ */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 291874cff85e..5c8366b71639 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -17,6 +17,7 @@
>
> #include <linux/preempt.h>
> #include <linux/kvm_host.h>
> +#include <linux/uaccess.h>
> #include <linux/wait.h>
>
> #include <asm/cputype.h>
> @@ -233,8 +234,19 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
>
> int kvm_psci_version(struct kvm_vcpu *vcpu)
> {
> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
> - return KVM_ARM_PSCI_0_2;
> + /*
> + * Our PSCI implementation stays the same across versions from
> + * v0.2 onward, only adding the few mandatory functions (such
> + * as FEATURES with 1.0) that are required by newer
> + * revisions. It is thus safe to return the latest, unless
> + * userspace has instructed us otherwise.
> + */
> + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> + if (vcpu->kvm->arch.psci_version)
> + return vcpu->kvm->arch.psci_version;
> +
> + return KVM_ARM_PSCI_LATEST;
> + }
>
> return KVM_ARM_PSCI_0_1;
> }
> @@ -406,3 +418,55 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
> return -EINVAL;
> };
> }
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> +{
> + return 1; /* PSCI version */
> +}
> +
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> +{
> + if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + if (reg->id == KVM_REG_ARM_PSCI_VERSION) {
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> +
> + val = kvm_psci_version(vcpu);
> + if (val == KVM_ARM_PSCI_0_1)
> + return -EINVAL;
> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + return 0;
> + }
> +
> + return -EINVAL;
> +}
> +
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + if (reg->id == KVM_REG_ARM_PSCI_VERSION &&
> + test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> +
> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + switch (val) {
> + case KVM_ARM_PSCI_0_2:
> + case KVM_ARM_PSCI_1_0:
> + vcpu->kvm->arch.psci_version = val;
> + return 0;
> + }
> + }
> +
> + return -EINVAL;
> +}
> --
> 2.14.2
>

2018-02-03 12:13:53

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Fri, 2 Feb 2018 21:17:06 +0100
Andrew Jones <[email protected]> wrote:

> On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> > Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> > Since all the new PSCI versions are backward compatible, we decide to
> > default to the latest version of the PSCI implementation. This is no
> > different from doing a firmware upgrade on KVM.
> >
> > But in order to give a chance to hypothetical badly implemented guests
> > that would have a fit by discovering something other than PSCI 0.2,
> > let's provide a new API that allows userspace to pick one particular
> > version of the API.
> >
> > This is implemented as a new class of "firmware" registers, where
> > we expose the PSCI version. This allows the PSCI version to be
> > save/restored as part of a guest migration, and also set to
> > any supported version if the guest requires it.
> >
> > Signed-off-by: Marc Zyngier <[email protected]>
> > ---
> > Documentation/virtual/kvm/api.txt | 3 +-
> > Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> > arch/arm/include/asm/kvm_host.h | 3 ++
> > arch/arm/include/uapi/asm/kvm.h | 6 +++
> > arch/arm/kvm/guest.c | 13 +++++++
> > arch/arm64/include/asm/kvm_host.h | 3 ++
> > arch/arm64/include/uapi/asm/kvm.h | 6 +++
> > arch/arm64/kvm/guest.c | 14 ++++++-
> > include/kvm/arm_psci.h | 9 +++++
> > virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> > 10 files changed, 151 insertions(+), 4 deletions(-)
> > create mode 100644 Documentation/virtual/kvm/arm/psci.txt
> >
> > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > index 57d3ee9e4bde..334905202141 100644
> > --- a/Documentation/virtual/kvm/api.txt
> > +++ b/Documentation/virtual/kvm/api.txt
> > @@ -2493,7 +2493,8 @@ Possible features:
> > and execute guest code when KVM_RUN is called.
> > - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> > Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> > - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> > + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> > + backward compatible with v0.2) for the CPU.
> > Depends on KVM_CAP_ARM_PSCI_0_2.
> > - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> > Depends on KVM_CAP_ARM_PMU_V3.
> > diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> > new file mode 100644
> > index 000000000000..aafdab887b04
> > --- /dev/null
> > +++ b/Documentation/virtual/kvm/arm/psci.txt
> > @@ -0,0 +1,30 @@
> > +KVM implements the PSCI (Power State Coordination Interface)
> > +specification in order to provide services such as CPU on/off, reset
> > +and power-off to the guest.
> > +
> > +The PSCI specification is regularly updated to provide new features,
> > +and KVM implements these updates if they make sense from a virtualization
> > +point of view.
> > +
> > +This means that a guest booted on two different versions of KVM can
> > +observe two different "firmware" revisions. This could cause issues if
> > +a given guest is tied to a particular PSCI revision (unlikely), or if
> > +a migration causes a different PSCI version to be exposed out of the
> > +blue to an unsuspecting guest.
> > +
> > +In order to remedy this situation, KVM exposes a set of "firmware
> > +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> > +interface. These registers can be saved/restored by userspace, and set
> > +to a convenient value if required.
> > +
> > +The following register is defined:
> > +
> > +* KVM_REG_ARM_PSCI_VERSION:
> > +
> > + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> > + (and thus has already been initialized)
> > + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> > + highest PSCI version implemented by KVM and compatible with v0.2)
> > + - Allows any PSCI version implemented by KVM and compatible with
> > + v0.2 to be set with SET_ONE_REG
> > + - Affects the whole VM (even if the register view is per-vcpu)
>

Hi Drew,

Thanks for looking into this, and for the exhaustive data.

>
> I've put some more thought and experimentation into this. I think we
> should change to a vcpu feature bit. The feature bit would be used to
> force compat mode, v0.2, so KVM would still enable the new PSCI
> version by default. Below are two tables describing why I think we
> should switch to something other than a new sysreg, and below those
> tables are notes as to why I think we should use a vcpu feature. The
> asterisks in the tables point out behaviors that aren't what we want.
> While both tables have an asterisk, the sysreg approach's issue is
> bug. The vcpu feature approach's issue is risk incurred from an
> unsupported migration, albeit one that is hard to detect without a
> new machine type.
>
> +-----------------------------------------------------------------------+
> | sysreg approach |
> +------------------+-----------+-------+--------------------------------+
> | migration | userspace | works | notes |
> | | change | | |
> +------------------+-----------+-------+--------------------------------+
> | new -> new | NO | YES | Expected |
> +------------------+-----------+-------+--------------------------------+
> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> +------------------+-----------+-------+--------------------------------+
> | new -> old | NO | NO | Migration fails due to the new |
> | | | | sysreg. Migration shouldn't |
> | | | | have been attempted, but no |
> | | | | way to know without a new |
> | | | | machine type. |
> +------------------+-----------+-------+--------------------------------+
> | compat -> old | YES | NO* | Even when setting PSCI version |
> | | | | to 0.2, we add a new sysreg, |
> | | | | so migration will still fail. |
> +------------------+-----------+-------+--------------------------------+
> | old -> compat | YES | YES | It's OK for the destination to |
> | | | | support more sysregs than the |
> | | | | source sends. |
> +------------------+-----------+-------+--------------------------------+
>
>
> +-----------------------------------------------------------------------+
> | vcpu feature approach |
> +------------------+-----------+-------+--------------------------------+
> | migration | userspace | works | notes |
> | | change | | |
> +------------------+-----------+-------+--------------------------------+
> | new -> new | NO | YES | Expected |
> +------------------+-----------+-------+--------------------------------+
> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> +------------------+-----------+-------+--------------------------------+
> | new -> old | NO | YES* | Migrates, but it's not safe |
> | | | | for the guest kernel, and no |
> | | | | way to know without a new |
> | | | | machine type. |
> +------------------+-----------+-------+--------------------------------+
> | compat -> old | YES | YES | Expected |
> +------------------+-----------+-------+--------------------------------+
> | old -> compat | YES | YES | Expected |
> +------------------+-----------+-------+--------------------------------+
>
>
> Notes as to why the vcpu feature approach was selected:
>
> 1) While this is VM state, and thus a VM control would be a more natural
> fit, a new vcpu feature bit would be much less new code. We also
> already have a PSCI vcpu feature bit, so a new one will actually fit
> quite well.
>
> 2) No new state needs to be migrated, as we already migrate the feature
> bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
> so bumping it one more in KVM doesn't require a QEMU change.
>
>
> If we switch to a vcpu feature bit, then I think this patch can be
> replaced with something like this

A couple of remarks:

- My worry with this feature bit is that it is a point fix, and it
doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
another feature bit that says "force to 1.0"? I'd really like
something we can live with in the long run, and "KVM as firmware"
needs to be able to evolve without requiring a new userspace
interface each time we rev it.

- The "compat->old" entry in your sysreg table is not quite fair. In
the feature table, you teach userspace about the new feature bit. You
could just as well teach userspace about the new sysreg. Yes, things
may be different in QEMU, but that's not what we're talking about
here.

- Allowing a guest to migrate in an unsafe way seems worse than failing
a migration unexpectedly. Or at least not any better.

To be clear: I'm not dismissing the idea at all, but I want to make sure
we're not cornering ourselves into an uncomfortable place.

Christoffer, Peter, what are your thoughts on this?

Thanks,

M.
--
Without deviation from the norm, progress is not possible.

2018-02-04 12:39:17

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Sat, Feb 03, 2018 at 11:59:32AM +0000, Marc Zyngier wrote:
> On Fri, 2 Feb 2018 21:17:06 +0100
> Andrew Jones <[email protected]> wrote:
>
> > On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> > > Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> > > Since all the new PSCI versions are backward compatible, we decide to
> > > default to the latest version of the PSCI implementation. This is no
> > > different from doing a firmware upgrade on KVM.
> > >
> > > But in order to give a chance to hypothetical badly implemented guests
> > > that would have a fit by discovering something other than PSCI 0.2,
> > > let's provide a new API that allows userspace to pick one particular
> > > version of the API.
> > >
> > > This is implemented as a new class of "firmware" registers, where
> > > we expose the PSCI version. This allows the PSCI version to be
> > > save/restored as part of a guest migration, and also set to
> > > any supported version if the guest requires it.
> > >
> > > Signed-off-by: Marc Zyngier <[email protected]>
> > > ---
> > > Documentation/virtual/kvm/api.txt | 3 +-
> > > Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> > > arch/arm/include/asm/kvm_host.h | 3 ++
> > > arch/arm/include/uapi/asm/kvm.h | 6 +++
> > > arch/arm/kvm/guest.c | 13 +++++++
> > > arch/arm64/include/asm/kvm_host.h | 3 ++
> > > arch/arm64/include/uapi/asm/kvm.h | 6 +++
> > > arch/arm64/kvm/guest.c | 14 ++++++-
> > > include/kvm/arm_psci.h | 9 +++++
> > > virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> > > 10 files changed, 151 insertions(+), 4 deletions(-)
> > > create mode 100644 Documentation/virtual/kvm/arm/psci.txt
> > >
> > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > > index 57d3ee9e4bde..334905202141 100644
> > > --- a/Documentation/virtual/kvm/api.txt
> > > +++ b/Documentation/virtual/kvm/api.txt
> > > @@ -2493,7 +2493,8 @@ Possible features:
> > > and execute guest code when KVM_RUN is called.
> > > - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> > > Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> > > - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> > > + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> > > + backward compatible with v0.2) for the CPU.
> > > Depends on KVM_CAP_ARM_PSCI_0_2.
> > > - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> > > Depends on KVM_CAP_ARM_PMU_V3.
> > > diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> > > new file mode 100644
> > > index 000000000000..aafdab887b04
> > > --- /dev/null
> > > +++ b/Documentation/virtual/kvm/arm/psci.txt
> > > @@ -0,0 +1,30 @@
> > > +KVM implements the PSCI (Power State Coordination Interface)
> > > +specification in order to provide services such as CPU on/off, reset
> > > +and power-off to the guest.
> > > +
> > > +The PSCI specification is regularly updated to provide new features,
> > > +and KVM implements these updates if they make sense from a virtualization
> > > +point of view.
> > > +
> > > +This means that a guest booted on two different versions of KVM can
> > > +observe two different "firmware" revisions. This could cause issues if
> > > +a given guest is tied to a particular PSCI revision (unlikely), or if
> > > +a migration causes a different PSCI version to be exposed out of the
> > > +blue to an unsuspecting guest.
> > > +
> > > +In order to remedy this situation, KVM exposes a set of "firmware
> > > +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> > > +interface. These registers can be saved/restored by userspace, and set
> > > +to a convenient value if required.
> > > +
> > > +The following register is defined:
> > > +
> > > +* KVM_REG_ARM_PSCI_VERSION:
> > > +
> > > + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> > > + (and thus has already been initialized)
> > > + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> > > + highest PSCI version implemented by KVM and compatible with v0.2)
> > > + - Allows any PSCI version implemented by KVM and compatible with
> > > + v0.2 to be set with SET_ONE_REG
> > > + - Affects the whole VM (even if the register view is per-vcpu)
> >
>
> Hi Drew,
>
> Thanks for looking into this, and for the exhaustive data.
>
> >
> > I've put some more thought and experimentation into this. I think we
> > should change to a vcpu feature bit. The feature bit would be used to
> > force compat mode, v0.2, so KVM would still enable the new PSCI
> > version by default. Below are two tables describing why I think we
> > should switch to something other than a new sysreg, and below those
> > tables are notes as to why I think we should use a vcpu feature. The
> > asterisks in the tables point out behaviors that aren't what we want.
> > While both tables have an asterisk, the sysreg approach's issue is
> > bug. The vcpu feature approach's issue is risk incurred from an
> > unsupported migration, albeit one that is hard to detect without a
> > new machine type.
> >
> > +-----------------------------------------------------------------------+
> > | sysreg approach |
> > +------------------+-----------+-------+--------------------------------+
> > | migration | userspace | works | notes |
> > | | change | | |
> > +------------------+-----------+-------+--------------------------------+
> > | new -> new | NO | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > +------------------+-----------+-------+--------------------------------+
> > | new -> old | NO | NO | Migration fails due to the new |
> > | | | | sysreg. Migration shouldn't |
> > | | | | have been attempted, but no |
> > | | | | way to know without a new |
> > | | | | machine type. |
> > +------------------+-----------+-------+--------------------------------+
> > | compat -> old | YES | NO* | Even when setting PSCI version |
> > | | | | to 0.2, we add a new sysreg, |
> > | | | | so migration will still fail. |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> compat | YES | YES | It's OK for the destination to |
> > | | | | support more sysregs than the |
> > | | | | source sends. |
> > +------------------+-----------+-------+--------------------------------+
> >
> >
> > +-----------------------------------------------------------------------+
> > | vcpu feature approach |
> > +------------------+-----------+-------+--------------------------------+
> > | migration | userspace | works | notes |
> > | | change | | |
> > +------------------+-----------+-------+--------------------------------+
> > | new -> new | NO | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > +------------------+-----------+-------+--------------------------------+
> > | new -> old | NO | YES* | Migrates, but it's not safe |
> > | | | | for the guest kernel, and no |
> > | | | | way to know without a new |
> > | | | | machine type. |
> > +------------------+-----------+-------+--------------------------------+
> > | compat -> old | YES | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> compat | YES | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> >
> >
> > Notes as to why the vcpu feature approach was selected:
> >
> > 1) While this is VM state, and thus a VM control would be a more natural
> > fit, a new vcpu feature bit would be much less new code. We also
> > already have a PSCI vcpu feature bit, so a new one will actually fit
> > quite well.
> >
> > 2) No new state needs to be migrated, as we already migrate the feature
> > bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
> > so bumping it one more in KVM doesn't require a QEMU change.
> >
> >
> > If we switch to a vcpu feature bit, then I think this patch can be
> > replaced with something like this
>
> A couple of remarks:
>
> - My worry with this feature bit is that it is a point fix, and it
> doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
> another feature bit that says "force to 1.0"? I'd really like
> something we can live with in the long run, and "KVM as firmware"
> needs to be able to evolve without requiring a new userspace
> interface each time we rev it.
>
> - The "compat->old" entry in your sysreg table is not quite fair. In
> the feature table, you teach userspace about the new feature bit. You
> could just as well teach userspace about the new sysreg. Yes, things
> may be different in QEMU, but that's not what we're talking about
> here.
>
> - Allowing a guest to migrate in an unsafe way seems worse than failing
> a migration unexpectedly. Or at least not any better.
>
> To be clear: I'm not dismissing the idea at all, but I want to make sure
> we're not cornering ourselves into an uncomfortable place.
>
> Christoffer, Peter, what are your thoughts on this?
>

Taking a step back, the only reasons why this patch isn't simply
enabling PSCI v1.0 by default (without any selection method) are that we
(1) want to support guests that complain about PSCI_VERSION != 0.2
(which isn't completely outside the realm of a reasonable implementation
if you read the description of PSCI_VERSION in the 0.2 spec) and (2) to
provide migration support for guests that call
PSCI_1_0_FN_PSCI_FEATURES.

If we ignore (1) because we don't know of any guests where this is an
issue, then it's all about (2), migration from "new -> old".

As far as I can tell, the use case we are worried about here is updating
the kernel (and not QEMU) on half of your data center and then trying to
migrate from the upgraded kernel machine to a legacy (and potentially
variant 2 vulnerable) machine. For this specific move from PSCI 0.2 to
1.0 with the included mitigation, I don't really think this is an
important use case to support.

In terms of the more general approach to "KVM firmware upgrades" and
migration, I think something like the proposed FW register interface
here should work, but I'm concerned about the lack of opportunity from
userspace to predict a migration failure. But I don't understand why
this requires a new machine type? Why can't we simply provide a KVM
capability that libvirt etc. can query for?

Also, is it generally true that we can't expose any additional system
registers from KVM without breaking migration and we don't have any
method to deal with that in userspace and upper layers? If that's true,
that's a bigger problem in general and something we should work on
trying to solve. If it's not true, then there should be some method to
deal with the FW register already (like capabilities).

Given the urgency of adding mitigation towards variant 2 which is the
driver for this work, I think we should drop the compat functionality in
this series and work this out later on if needed. I think we can just
tweak the previous patch to enable PSCI 1.0 by default and drop this
patch for the current merge window.

Thanks,
-Christoffer

2018-02-04 12:40:55

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

Hi Marc,

[ I know we're discussing the overall approach in parallel, but here are
some comments on the specifics of this patch, should it end up being
used in some capacity ]

On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> Since all the new PSCI versions are backward compatible, we decide to
> default to the latest version of the PSCI implementation. This is no
> different from doing a firmware upgrade on KVM.
>
> But in order to give a chance to hypothetical badly implemented guests
> that would have a fit by discovering something other than PSCI 0.2,
> let's provide a new API that allows userspace to pick one particular
> version of the API.
>
> This is implemented as a new class of "firmware" registers, where
> we expose the PSCI version. This allows the PSCI version to be
> save/restored as part of a guest migration, and also set to
> any supported version if the guest requires it.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> Documentation/virtual/kvm/api.txt | 3 +-
> Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> arch/arm/include/asm/kvm_host.h | 3 ++
> arch/arm/include/uapi/asm/kvm.h | 6 +++
> arch/arm/kvm/guest.c | 13 +++++++
> arch/arm64/include/asm/kvm_host.h | 3 ++
> arch/arm64/include/uapi/asm/kvm.h | 6 +++
> arch/arm64/kvm/guest.c | 14 ++++++-
> include/kvm/arm_psci.h | 9 +++++
> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> 10 files changed, 151 insertions(+), 4 deletions(-)
> create mode 100644 Documentation/virtual/kvm/arm/psci.txt
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 57d3ee9e4bde..334905202141 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2493,7 +2493,8 @@ Possible features:
> and execute guest code when KVM_RUN is called.
> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> + backward compatible with v0.2) for the CPU.
> Depends on KVM_CAP_ARM_PSCI_0_2.
> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> Depends on KVM_CAP_ARM_PMU_V3.

Can we add this to api.txt as well ?:

--------8><----------

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index fc3ae951bc07..c88aa04bcbe8 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1959,6 +1959,8 @@ arm64 CCSIDR registers are demultiplexed by CSSELR value:
arm64 system registers have the following id bit patterns:
0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>

+ARM/arm64 firmware pseudo-registers have the following bit pattern:
+ 0x4030 0000 0014 <regno:16>

MIPS registers are mapped using the lower 32 bits. The upper 16 of that is
the register group type:

--------8><----------

> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> new file mode 100644
> index 000000000000..aafdab887b04
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/psci.txt
> @@ -0,0 +1,30 @@
> +KVM implements the PSCI (Power State Coordination Interface)
> +specification in order to provide services such as CPU on/off, reset
> +and power-off to the guest.
> +
> +The PSCI specification is regularly updated to provide new features,
> +and KVM implements these updates if they make sense from a virtualization
> +point of view.
> +
> +This means that a guest booted on two different versions of KVM can
> +observe two different "firmware" revisions. This could cause issues if
> +a given guest is tied to a particular PSCI revision (unlikely), or if
> +a migration causes a different PSCI version to be exposed out of the
> +blue to an unsuspecting guest.
> +
> +In order to remedy this situation, KVM exposes a set of "firmware
> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> +interface. These registers can be saved/restored by userspace, and set
> +to a convenient value if required.
> +
> +The following register is defined:
> +
> +* KVM_REG_ARM_PSCI_VERSION:
> +
> + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> + (and thus has already been initialized)
> + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> + highest PSCI version implemented by KVM and compatible with v0.2)
> + - Allows any PSCI version implemented by KVM and compatible with
> + v0.2 to be set with SET_ONE_REG
> + - Affects the whole VM (even if the register view is per-vcpu)
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index acbf9ec7b396..e9d57060d88c 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -75,6 +75,9 @@ struct kvm_arch {
> /* Interrupt controller */
> struct vgic_dist vgic;
> int max_vcpus;
> +
> + /* Mandated version of PSCI */
> + u32 psci_version;
> };
>
> #define KVM_NR_MEM_OBJS 40
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6edd177bb1c7..47dfc99f5cd0 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -186,6 +186,12 @@ struct kvm_arch_memory_slot {
> #define KVM_REG_ARM_VFP_FPINST 0x1009
> #define KVM_REG_ARM_VFP_FPINST2 0x100A
>
> +/* KVM-as-firmware specific pseudo-registers */
> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM | KVM_REG_SIZE_U64 | \
> + KVM_REG_ARM_FW | ((r) & 0xffff))
> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
> +
> /* Device Control API: ARM VGIC */
> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
> index 1e0784ebbfd6..a18f33edc471 100644
> --- a/arch/arm/kvm/guest.c
> +++ b/arch/arm/kvm/guest.c
> @@ -22,6 +22,7 @@
> #include <linux/module.h>
> #include <linux/vmalloc.h>
> #include <linux/fs.h>
> +#include <kvm/arm_psci.h>
> #include <asm/cputype.h>
> #include <linux/uaccess.h>
> #include <asm/kvm.h>
> @@ -176,6 +177,7 @@ static unsigned long num_core_regs(void)
> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> {
> return num_core_regs() + kvm_arm_num_coproc_regs(vcpu)
> + + kvm_arm_get_fw_num_regs(vcpu)
> + NUM_TIMER_REGS;
> }
>
> @@ -196,6 +198,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> uindices++;
> }
>
> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> + if (ret)
> + return ret;
> + uindices += kvm_arm_get_fw_num_regs(vcpu);
> +
> ret = copy_timer_indices(vcpu, uindices);
> if (ret)
> return ret;
> @@ -214,6 +221,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return get_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_get_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return get_timer_reg(vcpu, reg);
>
> @@ -230,6 +240,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return set_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_set_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return set_timer_reg(vcpu, reg);
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4485ae8e98de..10af386642c6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -73,6 +73,9 @@ struct kvm_arch {
>
> /* Interrupt controller */
> struct vgic_dist vgic;
> +
> + /* Mandated version of PSCI */
> + u32 psci_version;
> };
>
> #define KVM_NR_MEM_OBJS 40
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 9abbf3044654..04b3256f8e6d 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -206,6 +206,12 @@ struct kvm_arch_memory_slot {
> #define KVM_REG_ARM_TIMER_CNT ARM64_SYS_REG(3, 3, 14, 3, 2)
> #define KVM_REG_ARM_TIMER_CVAL ARM64_SYS_REG(3, 3, 14, 0, 2)
>
> +/* KVM-as-firmware specific pseudo-registers */
> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
> + KVM_REG_ARM_FW | ((r) & 0xffff))
> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
> +
> /* Device Control API: ARM VGIC */
> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5c7f657dd207..811f04c5760e 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -25,6 +25,7 @@
> #include <linux/module.h>
> #include <linux/vmalloc.h>
> #include <linux/fs.h>
> +#include <kvm/arm_psci.h>
> #include <asm/cputype.h>
> #include <linux/uaccess.h>
> #include <asm/kvm.h>
> @@ -205,7 +206,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
> {
> return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
> - + NUM_TIMER_REGS;
> + + kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
> }
>
> /**
> @@ -225,6 +226,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> uindices++;
> }
>
> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
> + if (ret)
> + return ret;
> + uindices += kvm_arm_get_fw_num_regs(vcpu);
> +
> ret = copy_timer_indices(vcpu, uindices);
> if (ret)
> return ret;
> @@ -243,6 +249,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return get_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_get_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return get_timer_reg(vcpu, reg);
>
> @@ -259,6 +268,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> return set_core_reg(vcpu, reg);
>
> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> + return kvm_arm_set_fw_reg(vcpu, reg);
> +
> if (is_timer_reg(reg->id))
> return set_timer_reg(vcpu, reg);
>
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 5446435457c2..4ee098c39e01 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -24,7 +24,16 @@
> #define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
> #define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)
>
> +#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
> +
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> int kvm_psci_call(struct kvm_vcpu *vcpu);
>
> +struct kvm_one_reg;
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
> +
> #endif /* __KVM_ARM_PSCI_H__ */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 291874cff85e..5c8366b71639 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -17,6 +17,7 @@
>
> #include <linux/preempt.h>
> #include <linux/kvm_host.h>
> +#include <linux/uaccess.h>
> #include <linux/wait.h>
>
> #include <asm/cputype.h>
> @@ -233,8 +234,19 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
>
> int kvm_psci_version(struct kvm_vcpu *vcpu)
> {
> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
> - return KVM_ARM_PSCI_0_2;
> + /*
> + * Our PSCI implementation stays the same across versions from
> + * v0.2 onward, only adding the few mandatory functions (such
> + * as FEATURES with 1.0) that are required by newer
> + * revisions. It is thus safe to return the latest, unless
> + * userspace has instructed us otherwise.
> + */
> + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> + if (vcpu->kvm->arch.psci_version)
> + return vcpu->kvm->arch.psci_version;
> +
> + return KVM_ARM_PSCI_LATEST;
> + }
>
> return KVM_ARM_PSCI_0_1;
> }
> @@ -406,3 +418,55 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
> return -EINVAL;
> };
> }
> +
> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> +{
> + return 1; /* PSCI version */
> +}
> +
> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> +{
> + if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + if (reg->id == KVM_REG_ARM_PSCI_VERSION) {
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> +
> + val = kvm_psci_version(vcpu);
> + if (val == KVM_ARM_PSCI_0_1)
> + return -EINVAL;

Doesn't this potentially break userspace that doesn't set
KVM_ARM_VCPU_PSCI_0_2 which does KVM_GET_REG_LIST followed by
KVM_GET_ONE_REG ?

What is the rationale for not simply allowing PSCI v0.1 (terribly
outdated as it may be) be exposed via this firmware reg?

> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + return 0;
> + }
> +
> + return -EINVAL;
> +}
> +
> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + if (reg->id == KVM_REG_ARM_PSCI_VERSION &&
> + test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> + void __user *uaddr = (void __user *)(long)reg->addr;
> + u64 val;
> +
> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> + return -EFAULT;
> +
> + switch (val) {
> + case KVM_ARM_PSCI_0_2:
> + case KVM_ARM_PSCI_1_0:
> + vcpu->kvm->arch.psci_version = val;
> + return 0;
> + }

Then here we could change the check so that setting KVM_ARM_PSCI_0_1
when also having set KVM_ARM_VCPU_PSCI_0_2 returns an error, and
similarly trying to set KVM_ARM_PSCI_0_2+ without having set the feature
bit, returns an error.

> + }
> +
> + return -EINVAL;
> +}
> --
> 2.14.2
>

Thanks,
-Christoffer

2018-02-04 18:39:45

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 09/18] arm/arm64: KVM: Advertise SMCCC v1.1

On Thu, Feb 01, 2018 at 11:46:48AM +0000, Marc Zyngier wrote:
> The new SMC Calling Convention (v1.1) allows for a reduced overhead
> when calling into the firmware, and provides a new feature discovery
> mechanism.
>
> Make it visible to KVM guests.
>

Reviewed-by: Christoffer Dall <[email protected]>

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm/kvm/handle_exit.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 2 +-
> include/kvm/arm_psci.h | 2 +-
> include/linux/arm-smccc.h | 13 +++++++++++++
> virt/kvm/arm/psci.c | 24 +++++++++++++++++++++++-
> 5 files changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c
> index 230ae4079108..910bd8dabb3c 100644
> --- a/arch/arm/kvm/handle_exit.c
> +++ b/arch/arm/kvm/handle_exit.c
> @@ -36,7 +36,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> kvm_vcpu_hvc_get_imm(vcpu));
> vcpu->stat.hvc_exit_stat++;
>
> - ret = kvm_psci_call(vcpu);
> + ret = kvm_hvc_call_handler(vcpu);
> if (ret < 0) {
> vcpu_set_reg(vcpu, 0, ~0UL);
> return 1;
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 588f910632a7..e5e741bfffe1 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -52,7 +52,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
> kvm_vcpu_hvc_get_imm(vcpu));
> vcpu->stat.hvc_exit_stat++;
>
> - ret = kvm_psci_call(vcpu);
> + ret = kvm_hvc_call_handler(vcpu);
> if (ret < 0) {
> vcpu_set_reg(vcpu, 0, ~0UL);
> return 1;
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 4ee098c39e01..7b2e12697d4f 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -27,7 +27,7 @@
> #define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
>
> int kvm_psci_version(struct kvm_vcpu *vcpu);
> -int kvm_psci_call(struct kvm_vcpu *vcpu);
> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>
> struct kvm_one_reg;
>
> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
> index 4c5bca38c653..dc68aa5a7261 100644
> --- a/include/linux/arm-smccc.h
> +++ b/include/linux/arm-smccc.h
> @@ -60,6 +60,19 @@
> #define ARM_SMCCC_QUIRK_NONE 0
> #define ARM_SMCCC_QUIRK_QCOM_A6 1 /* Save/restore register a6 */
>
> +#define ARM_SMCCC_VERSION_1_0 0x10000
> +#define ARM_SMCCC_VERSION_1_1 0x10001
> +
> +#define ARM_SMCCC_VERSION_FUNC_ID \
> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
> + ARM_SMCCC_SMC_32, \
> + 0, 0)
> +
> +#define ARM_SMCCC_ARCH_FEATURES_FUNC_ID \
> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
> + ARM_SMCCC_SMC_32, \
> + 0, 1)
> +
> #ifndef __ASSEMBLY__
>
> #include <linux/linkage.h>
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 5c8366b71639..53272e8e0d37 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -15,6 +15,7 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <linux/arm-smccc.h>
> #include <linux/preempt.h>
> #include <linux/kvm_host.h>
> #include <linux/uaccess.h>
> @@ -351,6 +352,7 @@ static int kvm_psci_1_0_call(struct kvm_vcpu *vcpu)
> case PSCI_0_2_FN_SYSTEM_OFF:
> case PSCI_0_2_FN_SYSTEM_RESET:
> case PSCI_1_0_FN_PSCI_FEATURES:
> + case ARM_SMCCC_VERSION_FUNC_ID:
> val = 0;
> break;
> default:
> @@ -405,7 +407,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> * Errors:
> * -EINVAL: Unrecognized PSCI function
> */
> -int kvm_psci_call(struct kvm_vcpu *vcpu)
> +static int kvm_psci_call(struct kvm_vcpu *vcpu)
> {
> switch (kvm_psci_version(vcpu)) {
> case KVM_ARM_PSCI_1_0:
> @@ -419,6 +421,26 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
> };
> }
>
> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> +{
> + u32 func_id = smccc_get_function(vcpu);
> + u32 val = PSCI_RET_NOT_SUPPORTED;
> +
> + switch (func_id) {
> + case ARM_SMCCC_VERSION_FUNC_ID:
> + val = ARM_SMCCC_VERSION_1_1;
> + break;
> + case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
> + /* Nothing supported yet */
> + break;
> + default:
> + return kvm_psci_call(vcpu);
> + }
> +
> + smccc_set_retval(vcpu, val, 0, 0, 0);
> + return 1;
> +}
> +
> int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> {
> return 1; /* PSCI version */
> --
> 2.14.2
>

2018-02-04 18:39:50

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 10/18] arm/arm64: KVM: Turn kvm_psci_version into a static inline

On Thu, Feb 01, 2018 at 11:46:49AM +0000, Marc Zyngier wrote:
> We're about to need kvm_psci_version in HYP too. So let's turn it
> into a static inline, and pass the kvm structure as a second
> parameter (so that HYP can do a kern_hyp_va on it).
>

Reviewed-by: Christoffer Dall <[email protected]>

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/kvm/hyp/switch.c | 20 ++++++++++++--------
> include/kvm/arm_psci.h | 26 +++++++++++++++++++++++++-
> virt/kvm/arm/psci.c | 25 +++----------------------
> 3 files changed, 40 insertions(+), 31 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 036e1f3d77a6..408c04d789a5 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -19,6 +19,8 @@
> #include <linux/jump_label.h>
> #include <uapi/linux/psci.h>
>
> +#include <kvm/arm_psci.h>
> +
> #include <asm/kvm_asm.h>
> #include <asm/kvm_emulate.h>
> #include <asm/kvm_hyp.h>
> @@ -350,14 +352,16 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>
> if (exit_code == ARM_EXCEPTION_TRAP &&
> (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
> - kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
> - vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
> - u64 val = PSCI_RET_NOT_SUPPORTED;
> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
> - val = 2;
> -
> - vcpu_set_reg(vcpu, 0, val);
> - goto again;
> + kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32)) {
> + u32 val = vcpu_get_reg(vcpu, 0);
> +
> + if (val == PSCI_0_2_FN_PSCI_VERSION) {
> + val = kvm_psci_version(vcpu, kern_hyp_va(vcpu->kvm));
> + if (unlikely(val == KVM_ARM_PSCI_0_1))
> + val = PSCI_RET_NOT_SUPPORTED;
> + vcpu_set_reg(vcpu, 0, val);
> + goto again;
> + }
> }
>
> if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
> index 7b2e12697d4f..9b699f91171f 100644
> --- a/include/kvm/arm_psci.h
> +++ b/include/kvm/arm_psci.h
> @@ -18,6 +18,7 @@
> #ifndef __KVM_ARM_PSCI_H__
> #define __KVM_ARM_PSCI_H__
>
> +#include <linux/kvm_host.h>
> #include <uapi/linux/psci.h>
>
> #define KVM_ARM_PSCI_0_1 PSCI_VERSION(0, 1)
> @@ -26,7 +27,30 @@
>
> #define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
>
> -int kvm_psci_version(struct kvm_vcpu *vcpu);
> +/*
> + * We need the KVM pointer independently from the vcpu as we can call
> + * this from HYP, and need to apply kern_hyp_va on it...
> + */
> +static inline int kvm_psci_version(struct kvm_vcpu *vcpu, struct kvm *kvm)
> +{
> + /*
> + * Our PSCI implementation stays the same across versions from
> + * v0.2 onward, only adding the few mandatory functions (such
> + * as FEATURES with 1.0) that are required by newer
> + * revisions. It is thus safe to return the latest, unless
> + * userspace has instructed us otherwise.
> + */
> + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> + if (kvm->arch.psci_version)
> + return kvm->arch.psci_version;
> +
> + return KVM_ARM_PSCI_LATEST;
> + }
> +
> + return KVM_ARM_PSCI_0_1;
> +}
> +
> +
> int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>
> struct kvm_one_reg;
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 53272e8e0d37..2efacbe7b1a2 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -124,7 +124,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
> if (!vcpu)
> return PSCI_RET_INVALID_PARAMS;
> if (!vcpu->arch.power_off) {
> - if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
> + if (kvm_psci_version(source_vcpu, kvm) != KVM_ARM_PSCI_0_1)
> return PSCI_RET_ALREADY_ON;
> else
> return PSCI_RET_INVALID_PARAMS;
> @@ -233,25 +233,6 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
> kvm_prepare_system_event(vcpu, KVM_SYSTEM_EVENT_RESET);
> }
>
> -int kvm_psci_version(struct kvm_vcpu *vcpu)
> -{
> - /*
> - * Our PSCI implementation stays the same across versions from
> - * v0.2 onward, only adding the few mandatory functions (such
> - * as FEATURES with 1.0) that are required by newer
> - * revisions. It is thus safe to return the latest, unless
> - * userspace has instructed us otherwise.
> - */
> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
> - if (vcpu->kvm->arch.psci_version)
> - return vcpu->kvm->arch.psci_version;
> -
> - return KVM_ARM_PSCI_LATEST;
> - }
> -
> - return KVM_ARM_PSCI_0_1;
> -}
> -
> static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> @@ -409,7 +390,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
> */
> static int kvm_psci_call(struct kvm_vcpu *vcpu)
> {
> - switch (kvm_psci_version(vcpu)) {
> + switch (kvm_psci_version(vcpu, vcpu->kvm)) {
> case KVM_ARM_PSCI_1_0:
> return kvm_psci_1_0_call(vcpu);
> case KVM_ARM_PSCI_0_2:
> @@ -460,7 +441,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> void __user *uaddr = (void __user *)(long)reg->addr;
> u64 val;
>
> - val = kvm_psci_version(vcpu);
> + val = kvm_psci_version(vcpu, vcpu->kvm);
> if (val == KVM_ARM_PSCI_0_1)
> return -EINVAL;
> if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> --
> 2.14.2
>

2018-02-04 18:40:31

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 11/18] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support

On Thu, Feb 01, 2018 at 11:46:50AM +0000, Marc Zyngier wrote:
> A new feature of SMCCC 1.1 is that it offers firmware-based CPU
> workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
> BP hardening for CVE-2017-5715.
>
> If the host has some mitigation for this issue, report that
> we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
> host workaround on every guest exit.

Reviewed-by: Christoffer Dall <[email protected]>

>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm/include/asm/kvm_host.h | 7 +++++++
> arch/arm64/include/asm/kvm_host.h | 6 ++++++
> include/linux/arm-smccc.h | 5 +++++
> virt/kvm/arm/psci.c | 9 ++++++++-
> 4 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index e9d57060d88c..6c05e3b13081 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -309,4 +309,11 @@ static inline void kvm_fpsimd_flush_cpu_state(void) {}
>
> static inline void kvm_arm_vhe_guest_enter(void) {}
> static inline void kvm_arm_vhe_guest_exit(void) {}
> +
> +static inline bool kvm_arm_harden_branch_predictor(void)
> +{
> + /* No way to detect it yet, pretend it is not there. */
> + return false;
> +}
> +
> #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 10af386642c6..448d3b9a58cb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -418,4 +418,10 @@ static inline void kvm_arm_vhe_guest_exit(void)
> {
> local_daif_restore(DAIF_PROCCTX_NOIRQ);
> }
> +
> +static inline bool kvm_arm_harden_branch_predictor(void)
> +{
> + return cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR);
> +}
> +
> #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
> index dc68aa5a7261..e1ef944ef1da 100644
> --- a/include/linux/arm-smccc.h
> +++ b/include/linux/arm-smccc.h
> @@ -73,6 +73,11 @@
> ARM_SMCCC_SMC_32, \
> 0, 1)
>
> +#define ARM_SMCCC_ARCH_WORKAROUND_1 \
> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
> + ARM_SMCCC_SMC_32, \
> + 0, 0x8000)
> +
> #ifndef __ASSEMBLY__
>
> #include <linux/linkage.h>
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 2efacbe7b1a2..22c24561d07d 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -406,13 +406,20 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> {
> u32 func_id = smccc_get_function(vcpu);
> u32 val = PSCI_RET_NOT_SUPPORTED;
> + u32 feature;
>
> switch (func_id) {
> case ARM_SMCCC_VERSION_FUNC_ID:
> val = ARM_SMCCC_VERSION_1_1;
> break;
> case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
> - /* Nothing supported yet */
> + feature = smccc_get_arg1(vcpu);
> + switch(feature) {
> + case ARM_SMCCC_ARCH_WORKAROUND_1:
> + if (kvm_arm_harden_branch_predictor())
> + val = 0;
> + break;
> + }
> break;
> default:
> return kvm_psci_call(vcpu);
> --
> 2.14.2
>

2018-02-04 18:41:24

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 12/18] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling

On Thu, Feb 01, 2018 at 11:46:51AM +0000, Marc Zyngier wrote:
> We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible.
> So let's intercept it as early as we can by testing for the
> function call number as soon as we've identified a HVC call
> coming from the guest.

Hmmm. How often is this expected to happen and what is the expected
extra cost of doing the early-exit handling in the C code vs. here?

I think we'd be better off if we only had a single early-exit path (and
we should move the FP/SIMD trap to that path as well), but if there's a
measurable benefit of having this logic in assembly as opposed to in the
C code, then I'm ok with this as well.

The code in this patch looks fine otherwise.

Thanks,
-Christoffer

>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/kvm/hyp/hyp-entry.S | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> index e4f37b9dd47c..f36464bd57c5 100644
> --- a/arch/arm64/kvm/hyp/hyp-entry.S
> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> @@ -15,6 +15,7 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <linux/arm-smccc.h>
> #include <linux/linkage.h>
>
> #include <asm/alternative.h>
> @@ -64,10 +65,11 @@ alternative_endif
> lsr x0, x1, #ESR_ELx_EC_SHIFT
>
> cmp x0, #ESR_ELx_EC_HVC64
> + ccmp x0, #ESR_ELx_EC_HVC32, #4, ne
> b.ne el1_trap
>
> - mrs x1, vttbr_el2 // If vttbr is valid, the 64bit guest
> - cbnz x1, el1_trap // called HVC
> + mrs x1, vttbr_el2 // If vttbr is valid, the guest
> + cbnz x1, el1_hvc_guest // called HVC
>
> /* Here, we're pretty sure the host called HVC. */
> ldp x0, x1, [sp], #16
> @@ -100,6 +102,20 @@ alternative_endif
>
> eret
>
> +el1_hvc_guest:
> + /*
> + * Fastest possible path for ARM_SMCCC_ARCH_WORKAROUND_1.
> + * The workaround has already been applied on the host,
> + * so let's quickly get back to the guest. We don't bother
> + * restoring x1, as it can be clobbered anyway.
> + */
> + ldr x1, [sp] // Guest's x0
> + eor w1, w1, #ARM_SMCCC_ARCH_WORKAROUND_1
> + cbnz w1, el1_trap
> + mov x0, x1
> + add sp, sp, #16
> + eret
> +
> el1_trap:
> /*
> * x0: ESR_EC
> --
> 2.14.2
>

2018-02-05 09:09:50

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 12/18] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling

On 04/02/18 18:39, Christoffer Dall wrote:
> On Thu, Feb 01, 2018 at 11:46:51AM +0000, Marc Zyngier wrote:
>> We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible.
>> So let's intercept it as early as we can by testing for the
>> function call number as soon as we've identified a HVC call
>> coming from the guest.
>
> Hmmm. How often is this expected to happen and what is the expected
> extra cost of doing the early-exit handling in the C code vs. here?

Pretty often. On each context switch of a Linux guest, for example. It
is almost as bad as if we were trapping all VM ops. Moving it to C is
definitely visible on something like hackbench (I remember something
like a 10-12% degradation on Seattle, but I'd need to rerun the tests to
give you something accurate). It is the whole GPR save/restore dance
that costs us a lot (31 registers for the guest, 12 for the host), plus
some the extra SError synchronization that doesn't come for free either.

> I think we'd be better off if we only had a single early-exit path (and
> we should move the FP/SIMD trap to that path as well), but if there's a
> measurable benefit of having this logic in assembly as opposed to in the
> C code, then I'm ok with this as well.

I agree that the multiplication of "earlier than early" paths is
becoming annoying. Moving the FP/SIMD stuff to C would be less
problematic, as we have patches to move some of that to load/put, and
we'd only take the trap once per time slice (as opposed to once per
entry at the moment).

Here, we're trying hard to do exactly nothing, because each instruction
is just an extra overhead (we've already nuked the BP). I even
considered inserting that code as part of the per-CPU-type vectors (and
leave the rest of the KVM code alone), but it felt like a step too far.

> The code in this patch looks fine otherwise.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-02-05 09:25:56

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On 04/02/18 12:37, Christoffer Dall wrote:
> On Sat, Feb 03, 2018 at 11:59:32AM +0000, Marc Zyngier wrote:
>> On Fri, 2 Feb 2018 21:17:06 +0100
>> Andrew Jones <[email protected]> wrote:
>>
>>> On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
>>>> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
>>>> Since all the new PSCI versions are backward compatible, we decide to
>>>> default to the latest version of the PSCI implementation. This is no
>>>> different from doing a firmware upgrade on KVM.
>>>>
>>>> But in order to give a chance to hypothetical badly implemented guests
>>>> that would have a fit by discovering something other than PSCI 0.2,
>>>> let's provide a new API that allows userspace to pick one particular
>>>> version of the API.
>>>>
>>>> This is implemented as a new class of "firmware" registers, where
>>>> we expose the PSCI version. This allows the PSCI version to be
>>>> save/restored as part of a guest migration, and also set to
>>>> any supported version if the guest requires it.
>>>>
>>>> Signed-off-by: Marc Zyngier <[email protected]>
>>>> ---
>>>> Documentation/virtual/kvm/api.txt | 3 +-
>>>> Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
>>>> arch/arm/include/asm/kvm_host.h | 3 ++
>>>> arch/arm/include/uapi/asm/kvm.h | 6 +++
>>>> arch/arm/kvm/guest.c | 13 +++++++
>>>> arch/arm64/include/asm/kvm_host.h | 3 ++
>>>> arch/arm64/include/uapi/asm/kvm.h | 6 +++
>>>> arch/arm64/kvm/guest.c | 14 ++++++-
>>>> include/kvm/arm_psci.h | 9 +++++
>>>> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
>>>> 10 files changed, 151 insertions(+), 4 deletions(-)
>>>> create mode 100644 Documentation/virtual/kvm/arm/psci.txt
>>>>
>>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>>>> index 57d3ee9e4bde..334905202141 100644
>>>> --- a/Documentation/virtual/kvm/api.txt
>>>> +++ b/Documentation/virtual/kvm/api.txt
>>>> @@ -2493,7 +2493,8 @@ Possible features:
>>>> and execute guest code when KVM_RUN is called.
>>>> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
>>>> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
>>>> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
>>>> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
>>>> + backward compatible with v0.2) for the CPU.
>>>> Depends on KVM_CAP_ARM_PSCI_0_2.
>>>> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
>>>> Depends on KVM_CAP_ARM_PMU_V3.
>>>> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
>>>> new file mode 100644
>>>> index 000000000000..aafdab887b04
>>>> --- /dev/null
>>>> +++ b/Documentation/virtual/kvm/arm/psci.txt
>>>> @@ -0,0 +1,30 @@
>>>> +KVM implements the PSCI (Power State Coordination Interface)
>>>> +specification in order to provide services such as CPU on/off, reset
>>>> +and power-off to the guest.
>>>> +
>>>> +The PSCI specification is regularly updated to provide new features,
>>>> +and KVM implements these updates if they make sense from a virtualization
>>>> +point of view.
>>>> +
>>>> +This means that a guest booted on two different versions of KVM can
>>>> +observe two different "firmware" revisions. This could cause issues if
>>>> +a given guest is tied to a particular PSCI revision (unlikely), or if
>>>> +a migration causes a different PSCI version to be exposed out of the
>>>> +blue to an unsuspecting guest.
>>>> +
>>>> +In order to remedy this situation, KVM exposes a set of "firmware
>>>> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
>>>> +interface. These registers can be saved/restored by userspace, and set
>>>> +to a convenient value if required.
>>>> +
>>>> +The following register is defined:
>>>> +
>>>> +* KVM_REG_ARM_PSCI_VERSION:
>>>> +
>>>> + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
>>>> + (and thus has already been initialized)
>>>> + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
>>>> + highest PSCI version implemented by KVM and compatible with v0.2)
>>>> + - Allows any PSCI version implemented by KVM and compatible with
>>>> + v0.2 to be set with SET_ONE_REG
>>>> + - Affects the whole VM (even if the register view is per-vcpu)
>>>
>>
>> Hi Drew,
>>
>> Thanks for looking into this, and for the exhaustive data.
>>
>>>
>>> I've put some more thought and experimentation into this. I think we
>>> should change to a vcpu feature bit. The feature bit would be used to
>>> force compat mode, v0.2, so KVM would still enable the new PSCI
>>> version by default. Below are two tables describing why I think we
>>> should switch to something other than a new sysreg, and below those
>>> tables are notes as to why I think we should use a vcpu feature. The
>>> asterisks in the tables point out behaviors that aren't what we want.
>>> While both tables have an asterisk, the sysreg approach's issue is
>>> bug. The vcpu feature approach's issue is risk incurred from an
>>> unsupported migration, albeit one that is hard to detect without a
>>> new machine type.
>>>
>>> +-----------------------------------------------------------------------+
>>> | sysreg approach |
>>> +------------------+-----------+-------+--------------------------------+
>>> | migration | userspace | works | notes |
>>> | | change | | |
>>> +------------------+-----------+-------+--------------------------------+
>>> | new -> new | NO | YES | Expected |
>>> +------------------+-----------+-------+--------------------------------+
>>> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
>>> +------------------+-----------+-------+--------------------------------+
>>> | new -> old | NO | NO | Migration fails due to the new |
>>> | | | | sysreg. Migration shouldn't |
>>> | | | | have been attempted, but no |
>>> | | | | way to know without a new |
>>> | | | | machine type. |
>>> +------------------+-----------+-------+--------------------------------+
>>> | compat -> old | YES | NO* | Even when setting PSCI version |
>>> | | | | to 0.2, we add a new sysreg, |
>>> | | | | so migration will still fail. |
>>> +------------------+-----------+-------+--------------------------------+
>>> | old -> compat | YES | YES | It's OK for the destination to |
>>> | | | | support more sysregs than the |
>>> | | | | source sends. |
>>> +------------------+-----------+-------+--------------------------------+
>>>
>>>
>>> +-----------------------------------------------------------------------+
>>> | vcpu feature approach |
>>> +------------------+-----------+-------+--------------------------------+
>>> | migration | userspace | works | notes |
>>> | | change | | |
>>> +------------------+-----------+-------+--------------------------------+
>>> | new -> new | NO | YES | Expected |
>>> +------------------+-----------+-------+--------------------------------+
>>> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
>>> +------------------+-----------+-------+--------------------------------+
>>> | new -> old | NO | YES* | Migrates, but it's not safe |
>>> | | | | for the guest kernel, and no |
>>> | | | | way to know without a new |
>>> | | | | machine type. |
>>> +------------------+-----------+-------+--------------------------------+
>>> | compat -> old | YES | YES | Expected |
>>> +------------------+-----------+-------+--------------------------------+
>>> | old -> compat | YES | YES | Expected |
>>> +------------------+-----------+-------+--------------------------------+
>>>
>>>
>>> Notes as to why the vcpu feature approach was selected:
>>>
>>> 1) While this is VM state, and thus a VM control would be a more natural
>>> fit, a new vcpu feature bit would be much less new code. We also
>>> already have a PSCI vcpu feature bit, so a new one will actually fit
>>> quite well.
>>>
>>> 2) No new state needs to be migrated, as we already migrate the feature
>>> bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
>>> so bumping it one more in KVM doesn't require a QEMU change.
>>>
>>>
>>> If we switch to a vcpu feature bit, then I think this patch can be
>>> replaced with something like this
>>
>> A couple of remarks:
>>
>> - My worry with this feature bit is that it is a point fix, and it
>> doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
>> another feature bit that says "force to 1.0"? I'd really like
>> something we can live with in the long run, and "KVM as firmware"
>> needs to be able to evolve without requiring a new userspace
>> interface each time we rev it.
>>
>> - The "compat->old" entry in your sysreg table is not quite fair. In
>> the feature table, you teach userspace about the new feature bit. You
>> could just as well teach userspace about the new sysreg. Yes, things
>> may be different in QEMU, but that's not what we're talking about
>> here.
>>
>> - Allowing a guest to migrate in an unsafe way seems worse than failing
>> a migration unexpectedly. Or at least not any better.
>>
>> To be clear: I'm not dismissing the idea at all, but I want to make sure
>> we're not cornering ourselves into an uncomfortable place.
>>
>> Christoffer, Peter, what are your thoughts on this?
>>
>
> Taking a step back, the only reasons why this patch isn't simply
> enabling PSCI v1.0 by default (without any selection method) are that we
> (1) want to support guests that complain about PSCI_VERSION != 0.2
> (which isn't completely outside the realm of a reasonable implementation
> if you read the description of PSCI_VERSION in the 0.2 spec) and (2) to
> provide migration support for guests that call
> PSCI_1_0_FN_PSCI_FEATURES.
>
> If we ignore (1) because we don't know of any guests where this is an
> issue, then it's all about (2), migration from "new -> old".
>
> As far as I can tell, the use case we are worried about here is updating
> the kernel (and not QEMU) on half of your data center and then trying to
> migrate from the upgraded kernel machine to a legacy (and potentially
> variant 2 vulnerable) machine. For this specific move from PSCI 0.2 to
> 1.0 with the included mitigation, I don't really think this is an
> important use case to support.

I'm not so sure. Promising mitigation to a guest, and then seeing that
mitigation being silently taken away because we've allowed it to migrate
seem bad to me.

> In terms of the more general approach to "KVM firmware upgrades" and
> migration, I think something like the proposed FW register interface
> here should work, but I'm concerned about the lack of opportunity from
> userspace to predict a migration failure. But I don't understand why

Userspace could predict some of the failure cases, if only by checking
that all registers can be restored in a new guest. I'm not sure how
viable this is in a data centre type of environment.

> this requires a new machine type? Why can't we simply provide a KVM
> capability that libvirt etc. can query for?
>
> Also, is it generally true that we can't expose any additional system
> registers from KVM without breaking migration and we don't have any
> method to deal with that in userspace and upper layers? If that's true,

It is my understanding that each time we add a new sysreg to KVM,
migration in QEMU breaks in the new->old direction.

> that's a bigger problem in general and something we should work on
> trying to solve. If it's not true, then there should be some method to
> deal with the FW register already (like capabilities).
>
> Given the urgency of adding mitigation towards variant 2 which is the
> driver for this work, I think we should drop the compat functionality in
> this series and work this out later on if needed. I think we can just
> tweak the previous patch to enable PSCI 1.0 by default and drop this
> patch for the current merge window.

I'd be fine with that, as long as we have a clear agreement on the
impact of such a move.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-02-05 09:26:41

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Sat, Feb 03, 2018 at 11:59:32AM +0000, Marc Zyngier wrote:
> On Fri, 2 Feb 2018 21:17:06 +0100
> Andrew Jones <[email protected]> wrote:
>
> > On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> > > Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> > > Since all the new PSCI versions are backward compatible, we decide to
> > > default to the latest version of the PSCI implementation. This is no
> > > different from doing a firmware upgrade on KVM.
> > >
> > > But in order to give a chance to hypothetical badly implemented guests
> > > that would have a fit by discovering something other than PSCI 0.2,
> > > let's provide a new API that allows userspace to pick one particular
> > > version of the API.
> > >
> > > This is implemented as a new class of "firmware" registers, where
> > > we expose the PSCI version. This allows the PSCI version to be
> > > save/restored as part of a guest migration, and also set to
> > > any supported version if the guest requires it.
> > >
> > > Signed-off-by: Marc Zyngier <[email protected]>
> > > ---
> > > Documentation/virtual/kvm/api.txt | 3 +-
> > > Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> > > arch/arm/include/asm/kvm_host.h | 3 ++
> > > arch/arm/include/uapi/asm/kvm.h | 6 +++
> > > arch/arm/kvm/guest.c | 13 +++++++
> > > arch/arm64/include/asm/kvm_host.h | 3 ++
> > > arch/arm64/include/uapi/asm/kvm.h | 6 +++
> > > arch/arm64/kvm/guest.c | 14 ++++++-
> > > include/kvm/arm_psci.h | 9 +++++
> > > virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> > > 10 files changed, 151 insertions(+), 4 deletions(-)
> > > create mode 100644 Documentation/virtual/kvm/arm/psci.txt
> > >
> > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > > index 57d3ee9e4bde..334905202141 100644
> > > --- a/Documentation/virtual/kvm/api.txt
> > > +++ b/Documentation/virtual/kvm/api.txt
> > > @@ -2493,7 +2493,8 @@ Possible features:
> > > and execute guest code when KVM_RUN is called.
> > > - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> > > Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> > > - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> > > + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> > > + backward compatible with v0.2) for the CPU.
> > > Depends on KVM_CAP_ARM_PSCI_0_2.
> > > - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> > > Depends on KVM_CAP_ARM_PMU_V3.
> > > diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> > > new file mode 100644
> > > index 000000000000..aafdab887b04
> > > --- /dev/null
> > > +++ b/Documentation/virtual/kvm/arm/psci.txt
> > > @@ -0,0 +1,30 @@
> > > +KVM implements the PSCI (Power State Coordination Interface)
> > > +specification in order to provide services such as CPU on/off, reset
> > > +and power-off to the guest.
> > > +
> > > +The PSCI specification is regularly updated to provide new features,
> > > +and KVM implements these updates if they make sense from a virtualization
> > > +point of view.
> > > +
> > > +This means that a guest booted on two different versions of KVM can
> > > +observe two different "firmware" revisions. This could cause issues if
> > > +a given guest is tied to a particular PSCI revision (unlikely), or if
> > > +a migration causes a different PSCI version to be exposed out of the
> > > +blue to an unsuspecting guest.
> > > +
> > > +In order to remedy this situation, KVM exposes a set of "firmware
> > > +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> > > +interface. These registers can be saved/restored by userspace, and set
> > > +to a convenient value if required.
> > > +
> > > +The following register is defined:
> > > +
> > > +* KVM_REG_ARM_PSCI_VERSION:
> > > +
> > > + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> > > + (and thus has already been initialized)
> > > + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> > > + highest PSCI version implemented by KVM and compatible with v0.2)
> > > + - Allows any PSCI version implemented by KVM and compatible with
> > > + v0.2 to be set with SET_ONE_REG
> > > + - Affects the whole VM (even if the register view is per-vcpu)
> >
>
> Hi Drew,
>
> Thanks for looking into this, and for the exhaustive data.
>
> >
> > I've put some more thought and experimentation into this. I think we
> > should change to a vcpu feature bit. The feature bit would be used to
> > force compat mode, v0.2, so KVM would still enable the new PSCI
> > version by default. Below are two tables describing why I think we
> > should switch to something other than a new sysreg, and below those
> > tables are notes as to why I think we should use a vcpu feature. The
> > asterisks in the tables point out behaviors that aren't what we want.
> > While both tables have an asterisk, the sysreg approach's issue is
> > bug. The vcpu feature approach's issue is risk incurred from an
> > unsupported migration, albeit one that is hard to detect without a
> > new machine type.
> >
> > +-----------------------------------------------------------------------+
> > | sysreg approach |
> > +------------------+-----------+-------+--------------------------------+
> > | migration | userspace | works | notes |
> > | | change | | |
> > +------------------+-----------+-------+--------------------------------+
> > | new -> new | NO | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > +------------------+-----------+-------+--------------------------------+
> > | new -> old | NO | NO | Migration fails due to the new |
> > | | | | sysreg. Migration shouldn't |
> > | | | | have been attempted, but no |
> > | | | | way to know without a new |
> > | | | | machine type. |
> > +------------------+-----------+-------+--------------------------------+
> > | compat -> old | YES | NO* | Even when setting PSCI version |
> > | | | | to 0.2, we add a new sysreg, |
> > | | | | so migration will still fail. |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> compat | YES | YES | It's OK for the destination to |
> > | | | | support more sysregs than the |
> > | | | | source sends. |
> > +------------------+-----------+-------+--------------------------------+
> >
> >
> > +-----------------------------------------------------------------------+
> > | vcpu feature approach |
> > +------------------+-----------+-------+--------------------------------+
> > | migration | userspace | works | notes |
> > | | change | | |
> > +------------------+-----------+-------+--------------------------------+
> > | new -> new | NO | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > +------------------+-----------+-------+--------------------------------+
> > | new -> old | NO | YES* | Migrates, but it's not safe |
> > | | | | for the guest kernel, and no |
> > | | | | way to know without a new |
> > | | | | machine type. |
> > +------------------+-----------+-------+--------------------------------+
> > | compat -> old | YES | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> > | old -> compat | YES | YES | Expected |
> > +------------------+-----------+-------+--------------------------------+
> >
> >
> > Notes as to why the vcpu feature approach was selected:
> >
> > 1) While this is VM state, and thus a VM control would be a more natural
> > fit, a new vcpu feature bit would be much less new code. We also
> > already have a PSCI vcpu feature bit, so a new one will actually fit
> > quite well.
> >
> > 2) No new state needs to be migrated, as we already migrate the feature
> > bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
> > so bumping it one more in KVM doesn't require a QEMU change.
> >
> >
> > If we switch to a vcpu feature bit, then I think this patch can be
> > replaced with something like this
>
> A couple of remarks:
>
> - My worry with this feature bit is that it is a point fix, and it
> doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
> another feature bit that says "force to 1.0"? I'd really like
> something we can live with in the long run, and "KVM as firmware"
> needs to be able to evolve without requiring a new userspace
> interface each time we rev it.

You're right. The flag wouldn't be a good pattern for the long term.
I was thinking typically we wouldn't enable new features by default
in KVM, so this choice was geared towards getting mitigations and
compat support done quickly. Christoffer is probably right that we
could just backburn the compat stuff for now though.

>
> - The "compat->old" entry in your sysreg table is not quite fair. In
> the feature table, you teach userspace about the new feature bit. You
> could just as well teach userspace about the new sysreg. Yes, things
> may be different in QEMU, but that's not what we're talking about
> here.

Indeed, I should have elaborated on the fact that this is a how QEMU
does it now type of thing. While it would be possible to filter new
registers from the migration stream for the compat version, I guess
that would require much more work, and I was thinking of getting a
userspace solution out quickly after KVM gets these patches merged.
But, again, maybe that's not necessary.

>
> - Allowing a guest to migrate in an unsafe way seems worse than failing
> a migration unexpectedly. Or at least not any better.

We could protect the guest by adding kernel support to handle the
exception that old KVM would inject, I think. But, that would be
quite nasty.

>
> To be clear: I'm not dismissing the idea at all, but I want to make sure
> we're not cornering ourselves into an uncomfortable place.
>
> Christoffer, Peter, what are your thoughts on this?
>

Thanks,
drew

2018-02-05 09:31:56

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On 04/02/18 12:38, Christoffer Dall wrote:
> Hi Marc,
>
> [ I know we're discussing the overall approach in parallel, but here are
> some comments on the specifics of this patch, should it end up being
> used in some capacity ]
>
> On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
>> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
>> Since all the new PSCI versions are backward compatible, we decide to
>> default to the latest version of the PSCI implementation. This is no
>> different from doing a firmware upgrade on KVM.
>>
>> But in order to give a chance to hypothetical badly implemented guests
>> that would have a fit by discovering something other than PSCI 0.2,
>> let's provide a new API that allows userspace to pick one particular
>> version of the API.
>>
>> This is implemented as a new class of "firmware" registers, where
>> we expose the PSCI version. This allows the PSCI version to be
>> save/restored as part of a guest migration, and also set to
>> any supported version if the guest requires it.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> ---
>> Documentation/virtual/kvm/api.txt | 3 +-
>> Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
>> arch/arm/include/asm/kvm_host.h | 3 ++
>> arch/arm/include/uapi/asm/kvm.h | 6 +++
>> arch/arm/kvm/guest.c | 13 +++++++
>> arch/arm64/include/asm/kvm_host.h | 3 ++
>> arch/arm64/include/uapi/asm/kvm.h | 6 +++
>> arch/arm64/kvm/guest.c | 14 ++++++-
>> include/kvm/arm_psci.h | 9 +++++
>> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
>> 10 files changed, 151 insertions(+), 4 deletions(-)
>> create mode 100644 Documentation/virtual/kvm/arm/psci.txt
>>
>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
>> index 57d3ee9e4bde..334905202141 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -2493,7 +2493,8 @@ Possible features:
>> and execute guest code when KVM_RUN is called.
>> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
>> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
>> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
>> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
>> + backward compatible with v0.2) for the CPU.
>> Depends on KVM_CAP_ARM_PSCI_0_2.
>> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
>> Depends on KVM_CAP_ARM_PMU_V3.
>
> Can we add this to api.txt as well ?:
>
> --------8><----------
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index fc3ae951bc07..c88aa04bcbe8 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1959,6 +1959,8 @@ arm64 CCSIDR registers are demultiplexed by CSSELR value:
> arm64 system registers have the following id bit patterns:
> 0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>
>
> +ARM/arm64 firmware pseudo-registers have the following bit pattern:
> + 0x4030 0000 0014 <regno:16>
>
> MIPS registers are mapped using the lower 32 bits. The upper 16 of that is
> the register group type:
>
> --------8><----------

Ah, I never realised we actually documented this. Neat!

>
>> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
>> new file mode 100644
>> index 000000000000..aafdab887b04
>> --- /dev/null
>> +++ b/Documentation/virtual/kvm/arm/psci.txt
>> @@ -0,0 +1,30 @@
>> +KVM implements the PSCI (Power State Coordination Interface)
>> +specification in order to provide services such as CPU on/off, reset
>> +and power-off to the guest.
>> +
>> +The PSCI specification is regularly updated to provide new features,
>> +and KVM implements these updates if they make sense from a virtualization
>> +point of view.
>> +
>> +This means that a guest booted on two different versions of KVM can
>> +observe two different "firmware" revisions. This could cause issues if
>> +a given guest is tied to a particular PSCI revision (unlikely), or if
>> +a migration causes a different PSCI version to be exposed out of the
>> +blue to an unsuspecting guest.
>> +
>> +In order to remedy this situation, KVM exposes a set of "firmware
>> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
>> +interface. These registers can be saved/restored by userspace, and set
>> +to a convenient value if required.
>> +
>> +The following register is defined:
>> +
>> +* KVM_REG_ARM_PSCI_VERSION:
>> +
>> + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
>> + (and thus has already been initialized)
>> + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
>> + highest PSCI version implemented by KVM and compatible with v0.2)
>> + - Allows any PSCI version implemented by KVM and compatible with
>> + v0.2 to be set with SET_ONE_REG
>> + - Affects the whole VM (even if the register view is per-vcpu)
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index acbf9ec7b396..e9d57060d88c 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -75,6 +75,9 @@ struct kvm_arch {
>> /* Interrupt controller */
>> struct vgic_dist vgic;
>> int max_vcpus;
>> +
>> + /* Mandated version of PSCI */
>> + u32 psci_version;
>> };
>>
>> #define KVM_NR_MEM_OBJS 40
>> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
>> index 6edd177bb1c7..47dfc99f5cd0 100644
>> --- a/arch/arm/include/uapi/asm/kvm.h
>> +++ b/arch/arm/include/uapi/asm/kvm.h
>> @@ -186,6 +186,12 @@ struct kvm_arch_memory_slot {
>> #define KVM_REG_ARM_VFP_FPINST 0x1009
>> #define KVM_REG_ARM_VFP_FPINST2 0x100A
>>
>> +/* KVM-as-firmware specific pseudo-registers */
>> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
>> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>> + KVM_REG_ARM_FW | ((r) & 0xffff))
>> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
>> +
>> /* Device Control API: ARM VGIC */
>> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
>> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
>> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
>> index 1e0784ebbfd6..a18f33edc471 100644
>> --- a/arch/arm/kvm/guest.c
>> +++ b/arch/arm/kvm/guest.c
>> @@ -22,6 +22,7 @@
>> #include <linux/module.h>
>> #include <linux/vmalloc.h>
>> #include <linux/fs.h>
>> +#include <kvm/arm_psci.h>
>> #include <asm/cputype.h>
>> #include <linux/uaccess.h>
>> #include <asm/kvm.h>
>> @@ -176,6 +177,7 @@ static unsigned long num_core_regs(void)
>> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>> {
>> return num_core_regs() + kvm_arm_num_coproc_regs(vcpu)
>> + + kvm_arm_get_fw_num_regs(vcpu)
>> + NUM_TIMER_REGS;
>> }
>>
>> @@ -196,6 +198,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>> uindices++;
>> }
>>
>> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>> + if (ret)
>> + return ret;
>> + uindices += kvm_arm_get_fw_num_regs(vcpu);
>> +
>> ret = copy_timer_indices(vcpu, uindices);
>> if (ret)
>> return ret;
>> @@ -214,6 +221,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>> return get_core_reg(vcpu, reg);
>>
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_get_fw_reg(vcpu, reg);
>> +
>> if (is_timer_reg(reg->id))
>> return get_timer_reg(vcpu, reg);
>>
>> @@ -230,6 +240,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>> return set_core_reg(vcpu, reg);
>>
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_set_fw_reg(vcpu, reg);
>> +
>> if (is_timer_reg(reg->id))
>> return set_timer_reg(vcpu, reg);
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 4485ae8e98de..10af386642c6 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -73,6 +73,9 @@ struct kvm_arch {
>>
>> /* Interrupt controller */
>> struct vgic_dist vgic;
>> +
>> + /* Mandated version of PSCI */
>> + u32 psci_version;
>> };
>>
>> #define KVM_NR_MEM_OBJS 40
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 9abbf3044654..04b3256f8e6d 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -206,6 +206,12 @@ struct kvm_arch_memory_slot {
>> #define KVM_REG_ARM_TIMER_CNT ARM64_SYS_REG(3, 3, 14, 3, 2)
>> #define KVM_REG_ARM_TIMER_CVAL ARM64_SYS_REG(3, 3, 14, 0, 2)
>>
>> +/* KVM-as-firmware specific pseudo-registers */
>> +#define KVM_REG_ARM_FW (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
>> +#define KVM_REG_ARM_FW_REG(r) (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>> + KVM_REG_ARM_FW | ((r) & 0xffff))
>> +#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
>> +
>> /* Device Control API: ARM VGIC */
>> #define KVM_DEV_ARM_VGIC_GRP_ADDR 0
>> #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 5c7f657dd207..811f04c5760e 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -25,6 +25,7 @@
>> #include <linux/module.h>
>> #include <linux/vmalloc.h>
>> #include <linux/fs.h>
>> +#include <kvm/arm_psci.h>
>> #include <asm/cputype.h>
>> #include <linux/uaccess.h>
>> #include <asm/kvm.h>
>> @@ -205,7 +206,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>> {
>> return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
>> - + NUM_TIMER_REGS;
>> + + kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
>> }
>>
>> /**
>> @@ -225,6 +226,11 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>> uindices++;
>> }
>>
>> + ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>> + if (ret)
>> + return ret;
>> + uindices += kvm_arm_get_fw_num_regs(vcpu);
>> +
>> ret = copy_timer_indices(vcpu, uindices);
>> if (ret)
>> return ret;
>> @@ -243,6 +249,9 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>> return get_core_reg(vcpu, reg);
>>
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_get_fw_reg(vcpu, reg);
>> +
>> if (is_timer_reg(reg->id))
>> return get_timer_reg(vcpu, reg);
>>
>> @@ -259,6 +268,9 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
>> return set_core_reg(vcpu, reg);
>>
>> + if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
>> + return kvm_arm_set_fw_reg(vcpu, reg);
>> +
>> if (is_timer_reg(reg->id))
>> return set_timer_reg(vcpu, reg);
>>
>> diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
>> index 5446435457c2..4ee098c39e01 100644
>> --- a/include/kvm/arm_psci.h
>> +++ b/include/kvm/arm_psci.h
>> @@ -24,7 +24,16 @@
>> #define KVM_ARM_PSCI_0_2 PSCI_VERSION(0, 2)
>> #define KVM_ARM_PSCI_1_0 PSCI_VERSION(1, 0)
>>
>> +#define KVM_ARM_PSCI_LATEST KVM_ARM_PSCI_1_0
>> +
>> int kvm_psci_version(struct kvm_vcpu *vcpu);
>> int kvm_psci_call(struct kvm_vcpu *vcpu);
>>
>> +struct kvm_one_reg;
>> +
>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu);
>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
>> +
>> #endif /* __KVM_ARM_PSCI_H__ */
>> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
>> index 291874cff85e..5c8366b71639 100644
>> --- a/virt/kvm/arm/psci.c
>> +++ b/virt/kvm/arm/psci.c
>> @@ -17,6 +17,7 @@
>>
>> #include <linux/preempt.h>
>> #include <linux/kvm_host.h>
>> +#include <linux/uaccess.h>
>> #include <linux/wait.h>
>>
>> #include <asm/cputype.h>
>> @@ -233,8 +234,19 @@ static void kvm_psci_system_reset(struct kvm_vcpu *vcpu)
>>
>> int kvm_psci_version(struct kvm_vcpu *vcpu)
>> {
>> - if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
>> - return KVM_ARM_PSCI_0_2;
>> + /*
>> + * Our PSCI implementation stays the same across versions from
>> + * v0.2 onward, only adding the few mandatory functions (such
>> + * as FEATURES with 1.0) that are required by newer
>> + * revisions. It is thus safe to return the latest, unless
>> + * userspace has instructed us otherwise.
>> + */
>> + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
>> + if (vcpu->kvm->arch.psci_version)
>> + return vcpu->kvm->arch.psci_version;
>> +
>> + return KVM_ARM_PSCI_LATEST;
>> + }
>>
>> return KVM_ARM_PSCI_0_1;
>> }
>> @@ -406,3 +418,55 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
>> return -EINVAL;
>> };
>> }
>> +
>> +int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>> +{
>> + return 1; /* PSCI version */
>> +}
>> +
>> +int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>> +{
>> + if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
>> +int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> +{
>> + if (reg->id == KVM_REG_ARM_PSCI_VERSION) {
>> + void __user *uaddr = (void __user *)(long)reg->addr;
>> + u64 val;
>> +
>> + val = kvm_psci_version(vcpu);
>> + if (val == KVM_ARM_PSCI_0_1)
>> + return -EINVAL;
>
> Doesn't this potentially break userspace that doesn't set
> KVM_ARM_VCPU_PSCI_0_2 which does KVM_GET_REG_LIST followed by
> KVM_GET_ONE_REG ?

Ouch. Yes, well spotted.

> What is the rationale for not simply allowing PSCI v0.1 (terribly
> outdated as it may be) be exposed via this firmware reg?

The rational was that PSCI_GET_VERSION didn't exist in 0.1, hence there
was no reason of exposing a version number if you didn't want at least
0.2. But as you made it obvious above, this is broken. That register
needs to be either exposed all the time, or only be exposed if
KVM_ARM_VCPU_PSCI_0_2 is set. Given that the latter feels pretty
fragile, I'll choose the former.

>
>> + if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
>> + return -EFAULT;
>> +
>> + return 0;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +
>> +int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>> +{
>> + if (reg->id == KVM_REG_ARM_PSCI_VERSION &&
>> + test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) {
>> + void __user *uaddr = (void __user *)(long)reg->addr;
>> + u64 val;
>> +
>> + if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
>> + return -EFAULT;
>> +
>> + switch (val) {
>> + case KVM_ARM_PSCI_0_2:
>> + case KVM_ARM_PSCI_1_0:
>> + vcpu->kvm->arch.psci_version = val;
>> + return 0;
>> + }
>
> Then here we could change the check so that setting KVM_ARM_PSCI_0_1
> when also having set KVM_ARM_VCPU_PSCI_0_2 returns an error, and
> similarly trying to set KVM_ARM_PSCI_0_2+ without having set the feature
> bit, returns an error.

Yes, that's a good suggestion.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-02-05 09:50:05

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Sun, Feb 04, 2018 at 01:37:01PM +0100, Christoffer Dall wrote:
> On Sat, Feb 03, 2018 at 11:59:32AM +0000, Marc Zyngier wrote:
> > On Fri, 2 Feb 2018 21:17:06 +0100
> > Andrew Jones <[email protected]> wrote:
> >
> > > On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> > > > Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> > > > Since all the new PSCI versions are backward compatible, we decide to
> > > > default to the latest version of the PSCI implementation. This is no
> > > > different from doing a firmware upgrade on KVM.
> > > >
> > > > But in order to give a chance to hypothetical badly implemented guests
> > > > that would have a fit by discovering something other than PSCI 0.2,
> > > > let's provide a new API that allows userspace to pick one particular
> > > > version of the API.
> > > >
> > > > This is implemented as a new class of "firmware" registers, where
> > > > we expose the PSCI version. This allows the PSCI version to be
> > > > save/restored as part of a guest migration, and also set to
> > > > any supported version if the guest requires it.
> > > >
> > > > Signed-off-by: Marc Zyngier <[email protected]>
> > > > ---
> > > > Documentation/virtual/kvm/api.txt | 3 +-
> > > > Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> > > > arch/arm/include/asm/kvm_host.h | 3 ++
> > > > arch/arm/include/uapi/asm/kvm.h | 6 +++
> > > > arch/arm/kvm/guest.c | 13 +++++++
> > > > arch/arm64/include/asm/kvm_host.h | 3 ++
> > > > arch/arm64/include/uapi/asm/kvm.h | 6 +++
> > > > arch/arm64/kvm/guest.c | 14 ++++++-
> > > > include/kvm/arm_psci.h | 9 +++++
> > > > virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> > > > 10 files changed, 151 insertions(+), 4 deletions(-)
> > > > create mode 100644 Documentation/virtual/kvm/arm/psci.txt
> > > >
> > > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > > > index 57d3ee9e4bde..334905202141 100644
> > > > --- a/Documentation/virtual/kvm/api.txt
> > > > +++ b/Documentation/virtual/kvm/api.txt
> > > > @@ -2493,7 +2493,8 @@ Possible features:
> > > > and execute guest code when KVM_RUN is called.
> > > > - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> > > > Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> > > > - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> > > > + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> > > > + backward compatible with v0.2) for the CPU.
> > > > Depends on KVM_CAP_ARM_PSCI_0_2.
> > > > - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> > > > Depends on KVM_CAP_ARM_PMU_V3.
> > > > diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> > > > new file mode 100644
> > > > index 000000000000..aafdab887b04
> > > > --- /dev/null
> > > > +++ b/Documentation/virtual/kvm/arm/psci.txt
> > > > @@ -0,0 +1,30 @@
> > > > +KVM implements the PSCI (Power State Coordination Interface)
> > > > +specification in order to provide services such as CPU on/off, reset
> > > > +and power-off to the guest.
> > > > +
> > > > +The PSCI specification is regularly updated to provide new features,
> > > > +and KVM implements these updates if they make sense from a virtualization
> > > > +point of view.
> > > > +
> > > > +This means that a guest booted on two different versions of KVM can
> > > > +observe two different "firmware" revisions. This could cause issues if
> > > > +a given guest is tied to a particular PSCI revision (unlikely), or if
> > > > +a migration causes a different PSCI version to be exposed out of the
> > > > +blue to an unsuspecting guest.
> > > > +
> > > > +In order to remedy this situation, KVM exposes a set of "firmware
> > > > +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> > > > +interface. These registers can be saved/restored by userspace, and set
> > > > +to a convenient value if required.
> > > > +
> > > > +The following register is defined:
> > > > +
> > > > +* KVM_REG_ARM_PSCI_VERSION:
> > > > +
> > > > + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> > > > + (and thus has already been initialized)
> > > > + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> > > > + highest PSCI version implemented by KVM and compatible with v0.2)
> > > > + - Allows any PSCI version implemented by KVM and compatible with
> > > > + v0.2 to be set with SET_ONE_REG
> > > > + - Affects the whole VM (even if the register view is per-vcpu)
> > >
> >
> > Hi Drew,
> >
> > Thanks for looking into this, and for the exhaustive data.
> >
> > >
> > > I've put some more thought and experimentation into this. I think we
> > > should change to a vcpu feature bit. The feature bit would be used to
> > > force compat mode, v0.2, so KVM would still enable the new PSCI
> > > version by default. Below are two tables describing why I think we
> > > should switch to something other than a new sysreg, and below those
> > > tables are notes as to why I think we should use a vcpu feature. The
> > > asterisks in the tables point out behaviors that aren't what we want.
> > > While both tables have an asterisk, the sysreg approach's issue is
> > > bug. The vcpu feature approach's issue is risk incurred from an
> > > unsupported migration, albeit one that is hard to detect without a
> > > new machine type.
> > >
> > > +-----------------------------------------------------------------------+
> > > | sysreg approach |
> > > +------------------+-----------+-------+--------------------------------+
> > > | migration | userspace | works | notes |
> > > | | change | | |
> > > +------------------+-----------+-------+--------------------------------+
> > > | new -> new | NO | YES | Expected |
> > > +------------------+-----------+-------+--------------------------------+
> > > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > > +------------------+-----------+-------+--------------------------------+
> > > | new -> old | NO | NO | Migration fails due to the new |
> > > | | | | sysreg. Migration shouldn't |
> > > | | | | have been attempted, but no |
> > > | | | | way to know without a new |
> > > | | | | machine type. |
> > > +------------------+-----------+-------+--------------------------------+
> > > | compat -> old | YES | NO* | Even when setting PSCI version |
> > > | | | | to 0.2, we add a new sysreg, |
> > > | | | | so migration will still fail. |
> > > +------------------+-----------+-------+--------------------------------+
> > > | old -> compat | YES | YES | It's OK for the destination to |
> > > | | | | support more sysregs than the |
> > > | | | | source sends. |
> > > +------------------+-----------+-------+--------------------------------+
> > >
> > >
> > > +-----------------------------------------------------------------------+
> > > | vcpu feature approach |
> > > +------------------+-----------+-------+--------------------------------+
> > > | migration | userspace | works | notes |
> > > | | change | | |
> > > +------------------+-----------+-------+--------------------------------+
> > > | new -> new | NO | YES | Expected |
> > > +------------------+-----------+-------+--------------------------------+
> > > | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> > > +------------------+-----------+-------+--------------------------------+
> > > | new -> old | NO | YES* | Migrates, but it's not safe |
> > > | | | | for the guest kernel, and no |
> > > | | | | way to know without a new |
> > > | | | | machine type. |
> > > +------------------+-----------+-------+--------------------------------+
> > > | compat -> old | YES | YES | Expected |
> > > +------------------+-----------+-------+--------------------------------+
> > > | old -> compat | YES | YES | Expected |
> > > +------------------+-----------+-------+--------------------------------+
> > >
> > >
> > > Notes as to why the vcpu feature approach was selected:
> > >
> > > 1) While this is VM state, and thus a VM control would be a more natural
> > > fit, a new vcpu feature bit would be much less new code. We also
> > > already have a PSCI vcpu feature bit, so a new one will actually fit
> > > quite well.
> > >
> > > 2) No new state needs to be migrated, as we already migrate the feature
> > > bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
> > > so bumping it one more in KVM doesn't require a QEMU change.
> > >
> > >
> > > If we switch to a vcpu feature bit, then I think this patch can be
> > > replaced with something like this
> >
> > A couple of remarks:
> >
> > - My worry with this feature bit is that it is a point fix, and it
> > doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
> > another feature bit that says "force to 1.0"? I'd really like
> > something we can live with in the long run, and "KVM as firmware"
> > needs to be able to evolve without requiring a new userspace
> > interface each time we rev it.
> >
> > - The "compat->old" entry in your sysreg table is not quite fair. In
> > the feature table, you teach userspace about the new feature bit. You
> > could just as well teach userspace about the new sysreg. Yes, things
> > may be different in QEMU, but that's not what we're talking about
> > here.
> >
> > - Allowing a guest to migrate in an unsafe way seems worse than failing
> > a migration unexpectedly. Or at least not any better.
> >
> > To be clear: I'm not dismissing the idea at all, but I want to make sure
> > we're not cornering ourselves into an uncomfortable place.
> >
> > Christoffer, Peter, what are your thoughts on this?
> >
>
> Taking a step back, the only reasons why this patch isn't simply
> enabling PSCI v1.0 by default (without any selection method) are that we
> (1) want to support guests that complain about PSCI_VERSION != 0.2
> (which isn't completely outside the realm of a reasonable implementation
> if you read the description of PSCI_VERSION in the 0.2 spec) and (2) to
> provide migration support for guests that call
> PSCI_1_0_FN_PSCI_FEATURES.
>
> If we ignore (1) because we don't know of any guests where this is an
> issue, then it's all about (2), migration from "new -> old".
>
> As far as I can tell, the use case we are worried about here is updating
> the kernel (and not QEMU) on half of your data center and then trying to
> migrate from the upgraded kernel machine to a legacy (and potentially
> variant 2 vulnerable) machine. For this specific move from PSCI 0.2 to
> 1.0 with the included mitigation, I don't really think this is an
> important use case to support.
>
> In terms of the more general approach to "KVM firmware upgrades" and
> migration, I think something like the proposed FW register interface
> here should work, but I'm concerned about the lack of opportunity from
> userspace to predict a migration failure. But I don't understand why
> this requires a new machine type? Why can't we simply provide a KVM
> capability that libvirt etc. can query for?

Right, just exposing (or not) a property (which will become a libvirt
capability) should work for the management layers to determine if a
migration will fail, and then not attempt it. We just tend to avoid
allowing new properties from appearing in old machine types, and thus
new machine types would be the only ones exposing it. I'm not sure if
we need to be so strict though.

>
> Also, is it generally true that we can't expose any additional system
> registers from KVM without breaking migration and we don't have any
> method to deal with that in userspace and upper layers? If that's true,
> that's a bigger problem in general and something we should work on
> trying to solve. If it's not true, then there should be some method to
> deal with the FW register already (like capabilities).

Capabilities can certainly be added for anything that need them, including
a new sysreg. QEMU just currently manages the sysregs as an array,
without concern for which ones are old and which ones are new. Of course
that can be changed. It may not even be that difficult to do, we just
need to filter registers before adding the array to the migration
stream.

>
> Given the urgency of adding mitigation towards variant 2 which is the
> driver for this work, I think we should drop the compat functionality in
> this series and work this out later on if needed. I think we can just
> tweak the previous patch to enable PSCI 1.0 by default and drop this
> patch for the current merge window.

I agree. Without planning to wait for the userspace changes, then I
guess we don't need to wait for a decision on how to do them yet either.

Thanks,
drew

2018-02-05 09:59:24

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Mon, Feb 05, 2018 at 09:24:33AM +0000, Marc Zyngier wrote:
> On 04/02/18 12:37, Christoffer Dall wrote:
> > On Sat, Feb 03, 2018 at 11:59:32AM +0000, Marc Zyngier wrote:
> >> On Fri, 2 Feb 2018 21:17:06 +0100
> >> Andrew Jones <[email protected]> wrote:
> >>
> >>> On Thu, Feb 01, 2018 at 11:46:47AM +0000, Marc Zyngier wrote:
> >>>> Although we've implemented PSCI 1.0 and 1.1, nothing can select them
> >>>> Since all the new PSCI versions are backward compatible, we decide to
> >>>> default to the latest version of the PSCI implementation. This is no
> >>>> different from doing a firmware upgrade on KVM.
> >>>>
> >>>> But in order to give a chance to hypothetical badly implemented guests
> >>>> that would have a fit by discovering something other than PSCI 0.2,
> >>>> let's provide a new API that allows userspace to pick one particular
> >>>> version of the API.
> >>>>
> >>>> This is implemented as a new class of "firmware" registers, where
> >>>> we expose the PSCI version. This allows the PSCI version to be
> >>>> save/restored as part of a guest migration, and also set to
> >>>> any supported version if the guest requires it.
> >>>>
> >>>> Signed-off-by: Marc Zyngier <[email protected]>
> >>>> ---
> >>>> Documentation/virtual/kvm/api.txt | 3 +-
> >>>> Documentation/virtual/kvm/arm/psci.txt | 30 +++++++++++++++
> >>>> arch/arm/include/asm/kvm_host.h | 3 ++
> >>>> arch/arm/include/uapi/asm/kvm.h | 6 +++
> >>>> arch/arm/kvm/guest.c | 13 +++++++
> >>>> arch/arm64/include/asm/kvm_host.h | 3 ++
> >>>> arch/arm64/include/uapi/asm/kvm.h | 6 +++
> >>>> arch/arm64/kvm/guest.c | 14 ++++++-
> >>>> include/kvm/arm_psci.h | 9 +++++
> >>>> virt/kvm/arm/psci.c | 68 +++++++++++++++++++++++++++++++++-
> >>>> 10 files changed, 151 insertions(+), 4 deletions(-)
> >>>> create mode 100644 Documentation/virtual/kvm/arm/psci.txt
> >>>>
> >>>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> >>>> index 57d3ee9e4bde..334905202141 100644
> >>>> --- a/Documentation/virtual/kvm/api.txt
> >>>> +++ b/Documentation/virtual/kvm/api.txt
> >>>> @@ -2493,7 +2493,8 @@ Possible features:
> >>>> and execute guest code when KVM_RUN is called.
> >>>> - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
> >>>> Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
> >>>> - - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
> >>>> + - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision
> >>>> + backward compatible with v0.2) for the CPU.
> >>>> Depends on KVM_CAP_ARM_PSCI_0_2.
> >>>> - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
> >>>> Depends on KVM_CAP_ARM_PMU_V3.
> >>>> diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
> >>>> new file mode 100644
> >>>> index 000000000000..aafdab887b04
> >>>> --- /dev/null
> >>>> +++ b/Documentation/virtual/kvm/arm/psci.txt
> >>>> @@ -0,0 +1,30 @@
> >>>> +KVM implements the PSCI (Power State Coordination Interface)
> >>>> +specification in order to provide services such as CPU on/off, reset
> >>>> +and power-off to the guest.
> >>>> +
> >>>> +The PSCI specification is regularly updated to provide new features,
> >>>> +and KVM implements these updates if they make sense from a virtualization
> >>>> +point of view.
> >>>> +
> >>>> +This means that a guest booted on two different versions of KVM can
> >>>> +observe two different "firmware" revisions. This could cause issues if
> >>>> +a given guest is tied to a particular PSCI revision (unlikely), or if
> >>>> +a migration causes a different PSCI version to be exposed out of the
> >>>> +blue to an unsuspecting guest.
> >>>> +
> >>>> +In order to remedy this situation, KVM exposes a set of "firmware
> >>>> +pseudo-registers" that can be manipulated using the GET/SET_ONE_REG
> >>>> +interface. These registers can be saved/restored by userspace, and set
> >>>> +to a convenient value if required.
> >>>> +
> >>>> +The following register is defined:
> >>>> +
> >>>> +* KVM_REG_ARM_PSCI_VERSION:
> >>>> +
> >>>> + - Only valid if the vcpu has the KVM_ARM_VCPU_PSCI_0_2 feature set
> >>>> + (and thus has already been initialized)
> >>>> + - Returns the current PSCI version on GET_ONE_REG (defaulting to the
> >>>> + highest PSCI version implemented by KVM and compatible with v0.2)
> >>>> + - Allows any PSCI version implemented by KVM and compatible with
> >>>> + v0.2 to be set with SET_ONE_REG
> >>>> + - Affects the whole VM (even if the register view is per-vcpu)
> >>>
> >>
> >> Hi Drew,
> >>
> >> Thanks for looking into this, and for the exhaustive data.
> >>
> >>>
> >>> I've put some more thought and experimentation into this. I think we
> >>> should change to a vcpu feature bit. The feature bit would be used to
> >>> force compat mode, v0.2, so KVM would still enable the new PSCI
> >>> version by default. Below are two tables describing why I think we
> >>> should switch to something other than a new sysreg, and below those
> >>> tables are notes as to why I think we should use a vcpu feature. The
> >>> asterisks in the tables point out behaviors that aren't what we want.
> >>> While both tables have an asterisk, the sysreg approach's issue is
> >>> bug. The vcpu feature approach's issue is risk incurred from an
> >>> unsupported migration, albeit one that is hard to detect without a
> >>> new machine type.
> >>>
> >>> +-----------------------------------------------------------------------+
> >>> | sysreg approach |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | migration | userspace | works | notes |
> >>> | | change | | |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | new -> new | NO | YES | Expected |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | new -> old | NO | NO | Migration fails due to the new |
> >>> | | | | sysreg. Migration shouldn't |
> >>> | | | | have been attempted, but no |
> >>> | | | | way to know without a new |
> >>> | | | | machine type. |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | compat -> old | YES | NO* | Even when setting PSCI version |
> >>> | | | | to 0.2, we add a new sysreg, |
> >>> | | | | so migration will still fail. |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | old -> compat | YES | YES | It's OK for the destination to |
> >>> | | | | support more sysregs than the |
> >>> | | | | source sends. |
> >>> +------------------+-----------+-------+--------------------------------+
> >>>
> >>>
> >>> +-----------------------------------------------------------------------+
> >>> | vcpu feature approach |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | migration | userspace | works | notes |
> >>> | | change | | |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | new -> new | NO | YES | Expected |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | old -> new | NO | YES | PSCI 1.0 is backward compatible|
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | new -> old | NO | YES* | Migrates, but it's not safe |
> >>> | | | | for the guest kernel, and no |
> >>> | | | | way to know without a new |
> >>> | | | | machine type. |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | compat -> old | YES | YES | Expected |
> >>> +------------------+-----------+-------+--------------------------------+
> >>> | old -> compat | YES | YES | Expected |
> >>> +------------------+-----------+-------+--------------------------------+
> >>>
> >>>
> >>> Notes as to why the vcpu feature approach was selected:
> >>>
> >>> 1) While this is VM state, and thus a VM control would be a more natural
> >>> fit, a new vcpu feature bit would be much less new code. We also
> >>> already have a PSCI vcpu feature bit, so a new one will actually fit
> >>> quite well.
> >>>
> >>> 2) No new state needs to be migrated, as we already migrate the feature
> >>> bitmap. Unlike, KVM, QEMU doesn't track the max number of features,
> >>> so bumping it one more in KVM doesn't require a QEMU change.
> >>>
> >>>
> >>> If we switch to a vcpu feature bit, then I think this patch can be
> >>> replaced with something like this
> >>
> >> A couple of remarks:
> >>
> >> - My worry with this feature bit is that it is a point fix, and it
> >> doesn't scale. Come PSCI 1.2 and WORKAROUND_2, what do we do? Add
> >> another feature bit that says "force to 1.0"? I'd really like
> >> something we can live with in the long run, and "KVM as firmware"
> >> needs to be able to evolve without requiring a new userspace
> >> interface each time we rev it.
> >>
> >> - The "compat->old" entry in your sysreg table is not quite fair. In
> >> the feature table, you teach userspace about the new feature bit. You
> >> could just as well teach userspace about the new sysreg. Yes, things
> >> may be different in QEMU, but that's not what we're talking about
> >> here.
> >>
> >> - Allowing a guest to migrate in an unsafe way seems worse than failing
> >> a migration unexpectedly. Or at least not any better.
> >>
> >> To be clear: I'm not dismissing the idea at all, but I want to make sure
> >> we're not cornering ourselves into an uncomfortable place.
> >>
> >> Christoffer, Peter, what are your thoughts on this?
> >>
> >
> > Taking a step back, the only reasons why this patch isn't simply
> > enabling PSCI v1.0 by default (without any selection method) are that we
> > (1) want to support guests that complain about PSCI_VERSION != 0.2
> > (which isn't completely outside the realm of a reasonable implementation
> > if you read the description of PSCI_VERSION in the 0.2 spec) and (2) to
> > provide migration support for guests that call
> > PSCI_1_0_FN_PSCI_FEATURES.
> >
> > If we ignore (1) because we don't know of any guests where this is an
> > issue, then it's all about (2), migration from "new -> old".
> >
> > As far as I can tell, the use case we are worried about here is updating
> > the kernel (and not QEMU) on half of your data center and then trying to
> > migrate from the upgraded kernel machine to a legacy (and potentially
> > variant 2 vulnerable) machine. For this specific move from PSCI 0.2 to
> > 1.0 with the included mitigation, I don't really think this is an
> > important use case to support.
>
> I'm not so sure. Promising mitigation to a guest, and then seeing that
> mitigation being silently taken away because we've allowed it to migrate
> seem bad to me.
>
> > In terms of the more general approach to "KVM firmware upgrades" and
> > migration, I think something like the proposed FW register interface
> > here should work, but I'm concerned about the lack of opportunity from
> > userspace to predict a migration failure. But I don't understand why
>
> Userspace could predict some of the failure cases, if only by checking
> that all registers can be restored in a new guest. I'm not sure how
> viable this is in a data centre type of environment.
>
> > this requires a new machine type? Why can't we simply provide a KVM
> > capability that libvirt etc. can query for?
> >
> > Also, is it generally true that we can't expose any additional system
> > registers from KVM without breaking migration and we don't have any
> > method to deal with that in userspace and upper layers? If that's true,
>
> It is my understanding that each time we add a new sysreg to KVM,
> migration in QEMU breaks in the new->old direction.
>
> > that's a bigger problem in general and something we should work on
> > trying to solve. If it's not true, then there should be some method to
> > deal with the FW register already (like capabilities).
> >
> > Given the urgency of adding mitigation towards variant 2 which is the
> > driver for this work, I think we should drop the compat functionality in
> > this series and work this out later on if needed. I think we can just
> > tweak the previous patch to enable PSCI 1.0 by default and drop this
> > patch for the current merge window.
>
> I'd be fine with that, as long as we have a clear agreement on the
> impact of such a move.

Yeah, that's what I was trying to figure out with my fancy tables. I might
be coming around more to your approach now, though. Ensuring the new->old
migration fails is a nice feature of this series. It would be good if
we could preserve that behavior without committing to a new userspace
interface, but I'm not sure how. Maybe I should just apologize for the
noise, and this patch be left as is...

Thanks,
drew

2018-02-05 10:20:42

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 12/18] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling

On Mon, Feb 05, 2018 at 09:08:31AM +0000, Marc Zyngier wrote:
> On 04/02/18 18:39, Christoffer Dall wrote:
> > On Thu, Feb 01, 2018 at 11:46:51AM +0000, Marc Zyngier wrote:
> >> We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible.
> >> So let's intercept it as early as we can by testing for the
> >> function call number as soon as we've identified a HVC call
> >> coming from the guest.
> >
> > Hmmm. How often is this expected to happen and what is the expected
> > extra cost of doing the early-exit handling in the C code vs. here?
>
> Pretty often. On each context switch of a Linux guest, for example. It
> is almost as bad as if we were trapping all VM ops. Moving it to C is
> definitely visible on something like hackbench (I remember something
> like a 10-12% degradation on Seattle, but I'd need to rerun the tests to
> give you something accurate).

If it's that easily visible (although hackbench is clearly the
pathological case here), then we should try to optimize it. Let's hope
we don't have to add too many of these workarounds in the future.

> It is the whole GPR save/restore dance
> that costs us a lot (31 registers for the guest, 12 for the host), plus
> some the extra SError synchronization that doesn't come for free either.
>

Fair enough.

> > I think we'd be better off if we only had a single early-exit path (and
> > we should move the FP/SIMD trap to that path as well), but if there's a
> > measurable benefit of having this logic in assembly as opposed to in the
> > C code, then I'm ok with this as well.
>
> I agree that the multiplication of "earlier than early" paths is
> becoming annoying. Moving the FP/SIMD stuff to C would be less
> problematic, as we have patches to move some of that to load/put, and
> we'd only take the trap once per time slice (as opposed to once per
> entry at the moment).

Yes, and we can even improve on that (see separate discussions around
KVM support for SVE with Dave).

>
> Here, we're trying hard to do exactly nothing, because each instruction
> is just an extra overhead (we've already nuked the BP). I even
> considered inserting that code as part of the per-CPU-type vectors (and
> leave the rest of the KVM code alone), but it felt like a step too far.
>

We can always look at adjusting this more in the future if we want.

Reviewed-by: Christoffer Dall <[email protected]>

2018-02-05 10:45:40

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On 05/02/18 09:58, Andrew Jones wrote:
> On Mon, Feb 05, 2018 at 09:24:33AM +0000, Marc Zyngier wrote:
>> On 04/02/18 12:37, Christoffer Dall wrote:

[...]

>>> Given the urgency of adding mitigation towards variant 2 which is the
>>> driver for this work, I think we should drop the compat functionality in
>>> this series and work this out later on if needed. I think we can just
>>> tweak the previous patch to enable PSCI 1.0 by default and drop this
>>> patch for the current merge window.
>>
>> I'd be fine with that, as long as we have a clear agreement on the
>> impact of such a move.
>
> Yeah, that's what I was trying to figure out with my fancy tables. I might
> be coming around more to your approach now, though. Ensuring the new->old
> migration fails is a nice feature of this series. It would be good if
> we could preserve that behavior without committing to a new userspace
> interface, but I'm not sure how. Maybe I should just apologize for the
> noise, and this patch be left as is...

How about we don't decide now?

I can remove this patch from the series so that the core stuff can make
it into the arm64 tree ASAP (I think Catalin wants to queue something
early this week so that it can hit Linus' tree before the end of the
merge window), and then repost this single patch on its own (with fixes
for the things that Christoffer found in his review) after -rc1.

It leaves us time to haggle over the userspace ABI (which is
realistically not going to affect anyone), and we get the core stuff in
place for SoC vendors to start updating their firmware.

Thoughts?

M.
--
Jazz is not dead. It just smells funny...

2018-02-05 10:52:19

by Christoffer Dall

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On Mon, Feb 05, 2018 at 10:42:44AM +0000, Marc Zyngier wrote:
> On 05/02/18 09:58, Andrew Jones wrote:
> > On Mon, Feb 05, 2018 at 09:24:33AM +0000, Marc Zyngier wrote:
> >> On 04/02/18 12:37, Christoffer Dall wrote:
>
> [...]
>
> >>> Given the urgency of adding mitigation towards variant 2 which is the
> >>> driver for this work, I think we should drop the compat functionality in
> >>> this series and work this out later on if needed. I think we can just
> >>> tweak the previous patch to enable PSCI 1.0 by default and drop this
> >>> patch for the current merge window.
> >>
> >> I'd be fine with that, as long as we have a clear agreement on the
> >> impact of such a move.
> >
> > Yeah, that's what I was trying to figure out with my fancy tables. I might
> > be coming around more to your approach now, though. Ensuring the new->old
> > migration fails is a nice feature of this series. It would be good if
> > we could preserve that behavior without committing to a new userspace
> > interface, but I'm not sure how. Maybe I should just apologize for the
> > noise, and this patch be left as is...
>
> How about we don't decide now?
>
> I can remove this patch from the series so that the core stuff can make
> it into the arm64 tree ASAP (I think Catalin wants to queue something
> early this week so that it can hit Linus' tree before the end of the
> merge window), and then repost this single patch on its own (with fixes
> for the things that Christoffer found in his review) after -rc1.
>
> It leaves us time to haggle over the userspace ABI (which is
> realistically not going to affect anyone), and we get the core stuff in
> place for SoC vendors to start updating their firmware.
>
I agree, that's what I tried to suggest in my e-mail as well. Just
remember to tweak the previous patch to actually enable PSCI 1.0 by
default.

Thanks,
-Christoffer

2018-02-05 11:11:52

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 08/18] arm/arm64: KVM: Add PSCI version selection API

On 05/02/18 10:50, Christoffer Dall wrote:
> On Mon, Feb 05, 2018 at 10:42:44AM +0000, Marc Zyngier wrote:
>> On 05/02/18 09:58, Andrew Jones wrote:
>>> On Mon, Feb 05, 2018 at 09:24:33AM +0000, Marc Zyngier wrote:
>>>> On 04/02/18 12:37, Christoffer Dall wrote:
>>
>> [...]
>>
>>>>> Given the urgency of adding mitigation towards variant 2 which is the
>>>>> driver for this work, I think we should drop the compat functionality in
>>>>> this series and work this out later on if needed. I think we can just
>>>>> tweak the previous patch to enable PSCI 1.0 by default and drop this
>>>>> patch for the current merge window.
>>>>
>>>> I'd be fine with that, as long as we have a clear agreement on the
>>>> impact of such a move.
>>>
>>> Yeah, that's what I was trying to figure out with my fancy tables. I might
>>> be coming around more to your approach now, though. Ensuring the new->old
>>> migration fails is a nice feature of this series. It would be good if
>>> we could preserve that behavior without committing to a new userspace
>>> interface, but I'm not sure how. Maybe I should just apologize for the
>>> noise, and this patch be left as is...
>>
>> How about we don't decide now?
>>
>> I can remove this patch from the series so that the core stuff can make
>> it into the arm64 tree ASAP (I think Catalin wants to queue something
>> early this week so that it can hit Linus' tree before the end of the
>> merge window), and then repost this single patch on its own (with fixes
>> for the things that Christoffer found in his review) after -rc1.
>>
>> It leaves us time to haggle over the userspace ABI (which is
>> realistically not going to affect anyone), and we get the core stuff in
>> place for SoC vendors to start updating their firmware.
>>
> I agree, that's what I tried to suggest in my e-mail as well. Just
> remember to tweak the previous patch to actually enable PSCI 1.0 by
> default.

Yup. I'll move the KVM_ARM_PSCI_LATEST hunk to that patch, and return it
unconditionally from kvm_psci_version.

M.
--
Jazz is not dead. It just smells funny...