2018-01-06 07:58:11

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 0/7] Handle guest RAS Error in KVM and kernel

This series patches mainly do below things:

1. Trap guest RAS ERR* registers accesses to EL2 from Non-secure EL1,
KVM will will do a minimum simulation, these registers are simulated
to RAZ/WI in KVM.
2. Route guest synchronous External Abort to EL2. If it is also routed
to EL3 by firmware at the same time, system will trap to EL3 firmware instead
of EL2 KVM, then firmware judges whether EL2 routing is enabled, if enabled,
jump back to EL2 KVM, otherwise jump back to EL1 host kernel.
3. Enable APEI ARv8 SEI notification to parse the CPER records for SError
in the ACPI GHES driver, KVM will call handle_guest_sei() to let ACPI
driver to parse the CPER recorded for SError which happened in the guest
4. If ACPI driver parsed the CPER record failed, KVM will classify the Error
through Exception Syndrome Register and do different approaches according
to Asynchronous Error Type
5. If the guest RAS SError is not propagated and not consumed, this exception
is precise, we temporarily shut down the VM to isolate the error. For other
Asynchronous Error Type, KVM directly injects virtual SError with IMPLEMENTATION
DEFINED ESR or KVM panic if the error is fatal. For the RAS extension, guest
virtual ESR must be set, because all-zero means 'RAS error: Uncategorized' instead
of 'no valid ISS', so set this ESR to IMPLEMENTATION DEFINED by default if user space
does not specify it.

change since v8:
1. update the patch [1/7] and [2/7] to align this serie.
https://www.spinics.net/lists/arm-kernel/msg623513.html
https://www.spinics.net/lists/arm-kernel/msg623520.html
2. In kvm ,check handle_guest_sei()'s return value. If this function return true, stop
classifying errors.
3. Temporarily shut down the VM to isolate the error for recoverable error (UER)
4. update some patch's commit messages and clean some patches

Dongjiu Geng (5):
acpi: apei: Add SEI notification type support for ARMv8
KVM: arm64: Trap RAS error registers and set HCR_EL2's TERR & TEA
arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl
arm64: kvm: Set Virtual SError Exception Syndrome for guest
arm64: kvm: handle guest SError Interrupt by categorization

James Morse (1):
KVM: arm64: Save ESR_EL2 on guest SError

Xie XiuQi (1):
arm64: cpufeature: Detect CPU RAS Extentions

Documentation/virtual/kvm/api.txt | 11 ++++++
arch/arm/include/asm/kvm_host.h | 1 +
arch/arm/kvm/guest.c | 9 +++++
arch/arm64/Kconfig | 16 +++++++++
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/esr.h | 11 ++++++
arch/arm64/include/asm/kvm_arm.h | 2 ++
arch/arm64/include/asm/kvm_emulate.h | 17 +++++++++
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/include/asm/sysreg.h | 15 ++++++++
arch/arm64/include/asm/system_misc.h | 1 +
arch/arm64/kernel/cpufeature.c | 13 +++++++
arch/arm64/kvm/guest.c | 14 ++++++++
arch/arm64/kvm/handle_exit.c | 68 +++++++++++++++++++++++++++++++++---
arch/arm64/kvm/hyp/switch.c | 25 +++++++++++--
arch/arm64/kvm/inject_fault.c | 13 ++++++-
arch/arm64/kvm/reset.c | 3 ++
arch/arm64/kvm/sys_regs.c | 10 ++++++
arch/arm64/mm/fault.c | 16 +++++++++
drivers/acpi/apei/Kconfig | 15 ++++++++
drivers/acpi/apei/ghes.c | 53 ++++++++++++++++++++++++++++
include/acpi/ghes.h | 1 +
include/uapi/linux/kvm.h | 3 ++
virt/kvm/arm/arm.c | 7 ++++
24 files changed, 320 insertions(+), 9 deletions(-)

--
1.9.1


2018-01-06 07:58:13

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 1/7] arm64: cpufeature: Detect CPU RAS Extentions

From: Xie XiuQi <[email protected]>

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <[email protected]>
[Rebased added config option, reworded commit message]
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Dongjiu Geng <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
arch/arm64/Kconfig | 16 ++++++++++++++++
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 2 ++
arch/arm64/kernel/cpufeature.c | 13 +++++++++++++
4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6..cc00d10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -973,6 +973,22 @@ config ARM64_PMEM
operations if DC CVAP is not supported (following the behaviour of
DC CVAP itself if the system does not define a point of persistence).

+config ARM64_RAS_EXTN
+ bool "Enable support for RAS CPU Extensions"
+ default y
+ help
+ CPUs that support the Reliability, Availability and Serviceability
+ (RAS) Extensions, part of ARMv8.2 are able to track faults and
+ errors, classify them and report them to software.
+
+ On CPUs with these extensions system software can use additional
+ barriers to determine if faults are pending and read the
+ classification from a new set of registers.
+
+ Selecting this feature will allow the kernel to use these barriers
+ and access the new registers if the system supports the extension.
+ Platform RAS features may additionally depend on firmware support.
+
endmenu

config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da6216..4820d44 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
#define ARM64_WORKAROUND_858921 19
#define ARM64_WORKAROUND_CAVIUM_30115 20
#define ARM64_HAS_DCPOP 21
+#define ARM64_HAS_RAS_EXTN 22

-#define ARM64_NCAPS 22
+#define ARM64_NCAPS 23

#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed..64e2a80 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -332,6 +332,7 @@
#define ID_AA64ISAR1_DPB_SHIFT 0

/* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT 28
#define ID_AA64PFR0_GIC_SHIFT 24
#define ID_AA64PFR0_ASIMD_SHIFT 20
#define ID_AA64PFR0_FP_SHIFT 16
@@ -340,6 +341,7 @@
#define ID_AA64PFR0_EL1_SHIFT 4
#define ID_AA64PFR0_EL0_SHIFT 0

+#define ID_AA64PFR0_RAS_V1 0x1
#define ID_AA64PFR0_FP_NI 0xf
#define ID_AA64PFR0_FP_SUPPORTED 0x0
#define ID_AA64PFR0_ASIMD_NI 0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 21e2c95..4846974 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -125,6 +125,7 @@ static int __init register_cpu_hwcaps_dumper(void)
};

static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -900,6 +901,18 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
.min_field_value = 1,
},
#endif
+#ifdef CONFIG_ARM64_RAS_EXTN
+ {
+ .desc = "RAS Extension Support",
+ .capability = ARM64_HAS_RAS_EXTN,
+ .def_scope = SCOPE_SYSTEM,
+ .matches = has_cpuid_feature,
+ .sys_reg = SYS_ID_AA64PFR0_EL1,
+ .sign = FTR_UNSIGNED,
+ .field_pos = ID_AA64PFR0_RAS_SHIFT,
+ .min_field_value = ID_AA64PFR0_RAS_V1,
+ },
+#endif /* CONFIG_ARM64_RAS_EXTN */
{},
};

--
1.9.1

2018-01-06 07:58:16

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 4/7] KVM: arm64: Trap RAS error registers and set HCR_EL2's TERR & TEA

ARMv8.2 adds a new bit HCR_EL2.TEA which routes synchronous external
aborts to EL2, and adds a trap control bit HCR_EL2.TERR which traps
all Non-secure EL1&0 error record accesses to EL2.

This patch enables the two bits for the guest OS, guaranteeing that
KVM takes external aborts and traps attempts to access the physical
error registers.

ERRIDR_EL1 advertises the number of error records, we return
zero meaning we can treat all the other registers as RAZ/WI too.

Signed-off-by: Dongjiu Geng <[email protected]>
[removed specific emulation, use trap_raz_wi() directly for everything,
rephrased parts of the commit message]
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/include/asm/kvm_arm.h | 2 ++
arch/arm64/include/asm/kvm_emulate.h | 7 +++++++
arch/arm64/include/asm/sysreg.h | 10 ++++++++++
arch/arm64/kvm/sys_regs.c | 10 ++++++++++
4 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 61d694c..1188272 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
#include <asm/types.h>

/* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA (UL(1) << 37)
+#define HCR_TERR (UL(1) << 36)
#define HCR_E2H (UL(1) << 34)
#define HCR_ID (UL(1) << 33)
#define HCR_CD (UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index e5df3fc..555b28b 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -47,6 +47,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
if (is_kernel_in_hyp_mode())
vcpu->arch.hcr_el2 |= HCR_E2H;
+ if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+ /* route synchronous external abort exceptions to EL2 */
+ vcpu->arch.hcr_el2 |= HCR_TEA;
+ /* trap error record accesses */
+ vcpu->arch.hcr_el2 |= HCR_TERR;
+ }
+
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
vcpu->arch.hcr_el2 &= ~HCR_RW;
}
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 64e2a80..47b967d 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -169,6 +169,16 @@
#define SYS_AFSR0_EL1 sys_reg(3, 0, 5, 1, 0)
#define SYS_AFSR1_EL1 sys_reg(3, 0, 5, 1, 1)
#define SYS_ESR_EL1 sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1 sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1 sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1 sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1 sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1 sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1 sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1 sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1 sys_reg(3, 0, 5, 5, 1)
+
#define SYS_FAR_EL1 sys_reg(3, 0, 6, 0, 0)
#define SYS_PAR_EL1 sys_reg(3, 0, 7, 4, 0)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2e070d3..2b1fafa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -953,6 +953,16 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+ { SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+ { SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },

--
1.9.1

2018-01-06 07:58:17

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 2/7] KVM: arm64: Save ESR_EL2 on guest SError

From: James Morse <[email protected]>

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <[email protected]>
---
arch/arm64/kvm/hyp/switch.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c..fb5a538 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -228,11 +228,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)

static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
{
- u64 esr = read_sysreg_el2(esr);
- u8 ec = ESR_ELx_EC(esr);
+ u8 ec;
+ u64 esr;
u64 hpfar, far;

- vcpu->arch.fault.esr_el2 = esr;
+ esr = vcpu->arch.fault.esr_el2;
+ ec = ESR_ELx_EC(esr);

if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
return true;
@@ -313,6 +314,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
exit_code = __guest_enter(vcpu, host_ctxt);
/* And we're baaack! */

+ if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+ vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
/*
* We're using the raw exception code in order to only process
* the trap if no SError is pending. We will come back to the
--
1.9.1

2018-01-06 07:58:15

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

ARMv8.2 requires implementation of the RAS extension. In
this extension, it adds SEI(SError Interrupt) notification
type, this patch adds new GHES error source SEI handling
functions. This error source parsing and handling method
is similar with the SEA.

Expose API ghes_notify_sei() to external users. External
modules can call this exposed API to parse APEI table and
handle the SEI notification.

Signed-off-by: Dongjiu Geng <[email protected]>
---
drivers/acpi/apei/Kconfig | 15 ++++++++++++++
drivers/acpi/apei/ghes.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++
include/acpi/ghes.h | 1 +
3 files changed, 69 insertions(+)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae543..ff4afc3 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -55,6 +55,21 @@ config ACPI_APEI_SEA
option allows the OS to look for such hardware error record, and
take appropriate action.

+config ACPI_APEI_SEI
+ bool "APEI SError(System Error) Interrupt logging/recovering support"
+ depends on ARM64 && ACPI_APEI_GHES
+ default y
+ help
+ This option should be enabled if the system supports
+ firmware first handling of SEI (SError interrupt).
+
+ SEI happens with asynchronous external abort for errors on device
+ memory reads on ARMv8 systems. If a system supports firmware first
+ handling of SEI, the platform analyzes and handles hardware error
+ notifications from SEI, and it may then form a hardware error record for
+ the OS to parse and handle. This option allows the OS to look for
+ such hardware error record, and take appropriate action.
+
config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6a3f824..67cd3a7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -855,6 +855,46 @@ static inline void ghes_sea_add(struct ghes *ghes) { }
static inline void ghes_sea_remove(struct ghes *ghes) { }
#endif /* CONFIG_ACPI_APEI_SEA */

+#ifdef CONFIG_ACPI_APEI_SEI
+static LIST_HEAD(ghes_sei);
+
+/*
+ * Return 0 only if one of the SEI error sources successfully reported an error
+ * record sent from the firmware.
+ */
+int ghes_notify_sei(void)
+{
+ struct ghes *ghes;
+ int ret = -ENOENT;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(ghes, &ghes_sei, list) {
+ if (!ghes_proc(ghes))
+ ret = 0;
+ }
+ rcu_read_unlock();
+ return ret;
+}
+
+static void ghes_sei_add(struct ghes *ghes)
+{
+ mutex_lock(&ghes_list_mutex);
+ list_add_rcu(&ghes->list, &ghes_sei);
+ mutex_unlock(&ghes_list_mutex);
+}
+
+static void ghes_sei_remove(struct ghes *ghes)
+{
+ mutex_lock(&ghes_list_mutex);
+ list_del_rcu(&ghes->list);
+ mutex_unlock(&ghes_list_mutex);
+ synchronize_rcu();
+}
+#else /* CONFIG_ACPI_APEI_SEI */
+static inline void ghes_sei_add(struct ghes *ghes) { }
+static inline void ghes_sei_remove(struct ghes *ghes) { }
+#endif /* CONFIG_ACPI_APEI_SEI */
+
#ifdef CONFIG_HAVE_ACPI_APEI_NMI
/*
* printk is not safe in NMI context. So in NMI handler, we allocate
@@ -1086,6 +1126,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
goto err;
}
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ if (!IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
+ pr_warn(GHES_PFX "Generic hardware error source: %d notified via SEI is not supported!\n",
+ generic->header.source_id);
+ goto err;
+ }
+ break;
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d notified via NMI interrupt is not supported!\n",
@@ -1158,6 +1205,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_add(ghes);
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ ghes_sei_add(ghes);
+ break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@ -1211,6 +1261,9 @@ static int ghes_remove(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_remove(ghes);
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ ghes_sei_remove(ghes);
+ break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 8feb0c8..9ba59e2 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -120,5 +120,6 @@ static inline void *acpi_hest_get_next(struct acpi_hest_generic_data *gdata)
section = acpi_hest_get_next(section))

int ghes_notify_sea(void);
+int ghes_notify_sei(void);

#endif /* GHES_H */
--
1.9.1

2018-01-06 07:59:05

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
guest and user space needs a way to tell KVM this value. So we add a
new ioctl. Before user space specifies the Exception Syndrome Register
ESR(ESR), it firstly checks that whether KVM has the capability to
set the guest ESR, If has, will set it. Otherwise, nothing to do.

For this ESR specifying, Only support for AArch64, not support AArch32.

Signed-off-by: Dongjiu Geng <[email protected]>
---
change the name to KVM_CAP_ARM_INJECT_SERROR_ESR instead of
XXXXX_ARM_RAS_EXTENSION, suggested here

https://patchwork.kernel.org/patch/9925203/
---
Documentation/virtual/kvm/api.txt | 11 +++++++++++
arch/arm/include/asm/kvm_host.h | 1 +
arch/arm/kvm/guest.c | 9 +++++++++
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/kvm/guest.c | 5 +++++
arch/arm64/kvm/reset.c | 3 +++
include/uapi/linux/kvm.h | 3 +++
virt/kvm/arm/arm.c | 7 +++++++
8 files changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index e63a35f..6dfb9fc 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4347,3 +4347,14 @@ This capability indicates that userspace can load HV_X64_MSR_VP_INDEX msr. Its
value is used to denote the target vcpu for a SynIC interrupt. For
compatibilty, KVM initializes this msr to KVM's internal vcpu index. When this
capability is absent, userspace can still query this msr's value.
+
+8.13 KVM_CAP_ARM_SET_SERROR_ESR
+
+Architectures: arm, arm64
+
+This capability indicates that userspace can specify syndrome value reported to
+guest OS when guest takes a virtual SError interrupt exception.
+If KVM has this capability, userspace can only specify the ISS field for the ESR
+syndrome, can not specify the EC field which is not under control by KVM.
+If this virtual SError is taken to EL1 using AArch64, this value will be reported
+into ISS filed of ESR_EL1
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6..6cf5c7b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -211,6 +211,7 @@ struct kvm_vcpu_stat {
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);
unsigned long kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);

diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index 1e0784e..1e15fa2 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -248,6 +248,15 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
}

+/*
+ * we only support guest SError syndrome specifying
+ * for ARM64, not support it for ARM32.
+ */
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+ return -EINVAL;
+}
+
int __attribute_const__ kvm_target_cpu(void)
{
switch (read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e923b58..769cc58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -317,6 +317,7 @@ struct kvm_vcpu_stat {
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome);

#define KVM_ARCH_WANT_MMU_NOTIFIER
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5c7f657..738ae90 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -EINVAL;
}

+int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
+{
+ return -EINVAL;
+}
+
int __attribute_const__ kvm_target_cpu(void)
{
unsigned long implementor = read_cpuid_implementor();
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3256b92..38c8a64 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -77,6 +77,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ARM_PMU_V3:
r = kvm_arm_support_pmu_v3();
break;
+ case KVM_CAP_ARM_INJECT_SERROR_ESR:
+ r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+ break;
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_VCPU_ATTRIBUTES:
r = 1;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7e99999..0c861c4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -931,6 +931,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_PPC_SMT_POSSIBLE 147
#define KVM_CAP_HYPERV_SYNIC2 148
#define KVM_CAP_HYPERV_VP_INDEX 149
+#define KVM_CAP_ARM_INJECT_SERROR_ESR 150

#ifdef KVM_CAP_IRQ_ROUTING

@@ -1357,6 +1358,8 @@ struct kvm_s390_ucas_mapping {
/* Available with KVM_CAP_S390_CMMA_MIGRATION */
#define KVM_S390_GET_CMMA_BITS _IOWR(KVMIO, 0xb8, struct kvm_s390_cmma_log)
#define KVM_S390_SET_CMMA_BITS _IOW(KVMIO, 0xb9, struct kvm_s390_cmma_log)
+/* Available with KVM_CAP_ARM_INJECT_SERROR_ESR */
+#define KVM_ARM_INJECT_SERROR_ESR _IOW(KVMIO, 0xba, __u32)

#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 95cba07..60dea5f 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1027,6 +1027,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
return -EFAULT;
return kvm_arm_vcpu_has_attr(vcpu, &attr);
}
+ case KVM_ARM_INJECT_SERROR_ESR: {
+ u32 syndrome;
+
+ if (copy_from_user(&syndrome, argp, sizeof(syndrome)))
+ return -EFAULT;
+ return kvm_arm_set_sei_esr(vcpu, &syndrome);
+ }
default:
return -EINVAL;
}
--
1.9.1

2018-01-06 07:59:04

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 7/7] arm64: kvm: handle guest SError Interrupt by categorization

If it is not RAS SError, directly inject virtual SError,
which will keep the old way, otherwise firstly let host
ACPI kernel driver to handle it. If the ACPI handling is
failed, KVM continues categorizing errors by the ESR_ELx.

For the recoverable error (UER), it has not been silently
propagated and has not (yet) been architecturally consumed
by the PE, the exception is precise. In order to make it
simple, we temporarily shut down the VM to isolate the error.

Signed-off-by: Dongjiu Geng <[email protected]>
---
change since v8:
1. Check handle_guest_sei()'s return value
2. Temporarily shut down the VM to isolate the error for the
recoverable error (UER)
3. Remove some unused macro definitions
---
arch/arm64/include/asm/esr.h | 11 ++++++
arch/arm64/include/asm/system_misc.h | 1 +
arch/arm64/kvm/handle_exit.c | 68 +++++++++++++++++++++++++++++++++---
arch/arm64/mm/fault.c | 16 +++++++++
4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 66ed8b6..a751e86 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -102,6 +102,7 @@
#define ESR_ELx_FSC_ACCESS (0x08)
#define ESR_ELx_FSC_FAULT (0x04)
#define ESR_ELx_FSC_PERM (0x0C)
+#define ESR_ELx_FSC_SERROR (0x11)

/* ISS field definitions for Data Aborts */
#define ESR_ELx_ISV_SHIFT (24)
@@ -119,6 +120,16 @@
#define ESR_ELx_CM_SHIFT (8)
#define ESR_ELx_CM (UL(1) << ESR_ELx_CM_SHIFT)

+/* ISS field definitions for SError interrupt */
+#define ESR_ELx_AET_SHIFT (10)
+#define ESR_ELx_AET (UL(0x7) << ESR_ELx_AET_SHIFT)
+/* Restartable error */
+#define ESR_ELx_AET_UEO (UL(2) << ESR_ELx_AET_SHIFT)
+/* Recoverable error */
+#define ESR_ELx_AET_UER (UL(3) << ESR_ELx_AET_SHIFT)
+/* Corrected error */
+#define ESR_ELx_AET_CE (UL(6) << ESR_ELx_AET_SHIFT)
+
/* ISS field definitions for exceptions taken in to Hyp */
#define ESR_ELx_CV (UL(1) << 24)
#define ESR_ELx_COND_SHIFT (20)
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 07aa8e3..9ee13ad 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -57,6 +57,7 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int,
})

int handle_guest_sea(phys_addr_t addr, unsigned int esr);
+int handle_guest_sei(void);

#endif /* __ASSEMBLY__ */

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74..5b806d4 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,6 +28,7 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_psci.h>
+#include <asm/system_misc.h>

#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -178,6 +179,67 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
return arm_exit_handlers[hsr_ec];
}

+/**
+ * kvm_handle_guest_sei - handles SError interrupt or asynchronous aborts
+ * @vcpu: the VCPU pointer
+ * @run: access to the kvm_run structure for results
+ *
+ * For RAS SError interrupt, firstly let host kernel handle it. If handling
+ * failed, then categorize the error by the ESR
+ */
+static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+ unsigned int esr = kvm_vcpu_get_hsr(vcpu);
+ bool impdef_syndrome = esr & ESR_ELx_ISV; /* aka IDS */
+ unsigned int aet = esr & ESR_ELx_AET;
+
+ /*
+ * This is not RAS SError
+ */
+ if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+ kvm_inject_vabt(vcpu);
+ return 1;
+ }
+
+ /* For RAS the host kernel may handle this abort. */
+ if (!handle_guest_sei())
+ return 1;
+
+ /*
+ * In below two conditions, it will directly inject the
+ * virtual SError:
+ * 1. The Syndrome is IMPLEMENTATION DEFINED
+ * 2. It is Uncategorized SEI
+ */
+ if (impdef_syndrome ||
+ ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)) {
+ kvm_inject_vabt(vcpu);
+ return 1;
+ }
+
+ switch (aet) {
+ case ESR_ELx_AET_CE: /* corrected error */
+ case ESR_ELx_AET_UEO: /* restartable error, not yet consumed */
+ return 1; /* continue processing the guest exit */
+ case ESR_ELx_AET_UER: /* recoverable error */
+ /*
+ * the exception is precise, not been silently propagated
+ * and not been consumed by the CPU, temporarily shut down
+ * the VM to isolated the error, hope not touch it again.
+ */
+ run->exit_reason = KVM_EXIT_EXCEPTION;
+ return 0;
+ default:
+ /*
+ * Until now, the CPU supports RAS, SError interrupt is fatal
+ * and host does not successfully handle it.
+ */
+ panic("This Asynchronous SError interrupt is dangerous, panic");
+ }
+
+ return 0;
+}
+
/*
* Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
* proper exit to userspace.
@@ -201,8 +263,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
*vcpu_pc(vcpu) -= adj;
}

- kvm_inject_vabt(vcpu);
- return 1;
+ return kvm_handle_guest_sei(vcpu, run);
}

exception_index = ARM_EXCEPTION_CODE(exception_index);
@@ -211,8 +272,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
case ARM_EXCEPTION_IRQ:
return 1;
case ARM_EXCEPTION_EL1_SERROR:
- kvm_inject_vabt(vcpu);
- return 1;
+ return kvm_handle_guest_sei(vcpu, run);
case ARM_EXCEPTION_TRAP:
/*
* See ARM ARM B1.14.1: "Hyp traps on instructions
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index b64958b..8560672 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -728,6 +728,22 @@ int handle_guest_sea(phys_addr_t addr, unsigned int esr)
}

/*
+ * Handle SError interrupt that occurred in guest OS.
+ *
+ * The return value will be zero if the SEI was successfully handled
+ * and non-zero if handling is failed.
+ */
+int handle_guest_sei(void)
+{
+ int ret = -ENOENT;
+
+ if (IS_ENABLED(CONFIG_ACPI_APEI_SEI))
+ ret = ghes_notify_sei();
+
+ return ret;
+}
+
+/*
* Dispatch a data abort to the relevant handler.
*/
asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
--
1.9.1

2018-01-06 07:59:33

by Dongjiu Geng

[permalink] [raw]
Subject: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest

RAS Extension add a VSESR_EL2 register which can provide
the syndrome value reported to software on taking a virtual
SError interrupt exception. This patch supports to specify
this Syndrome.

In the RAS Extensions we can not set all-zero syndrome value
for SError, which means 'RAS error: Uncategorized' instead of
'no valid ISS'. So set it to IMPLEMENTATION DEFINED syndrome
by default.

We also need to support userspace to specify a valid syndrome
value, Because in some case, the recovery is driven by userspace.
This patch can support that userspace specify it.

In the guest/host world switch, restore this value to VSESR_EL2
only when HCR_EL2.VSE is set. This value no need to be saved
because it is stale vale when guest exit.

Signed-off-by: Dongjiu Geng <[email protected]>
[Set an impdef ESR for Virtual-SError]
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/include/asm/sysreg.h | 3 +++
arch/arm64/kvm/guest.c | 11 ++++++++++-
arch/arm64/kvm/hyp/switch.c | 16 ++++++++++++++++
arch/arm64/kvm/inject_fault.c | 13 ++++++++++++-
6 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 555b28b..73c84d0 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -155,6 +155,16 @@ static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
return vcpu->arch.fault.esr_el2;
}

+static inline u32 kvm_vcpu_get_vsesr(const struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.fault.vsesr_el2;
+}
+
+static inline void kvm_vcpu_set_vsesr(struct kvm_vcpu *vcpu, unsigned long val)
+{
+ vcpu->arch.fault.vsesr_el2 = val;
+}
+
static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
{
u32 esr = kvm_vcpu_get_hsr(vcpu);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 769cc58..53d1d81 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
u32 esr_el2; /* Hyp Syndrom Register */
u64 far_el2; /* Hyp Fault Address Register */
u64 hpfar_el2; /* Hyp IPA Fault Address Register */
+ u32 vsesr_el2; /* Virtual SError Exception Syndrome Register */
};

/*
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 47b967d..3b035cc 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -86,6 +86,9 @@
#define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4)
#define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3)

+/* virtual SError exception syndrome register */
+#define REG_VSESR_EL2 sys_reg(3, 4, 5, 2, 3)
+
#define SET_PSTATE_PAN(x) __emit_inst(0xd5000000 | REG_PSTATE_PAN_IMM | \
(!!x)<<8 | 0x1f)
#define SET_PSTATE_UAO(x) __emit_inst(0xd5000000 | REG_PSTATE_UAO_IMM | \
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 738ae90..ffad42b 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -279,7 +279,16 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,

int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
{
- return -EINVAL;
+ u64 reg = *syndrome;
+
+ /* inject virtual system Error or asynchronous abort */
+ kvm_inject_vabt(vcpu);
+
+ if (reg)
+ /* set vsesr_el2[24:0] with value that user space specified */
+ kvm_vcpu_set_vsesr(vcpu, reg & ESR_ELx_ISS_MASK);
+
+ return 0;
}

int __attribute_const__ kvm_target_cpu(void)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index fb5a538..7f08a5d 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -67,6 +67,14 @@ static hyp_alternate_select(__activate_traps_arch,
__activate_traps_nvhe, __activate_traps_vhe,
ARM64_HAS_VIRT_HOST_EXTN);

+static void __hyp_text __sysreg_set_vsesr(struct kvm_vcpu *vcpu, u64 value)
+{
+ if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) &&
+ (value & HCR_VSE))
+ write_sysreg_s(kvm_vcpu_get_vsesr(vcpu), REG_VSESR_EL2);
+}
+
+
static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
@@ -86,6 +94,14 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
isb();
}
write_sysreg(val, hcr_el2);
+
+ /*
+ * If the virtual SError interrupt is taken to EL1 using AArch64,
+ * then VSESR_EL2 provides the syndrome value reported in ISS field
+ * of ESR_EL1.
+ */
+ __sysreg_set_vsesr(vcpu, val);
+
/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
write_sysreg(1 << 15, hstr_el2);
/*
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 3556715..fb94b5e 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -246,14 +246,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
inject_undef64(vcpu);
}

+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+ kvm_vcpu_set_vsesr(vcpu, esr);
+ vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
/**
* kvm_inject_vabt - inject an async abort / SError into the guest
* @vcpu: The VCPU to receive the exception
*
* It is assumed that this code is called from the VCPU thread and that the
* VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
*/
void kvm_inject_vabt(struct kvm_vcpu *vcpu)
{
- vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+ pend_guest_serror(vcpu, ESR_ELx_ISV);
}
--
1.9.1

2018-01-22 19:42:27

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Hi Dongjiu Geng,

(versions of patches 1,2 and 4 have been queued by Catalin)

(Nit 'ACPI / APEI:' is the normal subject prefix for ghes.c, this helps the
maintainers know which patches they need to pay attention to when you are
touching multiple trees)

On 06/01/18 16:02, Dongjiu Geng wrote:
> ARMv8.2 requires implementation of the RAS extension.

> In
> this extension, it adds SEI(SError Interrupt) notification
> type, this patch adds new GHES error source SEI handling
> functions.

This reads as if this patch is handling SError RAS notifications generated by a
CPU with the RAS extensions. These are about CPU->Software notifications. APEI
and GHES are a firmware first mechanism which is Software->Software.
Reading the v8.2 documents won't help anyone with the APEI/GHES code.

Please describe this from the ACPI view, "ACPI 6.x adds support for NOTIFY_SEI
as a GHES notification mechanism... ", its up to the arch code to spot a v8.2
RAS Error based on the cpu caps.


> This error source parsing and handling method
> is similar with the SEA.

There are problems with doing this:

Oct. 18, 2017, 10:26 a.m. James Morse wrote:
| How do SEA and SEI interact?
|
| As far as I can see they can both interrupt each other, which isn't something
| the single in_nmi() path in APEI can handle. I thinks we should fix this
| first.

[..]

| SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
| XiuQi pointed to the memory_failure_queue() code. We can use this directly
| from SEA, but not SEI. (what happens if an SError arrives while we are
| queueing memory_failure work from an IRQ).
|
| The one that scares me is the trace-point reporting stuff. What happens if an
| SError arrives while we are enabling a trace point? (these are static-keys
| right?)
|
| I don't think we can just plumb SEI in like this and be done with it.
| (I'm looking at teasing out the estatus cache code from being x86:NMI only.
| This way we solve the same 'cant do this from NMI context' with the same
| code'.)


I will post what I've got for this estatus-cache thing as an RFC, its not ready
to be considered yet.


> Expose API ghes_notify_sei() to external users. External
> modules can call this exposed API to parse APEI table and
> handle the SEI notification.

external modules? You mean called by the arch code when it gets this NOTIFY_SEI?


Thanks,

James

2018-01-23 09:25:21

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Hi James,

On 2018/1/23 3:39, James Morse wrote:
> Hi Dongjiu Geng,
>
> (versions of patches 1,2 and 4 have been queued by Catalin)
>
> (Nit 'ACPI / APEI:' is the normal subject prefix for ghes.c, this helps the
> maintainers know which patches they need to pay attention to when you are
> touching multiple trees)
>
> On 06/01/18 16:02, Dongjiu Geng wrote:
>> ARMv8.2 requires implementation of the RAS extension.
>
>> In
>> this extension, it adds SEI(SError Interrupt) notification
>> type, this patch adds new GHES error source SEI handling
>> functions.
>
> This reads as if this patch is handling SError RAS notifications generated by a
> CPU with the RAS extensions. These are about CPU->Software notifications. APEI
> and GHES are a firmware first mechanism which is Software->Software.
> Reading the v8.2 documents won't help anyone with the APEI/GHES code.
>
> Please describe this from the ACPI view, "ACPI 6.x adds support for NOTIFY_SEI
> as a GHES notification mechanism... ", its up to the arch code to spot a v8.2
> RAS Error based on the cpu caps.Ok, I will modify it.

>
>
>> This error source parsing and handling method
>> is similar with the SEA.
>
> There are problems with doing this:
>
> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
> | How do SEA and SEI interact?
> |
> | As far as I can see they can both interrupt each other, which isn't something
> | the single in_nmi() path in APEI can handle. I thinks we should fix this
> | first.
>
> [..]
>
> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
> | from SEA, but not SEI. (what happens if an SError arrives while we are
> | queueing memory_failure work from an IRQ).
> |
> | The one that scares me is the trace-point reporting stuff. What happens if an
> | SError arrives while we are enabling a trace point? (these are static-keys
> | right?)
> |
> | I don't think we can just plumb SEI in like this and be done with it.
> | (I'm looking at teasing out the estatus cache code from being x86:NMI only.
> | This way we solve the same 'cant do this from NMI context' with the same
> | code'.)
>
>
> I will post what I've got for this estatus-cache thing as an RFC, its not ready
> to be considered yet.Yes, I know you are dong that. Your serial's patch will consider all above things, right?
If your patch can be consider that, this patch can based on your patchset. thanks.

>
>
>> Expose API ghes_notify_sei() to external users. External
>> modules can call this exposed API to parse APEI table and
>> handle the SEI notification.
>
> external modules? You mean called by the arch code when it gets this NOTIFY_SEI?
yes, called by kernel ARCH code, such as below, I remember I have discussed with you.

asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
{
nmi_enter();


if (!ghes_notify_sei())
return;



/* non-RAS errors are not containable */
if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
arm64_serror_panic(regs, esr);

nmi_exit();
}

>
>
> Thanks,
>
> James
>
> .
>


2018-01-23 10:10:37

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

sorry fix a typo.

On 2018/1/23 17:23, gengdongjiu wrote:
>> There are problems with doing this:
>>
>> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
>> | How do SEA and SEI interact?
>> |
>> | As far as I can see they can both interrupt each other, which isn't something
>> | the single in_nmi() path in APEI can handle. I thinks we should fix this
>> | first.
>>
>> [..]
>>
>> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
>> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
>> | from SEA, but not SEI. (what happens if an SError arrives while we are
>> | queueing memory_failure work from an IRQ).
>> |
>> | The one that scares me is the trace-point reporting stuff. What happens if an
>> | SError arrives while we are enabling a trace point? (these are static-keys
>> | right?)
>> |
>> | I don't think we can just plumb SEI in like this and be done with it.
>> | (I'm looking at teasing out the estatus cache code from being x86:NMI only.
>> | This way we solve the same 'cant do this from NMI context' with the same
>> | code'.)
>>
>>
>> I will post what I've got for this estatus-cache thing as an RFC, its not ready
>> to be considered yet.

Yes, I know you are dong that. Your serial's patch will consider all above things, right?
If your patch can be consider that, this patch can based on your patchset. thanks.

>
>>


2018-01-23 19:09:37

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi Dongjiu Geng,

On 06/01/18 16:02, Dongjiu Geng wrote:
> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
> guest and user space needs a way to tell KVM this value. So we add a
> new ioctl. Before user space specifies the Exception Syndrome Register
> ESR(ESR), it firstly checks that whether KVM has the capability to
> set the guest ESR, If has, will set it. Otherwise, nothing to do.
>
> For this ESR specifying, Only support for AArch64, not support AArch32.

After this patch user-space can trigger an SError in the guest. If it wants to
migrate the guest, how does the pending SError get migrated?

I think we need to fix migration first. Andrew Jones suggested using
KVM_GET/SET_VCPU_EVENTS:
https://www.spinics.net/lists/arm-kernel/msg616846.html

Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems
without the v8.2 RAS Extensions with the same API. I think this means a bit to
read/write whether SError is pending, and another to indicate the ESR should be
set/read.
CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.

user-space can then use the 'for migration' calls to make a 'new' SError pending.

Now that the cpufeature bits are queued, I think this can be split up into two
separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated
plumbing. The second for the KVM 'make SError pending' API.


> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5c7f657..738ae90 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
> return -EINVAL;
> }
>
> +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
> +{
> + return -EINVAL;
> +}

Does nothing in the patch that adds the support? This is a bit odd.
(oh, its hiding in patch 6...)


Thanks,

James


2018-01-23 19:11:20

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest

Hi Dongjiu Geng,

On 06/01/18 16:02, Dongjiu Geng wrote:
> RAS Extension add a VSESR_EL2 register which can provide
> the syndrome value reported to software on taking a virtual
> SError interrupt exception. This patch supports to specify
> this Syndrome.
>
> In the RAS Extensions we can not set all-zero syndrome value
> for SError, which means 'RAS error: Uncategorized' instead of
> 'no valid ISS'. So set it to IMPLEMENTATION DEFINED syndrome
> by default.
>
> We also need to support userspace to specify a valid syndrome
> value, Because in some case, the recovery is driven by userspace.
> This patch can support that userspace specify it.
>
> In the guest/host world switch, restore this value to VSESR_EL2
> only when HCR_EL2.VSE is set. This value no need to be saved
> because it is stale vale when guest exit.

A version of this patch has been queued by Catalin.

Now that the cpufeature bits are queued, I think this can be split up into two
separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated
plumbing. The second for the KVM 'make SError pending' API.


> Signed-off-by: Dongjiu Geng <[email protected]>
> [Set an impdef ESR for Virtual-SError]
> Signed-off-by: James Morse <[email protected]>

I didn't sign-off this patch. If you pick some bits from another version and
want to credit someone else you can 'CC:' them or just mention it in the
commit-message.


> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 47b967d..3b035cc 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -86,6 +86,9 @@
> #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4)
> #define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3)
>
> +/* virtual SError exception syndrome register */
> +#define REG_VSESR_EL2 sys_reg(3, 4, 5, 2, 3)

Irrelevant-Nit: sys-regs usually have a 'SYS_' prefix, and are in instruction
encoding order lower down the file.

(These PSTATE PAN things are a bit odd as they were used to generate and
instruction before the fancy {read,write}_sysreg() helpers were added).


> #define SET_PSTATE_PAN(x) __emit_inst(0xd5000000 | REG_PSTATE_PAN_IMM | \
> (!!x)<<8 | 0x1f)
> #define SET_PSTATE_UAO(x) __emit_inst(0xd5000000 | REG_PSTATE_UAO_IMM | \
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 738ae90..ffad42b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -279,7 +279,16 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>
> int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)

Bits of this are spread between patches 5 and 6. If you put them in the other
order this wouldn't happen.

(but after a rebase most of this patch should disappear)

> {
> - return -EINVAL;
> + u64 reg = *syndrome;
> +
> + /* inject virtual system Error or asynchronous abort */
> + kvm_inject_vabt(vcpu);

So this writes an impdef ESR, because its the existing code-path in KVM.


> + if (reg)
> + /* set vsesr_el2[24:0] with value that user space specified */
> + kvm_vcpu_set_vsesr(vcpu, reg & ESR_ELx_ISS_MASK);

And then you overwrite it. Which is a bit odd as there is a helper to do both in
one go:


> +
> + return 0;
> }

> int __attribute_const__ kvm_target_cpu(void)

> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index 3556715..fb94b5e 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -246,14 +246,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
> inject_undef64(vcpu);
> }
>
> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +{
> + kvm_vcpu_set_vsesr(vcpu, esr);
> + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +}

How come you don't use this in kvm_arm_set_sei_esr()?



Thanks,

James

2018-01-24 20:07:33

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi James,
Thanks a lot for your review and comments.

>
> Hi Dongjiu Geng,
>
> On 06/01/18 16:02, Dongjiu Geng wrote:
> > The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
> > guest and user space needs a way to tell KVM this value. So we add a
> > new ioctl. Before user space specifies the Exception Syndrome Register
> > ESR(ESR), it firstly checks that whether KVM has the capability to set
> > the guest ESR, If has, will set it. Otherwise, nothing to do.
> >
> > For this ESR specifying, Only support for AArch64, not support AArch32.
>
> After this patch user-space can trigger an SError in the guest. If it wants to migrate the guest, how does the pending SError get migrated?
>
> I think we need to fix migration first. Andrew Jones suggested using
> KVM_GET/SET_VCPU_EVENTS:
> https://www.spinics.net/lists/arm-kernel/msg616846.html
>
> Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems without the v8.2 RAS Extensions with the same API. I
> think this means a bit to read/write whether SError is pending, and another to indicate the ESR should be set/read.
> CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.

For the CPUs without the v8.2 RAS Extensions, its ESR is always 0, we only can inject a SError with ESR 0 to guest, cannot set its ESR.
About how about to use the KVM_GET/SET_VCPU_EVENTS, I will check the code, and consider your suggestion at the same time.
The IOCTL KVM_GET/SET_VCPU_EVENTS has been used by X86.

>
> user-space can then use the 'for migration' calls to make a 'new' SError pending.
>
> Now that the cpufeature bits are queued, I think this can be split up into two separate series for v4.16-rc1, one to tackle NOTIFY_SEI and
> the associated plumbing. The second for the KVM 'make SError pending' API.

Ok, thanks for your suggestion, will split it.

>
>
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index
> > 5c7f657..738ae90 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
> > return -EINVAL;
> > }
> >
> > +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) {
> > + return -EINVAL;
> > +}
>
> Does nothing in the patch that adds the support? This is a bit odd.
> (oh, its hiding in patch 6...)

To make this patch simple and small, I add it in patch 6.

>
>
> Thanks,
>
> James


2018-01-25 08:23:36

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest

Hi James,
thanks for the review.

On 2018/1/24 3:07, James Morse wrote:
> Hi Dongjiu Geng,
>
> On 06/01/18 16:02, Dongjiu Geng wrote:
>> RAS Extension add a VSESR_EL2 register which can provide
>> the syndrome value reported to software on taking a virtual
>> SError interrupt exception. This patch supports to specify
>> this Syndrome.
>>
>> In the RAS Extensions we can not set all-zero syndrome value
>> for SError, which means 'RAS error: Uncategorized' instead of
>> 'no valid ISS'. So set it to IMPLEMENTATION DEFINED syndrome
>> by default.
>>
>> We also need to support userspace to specify a valid syndrome
>> value, Because in some case, the recovery is driven by userspace.
>> This patch can support that userspace specify it.
>>
>> In the guest/host world switch, restore this value to VSESR_EL2
>> only when HCR_EL2.VSE is set. This value no need to be saved
>> because it is stale vale when guest exit.
>
> A version of this patch has been queued by Catalin.
yeah, I have seen that.

>
> Now that the cpufeature bits are queued, I think this can be split up into two
> separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated
> plumbing. The second for the KVM 'make SError pending' API.
>
>
>> Signed-off-by: Dongjiu Geng <[email protected]>
>> [Set an impdef ESR for Virtual-SError]
>> Signed-off-by: James Morse <[email protected]>
>
> I didn't sign-off this patch. If you pick some bits from another version and
> want to credit someone else you can 'CC:' them or just mention it in the
> commit-message.

I pick your modification of setting an impdef ESR for Virtual-SError, so add your name,
I change it to 'CC'

>
>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 47b967d..3b035cc 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -86,6 +86,9 @@
>> #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4)
>> #define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3)
>>
>> +/* virtual SError exception syndrome register */
>> +#define REG_VSESR_EL2 sys_reg(3, 4, 5, 2, 3)
>
> Irrelevant-Nit: sys-regs usually have a 'SYS_' prefix, and are in instruction
> encoding order lower down the file.
will follow that.

>
> (These PSTATE PAN things are a bit odd as they were used to generate and
> instruction before the fancy {read,write}_sysreg() helpers were added).>
>
>> #define SET_PSTATE_PAN(x) __emit_inst(0xd5000000 | REG_PSTATE_PAN_IMM | \
>> (!!x)<<8 | 0x1f)
>> #define SET_PSTATE_UAO(x) __emit_inst(0xd5000000 | REG_PSTATE_UAO_IMM | \
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 738ae90..ffad42b 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -279,7 +279,16 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>
>> int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
>
> Bits of this are spread between patches 5 and 6. If you put them in the other
> order this wouldn't happen.

Ok, I will adjust that.

>
> (but after a rebase most of this patch should disappear)
>
>> {
>> - return -EINVAL;
>> + u64 reg = *syndrome;
>> +
>> + /* inject virtual system Error or asynchronous abort */
>> + kvm_inject_vabt(vcpu);
>
> So this writes an impdef ESR, because its the existing code-path in KVM.
>
>
>> + if (reg)
>> + /* set vsesr_el2[24:0] with value that user space specified */
>> + kvm_vcpu_set_vsesr(vcpu, reg & ESR_ELx_ISS_MASK);
>
> And then you overwrite it. Which is a bit odd as there is a helper to do both in
> one go:
thanks, I will directly call pend_guest_serror() in this function.

>
>
>> +
>> + return 0;
>> }
>
>> int __attribute_const__ kvm_target_cpu(void)
>
>> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
>> index 3556715..fb94b5e 100644
>> --- a/arch/arm64/kvm/inject_fault.c
>> +++ b/arch/arm64/kvm/inject_fault.c
>> @@ -246,14 +246,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>> inject_undef64(vcpu);
>> }
>>
>> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
>> +{
>> + kvm_vcpu_set_vsesr(vcpu, esr);
>> + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
>> +}
>
> How come you don't use this in kvm_arm_set_sei_esr()?
thanks, I will call pend_guest_serror() in the kvm_arm_set_sei_esr().

>
>
>
> Thanks,
>
> James
>
> .
>


2018-01-30 19:56:53

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi gengdongjiu,

On 24/01/18 20:06, gengdongjiu wrote:
>> On 06/01/18 16:02, Dongjiu Geng wrote:
>>> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
>>> guest and user space needs a way to tell KVM this value. So we add a
>>> new ioctl. Before user space specifies the Exception Syndrome Register
>>> ESR(ESR), it firstly checks that whether KVM has the capability to set
>>> the guest ESR, If has, will set it. Otherwise, nothing to do.
>>>
>>> For this ESR specifying, Only support for AArch64, not support AArch32.
>>
>> After this patch user-space can trigger an SError in the guest. If it wants to migrate the guest, how does the pending SError get migrated?
>>
>> I think we need to fix migration first. Andrew Jones suggested using
>> KVM_GET/SET_VCPU_EVENTS:
>> https://www.spinics.net/lists/arm-kernel/msg616846.html
>>
>> Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems without the v8.2 RAS Extensions with the same API. I
>> think this means a bit to read/write whether SError is pending, and another to indicate the ESR should be set/read.
>> CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.
>
> For the CPUs without the v8.2 RAS Extensions, its ESR is always 0,
> we only can inject a SError with ESR 0 to guest, cannot set its ESR.

0? It's always implementation-defined. On Juno it seems to be always-0, but
other systems may behave differently. (Juno may generate another ESR value when
I'm not watching it...)

Just because we can't control the ESR doesn't mean injecting an SError isn't
something user-space may want to do.
If we tackle migration of pending-SError first, I think that will give us the
API to create a new pending SError with/without an ESR as appropriate.


> About how about to use the KVM_GET/SET_VCPU_EVENTS, I will check the code, and
> consider your suggestion at the same time.

(Not my suggestion, It was Andrew Jones idea.)

> The IOCTL KVM_GET/SET_VCPU_EVENTS has been used by X86.

We would be re-using the struct to have values with slightly different meanings.
But for migration the upshot is the same, call KVM_GET_VCPU_EVENTS on one
system, and pass the struct to KVM_SET_VCPU_EVENTS on the new system. If we're
lucky Qemu may be able to do this in shared x86/arm64 code.


>>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index
>>> 5c7f657..738ae90 100644
>>> --- a/arch/arm64/kvm/guest.c
>>> +++ b/arch/arm64/kvm/guest.c
>>> @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>> return -EINVAL;
>>> }
>>>
>>> +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) {
>>> + return -EINVAL;
>>> +}
>>
>> Does nothing in the patch that adds the support? This is a bit odd.
>> (oh, its hiding in patch 6...)
>
> To make this patch simple and small, I add it in patch 6.

But that made the functionality of this patch: A new API to return -EINVAL from
the kernel.

Swapping the patches round would have avoided this.
Regardless, I think this will fold out in a rebase.


Thanks,

James

2018-01-30 21:00:23

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Hi gengdongjiu,

On 23/01/18 09:23, gengdongjiu wrote:
> On 2018/1/23 3:39, James Morse wrote:
>> gengdongjiu wrote:
>>> This error source parsing and handling method
>>> is similar with the SEA.
>>
>> There are problems with doing this:
>>
>> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
>> | How do SEA and SEI interact?
>> |
>> | As far as I can see they can both interrupt each other, which isn't something
>> | the single in_nmi() path in APEI can handle. I thinks we should fix this
>> | first.
>>
>> [..]
>>
>> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
>> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
>> | from SEA, but not SEI. (what happens if an SError arrives while we are
>> | queueing memory_failure work from an IRQ).
>> |
>> | The one that scares me is the trace-point reporting stuff. What happens if an
>> | SError arrives while we are enabling a trace point? (these are static-keys
>> | right?)
>> |
>> | I don't think we can just plumb SEI in like this and be done with it.
>> | (I'm looking at teasing out the estatus cache code from being x86:NMI only.
>> | This way we solve the same 'cant do this from NMI context' with the same
>> | code'.)
>>
>>
>> I will post what I've got for this estatus-cache thing as an RFC, its not ready
>> to be considered yet.

> Yes, I know you are dong that. Your serial's patch will consider all above things, right?

Assuming I got it right, yes. It currently makes the race Xie XiuQi spotted
worse, which I want to fix too. (details on the cover letter)


> If your patch can be consider that, this patch can based on your patchset. thanks.

I'd like to pick these patches onto the end of that series, but first I want to
know what NOTIFY_SEI means for any OS. The ACPI spec doesn't say, and because
its asynchronous, route-able and mask-able, there are many more corners than
NOTFIY_SEA.

This thing is a notification using an emulated SError exception. (emulated
because physical-SError must be routed to EL3 for firmware-first, and
virtual-SError belongs to EL2).

Does your firmware emulate SError exactly as the TakeException() pseudo code in
the Arm-Arm?
Is the emulated SError routed following the routing rules for HCR_EL2.{AMO, TGE}?
What does your firmware do when it wants to emulate SError but its masked?
(e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had PSTATE.A
set.
e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the emulated
SError should go to EL1. This effectively masks SError.)


Answers to these let us determine whether a bug is in the firmware or the
kernel. If firmware is expecting the OS to do something special, I'd like to
know about it from the beginning!


>>> Expose API ghes_notify_sei() to external users. External
>>> modules can call this exposed API to parse APEI table and
>>> handle the SEI notification.
>>
>> external modules? You mean called by the arch code when it gets this NOTIFY_SEI?

> yes, called by kernel ARCH code, such as below, I remember I have discussed with you.

Sure. The phrase 'external modules' usually means the '.ko' files that live in
/lib/modules, nothing outside the kernel tree should be doing this stuff.


Thanks,

James


2018-02-05 06:21:56

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

James,
Thank you for your time to reply me.

On 2018/1/31 3:21, James Morse wrote:
> Hi gengdongjiu,
>
> On 24/01/18 20:06, gengdongjiu wrote:
>>> On 06/01/18 16:02, Dongjiu Geng wrote:
>>>> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
>>>> guest and user space needs a way to tell KVM this value. So we add a
>>>> new ioctl. Before user space specifies the Exception Syndrome Register
>>>> ESR(ESR), it firstly checks that whether KVM has the capability to set
>>>> the guest ESR, If has, will set it. Otherwise, nothing to do.
>>>>
>>>> For this ESR specifying, Only support for AArch64, not support AArch32.
>>>
>>> After this patch user-space can trigger an SError in the guest. If it wants to migrate the guest, how does the pending SError get migrated?
>>>
>>> I think we need to fix migration first. Andrew Jones suggested using
>>> KVM_GET/SET_VCPU_EVENTS:
>>> https://www.spinics.net/lists/arm-kernel/msg616846.html
>>>
>>> Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems without the v8.2 RAS Extensions with the same API. I
>>> think this means a bit to read/write whether SError is pending, and another to indicate the ESR should be set/read.
>>> CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.
>>
>> For the CPUs without the v8.2 RAS Extensions, its ESR is always 0,
>> we only can inject a SError with ESR 0 to guest, cannot set its ESR.
>
> 0? It's always implementation-defined. On Juno it seems to be always-0, but
> other systems may behave differently. (Juno may generate another ESR value when
> I'm not watching it...)
For the armv8.0 cpu without RAS Extensions, it does not have vsesr_el2, so when guest take a virtual SError,
its ESR is 0, can not control the virtual SError's syndrom value, it does not have such registers to control that.

Does Juno not have RAS Extension? if yes, I think we can only inject an SError, but can not change its ESR value, because it does not have vsesr_el2.

>
> Just because we can't control the ESR doesn't mean injecting an SError isn't
> something user-space may want to do.
yes, we may need to support user-space injects an SError even through CPU does not have RAS Extension.

> If we tackle migration of pending-SError first, I think that will give us the
> API to create a new pending SError with/without an ESR as appropriate.

sure, we should. Last week, I checked the KVM_GET/SET_VCPU_EVENTS IOCTL, it should meet our migration requirements

>
>
>> About how about to use the KVM_GET/SET_VCPU_EVENTS, I will check the code, and
>> consider your suggestion at the same time.
>
> (Not my suggestion, It was Andrew Jones idea.)
Got it.

>
>> The IOCTL KVM_GET/SET_VCPU_EVENTS has been used by X86.
>
> We would be re-using the struct to have values with slightly different meanings.
> But for migration the upshot is the same, call KVM_GET_VCPU_EVENTS on one
> system, and pass the struct to KVM_SET_VCPU_EVENTS on the new system. If we're
> lucky Qemu may be able to do this in shared x86/arm64 code.
>
Thanks for the reminder, I know your meaning.
In the x86, the kvm_vcpu_events includes exception/interrupt/nmi/smi. For the RAS, we
only involves the exception, so Qemu handling logic is different. Anyway, I will try to
share code for the two platform in Qemu.


/* for KVM_GET/SET_VCPU_EVENTS */
struct kvm_vcpu_events {
struct {
__u8 injected;
__u8 nr;
__u8 has_error_code;
__u8 pad;
__u32 error_code;
} exception;
struct {
__u8 injected;
__u8 nr;
__u8 soft;
__u8 shadow;
} interrupt;
struct {
__u8 injected;
__u8 pending;
__u8 masked;
__u8 pad;
} nmi;
__u32 sipi_vector;
__u32 flags;
struct {
__u8 smm;
__u8 pending;
__u8 smm_inside_nmi;
__u8 latched_init;
} smi;
__u32 reserved[9];
};

>
>>>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index
>>>> 5c7f657..738ae90 100644
>>>> --- a/arch/arm64/kvm/guest.c
>>>> +++ b/arch/arm64/kvm/guest.c
>>>> @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>>> return -EINVAL;
>>>> }
>>>>
>>>> +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) {
>>>> + return -EINVAL;
>>>> +}
>>>
>>> Does nothing in the patch that adds the support? This is a bit odd.
>>> (oh, its hiding in patch 6...)
>>
>> To make this patch simple and small, I add it in patch 6.
>
> But that made the functionality of this patch: A new API to return -EINVAL from
> the kernel.
>
> Swapping the patches round would have avoided this.
> Regardless, I think this will fold out in a rebase.
yes, I will, thanks for your kind suggestion.

>
>
> Thanks,
>
> James
>
> .
>


2018-02-05 11:26:40

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

[...]
>
> > Yes, I know you are dong that. Your serial's patch will consider all above
> things, right?
>
> Assuming I got it right, yes. It currently makes the race Xie XiuQi spotted worse,
> which I want to fix too. (details on the cover letter)

Ok.

>
>
> > If your patch can be consider that, this patch can based on your patchset.
> thanks.
>
> I'd like to pick these patches onto the end of that series, but first I want to
> know what NOTIFY_SEI means for any OS. The ACPI spec doesn't say, and
> because its asynchronous, route-able and mask-able, there are many more
> corners than NOTFIY_SEA.
>
> This thing is a notification using an emulated SError exception. (emulated
> because physical-SError must be routed to EL3 for firmware-first, and
> virtual-SError belongs to EL2).
>
> Does your firmware emulate SError exactly as the TakeException() pseudo code
> in the Arm-Arm?

Yes, it is.

> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
> TGE}?

Yes, it is.

> What does your firmware do when it wants to emulate SError but its masked?
> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
> PSTATE.A set.
> e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
> emulated SError should go to EL1. This effectively masks SError.)

Currently we does not consider much about the mask status(SPSR).
I remember that you ever suggested firmware should reboot if the mask status is set(SPSR), right?
I ever suggest our firmware team to evaluate that, but there is no response.

I CC "liu jun" <[email protected]> who is our UEFI firmware Architect, if you have firmware requirements, you can
raise again.

>
> Answers to these let us determine whether a bug is in the firmware or the
> kernel. If firmware is expecting the OS to do something special, I'd like to know
> about it from the beginning!

I know your meaning, thanks for raising it again.

>
>
> >>> Expose API ghes_notify_sei() to external users. External modules can
> >>> call this exposed API to parse APEI table and handle the SEI
> >>> notification.
> >>
> >> external modules? You mean called by the arch code when it gets this
> NOTIFY_SEI?
>
> > yes, called by kernel ARCH code, such as below, I remember I have discussed
> with you.
>
> Sure. The phrase 'external modules' usually means the '.ko' files that live in
> /lib/modules, nothing outside the kernel tree should be doing this stuff.

I will rename 'external modules' to other name. Thanks.

>
>
> Thanks,
>
> James


2018-02-09 17:47:36

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi gengdongjiu,

On 05/02/18 06:19, gengdongjiu wrote:
> On 2018/1/31 3:21, James Morse wrote:
>> On 24/01/18 20:06, gengdongjiu wrote:
>>>> On 06/01/18 16:02, Dongjiu Geng wrote:
>>>>> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
>>>>> guest and user space needs a way to tell KVM this value. So we add a
>>>>> new ioctl. Before user space specifies the Exception Syndrome Register
>>>>> ESR(ESR), it firstly checks that whether KVM has the capability to set
>>>>> the guest ESR, If has, will set it. Otherwise, nothing to do.
>>>>>
>>>>> For this ESR specifying, Only support for AArch64, not support AArch32.
>>>>
>>>> After this patch user-space can trigger an SError in the guest. If it wants to migrate the guest, how does the pending SError get migrated?
>>>>
>>>> I think we need to fix migration first. Andrew Jones suggested using
>>>> KVM_GET/SET_VCPU_EVENTS:
>>>> https://www.spinics.net/lists/arm-kernel/msg616846.html
>>>>
>>>> Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems without the v8.2 RAS Extensions with the same API. I
>>>> think this means a bit to read/write whether SError is pending, and another to indicate the ESR should be set/read.
>>>> CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.
>>>
>>> For the CPUs without the v8.2 RAS Extensions, its ESR is always 0,
>>> we only can inject a SError with ESR 0 to guest, cannot set its ESR.
>>
>> 0? It's always implementation-defined. On Juno it seems to be always-0, but
>> other systems may behave differently. (Juno may generate another ESR value when
>> I'm not watching it...)

> For the armv8.0 cpu without RAS Extensions, it does not have vsesr_el2, so when
> guest take a virtual SError,

> its ESR is 0, can not control the virtual SError's syndrom value, it does not have
> such registers to control that.

My point was its more nuanced than this: the ARM-ARM's
TakeVirtualSErrorException() pseudo-code lets virtual-SError have an
implementation defined syndrome. I've never seen Juno generate anything other
than '0', but it might do something different on a thursday.

The point? We can't know what a CPU without the RAS extensions puts in there.

Why Does this matter? When migrating a pending SError we have to know the
difference between 'use this 64bit value', and 'the CPU will generate it'.
If I make an SError pending with ESR=0 on a CPU with VSESR, I can't migrated to
a system that generates an impdef SError-ESR, because I can't know it will be 0.


> Does Juno not have RAS Extension?

It's two types of v8.0 CPU, no RAS extensions.


> if yes, I think we can only inject an SError, but can not change its ESR value,
> because it does not have vsesr_el2.

I agree, this means we need to be able to tell the difference between 'pending'
and 'pending with this ESR'.


>> Just because we can't control the ESR doesn't mean injecting an SError isn't
>> something user-space may want to do.

> yes, we may need to support user-space injects an SError even through CPU does not have RAS Extension.
>
>> If we tackle migration of pending-SError first, I think that will give us the
>> API to create a new pending SError with/without an ESR as appropriate.
>
> sure, we should. Last week, I checked the KVM_GET/SET_VCPU_EVENTS IOCTL, it should meet our
> migration requirements

Great!


Thanks,

James



>>> The IOCTL KVM_GET/SET_VCPU_EVENTS has been used by X86.
>>
>> We would be re-using the struct to have values with slightly different meanings.
>> But for migration the upshot is the same, call KVM_GET_VCPU_EVENTS on one
>> system, and pass the struct to KVM_SET_VCPU_EVENTS on the new system. If we're
>> lucky Qemu may be able to do this in shared x86/arm64 code.
>>
> Thanks for the reminder, I know your meaning.
> In the x86, the kvm_vcpu_events includes exception/interrupt/nmi/smi. For the RAS, we
> only involves the exception, so Qemu handling logic is different. Anyway, I will try to
> share code for the two platform in Qemu.
>
>
> /* for KVM_GET/SET_VCPU_EVENTS */
> struct kvm_vcpu_events {
> struct {
> __u8 injected;
> __u8 nr;
> __u8 has_error_code;
> __u8 pad;
> __u32 error_code;
> } exception;
> struct {
> __u8 injected;
> __u8 nr;
> __u8 soft;
> __u8 shadow;
> } interrupt;
> struct {
> __u8 injected;
> __u8 pending;
> __u8 masked;
> __u8 pad;
> } nmi;
> __u32 sipi_vector;
> __u32 flags;
> struct {
> __u8 smm;
> __u8 pending;
> __u8 smm_inside_nmi;
> __u8 latched_init;
> } smi;
> __u32 reserved[9];
> };
>

2018-02-12 12:55:45

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi James,
Thanks for the mail.

On 2018/2/10 1:44, James Morse wrote:
[...]
>
>> its ESR is 0, can not control the virtual SError's syndrom value, it does not have
>> such registers to control that.
>
> My point was its more nuanced than this: the ARM-ARM's
> TakeVirtualSErrorException() pseudo-code lets virtual-SError have an
> implementation defined syndrome. I've never seen Juno generate anything other
> than '0', but it might do something different on a thursday.

I checked the "aarch64/exceptions/asynch/AArch64.TakeVirtualSErrorException",
you are right, the virtual SError's syndrome value can be 0 or implementation defined value, not always 0,
which is decided by the "exception.syndrome<24>".
thanks for the clarification.

>
> The point? We can't know what a CPU without the RAS extensions puts in there.
>
> Why Does this matter? When migrating a pending SError we have to know the
> difference between 'use this 64bit value', and 'the CPU will generate it'.
> If I make an SError pending with ESR=0 on a CPU with VSESR, I can't migrated to
> a system that generates an impdef SError-ESR, because I can't know it will be 0.
Yes,
But it will have a issue,
For the target system, before taking the SError, no one can know whether its syndrome value
is IMPLEMENTATION DEFINED or architecturally defined.

when the virtual SError is taken, the ESR_ELx.IDS will be updated, then we can know
whether the ESR value is impdef or architecturally defined.

It seems migration is only allowed only when target system and source system all support RAS extension, because we do not know
whether its syndrome is IMPLEMENTATION DEFINED or architecturally defined.

>
>
>> Does Juno not have RAS Extension?
>
> It's two types of v8.0 CPU, no RAS extensions.
>
>
>> if yes, I think we can only inject an SError, but can not change its ESR value,
>> because it does not have vsesr_el2.
>
> I agree, this means we need to be able to tell the difference between 'pending'
> and 'pending with this ESR'.
>
>
>>> Just because we can't control the ESR doesn't mean injecting an SError isn't
>>> something user-space may want to do.
>
>> yes, we may need to support user-space injects an SError even through CPU does not have RAS Extension.
>>
>>> If we tackle migration of pending-SError first, I think that will give us the
>>> API to create a new pending SError with/without an ESR as appropriate.
>>
>> sure, we should. Last week, I checked the KVM_GET/SET_VCPU_EVENTS IOCTL, it should meet our
>> migration requirements
>
> Great!
>
>
> Thanks,
>
> James
>
>


2018-02-15 18:02:00

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi gengdongjiu,

On 12/02/18 10:19, gengdongjiu wrote:
> On 2018/2/10 1:44, James Morse wrote:
>> The point? We can't know what a CPU without the RAS extensions puts in there.
>>
>> Why Does this matter? When migrating a pending SError we have to know the
>> difference between 'use this 64bit value', and 'the CPU will generate it'.
>> If I make an SError pending with ESR=0 on a CPU with VSESR, I can't migrated to
>> a system that generates an impdef SError-ESR, because I can't know it will be 0.

> For the target system, before taking the SError, no one can know whether its syndrome value
> is IMPLEMENTATION DEFINED or architecturally defined.

For a virtual-SError, the hypervisor knows what it generated. (do I have
VSESR_EL2? What did I put in there?).


> when the virtual SError is taken, the ESR_ELx.IDS will be updated, then we can know
> whether the ESR value is impdef or architecturally defined.

True, the guest can't know anything about a pending virtual SError until it
takes it. Why is this a problem?


> It seems migration is only allowed only when target system and source system all support
> RAS extension, because we do not know whether its syndrome is IMPLEMENTATION DEFINED or
> architecturally defined.

I don't think Qemu allows migration between hosts with differing guest-ID
registers. But we shouldn't depend on this, and we may want to hide the v8.2 RAS
features from the guest's ID register, but still use them from the host.

The way I imagined it working was we would pack the following information into
that events struct:
{
bool serror_pending;
bool serror_has_esr;
u64 serror_esr;
}

The problem I was trying to describe is because there is no value of serror_esr
we can use to mean 'Ignore this, I'm a v8.0 CPU'. VSESR_EL2 is a 64bit register,
any bits we abuse may get a meaning we want to use in the future.

When it comes to migration, v8.{0,1} systems can only GET/SET events where
serror_has_esr == false, they can't use the serror_esr. On v8.2 systems we
should require serror_has_esr to be true.

If we need to support migration from v8.{0,1} to v8.2, we can make up an impdef
serror_esr.

We will need to decide what KVM does when SET is called but an SError was
already pending. 2.5.3 "Multiple SError interrupts" of [0] has something to say.


Happy new year,

James


[0]
https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

2018-02-16 12:29:31

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Hi gengdongjiu, liu jun

On 05/02/18 11:24, gengdongjiu wrote:
> James Morse wrote:
>> I'd like to pick these patches onto the end of that series, but first I want to
>> know what NOTIFY_SEI means for any OS. The ACPI spec doesn't say, and
>> because its asynchronous, route-able and mask-able, there are many more
>> corners than NOTFIY_SEA.
>>
>> This thing is a notification using an emulated SError exception. (emulated
>> because physical-SError must be routed to EL3 for firmware-first, and
>> virtual-SError belongs to EL2).
>>
>> Does your firmware emulate SError exactly as the TakeException() pseudo code
>> in the Arm-Arm?
>
> Yes, it is.
>
>> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
>> TGE}?
>
> Yes, it is.

... and yet ...


>> What does your firmware do when it wants to emulate SError but its masked?
>> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
>> PSTATE.A set.
>> e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
>> emulated SError should go to EL1. This effectively masks SError.)
>
> Currently we does not consider much about the mask status(SPSR).

.. this is a problem.

If you ignore SPSR_EL3 you may deliver an SError to EL1 when the exception
interrupted EL2. Even if you setup the EL1 register correctly, EL1 can't eret to
EL2. This should never happen, SError is effectively masked if you are running
at an EL higher than the one its routed to.

More obviously: if the exception came from the EL that SError should be routed
to, but PSTATE.A was set, you can't deliver SError. Masking SError is the only
way the OS has to indicate it can't take an exception right now. VBAR_EL1 may be
'wrong' if we're doing some power-management, the FAR/ELR/ESR/SPSR registers may
contain live values that the OS would lose if you deliver another exception over
the top.

If you deliver an emulated-SError as the OS eret's, your new ELR will point at
the eret instruction and the CPU will spin on this instruction forever.

You have to honour the masking and routing rules for SError, otherwise no OS can
run safely with this firmware.


> I remember that you ever suggested firmware should reboot if the mask status
> is set(SPSR), right?

Yes, this is my suggestion of what to do if you can't deliver an SError: store
the RAS error in the BERT and 'reboot'.


> I CC "liu jun" <[email protected]> who is our UEFI firmware Architect,
> if you have firmware requirements, you can raise again.

(UEFI? I didn't think there was any of that at EL3, but I'm not familiar with
all the 'PI' bits).

The requirement is your emulated-SError from EL3 looks exactly like a
physical-SError as if EL3 wasn't implemented.
Your CPU has to handle cases where it can't deliver an SError, your emulation
has to do the same.

This is not something any OS can work around.


>> Answers to these let us determine whether a bug is in the firmware or the
>> kernel. If firmware is expecting the OS to do something special, I'd like to know
>> about it from the beginning!
>
> I know your meaning, thanks for raising it again.


Happy new year,

James

2018-03-08 06:19:15

by gengdongjiu

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi James,
sorry for my late response due to chines new year.

2018-02-16 1:55 GMT+08:00 James Morse <[email protected]>:
> Hi gengdongjiu,
>
> On 12/02/18 10:19, gengdongjiu wrote:
>> On 2018/2/10 1:44, James Morse wrote:
>>> The point? We can't know what a CPU without the RAS extensions puts in there.
>>>
>>> Why Does this matter? When migrating a pending SError we have to know the
>>> difference between 'use this 64bit value', and 'the CPU will generate it'.
>>> If I make an SError pending with ESR=0 on a CPU with VSESR, I can't migrated to
>>> a system that generates an impdef SError-ESR, because I can't know it will be 0.
>
>> For the target system, before taking the SError, no one can know whether its syndrome value
>> is IMPLEMENTATION DEFINED or architecturally defined.
>
> For a virtual-SError, the hypervisor knows what it generated. (do I have
> VSESR_EL2? What did I put in there?).
>
>
>> when the virtual SError is taken, the ESR_ELx.IDS will be updated, then we can know
>> whether the ESR value is impdef or architecturally defined.
>
> True, the guest can't know anything about a pending virtual SError until it
> takes it. Why is this a problem?
>
>
>> It seems migration is only allowed only when target system and source system all support
>> RAS extension, because we do not know whether its syndrome is IMPLEMENTATION DEFINED or
>> architecturally defined.
>
> I don't think Qemu allows migration between hosts with differing guest-ID
> registers. But we shouldn't depend on this, and we may want to hide the v8.2 RAS
> features from the guest's ID register, but still use them from the host.
>
> The way I imagined it working was we would pack the following information into
> that events struct:
> {
> bool serror_pending;
> bool serror_has_esr;
> u64 serror_esr;
> }

I have used your suggestion struct

>
> The problem I was trying to describe is because there is no value of serror_esr
> we can use to mean 'Ignore this, I'm a v8.0 CPU'. VSESR_EL2 is a 64bit register,
> any bits we abuse may get a meaning we want to use in the future.
>
> When it comes to migration, v8.{0,1} systems can only GET/SET events where
> serror_has_esr == false, they can't use the serror_esr. On v8.2 systems we
> should require serror_has_esr to be true.
yes, I agreed.

>
> If we need to support migration from v8.{0,1} to v8.2, we can make up an impdef
> serror_esr.

For the Qemu migration, I need to check more the QEMU code.


Hi Andrew,
I use KVM_GET/SET_VCPU_EVENTS IOCTL to migrate the Serror
exception status of VM,
The even struct is shown below:

{
bool serror_pending;
bool serror_has_esr;
u64 serror_esr;
}

Only when the target machine is armv8.2, it needs to set the
serror_esr(SError Exception Syndrome Register).
for the armv8.0, software can not set the serror_esr(SError Exception
Syndrome Register).
so when migration from v8.{0,1} to v8.2, QEMU should make up an impdef
serror_esr for the v8.2 target.
can you give me some suggestion how to set that register in the QEMU?
I do not familar with the QEMU migration.
Thanks very much.

>
> We will need to decide what KVM does when SET is called but an SError was
> already pending. 2.5.3 "Multiple SError interrupts" of [0] has something to say.

how about KVM set again to the same VCPU?

>
>
> Happy new year,
thanks!

>
> James
>
>
> [0]
> https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

2018-03-15 20:50:42

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

Hi gengdongjiu,

On 08/03/18 06:18, gengdongjiu wrote:
> Hi James,
> sorry for my late response due to chines new year.

Happy new year,


> 2018-02-16 1:55 GMT+08:00 James Morse <[email protected]>:
>> On 12/02/18 10:19, gengdongjiu wrote:
>>> On 2018/2/10 1:44, James Morse wrote:
>>>> The point? We can't know what a CPU without the RAS extensions puts in there.
>>>>
>>>> Why Does this matter? When migrating a pending SError we have to know the
>>>> difference between 'use this 64bit value', and 'the CPU will generate it'.
>>>> If I make an SError pending with ESR=0 on a CPU with VSESR, I can't migrated to
>>>> a system that generates an impdef SError-ESR, because I can't know it will be 0.
>>
>>> For the target system, before taking the SError, no one can know whether its syndrome value
>>> is IMPLEMENTATION DEFINED or architecturally defined.
>>
>> For a virtual-SError, the hypervisor knows what it generated. (do I have
>> VSESR_EL2? What did I put in there?).
>>
>>
>>> when the virtual SError is taken, the ESR_ELx.IDS will be updated, then we can know
>>> whether the ESR value is impdef or architecturally defined.
>>
>> True, the guest can't know anything about a pending virtual SError until it
>> takes it. Why is this a problem?
>>
>>
>>> It seems migration is only allowed only when target system and source system all support
>>> RAS extension, because we do not know whether its syndrome is IMPLEMENTATION DEFINED or
>>> architecturally defined.
>>
>> I don't think Qemu allows migration between hosts with differing guest-ID
>> registers. But we shouldn't depend on this, and we may want to hide the v8.2 RAS
>> features from the guest's ID register, but still use them from the host.
>>
>> The way I imagined it working was we would pack the following information into
>> that events struct:
>> {
>> bool serror_pending;
>> bool serror_has_esr;
>> u64 serror_esr;
>> }
>
> I have used your suggestion struct

Ah! This is where it came from. Sorry, this was just to illustrate the
information/sizes we wanted to transfer.... I didn't mean it literally.

I should have said "64 bits of ESR, so that we can transfer anything that is
added to VSESR_EL2 in the future, a flag somewhere to indicate an serror is
pending, and another flag to indicate the ESR has a value we should use".


Thanks/Sorry!

James


>> The problem I was trying to describe is because there is no value of serror_esr
>> we can use to mean 'Ignore this, I'm a v8.0 CPU'. VSESR_EL2 is a 64bit register,
>> any bits we abuse may get a meaning we want to use in the future.
>>
>> When it comes to migration, v8.{0,1} systems can only GET/SET events where
>> serror_has_esr == false, they can't use the serror_esr. On v8.2 systems we
>> should require serror_has_esr to be true.
> yes, I agreed.
>
>>
>> If we need to support migration from v8.{0,1} to v8.2, we can make up an impdef
>> serror_esr.
>
> For the Qemu migration, I need to check more the QEMU code.


> Hi Andrew,
> I use KVM_GET/SET_VCPU_EVENTS IOCTL to migrate the Serror
> exception status of VM,
> The even struct is shown below:
>
> {
> bool serror_pending;
> bool serror_has_esr;
> u64 serror_esr;
> }
>
> Only when the target machine is armv8.2, it needs to set the
> serror_esr(SError Exception Syndrome Register).
> for the armv8.0, software can not set the serror_esr(SError Exception
> Syndrome Register).
> so when migration from v8.{0,1} to v8.2, QEMU should make up an impdef
> serror_esr for the v8.2 target.
> can you give me some suggestion how to set that register in the QEMU?
> I do not familar with the QEMU migration.
> Thanks very much.


2018-04-12 05:04:28

by gengdongjiu

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Dear James,
Thanks for this mail and sorry for my late response.


2018-02-16 1:55 GMT+08:00 James Morse <[email protected]>:
> Hi gengdongjiu, liu jun
>
> On 05/02/18 11:24, gengdongjiu wrote:
[....]
>>
>>> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
>>> TGE}?
>>
>> Yes, it is.
>
> ... and yet ...
>
>
>>> What does your firmware do when it wants to emulate SError but its masked?
>>> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
>>> PSTATE.A set.
>>> e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
>>> emulated SError should go to EL1. This effectively masks SError.)
>>
>> Currently we does not consider much about the mask status(SPSR).
>
> .. this is a problem.
>
> If you ignore SPSR_EL3 you may deliver an SError to EL1 when the exception
> interrupted EL2. Even if you setup the EL1 register correctly, EL1 can't eret to
> EL2. This should never happen, SError is effectively masked if you are running
> at an EL higher than the one its routed to.
>
> More obviously: if the exception came from the EL that SError should be routed
> to, but PSTATE.A was set, you can't deliver SError. Masking SError is the only

James, I summarized the masking and routing rules for SError to
confirm with you for the firmware first solution,

1. If the HCR_EL2.{AMO,TGE} is set, which means the SError should route to EL2,
When system happens SError and trap to EL3, If EL3 find
HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both set,
and find this SError come from EL2, it will not deliver an SError:
store the RAS error in the BERT and 'reboot'; but if
it find that this SError come from EL1 or EL0, it also need to deliver
an SError, right?


2. If the HCR_EL2.{AMO,TGE} is not set, which means the SError should
route to EL1,
When system happens SError and trap to EL3, If EL3 find
HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both not set,
and find this SError come from EL1, it will not deliver an SError:
store the RAS error in the BERT and 'reboot'; but if
it find that this SError come from EL0, it also need to deliver an
SError, right?


> way the OS has to indicate it can't take an exception right now. VBAR_EL1 may be
> 'wrong' if we're doing some power-management, the FAR/ELR/ESR/SPSR registers may
> contain live values that the OS would lose if you deliver another exception over
> the top.
>
> If you deliver an emulated-SError as the OS eret's, your new ELR will point at
> the eret instruction and the CPU will spin on this instruction forever.
>
> You have to honour the masking and routing rules for SError, otherwise no OS can
> run safely with this firmware.
>
>
>> I remember that you ever suggested firmware should reboot if the mask status
>> is set(SPSR), right?
>
> Yes, this is my suggestion of what to do if you can't deliver an SError: store
> the RAS error in the BERT and 'reboot'.
>
>
>> I CC "liu jun" <[email protected]> who is our UEFI firmware Architect,
>> if you have firmware requirements, you can raise again.
>
> (UEFI? I didn't think there was any of that at EL3, but I'm not familiar with
> all the 'PI' bits).
>
> The requirement is your emulated-SError from EL3 looks exactly like a
> physical-SError as if EL3 wasn't implemented.
> Your CPU has to handle cases where it can't deliver an SError, your emulation
> has to do the same.
>
> This is not something any OS can work around.
>
>
>>> Answers to these let us determine whether a bug is in the firmware or the
>>> kernel. If firmware is expecting the OS to do something special, I'd like to know
>>> about it from the beginning!
>>
>> I know your meaning, thanks for raising it again.
>
>
> Happy new year,
>
> James
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

2018-04-12 16:20:49

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

Hi gengdongjiu,

On 12/04/18 06:00, gengdongjiu wrote:
> 2018-02-16 1:55 GMT+08:00 James Morse <[email protected]>:
>> On 05/02/18 11:24, gengdongjiu wrote:
>>>> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
>>>> TGE}?
>>>
>>> Yes, it is.
>>
>> ... and yet ...
>>
>>
>>>> What does your firmware do when it wants to emulate SError but its masked?
>>>> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
>>>> PSTATE.A set.
>>>> e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
>>>> emulated SError should go to EL1. This effectively masks SError.)
>>>
>>> Currently we does not consider much about the mask status(SPSR).
>>
>> .. this is a problem.
>>
>> If you ignore SPSR_EL3 you may deliver an SError to EL1 when the exception
>> interrupted EL2. Even if you setup the EL1 register correctly, EL1 can't eret to
>> EL2. This should never happen, SError is effectively masked if you are running
>> at an EL higher than the one its routed to.
>>
>> More obviously: if the exception came from the EL that SError should be routed
>> to, but PSTATE.A was set, you can't deliver SError. Masking SError is the only

> James, I summarized the masking and routing rules for SError to
> confirm with you for the firmware first solution,

You also said "Currently we does not consider much about the mask status(SPSR)."


> 1. If the HCR_EL2.{AMO,TGE} is set,

If one or the other of these bits is set: (AMO==1 || TGE==1)

> which means the SError should route to EL2,
> When system happens SError and trap to EL3, If EL3 find
> HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both set,
> and find this SError come from EL2, it will not deliver an SError:
> store the RAS error in the BERT and 'reboot'; but if
> it find that this SError come from EL1 or EL0, it also need to deliver
> an SError, right?

Yes.


> 2. If the HCR_EL2.{AMO,TGE} is not set,

If neither of these bits is set: (AMO==0 && TGE == 0)

> which means the SError should route to EL1,
> When system happens SError and trap to EL3, If EL3 find
> HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both not set,

(I'm reading this as all three of these bits are clear)

> and find this SError come from EL1, it will not deliver an SError:
> store the RAS error in the BERT and 'reboot';

No, (AMO==0 && TGE == 0) means SError is routed to EL1, this exception
interrupted EL1 and the A bit was clear, so EL1 can take an SError.

The two cases here are:
AMO==0,TGE==0 means SError should be routed to EL1. If SPSR_EL3 says the
exception interrupted EL1 and the A bit was set, you need to do the BERT trick.

If SPSR_EL3 says the exception interrupted EL2, you need to do the BERT trick
regardless of the A bit, as SError is implicitly masked by running at a higher
exception level than it was routed to.


From your v11 reply:
> 2. The exception came from the EL that SError should not be routed
> to(according to hcr_EL2.{AMO, TGE}),even though the PSTATE.A was set,EL3
> firmware still deliver SError

(this is re-iterating the two-cases above:)
'not be routed to' is one of two things: Route-to-EL2+interruted-EL1, or
Route-to-EL1+interrupted-EL2.

Route-to-EL2+interrupted-EL1 is fine, regardless of SPSR_EL3.A the emulated
SError can be delivered to EL2, as EL2 can't mask SError when executing at a
lower EL.

Route-to-EL1+interrupted-EL2 is the problem. SError is implicitly masked by
running at a higher EL. Regardless of SPSR_EL3.A, the emulated SError can not be
delivered.
KVM does this on the way out of a guest, if an SError occurs during this time
the CPU will wait until execution returns to EL1 before delivering the SError.
Your firmware has to do the same.

Table D1-15 in "D1.14.2 Asynchronous exception masking" has a table with all the
combinations. The ARM-ARM is what we need to match with this behaviour.


> but if it find that this SError come from EL0, it also need to deliver an
> SError, right?

I thought interrupted-EL0 could always be delivered: but re-reading the
ARM-ARM's "D1.14.2 Asynchronous exception masking", if asynchronous exceptions
are routed to EL1 then EL0&EL1 are treated the same.
So if SError is routed to EL1, the exception interrupted EL0, and SPSR_EL3.A was
set, you still can't deliver the emulated-SError you have to do the BERT-trick.
Linux doesn't do this today, but another OS might (e.g. UEFI), and we might do
this in the future.

This is really tricky for firmware to get right. Another alternative would be to
put the CPER records in a Polled buffer, unless something needs doing right now,
in which case a BERT-reboot is probably best.


Thanks,

James

2018-04-13 15:00:36

by Dongjiu Geng

[permalink] [raw]
Subject: Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

James,
Thanks for this mail.

On 2018/4/13 0:14, James Morse wrote:
> Hi gengdongjiu,
>
> On 12/04/18 06:00, gengdongjiu wrote:
>> 2018-02-16 1:55 GMT+08:00 James Morse <[email protected]>:
>>> On 05/02/18 11:24, gengdongjiu wrote:
>>>>> Is the emulated SError routed following the routing rules for HCR_EL2.{AMO,
>>>>> TGE}?
>>>>
>>>> Yes, it is.
>>>
>>> ... and yet ...
>>>
>>>
>>>>> What does your firmware do when it wants to emulate SError but its masked?
>>>>> (e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had
>>>>> PSTATE.A set.
>>>>> e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the
>>>>> emulated SError should go to EL1. This effectively masks SError.)
>>>>
>>>> Currently we does not consider much about the mask status(SPSR).
>>>
>>> .. this is a problem.
>>>
>>> If you ignore SPSR_EL3 you may deliver an SError to EL1 when the exception
>>> interrupted EL2. Even if you setup the EL1 register correctly, EL1 can't eret to
>>> EL2. This should never happen, SError is effectively masked if you are running
>>> at an EL higher than the one its routed to.
>>>
>>> More obviously: if the exception came from the EL that SError should be routed
>>> to, but PSTATE.A was set, you can't deliver SError. Masking SError is the only
>
>> James, I summarized the masking and routing rules for SError to
>> confirm with you for the firmware first solution,
>
> You also said "Currently we does not consider much about the mask status(SPSR)."
Yes, we currently do not consider much it. After clarification with you, we want to modify the EL3 firmware to follow this rule.

>
>
>> 1. If the HCR_EL2.{AMO,TGE} is set,
>
> If one or the other of these bits is set: (AMO==1 || TGE==1)
>
>> which means the SError should route to EL2,
>> When system happens SError and trap to EL3, If EL3 find
>> HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both set,
>> and find this SError come from EL2, it will not deliver an SError:
>> store the RAS error in the BERT and 'reboot'; but if
>> it find that this SError come from EL1 or EL0, it also need to deliver
>> an SError, right?
>
> Yes.
>
>
>> 2. If the HCR_EL2.{AMO,TGE} is not set,
>
> If neither of these bits is set: (AMO==0 && TGE == 0)
>
>> which means the SError should route to EL1,
>> When system happens SError and trap to EL3, If EL3 find
>> HCR_EL2.{AMO,TGE} and SPSR_EL3.A are both not set,
>
> (I'm reading this as all three of these bits are clear)
sorry, it is a typo issue.
it should be HCR_EL2.AMO and HCR_EL2.TGE are both clear, but SPSR_EL3.A is set.

>
>> and find this SError come from EL1, it will not deliver an SError:
>> store the RAS error in the BERT and 'reboot';
>
> No, (AMO==0 && TGE == 0) means SError is routed to EL1, this exception
> interrupted EL1 and the A bit was clear, so EL1 can take an SError.

Agree.

>
> The two cases here are:
> AMO==0,TGE==0 means SError should be routed to EL1. If SPSR_EL3 says the
> exception interrupted EL1 and the A bit was set, you need to do the BERT trick.

>
> If SPSR_EL3 says the exception interrupted EL2, you need to do the BERT trick
"BERT trick" is storing the RAS error in the BERT and 'reboot, right?

> regardless of the A bit, as SError is implicitly masked by running at a higher
> exception level than it was routed to.


>
>
>>From your v11 reply:
>> 2. The exception came from the EL that SError should not be routed
>> to(according to hcr_EL2.{AMO, TGE}),even though the PSTATE.A was set,EL3
>> firmware still deliver SError
>
> (this is re-iterating the two-cases above:)
> 'not be routed to' is one of two things: Route-to-EL2+interruted-EL1, or
> Route-to-EL1+interrupted-EL2.
>
> Route-to-EL2+interrupted-EL1 is fine, regardless of SPSR_EL3.A the emulated
> SError can be delivered to EL2, as EL2 can't mask SError when executing at a
> lower EL.
Agree.

>
> Route-to-EL1+interrupted-EL2 is the problem. SError is implicitly masked by
> running at a higher EL. Regardless of SPSR_EL3.A, the emulated SError can not be
> delivered.
"can not be delivered" means storing the RAS error in the BERT and 'reboot, right?
In the Table D1-15 in "D1.14.2 Asynchronous exception masking", for the case, it is "C"
"C"means SError is not taken regardless of the value of the Process state interrupt mask.
for this case, whether it will be unsafe if BIOS directly reboot?


> KVM does this on the way out of a guest, if an SError occurs during this time
> the CPU will wait until execution returns to EL1 before delivering the SError.
> Your firmware has to do the same.
>
> Table D1-15 in "D1.14.2 Asynchronous exception masking" has a table with all the
> combinations. The ARM-ARM is what we need to match with this behaviour.
>
>
>> but if it find that this SError come from EL0, it also need to deliver an
>> SError, right?
>
> I thought interrupted-EL0 could always be delivered: but re-reading the
> ARM-ARM's "D1.14.2 Asynchronous exception masking", if asynchronous exceptions
> are routed to EL1 then EL0&EL1 are treated the same.
> So if SError is routed to EL1, the exception interrupted EL0, and SPSR_EL3.A was
> set, you still can't deliver the emulated-SError you have to do the BERT-trick.
> Linux doesn't do this today, but another OS might (e.g. UEFI), and we might do
> this in the future.
For this case, whether it will be unsafe if BIOS directly reboot?
For example, for some test purpose, EL0 set PSTATE.A, just right happen SError, then BIOS will reboot system.
I am afraid that system will become unsafe because BIOS will reboot system.

>
> This is really tricky for firmware to get right. Another alternative would be to
> put the CPER records in a Polled buffer, unless something needs doing right now,
> in which case a BERT-reboot is probably best.

In summary:

[1]:
Route-to-EL1 + interrupted-EL1, if SPSR_EL3.A is set, EL3 firmware can't deliver the emulated-SError, store the RAS error in the BERT and 'reboot.
Route-to-EL2 + interrupted-EL2, if SPSR_EL3.A is set, EL3 firmware can't deliver the emulated-SError, store the RAS error in the BERT and 'reboot.

I agree above two cases, but maybe we need to ensure that only in EL2 SError handler and EL1 SError exception handler the PSTATE.A is set, for other places, the PSTATE.A is not set.
then BIOS can know this is nested-SError when find the SPSR_EL3.A is set, can we ensure that in the Linux kernel code and KVM code?

[2]:
Route-to-EL2 + interrupted-EL1, regardless of SPSR_EL3.A the emulated SError can be delivered to EL2.
Route-to-EL2 + interrupted-EL0, regardless of SPSR_EL3.A the emulated SError can be delivered to EL2.

I agree above two cases.

[3]:
Route-to-EL1+interrupted-EL0, if SPSR_EL3.A is set, EL3 firmware can't deliver the emulated-SError, store the RAS error in the BERT and 'reboot
Route-to-EL1+interrupted-EL2, EL3 firmware store the RAS error in the BERT and 'reboot regardless of SPSR_EL3.A.

For above two cases, I am worried system will become unsafe because BIOS will reboot system.


>
>
> Thanks,
>
> James
>
> .
>