LinuxLists.cc - [PATCH v2 0/3] arm64: acpi: apei: handle SEI notification type for ARMv8

2017-03-08 04:15:18

Subject: [PATCH v2 0/3] arm64: acpi: apei: handle SEI notification type for ARMv8

2017-03-08 04:14:20

Subject: [PATCH v2 2/3] acpi: apei: handle SEI notification type for ARMv8

ARM APEI extension proposal added SEI (asynchronous SError interrupt)
notification type for ARMv8.

Add a new GHES error source handling function for SEI. In firmware
first mode, if an error source's notification type is SEI. Then GHES
could parse and report the detail error information.

Signed-off-by: Xie XiuQi <[email protected]>
---
arch/arm64/kernel/traps.c | 13 +++++++++++
drivers/acpi/apei/Kconfig | 14 ++++++++++++
drivers/acpi/apei/ghes.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-
include/acpi/ghes.h | 1 +
4 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e52be6a..65dbfa9 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -47,6 +47,8 @@
#include <asm/system_misc.h>
#include <asm/sysreg.h>

+#include <acpi/ghes.h>
+
static const char *handler[]= {
"Synchronous Abort",
"IRQ",
@@ -625,6 +627,17 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
handler[reason], smp_processor_id(), esr,
esr_get_class_string(esr));

+ /*
+ * Synchronous aborts may interrupt code which had interrupts masked.
+ * Before calling out into the wider kernel tell the interested
+ * subsystems.
+ */
+ if (IS_ENABLED(CONFIG_ACPI_APEI_SEI) && ESR_ELx_EC(esr) == ESR_ELx_EC_SERROR) {
+ nmi_enter();
+ ghes_notify_sei();
+ nmi_exit();
+ }
+
die("Oops - bad mode", regs, 0);
local_irq_disable();
panic("bad mode");
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 8ff0518..277deb8 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -55,6 +55,20 @@ config ACPI_APEI_SEA
option allows the OS to look for such hardware error record, and
take appropriate action.

+config ACPI_APEI_SEI
+ bool "APEI Asynchronous SError Interrupt logging/recovering support"
+ depends on ARM64 && ACPI_APEI_GHES
+ help
+ This option should be enabled if the system supports
+ firmware first handling of SEI (asynchronous SError interrupt).
+
+ SEI happens with invalid instruction access or asynchronous exceptions
+ on ARMv8 systems. If a system supports firmware first handling of SEI,
+ the platform analyzes and handles hardware error notifications from
+ SEI, and it may then form a HW error record for the OS to parse and
+ handle. This option allows the OS to look for such hardware error
+ record, and take appropriate action.
+
config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ea5b118..251d7e0 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -851,6 +851,50 @@ static inline void ghes_sea_remove(struct ghes *ghes)
}
#endif /* CONFIG_ACPI_APEI_SEA */

+#ifdef CONFIG_ACPI_APEI_SEI
+static LIST_HEAD(ghes_sei);
+
+void ghes_notify_sei(void)
+{
+ struct ghes *ghes;
+
+ /*
+ * synchronize_rcu() will wait for nmi_exit(), so no need to
+ * rcu_read_lock().
+ */
+ list_for_each_entry_rcu(ghes, &ghes_sei, list) {
+ ghes_proc(ghes);
+ }
+}
+
+static void ghes_sei_add(struct ghes *ghes)
+{
+ mutex_lock(&ghes_list_mutex);
+ list_add_rcu(&ghes->list, &ghes_sei);
+ mutex_unlock(&ghes_list_mutex);
+}
+
+static void ghes_sei_remove(struct ghes *ghes)
+{
+ mutex_lock(&ghes_list_mutex);
+ list_del_rcu(&ghes->list);
+ mutex_unlock(&ghes_list_mutex);
+ synchronize_rcu();
+}
+#else /* CONFIG_ACPI_APEI_SEI */
+static inline void ghes_sei_add(struct ghes *ghes)
+{
+ pr_err(GHES_PFX "ID: %d, trying to add SEI notification which is not supported\n",
+ ghes->generic->header.source_id);
+}
+
+static inline void ghes_sei_remove(struct ghes *ghes)
+{
+ pr_err(GHES_PFX "ID: %d, trying to remove SEI notification which is not supported\n",
+ ghes->generic->header.source_id);
+}
+#endif /* CONFIG_HAVE_ACPI_APEI_SEI */
+
#ifdef CONFIG_HAVE_ACPI_APEI_NMI
/*
* printk is not safe in NMI context. So in NMI handler, we allocate
@@ -1100,6 +1144,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
goto err;
}
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_SEI)) {
+ pr_warn(GHES_PFX "Generic hardware error source: %d notified via SEI is not supported!\n",
+ generic->header.source_id);
+ goto err;
+ }
+ break;
case ACPI_HEST_NOTIFY_NMI:
if (!IS_ENABLED(CONFIG_HAVE_ACPI_APEI_NMI)) {
pr_warn(GHES_PFX "Generic hardware error source: %d notified via NMI interrupt is not supported!\n",
@@ -1112,7 +1163,6 @@ static int ghes_probe(struct platform_device *ghes_dev)
generic->header.source_id);
goto err;
case ACPI_HEST_NOTIFY_GPIO:
- case ACPI_HEST_NOTIFY_SEI:
case ACPI_HEST_NOTIFY_GSIV:
pr_warn(GHES_PFX "Generic hardware error source: %d notified via notification type %u is not supported\n",
generic->header.source_id, generic->header.source_id);
@@ -1175,6 +1225,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_add(ghes);
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ ghes_sei_add(ghes);
+ break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_add(ghes);
break;
@@ -1220,6 +1273,9 @@ static int ghes_remove(struct platform_device *ghes_dev)
case ACPI_HEST_NOTIFY_SEA:
ghes_sea_remove(ghes);
break;
+ case ACPI_HEST_NOTIFY_SEI:
+ ghes_sei_remove(ghes);
+ break;
case ACPI_HEST_NOTIFY_NMI:
ghes_nmi_remove(ghes);
break;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 18bc935..7554658 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -100,5 +100,6 @@ static inline void *acpi_hest_generic_data_payload(struct acpi_hest_generic_data
}

void ghes_notify_sea(void);
+void ghes_notify_sei(void);

#endif /* GHES_H */
--
1.8.3.1

2017-03-08 04:15:15

by Xie XiuQi

[permalink] [raw]

Subject: [PATCH v2 1/3] arm64: RAS: add ras extension runtime detection

According to <<RAS Extension PRD03>> document, we add RAS extension
feature runtime detection, which would be used for error recovery
in the future.

Signed-off-by: Xie XiuQi <[email protected]>
Reviewed-by: Kefeng Wang <[email protected]>
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 2 ++
arch/arm64/kernel/cpufeature.c | 11 +++++++++++
3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index fb78a5d..3847cf8 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -37,7 +37,8 @@
#define ARM64_HAS_NO_FPSIMD 16
#define ARM64_WORKAROUND_REPEAT_TLBI 17
#define ARM64_WORKAROUND_QCOM_FALKOR_E1003 18
+#define ARM64_HAS_RAS_EXTN 19

-#define ARM64_NCAPS 19
+#define ARM64_NCAPS 20

#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index ac24b6e..32964c7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -157,6 +157,7 @@
#define ID_AA64ISAR0_AES_SHIFT 4

/* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT 28
#define ID_AA64PFR0_GIC_SHIFT 24
#define ID_AA64PFR0_ASIMD_SHIFT 20
#define ID_AA64PFR0_FP_SHIFT 16
@@ -165,6 +166,7 @@
#define ID_AA64PFR0_EL1_SHIFT 4
#define ID_AA64PFR0_EL0_SHIFT 0

+#define ID_AA64PFR0_RAS_V1 0x1
#define ID_AA64PFR0_FP_NI 0xf
#define ID_AA64PFR0_FP_SUPPORTED 0x0
#define ID_AA64PFR0_ASIMD_NI 0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index abda8e8..b0fb81e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -98,6 +98,7 @@
};

static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -860,6 +861,16 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
.min_field_value = 0,
.matches = has_no_fpsimd,
},
+ {
+ .desc = "ARM64 RAS Extension Support",
+ .capability = ARM64_HAS_RAS_EXTN,
+ .def_scope = SCOPE_SYSTEM,
+ .matches = has_cpuid_feature,
+ .sys_reg = SYS_ID_AA64PFR0_EL1,
+ .sign = FTR_UNSIGNED,
+ .field_pos = ID_AA64PFR0_RAS_SHIFT,
+ .min_field_value = ID_AA64PFR0_RAS_V1,
+ },
{},
};

--
1.8.3.1

2017-03-08 04:16:52

by Xie XiuQi

[permalink] [raw]

Subject: [PATCH v2 3/3] arm64: KVM: add guest SEI support

Add ghes handling for SEI so that the host kernel could parse and
report detailed error information for SEI which occur in the guest
kernel.

Signed-off-by: Xie XiuQi <[email protected]>
---
arch/arm64/include/asm/system_misc.h | 1 +
arch/arm64/kernel/traps.c | 18 ++++++++++++++++++
arch/arm64/kvm/handle_exit.c | 22 ++++++++++++++++++++--
3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 5b2cecd..d68d61f 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -59,5 +59,6 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int,
#endif /* __ASSEMBLY__ */

int handle_guest_sea(unsigned long addr, unsigned int esr);
+int handle_guest_sei(unsigned long addr, unsigned int esr);

#endif /* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 65dbfa9..cf9f569 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -616,6 +616,24 @@ const char *esr_get_class_string(u32 esr)
}

/*
+ * Handle asynchronous SError interrupt that occur in a guest kernel.
+ */
+int handle_guest_sei(unsigned long addr, unsigned int esr)
+{
+ /*
+ * synchronize_rcu() will wait for nmi_exit(), so no need to
+ * rcu_read_lock().
+ */
+ if(IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
+ rcu_read_lock();
+ ghes_notify_sei();
+ rcu_read_unlock();
+ }
+
+ return 0;
+}
+
+/*
* bad_mode handles the impossible case in the exception vector. This is always
* fatal.
*/
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 1bfe30d..8c7dba0 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -28,6 +28,7 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_mmu.h>
#include <asm/kvm_psci.h>
+#include <asm/system_misc.h>

#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -172,6 +173,23 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
return arm_exit_handlers[hsr_ec];
}

+static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+ unsigned long fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
+
+ if (handle_guest_sei((unsigned long)fault_ipa,
+ kvm_vcpu_get_hsr(vcpu))) {
+ kvm_err("Failed to handle guest SEI, FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
+ kvm_vcpu_trap_get_class(vcpu),
+ (unsigned long)kvm_vcpu_trap_get_fault(vcpu),
+ (unsigned long)kvm_vcpu_get_hsr(vcpu));
+ }
+
+ kvm_inject_vabt(vcpu);
+
+ return 0;
+}
+
/*
* Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
* proper exit to userspace.
@@ -195,7 +213,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
*vcpu_pc(vcpu) -= adj;
}

- kvm_inject_vabt(vcpu);
+ kvm_handle_guest_sei(vcpu, run);
return 1;
}

@@ -205,7 +223,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
case ARM_EXCEPTION_IRQ:
return 1;
case ARM_EXCEPTION_EL1_SERROR:
- kvm_inject_vabt(vcpu);
+ kvm_handle_guest_sei(vcpu, run);
return 1;
case ARM_EXCEPTION_TRAP:
/*
--
1.8.3.1

2017-03-14 09:45:26

by James Morse

[permalink] [raw]

Subject: Re: [PATCH v2 3/3] arm64: KVM: add guest SEI support

Hi Xie XiuQi,

On 08/03/17 04:09, Xie XiuQi wrote:
> Add ghes handling for SEI so that the host kernel could parse and
> report detailed error information for SEI which occur in the guest
> kernel.

How does this interact with Synchronous External Abort as a notify method?
Both of these take the in_nmi() path through APEI.

SError Interrupts are masked during exception processing, so we don't have to
worry about them becoming recursive.
For SEA the firmware has to promise not to invoke another SEA while we are still
processing the first, and SEI will be masked if we took it as an exception.

What happens if we take an SEA while processing another event notified via SEI?
Can this happen on your platform? Can someone else build a platform where this
happens? Does the GHES APEI code need to be able to handle this?

If we need to support both at the same time we will need to change Linux's APEI
code to reserve a page of virtual address space per GHES entry, instead of one
for NMI and one for IRQ.

> diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
> index 5b2cecd..d68d61f 100644
> --- a/arch/arm64/include/asm/system_misc.h
> +++ b/arch/arm64/include/asm/system_misc.h
> @@ -59,5 +59,6 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int,
> #endif /* __ASSEMBLY__ */
>
> int handle_guest_sea(unsigned long addr, unsigned int esr);
> +int handle_guest_sei(unsigned long addr, unsigned int esr);
>
> #endif /* __ASM_SYSTEM_MISC_H */
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 65dbfa9..cf9f569 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -616,6 +616,24 @@ const char *esr_get_class_string(u32 esr)
> }
>
> /*
> + * Handle asynchronous SError interrupt that occur in a guest kernel.
> + */
> +int handle_guest_sei(unsigned long addr, unsigned int esr)
> +{
> + /*
> + * synchronize_rcu() will wait for nmi_exit(), so no need to
> + * rcu_read_lock().
> + */

This comment was true for patch 4 of Tyler's series, but not-true when we got to
patch 10. Please remove it,

> + if(IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
> + rcu_read_lock();

Please put the rcu calls against the thing using them.

> + ghes_notify_sei();
> + rcu_read_unlock();
> + }
> +
> + return 0;
> +}
> +
> +/*
> * bad_mode handles the impossible case in the exception vector. This is always
> * fatal.
> */

> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 1bfe30d..8c7dba0 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -172,6 +173,23 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
> return arm_exit_handlers[hsr_ec];
> }
>
> +static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + unsigned long fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
> +
> + if (handle_guest_sei((unsigned long)fault_ipa,
> + kvm_vcpu_get_hsr(vcpu))) {
> + kvm_err("Failed to handle guest SEI, FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
> + kvm_vcpu_trap_get_class(vcpu),
> + (unsigned long)kvm_vcpu_trap_get_fault(vcpu),
> + (unsigned long)kvm_vcpu_get_hsr(vcpu));
> + }
> +

> + kvm_inject_vabt(vcpu);

Always inject an SError Interrupt? How should this work when Qemu supports
guest-RAS too?

If we do want to kill the guest for RAS-related reasons we should go via
user-space to allow Qemu to handle the error and potentially notify the guest.
This would let Qemu generate CPER records for the guest, mirroring what just
happened with the firmware-generated records.

As on the other thread: if there were CPER records processed by
handle_guest_sei() we should continue as normal as the fault was handled in some
way.
If there were no CPER records, (or the system doesn't support SEI as a GHES
notification mechanism), then yes we should still call kvm_inject_vabt().

A suggestion of how do this: [0], if you have a better suggestion please chime in!

Thanks,

James

[0] https://www.spinics.net/lists/kvm/msg146131.html

2017-03-20 07:55:45

by Xie XiuQi

[permalink] [raw]

Subject: Re: [PATCH v2 3/3] arm64: KVM: add guest SEI support

Hi James,

Thank you for your comments and detail explanation.

On 2017/3/14 17:45, James Morse wrote:
> Hi Xie XiuQi,
>
> On 08/03/17 04:09, Xie XiuQi wrote:
>> Add ghes handling for SEI so that the host kernel could parse and
>> report detailed error information for SEI which occur in the guest
>> kernel.
>
> How does this interact with Synchronous External Abort as a notify method?
> Both of these take the in_nmi() path through APEI.
>
> SError Interrupts are masked during exception processing, so we don't have to
> worry about them becoming recursive.

If we use firmware first mode, SEI will be routed to EL3 first, in which mode
the interrupt cannot be masked by the PSTATE.{A,I,F}.

> For SEA the firmware has to promise not to invoke another SEA while we are still
> processing the first, and SEI will be masked if we took it as an exception.
>

Yes, for SEI the firmware should also promise not to invoke another SEI while the
first SEI processing.

But I have a question here, how to handle this case: on the same cpu, another SEA
is taken while we are processing the first SEA. Should firmware detect this case and
reset the system directly?

The same question is also for SEI.

> What happens if we take an SEA while processing another event notified via SEI?
> Can this happen on your platform? Can someone else build a platform where this
> happens? Does the GHES APEI code need to be able to handle this?

IMO, the system should be panic if we take an SEA while processing another event
notified via SEI on the same cpu, and it's not necessary to parse the GHES for the
second SEA. However, if on different cpu, it might be taken simultaneously.

>
> If we need to support both at the same time we will need to change Linux's APEI
> code to reserve a page of virtual address space per GHES entry, instead of one
> for NMI and one for IRQ.
>

We cannot assume that firmware could prevent the SEA notify to OS while SEI is
processing on different cpu. Because firmware use two different GHES for SEA and SEI.
Yes, indeed, we could reserve another virtual address space for the second SEA or SEI.

All above, I just analyze the spec and discuss with BIOS team, but I have no platform
to test now. Any comments is welcome.

>
>> diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
>> index 5b2cecd..d68d61f 100644
>> --- a/arch/arm64/include/asm/system_misc.h
>> +++ b/arch/arm64/include/asm/system_misc.h
>> @@ -59,5 +59,6 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int,
>> #endif /* __ASSEMBLY__ */
>>
>> int handle_guest_sea(unsigned long addr, unsigned int esr);
>> +int handle_guest_sei(unsigned long addr, unsigned int esr);
>>
>> #endif /* __ASM_SYSTEM_MISC_H */
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index 65dbfa9..cf9f569 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -616,6 +616,24 @@ const char *esr_get_class_string(u32 esr)
>> }
>>
>> /*
>> + * Handle asynchronous SError interrupt that occur in a guest kernel.
>> + */
>> +int handle_guest_sei(unsigned long addr, unsigned int esr)
>> +{
>> + /*
>> + * synchronize_rcu() will wait for nmi_exit(), so no need to
>> + * rcu_read_lock().
>> + */
>
> This comment was true for patch 4 of Tyler's series, but not-true when we got to
> patch 10. Please remove it,

OK, thanks.

>
>
>> + if(IS_ENABLED(CONFIG_ACPI_APEI_SEI)) {
>> + rcu_read_lock();
>
> Please put the rcu calls against the thing using them.
>
>
>> + ghes_notify_sei();
>> + rcu_read_unlock();
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> * bad_mode handles the impossible case in the exception vector. This is always
>> * fatal.
>> */
>
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index 1bfe30d..8c7dba0 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -172,6 +173,23 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
>> return arm_exit_handlers[hsr_ec];
>> }
>>
>> +static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> + unsigned long fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
>> +
>> + if (handle_guest_sei((unsigned long)fault_ipa,
>> + kvm_vcpu_get_hsr(vcpu))) {
>> + kvm_err("Failed to handle guest SEI, FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
>> + kvm_vcpu_trap_get_class(vcpu),
>> + (unsigned long)kvm_vcpu_trap_get_fault(vcpu),
>> + (unsigned long)kvm_vcpu_get_hsr(vcpu));
>> + }
>> +
>
>> + kvm_inject_vabt(vcpu);
>
> Always inject an SError Interrupt? How should this work when Qemu supports
> guest-RAS too?
>
> If we do want to kill the guest for RAS-related reasons we should go via
> user-space to allow Qemu to handle the error and potentially notify the guest.
> This would let Qemu generate CPER records for the guest, mirroring what just
> happened with the firmware-generated records.
>
> As on the other thread: if there were CPER records processed by
> handle_guest_sei() we should continue as normal as the fault was handled in some
> way.
> If there were no CPER records, (or the system doesn't support SEI as a GHES
> notification mechanism), then yes we should still call kvm_inject_vabt().
>
> A suggestion of how do this: [0], if you have a better suggestion please chime in!

We need use ESB to isolate the asynchronous error, so that, recovery from SEI is possible then.
I'll do more analyze at spec & code.

--
Thanks,
Xie XiuQi

>
>
> Thanks,
>
> James
>
>
> [0] https://www.spinics.net/lists/kvm/msg146131.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>

2017-03-20 13:56:27

by James Morse

[permalink] [raw]

Subject: Re: [PATCH v2 3/3] arm64: KVM: add guest SEI support

Hi Xie XiuQi,

On 20/03/17 07:48, Xie XiuQi wrote:
> On 2017/3/14 17:45, James Morse wrote:
>> On 08/03/17 04:09, Xie XiuQi wrote:
>>> Add ghes handling for SEI so that the host kernel could parse and
>>> report detailed error information for SEI which occur in the guest
>>> kernel.
>>
>> How does this interact with Synchronous External Abort as a notify method?
>> Both of these take the in_nmi() path through APEI.
>>
>> SError Interrupts are masked during exception processing, so we don't have to
>> worry about them becoming recursive.
>
> If we use firmware first mode, SEI will be routed to EL3 first, in which mode
> the interrupt cannot be masked by the PSTATE.{A,I,F}.
>
>> For SEA the firmware has to promise not to invoke another SEA while we are still
>> processing the first, and SEI will be masked if we took it as an exception.
>>
>
> Yes, for SEI the firmware should also promise not to invoke another SEI while the
> first SEI processing.

Because the OS can mask the exception while it does the work this should be easy.

> But I have a question here, how to handle this case: on the same cpu, another SEA
> is taken while we are processing the first SEA. Should firmware detect this case and
> reset the system directly?

For SEA firmware has to only deliver one at a time. Tyler's comment[0] on this was:
Tyler Baicar wrote:
> Firmware that supports the new specs should only generate one of these at a
> time, it will wait for the ack from kernel before sending a second error
> (patch 1 of this series).

I think this is what the read ack register in GHESv2 is for.

What should happen here is up to firmware. System reset sounds sensible, if
possible it would be good if any such firmware could write both sets of error
records somewhere persistent and hand them to the OS via the BERT on the next boot.

> The same question is also for SEI.

I think SEI is different because it can be masked. For KVM we already have
kvm_inject_vabt() which sets the VSE bit in HCR_EL2. The hardware will deliver
an SError Interrupt to the guest when it next runs with SError unmasked.

If the guest was already running the APEI SEI code it should have SError masked
until its finished.

This should be the same for firmware, I don't know enough about how physical
SError is triggered.

>> What happens if we take an SEA while processing another event notified via SEI?
>> Can this happen on your platform? Can someone else build a platform where this
>> happens? Does the GHES APEI code need to be able to handle this?
>
> IMO, the system should be panic if we take an SEA while processing another event
> notified via SEI on the same cpu, and it's not necessary to parse the GHES for the
> second SEA. However, if on different cpu, it might be taken simultaneously.

For a different CPU we will spin waiting for the APEI locks, this should all
work properly today.

How can we know that SEA interrupted a CPU that was running the APEI SEI code?

The CPU masks SError when we take an exception so we can't use PSTATE.A to tell.
Judging from the range of PC values or setting some per-cpu variable is likely
to get messy.

I think the cleanest thing is to initially make SEI and SEA mutually exclusive
using Kconfig, then refactor the APEI GHES code to allow interactions like this:

>> If we need to support both at the same time we will need to change Linux's APEI
>> code to reserve a page of virtual address space per GHES entry, instead of one
>> for NMI and one for IRQ.

This way it doesn't matter if SEA interrupts SEI. I will have a go at writing this.

> We cannot assume that firmware could prevent the SEA notify to OS while SEI is
> processing on different cpu. Because firmware use two different GHES for SEA and SEI.

I agree. We should handle any sequence of APEI notify methods that the hardware
allows to happen.

Thanks,

James

[0] https://www.spinics.net/lists/arm-kernel/msg567837.html