Currently the kexec kernel can panic or hang due to 2 causes:
1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
old VP Assist Pages when the kexec kernel runs. The same issue is fixed
for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
VP assist page for hibernation"). Now fix it for kexec.
2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
can still try to access the hypercall page and cause panic. The workaround
"hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
hyperv_cleanup() to a better place.
Signed-off-by: Dexuan Cui <[email protected]>
---
Changes in v2:
Improved the commit log as Michael Kelley suggested.
No change to v1 otherwise.
arch/x86/hyperv/hv_init.c | 4 ++++
arch/x86/include/asm/mshyperv.h | 2 ++
arch/x86/kernel/cpu/mshyperv.c | 18 ++++++++++++++++++
drivers/hv/vmbus_drv.c | 2 --
4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e04d90af4c27..4638a52d8eae 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -16,6 +16,7 @@
#include <asm/hyperv-tlfs.h>
#include <asm/mshyperv.h>
#include <asm/idtentry.h>
+#include <linux/kexec.h>
#include <linux/version.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
@@ -26,6 +27,8 @@
#include <linux/syscore_ops.h>
#include <clocksource/hyperv_timer.h>
+int hyperv_init_cpuhp;
+
void *hv_hypercall_pg;
EXPORT_SYMBOL_GPL(hv_hypercall_pg);
@@ -401,6 +404,7 @@ void __init hyperv_init(void)
register_syscore_ops(&hv_syscore_ops);
+ hyperv_init_cpuhp = cpuhp;
return;
remove_cpuhp_state:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ffc289992d1b..30f76b966857 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
#if IS_ENABLED(CONFIG_HYPERV)
+extern int hyperv_init_cpuhp;
+
extern void *hv_hypercall_pg;
extern void __percpu **hyperv_pcpu_input_arg;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index f628e3dc150f..43b54bef5448 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
{
if (kexec_in_progress && hv_kexec_handler)
hv_kexec_handler();
+
+ /*
+ * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
+ * corrupts the old VP Assist Pages and can crash the kexec kernel.
+ */
+ if (kexec_in_progress && hyperv_init_cpuhp > 0)
+ cpuhp_remove_state(hyperv_init_cpuhp);
+
+ /* The function calls stop_other_cpus(). */
native_machine_shutdown();
+
+ /* Disable the hypercall page when there is only 1 active CPU. */
+ if (kexec_in_progress)
+ hyperv_cleanup();
}
static void hv_machine_crash_shutdown(struct pt_regs *regs)
{
if (hv_crash_handler)
hv_crash_handler(regs);
+
+ /* The function calls crash_smp_send_stop(). */
native_machine_crash_shutdown(regs);
+
+ /* Disable the hypercall page when there is only 1 active CPU. */
+ hyperv_cleanup();
}
#endif /* CONFIG_KEXEC_CORE */
#endif /* CONFIG_HYPERV */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 502f8cd95f6d..d491fdcee61f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2550,7 +2550,6 @@ static void hv_kexec_handler(void)
/* Make sure conn_state is set as hv_synic_cleanup checks for it */
mb();
cpuhp_remove_state(hyperv_cpuhp_online);
- hyperv_cleanup();
};
static void hv_crash_handler(struct pt_regs *regs)
@@ -2566,7 +2565,6 @@ static void hv_crash_handler(struct pt_regs *regs)
cpu = smp_processor_id();
hv_stimer_cleanup(cpu);
hv_synic_disable_regs(cpu);
- hyperv_cleanup();
};
static int hv_synic_suspend(void)
--
2.19.1
From: Dexuan Cui <[email protected]> Sent: Monday, December 21, 2020 10:56 PM
>
> Currently the kexec kernel can panic or hang due to 2 causes:
>
> 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> old VP Assist Pages when the kexec kernel runs. The same issue is fixed
> for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
> VP assist page for hibernation"). Now fix it for kexec.
>
> 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> can still try to access the hypercall page and cause panic. The workaround
> "hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
> hyperv_cleanup() to a better place.
>
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
>
> Changes in v2:
> Improved the commit log as Michael Kelley suggested.
> No change to v1 otherwise.
>
> arch/x86/hyperv/hv_init.c | 4 ++++
> arch/x86/include/asm/mshyperv.h | 2 ++
> arch/x86/kernel/cpu/mshyperv.c | 18 ++++++++++++++++++
> drivers/hv/vmbus_drv.c | 2 --
> 4 files changed, 24 insertions(+), 2 deletions(-)
>
Reviewed-by: Michael Kelley <[email protected]>
On Mon, Dec 21, 2020 at 10:55:41PM -0800, Dexuan Cui wrote:
> Currently the kexec kernel can panic or hang due to 2 causes:
>
> 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> old VP Assist Pages when the kexec kernel runs. The same issue is fixed
> for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
> VP assist page for hibernation"). Now fix it for kexec.
>
> 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> can still try to access the hypercall page and cause panic. The workaround
> "hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
> hyperv_cleanup() to a better place.
>
> Signed-off-by: Dexuan Cui <[email protected]>
The code looks a bit intrusive. On the other hand, this does sound like
something needs backporting for older stable kernels.
On a more practical note, I need to decide whether to take it via
hyperv-fixes or hyperv-next. What do you think?
Wei.
From: Wei Liu <[email protected]> Sent: Tuesday, January 5, 2021 5:04 AM
>
> On Mon, Dec 21, 2020 at 10:55:41PM -0800, Dexuan Cui wrote:
> > Currently the kexec kernel can panic or hang due to 2 causes:
> >
> > 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> > old VP Assist Pages when the kexec kernel runs. The same issue is fixed
> > for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
> > VP assist page for hibernation"). Now fix it for kexec.
> >
> > 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> > are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> > between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> > can still try to access the hypercall page and cause panic. The workaround
> > "hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
> > hyperv_cleanup() to a better place.
> >
> > Signed-off-by: Dexuan Cui <[email protected]>
>
> The code looks a bit intrusive. On the other hand, this does sound like
> something needs backporting for older stable kernels.
>
> On a more practical note, I need to decide whether to take it via
> hyperv-fixes or hyperv-next. What do you think?
>
I'd like to see this in hyperv-fixes and backported to older stable kernels.
In its current form, the kexec path in a Hyper-V guest has multiple problems
that make it unreliable, so the downside risk of taking these fixes is minimal
while the upside benefit is considerable.
Michael
On Tue, Jan 05, 2021 at 04:39:38PM +0000, Michael Kelley wrote:
> From: Wei Liu <[email protected]> Sent: Tuesday, January 5, 2021 5:04 AM
> >
> > On Mon, Dec 21, 2020 at 10:55:41PM -0800, Dexuan Cui wrote:
> > > Currently the kexec kernel can panic or hang due to 2 causes:
> > >
> > > 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> > > old VP Assist Pages when the kexec kernel runs. The same issue is fixed
> > > for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
> > > VP assist page for hibernation"). Now fix it for kexec.
> > >
> > > 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> > > are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> > > between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> > > can still try to access the hypercall page and cause panic. The workaround
> > > "hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
> > > hyperv_cleanup() to a better place.
> > >
> > > Signed-off-by: Dexuan Cui <[email protected]>
> >
> > The code looks a bit intrusive. On the other hand, this does sound like
> > something needs backporting for older stable kernels.
> >
> > On a more practical note, I need to decide whether to take it via
> > hyperv-fixes or hyperv-next. What do you think?
> >
>
> I'd like to see this in hyperv-fixes and backported to older stable kernels.
> In its current form, the kexec path in a Hyper-V guest has multiple problems
> that make it unreliable, so the downside risk of taking these fixes is minimal
> while the upside benefit is considerable.
Applied to hyperv-fixes.
Wei.