From: Chris Co <[email protected]>
When invoking kexec() on a Linux guest running on a Hyper-V host, the
kernel panics.
RIP: 0010:cpuhp_issue_call+0x137/0x140
Call Trace:
__cpuhp_remove_state_cpuslocked+0x99/0x100
__cpuhp_remove_state+0x1c/0x30
hv_kexec_handler+0x23/0x30 [hv_vmbus]
hv_machine_shutdown+0x1e/0x30
machine_shutdown+0x10/0x20
kernel_kexec+0x6d/0x96
__do_sys_reboot+0x1ef/0x230
__x64_sys_reboot+0x1d/0x20
do_syscall_64+0x6b/0x3d8
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This was due to hv_synic_cleanup() callback returning -EBUSY to
cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
should succeed in the case where vmbus_connection.conn_state
is DISCONNECTED.
Fix is to add an extra condition to test for
vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
only return early if true. This way the kexec() path can still shut
everything down while preserving the initial behavior of preventing
CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
Fixes: 8a857c55420f29 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
Signed-off-by: Chris Co <[email protected]>
Cc: [email protected]
---
drivers/hv/hv.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 0cde10fe0e71..f202ac7f4b3d 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -244,9 +244,13 @@ int hv_synic_cleanup(unsigned int cpu)
/*
* Hyper-V does not provide a way to change the connect CPU once
- * it is set; we must prevent the connect CPU from going offline.
+ * it is set; we must prevent the connect CPU from going offline
+ * while the VM is running normally. But in the panic or kexec()
+ * path where the vmbus is already disconnected, the CPU must be
+ * allowed to shut down.
*/
- if (cpu == VMBUS_CONNECT_CPU)
+ if (cpu == VMBUS_CONNECT_CPU &&
+ vmbus_connection.conn_state == CONNECTED)
return -EBUSY;
/*
--
2.17.1
On Tue, Nov 10, 2020 at 07:01:18PM +0000, Chris Co wrote:
> From: Chris Co <[email protected]>
>
> When invoking kexec() on a Linux guest running on a Hyper-V host, the
> kernel panics.
>
> RIP: 0010:cpuhp_issue_call+0x137/0x140
> Call Trace:
> __cpuhp_remove_state_cpuslocked+0x99/0x100
> __cpuhp_remove_state+0x1c/0x30
> hv_kexec_handler+0x23/0x30 [hv_vmbus]
> hv_machine_shutdown+0x1e/0x30
> machine_shutdown+0x10/0x20
> kernel_kexec+0x6d/0x96
> __do_sys_reboot+0x1ef/0x230
> __x64_sys_reboot+0x1d/0x20
> do_syscall_64+0x6b/0x3d8
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> This was due to hv_synic_cleanup() callback returning -EBUSY to
> cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
> if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
> should succeed in the case where vmbus_connection.conn_state
> is DISCONNECTED.
>
> Fix is to add an extra condition to test for
> vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
> only return early if true. This way the kexec() path can still shut
> everything down while preserving the initial behavior of preventing
> CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
>
> Fixes: 8a857c55420f29 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
> Signed-off-by: Chris Co <[email protected]>
> Cc: [email protected]
Reviewed-by: Andrea Parri (Microsoft) <[email protected]>
Thanks,
Andrea
> ---
> drivers/hv/hv.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index 0cde10fe0e71..f202ac7f4b3d 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -244,9 +244,13 @@ int hv_synic_cleanup(unsigned int cpu)
>
> /*
> * Hyper-V does not provide a way to change the connect CPU once
> - * it is set; we must prevent the connect CPU from going offline.
> + * it is set; we must prevent the connect CPU from going offline
> + * while the VM is running normally. But in the panic or kexec()
> + * path where the vmbus is already disconnected, the CPU must be
> + * allowed to shut down.
> */
> - if (cpu == VMBUS_CONNECT_CPU)
> + if (cpu == VMBUS_CONNECT_CPU &&
> + vmbus_connection.conn_state == CONNECTED)
> return -EBUSY;
>
> /*
> --
> 2.17.1
>
On Tue, Nov 10, 2020 at 09:18:33PM +0100, Andrea Parri wrote:
> On Tue, Nov 10, 2020 at 07:01:18PM +0000, Chris Co wrote:
> > From: Chris Co <[email protected]>
> >
> > When invoking kexec() on a Linux guest running on a Hyper-V host, the
> > kernel panics.
> >
> > RIP: 0010:cpuhp_issue_call+0x137/0x140
> > Call Trace:
> > __cpuhp_remove_state_cpuslocked+0x99/0x100
> > __cpuhp_remove_state+0x1c/0x30
> > hv_kexec_handler+0x23/0x30 [hv_vmbus]
> > hv_machine_shutdown+0x1e/0x30
> > machine_shutdown+0x10/0x20
> > kernel_kexec+0x6d/0x96
> > __do_sys_reboot+0x1ef/0x230
> > __x64_sys_reboot+0x1d/0x20
> > do_syscall_64+0x6b/0x3d8
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > This was due to hv_synic_cleanup() callback returning -EBUSY to
> > cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
> > if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
> > should succeed in the case where vmbus_connection.conn_state
> > is DISCONNECTED.
> >
> > Fix is to add an extra condition to test for
> > vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
> > only return early if true. This way the kexec() path can still shut
> > everything down while preserving the initial behavior of preventing
> > CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
> >
> > Fixes: 8a857c55420f29 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
> > Signed-off-by: Chris Co <[email protected]>
> > Cc: [email protected]
>
> Reviewed-by: Andrea Parri (Microsoft) <[email protected]>
Applied to hyperv-fixes. Thanks Chris and Andrea.
Wei.