2023-05-19 14:42:42

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

These commits

a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg")
2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs")

update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg
because that memory will be correctly marked as decrypted or encrypted
for all VM types (CoCo or normal). But problems ensue when CPUs in the
VM go online or offline after virtual PCI devices have been configured.

When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is
initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN.
But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which
may call the virtual PCI driver and fault trying to use the as yet
uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo
VM if the MMIO read and write hypercalls are used from state
CPUHP_AP_IRQ_AFFINITY_ONLINE.

When a CPU is taken offline, IRQs may be reassigned in state
CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to
use the hyperv_pcpu_input_arg that has already been freed by a
higher state.

Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE
immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE)
and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for
Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
early enough.

Fix the offlining problem by not freeing hyperv_pcpu_input_arg when
a CPU goes offline. Retain the allocated memory, and reuse it if
the CPU comes back online later.

Signed-off-by: Michael Kelley <[email protected]>
---
arch/x86/hyperv/hv_init.c | 2 +-
drivers/hv/hv_common.c | 48 +++++++++++++++++++++++-----------------------
include/linux/cpuhotplug.h | 1 +
3 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index a5f9474..6c04b52 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -416,7 +416,7 @@ void __init hyperv_init(void)
goto free_vp_assist_page;
}

- cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
+ cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online",
hv_cpu_init, hv_cpu_die);
if (cpuhp < 0)
goto free_ghcb_page;
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 64f9cec..4c5cee4 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu)
flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;

inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
- *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
- if (!(*inputarg))
- return -ENOMEM;

- if (hv_root_partition) {
- outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
- *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
+ /*
+ * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
+ * allocated if this CPU was previously online and then taken offline
+ */
+ if (!*inputarg) {
+ *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
+ if (!(*inputarg))
+ return -ENOMEM;
+
+ if (hv_root_partition) {
+ outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
+ *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
+ }
}

msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);
@@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu)

int hv_common_cpu_die(unsigned int cpu)
{
- unsigned long flags;
- void **inputarg, **outputarg;
- void *mem;
-
- local_irq_save(flags);
-
- inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
- mem = *inputarg;
- *inputarg = NULL;
-
- if (hv_root_partition) {
- outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
- *outputarg = NULL;
- }
-
- local_irq_restore(flags);
-
- kfree(mem);
+ /*
+ * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory
+ * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg
+ * may be used by the Hyper-V vPCI driver in reassigning interrupts
+ * as part of the offlining process. The interrupt reassignment
+ * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and
+ * called this function.
+ *
+ * If a previously offlined CPU is brought back online again, the
+ * originally allocated memory is reused in hv_common_cpu_init().
+ */

return 0;
}
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 0f1001d..696004a 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -201,6 +201,7 @@ enum cpuhp_state {
/* Online section invoked on the hotplugged CPU from the hotplug thread */
CPUHP_AP_ONLINE_IDLE,
CPUHP_AP_KVM_ONLINE,
+ CPUHP_AP_HYPERV_ONLINE,
CPUHP_AP_SCHED_WAIT_EMPTY,
CPUHP_AP_SMPBOOT_THREADS,
CPUHP_AP_X86_VDSO_VMA_ONLINE,
--
1.8.3.1



2023-05-19 14:43:05

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH 2/2] arm64/hyperv: Use CPUHP_AP_HYPERV_ONLINE state to fix CPU online sequencing

State CPUHP_AP_HYPERV_ONLINE has been introduced to correctly sequence
the initialization of hyperv_pcpu_input_arg. Use this use new state
for Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
early enough.

Signed-off-by: Michael Kelley <[email protected]>
---
arch/arm64/hyperv/mshyperv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c
index a406454..f1b8a04 100644
--- a/arch/arm64/hyperv/mshyperv.c
+++ b/arch/arm64/hyperv/mshyperv.c
@@ -67,7 +67,7 @@ static int __init hyperv_init(void)
if (ret)
return ret;

- ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/hyperv_init:online",
+ ret = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "arm64/hyperv_init:online",
hv_common_cpu_init, hv_common_cpu_die);
if (ret < 0) {
hv_common_free();
--
1.8.3.1


2023-05-22 09:21:54

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

Michael Kelley <[email protected]> writes:

> These commits
>
> a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg")
> 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs")
>
> update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg
> because that memory will be correctly marked as decrypted or encrypted
> for all VM types (CoCo or normal). But problems ensue when CPUs in the
> VM go online or offline after virtual PCI devices have been configured.
>
> When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is
> initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN.
> But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which
> may call the virtual PCI driver and fault trying to use the as yet
> uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo
> VM if the MMIO read and write hypercalls are used from state
> CPUHP_AP_IRQ_AFFINITY_ONLINE.
>
> When a CPU is taken offline, IRQs may be reassigned in state
> CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to
> use the hyperv_pcpu_input_arg that has already been freed by a
> higher state.
>
> Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE
> immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE)
> and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for
> Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
> early enough.
>
> Fix the offlining problem by not freeing hyperv_pcpu_input_arg when
> a CPU goes offline. Retain the allocated memory, and reuse it if
> the CPU comes back online later.
>
> Signed-off-by: Michael Kelley <[email protected]>
> ---
> arch/x86/hyperv/hv_init.c | 2 +-
> drivers/hv/hv_common.c | 48 +++++++++++++++++++++++-----------------------
> include/linux/cpuhotplug.h | 1 +
> 3 files changed, 26 insertions(+), 25 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index a5f9474..6c04b52 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -416,7 +416,7 @@ void __init hyperv_init(void)
> goto free_vp_assist_page;
> }
>
> - cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
> + cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online",
> hv_cpu_init, hv_cpu_die);
> if (cpuhp < 0)
> goto free_ghcb_page;
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 64f9cec..4c5cee4 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu)
> flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;
>
> inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> - *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
> - if (!(*inputarg))
> - return -ENOMEM;
>
> - if (hv_root_partition) {
> - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> - *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
> + /*
> + * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
> + * allocated if this CPU was previously online and then taken offline
> + */
> + if (!*inputarg) {
> + *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
> + if (!(*inputarg))
> + return -ENOMEM;
> +
> + if (hv_root_partition) {
> + outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> + *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
> + }
> }
>
> msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);
> @@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu)
>
> int hv_common_cpu_die(unsigned int cpu)
> {
> - unsigned long flags;
> - void **inputarg, **outputarg;
> - void *mem;
> -
> - local_irq_save(flags);
> -
> - inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> - mem = *inputarg;
> - *inputarg = NULL;
> -
> - if (hv_root_partition) {
> - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> - *outputarg = NULL;
> - }
> -
> - local_irq_restore(flags);
> -
> - kfree(mem);
> + /*
> + * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory
> + * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg
> + * may be used by the Hyper-V vPCI driver in reassigning interrupts
> + * as part of the offlining process. The interrupt reassignment
> + * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and
> + * called this function.
> + *
> + * If a previously offlined CPU is brought back online again, the
> + * originally allocated memory is reused in hv_common_cpu_init().
> + */
>
> return 0;
> }
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 0f1001d..696004a 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -201,6 +201,7 @@ enum cpuhp_state {
> /* Online section invoked on the hotplugged CPU from the hotplug thread */
> CPUHP_AP_ONLINE_IDLE,
> CPUHP_AP_KVM_ONLINE,
> + CPUHP_AP_HYPERV_ONLINE,

(Cc: KVM)

I would very much prefer we swap the order with KVM here: hv_cpu_init()
allocates and sets vCPU's VP assist page which is used by KVM on
CPUHP_AP_KVM_ONLINE:

kvm_online_cpu() -> __hardware_enable_nolock() ->
kvm_arch_hardware_enable() -> vmx_hardware_enable():

/*
* This can happen if we hot-added a CPU but failed to allocate
* VP assist page for it.
*/
if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
return -EFAULT;

With the change, this is likely broken.

FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
this happens upon KVM module load so CPU hotplug ordering should not
matter as this always happens on a CPU which is already online.

Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
init before KVM's (and vice versa on teardown) makes sense.

> CPUHP_AP_SCHED_WAIT_EMPTY,
> CPUHP_AP_SMPBOOT_THREADS,
> CPUHP_AP_X86_VDSO_VMA_ONLINE,

--
Vitaly


2023-05-22 14:28:23

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

From: Vitaly Kuznetsov <[email protected]> Sent: Monday, May 22, 2023 1:56 AM
>
> Michael Kelley <[email protected]> writes:
>

[snip]

> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> > index 0f1001d..696004a 100644
> > --- a/include/linux/cpuhotplug.h
> > +++ b/include/linux/cpuhotplug.h
> > @@ -201,6 +201,7 @@ enum cpuhp_state {
> > /* Online section invoked on the hotplugged CPU from the hotplug thread */
> > CPUHP_AP_ONLINE_IDLE,
> > CPUHP_AP_KVM_ONLINE,
> > + CPUHP_AP_HYPERV_ONLINE,
>
> (Cc: KVM)
>
> I would very much prefer we swap the order with KVM here: hv_cpu_init()
> allocates and sets vCPU's VP assist page which is used by KVM on
> CPUHP_AP_KVM_ONLINE:
>
> kvm_online_cpu() -> __hardware_enable_nolock() ->
> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>
> /*
> * This can happen if we hot-added a CPU but failed to allocate
> * VP assist page for it.
> */
> if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> With the change, this is likely broken.
>
> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
> this happens upon KVM module load so CPU hotplug ordering should not
> matter as this always happens on a CPU which is already online.
>
> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
> init before KVM's (and vice versa on teardown) makes sense.
>
> > CPUHP_AP_SCHED_WAIT_EMPTY,
> > CPUHP_AP_SMPBOOT_THREADS,
> > CPUHP_AP_X86_VDSO_VMA_ONLINE,

I have no objection to putting CPUHP_AP_HYPERV_ONLINE first. I did
not give any consideration to a possible dependency between the two. :-(
But note that in current code, hv_cpu_init() is running on the
CPUHP_AP_ONLINE_DYN state, which is also after KVM. So this patch
doesn't change the order w.r.t. KVM and the VP assist page. Are things
already broken for KVM, or is something else happening that makes it
work anyway?

Michael

2023-05-23 04:35:10

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH 2/2] arm64/hyperv: Use CPUHP_AP_HYPERV_ONLINE state to fix CPU online sequencing

> From: Michael Kelley (LINUX) <[email protected]>
> Sent: Friday, May 19, 2023 7:34 AM
> To: KY Srinivasan <[email protected]>; Haiyang Zhang
> ...
> State CPUHP_AP_HYPERV_ONLINE has been introduced to correctly sequence
> the initialization of hyperv_pcpu_input_arg. Use this use new state

s/this use/ this

> for Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
> early enough.


2023-05-23 08:24:08

by Vitaly Kuznetsov

[permalink] [raw]
Subject: RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

"Michael Kelley (LINUX)" <[email protected]> writes:

> From: Vitaly Kuznetsov <[email protected]> Sent: Monday, May 22, 2023 1:56 AM
>>
>> Michael Kelley <[email protected]> writes:
>>
>
> [snip]
>
>> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> > index 0f1001d..696004a 100644
>> > --- a/include/linux/cpuhotplug.h
>> > +++ b/include/linux/cpuhotplug.h
>> > @@ -201,6 +201,7 @@ enum cpuhp_state {
>> > /* Online section invoked on the hotplugged CPU from the hotplug thread */
>> > CPUHP_AP_ONLINE_IDLE,
>> > CPUHP_AP_KVM_ONLINE,
>> > + CPUHP_AP_HYPERV_ONLINE,
>>
>> (Cc: KVM)
>>
>> I would very much prefer we swap the order with KVM here: hv_cpu_init()
>> allocates and sets vCPU's VP assist page which is used by KVM on
>> CPUHP_AP_KVM_ONLINE:
>>
>> kvm_online_cpu() -> __hardware_enable_nolock() ->
>> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>>
>> /*
>> * This can happen if we hot-added a CPU but failed to allocate
>> * VP assist page for it.
>> */
>> if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
>> return -EFAULT;
>>
>> With the change, this is likely broken.
>>
>> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
>> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
>> this happens upon KVM module load so CPU hotplug ordering should not
>> matter as this always happens on a CPU which is already online.
>>
>> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
>> init before KVM's (and vice versa on teardown) makes sense.
>>
>> > CPUHP_AP_SCHED_WAIT_EMPTY,
>> > CPUHP_AP_SMPBOOT_THREADS,
>> > CPUHP_AP_X86_VDSO_VMA_ONLINE,
>
> I have no objection to putting CPUHP_AP_HYPERV_ONLINE first. I did
> not give any consideration to a possible dependency between the two. :-(
> But note that in current code, hv_cpu_init() is running on the
> CPUHP_AP_ONLINE_DYN state, which is also after KVM. So this patch
> doesn't change the order w.r.t. KVM and the VP assist page. Are things
> already broken for KVM, or is something else happening that makes it
> work anyway?

This looks like a currently present bug indeed so I had to refresh my
memory.

KVM's CPUHP_AP_KVM_STARTIN is registered with
cpuhp_setup_state_nocalls() which means that kvm_starting_cpu() is not
called for all currently present CPUs. Moreover, kvm_init() is called
when KVM vendor module (e.g. kvm_intel) is loaded and as KVM is normally
built as module, this happens much later than Hyper-V's
hyperv_init(). vmx_hardware_enable() is actually called from
hardware_enable_all() which happens when the first KVM VM is created.

This all changes when a CPU is hotplugged. The order CPUHP_AP_* from
cpuhp_state is respected and KVM's kvm_starting_cpu() is called _before_
Hyper-V's hv_cpu_init() even before your patch. We don't see the bug
just because Hyper-V doesn't(?) support CPU hotplug. Just sending a CPU
offline with e.g. "echo 0 > /sys/devices/system/cpu/cpuX/online" is not
the same as once allocated, VP assist page persists for all non-root
Hyper-V partitions. I don't know if KVM is supported for Hyper-V root
partitions but in case it is, we may have a problem.

Let's put CPUHP_AP_HYPERV_ONLINE before KVM's CPUHP_AP_KVM_ONLINE
explicitly so CPU hotplug scenario is handled correctly, even if current
Hyper-V versions don't support it.

--
Vitaly


2023-05-23 14:55:37

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

From: Vitaly Kuznetsov <[email protected]> Sent: Tuesday, May 23, 2023 1:05 AM
>
> "Michael Kelley (LINUX)" <[email protected]> writes:
>
> > From: Vitaly Kuznetsov <[email protected]> Sent: Monday, May 22, 2023 1:56 AM
> >>
> >> Michael Kelley <[email protected]> writes:
> >>
> >
> > [snip]
> >
> >> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> >> > index 0f1001d..696004a 100644
> >> > --- a/include/linux/cpuhotplug.h
> >> > +++ b/include/linux/cpuhotplug.h
> >> > @@ -201,6 +201,7 @@ enum cpuhp_state {
> >> > /* Online section invoked on the hotplugged CPU from the hotplug thread */
> >> > CPUHP_AP_ONLINE_IDLE,
> >> > CPUHP_AP_KVM_ONLINE,
> >> > + CPUHP_AP_HYPERV_ONLINE,
> >>
> >> (Cc: KVM)
> >>
> >> I would very much prefer we swap the order with KVM here: hv_cpu_init()
> >> allocates and sets vCPU's VP assist page which is used by KVM on
> >> CPUHP_AP_KVM_ONLINE:
> >>
> >> kvm_online_cpu() -> __hardware_enable_nolock() ->
> >> kvm_arch_hardware_enable() -> vmx_hardware_enable():
> >>
> >> /*
> >> * This can happen if we hot-added a CPU but failed to allocate
> >> * VP assist page for it.
> >> */
> >> if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
> >> return -EFAULT;
> >>
> >> With the change, this is likely broken.
> >>
> >> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
> >> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
> >> this happens upon KVM module load so CPU hotplug ordering should not
> >> matter as this always happens on a CPU which is already online.
> >>
> >> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
> >> init before KVM's (and vice versa on teardown) makes sense.
> >>
> >> > CPUHP_AP_SCHED_WAIT_EMPTY,
> >> > CPUHP_AP_SMPBOOT_THREADS,
> >> > CPUHP_AP_X86_VDSO_VMA_ONLINE,
> >
> > I have no objection to putting CPUHP_AP_HYPERV_ONLINE first. I did
> > not give any consideration to a possible dependency between the two. :-(
> > But note that in current code, hv_cpu_init() is running on the
> > CPUHP_AP_ONLINE_DYN state, which is also after KVM. So this patch
> > doesn't change the order w.r.t. KVM and the VP assist page. Are things
> > already broken for KVM, or is something else happening that makes it
> > work anyway?
>
> This looks like a currently present bug indeed so I had to refresh my
> memory.
>
> KVM's CPUHP_AP_KVM_STARTIN is registered with
> cpuhp_setup_state_nocalls() which means that kvm_starting_cpu() is not
> called for all currently present CPUs. Moreover, kvm_init() is called
> when KVM vendor module (e.g. kvm_intel) is loaded and as KVM is normally
> built as module, this happens much later than Hyper-V's
> hyperv_init(). vmx_hardware_enable() is actually called from
> hardware_enable_all() which happens when the first KVM VM is created.
>
> This all changes when a CPU is hotplugged. The order CPUHP_AP_* from
> cpuhp_state is respected and KVM's kvm_starting_cpu() is called _before_
> Hyper-V's hv_cpu_init() even before your patch. We don't see the bug
> just because Hyper-V doesn't(?) support CPU hotplug. Just sending a CPU
> offline with e.g. "echo 0 > /sys/devices/system/cpu/cpuX/online" is not
> the same as once allocated, VP assist page persists for all non-root
> Hyper-V partitions. I don't know if KVM is supported for Hyper-V root
> partitions but in case it is, we may have a problem.
>
> Let's put CPUHP_AP_HYPERV_ONLINE before KVM's CPUHP_AP_KVM_ONLINE
> explicitly so CPU hotplug scenario is handled correctly, even if current
> Hyper-V versions don't support it.
>

Will do. I'll send a v2 with the change.

Michael