From: "Longpeng(Mike)" <[email protected]>
A CPU will not show up in virtualized environment which includes an
Enclave. The VM splits its resources into a primary VM and a Enclave
VM. While the Enclave is active, the hypervisor will ignore all requests
to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
The kernel will wait up to ten seconds for CPU to show up
(do_boot_cpu()) and then rollback the hotplug state back to
CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
After the Enclave VM terminates, the primary VM can bring up the CPU
again.
Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
[bigeasy: Rewrite commit description.]
Signed-off-by: Longpeng(Mike) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
For XEN: this changes the behaviour as it allows to invoke
cpu_initialize_context() again should it have have earlier. I *think*
this is okay and would to bring up the CPU again should the memory
allocation in cpu_initialize_context() fail.
kernel/smpboot.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index f6bc0bc8a2aab..34958d7fe2c1c 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
*/
return -EAGAIN;
+ case CPU_UP_PREPARE:
+ /*
+ * Timeout while waiting for the CPU to show up. Allow to try
+ * again later.
+ */
+ return 0;
+
default:
/* Should not happen. Famous last words. */
--
2.33.1
On 22/11/21 16:47, Sebastian Andrzej Siewior wrote:
> From: "Longpeng(Mike)" <[email protected]>
>
> A CPU will not show up in virtualized environment which includes an
> Enclave. The VM splits its resources into a primary VM and a Enclave
> VM. While the Enclave is active, the hypervisor will ignore all requests
> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> The kernel will wait up to ten seconds for CPU to show up
> (do_boot_cpu()) and then rollback the hotplug state back to
> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>
For my own education, this is talking about *host* CPU hotplug, right?
> After the Enclave VM terminates, the primary VM can bring up the CPU
> again.
>
> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>
> [bigeasy: Rewrite commit description.]
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
>
> For XEN: this changes the behaviour as it allows to invoke
> cpu_initialize_context() again should it have have earlier. I *think*
> this is okay and would to bring up the CPU again should the memory
> allocation in cpu_initialize_context() fail.
Virt stuff notwithstanding, that looks OK to me.
Reviewed-by: Valentin Schneider <[email protected]>
That said, AFAICT only powerpc makes actual use of the state being set to
CPU_UP_PREPARE, it looks to be needless bookkeeping for everyone else (and
there's archs that seem happy using only CPU_DEAD / CPU_POST_DEAD).
>
> kernel/smpboot.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index f6bc0bc8a2aab..34958d7fe2c1c 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
> */
> return -EAGAIN;
>
> + case CPU_UP_PREPARE:
> + /*
> + * Timeout while waiting for the CPU to show up. Allow to try
> + * again later.
> + */
> + return 0;
> +
> default:
>
> /* Should not happen. Famous last words. */
> --
> 2.33.1
Tested-by: Dongli Zhang <[email protected]>
The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a
result, to online this CPU again (even after removal) is always failed.
I have tested that this patch works well to workaround the issue, by introducing
either a mdeley(11000) or while(1); to start_secondary(). That is, to online the
same CPU again is successful even after initial do_boot_cpu() failure.
1. add mdelay(11000) or while(1); to the start_secondary().
2. to online CPU is failed at do_boot_cpu().
3. to online CPU again is failed without this patch.
# echo 1 > /sys/devices/system/cpu/cpu4/online
-su: echo: write error: Input/output error
4. to online CPU again is successful with this patch.
Thank you very much!
Dongli Zhang
On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote:
> From: "Longpeng(Mike)" <[email protected]>
>
> A CPU will not show up in virtualized environment which includes an
> Enclave. The VM splits its resources into a primary VM and a Enclave
> VM. While the Enclave is active, the hypervisor will ignore all requests
> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> The kernel will wait up to ten seconds for CPU to show up
> (do_boot_cpu()) and then rollback the hotplug state back to
> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>
> After the Enclave VM terminates, the primary VM can bring up the CPU
> again.
>
> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>
> [bigeasy: Rewrite commit description.]
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> Link: https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-gE8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$
> ---
>
> For XEN: this changes the behaviour as it allows to invoke
> cpu_initialize_context() again should it have have earlier. I *think*
> this is okay and would to bring up the CPU again should the memory
> allocation in cpu_initialize_context() fail.
>
> kernel/smpboot.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index f6bc0bc8a2aab..34958d7fe2c1c 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
> */
> return -EAGAIN;
>
> + case CPU_UP_PREPARE:
> + /*
> + * Timeout while waiting for the CPU to show up. Allow to try
> + * again later.
> + */
> + return 0;
> +
> default:
>
> /* Should not happen. Famous last words. */
>
> -----Original Message-----
> From: Dongli Zhang [mailto:[email protected]]
> Sent: Wednesday, November 24, 2021 5:22 AM
> To: Sebastian Andrzej Siewior <[email protected]>; Longpeng (Mike, Cloud
> Infrastructure Service Product Dept.) <[email protected]>
> Cc: [email protected]; Gonglei (Arei) <[email protected]>;
> [email protected]; [email protected]; Peter Zijlstra
> <[email protected]>; Ingo Molnar <[email protected]>; Valentin Schneider
> <[email protected]>; Boris Ostrovsky <[email protected]>;
> Juergen Gross <[email protected]>; Stefano Stabellini <[email protected]>;
> Thomas Gleixner <[email protected]>; Ingo Molnar <[email protected]>; Borislav
> Petkov <[email protected]>; Dave Hansen <[email protected]>; H. Peter
> Anvin <[email protected]>
> Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be
> brought up again.
>
> Tested-by: Dongli Zhang <[email protected]>
>
>
> The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
> monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a
> result, to online this CPU again (even after removal) is always failed.
>
> I have tested that this patch works well to workaround the issue, by introducing
> either a mdeley(11000) or while(1); to start_secondary(). That is, to online
> the
> same CPU again is successful even after initial do_boot_cpu() failure.
>
> 1. add mdelay(11000) or while(1); to the start_secondary().
>
> 2. to online CPU is failed at do_boot_cpu().
>
Thanks for your testing :)
Does the cpu4 spin in wait_for_master_cpu() in your case ?
> 3. to online CPU again is failed without this patch.
>
> # echo 1 > /sys/devices/system/cpu/cpu4/online
> -su: echo: write error: Input/output error
>
> 4. to online CPU again is successful with this patch.
>
> Thank you very much!
>
> Dongli Zhang
>
> On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote:
> > From: "Longpeng(Mike)" <[email protected]>
> >
> > A CPU will not show up in virtualized environment which includes an
> > Enclave. The VM splits its resources into a primary VM and a Enclave
> > VM. While the Enclave is active, the hypervisor will ignore all requests
> > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> > The kernel will wait up to ten seconds for CPU to show up
> > (do_boot_cpu()) and then rollback the hotplug state back to
> > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
> >
> > After the Enclave VM terminates, the primary VM can bring up the CPU
> > again.
> >
> > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
> >
> > [bigeasy: Rewrite commit description.]
> >
> > Signed-off-by: Longpeng(Mike) <[email protected]>
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > Link:
> https://urldefense.com/v3/__https://lore.kernel.org/r/20210901051143.2752-1
> [email protected]__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-g
> E8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$
> > ---
> >
> > For XEN: this changes the behaviour as it allows to invoke
> > cpu_initialize_context() again should it have have earlier. I *think*
> > this is okay and would to bring up the CPU again should the memory
> > allocation in cpu_initialize_context() fail.
> >
> > kernel/smpboot.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> > index f6bc0bc8a2aab..34958d7fe2c1c 100644
> > --- a/kernel/smpboot.c
> > +++ b/kernel/smpboot.c
> > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
> > */
> > return -EAGAIN;
> >
> > + case CPU_UP_PREPARE:
> > + /*
> > + * Timeout while waiting for the CPU to show up. Allow to try
> > + * again later.
> > + */
> > + return 0;
> > +
> > default:
> >
> > /* Should not happen. Famous last words. */
> >
> -----Original Message-----
> From: Valentin Schneider [mailto:[email protected]]
> Sent: Wednesday, November 24, 2021 2:14 AM
> To: Sebastian Andrzej Siewior <[email protected]>; Longpeng (Mike, Cloud
> Infrastructure Service Product Dept.) <[email protected]>
> Cc: [email protected]; Gonglei (Arei) <[email protected]>;
> [email protected]; [email protected]; Peter Zijlstra
> <[email protected]>; Ingo Molnar <[email protected]>; Boris Ostrovsky
> <[email protected]>; Juergen Gross <[email protected]>; Stefano
> Stabellini <[email protected]>; Thomas Gleixner <[email protected]>;
> Ingo Molnar <[email protected]>; Borislav Petkov <[email protected]>; Dave Hansen
> <[email protected]>; H. Peter Anvin <[email protected]>
> Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be
> brought up again.
>
> On 22/11/21 16:47, Sebastian Andrzej Siewior wrote:
> > From: "Longpeng(Mike)" <[email protected]>
> >
> > A CPU will not show up in virtualized environment which includes an
> > Enclave. The VM splits its resources into a primary VM and a Enclave
> > VM. While the Enclave is active, the hypervisor will ignore all requests
> > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> > The kernel will wait up to ten seconds for CPU to show up
> > (do_boot_cpu()) and then rollback the hotplug state back to
> > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
> >
>
> For my own education, this is talking about *host* CPU hotplug, right?
>
It's about the *guest* CPU hotplug.
1. Users in Primary VM:
Split out vcpuX (offline from Primary VM) for Enclave VM
2. Hypervisor:
Set a flag for vcpuX, all requests from Primary VM to bring up vcpuX
will be ignore.
3. Users in Primary VM:
echo 1 > .../vcpuX/online would fail and leave the CPU state of vcpuX
in CPU_UP_PREPARE.
4. Users in Primary VM terminate the Enclave VM:
Hypervisor should clear the flag (set in step 2) of vcpuX, so the vcpuX
can continue to receive requests.
5. Users in Primary VM:
Try to online the vcpuX again (expect success), but it's always failed.
> > After the Enclave VM terminates, the primary VM can bring up the CPU
> > again.
> >
> > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
> >
> > [bigeasy: Rewrite commit description.]
> >
> > Signed-off-by: Longpeng(Mike) <[email protected]>
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> > ---
> >
> > For XEN: this changes the behaviour as it allows to invoke
> > cpu_initialize_context() again should it have have earlier. I *think*
> > this is okay and would to bring up the CPU again should the memory
> > allocation in cpu_initialize_context() fail.
>
> Virt stuff notwithstanding, that looks OK to me.
> Reviewed-by: Valentin Schneider <[email protected]>
>
> That said, AFAICT only powerpc makes actual use of the state being set to
> CPU_UP_PREPARE, it looks to be needless bookkeeping for everyone else (and
> there's archs that seem happy using only CPU_DEAD / CPU_POST_DEAD).
>
> >
> > kernel/smpboot.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> > index f6bc0bc8a2aab..34958d7fe2c1c 100644
> > --- a/kernel/smpboot.c
> > +++ b/kernel/smpboot.c
> > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
> > */
> > return -EAGAIN;
> >
> > + case CPU_UP_PREPARE:
> > + /*
> > + * Timeout while waiting for the CPU to show up. Allow to try
> > + * again later.
> > + */
> > + return 0;
> > +
> > default:
> >
> > /* Should not happen. Famous last words. */
> > --
> > 2.33.1
On 11/23/21 3:50 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
wrote:
>
>
>> -----Original Message-----
>> From: Dongli Zhang [mailto:[email protected]]
>> Sent: Wednesday, November 24, 2021 5:22 AM
>> To: Sebastian Andrzej Siewior <[email protected]>; Longpeng (Mike, Cloud
>> Infrastructure Service Product Dept.) <[email protected]>
>> Cc: [email protected]; Gonglei (Arei) <[email protected]>;
>> [email protected]; [email protected]; Peter Zijlstra
>> <[email protected]>; Ingo Molnar <[email protected]>; Valentin Schneider
>> <[email protected]>; Boris Ostrovsky <[email protected]>;
>> Juergen Gross <[email protected]>; Stefano Stabellini <[email protected]>;
>> Thomas Gleixner <[email protected]>; Ingo Molnar <[email protected]>; Borislav
>> Petkov <[email protected]>; Dave Hansen <[email protected]>; H. Peter
>> Anvin <[email protected]>
>> Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be
>> brought up again.
>>
>> Tested-by: Dongli Zhang <[email protected]>
>>
>>
>> The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
>> monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a
>> result, to online this CPU again (even after removal) is always failed.
>>
>> I have tested that this patch works well to workaround the issue, by introducing
>> either a mdeley(11000) or while(1); to start_secondary(). That is, to online
>> the
>> same CPU again is successful even after initial do_boot_cpu() failure.
>>
>> 1. add mdelay(11000) or while(1); to the start_secondary().
>>
>> 2. to online CPU is failed at do_boot_cpu().
>>
>
> Thanks for your testing :)
>
> Does the cpu4 spin in wait_for_master_cpu() in your case ?
I did two tests.
TEST 1.
I added "mdelay(11000);" as the first line in start_secondary(). Once the issue
was encountered, the RIP of CPU=4 was ffffffff8c242021 (from QEMU's "info
registers -a") which was in the range of wait_for_master_cpu().
# cat /proc/kallsyms | grep ffffffff8c2420
ffffffff8c242010 t wait_for_master_cpu
ffffffff8c242030 T load_fixmap_gdt
ffffffff8c242060 T native_write_cr4
ffffffff8c2420c0 T cr4_init
TEST 2.
I added "while(true);" as the first line in start_secondary(). Once the issue
was encountered, the RIP of CPU=4 was ffffffff91654c0a (from QEMU's "info
registers -a") which was in the range of start_secondary().
# cat /proc/kallsyms | grep ffffffff91654c0
ffffffff91654c00 t start_secondary
Dongli Zhang
>
>> 3. to online CPU again is failed without this patch.
>>
>> # echo 1 > /sys/devices/system/cpu/cpu4/online
>> -su: echo: write error: Input/output error
>>
>> 4. to online CPU again is successful with this patch.
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>> On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote:
>>> From: "Longpeng(Mike)" <[email protected]>
>>>
>>> A CPU will not show up in virtualized environment which includes an
>>> Enclave. The VM splits its resources into a primary VM and a Enclave
>>> VM. While the Enclave is active, the hypervisor will ignore all requests
>>> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
>>> The kernel will wait up to ten seconds for CPU to show up
>>> (do_boot_cpu()) and then rollback the hotplug state back to
>>> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
>>> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>>>
>>> After the Enclave VM terminates, the primary VM can bring up the CPU
>>> again.
>>>
>>> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>>>
>>> [bigeasy: Rewrite commit description.]
>>>
>>> Signed-off-by: Longpeng(Mike) <[email protected]>
>>> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
>>> Link:
>> https://urldefense.com/v3/__https://lore.kernel.org/r/20210901051143.2752-1
>> [email protected]__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-g
>> E8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$
>>> ---
>>>
>>> For XEN: this changes the behaviour as it allows to invoke
>>> cpu_initialize_context() again should it have have earlier. I *think*
>>> this is okay and would to bring up the CPU again should the memory
>>> allocation in cpu_initialize_context() fail.
>>>
>>> kernel/smpboot.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
>>> index f6bc0bc8a2aab..34958d7fe2c1c 100644
>>> --- a/kernel/smpboot.c
>>> +++ b/kernel/smpboot.c
>>> @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
>>> */
>>> return -EAGAIN;
>>>
>>> + case CPU_UP_PREPARE:
>>> + /*
>>> + * Timeout while waiting for the CPU to show up. Allow to try
>>> + * again later.
>>> + */
>>> + return 0;
>>> +
>>> default:
>>>
>>> /* Should not happen. Famous last words. */
>>>
Hi,
> -----Original Message-----
> From: Xen-devel <[email protected]> On Behalf Of
> Sebastian Andrzej Siewior
> Sent: Monday, November 22, 2021 11:47 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <[email protected]>
> Cc: [email protected]; Gonglei (Arei) <[email protected]>;
> [email protected]; [email protected]; Peter Zijlstra
> <[email protected]>; Ingo Molnar <[email protected]>; Valentin
> Schneider <[email protected]>; Boris Ostrovsky
> <[email protected]>; Juergen Gross <[email protected]>; Stefano
> Stabellini <[email protected]>; Thomas Gleixner <[email protected]>;
> Ingo Molnar <[email protected]>; Borislav Petkov <[email protected]>; Dave
> Hansen <[email protected]>; H. Peter Anvin <[email protected]>
> Subject: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be
> brought up again.
>
> From: "Longpeng(Mike)" <[email protected]>
>
> A CPU will not show up in virtualized environment which includes an
> Enclave. The VM splits its resources into a primary VM and a Enclave
> VM. While the Enclave is active, the hypervisor will ignore all requests
> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> The kernel will wait up to ten seconds for CPU to show up
> (do_boot_cpu()) and then rollback the hotplug state back to
> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>
> After the Enclave VM terminates, the primary VM can bring up the CPU
> again.
>
> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>
> [bigeasy: Rewrite commit description.]
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> Link: https://lore.kernel.org/r/20210901051143.2752-1-
> [email protected]
> ---
>
> For XEN: this changes the behaviour as it allows to invoke
> cpu_initialize_context() again should it have have earlier. I *think*
> this is okay and would to bring up the CPU again should the memory
> allocation in cpu_initialize_context() fail.
>
> kernel/smpboot.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index f6bc0bc8a2aab..34958d7fe2c1c 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu)
> */
> return -EAGAIN;
>
> + case CPU_UP_PREPARE:
> + /*
> + * Timeout while waiting for the CPU to show up. Allow to try
> + * again later.
> + */
> + return 0;
> +
> default:
>
> /* Should not happen. Famous last words. */
> --
> 2.33.1
>
Reviewed-by: Henry Wang <[email protected]>
Kind regards,
Henry
On 24/11/21 00:19, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
>> -----Original Message-----
>> From: Valentin Schneider [mailto:[email protected]]
>> For my own education, this is talking about *host* CPU hotplug, right?
>>
>
> It's about the *guest* CPU hotplug.
>
> 1. Users in Primary VM:
> Split out vcpuX (offline from Primary VM) for Enclave VM
>
> 2. Hypervisor:
> Set a flag for vcpuX, all requests from Primary VM to bring up vcpuX
> will be ignore.
>
> 3. Users in Primary VM:
> echo 1 > .../vcpuX/online would fail and leave the CPU state of vcpuX
> in CPU_UP_PREPARE.
>
> 4. Users in Primary VM terminate the Enclave VM:
> Hypervisor should clear the flag (set in step 2) of vcpuX, so the vcpuX
> can continue to receive requests.
>
> 5. Users in Primary VM:
> Try to online the vcpuX again (expect success), but it's always failed.
>
If I followed the rabbit hole in the right direction, this is about:
ff8a4d3e3a99 ("nitro_enclaves: Add logic for setting an enclave vCPU")
So there's a 1:1 CPU:vCPU mapping and an Enclave carves a chunk out of the
Primary VM...
Thanks for the detailed explanation!
On Mon, Nov 22 2021 at 16:47, Sebastian Andrzej Siewior wrote:
> From: "Longpeng(Mike)" <[email protected]>
>
> A CPU will not show up in virtualized environment which includes an
> Enclave. The VM splits its resources into a primary VM and a Enclave
> VM. While the Enclave is active, the hypervisor will ignore all requests
> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
> The kernel will wait up to ten seconds for CPU to show up
> (do_boot_cpu()) and then rollback the hotplug state back to
> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>
> After the Enclave VM terminates, the primary VM can bring up the CPU
> again.
>
> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>
> [bigeasy: Rewrite commit description.]
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
>
> For XEN: this changes the behaviour as it allows to invoke
> cpu_initialize_context() again should it have have earlier. I *think*
> this is okay and would to bring up the CPU again should the memory
> allocation in cpu_initialize_context() fail.
Any comment from XEN folks?
Thanks,
tglx
On 11/24/21 5:54 PM, Thomas Gleixner wrote:
> On Mon, Nov 22 2021 at 16:47, Sebastian Andrzej Siewior wrote:
>> From: "Longpeng(Mike)" <[email protected]>
>>
>> A CPU will not show up in virtualized environment which includes an
>> Enclave. The VM splits its resources into a primary VM and a Enclave
>> VM. While the Enclave is active, the hypervisor will ignore all requests
>> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state.
>> The kernel will wait up to ten seconds for CPU to show up
>> (do_boot_cpu()) and then rollback the hotplug state back to
>> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is
>> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage.
>>
>> After the Enclave VM terminates, the primary VM can bring up the CPU
>> again.
>>
>> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state.
>>
>> [bigeasy: Rewrite commit description.]
>>
>> Signed-off-by: Longpeng(Mike) <[email protected]>
>> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>>
>> For XEN: this changes the behaviour as it allows to invoke
>> cpu_initialize_context() again should it have have earlier. I *think*
>> this is okay and would to bring up the CPU again should the memory
>> allocation in cpu_initialize_context() fail.
> Any comment from XEN folks?
If memory allocation in cpu_initialize_context() fails we will not be able to bring up the VCPU because xen_cpu_initialized_map bit at the top of that routine will already have been set. We will BUG in xen_pv_cpu_up() on second (presumably successful) attempt because nothing for that VCPU would be initialized. This can in principle be fixed by moving allocation to the top of the routine and freeing context if the bit in the bitmap is already set.
Having said that, allocation really should not fail: for PV guests we first bring max number of VCPUs up and then offline them down to however many need to run. I think if we fail allocation during boot we are going to have a really bad day anyway.
-boris
On 2021-11-24 21:17:34 [-0500], Boris Ostrovsky wrote:
>
> On 11/24/21 5:54 PM, Thomas Gleixner wrote:
> > Any comment from XEN folks?
>
>
> If memory allocation in cpu_initialize_context() fails we will not be
> able to bring up the VCPU because xen_cpu_initialized_map bit at the
> top of that routine will already have been set. We will BUG in
> xen_pv_cpu_up() on second (presumably successful) attempt because
> nothing for that VCPU would be initialized. This can in principle be
> fixed by moving allocation to the top of the routine and freeing
> context if the bit in the bitmap is already set.
>
>
> Having said that, allocation really should not fail: for PV guests we
> first bring max number of VCPUs up and then offline them down to
> however many need to run. I think if we fail allocation during boot we
> are going to have a really bad day anyway.
>
So can we keep the patch as-is or are some changes needed?
> -boris
Sebastian
On 12/6/21 6:25 AM, Sebastian Andrzej Siewior wrote:
> On 2021-11-24 21:17:34 [-0500], Boris Ostrovsky wrote:
>> On 11/24/21 5:54 PM, Thomas Gleixner wrote:
>>> Any comment from XEN folks?
>>
>> If memory allocation in cpu_initialize_context() fails we will not be
>> able to bring up the VCPU because xen_cpu_initialized_map bit at the
>> top of that routine will already have been set. We will BUG in
>> xen_pv_cpu_up() on second (presumably successful) attempt because
>> nothing for that VCPU would be initialized. This can in principle be
>> fixed by moving allocation to the top of the routine and freeing
>> context if the bit in the bitmap is already set.
>>
>>
>> Having said that, allocation really should not fail: for PV guests we
>> first bring max number of VCPUs up and then offline them down to
>> however many need to run. I think if we fail allocation during boot we
>> are going to have a really bad day anyway.
>>
> So can we keep the patch as-is or are some changes needed?
I think for the sake of completeness we could add
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 6a8f3b53ab83..86368fcef466 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -277,8 +277,11 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
return 0;
ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
- if (ctxt == NULL)
+ if (ctxt == NULL) {
+ cpumask_clear_cpu(cpu, xen_cpu_initialized_map);
+ cpumask_clear_cpu(cpu, cpu_callout_mask);
return -ENOMEM;
+ }
gdt = get_cpu_gdt_rw(cpu);
-boris