2020-09-23 18:59:48

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH] KVM: Enable hardware before doing arch VM initialization

Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
fully enabled during VM init in order to make SEAMCALLs.

This also provides consistent ordering between kvm_create_vm() and
kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
hardware_disable_all().

Cc: Marc Zyngier <[email protected]>
Cc: James Morse <[email protected]>
Cc: Julien Thierry <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Aleksandar Markovic <[email protected]>
Cc: [email protected]
Cc: Paul Mackerras <[email protected]>
Cc: [email protected]
Cc: Christian Borntraeger <[email protected]>
Cc: Janosch Frank <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Joerg Roedel <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---

Obviously not required until the TDX series comes along, but IMO KVM
should be consistent with respect to enabling and disabling virt support
in hardware.

Tested only on Intel hardware. Unless I missed something, this only
affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
Arm looks safe (based on my mostly clueless reading of the code), but I
have no idea if this will cause problem for MIPS, which is doing all kinds
of things in hardware_enable() that I don't pretend to fully understand.

virt/kvm/kvm_main.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cf88233b819a..58fa19bcfc90 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
struct kvm_memslots *slots = kvm_alloc_memslots();

if (!slots)
- goto out_err_no_arch_destroy_vm;
+ goto out_err_no_disable;
/* Generations must be different for each address space. */
slots->generation = i;
rcu_assign_pointer(kvm->memslots[i], slots);
@@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
rcu_assign_pointer(kvm->buses[i],
kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
if (!kvm->buses[i])
- goto out_err_no_arch_destroy_vm;
+ goto out_err_no_disable;
}

kvm->max_halt_poll_ns = halt_poll_ns;

- r = kvm_arch_init_vm(kvm, type);
- if (r)
- goto out_err_no_arch_destroy_vm;
-
r = hardware_enable_all();
if (r)
goto out_err_no_disable;

+ r = kvm_arch_init_vm(kvm, type);
+ if (r)
+ goto out_err_no_arch_destroy_vm;
+
#ifdef CONFIG_HAVE_KVM_IRQFD
INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
#endif
@@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
#endif
out_err_no_mmu_notifier:
- hardware_disable_all();
-out_err_no_disable:
kvm_arch_destroy_vm(kvm);
out_err_no_arch_destroy_vm:
+ hardware_disable_all();
+out_err_no_disable:
WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
for (i = 0; i < KVM_NR_BUSES; i++)
kfree(kvm_get_bus(kvm, i));
--
2.28.0


2020-09-24 05:35:05

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH] KVM: Enable hardware before doing arch VM initialization



On 23.09.20 20:57, Sean Christopherson wrote:
> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
> fully enabled during VM init in order to make SEAMCALLs.
>
> This also provides consistent ordering between kvm_create_vm() and
> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
> hardware_disable_all().
>
> Cc: Marc Zyngier <[email protected]>
> Cc: James Morse <[email protected]>
> Cc: Julien Thierry <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: [email protected]
> Cc: Huacai Chen <[email protected]>
> Cc: Aleksandar Markovic <[email protected]>
> Cc: [email protected]
> Cc: Paul Mackerras <[email protected]>
> Cc: [email protected]
> Cc: Christian Borntraeger <[email protected]>
> Cc: Janosch Frank <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: Vitaly Kuznetsov <[email protected]>
> Cc: Wanpeng Li <[email protected]>
> Cc: Jim Mattson <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
> Obviously not required until the TDX series comes along, but IMO KVM
> should be consistent with respect to enabling and disabling virt support
> in hardware.
>
> Tested only on Intel hardware. Unless I missed something, this only
> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.

Yes, looks fine from an s390 perspective.

Reviewed-by: Christian Borntraeger <[email protected]>

2020-09-24 06:33:59

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

Hi, Sean,

On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
<[email protected]> wrote:
>
> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
> fully enabled during VM init in order to make SEAMCALLs.
>
> This also provides consistent ordering between kvm_create_vm() and
> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
> hardware_disable_all().
Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
also enabled at hardware_enable_all()?

The swapping seems not affect MIPS, but I observed a fact:
kvm_arch_hardware_enable() not only be called at
hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
if you swap the order, new starting CPUs are not enabled VMX before
kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
VMX/TDX).

Huacai
>
> Cc: Marc Zyngier <[email protected]>
> Cc: James Morse <[email protected]>
> Cc: Julien Thierry <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: [email protected]
> Cc: Huacai Chen <[email protected]>
> Cc: Aleksandar Markovic <[email protected]>
> Cc: [email protected]
> Cc: Paul Mackerras <[email protected]>
> Cc: [email protected]
> Cc: Christian Borntraeger <[email protected]>
> Cc: Janosch Frank <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: Vitaly Kuznetsov <[email protected]>
> Cc: Wanpeng Li <[email protected]>
> Cc: Jim Mattson <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
> Obviously not required until the TDX series comes along, but IMO KVM
> should be consistent with respect to enabling and disabling virt support
> in hardware.
>
> Tested only on Intel hardware. Unless I missed something, this only
> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
> Arm looks safe (based on my mostly clueless reading of the code), but I
> have no idea if this will cause problem for MIPS, which is doing all kinds
> of things in hardware_enable() that I don't pretend to fully understand.
>
> virt/kvm/kvm_main.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index cf88233b819a..58fa19bcfc90 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
> struct kvm_memslots *slots = kvm_alloc_memslots();
>
> if (!slots)
> - goto out_err_no_arch_destroy_vm;
> + goto out_err_no_disable;
> /* Generations must be different for each address space. */
> slots->generation = i;
> rcu_assign_pointer(kvm->memslots[i], slots);
> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
> rcu_assign_pointer(kvm->buses[i],
> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
> if (!kvm->buses[i])
> - goto out_err_no_arch_destroy_vm;
> + goto out_err_no_disable;
> }
>
> kvm->max_halt_poll_ns = halt_poll_ns;
>
> - r = kvm_arch_init_vm(kvm, type);
> - if (r)
> - goto out_err_no_arch_destroy_vm;
> -
> r = hardware_enable_all();
> if (r)
> goto out_err_no_disable;
>
> + r = kvm_arch_init_vm(kvm, type);
> + if (r)
> + goto out_err_no_arch_destroy_vm;
> +
> #ifdef CONFIG_HAVE_KVM_IRQFD
> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
> #endif
> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
> #endif
> out_err_no_mmu_notifier:
> - hardware_disable_all();
> -out_err_no_disable:
> kvm_arch_destroy_vm(kvm);
> out_err_no_arch_destroy_vm:
> + hardware_disable_all();
> +out_err_no_disable:
> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> for (i = 0; i < KVM_NR_BUSES; i++)
> kfree(kvm_get_bus(kvm, i));
> --
> 2.28.0
>

2020-09-24 06:52:13

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

On 24/09/20 08:31, Huacai Chen wrote:
> Hi, Sean,
>
> On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
> <[email protected]> wrote:
>>
>> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
>> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
>> fully enabled during VM init in order to make SEAMCALLs.
>>
>> This also provides consistent ordering between kvm_create_vm() and
>> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
>> hardware_disable_all().
> Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
> enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
> also enabled at hardware_enable_all()?

kvm_arch_init_vm() enables TDX *for the VM*, and to do that it needs VMX
instructions (specifically SEAMCALL, which is a hypervisor->"ultravisor"
call). Because that action is VM-specific it cannot be done in
hardware_enable_all().

Paolo

> The swapping seems not affect MIPS, but I observed a fact:
> kvm_arch_hardware_enable() not only be called at
> hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
> if you swap the order, new starting CPUs are not enabled VMX before
> kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
> VMX/TDX).
>
> Huacai
>>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: James Morse <[email protected]>
>> Cc: Julien Thierry <[email protected]>
>> Cc: Suzuki K Poulose <[email protected]>
>> Cc: [email protected]
>> Cc: Huacai Chen <[email protected]>
>> Cc: Aleksandar Markovic <[email protected]>
>> Cc: [email protected]
>> Cc: Paul Mackerras <[email protected]>
>> Cc: [email protected]
>> Cc: Christian Borntraeger <[email protected]>
>> Cc: Janosch Frank <[email protected]>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Cornelia Huck <[email protected]>
>> Cc: Claudio Imbrenda <[email protected]>
>> Cc: Vitaly Kuznetsov <[email protected]>
>> Cc: Wanpeng Li <[email protected]>
>> Cc: Jim Mattson <[email protected]>
>> Cc: Joerg Roedel <[email protected]>
>> Signed-off-by: Sean Christopherson <[email protected]>
>> ---
>>
>> Obviously not required until the TDX series comes along, but IMO KVM
>> should be consistent with respect to enabling and disabling virt support
>> in hardware.
>>
>> Tested only on Intel hardware. Unless I missed something, this only
>> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
>> Arm looks safe (based on my mostly clueless reading of the code), but I
>> have no idea if this will cause problem for MIPS, which is doing all kinds
>> of things in hardware_enable() that I don't pretend to fully understand.
>>
>> virt/kvm/kvm_main.c | 16 ++++++++--------
>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index cf88233b819a..58fa19bcfc90 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> struct kvm_memslots *slots = kvm_alloc_memslots();
>>
>> if (!slots)
>> - goto out_err_no_arch_destroy_vm;
>> + goto out_err_no_disable;
>> /* Generations must be different for each address space. */
>> slots->generation = i;
>> rcu_assign_pointer(kvm->memslots[i], slots);
>> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> rcu_assign_pointer(kvm->buses[i],
>> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
>> if (!kvm->buses[i])
>> - goto out_err_no_arch_destroy_vm;
>> + goto out_err_no_disable;
>> }
>>
>> kvm->max_halt_poll_ns = halt_poll_ns;
>>
>> - r = kvm_arch_init_vm(kvm, type);
>> - if (r)
>> - goto out_err_no_arch_destroy_vm;
>> -
>> r = hardware_enable_all();
>> if (r)
>> goto out_err_no_disable;
>>
>> + r = kvm_arch_init_vm(kvm, type);
>> + if (r)
>> + goto out_err_no_arch_destroy_vm;
>> +
>> #ifdef CONFIG_HAVE_KVM_IRQFD
>> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
>> #endif
>> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
>> #endif
>> out_err_no_mmu_notifier:
>> - hardware_disable_all();
>> -out_err_no_disable:
>> kvm_arch_destroy_vm(kvm);
>> out_err_no_arch_destroy_vm:
>> + hardware_disable_all();
>> +out_err_no_disable:
>> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>> for (i = 0; i < KVM_NR_BUSES; i++)
>> kfree(kvm_get_bus(kvm, i));
>> --
>> 2.28.0
>>
>

2020-09-24 07:49:40

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

Hi, Paolo,

On Thu, Sep 24, 2020 at 2:50 PM Paolo Bonzini <[email protected]> wrote:
>
> On 24/09/20 08:31, Huacai Chen wrote:
> > Hi, Sean,
> >
> > On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
> > <[email protected]> wrote:
> >>
> >> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
> >> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
> >> fully enabled during VM init in order to make SEAMCALLs.
> >>
> >> This also provides consistent ordering between kvm_create_vm() and
> >> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
> >> hardware_disable_all().
> > Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
> > enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
> > also enabled at hardware_enable_all()?
>
> kvm_arch_init_vm() enables TDX *for the VM*, and to do that it needs VMX
> instructions (specifically SEAMCALL, which is a hypervisor->"ultravisor"
> call). Because that action is VM-specific it cannot be done in
> hardware_enable_all().
>
> Paolo
OK, I know.

Reviewed-by: Huacai Chen <[email protected]>

>
> > The swapping seems not affect MIPS, but I observed a fact:
> > kvm_arch_hardware_enable() not only be called at
> > hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
> > if you swap the order, new starting CPUs are not enabled VMX before
> > kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
> > VMX/TDX).
> >
> > Huacai
> >>
> >> Cc: Marc Zyngier <[email protected]>
> >> Cc: James Morse <[email protected]>
> >> Cc: Julien Thierry <[email protected]>
> >> Cc: Suzuki K Poulose <[email protected]>
> >> Cc: [email protected]
> >> Cc: Huacai Chen <[email protected]>
> >> Cc: Aleksandar Markovic <[email protected]>
> >> Cc: [email protected]
> >> Cc: Paul Mackerras <[email protected]>
> >> Cc: [email protected]
> >> Cc: Christian Borntraeger <[email protected]>
> >> Cc: Janosch Frank <[email protected]>
> >> Cc: David Hildenbrand <[email protected]>
> >> Cc: Cornelia Huck <[email protected]>
> >> Cc: Claudio Imbrenda <[email protected]>
> >> Cc: Vitaly Kuznetsov <[email protected]>
> >> Cc: Wanpeng Li <[email protected]>
> >> Cc: Jim Mattson <[email protected]>
> >> Cc: Joerg Roedel <[email protected]>
> >> Signed-off-by: Sean Christopherson <[email protected]>
> >> ---
> >>
> >> Obviously not required until the TDX series comes along, but IMO KVM
> >> should be consistent with respect to enabling and disabling virt support
> >> in hardware.
> >>
> >> Tested only on Intel hardware. Unless I missed something, this only
> >> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
> >> Arm looks safe (based on my mostly clueless reading of the code), but I
> >> have no idea if this will cause problem for MIPS, which is doing all kinds
> >> of things in hardware_enable() that I don't pretend to fully understand.
> >>
> >> virt/kvm/kvm_main.c | 16 ++++++++--------
> >> 1 file changed, 8 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >> index cf88233b819a..58fa19bcfc90 100644
> >> --- a/virt/kvm/kvm_main.c
> >> +++ b/virt/kvm/kvm_main.c
> >> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> struct kvm_memslots *slots = kvm_alloc_memslots();
> >>
> >> if (!slots)
> >> - goto out_err_no_arch_destroy_vm;
> >> + goto out_err_no_disable;
> >> /* Generations must be different for each address space. */
> >> slots->generation = i;
> >> rcu_assign_pointer(kvm->memslots[i], slots);
> >> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> rcu_assign_pointer(kvm->buses[i],
> >> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
> >> if (!kvm->buses[i])
> >> - goto out_err_no_arch_destroy_vm;
> >> + goto out_err_no_disable;
> >> }
> >>
> >> kvm->max_halt_poll_ns = halt_poll_ns;
> >>
> >> - r = kvm_arch_init_vm(kvm, type);
> >> - if (r)
> >> - goto out_err_no_arch_destroy_vm;
> >> -
> >> r = hardware_enable_all();
> >> if (r)
> >> goto out_err_no_disable;
> >>
> >> + r = kvm_arch_init_vm(kvm, type);
> >> + if (r)
> >> + goto out_err_no_arch_destroy_vm;
> >> +
> >> #ifdef CONFIG_HAVE_KVM_IRQFD
> >> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
> >> #endif
> >> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
> >> #endif
> >> out_err_no_mmu_notifier:
> >> - hardware_disable_all();
> >> -out_err_no_disable:
> >> kvm_arch_destroy_vm(kvm);
> >> out_err_no_arch_destroy_vm:
> >> + hardware_disable_all();
> >> +out_err_no_disable:
> >> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> >> for (i = 0; i < KVM_NR_BUSES; i++)
> >> kfree(kvm_get_bus(kvm, i));
> >> --
> >> 2.28.0
> >>
> >
>