Syzbot reported GPF in kvm_mmu_uninit_tdp_mmu(), which is caused by
passing NULL pointer to flush_workqueue().
tdp_mmu_zap_wq is allocated via alloc_workqueue() which may fail. There
is no error hanling and kvm_mmu_uninit_tdp_mmu() return value is simply
ignored. Even all kvm_*_init_vm() functions are void, so the easiest
solution is to check that tdp_mmu_zap_wq is valid pointer before passing
it somewhere.
Fixes: 22b94c4b63eb ("KVM: x86/mmu: Zap invalidated roots via asynchronous worker")
Reported-and-tested-by: [email protected]
Signed-off-by: Pavel Skripkin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index e7e7876251b3..b3e8ff7ac5b0 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -48,8 +48,10 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
if (!kvm->arch.tdp_mmu_enabled)
return;
- flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
- destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ if (kvm->arch.tdp_mmu_zap_wq) {
+ flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
+ }
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages));
WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
@@ -119,9 +121,11 @@ static void tdp_mmu_zap_root_work(struct work_struct *work)
static void tdp_mmu_schedule_zap_root(struct kvm *kvm, struct kvm_mmu_page *root)
{
- root->tdp_mmu_async_data = kvm;
- INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
- queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
+ if (kvm->arch.tdp_mmu_zap_wq) {
+ root->tdp_mmu_async_data = kvm;
+ INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
+ queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
+ }
}
static inline bool kvm_tdp_root_mark_invalid(struct kvm_mmu_page *page)
--
2.35.1
Le 25/03/2022 à 17:38, Pavel Skripkin a écrit :
> Syzbot reported GPF in kvm_mmu_uninit_tdp_mmu(), which is caused by
> passing NULL pointer to flush_workqueue().
>
> tdp_mmu_zap_wq is allocated via alloc_workqueue() which may fail. There
> is no error hanling and kvm_mmu_uninit_tdp_mmu() return value is simply
> ignored. Even all kvm_*_init_vm() functions are void, so the easiest
> solution is to check that tdp_mmu_zap_wq is valid pointer before passing
> it somewhere.
>
> Fixes: 22b94c4b63eb ("KVM: x86/mmu: Zap invalidated roots via asynchronous worker")
> Reported-and-tested-by: [email protected]
> Signed-off-by: Pavel Skripkin <[email protected]>
> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index e7e7876251b3..b3e8ff7ac5b0 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -48,8 +48,10 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
> if (!kvm->arch.tdp_mmu_enabled)
> return;
>
> - flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
> - destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
> + if (kvm->arch.tdp_mmu_zap_wq) {
> + flush_workqueue(kvm->arch.tdp_mmu_zap_wq);
> + destroy_workqueue(kvm->arch.tdp_mmu_zap_wq);
Hi,
unrelated to the patch, but flush_workqueue() is redundant and could be
removed.
destroy_workqueue() already drains the queue.
Just my 2c,
CJ
> + }
>
> WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages));
> WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
> @@ -119,9 +121,11 @@ static void tdp_mmu_zap_root_work(struct work_struct *work)
>
> static void tdp_mmu_schedule_zap_root(struct kvm *kvm, struct kvm_mmu_page *root)
> {
> - root->tdp_mmu_async_data = kvm;
> - INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
> - queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
> + if (kvm->arch.tdp_mmu_zap_wq) {
> + root->tdp_mmu_async_data = kvm;
> + INIT_WORK(&root->tdp_mmu_async_work, tdp_mmu_zap_root_work);
> + queue_work(kvm->arch.tdp_mmu_zap_wq, &root->tdp_mmu_async_work);
> + }
> }
>
> static inline bool kvm_tdp_root_mark_invalid(struct kvm_mmu_page *page)
On 3/25/22 17:38, Pavel Skripkin wrote:
> Syzbot reported GPF in kvm_mmu_uninit_tdp_mmu(), which is caused by
> passing NULL pointer to flush_workqueue().
>
> tdp_mmu_zap_wq is allocated via alloc_workqueue() which may fail. There
> is no error hanling and kvm_mmu_uninit_tdp_mmu() return value is simply
> ignored. Even all kvm_*_init_vm() functions are void, so the easiest
> solution is to check that tdp_mmu_zap_wq is valid pointer before passing
> it somewhere.
Thanks for the analysis, but not scheduling the work item in
tdp_mmu_schedule_zap_root is broken; you can't just let the roots
survive (KVM uses its own workqueue because it needs to work item to
complete has to flush it before kvm_mmu_zap_all_fast returns).
I'll fix it properly by propagating the error up to kvm_mmu_init_vm and
kvm_arch_init_vm,
Thanks,
Paolo
Hi Paolo,
On 3/25/22 19:50, Paolo Bonzini wrote:
> On 3/25/22 17:38, Pavel Skripkin wrote:
>> Syzbot reported GPF in kvm_mmu_uninit_tdp_mmu(), which is caused by
>> passing NULL pointer to flush_workqueue().
>>
>> tdp_mmu_zap_wq is allocated via alloc_workqueue() which may fail. There
>> is no error hanling and kvm_mmu_uninit_tdp_mmu() return value is simply
>> ignored. Even all kvm_*_init_vm() functions are void, so the easiest
>> solution is to check that tdp_mmu_zap_wq is valid pointer before passing
>> it somewhere.
>
> Thanks for the analysis, but not scheduling the work item in
> tdp_mmu_schedule_zap_root is broken; you can't just let the roots
> survive (KVM uses its own workqueue because it needs to work item to
> complete has to flush it before kvm_mmu_zap_all_fast returns).
>
Ah, I see, thanks for explanation.
I thought about propagating an error up to callers, but
kvm_mmu_uninit_tdp_mmu() returns false with config disabled, so I
decided to implement easiest fix w/o digging into details
sorry about that
With regards,
Pavel Skripkin