2024-06-03 17:44:43

by Moger, Babu

[permalink] [raw]
Subject: [PATCH] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error

System throws this following UBSAN: invalid-load error when the very first
VM is powered up on a freshly booted host machine. Happens only with 2P or
4P (multiple sockets) systems.

{ 688.429145] ------------[ cut here ]------------
[ 688.429156] UBSAN: invalid-load in arch/x86/kvm/../../../virt/kvm/kvm_main.c:655:10
[ 688.437739] load of value 160 is not a valid value for type '_Bool'
[ 688.444760] CPU: 370 PID: 8246 Comm: CPU 0/KVM Not tainted 6.8.2-amdsos-build58-ubuntu-22.04+ #1
[ 688.444767] Hardware name: AMD Corporation Sh54p/Sh54p, BIOS WPC4429N 04/25/2024
[ 688.444770] Call Trace:
[ 688.444777] <TASK>
[ 688.444787] dump_stack_lvl+0x48/0x60
[ 688.444810] ubsan_epilogue+0x5/0x30
[ 688.444823] __ubsan_handle_load_invalid_value+0x79/0x80
[ 688.444827] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.444836] ? flush_tlb_func+0xe9/0x2e0
[ 688.444845] kvm_mmu_notifier_invalidate_range_end.cold+0x18/0x4f [kvm]
[ 688.444906] __mmu_notifier_invalidate_range_end+0x63/0xe0
[ 688.444917] __split_huge_pmd+0x367/0xfc0
[ 688.444928] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.444931] ? alloc_pages_mpol+0x97/0x210
[ 688.444941] do_huge_pmd_wp_page+0x1cc/0x380
[ 688.444946] __handle_mm_fault+0x8ee/0xe50
[ 688.444958] handle_mm_fault+0xe4/0x4a0
[ 688.444962] __get_user_pages+0x190/0x840
[ 688.444972] get_user_pages_unlocked+0xe0/0x590
[ 688.444977] hva_to_pfn+0x114/0x550 [kvm]
[ 688.445007] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445011] ? __gfn_to_pfn_memslot+0x3b/0xd0 [kvm]
[ 688.445037] kvm_faultin_pfn+0xed/0x5b0 [kvm]
[ 688.445079] kvm_tdp_page_fault+0x123/0x170 [kvm]
[ 688.445109] kvm_mmu_page_fault+0x244/0xaa0 [kvm]
[ 688.445136] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445138] ? kvm_io_bus_get_first_dev+0x56/0xf0 [kvm]
[ 688.445165] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445171] ? svm_vcpu_run+0x329/0x7c0 [kvm_amd]
[ 688.445186] vcpu_enter_guest+0x592/0x1070 [kvm]
[ 688.445223] kvm_arch_vcpu_ioctl_run+0x145/0x8a0 [kvm]
[ 688.445254] kvm_vcpu_ioctl+0x288/0x6d0 [kvm]
[ 688.445279] ? vcpu_put+0x22/0x50 [kvm]
[ 688.445305] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445307] ? kvm_arch_vcpu_ioctl_run+0x346/0x8a0 [kvm]
[ 688.445335] __x64_sys_ioctl+0x8f/0xd0
[ 688.445343] do_syscall_64+0x77/0x120
[ 688.445353] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445355] ? fire_user_return_notifiers+0x42/0x70
[ 688.445363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445365] ? syscall_exit_to_user_mode+0x82/0x1b0
[ 688.445372] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445377] ? do_syscall_64+0x86/0x120
[ 688.445380] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445383] ? do_syscall_64+0x86/0x120
[ 688.445388] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445392] ? do_syscall_64+0x86/0x120
[ 688.445396] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445400] ? do_syscall_64+0x86/0x120
[ 688.445404] ? do_syscall_64+0x86/0x120
[ 688.445407] ? do_syscall_64+0x86/0x120
[ 688.445410] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 688.445416] RIP: 0033:0x7fdf2ed1a94f
[ 688.445421] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89
44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41>
89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ 688.445424] RSP: 002b:00007fc127bff460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 688.445429] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fdf2ed1a94f
[ 688.445432] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000016
[ 688.445434] RBP: 00005586f80dc350 R08: 00005586f6a0af10 R09: 00000000ffffffff
[ 688.445436] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[ 688.445438] R13: 0000000000000001 R14: 0000000000000cf8 R15: 0000000000000000
[ 688.445443] </TASK>
[ 688.445444] ---[ end trace ]---

However, VM boots up fine without any issues and operational.

The error is due to invalid assignment in kvm invalidate range end path.
There is no arch specific handler for this case and handler is assigned
to kvm_null_fn(). This is an empty function and returns void. Return value
of this function is assigned to boolean variable. UBSAN complains about
this incompatible assignment when kernel is compiled with CONFIG_UBSAN.

Fix the issue by adding a check for the null handler.

Signed-off-by: Babu Moger <[email protected]>
---
Seems straight forward fix to me. Point me if you think otherwise. New
to this area of the code. First of all not clear to me why handler need
to be called when memory slot is not found in the hva range.
---
virt/kvm/kvm_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 14841acb8b95..ee8be1835214 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -653,7 +653,8 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
if (IS_KVM_NULL_FN(range->handler))
break;
}
- r.ret |= range->handler(kvm, &gfn_range);
+ if (!IS_KVM_NULL_FN(range->handler))
+ r.ret |= range->handler(kvm, &gfn_range);
}
}

--
2.34.1



2024-06-03 17:54:26

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error

On Mon, Jun 03, 2024, Babu Moger wrote:
> System throws this following UBSAN: invalid-load error when the very first
> VM is powered up on a freshly booted host machine. Happens only with 2P or
> 4P (multiple sockets) systems.

...

> However, VM boots up fine without any issues and operational.
>
> The error is due to invalid assignment in kvm invalidate range end path.
> There is no arch specific handler for this case and handler is assigned
> to kvm_null_fn(). This is an empty function and returns void. Return value
> of this function is assigned to boolean variable. UBSAN complains about
> this incompatible assignment when kernel is compiled with CONFIG_UBSAN.
>
> Fix the issue by adding a check for the null handler.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> Seems straight forward fix to me. Point me if you think otherwise. New
> to this area of the code. First of all not clear to me why handler need
> to be called when memory slot is not found in the hva range.
> ---
> virt/kvm/kvm_main.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 14841acb8b95..ee8be1835214 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -653,7 +653,8 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> if (IS_KVM_NULL_FN(range->handler))
> break;
> }
> - r.ret |= range->handler(kvm, &gfn_range);
> + if (!IS_KVM_NULL_FN(range->handler))
> + r.ret |= range->handler(kvm, &gfn_range);

Hrm, this should be unreachable, the IS_KVM_NULL_FN() just about is supposed to
bail after locking.

Ah, the "break" will only break out of the memslot loop, it won't break out of
the address space loop. Stupid SMM.

I think this is what we want.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b312d0cbe60b..70f5a39f8302 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -651,7 +651,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
range->on_lock(kvm);

if (IS_KVM_NULL_FN(range->handler))
- break;
+ goto mmu_unlock;
}
r.ret |= range->handler(kvm, &gfn_range);
}
@@ -660,6 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
if (range->flush_on_ret && r.ret)
kvm_flush_remote_tlbs(kvm);

+mmu_unlock:
if (r.found_memslot)
KVM_MMU_UNLOCK(kvm);

2024-06-03 17:57:12

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error

On Mon, Jun 3, 2024 at 7:54 PM Sean Christopherson <[email protected]> wrote:
> > However, VM boots up fine without any issues and operational.

Yes, the caller uses kvm_handle_hva_range() as if it returned void.

> Ah, the "break" will only break out of the memslot loop, it won't break out of
> the address space loop. Stupid SMM.
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b312d0cbe60b..70f5a39f8302 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -651,7 +651,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> range->on_lock(kvm);
>
> if (IS_KVM_NULL_FN(range->handler))
> - break;
> + goto mmu_unlock;
> }
> r.ret |= range->handler(kvm, &gfn_range);
> }
> @@ -660,6 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> if (range->flush_on_ret && r.ret)
> kvm_flush_remote_tlbs(kvm);
>
> +mmu_unlock:
> if (r.found_memslot)
> KVM_MMU_UNLOCK(kvm);

Yep. If you want to just reply with Signed-off-by I'll mix the
original commit message and your patch.

Paolo


2024-06-03 20:38:53

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error

On Mon, Jun 03, 2024, Paolo Bonzini wrote:
> On Mon, Jun 3, 2024 at 7:54 PM Sean Christopherson <[email protected]> wrote:
> > > However, VM boots up fine without any issues and operational.
>
> Yes, the caller uses kvm_handle_hva_range() as if it returned void.
>
> > Ah, the "break" will only break out of the memslot loop, it won't break out of
> > the address space loop. Stupid SMM.
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index b312d0cbe60b..70f5a39f8302 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -651,7 +651,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> > range->on_lock(kvm);
> >
> > if (IS_KVM_NULL_FN(range->handler))
> > - break;
> > + goto mmu_unlock;
> > }
> > r.ret |= range->handler(kvm, &gfn_range);
> > }
> > @@ -660,6 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> > if (range->flush_on_ret && r.ret)
> > kvm_flush_remote_tlbs(kvm);
> >
> > +mmu_unlock:
> > if (r.found_memslot)
> > KVM_MMU_UNLOCK(kvm);
>
> Yep. If you want to just reply with Signed-off-by I'll mix the
> original commit message and your patch.

Signed-off-by: Sean Christopherson <[email protected]>

2024-06-06 15:13:04

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error



On 6/3/24 15:38, Sean Christopherson wrote:
> On Mon, Jun 03, 2024, Paolo Bonzini wrote:
>> On Mon, Jun 3, 2024 at 7:54 PM Sean Christopherson <[email protected]> wrote:
>>>> However, VM boots up fine without any issues and operational.
>>
>> Yes, the caller uses kvm_handle_hva_range() as if it returned void.
>>
>>> Ah, the "break" will only break out of the memslot loop, it won't break out of
>>> the address space loop. Stupid SMM.
>>>
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index b312d0cbe60b..70f5a39f8302 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -651,7 +651,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
>>> range->on_lock(kvm);
>>>
>>> if (IS_KVM_NULL_FN(range->handler))
>>> - break;
>>> + goto mmu_unlock;
>>> }
>>> r.ret |= range->handler(kvm, &gfn_range);
>>> }
>>> @@ -660,6 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
>>> if (range->flush_on_ret && r.ret)
>>> kvm_flush_remote_tlbs(kvm);
>>>
>>> +mmu_unlock:
>>> if (r.found_memslot)
>>> KVM_MMU_UNLOCK(kvm);
>>
>> Yep. If you want to just reply with Signed-off-by I'll mix the
>> original commit message and your patch.
>
> Signed-off-by: Sean Christopherson <[email protected]>

Thanks Sean, Paolo.

I will send v2 with Sean's patch.
--
Thanks
Babu Moger

2024-06-12 14:42:20

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v2] KVM: Fix Undefined Behavior Sanitizer(UBSAN) error

System throws this following UBSAN: invalid-load error when the very first
VM is powered up on a freshly booted host machine. Happens only with 2P or
4P (multiple sockets) systems.

{ 688.429145] ------------[ cut here ]------------
[ 688.429156] UBSAN: invalid-load in arch/x86/kvm/../../../virt/kvm/kvm_main.c:655:10
[ 688.437739] load of value 160 is not a valid value for type '_Bool'
[ 688.444760] CPU: 370 PID: 8246 Comm: CPU 0/KVM Not tainted 6.8.2-amdsos-build58-ubuntu-22.04+ #1
[ 688.444767] Hardware name: AMD Corporation Sh54p/Sh54p, BIOS WPC4429N 04/25/2024
[ 688.444770] Call Trace:
[ 688.444777] <TASK>
[ 688.444787] dump_stack_lvl+0x48/0x60
[ 688.444810] ubsan_epilogue+0x5/0x30
[ 688.444823] __ubsan_handle_load_invalid_value+0x79/0x80
[ 688.444827] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.444836] ? flush_tlb_func+0xe9/0x2e0
[ 688.444845] kvm_mmu_notifier_invalidate_range_end.cold+0x18/0x4f [kvm]
[ 688.444906] __mmu_notifier_invalidate_range_end+0x63/0xe0
[ 688.444917] __split_huge_pmd+0x367/0xfc0
[ 688.444928] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.444931] ? alloc_pages_mpol+0x97/0x210
[ 688.444941] do_huge_pmd_wp_page+0x1cc/0x380
[ 688.444946] __handle_mm_fault+0x8ee/0xe50
[ 688.444958] handle_mm_fault+0xe4/0x4a0
[ 688.444962] __get_user_pages+0x190/0x840
[ 688.444972] get_user_pages_unlocked+0xe0/0x590
[ 688.444977] hva_to_pfn+0x114/0x550 [kvm]
[ 688.445007] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445011] ? __gfn_to_pfn_memslot+0x3b/0xd0 [kvm]
[ 688.445037] kvm_faultin_pfn+0xed/0x5b0 [kvm]
[ 688.445079] kvm_tdp_page_fault+0x123/0x170 [kvm]
[ 688.445109] kvm_mmu_page_fault+0x244/0xaa0 [kvm]
[ 688.445136] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445138] ? kvm_io_bus_get_first_dev+0x56/0xf0 [kvm]
[ 688.445165] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445171] ? svm_vcpu_run+0x329/0x7c0 [kvm_amd]
[ 688.445186] vcpu_enter_guest+0x592/0x1070 [kvm]
[ 688.445223] kvm_arch_vcpu_ioctl_run+0x145/0x8a0 [kvm]
[ 688.445254] kvm_vcpu_ioctl+0x288/0x6d0 [kvm]
[ 688.445279] ? vcpu_put+0x22/0x50 [kvm]
[ 688.445305] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445307] ? kvm_arch_vcpu_ioctl_run+0x346/0x8a0 [kvm]
[ 688.445335] __x64_sys_ioctl+0x8f/0xd0
[ 688.445343] do_syscall_64+0x77/0x120
[ 688.445353] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445355] ? fire_user_return_notifiers+0x42/0x70
[ 688.445363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445365] ? syscall_exit_to_user_mode+0x82/0x1b0
[ 688.445372] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445377] ? do_syscall_64+0x86/0x120
[ 688.445380] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445383] ? do_syscall_64+0x86/0x120
[ 688.445388] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445392] ? do_syscall_64+0x86/0x120
[ 688.445396] ? srso_alias_return_thunk+0x5/0xfbef5
[ 688.445400] ? do_syscall_64+0x86/0x120
[ 688.445404] ? do_syscall_64+0x86/0x120
[ 688.445407] ? do_syscall_64+0x86/0x120
[ 688.445410] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 688.445416] RIP: 0033:0x7fdf2ed1a94f
[ 688.445421] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89
44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41>
89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ 688.445424] RSP: 002b:00007fc127bff460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 688.445429] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fdf2ed1a94f
[ 688.445432] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000016
[ 688.445434] RBP: 00005586f80dc350 R08: 00005586f6a0af10 R09: 00000000ffffffff
[ 688.445436] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[ 688.445438] R13: 0000000000000001 R14: 0000000000000cf8 R15: 0000000000000000
[ 688.445443] </TASK>
[ 688.445444] ---[ end trace ]---

However, VM boots up fine without any issues and operational.

The error is due to invalid assignment in kvm invalidate range end path.
There is no arch specific handler for this case and handler is assigned
to kvm_null_fn(). This is an empty function and returns void. Return value
of this function is assigned to boolean variable. UBSAN complains about
this incompatible assignment when kernel is compiled with CONFIG_UBSAN.

Fix the issue by breaking out of memslot loop when the handler is null.

Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
v2: Updated with Sean's patch. Added his sign-off-by.
---
virt/kvm/kvm_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 14841acb8b95..d65d3aa99650 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -651,7 +651,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
range->on_lock(kvm);

if (IS_KVM_NULL_FN(range->handler))
- break;
+ goto mmu_unlock;
}
r.ret |= range->handler(kvm, &gfn_range);
}
@@ -660,6 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
if (range->flush_on_ret && r.ret)
kvm_flush_remote_tlbs(kvm);

+mmu_unlock:
if (r.found_memslot)
KVM_MMU_UNLOCK(kvm);

--
2.34.1