From: Colin Ian King <[email protected]>
The allocation for *gfn_track should be for a slot->npages lot of
short integers, however the current allocation is using sizeof(*gfn_track)
and that is the size of a pointer, which is too large. Fix this by
using sizeof(**gfn_track) instead.
Addresses-Coverity: ("Wrong sizeof argument")
Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
Signed-off-by: Colin Ian King <[email protected]>
---
arch/x86/kvm/mmu/page_track.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index bb5d60bd4dbf..5b785a5f7dc9 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -92,7 +92,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
slots = __kvm_memslots(kvm, i);
kvm_for_each_memslot(slot, slots) {
gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
- *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
+ *gfn_track = kvcalloc(slot->npages, sizeof(**gfn_track),
GFP_KERNEL_ACCOUNT);
if (*gfn_track == NULL) {
mutex_unlock(&kvm->slots_arch_lock);
--
2.32.0
On Fri, Oct 01, 2021, Colin King wrote:
> From: Colin Ian King <[email protected]>
>
> The allocation for *gfn_track should be for a slot->npages lot of
> short integers, however the current allocation is using sizeof(*gfn_track)
> and that is the size of a pointer, which is too large. Fix this by
> using sizeof(**gfn_track) instead.
>
> Addresses-Coverity: ("Wrong sizeof argument")
> Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
> Signed-off-by: Colin Ian King <[email protected]>
> ---
> arch/x86/kvm/mmu/page_track.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> index bb5d60bd4dbf..5b785a5f7dc9 100644
> --- a/arch/x86/kvm/mmu/page_track.c
> +++ b/arch/x86/kvm/mmu/page_track.c
> @@ -92,7 +92,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
> slots = __kvm_memslots(kvm, i);
> kvm_for_each_memslot(slot, slots) {
> gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> - *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> + *gfn_track = kvcalloc(slot->npages, sizeof(**gfn_track),
> GFP_KERNEL_ACCOUNT);
Eww (not your patch, the original code). IMO the double indirection is completely
unnecessary, e.g. I find this far easier to follow
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index bb5d60bd4dbf..8cae41b831dd 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -75,7 +75,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
- unsigned short **gfn_track;
+ unsigned short *gfn_track;
int i;
if (write_tracking_enabled(kvm))
@@ -91,13 +91,13 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
slots = __kvm_memslots(kvm, i);
kvm_for_each_memslot(slot, slots) {
- gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
- *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
- GFP_KERNEL_ACCOUNT);
- if (*gfn_track == NULL) {
+ gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
+ GFP_KERNEL_ACCOUNT);
+ if (gfn_track == NULL) {
mutex_unlock(&kvm->slots_arch_lock);
return -ENOMEM;
}
+ slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE] = gfn_track;
}
}
> if (*gfn_track == NULL) {
> mutex_unlock(&kvm->slots_arch_lock);
Hrm, this fails to free the gfn_track allocations for previous memslots. The
on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
in the _current_ slot, but does not free previous slots).
And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
and means we have to maintain two near-identical copies of non-obvious code.
Paolo, is it too late to just drop the original deae4a10f166 ("KVM: x86: only
allocate gfn_track when necessary")?
> --
> 2.32.0
>
On 05/10/21 17:41, Sean Christopherson wrote:
>> if (*gfn_track == NULL) {
>> mutex_unlock(&kvm->slots_arch_lock);
> Hrm, this fails to free the gfn_track allocations for previous memslots. The
> on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> in the_current_ slot, but does not free previous slots).
That's not a huge deal because the syscall is failing. So as long as
it's not leaked forever, it's okay. The problem is the
WARN_ON(slot->arch.rmap[i]), or the missing check in
kvm_page_track_enable_mmu_write_tracking, but that's easily fixed. I'd
even remove the call to memslot_rmaps_free.
> And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> and means we have to maintain two near-identical copies of non-obvious code.
I was thinking the separate flow (not so much the flag) is needed
because, if KVMGT is enabled, gfn_track is allocated unconditionally.
rmaps are added on top of that if shadow paging is enabled; but
kvm_page_track_create_memslot will have already created the counter,
including the one for KVM_PAGE_TRACK_WRITE.
But looking at the code again, I guess you could call
kvm_page_track_enable_mmu_write_tracking inside alloc_all_memslots_rmaps
(with a little bit of renaming), and with that the flag would go away.
I'll take a look tomorrow, but I'd rather avoid reverting the patch.
Thanks,
Paolo
> Paolo, is it too late to just drop the original deae4a10f166 ("KVM: x86: only
> allocate gfn_track when necessary")?
>
On Tue, Oct 05, 2021, Paolo Bonzini wrote:
> On 05/10/21 17:41, Sean Christopherson wrote:
> > > if (*gfn_track == NULL) {
> > > mutex_unlock(&kvm->slots_arch_lock);
> > Hrm, this fails to free the gfn_track allocations for previous memslots. The
> > on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> > in the_current_ slot, but does not free previous slots).
>
> That's not a huge deal because the syscall is failing. So as long as it's
> not leaked forever, it's okay. The problem is the
> WARN_ON(slot->arch.rmap[i]), or the missing check in
> kvm_page_track_enable_mmu_write_tracking, but that's easily fixed. I'd even
> remove the call to memslot_rmaps_free.
It can be leaked forever though, e.g. if userspace invokes KVM_RUN over and over
on -ENOMEM. That would trigger the WARN_ON(slot->arch.rmap[i]) and leak the
previous allocation. I think it would be safe to change that WARN_ON to a
check-and-continue, i.e. to preserve the previous allocation
> > And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> > and means we have to maintain two near-identical copies of non-obvious code.
>
> I was thinking the separate flow (not so much the flag) is needed because,
> if KVMGT is enabled, gfn_track is allocated unconditionally. rmaps are added
> on top of that if shadow paging is enabled; but
> kvm_page_track_create_memslot will have already created the counter,
> including the one for KVM_PAGE_TRACK_WRITE.
>
> But looking at the code again, I guess you could call
> kvm_page_track_enable_mmu_write_tracking inside alloc_all_memslots_rmaps
> (with a little bit of renaming), and with that the flag would go away.
Yes, and reuse the control flow, which is what I really care about since that's
the part that both features get wrong.
> I'll take a look tomorrow, but I'd rather avoid reverting the patch.
I can poke at it too if you don't have time. I wasn't suggesting a full revert,
rather a "drop and pretend it never got applied", with a plan to apply a new
version instead of fixing up the current code.
On 05/10/21 19:55, Sean Christopherson wrote:
> I wasn't suggesting a full revert,
> rather a "drop and pretend it never got applied", with a plan to apply a new
> version instead of fixing up the current code.
Considering that there are issues in the rmaps as well, I'd rather fix
both the right way.
Paolo
On Wed, Oct 6, 2021 at 12:41 AM Sean Christopherson <[email protected]> wrote:
>
> On Fri, Oct 01, 2021, Colin King wrote:
> > From: Colin Ian King <[email protected]>
> >
> > The allocation for *gfn_track should be for a slot->npages lot of
> > short integers, however the current allocation is using sizeof(*gfn_track)
> > and that is the size of a pointer, which is too large. Fix this by
> > using sizeof(**gfn_track) instead.
> >
> > Addresses-Coverity: ("Wrong sizeof argument")
> > Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
> > Signed-off-by: Colin Ian King <[email protected]>
> > ---
> > arch/x86/kvm/mmu/page_track.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> > index bb5d60bd4dbf..5b785a5f7dc9 100644
> > --- a/arch/x86/kvm/mmu/page_track.c
> > +++ b/arch/x86/kvm/mmu/page_track.c
> > @@ -92,7 +92,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
> > slots = __kvm_memslots(kvm, i);
> > kvm_for_each_memslot(slot, slots) {
> > gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> > - *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> > + *gfn_track = kvcalloc(slot->npages, sizeof(**gfn_track),
> > GFP_KERNEL_ACCOUNT);
>
> Eww (not your patch, the original code). IMO the double indirection is completely
> unnecessary, e.g. I find this far easier to follow
>
> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> index bb5d60bd4dbf..8cae41b831dd 100644
> --- a/arch/x86/kvm/mmu/page_track.c
> +++ b/arch/x86/kvm/mmu/page_track.c
> @@ -75,7 +75,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
> {
> struct kvm_memslots *slots;
> struct kvm_memory_slot *slot;
> - unsigned short **gfn_track;
> + unsigned short *gfn_track;
> int i;
>
> if (write_tracking_enabled(kvm))
> @@ -91,13 +91,13 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
> for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> slots = __kvm_memslots(kvm, i);
> kvm_for_each_memslot(slot, slots) {
> - gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> - *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> - GFP_KERNEL_ACCOUNT);
> - if (*gfn_track == NULL) {
> + gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> + GFP_KERNEL_ACCOUNT);
> + if (gfn_track == NULL) {
> mutex_unlock(&kvm->slots_arch_lock);
> return -ENOMEM;
> }
> + slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE] = gfn_track;
> }
> }
>
>
>
> > if (*gfn_track == NULL) {
> > mutex_unlock(&kvm->slots_arch_lock);
>
> Hrm, this fails to free the gfn_track allocations for previous memslots. The
> on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> in the _current_ slot, but does not free previous slots).
>
> And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> and means we have to maintain two near-identical copies of non-obvious code.
I agree that's better than my patch. I can put together a new patch
once it's decided whether or not my patch should be dropped.
-David
On Wed, Oct 06, 2021, David Stevens wrote:
> On Wed, Oct 6, 2021 at 12:41 AM Sean Christopherson <[email protected]> wrote:
> > Hrm, this fails to free the gfn_track allocations for previous memslots. The
> > on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> > in the _current_ slot, but does not free previous slots).
> >
> > And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> > and means we have to maintain two near-identical copies of non-obvious code.
>
> I agree that's better than my patch. I can put together a new patch
> once it's decided whether or not my patch should be dropped.
All yours, unless Paolo wants to fight you for it :-) I'm totally ok doing
cleanup/fixes on top.