2022-12-29 12:47:03

by Wei Wang

[permalink] [raw]
Subject: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

Current usage of kvm_io_device requires users to destruct it with an extra
call of kvm_iodevice_destructor after the device gets unregistered from
the kvm_io_bus. This is not necessary and can cause errors if a user
forgot to make the extra call.

Simplify the usage by combining kvm_iodevice_destructor into
kvm_io_bus_unregister_dev. This reduces LOCs a bit for users and can
avoid the leakage of destructing the device explicitly.

The fix was originally provided by Sean Christopherson.
Link: https://lore.kernel.org/lkml/DS0PR11MB6373F27D0EE6CD28C784478BDCEC9@DS0PR11MB6373.namprd11.prod.outlook.com/T/
Fixes: 5d3c4c79384a ("KVM: Stop looking for coalesced MMIO zones if the bus is destroyed")
Cc: [email protected]
Reported-by: 柳菁峰 <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
---
include/kvm/iodev.h | 6 ------
virt/kvm/coalesced_mmio.c | 1 -
virt/kvm/eventfd.c | 1 -
virt/kvm/kvm_main.c | 24 +++++++++++++++---------
4 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
index d75fc4365746..56619e33251e 100644
--- a/include/kvm/iodev.h
+++ b/include/kvm/iodev.h
@@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
: -EOPNOTSUPP;
}

-static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
-{
- if (dev->ops->destructor)
- dev->ops->destructor(dev);
-}
-
#endif /* __KVM_IODEV_H__ */
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 0be80c213f7f..d7135a5e76f8 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
*/
if (r)
break;
- kvm_iodevice_destructor(&dev->dev);
}
}

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 2a3ed401ce46..1b277afb545b 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
bus = kvm_get_bus(kvm, bus_idx);
if (bus)
bus->ioeventfd_count--;
- ioeventfd_release(p);
ret = 0;
break;
}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 13e88297f999..582757ebdce6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
.priority = 0,
};

+static void kvm_iodevice_destructor(struct kvm_io_device *dev)
+{
+ if (dev->ops->destructor)
+ dev->ops->destructor(dev);
+}
+
static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
{
int i;
@@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
struct kvm_io_device *dev)
{
- int i, j;
+ int i;
struct kvm_io_bus *new_bus, *bus;

lockdep_assert_held(&kvm->slots_lock);
@@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
synchronize_srcu_expedited(&kvm->srcu);

- /* Destroy the old bus _after_ installing the (null) bus. */
+ /*
+ * If (null) bus is installed, destroy the old bus, including all the
+ * attached devices. Otherwise, destroy the caller's device only.
+ */
if (!new_bus) {
pr_err("kvm: failed to shrink bus, removing it completely\n");
- for (j = 0; j < bus->dev_count; j++) {
- if (j == i)
- continue;
- kvm_iodevice_destructor(bus->range[j].dev);
- }
+ kvm_io_bus_destroy(bus);
+ return -ENOMEM;
}

- kfree(bus);
- return new_bus ? 0 : -ENOMEM;
+ kvm_iodevice_destructor(dev);
+ return 0;
}

struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus bus_idx,
--
2.27.0


2023-01-23 23:25:30

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

On Thu, Dec 29, 2022, Wei Wang wrote:
> Current usage of kvm_io_device requires users to destruct it with an extra
> call of kvm_iodevice_destructor after the device gets unregistered from
> the kvm_io_bus. This is not necessary and can cause errors if a user
> forgot to make the extra call.
>
> Simplify the usage by combining kvm_iodevice_destructor into
> kvm_io_bus_unregister_dev. This reduces LOCs a bit for users and can
> avoid the leakage of destructing the device explicitly.
>
> The fix was originally provided by Sean Christopherson.
> Link: https://lore.kernel.org/lkml/DS0PR11MB6373F27D0EE6CD28C784478BDCEC9@DS0PR11MB6373.namprd11.prod.outlook.com/T/
> Fixes: 5d3c4c79384a ("KVM: Stop looking for coalesced MMIO zones if the bus is destroyed")
> Cc: [email protected]
> Reported-by: 柳菁峰 <[email protected]>
> Signed-off-by: Wei Wang <[email protected]>
> ---
> include/kvm/iodev.h | 6 ------
> virt/kvm/coalesced_mmio.c | 1 -
> virt/kvm/eventfd.c | 1 -
> virt/kvm/kvm_main.c | 24 +++++++++++++++---------
> 4 files changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> index d75fc4365746..56619e33251e 100644
> --- a/include/kvm/iodev.h
> +++ b/include/kvm/iodev.h
> @@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
> : -EOPNOTSUPP;
> }
>
> -static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
> -{
> - if (dev->ops->destructor)
> - dev->ops->destructor(dev);
> -}
> -
> #endif /* __KVM_IODEV_H__ */
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> index 0be80c213f7f..d7135a5e76f8 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
> */
> if (r)
> break;
> - kvm_iodevice_destructor(&dev->dev);

The comment lurking above this is now stale.

> }
> }
>
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 2a3ed401ce46..1b277afb545b 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
> bus = kvm_get_bus(kvm, bus_idx);
> if (bus)
> bus->ioeventfd_count--;
> - ioeventfd_release(p);
> ret = 0;
> break;
> }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 13e88297f999..582757ebdce6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
> .priority = 0,
> };
>
> +static void kvm_iodevice_destructor(struct kvm_io_device *dev)
> +{
> + if (dev->ops->destructor)
> + dev->ops->destructor(dev);
> +}
> +
> static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
> {
> int i;
> @@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
> int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
> struct kvm_io_device *dev)
> {
> - int i, j;
> + int i;
> struct kvm_io_bus *new_bus, *bus;
>
> lockdep_assert_held(&kvm->slots_lock);
> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
> rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
> synchronize_srcu_expedited(&kvm->srcu);
>
> - /* Destroy the old bus _after_ installing the (null) bus. */
> + /*
> + * If (null) bus is installed, destroy the old bus, including all the
> + * attached devices. Otherwise, destroy the caller's device only.
> + */
> if (!new_bus) {
> pr_err("kvm: failed to shrink bus, removing it completely\n");
> - for (j = 0; j < bus->dev_count; j++) {
> - if (j == i)
> - continue;
> - kvm_iodevice_destructor(bus->range[j].dev);
> - }
> + kvm_io_bus_destroy(bus);
> + return -ENOMEM;

Returning an error code is unnecessary if unregister_dev() destroys the bus.
Nothing ultimately consumes the result, e.g. kvm_vm_ioctl_unregister_coalesced_mmio()
intentionally ignores the result other than to bail from the loop, and destroying
the bus means it will immediately bail from the loop anyways.

> }
>
> - kfree(bus);
> - return new_bus ? 0 : -ENOMEM;
> + kvm_iodevice_destructor(dev);
> + return 0;

Unless I'm misreading things, this path leaks "bus".

Given that that intent is to send the fix for stable, that this is as much a
cleanup as it is a bug fix, and that it's not super trivial, I'm inclined to queue
my patch and then land this on top as cleanup.

2023-01-24 20:22:44

by Michal Luczaj

[permalink] [raw]
Subject: Re: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

On 1/24/23 00:25, Sean Christopherson wrote:
> On Thu, Dec 29, 2022, Wei Wang wrote:
>> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
>> index 2a3ed401ce46..1b277afb545b 100644
>> --- a/virt/kvm/eventfd.c
>> +++ b/virt/kvm/eventfd.c
>> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
>> bus = kvm_get_bus(kvm, bus_idx);
>> if (bus)
>> bus->ioeventfd_count--;
>> - ioeventfd_release(p);
>> ret = 0;
>> break;
>> }

I was wondering: would it make sense to simplify from
list_for_each_entry_safe() to list_for_each_entry() in this loop?

>> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>> rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
>> synchronize_srcu_expedited(&kvm->srcu);
>>
>> - /* Destroy the old bus _after_ installing the (null) bus. */
>> + /*
>> + * If (null) bus is installed, destroy the old bus, including all the
>> + * attached devices. Otherwise, destroy the caller's device only.
>> + */
>> if (!new_bus) {
>> pr_err("kvm: failed to shrink bus, removing it completely\n");
>> - for (j = 0; j < bus->dev_count; j++) {
>> - if (j == i)
>> - continue;
>> - kvm_iodevice_destructor(bus->range[j].dev);
>> - }
>> + kvm_io_bus_destroy(bus);
>> + return -ENOMEM;
>
> Returning an error code is unnecessary if unregister_dev() destroys the bus.
> Nothing ultimately consumes the result, e.g. kvm_vm_ioctl_unregister_coalesced_mmio()
> intentionally ignores the result other than to bail from the loop, and destroying
> the bus means it will immediately bail from the loop anyways.

But it is important to know _if_ the bus was destroyed, right?
IOW, doesn't your comment from commit 5d3c4c79384a still hold?

(...) But, it doesn't tell the caller that it obliterated the
bus and invoked the destructor for all devices that were on the bus. In
the coalesced MMIO case, this can result in a deleted list entry
dereference due to attempting to continue iterating on coalesced_zones
after future entries (in the walk) have been deleted.

Michal


2023-01-24 20:55:13

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

On Tue, Jan 24, 2023, Michal Luczaj wrote:
> On 1/24/23 00:25, Sean Christopherson wrote:
> > On Thu, Dec 29, 2022, Wei Wang wrote:
> >> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> >> index 2a3ed401ce46..1b277afb545b 100644
> >> --- a/virt/kvm/eventfd.c
> >> +++ b/virt/kvm/eventfd.c
> >> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
> >> bus = kvm_get_bus(kvm, bus_idx);
> >> if (bus)
> >> bus->ioeventfd_count--;
> >> - ioeventfd_release(p);
> >> ret = 0;
> >> break;
> >> }
>
> I was wondering: would it make sense to simplify from
> list_for_each_entry_safe() to list_for_each_entry() in this loop?

Ooh, yeah, that's super confusing, at least to me, because the "safe" part implies
that the loop processes entries after kvm_io_bus_unregister_dev(), i.e. needs to
guard against failure same as the coalesced MMIO case.

Wei, want to tack on a patch in v2?

> >> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
> >> rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
> >> synchronize_srcu_expedited(&kvm->srcu);
> >>
> >> - /* Destroy the old bus _after_ installing the (null) bus. */
> >> + /*
> >> + * If (null) bus is installed, destroy the old bus, including all the
> >> + * attached devices. Otherwise, destroy the caller's device only.
> >> + */
> >> if (!new_bus) {
> >> pr_err("kvm: failed to shrink bus, removing it completely\n");
> >> - for (j = 0; j < bus->dev_count; j++) {
> >> - if (j == i)
> >> - continue;
> >> - kvm_iodevice_destructor(bus->range[j].dev);
> >> - }
> >> + kvm_io_bus_destroy(bus);
> >> + return -ENOMEM;
> >
> > Returning an error code is unnecessary if unregister_dev() destroys the bus.
> > Nothing ultimately consumes the result, e.g. kvm_vm_ioctl_unregister_coalesced_mmio()
> > intentionally ignores the result other than to bail from the loop, and destroying
> > the bus means it will immediately bail from the loop anyways.
>
> But it is important to know _if_ the bus was destroyed, right?
> IOW, doesn't your comment from commit 5d3c4c79384a still hold?

/facepalm

Yes, it matters. I somehow got on the train of thought that list_for_each_entry_safe()
magically bails if the list is purged, but the safe variant only plays nice with
the _current_ entry being deleted.

So yeah, the return code needs to stay.

> (...) But, it doesn't tell the caller that it obliterated the
> bus and invoked the destructor for all devices that were on the bus. In
> the coalesced MMIO case, this can result in a deleted list entry
> dereference due to attempting to continue iterating on coalesced_zones
> after future entries (in the walk) have been deleted.
>
> Michal
>

2023-01-25 02:38:28

by Wei Wang

[permalink] [raw]
Subject: RE: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

On Tuesday, January 24, 2023 7:25 AM, Sean Christopherson wrote:
> > - kfree(bus);
> > - return new_bus ? 0 : -ENOMEM;
> > + kvm_iodevice_destructor(dev);
> > + return 0;
>
> Unless I'm misreading things, this path leaks "bus".

Right, should keep the kfree above.

> Given that that intent is to send the fix for stable, that this is as much a
> cleanup as it is a bug fix, and that it's not super trivial, I'm inclined to queue
> my patch and then land this on top as cleanup.

Sounds good to me.

2023-01-25 02:42:40

by Wei Wang

[permalink] [raw]
Subject: RE: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus

On Wednesday, January 25, 2023 4:55 AM, Sean Christopherson wrote:
> >
> > I was wondering: would it make sense to simplify from
> > list_for_each_entry_safe() to list_for_each_entry() in this loop?
>
> Ooh, yeah, that's super confusing, at least to me, because the "safe" part
> implies that the loop processes entries after kvm_io_bus_unregister_dev(),
> i.e. needs to guard against failure same as the coalesced MMIO case.
>
> Wei, want to tack on a patch in v2?

Yes. I will include it in v2.