2021-07-26 14:40:03

by Christoph Hellwig

[permalink] [raw]
Subject: two small mdev fixups

Hi all,

two small mdev fixes - one to fix mdev for built-in drivers, and the other
one to remove a pointless warning.


2021-07-26 14:40:15

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

Only a single driver actually sets the ->request method, so don't print
a scary warning if it isn't.

Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/vfio/mdev/mdev_core.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b16606ebafa1..b314101237fe 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -138,10 +138,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
if (!dev)
return -EINVAL;

- /* Not mandatory, but its absence could be a problem */
- if (!ops->request)
- dev_info(dev, "Driver cannot be asked to release device\n");
-
mutex_lock(&parent_list_lock);

/* Check for duplicate */
--
2.30.2

2021-07-26 17:13:06

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Mon, Jul 26 2021, Christoph Hellwig <[email protected]> wrote:

> Only a single driver actually sets the ->request method, so don't print
> a scary warning if it isn't.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/vfio/mdev/mdev_core.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b16606ebafa1..b314101237fe 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -138,10 +138,6 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
> if (!dev)
> return -EINVAL;
>
> - /* Not mandatory, but its absence could be a problem */
> - if (!ops->request)
> - dev_info(dev, "Driver cannot be asked to release device\n");
> -
> mutex_lock(&parent_list_lock);
>
> /* Check for duplicate */

We also log a warning if we would like to call ->request() but none was
provided, so I think that's fine.

Reviewed-by: Cornelia Huck <[email protected]>

But I wonder why nobody else implements this? Lack of surprise removal?

2021-07-26 23:08:50

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Mon, Jul 26, 2021 at 04:35:24PM +0200, Christoph Hellwig wrote:
> Only a single driver actually sets the ->request method, so don't print
> a scary warning if it isn't.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/vfio/mdev/mdev_core.c | 4 ----
> 1 file changed, 4 deletions(-)

Reviewed-by: Jason Gunthorpe <[email protected]>

2021-07-26 23:10:29

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:

> But I wonder why nobody else implements this? Lack of surprise removal?

The only implementation triggers an eventfd that seems to be the same
eventfd as the interrupt..

Do you know how this works in userspace? I'm surprised that the
interrupt eventfd can trigger an observation that the kernel driver
wants to be unplugged?

Jason

2021-07-26 23:30:01

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Mon, 26 Jul 2021 20:09:06 -0300
Jason Gunthorpe <[email protected]> wrote:

> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
>
> > But I wonder why nobody else implements this? Lack of surprise removal?
>
> The only implementation triggers an eventfd that seems to be the same
> eventfd as the interrupt..
>
> Do you know how this works in userspace? I'm surprised that the
> interrupt eventfd can trigger an observation that the kernel driver
> wants to be unplugged?

I think we're talking about ccw, but I see QEMU registering separate
eventfds for each of the 3 IRQ indexes and the mdev driver specifically
triggering the req_trigger...? Thanks,

Alex

2021-07-27 06:05:30

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Mon, Jul 26 2021, Alex Williamson <[email protected]> wrote:

> On Mon, 26 Jul 2021 20:09:06 -0300
> Jason Gunthorpe <[email protected]> wrote:
>
>> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
>>
>> > But I wonder why nobody else implements this? Lack of surprise removal?
>>
>> The only implementation triggers an eventfd that seems to be the same
>> eventfd as the interrupt..
>>
>> Do you know how this works in userspace? I'm surprised that the
>> interrupt eventfd can trigger an observation that the kernel driver
>> wants to be unplugged?
>
> I think we're talking about ccw, but I see QEMU registering separate
> eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> triggering the req_trigger...? Thanks,
>
> Alex

Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
checks), and this one.

2021-07-27 17:34:12

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> On Mon, Jul 26 2021, Alex Williamson <[email protected]> wrote:
>
> > On Mon, 26 Jul 2021 20:09:06 -0300
> > Jason Gunthorpe <[email protected]> wrote:
> >
> >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> >>
> >> > But I wonder why nobody else implements this? Lack of surprise removal?
> >>
> >> The only implementation triggers an eventfd that seems to be the same
> >> eventfd as the interrupt..
> >>
> >> Do you know how this works in userspace? I'm surprised that the
> >> interrupt eventfd can trigger an observation that the kernel driver
> >> wants to be unplugged?
> >
> > I think we're talking about ccw, but I see QEMU registering separate
> > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > triggering the req_trigger...? Thanks,
> >
> > Alex
>
> Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> checks), and this one.

If it is a dedicated eventfd for 'device being removed' why is it in
the CCW implementation and not core code?

Is PCI doing the same?

Jason

2021-07-27 18:57:36

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Tue, 27 Jul 2021 14:32:09 -0300
Jason Gunthorpe <[email protected]> wrote:

> On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > On Mon, Jul 26 2021, Alex Williamson <[email protected]> wrote:
> >
> > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > Jason Gunthorpe <[email protected]> wrote:
> > >
> > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > >>
> > >> > But I wonder why nobody else implements this? Lack of surprise removal?
> > >>
> > >> The only implementation triggers an eventfd that seems to be the same
> > >> eventfd as the interrupt..
> > >>
> > >> Do you know how this works in userspace? I'm surprised that the
> > >> interrupt eventfd can trigger an observation that the kernel driver
> > >> wants to be unplugged?
> > >
> > > I think we're talking about ccw, but I see QEMU registering separate
> > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > triggering the req_trigger...? Thanks,
> > >
> > > Alex
> >
> > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > checks), and this one.
>
> If it is a dedicated eventfd for 'device being removed' why is it in
> the CCW implementation and not core code?

The CCW implementation (likewise the vfio-pci implementation) owns the
IRQ index address space and the decision to make this a signal to
userspace rather than perhaps some handling a device might be able to
do internally. For instance an alternate vfio-pci implementation might
zap all mmaps, block all r/w access, and turn this into a surprise
removal. Another implementation might be more aggressive to sending
SIGKILL to the user process. This was the thought behind why vfio-core
triggers the driver request callback with a counter, leaving the policy
to the driver.

> Is PCI doing the same?

Yes, that's where this handling originated. Thanks,

Alex


2021-07-27 19:05:30

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> On Tue, 27 Jul 2021 14:32:09 -0300
> Jason Gunthorpe <[email protected]> wrote:
>
> > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > > On Mon, Jul 26 2021, Alex Williamson <[email protected]> wrote:
> > >
> > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > Jason Gunthorpe <[email protected]> wrote:
> > > >
> > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > >>
> > > >> > But I wonder why nobody else implements this? Lack of surprise removal?
> > > >>
> > > >> The only implementation triggers an eventfd that seems to be the same
> > > >> eventfd as the interrupt..
> > > >>
> > > >> Do you know how this works in userspace? I'm surprised that the
> > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > >> wants to be unplugged?
> > > >
> > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > triggering the req_trigger...? Thanks,
> > > >
> > > > Alex
> > >
> > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > checks), and this one.
> >
> > If it is a dedicated eventfd for 'device being removed' why is it in
> > the CCW implementation and not core code?
>
> The CCW implementation (likewise the vfio-pci implementation) owns
> the IRQ index address space and the decision to make this a signal
> to userspace rather than perhaps some handling a device might be
> able to do internally.

The core code holds the vfio_device_get() so long as the FD is
open. There is no way to pass the wait_for_completion without
userspace closing the FD, so there isn't really much choice for the
drivers to do beyond signal to userpace to close the FD??

> For instance an alternate vfio-pci implementation might zap all
> mmaps, block all r/w access, and turn this into a surprise removal.

This is nice, but wouldn't close the FD, so needs core changes
anyhow..

> Another implementation might be more aggressive to sending SIGKILL
> to the user process.

We don't try to revoke FDs from the kernel, it is racy, dangerous and
unreliable.

> This was the thought behind why vfio-core triggers the driver
> request callback with a counter, leaving the policy to the driver.

IMHO subsystem policy does not belong in drivers. Down that road lies
a mess for userspace.

Jason

2021-07-27 19:28:35

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set

On Tue, 27 Jul 2021 16:03:17 -0300
Jason Gunthorpe <[email protected]> wrote:

> On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> > On Tue, 27 Jul 2021 14:32:09 -0300
> > Jason Gunthorpe <[email protected]> wrote:
> >
> > > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > > > On Mon, Jul 26 2021, Alex Williamson <[email protected]> wrote:
> > > >
> > > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > > Jason Gunthorpe <[email protected]> wrote:
> > > > >
> > > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > > >>
> > > > >> > But I wonder why nobody else implements this? Lack of surprise removal?
> > > > >>
> > > > >> The only implementation triggers an eventfd that seems to be the same
> > > > >> eventfd as the interrupt..
> > > > >>
> > > > >> Do you know how this works in userspace? I'm surprised that the
> > > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > > >> wants to be unplugged?
> > > > >
> > > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > > triggering the req_trigger...? Thanks,
> > > > >
> > > > > Alex
> > > >
> > > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > > checks), and this one.
> > >
> > > If it is a dedicated eventfd for 'device being removed' why is it in
> > > the CCW implementation and not core code?
> >
> > The CCW implementation (likewise the vfio-pci implementation) owns
> > the IRQ index address space and the decision to make this a signal
> > to userspace rather than perhaps some handling a device might be
> > able to do internally.
>
> The core code holds the vfio_device_get() so long as the FD is
> open. There is no way to pass the wait_for_completion without
> userspace closing the FD, so there isn't really much choice for the
> drivers to do beyond signal to userpace to close the FD??
>
> > For instance an alternate vfio-pci implementation might zap all
> > mmaps, block all r/w access, and turn this into a surprise removal.
>
> This is nice, but wouldn't close the FD, so needs core changes
> anyhow..

Right, the core would need to be able to handle an FD disconnected from
the device, obviously some core changes would be required.

> > Another implementation might be more aggressive to sending SIGKILL
> > to the user process.
>
> We don't try to revoke FDs from the kernel, it is racy, dangerous and
> unreliable.

I'm not sure how trying to kill the process using an open file becomes
a revoke... In fact, the surprise hotplug might just be able to zap
mmaps and wait for userspace to generate a SIGBUS.

> > This was the thought behind why vfio-core triggers the driver
> > request callback with a counter, leaving the policy to the driver.
>
> IMHO subsystem policy does not belong in drivers. Down that road lies
> a mess for userspace.

I think my argument was that to this point it's been driver policy, not
subsystem policy. The subsystem policy is to block until the device is
released, it's the driver policy whether it has a means to implement
something to expedite that. Thanks,

Alex


2021-08-03 20:16:01

by Alex Williamson

[permalink] [raw]
Subject: Re: two small mdev fixups

On Mon, 26 Jul 2021 16:35:22 +0200
Christoph Hellwig <[email protected]> wrote:

> Hi all,
>
> two small mdev fixes - one to fix mdev for built-in drivers, and the other
> one to remove a pointless warning.

Applied to vfio next branch for v5.15 with Connie and Jason's R-b.
Thanks,

Alex