2022-05-27 12:03:28

by Eugenio Perez Martin

[permalink] [raw]
Subject: [PATCH v4 0/4] Implement vdpasim stop operation

Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer

that backend feature and userspace can effectively stop the device.



This is a must before get virtqueue indexes (base) for live migration,

since the device could modify them after userland gets them. There are

individual ways to perform that action for some devices

(VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no

way to perform it for any vhost device (and, in particular, vhost-vdpa).



After the return of ioctl with stop != 0, the device MUST finish any

pending operations like in flight requests. It must also preserve all

the necessary state (the virtqueue vring base plus the possible device

specific states) that is required for restoring in the future. The

device must not change its configuration after that point.



After the return of ioctl with stop == 0, the device can continue

processing buffers as long as typical conditions are met (vq is enabled,

DRIVER_OK status bit is enabled, etc).



In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD

so the device can save pending operations.



Comments are welcome.



v4:

* Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.



v3:

* s/VHOST_STOP/VHOST_VDPA_STOP/

* Add documentation and requirements of the ioctl above its definition.



v2:

* Replace raw _F_STOP with BIT_ULL(_F_STOP).

* Fix obtaining of stop ioctl arg (it was not obtained but written).

* Add stop to vdpa_sim_blk.



Eugenio Pérez (4):

vdpa: Add stop operation

vhost-vdpa: introduce STOP backend feature bit

vhost-vdpa: uAPI to stop the device

vdpa_sim: Implement stop vdpa op



drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +++++++++++++++++

drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +

drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 3 +++

drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++

drivers/vhost/vdpa.c | 34 +++++++++++++++++++++++++++-

include/linux/vdpa.h | 6 +++++

include/uapi/linux/vhost.h | 14 ++++++++++++

include/uapi/linux/vhost_types.h | 2 ++

8 files changed, 83 insertions(+), 1 deletion(-)



--

2.31.1






2022-05-28 19:58:08

by Eugenio Perez Martin

[permalink] [raw]
Subject: [PATCH v4 3/4] vhost-vdpa: uAPI to stop the device

The ioctl adds support for stop the device from userspace.

Signed-off-by: Eugenio Pérez <[email protected]>
---
drivers/vhost/vdpa.c | 18 ++++++++++++++++++
include/uapi/linux/vhost.h | 14 ++++++++++++++
2 files changed, 32 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 32713db5831d..d1d19555c4b7 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -478,6 +478,21 @@ static long vhost_vdpa_get_vqs_count(struct vhost_vdpa *v, u32 __user *argp)
return 0;
}

+static long vhost_vdpa_stop(struct vhost_vdpa *v, u32 __user *argp)
+{
+ struct vdpa_device *vdpa = v->vdpa;
+ const struct vdpa_config_ops *ops = vdpa->config;
+ int stop;
+
+ if (!ops->stop)
+ return -EOPNOTSUPP;
+
+ if (copy_from_user(&stop, argp, sizeof(stop)))
+ return -EFAULT;
+
+ return ops->stop(vdpa, stop);
+}
+
static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
void __user *argp)
{
@@ -650,6 +665,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_GET_VQS_COUNT:
r = vhost_vdpa_get_vqs_count(v, argp);
break;
+ case VHOST_VDPA_STOP:
+ r = vhost_vdpa_stop(v, argp);
+ break;
default:
r = vhost_dev_ioctl(&v->vdev, cmd, argp);
if (r == -ENOIOCTLCMD)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index cab645d4a645..c7e47b29bf61 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -171,4 +171,18 @@
#define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7C, \
struct vhost_vring_state)

+/* Stop or resume a device so it does not process virtqueue requests anymore
+ *
+ * After the return of ioctl with stop != 0, the device must finish any
+ * pending operations like in flight requests. It must also preserve all
+ * the necessary state (the virtqueue vring base plus the possible device
+ * specific states) that is required for restoring in the future. The
+ * device must not change its configuration after that point.
+ *
+ * After the return of ioctl with stop == 0, the device can continue
+ * processing buffers as long as typical conditions are met (vq is enabled,
+ * DRIVER_OK status bit is enabled, etc).
+ */
+#define VHOST_VDPA_STOP _IOW(VHOST_VIRTIO, 0x7D, int)
+
#endif
--
2.31.1


2022-05-31 08:17:48

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Tue, May 31, 2022 at 1:40 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Mon, May 30, 2022 at 11:39:21AM +0800, Jason Wang wrote:
> > On Fri, May 27, 2022 at 6:56 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Thu, May 26, 2022 at 12:54:32PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Eugenio Pérez <[email protected]>
> > > > > Sent: Thursday, May 26, 2022 8:44 AM
> > > >
> > > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> > > > >
> > > > > that backend feature and userspace can effectively stop the device.
> > > > >
> > > > >
> > > > >
> > > > > This is a must before get virtqueue indexes (base) for live migration,
> > > > >
> > > > > since the device could modify them after userland gets them. There are
> > > > >
> > > > > individual ways to perform that action for some devices
> > > > >
> > > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there
> > > > > was no
> > > > >
> > > > > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop != 0, the device MUST finish any
> > > > >
> > > > > pending operations like in flight requests. It must also preserve all
> > > > >
> > > > > the necessary state (the virtqueue vring base plus the possible device
> > > > >
> > > > > specific states) that is required for restoring in the future. The
> > > > >
> > > > > device must not change its configuration after that point.
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop == 0, the device can continue
> > > > >
> > > > > processing buffers as long as typical conditions are met (vq is enabled,
> > > > >
> > > > > DRIVER_OK status bit is enabled, etc).
> > > >
> > > > Just to be clear, we are adding vdpa level new ioctl() that doesn’t map to any mechanism in the virtio spec.
> > > >
> > > > Why can't we use this ioctl() to indicate driver to start/stop the device instead of driving it through the driver_ok?
> > > > This is in the context of other discussion we had in the LM series.
> > >
> > > If there's something in the spec that does this then let's use that.
> >
> > Actually, we try to propose a independent feature here:
> >
> > https://lists.oasis-open.org/archives/virtio-dev/202111/msg00020.html
> >
> > Does it make sense to you?
> >
> > Thanks
>
> But I thought the LM patches are trying to replace all that?

I'm not sure, and actually I think they are orthogonal. We need a new
state and the command to set the state could be transport specific or
a virtqueue.

As far as I know, most of the vendors have implemented this semantic.

Thanks

>
>
> > > Unfortunately the LM series seems to be stuck on moving
> > > bits around with the admin virtqueue ...
> > >
> > > --
> > > MST
> > >
>


2022-06-01 14:03:50

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Tue, May 31, 2022 at 10:26 PM Parav Pandit <[email protected]> wrote:
>
>
>
> > From: Eugenio Perez Martin <[email protected]>
> > Sent: Friday, May 27, 2022 3:55 AM
> >
> > On Fri, May 27, 2022 at 4:26 AM Jason Wang <[email protected]> wrote:
> > >
> > > On Thu, May 26, 2022 at 8:54 PM Parav Pandit <[email protected]> wrote:
> > > >
> > > >
> > > >
> > > > > From: Eugenio Pérez <[email protected]>
> > > > > Sent: Thursday, May 26, 2022 8:44 AM
> > > >
> > > > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will
> > > > > offer
> > > > >
> > > > > that backend feature and userspace can effectively stop the device.
> > > > >
> > > > >
> > > > >
> > > > > This is a must before get virtqueue indexes (base) for live
> > > > > migration,
> > > > >
> > > > > since the device could modify them after userland gets them. There
> > > > > are
> > > > >
> > > > > individual ways to perform that action for some devices
> > > > >
> > > > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but
> > there
> > > > > was no
> > > > >
> > > > > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop != 0, the device MUST finish
> > > > > any
> > > > >
> > > > > pending operations like in flight requests. It must also preserve
> > > > > all
> > > > >
> > > > > the necessary state (the virtqueue vring base plus the possible
> > > > > device
> > > > >
> > > > > specific states) that is required for restoring in the future. The
> > > > >
> > > > > device must not change its configuration after that point.
> > > > >
> > > > >
> > > > >
> > > > > After the return of ioctl with stop == 0, the device can continue
> > > > >
> > > > > processing buffers as long as typical conditions are met (vq is
> > > > > enabled,
> > > > >
> > > > > DRIVER_OK status bit is enabled, etc).
> > > >
> > > > Just to be clear, we are adding vdpa level new ioctl() that doesn’t map to
> > any mechanism in the virtio spec.
> > >
> > > We try to provide forward compatibility to VIRTIO_CONFIG_S_STOP. That
> > > means it is expected to implement at least a subset of
> > > VIRTIO_CONFIG_S_STOP.
> > >
> >
> > Appending a link to the proposal, just for reference [1].
> >
> > > >
> > > > Why can't we use this ioctl() to indicate driver to start/stop the device
> > instead of driving it through the driver_ok?
> > >
> >
> > Parav, I'm not sure I follow you here.
> >
> > By the proposal, the resume of the device is (From qemu POV):
> > 1. To configure all data vqs and cvq (addr, num, ...) 2. To enable only CVQ, not
> > data vqs 3. To send DRIVER_OK 4. Wait for all buffers of CVQ to be used 5. To
> > enable all others data vqs (individual ioctl at the moment)
> >
> > Where can we fit the resume (as "stop(false)") here? If the device is stopped
> > (as if we send stop(true) before DRIVER_OK), we don't read CVQ first. If we
> > send it right after (or instead) DRIVER_OK, data buffers can reach data vqs
> > before configuring RSS.
> >
> It doesn’t make sense with currently proposed way of using cvq to replay the config.

The stop/resume part is not intended to restore the config through the
CVQ. The stop call is issued to be able to retrieve the vq status
(base, in vhost terminology). The symmetric operation (resume) was
added on demand, it was never intended to be part neither of the
config restore or the virtqueue state restore workflow.

The configuration restore workflow was modelled after the device
initialization, so each part needed to add the less things the better,
and only qemu needed to be changed. From the device POV, there is no
need to learn new tricks for this. The support of .set_vq_ready and
.get_vq_ready is already in the kernel in every vdpa backend driver.

> Need to continue with currently proposed temporary method that subsequently to be replaced with optimized flow as we discussed.

Back then, it was noted by you that enabling each data vq individually
after DRIVER_OK is slow on mlx5 devices. The solution was to batch
these enable calls accounting in the kernel, achieving no growth in
the vdpa uAPI layer. The proposed solution did not involve the resume
operation.

After that, you proposed in this thread "Why can't we use this ioctl()
to indicate driver to start/stop the device instead of driving it
through the driver_ok?". As I understand, that is a mistake, since it
requires the device, the vdpa layer, etc... to learn new tricks. It
requires qemu to duplicate the initialization layer (it's now common
for start and restore config). But I might have not seen the whole
picture, missing advantages of using the resume call for this
workflow. Can you describe the workflow you have in mind? How does
that new workflow affect this proposal?

I'm ok to change the proposal as long as we find we obtain a net gain.

Thanks!


2022-06-01 19:19:52

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Tue, May 31, 2022 at 09:13:38AM +0200, Eugenio Perez Martin wrote:
> On Tue, May 31, 2022 at 7:42 AM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Thu, May 26, 2022 at 02:43:34PM +0200, Eugenio P?rez wrote:
> > > Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> > > that backend feature and userspace can effectively stop the device.
> > >
> > > This is a must before get virtqueue indexes (base) for live migration,
> > > since the device could modify them after userland gets them. There are
> > > individual ways to perform that action for some devices
> > > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> > > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> > >
> > > After the return of ioctl with stop != 0, the device MUST finish any
> > > pending operations like in flight requests. It must also preserve all
> > > the necessary state (the virtqueue vring base plus the possible device
> > > specific states) that is required for restoring in the future. The
> > > device must not change its configuration after that point.
> > >
> > > After the return of ioctl with stop == 0, the device can continue
> > > processing buffers as long as typical conditions are met (vq is enabled,
> > > DRIVER_OK status bit is enabled, etc).
> > >
> > > In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> > > so the device can save pending operations.
> > >
> > > Comments are welcome.
> >
> >
> > So given this is just for simulator and affects UAPI I think it's fine
> > to make it wait for the next merge window, until there's a consensus.
> > Right?
> >
>
> While the change is only implemented in the simulator at this moment,
> it's just the very last missing piece in the kernel to implement
> complete live migration for net devices with cvq :). All vendor
> drivers can implement this call with current code, just a little bit
> of plumbing is needed. And it was accepted in previous meetings.
>
> If it proves it works for every configuration (nested, etc), the
> implementation can forward the call to the admin vq for example. At
> the moment, it follows the proposed stop status bit sematic to stop
> the device, which POC has been tested in these circumstances.
>
> Thanks!

Oh absolutely, but I am guessing this plumbing won't
be ready for this merge window.

> > > v4:
> > > * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
> > >
> > > v3:
> > > * s/VHOST_STOP/VHOST_VDPA_STOP/
> > > * Add documentation and requirements of the ioctl above its definition.
> > >
> > > v2:
> > > * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> > > * Fix obtaining of stop ioctl arg (it was not obtained but written).
> > > * Add stop to vdpa_sim_blk.
> > >
> > > Eugenio P?rez (4):
> > > vdpa: Add stop operation
> > > vhost-vdpa: introduce STOP backend feature bit
> > > vhost-vdpa: uAPI to stop the device
> > > vdpa_sim: Implement stop vdpa op
> > >
> > > drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +++++++++++++++++
> > > drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
> > > drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 3 +++
> > > drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++
> > > drivers/vhost/vdpa.c | 34 +++++++++++++++++++++++++++-
> > > include/linux/vdpa.h | 6 +++++
> > > include/uapi/linux/vhost.h | 14 ++++++++++++
> > > include/uapi/linux/vhost_types.h | 2 ++
> > > 8 files changed, 83 insertions(+), 1 deletion(-)
> > >
> > > --
> > > 2.31.1
> > >
> >


2022-06-01 19:26:51

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Thu, May 26, 2022 at 02:43:34PM +0200, Eugenio P?rez wrote:
> Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> that backend feature and userspace can effectively stop the device.
>
> This is a must before get virtqueue indexes (base) for live migration,
> since the device could modify them after userland gets them. There are
> individual ways to perform that action for some devices
> (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> way to perform it for any vhost device (and, in particular, vhost-vdpa).
>
> After the return of ioctl with stop != 0, the device MUST finish any
> pending operations like in flight requests. It must also preserve all
> the necessary state (the virtqueue vring base plus the possible device
> specific states) that is required for restoring in the future. The
> device must not change its configuration after that point.
>
> After the return of ioctl with stop == 0, the device can continue
> processing buffers as long as typical conditions are met (vq is enabled,
> DRIVER_OK status bit is enabled, etc).
>
> In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> so the device can save pending operations.
>
> Comments are welcome.


So given this is just for simulator and affects UAPI I think it's fine
to make it wait for the next merge window, until there's a consensus.
Right?

> v4:
> * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
>
> v3:
> * s/VHOST_STOP/VHOST_VDPA_STOP/
> * Add documentation and requirements of the ioctl above its definition.
>
> v2:
> * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> * Fix obtaining of stop ioctl arg (it was not obtained but written).
> * Add stop to vdpa_sim_blk.
>
> Eugenio P?rez (4):
> vdpa: Add stop operation
> vhost-vdpa: introduce STOP backend feature bit
> vhost-vdpa: uAPI to stop the device
> vdpa_sim: Implement stop vdpa op
>
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +++++++++++++++++
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 3 +++
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++
> drivers/vhost/vdpa.c | 34 +++++++++++++++++++++++++++-
> include/linux/vdpa.h | 6 +++++
> include/uapi/linux/vhost.h | 14 ++++++++++++
> include/uapi/linux/vhost_types.h | 2 ++
> 8 files changed, 83 insertions(+), 1 deletion(-)
>
> --
> 2.31.1
>


2022-06-01 19:27:11

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] vhost-vdpa: uAPI to stop the device

On Wed, Jun 1, 2022 at 1:03 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, May 26, 2022 at 02:43:37PM +0200, Eugenio Pérez wrote:
> > The ioctl adds support for stop the device from userspace.
> >
> > Signed-off-by: Eugenio Pérez <[email protected]>
> > ---
> > drivers/vhost/vdpa.c | 18 ++++++++++++++++++
> > include/uapi/linux/vhost.h | 14 ++++++++++++++
> > 2 files changed, 32 insertions(+)
> >
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > index 32713db5831d..d1d19555c4b7 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -478,6 +478,21 @@ static long vhost_vdpa_get_vqs_count(struct vhost_vdpa *v, u32 __user *argp)
> > return 0;
> > }
> >
> > +static long vhost_vdpa_stop(struct vhost_vdpa *v, u32 __user *argp)
> > +{
> > + struct vdpa_device *vdpa = v->vdpa;
> > + const struct vdpa_config_ops *ops = vdpa->config;
> > + int stop;
> > +
> > + if (!ops->stop)
> > + return -EOPNOTSUPP;
> > +
> > + if (copy_from_user(&stop, argp, sizeof(stop)))
> > + return -EFAULT;
> > +
> > + return ops->stop(vdpa, stop);
> > +}
> > +
> > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > void __user *argp)
> > {
> > @@ -650,6 +665,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > case VHOST_VDPA_GET_VQS_COUNT:
> > r = vhost_vdpa_get_vqs_count(v, argp);
> > break;
> > + case VHOST_VDPA_STOP:
> > + r = vhost_vdpa_stop(v, argp);
> > + break;
> > default:
> > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > if (r == -ENOIOCTLCMD)
> > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > index cab645d4a645..c7e47b29bf61 100644
> > --- a/include/uapi/linux/vhost.h
> > +++ b/include/uapi/linux/vhost.h
> > @@ -171,4 +171,18 @@
> > #define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7C, \
> > struct vhost_vring_state)
> >
> > +/* Stop or resume a device so it does not process virtqueue requests anymore
> > + *
> > + * After the return of ioctl with stop != 0, the device must finish any
> > + * pending operations like in flight requests. It must also preserve all
> > + * the necessary state (the virtqueue vring base plus the possible device
> > + * specific states) that is required for restoring in the future. The
> > + * device must not change its configuration after that point.
> > + *
> > + * After the return of ioctl with stop == 0, the device can continue
> > + * processing buffers as long as typical conditions are met (vq is enabled,
> > + * DRIVER_OK status bit is enabled, etc).
> > + */
> > +#define VHOST_VDPA_STOP _IOW(VHOST_VIRTIO, 0x7D, int)
> > +
> > #endif
>
> I wonder how does this interact with the admin vq idea.
> I.e. if we stop all VQs then apparently admin vq can't
> work either ...
> Thoughts?
>

Copying here the answer to Parav, feel free to answer to any thread or
highlight if I missed something :). Using the admin vq proposal
terminology of "device group".

--
This would stop a device of a device
group, but not the whole virtqueue group. If the admin VQ is offered
by the PF (since it's not exposed to the guest), it will continue
accepting requests as normal. If it's exposed in the VF, I think the
best bet is to shadow it, since guest and host requests could
conflict.

Since this is offered through vdpa, the device backend driver can
route it to whatever method works better for the hardware. For
example, to send an admin vq command to the PF. That's why it's
important to keep the feature as self-contained and orthogonal to
others as possible.
--

> > --
> > 2.31.1
>


2022-06-01 20:05:55

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Tue, May 31, 2022 at 7:42 AM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, May 26, 2022 at 02:43:34PM +0200, Eugenio Pérez wrote:
> > Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> > that backend feature and userspace can effectively stop the device.
> >
> > This is a must before get virtqueue indexes (base) for live migration,
> > since the device could modify them after userland gets them. There are
> > individual ways to perform that action for some devices
> > (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> > way to perform it for any vhost device (and, in particular, vhost-vdpa).
> >
> > After the return of ioctl with stop != 0, the device MUST finish any
> > pending operations like in flight requests. It must also preserve all
> > the necessary state (the virtqueue vring base plus the possible device
> > specific states) that is required for restoring in the future. The
> > device must not change its configuration after that point.
> >
> > After the return of ioctl with stop == 0, the device can continue
> > processing buffers as long as typical conditions are met (vq is enabled,
> > DRIVER_OK status bit is enabled, etc).
> >
> > In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> > so the device can save pending operations.
> >
> > Comments are welcome.
>
>
> So given this is just for simulator and affects UAPI I think it's fine
> to make it wait for the next merge window, until there's a consensus.
> Right?
>

While the change is only implemented in the simulator at this moment,
it's just the very last missing piece in the kernel to implement
complete live migration for net devices with cvq :). All vendor
drivers can implement this call with current code, just a little bit
of plumbing is needed. And it was accepted in previous meetings.

If it proves it works for every configuration (nested, etc), the
implementation can forward the call to the admin vq for example. At
the moment, it follows the proposed stop status bit sematic to stop
the device, which POC has been tested in these circumstances.

Thanks!

> > v4:
> > * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
> >
> > v3:
> > * s/VHOST_STOP/VHOST_VDPA_STOP/
> > * Add documentation and requirements of the ioctl above its definition.
> >
> > v2:
> > * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> > * Fix obtaining of stop ioctl arg (it was not obtained but written).
> > * Add stop to vdpa_sim_blk.
> >
> > Eugenio Pérez (4):
> > vdpa: Add stop operation
> > vhost-vdpa: introduce STOP backend feature bit
> > vhost-vdpa: uAPI to stop the device
> > vdpa_sim: Implement stop vdpa op
> >
> > drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +++++++++++++++++
> > drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
> > drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 3 +++
> > drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++
> > drivers/vhost/vdpa.c | 34 +++++++++++++++++++++++++++-
> > include/linux/vdpa.h | 6 +++++
> > include/uapi/linux/vhost.h | 14 ++++++++++++
> > include/uapi/linux/vhost_types.h | 2 ++
> > 8 files changed, 83 insertions(+), 1 deletion(-)
> >
> > --
> > 2.31.1
> >
>


2022-06-01 20:50:41

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] vhost-vdpa: uAPI to stop the device

On Thu, May 26, 2022 at 02:43:37PM +0200, Eugenio P?rez wrote:
> The ioctl adds support for stop the device from userspace.
>
> Signed-off-by: Eugenio P?rez <[email protected]>
> ---
> drivers/vhost/vdpa.c | 18 ++++++++++++++++++
> include/uapi/linux/vhost.h | 14 ++++++++++++++
> 2 files changed, 32 insertions(+)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 32713db5831d..d1d19555c4b7 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -478,6 +478,21 @@ static long vhost_vdpa_get_vqs_count(struct vhost_vdpa *v, u32 __user *argp)
> return 0;
> }
>
> +static long vhost_vdpa_stop(struct vhost_vdpa *v, u32 __user *argp)
> +{
> + struct vdpa_device *vdpa = v->vdpa;
> + const struct vdpa_config_ops *ops = vdpa->config;
> + int stop;
> +
> + if (!ops->stop)
> + return -EOPNOTSUPP;
> +
> + if (copy_from_user(&stop, argp, sizeof(stop)))
> + return -EFAULT;
> +
> + return ops->stop(vdpa, stop);
> +}
> +
> static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> void __user *argp)
> {
> @@ -650,6 +665,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> case VHOST_VDPA_GET_VQS_COUNT:
> r = vhost_vdpa_get_vqs_count(v, argp);
> break;
> + case VHOST_VDPA_STOP:
> + r = vhost_vdpa_stop(v, argp);
> + break;
> default:
> r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> if (r == -ENOIOCTLCMD)
> diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> index cab645d4a645..c7e47b29bf61 100644
> --- a/include/uapi/linux/vhost.h
> +++ b/include/uapi/linux/vhost.h
> @@ -171,4 +171,18 @@
> #define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7C, \
> struct vhost_vring_state)
>
> +/* Stop or resume a device so it does not process virtqueue requests anymore
> + *
> + * After the return of ioctl with stop != 0, the device must finish any
> + * pending operations like in flight requests. It must also preserve all
> + * the necessary state (the virtqueue vring base plus the possible device
> + * specific states) that is required for restoring in the future. The
> + * device must not change its configuration after that point.
> + *
> + * After the return of ioctl with stop == 0, the device can continue
> + * processing buffers as long as typical conditions are met (vq is enabled,
> + * DRIVER_OK status bit is enabled, etc).
> + */
> +#define VHOST_VDPA_STOP _IOW(VHOST_VIRTIO, 0x7D, int)
> +
> #endif

I wonder how does this interact with the admin vq idea.
I.e. if we stop all VQs then apparently admin vq can't
work either ...
Thoughts?

> --
> 2.31.1


2022-06-01 21:34:43

by Parav Pandit

[permalink] [raw]
Subject: RE: [PATCH v4 3/4] vhost-vdpa: uAPI to stop the device



> From: Eugenio Perez Martin <[email protected]>
> Sent: Wednesday, June 1, 2022 7:15 AM
>
> On Wed, Jun 1, 2022 at 1:03 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Thu, May 26, 2022 at 02:43:37PM +0200, Eugenio Pérez wrote:
> > > The ioctl adds support for stop the device from userspace.
> > >
> > > Signed-off-by: Eugenio Pérez <[email protected]>
> > > ---
> > > drivers/vhost/vdpa.c | 18 ++++++++++++++++++
> > > include/uapi/linux/vhost.h | 14 ++++++++++++++
> > > 2 files changed, 32 insertions(+)
> > >
> > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index
> > > 32713db5831d..d1d19555c4b7 100644
> > > --- a/drivers/vhost/vdpa.c
> > > +++ b/drivers/vhost/vdpa.c
> > > @@ -478,6 +478,21 @@ static long vhost_vdpa_get_vqs_count(struct
> vhost_vdpa *v, u32 __user *argp)
> > > return 0;
> > > }
> > >
> > > +static long vhost_vdpa_stop(struct vhost_vdpa *v, u32 __user *argp)
> > > +{
> > > + struct vdpa_device *vdpa = v->vdpa;
> > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > + int stop;
> > > +
> > > + if (!ops->stop)
> > > + return -EOPNOTSUPP;
> > > +
> > > + if (copy_from_user(&stop, argp, sizeof(stop)))
> > > + return -EFAULT;
> > > +
> > > + return ops->stop(vdpa, stop);
> > > +}
> > > +
> > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int
> cmd,
> > > void __user *argp) { @@ -650,6
> > > +665,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > case VHOST_VDPA_GET_VQS_COUNT:
> > > r = vhost_vdpa_get_vqs_count(v, argp);
> > > break;
> > > + case VHOST_VDPA_STOP:
> > > + r = vhost_vdpa_stop(v, argp);
> > > + break;
> > > default:
> > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > if (r == -ENOIOCTLCMD) diff --git
> > > a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index
> > > cab645d4a645..c7e47b29bf61 100644
> > > --- a/include/uapi/linux/vhost.h
> > > +++ b/include/uapi/linux/vhost.h
> > > @@ -171,4 +171,18 @@
> > > #define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7C,
> \
> > > struct vhost_vring_state)
> > >
> > > +/* Stop or resume a device so it does not process virtqueue
> > > +requests anymore
> > > + *
> > > + * After the return of ioctl with stop != 0, the device must finish
> > > +any
> > > + * pending operations like in flight requests. It must also
> > > +preserve all
> > > + * the necessary state (the virtqueue vring base plus the possible
> > > +device
> > > + * specific states) that is required for restoring in the future.
> > > +The
> > > + * device must not change its configuration after that point.
> > > + *
> > > + * After the return of ioctl with stop == 0, the device can
> > > +continue
> > > + * processing buffers as long as typical conditions are met (vq is
> > > +enabled,
> > > + * DRIVER_OK status bit is enabled, etc).
> > > + */
> > > +#define VHOST_VDPA_STOP _IOW(VHOST_VIRTIO, 0x7D, int)
> > > +
A better name is VHOST_VDPA_SET_STATE
State = stop/suspend
State = start/resume

Suspend/resume seems more logical, as opposed start/stop, because it more clearly indicates that the resume (start) is from some programmed beginning (and not first boot).

> > > #endif
> >
> > I wonder how does this interact with the admin vq idea.
> > I.e. if we stop all VQs then apparently admin vq can't work either ...
> > Thoughts?
> >
>
> Copying here the answer to Parav, feel free to answer to any thread or
> highlight if I missed something :). Using the admin vq proposal terminology of
> "device group".
>
> --
> This would stop a device of a device
> group, but not the whole virtqueue group. If the admin VQ is offered by the
> PF (since it's not exposed to the guest), it will continue accepting requests as
> normal. If it's exposed in the VF, I think the best bet is to shadow it, since
> guest and host requests could conflict.
>

vhost-vdpa device is exposed for a VF through vp-vdpa driver to user land.
Now vp-vdpa driver will have to choose between using config register vs using AQ to suspend/resume the device.

Why not always begin with more superior interface of AQ that address multiple of these needs for LM case?

For LM case, more you explore, we realize that either VF relying on PF's AQ for query/config/setup/restore makes more sense or have its own dedicated AQ.

VM's suspend/resume operation can be handled through the shadow Q.

> Since this is offered through vdpa, the device backend driver can route it to
> whatever method works better for the hardware. For example, to send an
> admin vq command to the PF. That's why it's important to keep the feature
> as self-contained and orthogonal to others as possible.
> --
>
> > > --
> > > 2.31.1
> >

2022-06-02 02:11:23

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation


在 2022/5/26 20:43, Eugenio Pérez 写道:
> Implement stop operation for vdpa_sim devices, so vhost-vdpa will offer
> that backend feature and userspace can effectively stop the device.
>
> This is a must before get virtqueue indexes (base) for live migration,
> since the device could modify them after userland gets them. There are
> individual ways to perform that action for some devices
> (VHOST_NET_SET_BACKEND, VHOST_VSOCK_SET_RUNNING, ...) but there was no
> way to perform it for any vhost device (and, in particular, vhost-vdpa).
>
> After the return of ioctl with stop != 0, the device MUST finish any
> pending operations like in flight requests. It must also preserve all
> the necessary state (the virtqueue vring base plus the possible device
> specific states) that is required for restoring in the future. The
> device must not change its configuration after that point.


I think we probably need more accurate definition on the state as Parav
suggested.

Besides this, we should also clarify when stop is allowed. E.g should we
allow setting stop without DRIVER_OK?

Thanks


>
> After the return of ioctl with stop == 0, the device can continue
> processing buffers as long as typical conditions are met (vq is enabled,
> DRIVER_OK status bit is enabled, etc).
>
> In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> so the device can save pending operations.
>
> Comments are welcome.
>
> v4:
> * Replace VHOST_STOP to VHOST_VDPA_STOP in vhost ioctl switch case too.
>
> v3:
> * s/VHOST_STOP/VHOST_VDPA_STOP/
> * Add documentation and requirements of the ioctl above its definition.
>
> v2:
> * Replace raw _F_STOP with BIT_ULL(_F_STOP).
> * Fix obtaining of stop ioctl arg (it was not obtained but written).
> * Add stop to vdpa_sim_blk.
>
> Eugenio Pérez (4):
> vdpa: Add stop operation
> vhost-vdpa: introduce STOP backend feature bit
> vhost-vdpa: uAPI to stop the device
> vdpa_sim: Implement stop vdpa op
>
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 21 +++++++++++++++++
> drivers/vdpa/vdpa_sim/vdpa_sim.h | 1 +
> drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 3 +++
> drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 3 +++
> drivers/vhost/vdpa.c | 34 +++++++++++++++++++++++++++-
> include/linux/vdpa.h | 6 +++++
> include/uapi/linux/vhost.h | 14 ++++++++++++
> include/uapi/linux/vhost_types.h | 2 ++
> 8 files changed, 83 insertions(+), 1 deletion(-)
>


2022-06-03 07:56:00

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] vhost-vdpa: uAPI to stop the device

On Wed, Jun 1, 2022 at 9:13 PM Parav Pandit <[email protected]> wrote:
>
>
>
> > From: Eugenio Perez Martin <[email protected]>
> > Sent: Wednesday, June 1, 2022 7:15 AM
> >
> > On Wed, Jun 1, 2022 at 1:03 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Thu, May 26, 2022 at 02:43:37PM +0200, Eugenio Pérez wrote:
> > > > The ioctl adds support for stop the device from userspace.
> > > >
> > > > Signed-off-by: Eugenio Pérez <[email protected]>
> > > > ---
> > > > drivers/vhost/vdpa.c | 18 ++++++++++++++++++
> > > > include/uapi/linux/vhost.h | 14 ++++++++++++++
> > > > 2 files changed, 32 insertions(+)
> > > >
> > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index
> > > > 32713db5831d..d1d19555c4b7 100644
> > > > --- a/drivers/vhost/vdpa.c
> > > > +++ b/drivers/vhost/vdpa.c
> > > > @@ -478,6 +478,21 @@ static long vhost_vdpa_get_vqs_count(struct
> > vhost_vdpa *v, u32 __user *argp)
> > > > return 0;
> > > > }
> > > >
> > > > +static long vhost_vdpa_stop(struct vhost_vdpa *v, u32 __user *argp)
> > > > +{
> > > > + struct vdpa_device *vdpa = v->vdpa;
> > > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > > + int stop;
> > > > +
> > > > + if (!ops->stop)
> > > > + return -EOPNOTSUPP;
> > > > +
> > > > + if (copy_from_user(&stop, argp, sizeof(stop)))
> > > > + return -EFAULT;
> > > > +
> > > > + return ops->stop(vdpa, stop);
> > > > +}
> > > > +
> > > > static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int
> > cmd,
> > > > void __user *argp) { @@ -650,6
> > > > +665,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > case VHOST_VDPA_GET_VQS_COUNT:
> > > > r = vhost_vdpa_get_vqs_count(v, argp);
> > > > break;
> > > > + case VHOST_VDPA_STOP:
> > > > + r = vhost_vdpa_stop(v, argp);
> > > > + break;
> > > > default:
> > > > r = vhost_dev_ioctl(&v->vdev, cmd, argp);
> > > > if (r == -ENOIOCTLCMD) diff --git
> > > > a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index
> > > > cab645d4a645..c7e47b29bf61 100644
> > > > --- a/include/uapi/linux/vhost.h
> > > > +++ b/include/uapi/linux/vhost.h
> > > > @@ -171,4 +171,18 @@
> > > > #define VHOST_VDPA_SET_GROUP_ASID _IOW(VHOST_VIRTIO, 0x7C,
> > \
> > > > struct vhost_vring_state)
> > > >
> > > > +/* Stop or resume a device so it does not process virtqueue
> > > > +requests anymore
> > > > + *
> > > > + * After the return of ioctl with stop != 0, the device must finish
> > > > +any
> > > > + * pending operations like in flight requests. It must also
> > > > +preserve all
> > > > + * the necessary state (the virtqueue vring base plus the possible
> > > > +device
> > > > + * specific states) that is required for restoring in the future.
> > > > +The
> > > > + * device must not change its configuration after that point.
> > > > + *
> > > > + * After the return of ioctl with stop == 0, the device can
> > > > +continue
> > > > + * processing buffers as long as typical conditions are met (vq is
> > > > +enabled,
> > > > + * DRIVER_OK status bit is enabled, etc).
> > > > + */
> > > > +#define VHOST_VDPA_STOP _IOW(VHOST_VIRTIO, 0x7D, int)
> > > > +
> A better name is VHOST_VDPA_SET_STATE
> State = stop/suspend
> State = start/resume
>
> Suspend/resume seems more logical, as opposed start/stop, because it more clearly indicates that the resume (start) is from some programmed beginning (and not first boot).
>

It's fine to move to that nomenclature in my opinion.

> > > > #endif
> > >
> > > I wonder how does this interact with the admin vq idea.
> > > I.e. if we stop all VQs then apparently admin vq can't work either ...
> > > Thoughts?
> > >
> >
> > Copying here the answer to Parav, feel free to answer to any thread or
> > highlight if I missed something :). Using the admin vq proposal terminology of
> > "device group".
> >
> > --
> > This would stop a device of a device
> > group, but not the whole virtqueue group. If the admin VQ is offered by the
> > PF (since it's not exposed to the guest), it will continue accepting requests as
> > normal. If it's exposed in the VF, I think the best bet is to shadow it, since
> > guest and host requests could conflict.
> >
>
> vhost-vdpa device is exposed for a VF through vp-vdpa driver to user land.
> Now vp-vdpa driver will have to choose between using config register vs using AQ to suspend/resume the device.
>

vp_vdpa cannot choose if the virtio device has an admin vq or any
other feature, it just wraps the virtio device. If that virtio device
does not expose AQ, vp_vdpa cannot expose it.

> Why not always begin with more superior interface of AQ that address multiple of these needs for LM case?
>

Because it doesn't address valid use cases like vp_vdpa with no AQ,
devices that are not VF, or nested virtualization.

VHOST_VDPA_STOP / VHOST_VDPA_SET_STATE does not replace AQ commands:
It's just the way vhost-vdpa exposes that capability to qemu. vdpa
backend is free to choose whatever methods it finds better to
implement it.

> For LM case, more you explore, we realize that either VF relying on PF's AQ for query/config/setup/restore makes more sense or have its own dedicated AQ.
>

This ioctl does not mandate that the device cannot implement it
through AQ, or that the device has to be a VF.

Thanks!

> VM's suspend/resume operation can be handled through the shadow Q.
>
> > Since this is offered through vdpa, the device backend driver can route it to
> > whatever method works better for the hardware. For example, to send an
> > admin vq command to the PF. That's why it's important to keep the feature
> > as self-contained and orthogonal to others as possible.
> > --
> >
> > > > --
> > > > 2.31.1
> > >
>