2017-12-20 19:02:29

by Greg Kurz

[permalink] [raw]
Subject: [PATCH 0/2] 9p/trans_virtio: handle request cancellation

The 9p protocol mostly relies on a request/reply dialog between the
client and the server. A notable exception to this rule is request
cancellation (ie, flush in 9p wording): when the client requests an
in-flight request to be flushed, the server should only reply to the
flush request and not to the flushed in-flight request (otherwise it
considers the in-flight request to have completed just like it has
never been flushed).

It is up to the client to inform the transport that the in-flight request
has been flushed and won't receive a reply. This is achieved with the
'cancelled' operation of the struct p9_trans_module.

This operation isn't currently implemented with the virtio transport.
As a consequence, flushed requests leave buffers in the used list
forever and the virtqueue ends up in being able to process only one
request at a time.

This issue never popped up because the 9p server in QEMU had a bug
and would always reply to flushed requests. But this will be fixed
soon, so it is time to implement the 'cancelled' operation in the
9p virtio transport.

--
Greg

---

Greg Kurz (2):
virtio: allow to detach a buffer from the virtqueue
9p/trans_virtio: implement cancelled callback


drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
include/linux/virtio.h | 1 +
net/9p/trans_virtio.c | 18 ++++++++++++++++++
3 files changed, 47 insertions(+)


2017-12-20 18:22:22

by Greg Kurz

[permalink] [raw]
Subject: [PATCH 2/2] 9p/trans_virtio: implement cancelled callback

When 9p requests are successfully flushed, we must manually move the
associated buffers to the virtqueue freelist, since the server doesn't
send a reply.

Signed-off-by: Greg Kurz <[email protected]>
---
net/9p/trans_virtio.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index f3a4efcf1456..d3216af124c4 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -206,6 +206,23 @@ static int p9_virtio_cancel(struct p9_client *client, struct p9_req_t *req)
return 1;
}

+static int p9_virtio_cancelled(struct p9_client *client, struct p9_req_t *req)
+{
+ struct virtio_chan *chan = client->trans;
+ unsigned long flags;
+
+ spin_lock_irqsave(&chan->lock, flags);
+ if (virtqueue_detach_buf(chan->vq, req) != req) {
+ spin_unlock_irqrestore(&chan->lock, flags);
+ return 0;
+ }
+ chan->ring_bufs_avail = 1;
+ spin_unlock_irqrestore(&chan->lock, flags);
+ /* Wakeup if anyone waiting for VirtIO ring space. */
+ wake_up(chan->vc_wq);
+ return 0;
+}
+
/**
* pack_sg_list_p - Just like pack_sg_list. Instead of taking a buffer,
* this takes a list of pages.
@@ -737,6 +754,7 @@ static struct p9_trans_module p9_virtio_trans = {
.request = p9_virtio_request,
.zc_request = p9_virtio_zc_request,
.cancel = p9_virtio_cancel,
+ .cancelled = p9_virtio_cancelled,
/*
* We leave one entry for input and one entry for response
* headers. We also skip one more entry to accomodate, address

2017-12-20 20:12:13

by Greg Kurz

[permalink] [raw]
Subject: [PATCH 1/2] virtio: allow to detach a buffer from the virtqueue

It is possible for a device to stop using buffers without pushing them
back to the driver. This is the case for example with the 9p virtio
device: if the driver flushes an in-flight request, the 9p specification
specification [*] mandates the server to "to purge the pending response".
The reply to the flush request indicates that the 9p server has stopped
using the buffers of the flushed in-flight request. But since the server
doesn't push back the associated buffers, they don't go back to the free
list. This leads the virtqueue to end up with a single slot to handle all
the dialog with the device, ie, serialize all I/Os.

This patch hence gives the possibility for device specific code to
explicitly detach a given buffer from the used ring and put it back
to the free list.

[*] http://man.cat-v.org/plan_9/5/flush

Signed-off-by: Greg Kurz <[email protected]>
---
drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
include/linux/virtio.h | 1 +
2 files changed, 29 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index eb30f3e09a47..886e9d054de3 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -936,6 +936,34 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
}
EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);

+/**
+ * virtqueue_detach_buf - detach specific buffer
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * Returns NULL or the "data" token handed to virtqueue_add_*().
+ * This should only be used if the driver really knows the buffer
+ * isn't needed anymore by the device.
+ */
+void *virtqueue_detach_buf(struct virtqueue *_vq, void *buf)
+{
+ struct vring_virtqueue *vq = to_vvq(_vq);
+ unsigned int i;
+
+ START_USE(vq);
+
+ for (i = 0; i < vq->vring.num; i++) {
+ if (vq->desc_state[i].data != buf)
+ continue;
+ detach_buf(vq, i, NULL);
+ END_USE(vq);
+ return buf;
+ }
+
+ END_USE(vq);
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(virtqueue_detach_buf);
+
irqreturn_t vring_interrupt(int irq, void *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 988c7355bc22..850158518ce5 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -80,6 +80,7 @@ bool virtqueue_poll(struct virtqueue *vq, unsigned);
bool virtqueue_enable_cb_delayed(struct virtqueue *vq);

void *virtqueue_detach_unused_buf(struct virtqueue *vq);
+void *virtqueue_detach_buf(struct virtqueue *vq, void *buf);

unsigned int virtqueue_get_vring_size(struct virtqueue *vq);


2018-01-08 15:15:11

by Greg Kurz

[permalink] [raw]
Subject: Re: [PATCH 0/2] 9p/trans_virtio: handle request cancellation

Hi Michael,

I wish you a happy new year, and I really need some feedback on this
series because it holds down some patches on the QEMU side.

Thanks,

--
Greg

On Wed, 20 Dec 2017 18:46:27 +0100
Greg Kurz <[email protected]> wrote:

> The 9p protocol mostly relies on a request/reply dialog between the
> client and the server. A notable exception to this rule is request
> cancellation (ie, flush in 9p wording): when the client requests an
> in-flight request to be flushed, the server should only reply to the
> flush request and not to the flushed in-flight request (otherwise it
> considers the in-flight request to have completed just like it has
> never been flushed).
>
> It is up to the client to inform the transport that the in-flight request
> has been flushed and won't receive a reply. This is achieved with the
> 'cancelled' operation of the struct p9_trans_module.
>
> This operation isn't currently implemented with the virtio transport.
> As a consequence, flushed requests leave buffers in the used list
> forever and the virtqueue ends up in being able to process only one
> request at a time.
>
> This issue never popped up because the 9p server in QEMU had a bug
> and would always reply to flushed requests. But this will be fixed
> soon, so it is time to implement the 'cancelled' operation in the
> 9p virtio transport.
>
> --
> Greg
>
> ---
>
> Greg Kurz (2):
> virtio: allow to detach a buffer from the virtqueue
> 9p/trans_virtio: implement cancelled callback
>
>
> drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
> include/linux/virtio.h | 1 +
> net/9p/trans_virtio.c | 18 ++++++++++++++++++
> 3 files changed, 47 insertions(+)
>
>

2018-01-19 19:50:45

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 1/2] virtio: allow to detach a buffer from the virtqueue

On Wed, Dec 20, 2017 at 06:46:42PM +0100, Greg Kurz wrote:
> It is possible for a device to stop using buffers without pushing them
> back to the driver. This is the case for example with the 9p virtio
> device: if the driver flushes an in-flight request, the 9p specification
> specification [*] mandates the server to "to purge the pending response".
> The reply to the flush request indicates that the 9p server has stopped
> using the buffers of the flushed in-flight request. But since the server
> doesn't push back the associated buffers, they don't go back to the free
> list. This leads the virtqueue to end up with a single slot to handle all
> the dialog with the device, ie, serialize all I/Os.
>
> This patch hence gives the possibility for device specific code to
> explicitly detach a given buffer from the used ring and put it back
> to the free list.
>
> [*] http://man.cat-v.org/plan_9/5/flush
>
> Signed-off-by: Greg Kurz <[email protected]>

It would be better to just change the server to mark all flushed
requests as used. Why isn't that an option?

> ---
> drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
> include/linux/virtio.h | 1 +
> 2 files changed, 29 insertions(+)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index eb30f3e09a47..886e9d054de3 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -936,6 +936,34 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
> }
> EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);
>
> +/**
> + * virtqueue_detach_buf - detach specific buffer
> + * @vq: the struct virtqueue we're talking about.
> + *
> + * Returns NULL or the "data" token handed to virtqueue_add_*().
> + * This should only be used if the driver really knows the buffer
> + * isn't needed anymore by the device.
> + */
> +void *virtqueue_detach_buf(struct virtqueue *_vq, void *buf)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + unsigned int i;
> +
> + START_USE(vq);
> +
> + for (i = 0; i < vq->vring.num; i++) {
> + if (vq->desc_state[i].data != buf)
> + continue;
> + detach_buf(vq, i, NULL);
> + END_USE(vq);
> + return buf;
> + }
> +
> + END_USE(vq);
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_detach_buf);
> +
> irqreturn_t vring_interrupt(int irq, void *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);

Unfortunately the used index gets out of sync with the available index
then.

E.g. you are breaking the invariant that used == avail means ring empty.

Any chance to preserve this invariant?



> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 988c7355bc22..850158518ce5 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -80,6 +80,7 @@ bool virtqueue_poll(struct virtqueue *vq, unsigned);
> bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
>
> void *virtqueue_detach_unused_buf(struct virtqueue *vq);
> +void *virtqueue_detach_buf(struct virtqueue *vq, void *buf);
>
> unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
>

2018-01-20 10:00:46

by Greg Kurz

[permalink] [raw]
Subject: Re: [PATCH 1/2] virtio: allow to detach a buffer from the virtqueue

On Fri, 19 Jan 2018 21:49:38 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Wed, Dec 20, 2017 at 06:46:42PM +0100, Greg Kurz wrote:
> > It is possible for a device to stop using buffers without pushing them
> > back to the driver. This is the case for example with the 9p virtio
> > device: if the driver flushes an in-flight request, the 9p specification
> > specification [*] mandates the server to "to purge the pending response".
> > The reply to the flush request indicates that the 9p server has stopped
> > using the buffers of the flushed in-flight request. But since the server
> > doesn't push back the associated buffers, they don't go back to the free
> > list. This leads the virtqueue to end up with a single slot to handle all
> > the dialog with the device, ie, serialize all I/Os.
> >
> > This patch hence gives the possibility for device specific code to
> > explicitly detach a given buffer from the used ring and put it back
> > to the free list.
> >
> > [*] http://man.cat-v.org/plan_9/5/flush
> >
> > Signed-off-by: Greg Kurz <[email protected]>
>
> It would be better to just change the server to mark all flushed
> requests as used. Why isn't that an option?
>

It is just because I started to look at this from a 9p client code
perspective. It supports several other transports than virtio, and
has a set of transport hooks. One of this hook is called when the
server is supposed to have cancelled a request. My first thought
was to rely on this 'cancelled' hook, like it is done for the RDMA
and fd-based transport.

But now, I realize I should be able to come up with something simpler
in the virtio case, if I do like you suggest. It would require a change
in the client though, as the current code assumes anything pushed by the
other end contains a 9p message, which would be wrong if QEMU does
something like virtqueue_push(vq, elem, 0).

I'll give a try.

Thanks!

> > ---
> > drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++++++++
> > include/linux/virtio.h | 1 +
> > 2 files changed, 29 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index eb30f3e09a47..886e9d054de3 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -936,6 +936,34 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
> > }
> > EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);
> >
> > +/**
> > + * virtqueue_detach_buf - detach specific buffer
> > + * @vq: the struct virtqueue we're talking about.
> > + *
> > + * Returns NULL or the "data" token handed to virtqueue_add_*().
> > + * This should only be used if the driver really knows the buffer
> > + * isn't needed anymore by the device.
> > + */
> > +void *virtqueue_detach_buf(struct virtqueue *_vq, void *buf)
> > +{
> > + struct vring_virtqueue *vq = to_vvq(_vq);
> > + unsigned int i;
> > +
> > + START_USE(vq);
> > +
> > + for (i = 0; i < vq->vring.num; i++) {
> > + if (vq->desc_state[i].data != buf)
> > + continue;
> > + detach_buf(vq, i, NULL);
> > + END_USE(vq);
> > + return buf;
> > + }
> > +
> > + END_USE(vq);
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(virtqueue_detach_buf);
> > +
> > irqreturn_t vring_interrupt(int irq, void *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
>
> Unfortunately the used index gets out of sync with the available index
> then.
>
> E.g. you are breaking the invariant that used == avail means ring empty.
>
> Any chance to preserve this invariant?
>
>
>
> > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > index 988c7355bc22..850158518ce5 100644
> > --- a/include/linux/virtio.h
> > +++ b/include/linux/virtio.h
> > @@ -80,6 +80,7 @@ bool virtqueue_poll(struct virtqueue *vq, unsigned);
> > bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
> >
> > void *virtqueue_detach_unused_buf(struct virtqueue *vq);
> > +void *virtqueue_detach_buf(struct virtqueue *vq, void *buf);
> >
> > unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
> >