2023-03-25 11:08:00

by 黄杰

[permalink] [raw]
Subject: [PATCH v2] virtio_ring: don't update event idx on get_buf

in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
the vq->event_triggered will be set to true. It will no longer be set to
false. Unless we explicitly call virtqueue_enable_cb_delayed or
virtqueue_enable_cb_prepare.

If we disable the napi_tx, it will only be called when the tx ring
buffer is relatively small.

Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
every time we call virtqueue_get_buf_ctx. This will bring more interruptions.

To summarize:
1) event_triggered was set to true in vring_interrupt()
2) after this nothing will happen for virtqueue_disable_cb() so
VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
then it tries to publish new event

To fix, if event_triggered is set to true, do not update
vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap

Tested with iperf:
iperf3 tcp stream:
vm1 -----------------> vm2
vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
there are many tx interrupts in vm2.
but without event_triggered there are just a few tx interrupts.

Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
Signed-off-by: Albert Huang <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/virtio/virtio_ring.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cbeeea1b0439..1c36fa477966 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
/* If we expect an interrupt for the next entry, tell host
* by writing event index and flush out the write before
* the read in the next get_buf call. */
- if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
+ if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
+ !vq->event_triggered))
virtio_store_mb(vq->weak_barriers,
&vring_used_event(&vq->split.vring),
cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
@@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
* by writing event index and flush out the write before
* the read in the next get_buf call.
*/
- if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
+ if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
+ !vq->event_triggered))
virtio_store_mb(vq->weak_barriers,
&vq->packed.vring.driver->off_wrap,
cpu_to_le16(vq->last_used_idx));
--
2.37.0 (Apple Git-136)


2023-03-27 04:37:06

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_ring: don't update event idx on get_buf

On Sat, 25 Mar 2023 18:56:33 +0800, Albert Huang <[email protected]> wrote:
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix, if event_triggered is set to true, do not update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: Albert Huang <[email protected]>
> Message-Id: <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>

Reviewed-by: Xuan Zhuo <[email protected]>

> ---
> drivers/virtio/virtio_ring.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cbeeea1b0439..1c36fa477966 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> /* If we expect an interrupt for the next entry, tell host
> * by writing event index and flush out the write before
> * the read in the next get_buf call. */
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vring_used_event(&vq->split.vring),
> cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> * by writing event index and flush out the write before
> * the read in the next get_buf call.
> */
> - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vq->packed.vring.driver->off_wrap,
> cpu_to_le16(vq->last_used_idx));
> --
> 2.37.0 (Apple Git-136)
>

2023-03-27 12:50:23

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_ring: don't update event idx on get_buf

is the below same as what I posted or different? how?

On Sat, Mar 25, 2023 at 06:56:33PM +0800, Albert Huang wrote:
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix, if event_triggered is set to true, do not update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: Albert Huang <[email protected]>
> Message-Id: <[email protected]>

what is this exactly?

> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---
> drivers/virtio/virtio_ring.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cbeeea1b0439..1c36fa477966 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> /* If we expect an interrupt for the next entry, tell host
> * by writing event index and flush out the write before
> * the read in the next get_buf call. */
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vring_used_event(&vq->split.vring),
> cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> * by writing event index and flush out the write before
> * the read in the next get_buf call.
> */
> - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vq->packed.vring.driver->off_wrap,
> cpu_to_le16(vq->last_used_idx));
> --
> 2.37.0 (Apple Git-136)

2023-03-28 02:19:52

by Dominique Martinet

[permalink] [raw]
Subject: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

Hi Michael, Albert,

Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.

This patch (commited as 35395770f803 ("virtio_ring: don't update event
idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
Luis in https://lkml.kernel.org/r/[email protected]

I've just hit had a look at recent patches[1] and reverted this to test
and I can mount again, so I'm pretty sure this is the culprit, but I
didn't look at the content at all yet so cannot advise further.
It might very well be that we need some extra handling for 9p
specifically that can be added separately if required.

[1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/


This can be reproduced with a simple mount, run qemu with some -virtfs
argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
these messages:
9pnet: -- p9_virtio_request (83): 9p debug: virtio request
9pnet: -- p9_virtio_request (83): virtio request kicked

So I suspect we're just not getting a callback.


I'll have a closer look after work, but any advice meanwhile will be
appreciated!
(I'm sure Luis would also like a temporary drop from -next until
this is figured out, but I'll leave this up to you)


>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix, if event_triggered is set to true, do not update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: Albert Huang <[email protected]>
> Message-Id: <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---
> drivers/virtio/virtio_ring.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cbeeea1b0439..1c36fa477966 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> /* If we expect an interrupt for the next entry, tell host
> * by writing event index and flush out the write before
> * the read in the next get_buf call. */
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vring_used_event(&vq->split.vring),
> cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> * by writing event index and flush out the write before
> * the read in the next get_buf call.
> */
> - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vq->packed.vring.driver->off_wrap,
> cpu_to_le16(vq->last_used_idx));

2023-03-28 03:14:16

by Jason Wang

[permalink] [raw]
Subject: Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
<[email protected]> wrote:
>
> Hi Michael, Albert,
>
> Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > the vq->event_triggered will be set to true. It will no longer be set to
> > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > virtqueue_enable_cb_prepare.
>
> This patch (commited as 35395770f803 ("virtio_ring: don't update event
> idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> Luis in https://lkml.kernel.org/r/[email protected]
>
> I've just hit had a look at recent patches[1] and reverted this to test
> and I can mount again, so I'm pretty sure this is the culprit, but I
> didn't look at the content at all yet so cannot advise further.
> It might very well be that we need some extra handling for 9p
> specifically that can be added separately if required.
>
> [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
>
>
> This can be reproduced with a simple mount, run qemu with some -virtfs
> argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> these messages:
> 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> 9pnet: -- p9_virtio_request (83): virtio request kicked
>
> So I suspect we're just not getting a callback.

I think so. The patch assumes the driver will call
virtqueue_disable/enable_cb() which is not the case of the 9p driver.

So after the first interrupt, event_triggered will be set to true forever.

Thanks

>
>
> I'll have a closer look after work, but any advice meanwhile will be
> appreciated!
> (I'm sure Luis would also like a temporary drop from -next until
> this is figured out, but I'll leave this up to you)
>
>
> >
> > If we disable the napi_tx, it will only be called when the tx ring
> > buffer is relatively small.
> >
> > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> >
> > To summarize:
> > 1) event_triggered was set to true in vring_interrupt()
> > 2) after this nothing will happen for virtqueue_disable_cb() so
> > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > then it tries to publish new event
> >
> > To fix, if event_triggered is set to true, do not update
> > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> >
> > Tested with iperf:
> > iperf3 tcp stream:
> > vm1 -----------------> vm2
> > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > there are many tx interrupts in vm2.
> > but without event_triggered there are just a few tx interrupts.
> >
> > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > Signed-off-by: Albert Huang <[email protected]>
> > Message-Id: <[email protected]>
> > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > ---
> > drivers/virtio/virtio_ring.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index cbeeea1b0439..1c36fa477966 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > /* If we expect an interrupt for the next entry, tell host
> > * by writing event index and flush out the write before
> > * the read in the next get_buf call. */
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > + !vq->event_triggered))
> > virtio_store_mb(vq->weak_barriers,
> > &vring_used_event(&vq->split.vring),
> > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > * by writing event index and flush out the write before
> > * the read in the next get_buf call.
> > */
> > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > + !vq->event_triggered))
> > virtio_store_mb(vq->weak_barriers,
> > &vq->packed.vring.driver->off_wrap,
> > cpu_to_le16(vq->last_used_idx));
>

2023-03-28 03:30:09

by 黄杰

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
>
> On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> <[email protected]> wrote:
> >
> > Hi Michael, Albert,
> >
> > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > the vq->event_triggered will be set to true. It will no longer be set to
> > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > virtqueue_enable_cb_prepare.
> >
> > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > Luis in https://lkml.kernel.org/r/[email protected]
> >
> > I've just hit had a look at recent patches[1] and reverted this to test
> > and I can mount again, so I'm pretty sure this is the culprit, but I
> > didn't look at the content at all yet so cannot advise further.
> > It might very well be that we need some extra handling for 9p
> > specifically that can be added separately if required.
> >
> > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> >
> >
> > This can be reproduced with a simple mount, run qemu with some -virtfs
> > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > these messages:
> > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > 9pnet: -- p9_virtio_request (83): virtio request kicked
> >
> > So I suspect we're just not getting a callback.
>
> I think so. The patch assumes the driver will call
> virtqueue_disable/enable_cb() which is not the case of the 9p driver.
>
> So after the first interrupt, event_triggered will be set to true forever.
>
> Thanks
>

Hi: Wang

Yes, This patch assumes that all virtio-related drivers will call
virtqueue_disable/enable_cb().
Thank you for raising this issue.

It seems that napi_tx is only related to virtue_net. I'm thinking if
we need to refactor
napi_tx instead of implementing it inside virtio_ring.

Thanks

> >
> >
> > I'll have a closer look after work, but any advice meanwhile will be
> > appreciated!
> > (I'm sure Luis would also like a temporary drop from -next until
> > this is figured out, but I'll leave this up to you)
> >
> >
> > >
> > > If we disable the napi_tx, it will only be called when the tx ring
> > > buffer is relatively small.
> > >
> > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > >
> > > To summarize:
> > > 1) event_triggered was set to true in vring_interrupt()
> > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > then it tries to publish new event
> > >
> > > To fix, if event_triggered is set to true, do not update
> > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > >
> > > Tested with iperf:
> > > iperf3 tcp stream:
> > > vm1 -----------------> vm2
> > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > there are many tx interrupts in vm2.
> > > but without event_triggered there are just a few tx interrupts.
> > >
> > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > Signed-off-by: Albert Huang <[email protected]>
> > > Message-Id: <[email protected]>
> > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > ---
> > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index cbeeea1b0439..1c36fa477966 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > /* If we expect an interrupt for the next entry, tell host
> > > * by writing event index and flush out the write before
> > > * the read in the next get_buf call. */
> > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > + !vq->event_triggered))
> > > virtio_store_mb(vq->weak_barriers,
> > > &vring_used_event(&vq->split.vring),
> > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > * by writing event index and flush out the write before
> > > * the read in the next get_buf call.
> > > */
> > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > + !vq->event_triggered))
> > > virtio_store_mb(vq->weak_barriers,
> > > &vq->packed.vring.driver->off_wrap,
> > > cpu_to_le16(vq->last_used_idx));
> >
>

2023-03-28 03:49:00

by Jason Wang

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
>
> Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> >
> > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > <[email protected]> wrote:
> > >
> > > Hi Michael, Albert,
> > >
> > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > virtqueue_enable_cb_prepare.
> > >
> > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > Luis in https://lkml.kernel.org/r/[email protected]
> > >
> > > I've just hit had a look at recent patches[1] and reverted this to test
> > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > didn't look at the content at all yet so cannot advise further.
> > > It might very well be that we need some extra handling for 9p
> > > specifically that can be added separately if required.
> > >
> > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > >
> > >
> > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > these messages:
> > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > >
> > > So I suspect we're just not getting a callback.
> >
> > I think so. The patch assumes the driver will call
> > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> >
> > So after the first interrupt, event_triggered will be set to true forever.
> >
> > Thanks
> >
>
> Hi: Wang
>
> Yes, This patch assumes that all virtio-related drivers will call
> virtqueue_disable/enable_cb().
> Thank you for raising this issue.
>
> It seems that napi_tx is only related to virtue_net. I'm thinking if
> we need to refactor
> napi_tx instead of implementing it inside virtio_ring.

We can hear from others.

I think it's better not to workaround virtio_ring issues in a specific
driver. It might just add more hacks. We should correctly set
VRING_AVAIL_F_NO_INTERRUPT,

Do you think the following might work (not even a compile test)?

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 41144b5246a8..12f4efb6dc54 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
- vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
- if (vq->event)
- /* TODO: this is a hack. Figure out a cleaner
value to write. */
- vring_used_event(&vq->split.vring) = 0x0;
- else
- vq->split.vring.avail->flags =
- cpu_to_virtio16(_vq->vdev,
- vq->split.avail_flags_shadow);
- }
+ if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
+ vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+
+ if (vq->event && !vq->event_triggered)
+ /* TODO: this is a hack. Figure out a cleaner value to write. */
+ vring_used_event(&vq->split.vring) = 0x0;
+ else
+ vq->split.vring.avail->flags =
+ cpu_to_virtio16(_vq->vdev,
+ vq->split.avail_flags_shadow);
}

static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
@@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
+ if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+
+ if (vq->event_triggered)
vq->packed.vring.driver->flags =
cpu_to_le16(vq->packed.event_flags_shadow);
}
@@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- /* If device triggered an event already it won't trigger one again:
- * no need to disable.
- */
- if (vq->event_triggered)
- return;
-
if (vq->packed_ring)
virtqueue_disable_cb_packed(_vq);
else

Thanks

>
> Thanks
>
> > >
> > >
> > > I'll have a closer look after work, but any advice meanwhile will be
> > > appreciated!
> > > (I'm sure Luis would also like a temporary drop from -next until
> > > this is figured out, but I'll leave this up to you)
> > >
> > >
> > > >
> > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > buffer is relatively small.
> > > >
> > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > >
> > > > To summarize:
> > > > 1) event_triggered was set to true in vring_interrupt()
> > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > then it tries to publish new event
> > > >
> > > > To fix, if event_triggered is set to true, do not update
> > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > >
> > > > Tested with iperf:
> > > > iperf3 tcp stream:
> > > > vm1 -----------------> vm2
> > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > there are many tx interrupts in vm2.
> > > > but without event_triggered there are just a few tx interrupts.
> > > >
> > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > Signed-off-by: Albert Huang <[email protected]>
> > > > Message-Id: <[email protected]>
> > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > ---
> > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index cbeeea1b0439..1c36fa477966 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > /* If we expect an interrupt for the next entry, tell host
> > > > * by writing event index and flush out the write before
> > > > * the read in the next get_buf call. */
> > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > + !vq->event_triggered))
> > > > virtio_store_mb(vq->weak_barriers,
> > > > &vring_used_event(&vq->split.vring),
> > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > * by writing event index and flush out the write before
> > > > * the read in the next get_buf call.
> > > > */
> > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > + !vq->event_triggered))
> > > > virtio_store_mb(vq->weak_barriers,
> > > > &vq->packed.vring.driver->off_wrap,
> > > > cpu_to_le16(vq->last_used_idx));
> > >
> >
>

2023-03-28 04:55:13

by Luis Chamberlain

[permalink] [raw]
Subject: Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 11:13:32AM +0900, Dominique Martinet wrote:
> Hi Michael, Albert,
>
> Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > the vq->event_triggered will be set to true. It will no longer be set to
> > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > virtqueue_enable_cb_prepare.
>
> This patch (commited as 35395770f803 ("virtio_ring: don't update event
> idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> Luis in https://lkml.kernel.org/r/[email protected]

Spot on, reverting that commit fixes it.

Luis

2023-03-28 09:13:30

by 黄杰

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

Jason Wang <[email protected]> 于2023年3月28日周二 11:40写道:
>
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> >
> > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <[email protected]> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,
>
> Do you think the following might work (not even a compile test)?
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 41144b5246a8..12f4efb6dc54 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (vq->event)
> - /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> - vring_used_event(&vq->split.vring) = 0x0;
> - else
> - vq->split.vring.avail->flags =
> - cpu_to_virtio16(_vq->vdev,
> - vq->split.avail_flags_shadow);
> - }
> + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event && !vq->event_triggered)
> + /* TODO: this is a hack. Figure out a cleaner value to write. */
> + vring_used_event(&vq->split.vring) = 0x0;
> + else
> + vq->split.vring.avail->flags =
> + cpu_to_virtio16(_vq->vdev,
> + vq->split.avail_flags_shadow);
> }
>
> static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Thanks
>

Hi, This patch seems to address the issue I initially raised and also
avoids the problem with virtio-9P.

but maybe this is a better choice:

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 307e139cb11d..6784d155c781 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
virtqueue *_vq)

if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+
+ if (vq->event_triggered)
+ return;
+
if (vq->event)
/* TODO: this is a hack. Figure out a cleaner
value to write. */
vring_used_event(&vq->split.vring) = 0x0;
@@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
virtqueue *_vq)

if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+
+ if (vq->event_triggered)
+ return;
+
vq->packed.vring.driver->flags =
cpu_to_le16(vq->packed.event_flags_shadow);
}
@@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- /* If device triggered an event already it won't trigger one again:
- * no need to disable.
- */
- if (vq->event_triggered)
- return;
-
if (vq->packed_ring)
virtqueue_disable_cb_packed(_vq);
else

Does Michael have any other suggestions?

Thanks.

> >
> > Thanks
> >
> > > >
> > > >
> > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > appreciated!
> > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > this is figured out, but I'll leave this up to you)
> > > >
> > > >
> > > > >
> > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > buffer is relatively small.
> > > > >
> > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > >
> > > > > To summarize:
> > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > then it tries to publish new event
> > > > >
> > > > > To fix, if event_triggered is set to true, do not update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > >
> > > > > Tested with iperf:
> > > > > iperf3 tcp stream:
> > > > > vm1 -----------------> vm2
> > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > there are many tx interrupts in vm2.
> > > > > but without event_triggered there are just a few tx interrupts.
> > > > >
> > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > Message-Id: <[email protected]>
> > > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call. */
> > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vring_used_event(&vq->split.vring),
> > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call.
> > > > > */
> > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vq->packed.vring.driver->off_wrap,
> > > > > cpu_to_le16(vq->last_used_idx));
> > > >
> > >
> >
>

2023-03-28 14:46:00

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 05:09:19PM +0800, 黄杰 wrote:
> Jason Wang <[email protected]> 于2023年3月28日周二 11:40写道:
> >
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> > >
> > > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <[email protected]> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
> >
>
> Hi, This patch seems to address the issue I initially raised and also
> avoids the problem with virtio-9P.
>
> but maybe this is a better choice:
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..6784d155c781 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
>
> if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Does Michael have any other suggestions?
>
> Thanks.

Hmm what bothers me is this breaks the underlying assumption that
shadow is an exact match of the shared memory. Need to check this does
not cascade. But I am still trying to get my head around what
the issue is.




> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > > Message-Id: <[email protected]>
> > > > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
> >

2023-03-29 05:31:02

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> >
> > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <[email protected]> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,

I am still stuck trying to understand why we don't set it.
How does event_triggered end up getting set without
event index support?

Thanks!

2023-03-29 05:38:01

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 05:09:19PM +0800, 黄杰 wrote:
> Jason Wang <[email protected]> 于2023年3月28日周二 11:40写道:
> >
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> > >
> > > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <[email protected]> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
> >
>
> Hi, This patch seems to address the issue I initially raised and also
> avoids the problem with virtio-9P.
>
> but maybe this is a better choice:
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..6784d155c781 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
>
> if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Does Michael have any other suggestions?
>
> Thanks.

Oh I finally understand I think. The issue is with event index
enabled. interrupt fires once, we set event index to 0x0.
since shadow is not set then we never update it again.

I agree, your patch is a good fix, just please copy the comment
you are removing to the two places where we check event_triggered now.
I dropped patch v2 from my tree and please post v3
with this fixup squashed.

Thanks a lot!



Apropos if we fixed all drivers to call disable/enable_cb
explicitly we could remove branch on data path, but it's a lot of work
and it's easy to miss some drivers.



> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > > Message-Id: <[email protected]>
> > > > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
> >

2023-03-29 05:38:36

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_ring: don't update event idx on get_buf

On Sat, Mar 25, 2023 at 06:56:33PM +0800, Albert Huang wrote:
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix, if event_triggered is set to true, do not update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: Albert Huang <[email protected]>
> Message-Id: <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>

dropped for now due to breakage of 9p.
Pls post v3 with the fix you suggested (moving the check
of vq->event_triggered).

> ---
> drivers/virtio/virtio_ring.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index cbeeea1b0439..1c36fa477966 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> /* If we expect an interrupt for the next entry, tell host
> * by writing event index and flush out the write before
> * the read in the next get_buf call. */
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vring_used_event(&vq->split.vring),
> cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> * by writing event index and flush out the write before
> * the read in the next get_buf call.
> */
> - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> + !vq->event_triggered))
> virtio_store_mb(vq->weak_barriers,
> &vq->packed.vring.driver->off_wrap,
> cpu_to_le16(vq->last_used_idx));
> --
> 2.37.0 (Apple Git-136)

2023-03-29 05:55:13

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> >
> > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <[email protected]> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,
>
> Do you think the following might work (not even a compile test)?


ok but:

> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 41144b5246a8..12f4efb6dc54 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (vq->event)
> - /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> - vring_used_event(&vq->split.vring) = 0x0;
> - else
> - vq->split.vring.avail->flags =
> - cpu_to_virtio16(_vq->vdev,
> - vq->split.avail_flags_shadow);
> - }
> + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event && !vq->event_triggered)
> + /* TODO: this is a hack. Figure out a cleaner value to write. */
> + vring_used_event(&vq->split.vring) = 0x0;
> + else
> + vq->split.vring.avail->flags =
> + cpu_to_virtio16(_vq->vdev,
> + vq->split.avail_flags_shadow);
> }
>
> static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)

I don't get this one. if event_triggered why do you still want to
write into driver flags? it won't trigger again anytime soon.

> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Thanks

I think I prefer Huang Albert's other patch - are you ok with it?

> >
> > Thanks
> >
> > > >
> > > >
> > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > appreciated!
> > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > this is figured out, but I'll leave this up to you)
> > > >
> > > >
> > > > >
> > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > buffer is relatively small.
> > > > >
> > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > >
> > > > > To summarize:
> > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > then it tries to publish new event
> > > > >
> > > > > To fix, if event_triggered is set to true, do not update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > >
> > > > > Tested with iperf:
> > > > > iperf3 tcp stream:
> > > > > vm1 -----------------> vm2
> > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > there are many tx interrupts in vm2.
> > > > > but without event_triggered there are just a few tx interrupts.
> > > > >
> > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > Message-Id: <[email protected]>
> > > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call. */
> > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vring_used_event(&vq->split.vring),
> > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call.
> > > > > */
> > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vq->packed.vring.driver->off_wrap,
> > > > > cpu_to_le16(vq->last_used_idx));
> > > >
> > >
> >

2023-03-29 07:25:25

by 黄杰

[permalink] [raw]
Subject: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set

From: "huangjie.albert" <[email protected]>

in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
the vq->event_triggered will be set to true. It will no longer be set to
false. Unless we explicitly call virtqueue_enable_cb_delayed or
virtqueue_enable_cb_prepare.

If we disable the napi_tx, it will only be called when the tx ring
buffer is relatively small.

Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
every time we call virtqueue_get_buf_ctx.This bring more interruptions.

To summarize:
1) event_triggered was set to true in vring_interrupt()
2) after this nothing will happen for virtqueue_disable_cb() so
VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
then it tries to publish new event

To fix:
update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
when we call virtqueue_disable_cb even the event_triggered is set to true.

Tested with iperf:
iperf3 tcp stream:
vm1 -----------------> vm2
vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
there are many tx interrupts in vm2.
but without event_triggered there are just a few tx interrupts.

v2->v3:
-update the interrupt disable flag even with the event_triggered is set,
-instead of checking whether event_triggered is set in
-virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
-not called virtqueue_{enable/disable}_cb to miss notifications.

Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
Signed-off-by: huangjie.albert <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
---
drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 307e139cb11d..ad74463a48ee 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)

if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+
+ /*
+ * If device triggered an event already it won't trigger one again:
+ * no need to disable.
+ */
+ if (vq->event_triggered)
+ return;
+
if (vq->event)
/* TODO: this is a hack. Figure out a cleaner value to write. */
vring_used_event(&vq->split.vring) = 0x0;
@@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
+ if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+
+ /*
+ * If device triggered an event already it won't trigger one again:
+ * no need to disable.
+ */
+ if (vq->event_triggered)
+ return;
+
vq->packed.vring.driver->flags =
cpu_to_le16(vq->packed.event_flags_shadow);
}
@@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- /* If device triggered an event already it won't trigger one again:
- * no need to disable.
- */
- if (vq->event_triggered)
- return;
-
if (vq->packed_ring)
virtqueue_disable_cb_packed(_vq);
else
--
2.31.1

2023-03-29 07:47:36

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set

Maybe one new thread is better.

Thanks.

On Wed, 29 Mar 2023 15:21:35 +0800, Albert Huang <[email protected]> wrote:
> From: "huangjie.albert" <[email protected]>
>
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx.This bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix:
> update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
> when we call virtqueue_disable_cb even the event_triggered is set to true.
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> v2->v3:
> -update the interrupt disable flag even with the event_triggered is set,
> -instead of checking whether event_triggered is set in
> -virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
> -not called virtqueue_{enable/disable}_cb to miss notifications.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: huangjie.albert <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..ad74463a48ee 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
> --
> 2.31.1
>

2023-03-29 16:50:28

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set

Ican't parse the subject at all. I think the subject from v2
was fine - we are skipping event index updates on get buf.
Or maybe go higher level and describe the effect of the patch:

virtio_ring: reduce interrupt rate with event idx enabled


On Wed, Mar 29, 2023 at 03:21:35PM +0800, Albert Huang wrote:
> From: "huangjie.albert" <[email protected]>
>
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false.

this last sentence is redundant

> Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx.This bring more interruptions.

This will bring more interrupts. And pls add space after ".".

>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix:
> update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
> when we call virtqueue_disable_cb even the event_triggered is set to true.

bad grammar here and way too much detail. better:

make disable_cb set VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE in flags shadow to make get_buf
correctly detect that callbacks are disabled.

>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>

put changelog after --- please.

> v2->v3:
> -update the interrupt disable flag even with the event_triggered is set,
> -instead of checking whether event_triggered is set in
> -virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
> -not called virtqueue_{enable/disable}_cb to miss notifications.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: huangjie.albert <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..ad74463a48ee 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {


Why are you making this change? It's not great because it
only works because VRING_PACKED_EVENT_FLAG_DISABLE happens
to equal 0x1. does not the patch work with the original
if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE)
?

Besides, if you are making unrelated changes commit log should
describe them.

> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
> --
> 2.31.1

2023-03-29 16:56:32

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set

On Wed, Mar 29, 2023 at 03:27:03PM +0800, Xuan Zhuo wrote:
> Maybe one new thread is better.
>
> Thanks.

I don't know but do not post same message twice please
without explanation. if you repost put "PATCH repost" in
the subject.

2023-03-30 02:55:43

by Jason Wang

[permalink] [raw]
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)

On Wed, Mar 29, 2023 at 1:42 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <[email protected]> wrote:
> > >
> > > Jason Wang <[email protected]> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <[email protected]> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/[email protected]
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
>
>
> ok but:
>
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
>
> I don't get this one. if event_triggered why do you still want to
> write into driver flags? it won't trigger again anytime soon.

Should be a typo.

>
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
>
> I think I prefer Huang Albert's other patch - are you ok with it?

Yes.

Thanks

>
> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > > Message-Id: <[email protected]>
> > > > > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
>