2013-07-08 09:04:17

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH 0/2] virtio_net: fix race in RX VQ processing

Jason Wang reported a race in RX VQ processing:
virtqueue_enable_cb is called outside napi lock,
violating virtio serialization rules.
The race has been there from day 1, but it got especially nasty in 3.0
when commit a5c262c5fd83ece01bd649fb08416c501d4c59d7
"virtio_ring: support event idx feature"
added more dependency on vq state.

Please review, and consider for 3.11 and for stable.

Jason, could you please report whether this fixes the issues for you?


2013-07-08 09:03:27

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH 1/2] virtio: support unlocked queue poll

This adds a way to check ring empty state after enable_cb outside any
locks. Will be used by virtio_net.

Note: there's room for more optimization: caller is likely to have a
memory barrier already, which means we might be able to get rid of a
barrier here. Deferring this optimization until we do some
benchmarking.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/virtio/virtio_ring.c | 56 ++++++++++++++++++++++++++++++++++----------
include/linux/virtio.h | 4 ++++
2 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5217baf..37d58f8 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -607,19 +607,21 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
EXPORT_SYMBOL_GPL(virtqueue_disable_cb);

/**
- * virtqueue_enable_cb - restart callbacks after disable_cb.
+ * virtqueue_enable_cb_prepare - restart callbacks after disable_cb
* @vq: the struct virtqueue we're talking about.
*
- * This re-enables callbacks; it returns "false" if there are pending
- * buffers in the queue, to detect a possible race between the driver
- * checking for more work, and enabling callbacks.
+ * This re-enables callbacks; it returns current queue state
+ * in an opaque unsigned value. This value should be later tested by
+ * virtqueue_poll, to detect a possible race between the driver checking for
+ * more work, and enabling callbacks.
*
* Caller must ensure we don't call this with other virtqueue
* operations at the same time (except where noted).
*/
-bool virtqueue_enable_cb(struct virtqueue *_vq)
+unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
+ u16 last_used_idx;

START_USE(vq);

@@ -629,15 +631,45 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
* either clear the flags bit or point the event index at the next
* entry. Always do both to keep code simple. */
vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
- vring_used_event(&vq->vring) = vq->last_used_idx;
+ vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx;
+ END_USE(vq);
+ return last_used_idx;
+}
+EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
+
+/**
+ * virtqueue_poll - query pending used buffers
+ * @vq: the struct virtqueue we're talking about.
+ * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
+ *
+ * Returns "true" if there are pending used buffers in the queue.
+ *
+ * This does not need to be serialized.
+ */
+bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
+{
+ struct vring_virtqueue *vq = to_vvq(_vq);
+
virtio_mb(vq->weak_barriers);
- if (unlikely(more_used(vq))) {
- END_USE(vq);
- return false;
- }
+ return (u16)last_used_idx != vq->vring.used->idx;
+}
+EXPORT_SYMBOL_GPL(virtqueue_poll);

- END_USE(vq);
- return true;
+/**
+ * virtqueue_enable_cb - restart callbacks after disable_cb.
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * This re-enables callbacks; it returns "false" if there are pending
+ * buffers in the queue, to detect a possible race between the driver
+ * checking for more work, and enabling callbacks.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
+bool virtqueue_enable_cb(struct virtqueue *_vq)
+{
+ unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq);
+ return !virtqueue_poll(_vq, last_used_idx);
}
EXPORT_SYMBOL_GPL(virtqueue_enable_cb);

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 9ff8645..72398ee 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -70,6 +70,10 @@ void virtqueue_disable_cb(struct virtqueue *vq);

bool virtqueue_enable_cb(struct virtqueue *vq);

+unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
+
+bool virtqueue_poll(struct virtqueue *vq, unsigned);
+
bool virtqueue_enable_cb_delayed(struct virtqueue *vq);

void *virtqueue_detach_unused_buf(struct virtqueue *vq);
--
MST

2013-07-08 09:03:32

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH 2/2] virtio_net: fix race in RX VQ processing

virtio net called virtqueue_enable_cq on RX path after napi_complete, so
with NAPI_STATE_SCHED clear - outside the implicit napi lock.
This violates the requirement to synchronize virtqueue_enable_cq wrt
virtqueue_add_buf. In particular, used event can move backwards,
causing us to lose interrupts.
In a debug build, this can trigger panic within START_USE.

Jason Wang reports that he can trigger the races artificially,
by adding udelay() in virtqueue_enable_cb() after virtio_mb().

However, we must call napi_complete to clear NAPI_STATE_SCHED before
polling the virtqueue for used buffers, otherwise napi_schedule_prep in
a callback will fail, causing us to lose RX events.

To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
set (under napi lock), later call virtqueue_poll with
NAPI_STATE_SCHED clear (outside the lock).

Reported-by: Jason Wang <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5305bd1..fbdd79a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -622,8 +622,9 @@ again:

/* Out of packets? */
if (received < budget) {
+ unsigned r = virtqueue_enable_cb_prepare(rq->vq);
napi_complete(napi);
- if (unlikely(!virtqueue_enable_cb(rq->vq)) &&
+ if (unlikely(virtqueue_poll(rq->vq, r)) &&
napi_schedule_prep(napi)) {
virtqueue_disable_cb(rq->vq);
__napi_schedule(napi);
--
MST

2013-07-08 12:52:31

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH 2/2] virtio_net: fix race in RX VQ processing

Hello.

On 08-07-2013 13:04, Michael S. Tsirkin wrote:

> virtio net called virtqueue_enable_cq on RX path after napi_complete, so
> with NAPI_STATE_SCHED clear - outside the implicit napi lock.
> This violates the requirement to synchronize virtqueue_enable_cq wrt
> virtqueue_add_buf. In particular, used event can move backwards,
> causing us to lose interrupts.
> In a debug build, this can trigger panic within START_USE.

> Jason Wang reports that he can trigger the races artificially,
> by adding udelay() in virtqueue_enable_cb() after virtio_mb().

> However, we must call napi_complete to clear NAPI_STATE_SCHED before
> polling the virtqueue for used buffers, otherwise napi_schedule_prep in
> a callback will fail, causing us to lose RX events.

> To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
> set (under napi lock), later call virtqueue_poll with
> NAPI_STATE_SCHED clear (outside the lock).

> Reported-by: Jason Wang <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---
> drivers/net/virtio_net.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)

> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5305bd1..fbdd79a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -622,8 +622,9 @@ again:
>
> /* Out of packets? */
> if (received < budget) {
> + unsigned r = virtqueue_enable_cb_prepare(rq->vq);

Empty line wouldn't hurt here, after declaration.

WBR, Sergei

2013-07-08 13:07:47

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 2/2] virtio_net: fix race in RX VQ processing

On Mon, Jul 08, 2013 at 04:52:26PM +0400, Sergei Shtylyov wrote:
> Hello.
>
> On 08-07-2013 13:04, Michael S. Tsirkin wrote:
>
> >virtio net called virtqueue_enable_cq on RX path after napi_complete, so
> >with NAPI_STATE_SCHED clear - outside the implicit napi lock.
> >This violates the requirement to synchronize virtqueue_enable_cq wrt
> >virtqueue_add_buf. In particular, used event can move backwards,
> >causing us to lose interrupts.
> >In a debug build, this can trigger panic within START_USE.
>
> >Jason Wang reports that he can trigger the races artificially,
> >by adding udelay() in virtqueue_enable_cb() after virtio_mb().
>
> >However, we must call napi_complete to clear NAPI_STATE_SCHED before
> >polling the virtqueue for used buffers, otherwise napi_schedule_prep in
> >a callback will fail, causing us to lose RX events.
>
> >To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
> >set (under napi lock), later call virtqueue_poll with
> >NAPI_STATE_SCHED clear (outside the lock).
>
> >Reported-by: Jason Wang <[email protected]>
> >Signed-off-by: Michael S. Tsirkin <[email protected]>
> >---
> > drivers/net/virtio_net.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
>
> >diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >index 5305bd1..fbdd79a 100644
> >--- a/drivers/net/virtio_net.c
> >+++ b/drivers/net/virtio_net.c
> >@@ -622,8 +622,9 @@ again:
> >
> > /* Out of packets? */
> > if (received < budget) {
> >+ unsigned r = virtqueue_enable_cb_prepare(rq->vq);
>
> Empty line wouldn't hurt here, after declaration.
>
> WBR, Sergei

I don't like an empty line here - it breaks _prepare
away from _poll which is in the same logical code block.

Is there some rule that says we must have empty
lines after declarations? If yes I'd rather split
initialization away from declaration, though that's
more verbose than it needs to be.

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index fbdd79a..edcffc6 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -622,7 +622,9 @@ again:

/* Out of packets? */
if (received < budget) {
- unsigned r = virtqueue_enable_cb_prepare(rq->vq);
+ unsigned r;
+
+ r = virtqueue_enable_cb_prepare(rq->vq);
napi_complete(napi);
if (unlikely(virtqueue_poll(rq->vq, r)) &&
napi_schedule_prep(napi)) {

2013-07-09 03:17:23

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio_net: fix race in RX VQ processing

On 07/08/2013 05:05 PM, Michael S. Tsirkin wrote:
> Jason Wang reported a race in RX VQ processing:
> virtqueue_enable_cb is called outside napi lock,
> violating virtio serialization rules.
> The race has been there from day 1, but it got especially nasty in 3.0
> when commit a5c262c5fd83ece01bd649fb08416c501d4c59d7
> "virtio_ring: support event idx feature"
> added more dependency on vq state.
>
> Please review, and consider for 3.11 and for stable.
>
> Jason, could you please report whether this fixes the issues for you?

Yes, I confirm this fixes the issue.

Since I can only reproduce it by adding udelay in virtqueue_enable_cb()
after virtio_mb(). I validate this also with this.

Thanks

2013-07-09 03:27:05

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 1/2] virtio: support unlocked queue poll

On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote:
> This adds a way to check ring empty state after enable_cb outside any
> locks. Will be used by virtio_net.
>
> Note: there's room for more optimization: caller is likely to have a
> memory barrier already, which means we might be able to get rid of a
> barrier here. Deferring this optimization until we do some
> benchmarking.
>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---

Tested-by: Jason Wang <[email protected]>
Acked-by: Jason Wang <[email protected]>
> drivers/virtio/virtio_ring.c | 56 ++++++++++++++++++++++++++++++++++----------
> include/linux/virtio.h | 4 ++++
> 2 files changed, 48 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5217baf..37d58f8 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -607,19 +607,21 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>
> /**
> - * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * virtqueue_enable_cb_prepare - restart callbacks after disable_cb
> * @vq: the struct virtqueue we're talking about.
> *
> - * This re-enables callbacks; it returns "false" if there are pending
> - * buffers in the queue, to detect a possible race between the driver
> - * checking for more work, and enabling callbacks.
> + * This re-enables callbacks; it returns current queue state
> + * in an opaque unsigned value. This value should be later tested by
> + * virtqueue_poll, to detect a possible race between the driver checking for
> + * more work, and enabling callbacks.
> *
> * Caller must ensure we don't call this with other virtqueue
> * operations at the same time (except where noted).
> */
> -bool virtqueue_enable_cb(struct virtqueue *_vq)
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 last_used_idx;
>
> START_USE(vq);
>
> @@ -629,15 +631,45 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
> * either clear the flags bit or point the event index at the next
> * entry. Always do both to keep code simple. */
> vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
> - vring_used_event(&vq->vring) = vq->last_used_idx;
> + vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx;
> + END_USE(vq);
> + return last_used_idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> +
> +/**
> + * virtqueue_poll - query pending used buffers
> + * @vq: the struct virtqueue we're talking about.
> + * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
> + *
> + * Returns "true" if there are pending used buffers in the queue.
> + *
> + * This does not need to be serialized.
> + */
> +bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> virtio_mb(vq->weak_barriers);
> - if (unlikely(more_used(vq))) {
> - END_USE(vq);
> - return false;
> - }
> + return (u16)last_used_idx != vq->vring.used->idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_poll);
>
> - END_USE(vq);
> - return true;
> +/**
> + * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * @vq: the struct virtqueue we're talking about.
> + *
> + * This re-enables callbacks; it returns "false" if there are pending
> + * buffers in the queue, to detect a possible race between the driver
> + * checking for more work, and enabling callbacks.
> + *
> + * Caller must ensure we don't call this with other virtqueue
> + * operations at the same time (except where noted).
> + */
> +bool virtqueue_enable_cb(struct virtqueue *_vq)
> +{
> + unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq);
> + return !virtqueue_poll(_vq, last_used_idx);
> }
> EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
>
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 9ff8645..72398ee 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -70,6 +70,10 @@ void virtqueue_disable_cb(struct virtqueue *vq);
>
> bool virtqueue_enable_cb(struct virtqueue *vq);
>
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
> +
> +bool virtqueue_poll(struct virtqueue *vq, unsigned);
> +
> bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
>
> void *virtqueue_detach_unused_buf(struct virtqueue *vq);

2013-07-09 03:28:47

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 2/2] virtio_net: fix race in RX VQ processing

On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote:
> virtio net called virtqueue_enable_cq on RX path after napi_complete, so
> with NAPI_STATE_SCHED clear - outside the implicit napi lock.
> This violates the requirement to synchronize virtqueue_enable_cq wrt
> virtqueue_add_buf. In particular, used event can move backwards,
> causing us to lose interrupts.
> In a debug build, this can trigger panic within START_USE.
>
> Jason Wang reports that he can trigger the races artificially,
> by adding udelay() in virtqueue_enable_cb() after virtio_mb().
>
> However, we must call napi_complete to clear NAPI_STATE_SCHED before
> polling the virtqueue for used buffers, otherwise napi_schedule_prep in
> a callback will fail, causing us to lose RX events.
>
> To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
> set (under napi lock), later call virtqueue_poll with
> NAPI_STATE_SCHED clear (outside the lock).
>
> Reported-by: Jason Wang <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---

Tested-by: Jason Wang <[email protected]>
Acked-by: Jason Wang <[email protected]>
> drivers/net/virtio_net.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5305bd1..fbdd79a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -622,8 +622,9 @@ again:
>
> /* Out of packets? */
> if (received < budget) {
> + unsigned r = virtqueue_enable_cb_prepare(rq->vq);
> napi_complete(napi);
> - if (unlikely(!virtqueue_enable_cb(rq->vq)) &&
> + if (unlikely(virtqueue_poll(rq->vq, r)) &&
> napi_schedule_prep(napi)) {
> virtqueue_disable_cb(rq->vq);
> __napi_schedule(napi);

2013-07-09 03:54:57

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio_net: fix race in RX VQ processing

From: "Michael S. Tsirkin" <[email protected]>
Date: Mon, 8 Jul 2013 12:05:26 +0300

> Jason Wang reported a race in RX VQ processing:
> virtqueue_enable_cb is called outside napi lock,
> violating virtio serialization rules.
> The race has been there from day 1, but it got especially nasty in 3.0
> when commit a5c262c5fd83ece01bd649fb08416c501d4c59d7
> "virtio_ring: support event idx feature"
> added more dependency on vq state.
>
> Please review, and consider for 3.11 and for stable.
>
> Jason, could you please report whether this fixes the issues for you?

Please resubmit with the minor coding style fix and Jason's Acked-by/Tested-by

Thanks.

2013-07-10 03:59:13

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/2] virtio: support unlocked queue poll

"Michael S. Tsirkin" <[email protected]> writes:
> This adds a way to check ring empty state after enable_cb outside any
> locks. Will be used by virtio_net.
>
> Note: there's room for more optimization: caller is likely to have a
> memory barrier already, which means we might be able to get rid of a
> barrier here. Deferring this optimization until we do some
> benchmarking.
>
> Signed-off-by: Michael S. Tsirkin <[email protected]>

Acked-by: Rusty Russell <[email protected]>

Thanks,
Rusty.

2013-07-10 04:38:47

by Asias He

[permalink] [raw]
Subject: Re: [PATCH 1/2] virtio: support unlocked queue poll

On Mon, Jul 08, 2013 at 12:04:36PM +0300, Michael S. Tsirkin wrote:
> This adds a way to check ring empty state after enable_cb outside any
> locks. Will be used by virtio_net.
>
> Note: there's room for more optimization: caller is likely to have a
> memory barrier already, which means we might be able to get rid of a
> barrier here. Deferring this optimization until we do some
> benchmarking.
>
> Signed-off-by: Michael S. Tsirkin <[email protected]>

Acked-by: Asias He <[email protected]>

> ---
> drivers/virtio/virtio_ring.c | 56 ++++++++++++++++++++++++++++++++++----------
> include/linux/virtio.h | 4 ++++
> 2 files changed, 48 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5217baf..37d58f8 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -607,19 +607,21 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>
> /**
> - * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * virtqueue_enable_cb_prepare - restart callbacks after disable_cb
> * @vq: the struct virtqueue we're talking about.
> *
> - * This re-enables callbacks; it returns "false" if there are pending
> - * buffers in the queue, to detect a possible race between the driver
> - * checking for more work, and enabling callbacks.
> + * This re-enables callbacks; it returns current queue state
> + * in an opaque unsigned value. This value should be later tested by
> + * virtqueue_poll, to detect a possible race between the driver checking for
> + * more work, and enabling callbacks.
> *
> * Caller must ensure we don't call this with other virtqueue
> * operations at the same time (except where noted).
> */
> -bool virtqueue_enable_cb(struct virtqueue *_vq)
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 last_used_idx;
>
> START_USE(vq);
>
> @@ -629,15 +631,45 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
> * either clear the flags bit or point the event index at the next
> * entry. Always do both to keep code simple. */
> vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
> - vring_used_event(&vq->vring) = vq->last_used_idx;
> + vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx;
> + END_USE(vq);
> + return last_used_idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
> +
> +/**
> + * virtqueue_poll - query pending used buffers
> + * @vq: the struct virtqueue we're talking about.
> + * @last_used_idx: virtqueue state (from call to virtqueue_enable_cb_prepare).
> + *
> + * Returns "true" if there are pending used buffers in the queue.
> + *
> + * This does not need to be serialized.
> + */
> +bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> virtio_mb(vq->weak_barriers);
> - if (unlikely(more_used(vq))) {
> - END_USE(vq);
> - return false;
> - }
> + return (u16)last_used_idx != vq->vring.used->idx;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_poll);
>
> - END_USE(vq);
> - return true;
> +/**
> + * virtqueue_enable_cb - restart callbacks after disable_cb.
> + * @vq: the struct virtqueue we're talking about.
> + *
> + * This re-enables callbacks; it returns "false" if there are pending
> + * buffers in the queue, to detect a possible race between the driver
> + * checking for more work, and enabling callbacks.
> + *
> + * Caller must ensure we don't call this with other virtqueue
> + * operations at the same time (except where noted).
> + */
> +bool virtqueue_enable_cb(struct virtqueue *_vq)
> +{
> + unsigned last_used_idx = virtqueue_enable_cb_prepare(_vq);
> + return !virtqueue_poll(_vq, last_used_idx);
> }
> EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
>
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 9ff8645..72398ee 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -70,6 +70,10 @@ void virtqueue_disable_cb(struct virtqueue *vq);
>
> bool virtqueue_enable_cb(struct virtqueue *vq);
>
> +unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
> +
> +bool virtqueue_poll(struct virtqueue *vq, unsigned);
> +
> bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
>
> void *virtqueue_detach_unused_buf(struct virtqueue *vq);
> --
> MST
>
> _______________________________________________
> Virtualization mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization

--
Asias

2013-07-10 04:39:58

by Asias He

[permalink] [raw]
Subject: Re: [PATCH 2/2] virtio_net: fix race in RX VQ processing

On Tue, Jul 09, 2013 at 11:28:34AM +0800, Jason Wang wrote:
> On 07/08/2013 05:04 PM, Michael S. Tsirkin wrote:
> > virtio net called virtqueue_enable_cq on RX path after napi_complete, so
> > with NAPI_STATE_SCHED clear - outside the implicit napi lock.
> > This violates the requirement to synchronize virtqueue_enable_cq wrt
> > virtqueue_add_buf. In particular, used event can move backwards,
> > causing us to lose interrupts.
> > In a debug build, this can trigger panic within START_USE.
> >
> > Jason Wang reports that he can trigger the races artificially,
> > by adding udelay() in virtqueue_enable_cb() after virtio_mb().
> >
> > However, we must call napi_complete to clear NAPI_STATE_SCHED before
> > polling the virtqueue for used buffers, otherwise napi_schedule_prep in
> > a callback will fail, causing us to lose RX events.
> >
> > To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
> > set (under napi lock), later call virtqueue_poll with
> > NAPI_STATE_SCHED clear (outside the lock).
> >
> > Reported-by: Jason Wang <[email protected]>
> > Signed-off-by: Michael S. Tsirkin <[email protected]>

Acked-by: Asias He <[email protected]>

> > ---
>
> Tested-by: Jason Wang <[email protected]>
> Acked-by: Jason Wang <[email protected]>
> > drivers/net/virtio_net.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 5305bd1..fbdd79a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -622,8 +622,9 @@ again:
> >
> > /* Out of packets? */
> > if (received < budget) {
> > + unsigned r = virtqueue_enable_cb_prepare(rq->vq);
> > napi_complete(napi);
> > - if (unlikely(!virtqueue_enable_cb(rq->vq)) &&
> > + if (unlikely(virtqueue_poll(rq->vq, r)) &&
> > napi_schedule_prep(napi)) {
> > virtqueue_disable_cb(rq->vq);
> > __napi_schedule(napi);
>
> _______________________________________________
> Virtualization mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization

--
Asias