2022-09-05 05:26:46

by Jason Wang

[permalink] [raw]
Subject: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed.

What's more important. This is a must for some vDPA parent to work
since control virtqueue is emulated via a workqueue for those parents.

Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Signed-off-by: Jason Wang <[email protected]>
---
drivers/net/virtio_net.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ece00b84e3a7..169368365d6a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
* into the hypervisor, so the request should be handled immediately.
*/
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
- !virtqueue_is_broken(vi->cvq))
+ !virtqueue_is_broken(vi->cvq)) {
+ cond_resched();
cpu_relax();
+ }

return vi->ctrl->status == VIRTIO_NET_OK;
}
--
2.25.1


2022-09-05 07:50:15

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> Adding cond_resched() to the command waiting loop for a better
> co-operation with the scheduler. This allows to give CPU a breath to
> run other task(workqueue) instead of busy looping when preemption is
> not allowed.
>
> What's more important. This is a must for some vDPA parent to work
> since control virtqueue is emulated via a workqueue for those parents.
>
> Fixes: bda324fd037a ("vdpasim: control virtqueue support")

That's a weird commit to fix. so it fixes the simulator?

> Signed-off-by: Jason Wang <[email protected]>
> ---
> drivers/net/virtio_net.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ece00b84e3a7..169368365d6a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> * into the hypervisor, so the request should be handled immediately.
> */
> while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> - !virtqueue_is_broken(vi->cvq))
> + !virtqueue_is_broken(vi->cvq)) {
> + cond_resched();
> cpu_relax();
> + }

with cond_resched do we still need cpu_relax?

> return vi->ctrl->status == VIRTIO_NET_OK;
> }
> --
> 2.25.1

2022-09-05 09:21:48

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > Adding cond_resched() to the command waiting loop for a better
> > co-operation with the scheduler. This allows to give CPU a breath to
> > run other task(workqueue) instead of busy looping when preemption is
> > not allowed.
> >
> > What's more important. This is a must for some vDPA parent to work
> > since control virtqueue is emulated via a workqueue for those parents.
> >
> > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>
> That's a weird commit to fix. so it fixes the simulator?

Yes, since the simulator is using a workqueue to handle control virtueue.

>
> > Signed-off-by: Jason Wang <[email protected]>
> > ---
> > drivers/net/virtio_net.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index ece00b84e3a7..169368365d6a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> > * into the hypervisor, so the request should be handled immediately.
> > */
> > while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > - !virtqueue_is_broken(vi->cvq))
> > + !virtqueue_is_broken(vi->cvq)) {
> > + cond_resched();
> > cpu_relax();
> > + }
>
> with cond_resched do we still need cpu_relax?

I think so, it's possible that cond_sched() just doesn't trigger scheduling.

Thanks

>
> > return vi->ctrl->status == VIRTIO_NET_OK;
> > }
> > --
> > 2.25.1
>

2022-09-06 11:20:04

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > Adding cond_resched() to the command waiting loop for a better
> > > co-operation with the scheduler. This allows to give CPU a breath to
> > > run other task(workqueue) instead of busy looping when preemption is
> > > not allowed.
> > >
> > > What's more important. This is a must for some vDPA parent to work
> > > since control virtqueue is emulated via a workqueue for those parents.
> > >
> > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> >
> > That's a weird commit to fix. so it fixes the simulator?
>
> Yes, since the simulator is using a workqueue to handle control virtueue.

Uhmm... touching a driver for a simulator's sake looks a little weird.

Additionally, if the bug is vdpasim, I think it's better to try to
solve it there, if possible.

Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
neither needs a process context, so perhaps you could rework it to run
the work_fn() directly from vdpasim_kick_vq(), at least for the control
virtqueue?

Thanks!

Paolo

2022-09-07 03:02:58

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > Adding cond_resched() to the command waiting loop for a better
> > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > run other task(workqueue) instead of busy looping when preemption is
> > > > not allowed.
> > > >
> > > > What's more important. This is a must for some vDPA parent to work
> > > > since control virtqueue is emulated via a workqueue for those parents.
> > > >
> > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > >
> > > That's a weird commit to fix. so it fixes the simulator?
> >
> > Yes, since the simulator is using a workqueue to handle control virtueue.
>
> Uhmm... touching a driver for a simulator's sake looks a little weird.

Simulator is not the only one that is using a workqueue (but should be
the first).

I can see that the mlx5 vDPA driver is using a workqueue as well (see
mlx5_vdpa_kick_vq()).

And in the case of VDUSE, it needs to wait for the response from the
userspace, this means cond_resched() is probably a must for the case
like UP.

>
> Additionally, if the bug is vdpasim, I think it's better to try to
> solve it there, if possible.
>
> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> neither needs a process context, so perhaps you could rework it to run
> the work_fn() directly from vdpasim_kick_vq(), at least for the control
> virtqueue?

It's possible (but require some rework on the simulator core). But
considering we have other similar use cases, it looks better to solve
it in the virtio-net driver.

Additionally, this may have better behaviour when using for the buggy
hardware (e.g the control virtqueue takes too long to respond). We may
consider switching to use interrupt/sleep in the future (but not
suitable for -net).

Thanks

>
> Thanks!
>
> Paolo
>

2022-09-07 07:47:47

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > >
> > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > not allowed.
> > > > >
> > > > > What's more important. This is a must for some vDPA parent to work
> > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > >
> > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > >
> > > > That's a weird commit to fix. so it fixes the simulator?
> > >
> > > Yes, since the simulator is using a workqueue to handle control virtueue.
> >
> > Uhmm... touching a driver for a simulator's sake looks a little weird.
>
> Simulator is not the only one that is using a workqueue (but should be
> the first).
>
> I can see that the mlx5 vDPA driver is using a workqueue as well (see
> mlx5_vdpa_kick_vq()).
>
> And in the case of VDUSE, it needs to wait for the response from the
> userspace, this means cond_resched() is probably a must for the case
> like UP.
>
> >
> > Additionally, if the bug is vdpasim, I think it's better to try to
> > solve it there, if possible.
> >
> > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > neither needs a process context, so perhaps you could rework it to run
> > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > virtqueue?
>
> It's possible (but require some rework on the simulator core). But
> considering we have other similar use cases, it looks better to solve
> it in the virtio-net driver.

I see.

> Additionally, this may have better behaviour when using for the buggy
> hardware (e.g the control virtqueue takes too long to respond). We may
> consider switching to use interrupt/sleep in the future (but not
> suitable for -net).

Agreed. Possibly a timeout could be useful, too.

Cheers,

Paolo

2022-09-07 07:58:08

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > >
> > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > not allowed.
> > > > > >
> > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > >
> > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > >
> > > > > That's a weird commit to fix. so it fixes the simulator?
> > > >
> > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > >
> > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> >
> > Simulator is not the only one that is using a workqueue (but should be
> > the first).
> >
> > I can see that the mlx5 vDPA driver is using a workqueue as well (see
> > mlx5_vdpa_kick_vq()).
> >
> > And in the case of VDUSE, it needs to wait for the response from the
> > userspace, this means cond_resched() is probably a must for the case
> > like UP.
> >
> > >
> > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > solve it there, if possible.
> > >
> > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > neither needs a process context, so perhaps you could rework it to run
> > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > virtqueue?
> >
> > It's possible (but require some rework on the simulator core). But
> > considering we have other similar use cases, it looks better to solve
> > it in the virtio-net driver.
>
> I see.
>
> > Additionally, this may have better behaviour when using for the buggy
> > hardware (e.g the control virtqueue takes too long to respond). We may
> > consider switching to use interrupt/sleep in the future (but not
> > suitable for -net).
>
> Agreed. Possibly a timeout could be useful, too.
>
> Cheers,
>
> Paolo


Hmm timeouts are kind of arbitrary.
regular drivers basically derive them from hardware
behaviour but with a generic driver like virtio it's harder.
I guess we could add timeout as a config field, have
device make a promise to the driver.

Making the wait interruptible seems more reasonable.

--
MST

2022-09-08 02:55:52

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop


在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
>> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
>>> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
>>>> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
>>>>> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
>>>>>> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
>>>>>>> Adding cond_resched() to the command waiting loop for a better
>>>>>>> co-operation with the scheduler. This allows to give CPU a breath to
>>>>>>> run other task(workqueue) instead of busy looping when preemption is
>>>>>>> not allowed.
>>>>>>>
>>>>>>> What's more important. This is a must for some vDPA parent to work
>>>>>>> since control virtqueue is emulated via a workqueue for those parents.
>>>>>>>
>>>>>>> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>>>>>> That's a weird commit to fix. so it fixes the simulator?
>>>>> Yes, since the simulator is using a workqueue to handle control virtueue.
>>>> Uhmm... touching a driver for a simulator's sake looks a little weird.
>>> Simulator is not the only one that is using a workqueue (but should be
>>> the first).
>>>
>>> I can see that the mlx5 vDPA driver is using a workqueue as well (see
>>> mlx5_vdpa_kick_vq()).
>>>
>>> And in the case of VDUSE, it needs to wait for the response from the
>>> userspace, this means cond_resched() is probably a must for the case
>>> like UP.
>>>
>>>> Additionally, if the bug is vdpasim, I think it's better to try to
>>>> solve it there, if possible.
>>>>
>>>> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
>>>> neither needs a process context, so perhaps you could rework it to run
>>>> the work_fn() directly from vdpasim_kick_vq(), at least for the control
>>>> virtqueue?
>>> It's possible (but require some rework on the simulator core). But
>>> considering we have other similar use cases, it looks better to solve
>>> it in the virtio-net driver.
>> I see.
>>
>>> Additionally, this may have better behaviour when using for the buggy
>>> hardware (e.g the control virtqueue takes too long to respond). We may
>>> consider switching to use interrupt/sleep in the future (but not
>>> suitable for -net).
>> Agreed. Possibly a timeout could be useful, too.
>>
>> Cheers,
>>
>> Paolo
>
> Hmm timeouts are kind of arbitrary.
> regular drivers basically derive them from hardware
> behaviour but with a generic driver like virtio it's harder.
> I guess we could add timeout as a config field, have
> device make a promise to the driver.
>
> Making the wait interruptible seems more reasonable.


Yes, but I think we still need this patch for -net and -stable.

Thanks


>

2022-09-08 05:43:54

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
>
> 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > not allowed.
> > > > > > > >
> > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > >
> > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > Simulator is not the only one that is using a workqueue (but should be
> > > > the first).
> > > >
> > > > I can see that the mlx5 vDPA driver is using a workqueue as well (see
> > > > mlx5_vdpa_kick_vq()).
> > > >
> > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > userspace, this means cond_resched() is probably a must for the case
> > > > like UP.
> > > >
> > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > solve it there, if possible.
> > > > >
> > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > virtqueue?
> > > > It's possible (but require some rework on the simulator core). But
> > > > considering we have other similar use cases, it looks better to solve
> > > > it in the virtio-net driver.
> > > I see.
> > >
> > > > Additionally, this may have better behaviour when using for the buggy
> > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > consider switching to use interrupt/sleep in the future (but not
> > > > suitable for -net).
> > > Agreed. Possibly a timeout could be useful, too.
> > >
> > > Cheers,
> > >
> > > Paolo
> >
> > Hmm timeouts are kind of arbitrary.
> > regular drivers basically derive them from hardware
> > behaviour but with a generic driver like virtio it's harder.
> > I guess we could add timeout as a config field, have
> > device make a promise to the driver.
> >
> > Making the wait interruptible seems more reasonable.
>
>
> Yes, but I think we still need this patch for -net and -stable.
>
> Thanks

I was referring to Paolo's idea of having a timeout.

--
MST

2022-10-09 07:12:18

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop


在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
>> 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
>>> On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
>>>> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
>>>>> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
>>>>>> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
>>>>>>> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
>>>>>>>> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
>>>>>>>>> Adding cond_resched() to the command waiting loop for a better
>>>>>>>>> co-operation with the scheduler. This allows to give CPU a breath to
>>>>>>>>> run other task(workqueue) instead of busy looping when preemption is
>>>>>>>>> not allowed.
>>>>>>>>>
>>>>>>>>> What's more important. This is a must for some vDPA parent to work
>>>>>>>>> since control virtqueue is emulated via a workqueue for those parents.
>>>>>>>>>
>>>>>>>>> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>>>>>>>> That's a weird commit to fix. so it fixes the simulator?
>>>>>>> Yes, since the simulator is using a workqueue to handle control virtueue.
>>>>>> Uhmm... touching a driver for a simulator's sake looks a little weird.
>>>>> Simulator is not the only one that is using a workqueue (but should be
>>>>> the first).
>>>>>
>>>>> I can see that the mlx5 vDPA driver is using a workqueue as well (see
>>>>> mlx5_vdpa_kick_vq()).
>>>>>
>>>>> And in the case of VDUSE, it needs to wait for the response from the
>>>>> userspace, this means cond_resched() is probably a must for the case
>>>>> like UP.
>>>>>
>>>>>> Additionally, if the bug is vdpasim, I think it's better to try to
>>>>>> solve it there, if possible.
>>>>>>
>>>>>> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
>>>>>> neither needs a process context, so perhaps you could rework it to run
>>>>>> the work_fn() directly from vdpasim_kick_vq(), at least for the control
>>>>>> virtqueue?
>>>>> It's possible (but require some rework on the simulator core). But
>>>>> considering we have other similar use cases, it looks better to solve
>>>>> it in the virtio-net driver.
>>>> I see.
>>>>
>>>>> Additionally, this may have better behaviour when using for the buggy
>>>>> hardware (e.g the control virtqueue takes too long to respond). We may
>>>>> consider switching to use interrupt/sleep in the future (but not
>>>>> suitable for -net).
>>>> Agreed. Possibly a timeout could be useful, too.
>>>>
>>>> Cheers,
>>>>
>>>> Paolo
>>> Hmm timeouts are kind of arbitrary.
>>> regular drivers basically derive them from hardware
>>> behaviour but with a generic driver like virtio it's harder.
>>> I guess we could add timeout as a config field, have
>>> device make a promise to the driver.
>>>
>>> Making the wait interruptible seems more reasonable.
>>
>> Yes, but I think we still need this patch for -net and -stable.
>>
>> Thanks
> I was referring to Paolo's idea of having a timeout.


Ok, I think we're fine with this patch. Any chance to merge this or do I
need to resend?

Thanks


>

2022-10-10 18:05:47

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
>
> 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > not allowed.
> > > > > > > > > >
> > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > >
> > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > the first).
> > > > > >
> > > > > > I can see that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > mlx5_vdpa_kick_vq()).
> > > > > >
> > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > like UP.
> > > > > >
> > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > solve it there, if possible.
> > > > > > >
> > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > virtqueue?
> > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > it in the virtio-net driver.
> > > > > I see.
> > > > >
> > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > suitable for -net).
> > > > > Agreed. Possibly a timeout could be useful, too.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Paolo
> > > > Hmm timeouts are kind of arbitrary.
> > > > regular drivers basically derive them from hardware
> > > > behaviour but with a generic driver like virtio it's harder.
> > > > I guess we could add timeout as a config field, have
> > > > device make a promise to the driver.
> > > >
> > > > Making the wait interruptible seems more reasonable.
> > >
> > > Yes, but I think we still need this patch for -net and -stable.
> > >
> > > Thanks
> > I was referring to Paolo's idea of having a timeout.
>
>
> Ok, I think we're fine with this patch. Any chance to merge this or do I
> need to resend?
>
> Thanks

Last question: do we want cpu_relax here now? Or is cond_resched
sufficient?

>
> >

2022-10-12 03:48:33

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Tue, Oct 11, 2022 at 1:11 AM Michael S. Tsirkin <[email protected]> wrote:
>
> On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
> >
> > 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > > not allowed.
> > > > > > > > > > >
> > > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > > >
> > > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > > the first).
> > > > > > >
> > > > > > > I can see that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > > mlx5_vdpa_kick_vq()).
> > > > > > >
> > > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > > like UP.
> > > > > > >
> > > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > > solve it there, if possible.
> > > > > > > >
> > > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > > virtqueue?
> > > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > > it in the virtio-net driver.
> > > > > > I see.
> > > > > >
> > > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > > suitable for -net).
> > > > > > Agreed. Possibly a timeout could be useful, too.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Paolo
> > > > > Hmm timeouts are kind of arbitrary.
> > > > > regular drivers basically derive them from hardware
> > > > > behaviour but with a generic driver like virtio it's harder.
> > > > > I guess we could add timeout as a config field, have
> > > > > device make a promise to the driver.
> > > > >
> > > > > Making the wait interruptible seems more reasonable.
> > > >
> > > > Yes, but I think we still need this patch for -net and -stable.
> > > >
> > > > Thanks
> > > I was referring to Paolo's idea of having a timeout.
> >
> >
> > Ok, I think we're fine with this patch. Any chance to merge this or do I
> > need to resend?
> >
> > Thanks
>
> Last question: do we want cpu_relax here now? Or is cond_resched
> sufficient?

(Have answered in another thread)

I think we need cpu_relax() since there could be no high priority task
in the current cpu so we still need to relax.

Thanks

>
> >
> > >
>

2022-10-17 07:39:46

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Wed, Oct 12, 2022 at 11:19 AM Jason Wang <[email protected]> wrote:
>
> On Tue, Oct 11, 2022 at 1:11 AM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
> > >
> > > 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > > > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <[email protected]> wrote:
> > > > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > > > not allowed.
> > > > > > > > > > > >
> > > > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > > > >
> > > > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > > > the first).
> > > > > > > >
> > > > > > > > I can see that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > > > mlx5_vdpa_kick_vq()).
> > > > > > > >
> > > > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > > > like UP.
> > > > > > > >
> > > > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > > > solve it there, if possible.
> > > > > > > > >
> > > > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > > > virtqueue?
> > > > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > > > it in the virtio-net driver.
> > > > > > > I see.
> > > > > > >
> > > > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > > > suitable for -net).
> > > > > > > Agreed. Possibly a timeout could be useful, too.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Paolo
> > > > > > Hmm timeouts are kind of arbitrary.
> > > > > > regular drivers basically derive them from hardware
> > > > > > behaviour but with a generic driver like virtio it's harder.
> > > > > > I guess we could add timeout as a config field, have
> > > > > > device make a promise to the driver.
> > > > > >
> > > > > > Making the wait interruptible seems more reasonable.
> > > > >
> > > > > Yes, but I think we still need this patch for -net and -stable.
> > > > >
> > > > > Thanks
> > > > I was referring to Paolo's idea of having a timeout.
> > >
> > >
> > > Ok, I think we're fine with this patch. Any chance to merge this or do I
> > > need to resend?
> > >
> > > Thanks
> >
> > Last question: do we want cpu_relax here now? Or is cond_resched
> > sufficient?
>
> (Have answered in another thread)
>
> I think we need cpu_relax() since there could be no high priority task
> in the current cpu so we still need to relax.
>
> Thanks

Michael, does this answer make sense? If yes, would you like to ack the patch?

Thanks

>
> >
> > >
> > > >
> >

2022-10-17 20:26:24

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop

On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> Adding cond_resched() to the command waiting loop for a better
> co-operation with the scheduler. This allows to give CPU a breath to
> run other task(workqueue) instead of busy looping when preemption is
> not allowed.
>
> What's more important. This is a must for some vDPA parent to work
> since control virtqueue is emulated via a workqueue for those parents.
>
> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> Signed-off-by: Jason Wang <[email protected]>

Acked-by: Michael S. Tsirkin <[email protected]>

> ---
> drivers/net/virtio_net.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ece00b84e3a7..169368365d6a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> * into the hypervisor, so the request should be handled immediately.
> */
> while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> - !virtqueue_is_broken(vi->cvq))
> + !virtqueue_is_broken(vi->cvq)) {
> + cond_resched();
> cpu_relax();
> + }
>
> return vi->ctrl->status == VIRTIO_NET_OK;
> }
> --
> 2.25.1