2023-04-27 05:01:53

by Wenliang Wang

[permalink] [raw]
Subject: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

For multi-queue and large rx-ring-size use case, the following error
occurred when free_unused_bufs:
rcu: INFO: rcu_sched self-detected stall on CPU.

Signed-off-by: Wenliang Wang <[email protected]>
---
drivers/net/virtio_net.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ea1bd4bb326d..21d8382fd2c7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
struct virtqueue *vq = vi->rq[i].vq;
while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
virtnet_rq_free_unused_buf(vq, buf);
+ schedule();
}
}

--
2.20.1


2023-04-27 06:28:36

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> For multi-queue and large rx-ring-size use case, the following error

Cound you give we one number for example?

> occurred when free_unused_bufs:
> rcu: INFO: rcu_sched self-detected stall on CPU.
>
> Signed-off-by: Wenliang Wang <[email protected]>
> ---
> drivers/net/virtio_net.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ea1bd4bb326d..21d8382fd2c7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> struct virtqueue *vq = vi->rq[i].vq;
> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> virtnet_rq_free_unused_buf(vq, buf);
> + schedule();

Just for rq?

Do we need to do the same thing for sq?

Thanks.


> }
> }
>
> --
> 2.20.1
>

2023-04-27 07:08:03

by Wenliang Wang

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs



On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
>> For multi-queue and large rx-ring-size use case, the following error
>
> Cound you give we one number for example?

128 queues and 16K queue_size is typical.

>
>> occurred when free_unused_bufs:
>> rcu: INFO: rcu_sched self-detected stall on CPU.
>>
>> Signed-off-by: Wenliang Wang <[email protected]>
>> ---
>> drivers/net/virtio_net.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index ea1bd4bb326d..21d8382fd2c7 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
>> struct virtqueue *vq = vi->rq[i].vq;
>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
>> virtnet_rq_free_unused_buf(vq, buf);
>> + schedule();
>
> Just for rq?
>
> Do we need to do the same thing for sq?
Rq buffers are pre-allocated, take seconds to free rq unused buffers.

Sq unused buffers are much less, so do the same for sq is optional.

>
> Thanks.
>
>
>> }
>> }
>>
>> --
>> 2.20.1
>>

2023-04-27 07:25:24

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
>
>
> On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> > On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> >> For multi-queue and large rx-ring-size use case, the following error
> >
> > Cound you give we one number for example?
>
> 128 queues and 16K queue_size is typical.
>
> >
> >> occurred when free_unused_bufs:
> >> rcu: INFO: rcu_sched self-detected stall on CPU.
> >>
> >> Signed-off-by: Wenliang Wang <[email protected]>
> >> ---
> >> drivers/net/virtio_net.c | 1 +
> >> 1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index ea1bd4bb326d..21d8382fd2c7 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> >> struct virtqueue *vq = vi->rq[i].vq;
> >> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> >> virtnet_rq_free_unused_buf(vq, buf);
> >> + schedule();
> >
> > Just for rq?
> >
> > Do we need to do the same thing for sq?
> Rq buffers are pre-allocated, take seconds to free rq unused buffers.
>
> Sq unused buffers are much less, so do the same for sq is optional.

I got.

I think we should look for a way, compatible with the less queues or the smaller
rings. Calling schedule() directly may be not a good way.

Thanks.


>
> >
> > Thanks.
> >
> >
> >> }
> >> }
> >>
> >> --
> >> 2.20.1
> >>

2023-04-27 08:16:53

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
> >
> >
> > On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> > > On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> > >> For multi-queue and large rx-ring-size use case, the following error
> > >
> > > Cound you give we one number for example?
> >
> > 128 queues and 16K queue_size is typical.
> >
> > >
> > >> occurred when free_unused_bufs:
> > >> rcu: INFO: rcu_sched self-detected stall on CPU.
> > >>
> > >> Signed-off-by: Wenliang Wang <[email protected]>
> > >> ---
> > >> drivers/net/virtio_net.c | 1 +
> > >> 1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > >> index ea1bd4bb326d..21d8382fd2c7 100644
> > >> --- a/drivers/net/virtio_net.c
> > >> +++ b/drivers/net/virtio_net.c
> > >> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> > >> struct virtqueue *vq = vi->rq[i].vq;
> > >> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> > >> virtnet_rq_free_unused_buf(vq, buf);
> > >> + schedule();
> > >
> > > Just for rq?
> > >
> > > Do we need to do the same thing for sq?
> > Rq buffers are pre-allocated, take seconds to free rq unused buffers.
> >
> > Sq unused buffers are much less, so do the same for sq is optional.
>
> I got.
>
> I think we should look for a way, compatible with the less queues or the smaller
> rings. Calling schedule() directly may be not a good way.
>
> Thanks.

Why isn't it a good way?

>
> >
> > >
> > > Thanks.
> > >
> > >
> > >> }
> > >> }
> > >>
> > >> --
> > >> 2.20.1
> > >>

2023-04-27 08:30:13

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote:
> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
> > On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
> > > On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
> > > >
> > > >
> > > > On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> > > > > On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> > > > >> For multi-queue and large rx-ring-size use case, the following error
> > > > >
> > > > > Cound you give we one number for example?
> > > >
> > > > 128 queues and 16K queue_size is typical.
> > > >
> > > > >
> > > > >> occurred when free_unused_bufs:
> > > > >> rcu: INFO: rcu_sched self-detected stall on CPU.
> > > > >>
> > > > >> Signed-off-by: Wenliang Wang <[email protected]>
> > > > >> ---
> > > > >> drivers/net/virtio_net.c | 1 +
> > > > >> 1 file changed, 1 insertion(+)
> > > > >>
> > > > >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > >> index ea1bd4bb326d..21d8382fd2c7 100644
> > > > >> --- a/drivers/net/virtio_net.c
> > > > >> +++ b/drivers/net/virtio_net.c
> > > > >> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> > > > >> struct virtqueue *vq = vi->rq[i].vq;
> > > > >> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> > > > >> virtnet_rq_free_unused_buf(vq, buf);
> > > > >> + schedule();
> > > > >
> > > > > Just for rq?
> > > > >
> > > > > Do we need to do the same thing for sq?
> > > > Rq buffers are pre-allocated, take seconds to free rq unused buffers.
> > > >
> > > > Sq unused buffers are much less, so do the same for sq is optional.
> > >
> > > I got.
> > >
> > > I think we should look for a way, compatible with the less queues or the smaller
> > > rings. Calling schedule() directly may be not a good way.
> > >
> > > Thanks.
> >
> > Why isn't it a good way?
>
> For the small ring, I don't think it is a good way, maybe we only deal with one
> buf, then call schedule().
>
> We can call the schedule() after processing a certain number of buffers,
> or check need_resched () first.
>
> Thanks.


Wenliang, does
if (need_resched())
schedule();
fix the issue for you?


>
>
> >
> > >
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > >> }
> > > > >> }
> > > > >>
> > > > >> --
> > > > >> 2.20.1
> > > > >>
> >

2023-04-27 08:30:28

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
> > On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
> > >
> > >
> > > On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> > > > On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> > > >> For multi-queue and large rx-ring-size use case, the following error
> > > >
> > > > Cound you give we one number for example?
> > >
> > > 128 queues and 16K queue_size is typical.
> > >
> > > >
> > > >> occurred when free_unused_bufs:
> > > >> rcu: INFO: rcu_sched self-detected stall on CPU.
> > > >>
> > > >> Signed-off-by: Wenliang Wang <[email protected]>
> > > >> ---
> > > >> drivers/net/virtio_net.c | 1 +
> > > >> 1 file changed, 1 insertion(+)
> > > >>
> > > >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > >> index ea1bd4bb326d..21d8382fd2c7 100644
> > > >> --- a/drivers/net/virtio_net.c
> > > >> +++ b/drivers/net/virtio_net.c
> > > >> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> > > >> struct virtqueue *vq = vi->rq[i].vq;
> > > >> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> > > >> virtnet_rq_free_unused_buf(vq, buf);
> > > >> + schedule();
> > > >
> > > > Just for rq?
> > > >
> > > > Do we need to do the same thing for sq?
> > > Rq buffers are pre-allocated, take seconds to free rq unused buffers.
> > >
> > > Sq unused buffers are much less, so do the same for sq is optional.
> >
> > I got.
> >
> > I think we should look for a way, compatible with the less queues or the smaller
> > rings. Calling schedule() directly may be not a good way.
> >
> > Thanks.
>
> Why isn't it a good way?

For the small ring, I don't think it is a good way, maybe we only deal with one
buf, then call schedule().

We can call the schedule() after processing a certain number of buffers,
or check need_resched () first.

Thanks.



>
> >
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > >> }
> > > >> }
> > > >>
> > > >> --
> > > >> 2.20.1
> > > >>
>

2023-04-27 08:54:58

by Wenliang Wang

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On 4/27/23 4:23 PM, Michael S. Tsirkin wrote:
> On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote:
>> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
>>> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
>>>> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
>>>>>
>>>>>
>>>>> On 4/27/23 2:20 PM, Xuan Zhuo wrote:
>>>>>> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
>>>>>>> For multi-queue and large rx-ring-size use case, the following error
>>>>>>
>>>>>> Cound you give we one number for example?
>>>>>
>>>>> 128 queues and 16K queue_size is typical.
>>>>>
>>>>>>
>>>>>>> occurred when free_unused_bufs:
>>>>>>> rcu: INFO: rcu_sched self-detected stall on CPU.
>>>>>>>
>>>>>>> Signed-off-by: Wenliang Wang <[email protected]>
>>>>>>> ---
>>>>>>> drivers/net/virtio_net.c | 1 +
>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>
>>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>>> index ea1bd4bb326d..21d8382fd2c7 100644
>>>>>>> --- a/drivers/net/virtio_net.c
>>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
>>>>>>> struct virtqueue *vq = vi->rq[i].vq;
>>>>>>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
>>>>>>> virtnet_rq_free_unused_buf(vq, buf);
>>>>>>> + schedule();
>>>>>>
>>>>>> Just for rq?
>>>>>>
>>>>>> Do we need to do the same thing for sq?
>>>>> Rq buffers are pre-allocated, take seconds to free rq unused buffers.
>>>>>
>>>>> Sq unused buffers are much less, so do the same for sq is optional.
>>>>
>>>> I got.
>>>>
>>>> I think we should look for a way, compatible with the less queues or the smaller
>>>> rings. Calling schedule() directly may be not a good way.
>>>>
>>>> Thanks.
>>>
>>> Why isn't it a good way?
>>
>> For the small ring, I don't think it is a good way, maybe we only deal with one
>> buf, then call schedule().
>>
>> We can call the schedule() after processing a certain number of buffers,
>> or check need_resched () first.
>>
>> Thanks.
>
>
> Wenliang, does
> if (need_resched())
> schedule();
> fix the issue for you?
>
Yeah, it works better.
>
>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> --
>>>>>>> 2.20.1
>>>>>>>
>>>
>

2023-04-27 09:02:42

by Xuan Zhuo

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, 27 Apr 2023 16:49:58 +0800, Wenliang Wang <[email protected]> wrote:
> On 4/27/23 4:23 PM, Michael S. Tsirkin wrote:
> > On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote:
> >> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
> >>> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
> >>>> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
> >>>>>
> >>>>>
> >>>>> On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> >>>>>> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> >>>>>>> For multi-queue and large rx-ring-size use case, the following error
> >>>>>>
> >>>>>> Cound you give we one number for example?
> >>>>>
> >>>>> 128 queues and 16K queue_size is typical.
> >>>>>
> >>>>>>
> >>>>>>> occurred when free_unused_bufs:
> >>>>>>> rcu: INFO: rcu_sched self-detected stall on CPU.
> >>>>>>>
> >>>>>>> Signed-off-by: Wenliang Wang <[email protected]>
> >>>>>>> ---
> >>>>>>> drivers/net/virtio_net.c | 1 +
> >>>>>>> 1 file changed, 1 insertion(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>>>>> index ea1bd4bb326d..21d8382fd2c7 100644
> >>>>>>> --- a/drivers/net/virtio_net.c
> >>>>>>> +++ b/drivers/net/virtio_net.c
> >>>>>>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> >>>>>>> struct virtqueue *vq = vi->rq[i].vq;
> >>>>>>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> >>>>>>> virtnet_rq_free_unused_buf(vq, buf);
> >>>>>>> + schedule();
> >>>>>>
> >>>>>> Just for rq?
> >>>>>>
> >>>>>> Do we need to do the same thing for sq?
> >>>>> Rq buffers are pre-allocated, take seconds to free rq unused buffers.
> >>>>>
> >>>>> Sq unused buffers are much less, so do the same for sq is optional.
> >>>>
> >>>> I got.
> >>>>
> >>>> I think we should look for a way, compatible with the less queues or the smaller
> >>>> rings. Calling schedule() directly may be not a good way.
> >>>>
> >>>> Thanks.
> >>>
> >>> Why isn't it a good way?
> >>
> >> For the small ring, I don't think it is a good way, maybe we only deal with one
> >> buf, then call schedule().
> >>
> >> We can call the schedule() after processing a certain number of buffers,
> >> or check need_resched () first.
> >>
> >> Thanks.
> >
> >
> > Wenliang, does
> > if (need_resched())
> > schedule();
> > fix the issue for you?
> >
> Yeah, it works better.

I prefer to use it in combination with a fixed number(such as 256).
Every time 256 buffers are processed, check need_resched().
This can accommodate large rings and small rings.

Also, it is necessary to add similar logic to sq. Although the possibility is
low, it is possible that the same problem will occur.

Thanks.

> >
> >>
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>>
> >>>>>>> }
> >>>>>>> }
> >>>>>>>
> >>>>>>> --
> >>>>>>> 2.20.1
> >>>>>>>
> >>>
> >

2023-04-27 10:52:27

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs



On 2023/4/27 16:23, Michael S. Tsirkin wrote:
> On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote:
>> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
>>> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
>>>> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
>>>>>
>>>>>
>>>>> On 4/27/23 2:20 PM, Xuan Zhuo wrote:
>>>>>> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
>>>>>>> For multi-queue and large rx-ring-size use case, the following error
>>>>>>
>>>>>> Cound you give we one number for example?
>>>>>
>>>>> 128 queues and 16K queue_size is typical.
>>>>>
>>>>>>
>>>>>>> occurred when free_unused_bufs:
>>>>>>> rcu: INFO: rcu_sched self-detected stall on CPU.
>>>>>>>
>>>>>>> Signed-off-by: Wenliang Wang <[email protected]>
>>>>>>> ---
>>>>>>> drivers/net/virtio_net.c | 1 +
>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>
>>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>>> index ea1bd4bb326d..21d8382fd2c7 100644
>>>>>>> --- a/drivers/net/virtio_net.c
>>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
>>>>>>> struct virtqueue *vq = vi->rq[i].vq;
>>>>>>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
>>>>>>> virtnet_rq_free_unused_buf(vq, buf);
>>>>>>> + schedule();
>>>>>>
>>>>>> Just for rq?
>>>>>>
>>>>>> Do we need to do the same thing for sq?
>>>>> Rq buffers are pre-allocated, take seconds to free rq unused buffers.
>>>>>
>>>>> Sq unused buffers are much less, so do the same for sq is optional.
>>>>
>>>> I got.
>>>>
>>>> I think we should look for a way, compatible with the less queues or the smaller
>>>> rings. Calling schedule() directly may be not a good way.
>>>>
>>>> Thanks.
>>>
>>> Why isn't it a good way?
>>
>> For the small ring, I don't think it is a good way, maybe we only deal with one
>> buf, then call schedule().
>>
>> We can call the schedule() after processing a certain number of buffers,
>> or check need_resched () first.
>>
>> Thanks.
>
>
> Wenliang, does
> if (need_resched())
> schedule();

Can we just use cond_resched()?

> fix the issue for you?
>
>
>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> --
>>>>>>> 2.20.1
>>>>>>>
>>>
>

--
Thanks,
Qi

2023-04-27 10:53:40

by Wenliang Wang

[permalink] [raw]
Subject: [PATCH v2] virtio_net: suppress cpu stall when free_unused_bufs

For multi-queue and large ring-size use case, the following error
occurred when free_unused_bufs:
rcu: INFO: rcu_sched self-detected stall on CPU.

Signed-off-by: Wenliang Wang <[email protected]>
---
v2:
-add need_resched check.
-apply same logic to sq.
---
drivers/net/virtio_net.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ea1bd4bb326d..573558b69a60 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3559,12 +3559,16 @@ static void free_unused_bufs(struct virtnet_info *vi)
struct virtqueue *vq = vi->sq[i].vq;
while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
virtnet_sq_free_unused_buf(vq, buf);
+ if (need_resched())
+ schedule();
}

for (i = 0; i < vi->max_queue_pairs; i++) {
struct virtqueue *vq = vi->rq[i].vq;
while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
virtnet_rq_free_unused_buf(vq, buf);
+ if (need_resched())
+ schedule();
}
}

--
2.20.1

2023-04-28 01:23:11

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_net: suppress cpu stall when free_unused_bufs

On Thu, Apr 27, 2023 at 06:46:18PM +0800, Wenliang Wang wrote:
> For multi-queue and large ring-size use case, the following error
> occurred when free_unused_bufs:
> rcu: INFO: rcu_sched self-detected stall on CPU.
>
> Signed-off-by: Wenliang Wang <[email protected]>

pls send vN+1 as a new thread not as a reply in existing thread of vN.

> ---
> v2:
> -add need_resched check.
> -apply same logic to sq.
> ---
> drivers/net/virtio_net.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ea1bd4bb326d..573558b69a60 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3559,12 +3559,16 @@ static void free_unused_bufs(struct virtnet_info *vi)
> struct virtqueue *vq = vi->sq[i].vq;
> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> virtnet_sq_free_unused_buf(vq, buf);
> + if (need_resched())
> + schedule();
> }
>
> for (i = 0; i < vi->max_queue_pairs; i++) {
> struct virtqueue *vq = vi->rq[i].vq;
> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> virtnet_rq_free_unused_buf(vq, buf);
> + if (need_resched())
> + schedule();
> }
> }
>
> --
> 2.20.1

2023-04-28 14:07:13

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: suppress cpu stall when free_unused_bufs

Qi Zheng wrote:
>
>
> On 2023/4/27 16:23, Michael S. Tsirkin wrote:
> > On Thu, Apr 27, 2023 at 04:13:45PM +0800, Xuan Zhuo wrote:
> >> On Thu, 27 Apr 2023 04:12:44 -0400, "Michael S. Tsirkin" <[email protected]> wrote:
> >>> On Thu, Apr 27, 2023 at 03:13:44PM +0800, Xuan Zhuo wrote:
> >>>> On Thu, 27 Apr 2023 15:02:26 +0800, Wenliang Wang <[email protected]> wrote:
> >>>>>
> >>>>>
> >>>>> On 4/27/23 2:20 PM, Xuan Zhuo wrote:
> >>>>>> On Thu, 27 Apr 2023 12:34:33 +0800, Wenliang Wang <[email protected]> wrote:
> >>>>>>> For multi-queue and large rx-ring-size use case, the following error
> >>>>>>
> >>>>>> Cound you give we one number for example?
> >>>>>
> >>>>> 128 queues and 16K queue_size is typical.
> >>>>>
> >>>>>>
> >>>>>>> occurred when free_unused_bufs:
> >>>>>>> rcu: INFO: rcu_sched self-detected stall on CPU.
> >>>>>>>
> >>>>>>> Signed-off-by: Wenliang Wang <[email protected]>
> >>>>>>> ---
> >>>>>>> drivers/net/virtio_net.c | 1 +
> >>>>>>> 1 file changed, 1 insertion(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>>>>> index ea1bd4bb326d..21d8382fd2c7 100644
> >>>>>>> --- a/drivers/net/virtio_net.c
> >>>>>>> +++ b/drivers/net/virtio_net.c
> >>>>>>> @@ -3565,6 +3565,7 @@ static void free_unused_bufs(struct virtnet_info *vi)
> >>>>>>> struct virtqueue *vq = vi->rq[i].vq;
> >>>>>>> while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> >>>>>>> virtnet_rq_free_unused_buf(vq, buf);
> >>>>>>> + schedule();
> >>>>>>
> >>>>>> Just for rq?
> >>>>>>
> >>>>>> Do we need to do the same thing for sq?
> >>>>> Rq buffers are pre-allocated, take seconds to free rq unused buffers.
> >>>>>
> >>>>> Sq unused buffers are much less, so do the same for sq is optional.
> >>>>
> >>>> I got.
> >>>>
> >>>> I think we should look for a way, compatible with the less queues or the smaller
> >>>> rings. Calling schedule() directly may be not a good way.
> >>>>
> >>>> Thanks.
> >>>
> >>> Why isn't it a good way?
> >>
> >> For the small ring, I don't think it is a good way, maybe we only deal with one
> >> buf, then call schedule().
> >>
> >> We can call the schedule() after processing a certain number of buffers,
> >> or check need_resched () first.
> >>
> >> Thanks.
> >
> >
> > Wenliang, does
> > if (need_resched())
> > schedule();
>
> Can we just use cond_resched()?

I believe that is preferred. But v2 still calls schedule directly.