LinuxLists.cc - [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

2016-11-30 12:11:56

Subject: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.

Signed-off-by: Yunjian Wang <[email protected]>
---
drivers/vhost/net.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5dc128a..edc470b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
pr_debug("Discarded rx packet: "
" len %d, expected %zd\n", err, sock_len);
vhost_discard_vq_desc(vq, headcount);
+ /* Don't continue to do, when meet errors. */
+ if (err < 0)
+ goto out;
continue;
}
/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
--
1.9.5.msysgit.1

2016-11-30 13:07:29

by Jason Wang

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On 2016年11月30日 20:10, Yunjian Wang wrote:
> When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>
> Signed-off-by: Yunjian Wang <[email protected]>
> ---
> drivers/vhost/net.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 5dc128a..edc470b 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> pr_debug("Discarded rx packet: "
> " len %d, expected %zd\n", err, sock_len);
> vhost_discard_vq_desc(vq, headcount);
> + /* Don't continue to do, when meet errors. */
> + if (err < 0)
> + goto out;
> continue;
> }
> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */

Acked-by: Jason Wang <[email protected]>

We may want to rename vhost_discard_vq_desc() in the future, since it
does not discard the desc in fact.

2016-11-30 13:41:15

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> When we meet an error(err=-EBADFD) recvmsg,

How do you get EBADFD? Won't vhost_net_rx_peek_head_len
return 0 in this case, breaking the loop?

> the error handling in vhost
> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>
> Signed-off-by: Yunjian Wang <[email protected]>
> ---
> drivers/vhost/net.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 5dc128a..edc470b 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> pr_debug("Discarded rx packet: "
> " len %d, expected %zd\n", err, sock_len);
> vhost_discard_vq_desc(vq, headcount);
> + /* Don't continue to do, when meet errors. */
> + if (err < 0)
> + goto out;

You might get e.g. EAGAIN and I think you need to retry
in this case.

> continue;
> }
> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> --
> 1.9.5.msysgit.1
>

2016-11-30 13:44:02

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On Wed, Nov 30, 2016 at 09:07:16PM +0800, Jason Wang wrote:
>
>
> On 2016年11月30日 20:10, Yunjian Wang wrote:
> > When we meet an error(err=-EBADFD) recvmsg, the error handling in vhost
> > handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> >
> > Signed-off-by: Yunjian Wang <[email protected]>
> > ---
> > drivers/vhost/net.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index 5dc128a..edc470b 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> > pr_debug("Discarded rx packet: "
> > " len %d, expected %zd\n", err, sock_len);
> > vhost_discard_vq_desc(vq, headcount);
> > + /* Don't continue to do, when meet errors. */
> > + if (err < 0)
> > + goto out;
> > continue;
> > }
> > /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>
> Acked-by: Jason Wang <[email protected]>
>
> We may want to rename vhost_discard_vq_desc() in the future, since it does
> not discard the desc in fact.

To me this looks a bit too aggressive. I would also like commit log
to explain better what is going on, to make sure we are
not just treating the symptoms.
--
MST

2016-12-01 02:49:18

by wangyunjian

[permalink] [raw]

Subject: RE: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

>-----Original Message-----
>From: Michael S. Tsirkin [mailto:[email protected]]
>Sent: Wednesday, November 30, 2016 9:41 PM
>To: wangyunjian
>Cc: [email protected]; [email protected]; [email protected]; caihe
>Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>
>On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>> When we meet an error(err=-EBADFD) recvmsg,
>
>How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>return 0 in this case, breaking the loop?

We started many guest VMs while attaching/detaching some virtio-net nics for loop.
The soft lockup might happened. The err is -EBADFD.

meesage log:
kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
call trace:
[<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
[<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
[<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
[<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
[<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
[<fffffff810a5c7f>]kthread+0xcf/0xe0
[<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
[<fffffff81648898>]ret_from_fork+0x58/0x90
[<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140

>> the error handling in vhost
>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>
>> Signed-off-by: Yunjian Wang <[email protected]>
>> ---
>> drivers/vhost/net.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index 5dc128a..edc470b 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>> pr_debug("Discarded rx packet: "
>> " len %d, expected %zd\n", err, sock_len);
>> vhost_discard_vq_desc(vq, headcount);
>> + /* Don't continue to do, when meet errors. */
>> + if (err < 0)
>> + goto out;
>
>You might get e.g. EAGAIN and I think you need to retry
>in this case.
>
>> continue;
>> }
>> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>> --
>> 1.9.5.msysgit.1
>>

2016-12-01 03:13:20

by Jason Wang

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On 2016年12月01日 10:48, wangyunjian wrote:
>> -----Original Message-----
>> From: Michael S. Tsirkin [mailto:[email protected]]
>> Sent: Wednesday, November 30, 2016 9:41 PM
>> To: wangyunjian
>> Cc: [email protected]; [email protected]; [email protected]; caihe
>> Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>
>> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>> When we meet an error(err=-EBADFD) recvmsg,
>> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>> return 0 in this case, breaking the loop?
> We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> The soft lockup might happened. The err is -EBADFD.

How did you do the attaching/detaching? AFAIK, the -EBADFD can only
happens when you deleting tun device during vhost_net transmission.

>
> meesage log:
> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> call trace:
> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> [<fffffff810a5c7f>]kthread+0xcf/0xe0
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> [<fffffff81648898>]ret_from_fork+0x58/0x90
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
>
>>> the error handling in vhost
>>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>>
>>> Signed-off-by: Yunjian Wang <[email protected]>
>>> ---
>>> drivers/vhost/net.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>> index 5dc128a..edc470b 100644
>>> --- a/drivers/vhost/net.c
>>> +++ b/drivers/vhost/net.c
>>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>> pr_debug("Discarded rx packet: "
>>> " len %d, expected %zd\n", err, sock_len);
>>> vhost_discard_vq_desc(vq, headcount);
>>> + /* Don't continue to do, when meet errors. */
>>> + if (err < 0)
>>> + goto out;
>> You might get e.g. EAGAIN and I think you need to retry
>> in this case.
>>
>>> continue;
>>> }
>>> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>>> --
>>> 1.9.5.msysgit.1
>>>

2016-12-01 03:15:07

by Jason Wang

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On 2016年11月30日 21:40, Michael S. Tsirkin wrote:
> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>> When we meet an error(err=-EBADFD) recvmsg,
> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> return 0 in this case, breaking the loop?
>
>> the error handling in vhost
>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>
>> Signed-off-by: Yunjian Wang <[email protected]>
>> ---
>> drivers/vhost/net.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index 5dc128a..edc470b 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>> pr_debug("Discarded rx packet: "
>> " len %d, expected %zd\n", err, sock_len);
>> vhost_discard_vq_desc(vq, headcount);
>> + /* Don't continue to do, when meet errors. */
>> + if (err < 0)
>> + goto out;
> You might get e.g. EAGAIN and I think you need to retry
> in this case.

Yes.

>> continue;
>> }
>> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>> --
>> 1.9.5.msysgit.1
>>

2016-12-01 03:21:07

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>
> >-----Original Message-----
> >From: Michael S. Tsirkin [mailto:[email protected]]
> >Sent: Wednesday, November 30, 2016 9:41 PM
> >To: wangyunjian
> >Cc: [email protected]; [email protected]; [email protected]; caihe
> >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> >
> >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> >> When we meet an error(err=-EBADFD) recvmsg,
> >
> >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> >return 0 in this case, breaking the loop?
>
> We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> The soft lockup might happened. The err is -EBADFD.
>

OK, I'd like to figure out what happened here. why don't
we get 0 when we peek at the head?

EBADFD is from here:
struct tun_struct *tun = __tun_get(tfile);
...
if (!tun)
return -EBADFD;

but then:
static int tun_peek_len(struct socket *sock)
{

...

struct tun_struct *tun;
...

tun = __tun_get(tfile);
if (!tun)
return 0;

so peek len should return 0.

then while will exit:
while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
...

> meesage log:
> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> call trace:
> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> [<fffffff810a5c7f>]kthread+0xcf/0xe0
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> [<fffffff81648898>]ret_from_fork+0x58/0x90
> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140

So somehow you keep seeing something in tun when we peek.
IMO this is the problem we should try to fix.
Could you try debugging the root cause here?

> >> the error handling in vhost
> >> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> >>
> >> Signed-off-by: Yunjian Wang <[email protected]>
> >> ---
> >> drivers/vhost/net.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> >> index 5dc128a..edc470b 100644
> >> --- a/drivers/vhost/net.c
> >> +++ b/drivers/vhost/net.c
> >> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> >> pr_debug("Discarded rx packet: "
> >> " len %d, expected %zd\n", err, sock_len);
> >> vhost_discard_vq_desc(vq, headcount);
> >> + /* Don't continue to do, when meet errors. */
> >> + if (err < 0)
> >> + goto out;
> >
> >You might get e.g. EAGAIN and I think you need to retry
> >in this case.
> >
> >> continue;
> >> }
> >> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> >> --
> >> 1.9.5.msysgit.1
> >>

2016-12-01 03:26:31

by Jason Wang

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>>> -----Original Message-----
>>> From: Michael S. Tsirkin [mailto:[email protected]]
>>> Sent: Wednesday, November 30, 2016 9:41 PM
>>> To: wangyunjian
>>> Cc: [email protected]; [email protected]; [email protected]; caihe
>>> Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>>
>>> On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>>> When we meet an error(err=-EBADFD) recvmsg,
>>> How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>>> return 0 in this case, breaking the loop?
>> We started many guest VMs while attaching/detaching some virtio-net nics for loop.
>> The soft lockup might happened. The err is -EBADFD.
>>
> OK, I'd like to figure out what happened here. why don't
> we get 0 when we peek at the head?
>
> EBADFD is from here:
> struct tun_struct *tun = __tun_get(tfile);
> ...
> if (!tun)
> return -EBADFD;
>
> but then:
> static int tun_peek_len(struct socket *sock)
> {
>
> ...
>
> struct tun_struct *tun;
> ...
>
> tun = __tun_get(tfile);
> if (!tun)
> return 0;
>
>
> so peek len should return 0.
>
> then while will exit:
> while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
> ...
>

Consider this case: user do ip link del link tap0 before recvmsg() but
after tun_peek_len() ?

>> meesage log:
>> kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
>> call trace:
>> [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
>> [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
>> [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
>> [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
>> [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
>> [<fffffff810a5c7f>]kthread+0xcf/0xe0
>> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
>> [<fffffff81648898>]ret_from_fork+0x58/0x90
>> [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> So somehow you keep seeing something in tun when we peek.
> IMO this is the problem we should try to fix.
> Could you try debugging the root cause here?
>
>>>> the error handling in vhost
>>>> handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
>>>>
>>>> Signed-off-by: Yunjian Wang <[email protected]>
>>>> ---
>>>> drivers/vhost/net.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>> index 5dc128a..edc470b 100644
>>>> --- a/drivers/vhost/net.c
>>>> +++ b/drivers/vhost/net.c
>>>> @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
>>>> pr_debug("Discarded rx packet: "
>>>> " len %d, expected %zd\n", err, sock_len);
>>>> vhost_discard_vq_desc(vq, headcount);
>>>> + /* Don't continue to do, when meet errors. */
>>>> + if (err < 0)
>>>> + goto out;
>>> You might get e.g. EAGAIN and I think you need to retry
>>> in this case.
>>>
>>>> continue;
>>>> }
>>>> /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
>>>> --
>>>> 1.9.5.msysgit.1
>>>>

2016-12-01 03:35:07

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
>
>
> On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> > On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
> > > > -----Original Message-----
> > > > From: Michael S. Tsirkin [mailto:[email protected]]
> > > > Sent: Wednesday, November 30, 2016 9:41 PM
> > > > To: wangyunjian
> > > > Cc: [email protected]; [email protected]; [email protected]; caihe
> > > > Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> > > >
> > > > On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> > > > > When we meet an error(err=-EBADFD) recvmsg,
> > > > How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> > > > return 0 in this case, breaking the loop?
> > > We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> > > The soft lockup might happened. The err is -EBADFD.
> > >
> > OK, I'd like to figure out what happened here. why don't
> > we get 0 when we peek at the head?
> >
> > EBADFD is from here:
> > struct tun_struct *tun = __tun_get(tfile);
> > ...
> > if (!tun)
> > return -EBADFD;
> >
> > but then:
> > static int tun_peek_len(struct socket *sock)
> > {
> >
> > ...
> >
> > struct tun_struct *tun;
> > ...
> > tun = __tun_get(tfile);
> > if (!tun)
> > return 0;
> >
> >
> > so peek len should return 0.
> >
> > then while will exit:
> > while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
> > ...
> >
>
> Consider this case: user do ip link del link tap0 before recvmsg() but after
> tun_peek_len() ?

Sure, this can happen, but I think we'll just exit on the next loop,
won't we?

> > > meesage log:
> > > kernel:[609608.510180]BUG: soft lockup - CPU#18 stuck for 23s! [vhost-60898:126093]
> > > call trace:
> > > [<fffffffa0132967>]vhost_get_vq_desc+0x1e7/0x984 [vhost]
> > > [<fffffffa02037e6>]handle_rx+0x226/0x810 [vhost_net]
> > > [<fffffffa0203de5>]handle_rx_net+0x15/0x20 [vhost_net]
> > > [<fffffffa013160b>]vhost_worker+0xfb/0x1e0 [vhost]
> > > [<fffffffa0131510>]? vhost_dev_reset_owner+0x50/0x50 [vhost]
> > > [<fffffff810a5c7f>]kthread+0xcf/0xe0
> > > [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> > > [<fffffff81648898>]ret_from_fork+0x58/0x90
> > > [<fffffff810a5bb0>]? kthread_create_on_node+0x140/0x140
> > So somehow you keep seeing something in tun when we peek.
> > IMO this is the problem we should try to fix.
> > Could you try debugging the root cause here?
> >
> > > > > the error handling in vhost
> > > > > handle_rx() will continue. This will cause a soft CPU lockup in vhost thread.
> > > > >
> > > > > Signed-off-by: Yunjian Wang <[email protected]>
> > > > > ---
> > > > > drivers/vhost/net.c | 3 +++
> > > > > 1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > > > index 5dc128a..edc470b 100644
> > > > > --- a/drivers/vhost/net.c
> > > > > +++ b/drivers/vhost/net.c
> > > > > @@ -717,6 +717,9 @@ static void handle_rx(struct vhost_net *net)
> > > > > pr_debug("Discarded rx packet: "
> > > > > " len %d, expected %zd\n", err, sock_len);
> > > > > vhost_discard_vq_desc(vq, headcount);
> > > > > + /* Don't continue to do, when meet errors. */
> > > > > + if (err < 0)
> > > > > + goto out;
> > > > You might get e.g. EAGAIN and I think you need to retry
> > > > in this case.
> > > >
> > > > > continue;
> > > > > }
> > > > > /* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
> > > > > --
> > > > > 1.9.5.msysgit.1
> > > > >

2016-12-01 03:37:08

by Jason Wang

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On 2016年12月01日 11:27, Michael S. Tsirkin wrote:
> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
>> >
>> >
>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>>>>> > > > >-----Original Message-----
>>>>> > > > >From: Michael S. Tsirkin [mailto:[email protected]]
>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM
>>>>> > > > >To: wangyunjian
>>>>> > > > >Cc:[email protected];[email protected];[email protected]; caihe
>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>>>>> > > > >
>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,
>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>>>>> > > > >return 0 in this case, breaking the loop?
>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.
>>>> > > >The soft lockup might happened. The err is -EBADFD.
>>>> > > >
>>> > >OK, I'd like to figure out what happened here. why don't
>>> > >we get 0 when we peek at the head?
>>> > >
>>> > >EBADFD is from here:
>>> > > struct tun_struct *tun = __tun_get(tfile);
>>> > >...
>>> > > if (!tun)
>>> > > return -EBADFD;
>>> > >
>>> > >but then:
>>> > >static int tun_peek_len(struct socket *sock)
>>> > >{
>>> > >
>>> > >...
>>> > >
>>> > > struct tun_struct *tun;
>>> > >...
>>> > > tun = __tun_get(tfile);
>>> > > if (!tun)
>>> > > return 0;
>>> > >
>>> > >
>>> > >so peek len should return 0.
>>> > >
>>> > >then while will exit:
>>> > > while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk)))
>>> > >...
>>> > >
>> >
>> >Consider this case: user do ip link del link tap0 before recvmsg() but after
>> >tun_peek_len() ?
> Sure, this can happen, but I think we'll just exit on the next loop,
> won't we?
>

Right, this is the only case I can image for -EBADFD, let's wait for the
author to the steps.

2016-12-01 04:42:12

by wangyunjian

[permalink] [raw]

Subject: RE: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

>-----Original Message-----
>From: Jason Wang [mailto:[email protected]]
>Sent: Thursday, December 01, 2016 11:37 AM
>To: Michael S. Tsirkin
>Cc: wangyunjian; [email protected]; [email protected]; caihe
>Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
>
>
>
>On 2016年12月01日 11:27, Michael S. Tsirkin wrote:
>> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
>>> >
>>> >
>>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
>>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
>>>>>> > > > >-----Original Message-----
>>>>>> > > > >From: Michael S. Tsirkin [mailto:[email protected]]
>>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM
>>>>>> > > > >To: wangyunjian
>>>>>> > > > >Cc:[email protected];[email protected];linux-kernel@
>>>>>> > > > >vger.kernel.org; caihe
>>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call
>>>>>> > > > >the recvmsg when meet errors
>>>>>> > > > >
>>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
>>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,
>>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
>>>>>> > > > >return 0 in this case, breaking the loop?
>>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.
>>>>> > > >The soft lockup might happened. The err is -EBADFD.
>>>>> > > >
>>>> > >OK, I'd like to figure out what happened here. why don't we get 0
>>>> > >when we peek at the head?
>>>> > >
>>>> > >EBADFD is from here:
>>>> > > struct tun_struct *tun = __tun_get(tfile); ...
>>>> > > if (!tun)
>>>> > > return -EBADFD;
>>>> > >
>>>> > >but then:
>>>> > >static int tun_peek_len(struct socket *sock) {
>>>> > >
>>>> > >...
>>>> > >
>>>> > > struct tun_struct *tun; ...
>>>> > > tun = __tun_get(tfile);
>>>> > > if (!tun)
>>>> > > return 0;
>>>> > >
>>>> > >
>>>> > >so peek len should return 0.
>>>> > >
>>>> > >then while will exit:
>>>> > > while ((sock_len = vhost_net_rx_peek_head_len(net,
>>>> > >sock->sk))) ...
>>>> > >
>>> >
>>> >Consider this case: user do ip link del link tap0 before recvmsg()
>>> >but after
>>> >tun_peek_len() ?
>> Sure, this can happen, but I think we'll just exit on the next loop,
>> won't we?
>>
>
>Right, this is the only case I can image for -EBADFD, let's wait for the author to the steps.
>

Thanks, I understand it don't happen in the latest kernel version.
My problem happened using kernel version 3.10.0-xx

The peek len willn't return 0.

static int peek_head_len(struct sock *sk)
{
struct sk_buff *head;
int len = 0;
unsigned long flags;

spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
head = skb_peek(&sk->sk_receive_queue);
if (likely(head)) {
len = head->len;
if (skb_vlan_tag_present(head))
len += VLAN_HLEN;
}

spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
return len;
}

2016-12-01 04:57:03

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors

On Thu, Dec 01, 2016 at 04:41:40AM +0000, wangyunjian wrote:
> >-----Original Message-----
> >From: Jason Wang [mailto:[email protected]]
> >Sent: Thursday, December 01, 2016 11:37 AM
> >To: Michael S. Tsirkin
> >Cc: wangyunjian; [email protected]; [email protected]; caihe
> >Subject: Re: [PATCH net] vhost_net: don't continue to call the recvmsg when meet errors
> >
> >
> >
> >On 2016年12月01日 11:27, Michael S. Tsirkin wrote:
> >> On Thu, Dec 01, 2016 at 11:26:21AM +0800, Jason Wang wrote:
> >>> >
> >>> >
> >>> >On 2016年12月01日 11:21, Michael S. Tsirkin wrote:
> >>>> > >On Thu, Dec 01, 2016 at 02:48:59AM +0000, wangyunjian wrote:
> >>>>>> > > > >-----Original Message-----
> >>>>>> > > > >From: Michael S. Tsirkin [mailto:[email protected]]
> >>>>>> > > > >Sent: Wednesday, November 30, 2016 9:41 PM
> >>>>>> > > > >To: wangyunjian
> >>>>>> > > > >Cc:[email protected];[email protected];linux-kernel@
> >>>>>> > > > >vger.kernel.org; caihe
> >>>>>> > > > >Subject: Re: [PATCH net] vhost_net: don't continue to call
> >>>>>> > > > >the recvmsg when meet errors
> >>>>>> > > > >
> >>>>>> > > > >On Wed, Nov 30, 2016 at 08:10:57PM +0800, Yunjian Wang wrote:
> >>>>>>> > > > > >When we meet an error(err=-EBADFD) recvmsg,
> >>>>>> > > > >How do you get EBADFD? Won't vhost_net_rx_peek_head_len
> >>>>>> > > > >return 0 in this case, breaking the loop?
> >>>>> > > >We started many guest VMs while attaching/detaching some virtio-net nics for loop.
> >>>>> > > >The soft lockup might happened. The err is -EBADFD.
> >>>>> > > >
> >>>> > >OK, I'd like to figure out what happened here. why don't we get 0
> >>>> > >when we peek at the head?
> >>>> > >
> >>>> > >EBADFD is from here:
> >>>> > > struct tun_struct *tun = __tun_get(tfile); ...
> >>>> > > if (!tun)
> >>>> > > return -EBADFD;
> >>>> > >
> >>>> > >but then:
> >>>> > >static int tun_peek_len(struct socket *sock) {
> >>>> > >
> >>>> > >...
> >>>> > >
> >>>> > > struct tun_struct *tun; ...
> >>>> > > tun = __tun_get(tfile);
> >>>> > > if (!tun)
> >>>> > > return 0;
> >>>> > >
> >>>> > >
> >>>> > >so peek len should return 0.
> >>>> > >
> >>>> > >then while will exit:
> >>>> > > while ((sock_len = vhost_net_rx_peek_head_len(net,
> >>>> > >sock->sk))) ...
> >>>> > >
> >>> >
> >>> >Consider this case: user do ip link del link tap0 before recvmsg()
> >>> >but after
> >>> >tun_peek_len() ?
> >> Sure, this can happen, but I think we'll just exit on the next loop,
> >> won't we?
> >>
> >
> >Right, this is the only case I can image for -EBADFD, let's wait for the author to the steps.
> >
>
> Thanks, I understand it don't happen in the latest kernel version.
> My problem happened using kernel version 3.10.0-xx
> The peek len willn't return 0.
>
> static int peek_head_len(struct sock *sk)
> {
> struct sk_buff *head;
> int len = 0;
> unsigned long flags;
>
> spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
> head = skb_peek(&sk->sk_receive_queue);

Should return NULL, should it not?
Maybe sk_receive_queue was not purged on detach back then.

> if (likely(head)) {
> len = head->len;
> if (skb_vlan_tag_present(head))
> len += VLAN_HLEN;
> }
>
> spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
> return len;
> }