Hi all:
The code used to busy poll for cvq command which turns out to have
several side effects:
1) infinite poll for buggy devices
2) bad interaction with scheduler
So this series tries to use sleep instead of busy polling. In this
version, I take a step back: the hardening part is not implemented and
leave for future investigation. We use to aggree to use interruptible
sleep but it doesn't work for a general workqueue.
Please review.
Thanks
Changes since V1:
- use RTNL to synchronize rx mode worker
- use completion for simplicity
- don't try to harden CVQ command
Changes since RFC:
- switch to use BAD_RING in virtio_break_device()
- check virtqueue_is_broken() after being woken up
- use more_used() instead of virtqueue_get_buf() to allow caller to
get buffers afterwards
- break the virtio-net device when timeout
- get buffer manually since the virtio core check more_used() instead
Jason Wang (2):
virtio-net: convert rx mode setting to use workqueue
virtio-net: sleep instead of busy waiting for cvq command
drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
1 file changed, 66 insertions(+), 10 deletions(-)
--
2.25.1
Hi Jason,
On 4/13/23 08:40, Jason Wang wrote:
> Hi all:
>
> The code used to busy poll for cvq command which turns out to have
> several side effects:
>
> 1) infinite poll for buggy devices
> 2) bad interaction with scheduler
>
> So this series tries to use sleep instead of busy polling. In this
> version, I take a step back: the hardening part is not implemented and
> leave for future investigation. We use to aggree to use interruptible
> sleep but it doesn't work for a general workqueue.
>
> Please review.
Thanks for working on this.
My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
it makes the vdpa dev add/del commands to freeze:
[<0>] device_del+0x37/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] unregister_virtio_device+0x11/0x20
[<0>] device_release_driver_internal+0x193/0x200
[<0>] bus_remove_device+0xbf/0x130
[<0>] device_del+0x174/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
[<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
[<0>] genl_rcv_msg+0x151/0x290
[<0>] netlink_rcv_skb+0x54/0x100
[<0>] genl_rcv+0x24/0x40
[<0>] netlink_unicast+0x217/0x340
[<0>] netlink_sendmsg+0x23e/0x4a0
[<0>] sock_sendmsg+0x8f/0xa0
[<0>] __sys_sendto+0xfc/0x170
[<0>] __x64_sys_sendto+0x20/0x30
[<0>] do_syscall_64+0x59/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
testing), it works fine:
Tested-by: Maxime Coquelin <[email protected]>
For the potential missing interrupt with non-compliant devices, I guess
it could be handled with the hardening work as same thing could happen
if the VDUSE application crashed for example.
Regards,
Maxime
[0]:
> Thanks
>
> Changes since V1:
> - use RTNL to synchronize rx mode worker
> - use completion for simplicity
> - don't try to harden CVQ command
>
> Changes since RFC:
>
> - switch to use BAD_RING in virtio_break_device()
> - check virtqueue_is_broken() after being woken up
> - use more_used() instead of virtqueue_get_buf() to allow caller to
> get buffers afterwards
> - break the virtio-net device when timeout
> - get buffer manually since the virtio core check more_used() instead
>
> Jason Wang (2):
> virtio-net: convert rx mode setting to use workqueue
> virtio-net: sleep instead of busy waiting for cvq command
>
> drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
> 1 file changed, 66 insertions(+), 10 deletions(-)
>
On 4/13/23 15:02, Maxime Coquelin wrote:
> Hi Jason,
>
> On 4/13/23 08:40, Jason Wang wrote:
>> Hi all:
>>
>> The code used to busy poll for cvq command which turns out to have
>> several side effects:
>>
>> 1) infinite poll for buggy devices
>> 2) bad interaction with scheduler
>>
>> So this series tries to use sleep instead of busy polling. In this
>> version, I take a step back: the hardening part is not implemented and
>> leave for future investigation. We use to aggree to use interruptible
>> sleep but it doesn't work for a general workqueue.
>>
>> Please review.
>
> Thanks for working on this.
> My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
> it makes the vdpa dev add/del commands to freeze:
> [<0>] device_del+0x37/0x3d0
> [<0>] device_unregister+0x13/0x60
> [<0>] unregister_virtio_device+0x11/0x20
> [<0>] device_release_driver_internal+0x193/0x200
> [<0>] bus_remove_device+0xbf/0x130
> [<0>] device_del+0x174/0x3d0
> [<0>] device_unregister+0x13/0x60
> [<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
> [<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
> [<0>] genl_rcv_msg+0x151/0x290
> [<0>] netlink_rcv_skb+0x54/0x100
> [<0>] genl_rcv+0x24/0x40
> [<0>] netlink_unicast+0x217/0x340
> [<0>] netlink_sendmsg+0x23e/0x4a0
> [<0>] sock_sendmsg+0x8f/0xa0
> [<0>] __sys_sendto+0xfc/0x170
> [<0>] __x64_sys_sendto+0x20/0x30
> [<0>] do_syscall_64+0x59/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>
> Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
> testing), it works fine:
>
> Tested-by: Maxime Coquelin <[email protected]>
>
> For the potential missing interrupt with non-compliant devices, I guess
> it could be handled with the hardening work as same thing could happen
> if the VDUSE application crashed for example.
>
> Regards,
> Maxime
>
> [0]:
Better with the link...
[0]: https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commits/vduse_v1/
>> Thanks
>>
>> Changes since V1:
>> - use RTNL to synchronize rx mode worker
>> - use completion for simplicity
>> - don't try to harden CVQ command
>>
>> Changes since RFC:
>>
>> - switch to use BAD_RING in virtio_break_device()
>> - check virtqueue_is_broken() after being woken up
>> - use more_used() instead of virtqueue_get_buf() to allow caller to
>> get buffers afterwards
>> - break the virtio-net device when timeout
>> - get buffer manually since the virtio core check more_used() instead
>>
>> Jason Wang (2):
>> virtio-net: convert rx mode setting to use workqueue
>> virtio-net: sleep instead of busy waiting for cvq command
>>
>> drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
>> 1 file changed, 66 insertions(+), 10 deletions(-)
>>
On Thu, 13 Apr 2023 14:40:25 +0800 Jason Wang wrote:
> The code used to busy poll for cvq command which turns out to have
> several side effects:
>
> 1) infinite poll for buggy devices
> 2) bad interaction with scheduler
>
> So this series tries to use sleep instead of busy polling. In this
> version, I take a step back: the hardening part is not implemented and
> leave for future investigation. We use to aggree to use interruptible
> sleep but it doesn't work for a general workqueue.
CC: netdev missing?
On Thu, Apr 13, 2023 at 10:04 PM Jakub Kicinski <[email protected]> wrote:
>
> On Thu, 13 Apr 2023 14:40:25 +0800 Jason Wang wrote:
> > The code used to busy poll for cvq command which turns out to have
> > several side effects:
> >
> > 1) infinite poll for buggy devices
> > 2) bad interaction with scheduler
> >
> > So this series tries to use sleep instead of busy polling. In this
> > version, I take a step back: the hardening part is not implemented and
> > leave for future investigation. We use to aggree to use interruptible
> > sleep but it doesn't work for a general workqueue.
>
> CC: netdev missing?
My bad. Will cc netdev for any discussion.
Thanks
>