2022-11-21 12:53:57

by Li Zetao

[permalink] [raw]
Subject: [PATCH] virtio_net: Fix probe failed when modprobe virtio_net

When doing the following test steps, an error was found:
step 1: modprobe virtio_net succeeded
# modprobe virtio_net <-- OK

step 2: fault injection in register_netdevice()
# modprobe -r virtio_net <-- OK
# ...
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 3521 Comm: modprobe
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
Call Trace:
<TASK>
...
should_failslab+0xa/0x20
...
dev_set_name+0xc0/0x100
netdev_register_kobject+0xc2/0x340
register_netdevice+0xbb9/0x1320
virtnet_probe+0x1d72/0x2658 [virtio_net]
...
</TASK>
virtio_net: probe of virtio0 failed with error -22

step 3: modprobe virtio_net failed
# modprobe virtio_net <-- failed
virtio_net: probe of virtio0 failed with error -2

The root cause of the problem is that the queues are not
disable on the error handling path when register_netdevice()
fails in virtnet_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path.

Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
Signed-off-by: Li Zetao <[email protected]>
---
drivers/net/virtio_net.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7106932c6f88..86e52454b5b5 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev)
return 0;

free_unregister_netdev:
- virtio_reset_device(vdev);
-
unregister_netdev(dev);
free_failover:
net_failover_destroy(vi->failover);
free_vqs:
+ virtio_reset_device(vdev);
cancel_delayed_work_sync(&vi->refill);
free_receive_page_frags(vi);
virtnet_del_vqs(vi);
--
2.25.1



2022-11-21 18:43:21

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH] virtio_net: Fix probe failed when modprobe virtio_net

On Mon, Nov 21, 2022 at 09:29:35PM +0800, Li Zetao wrote:
> When doing the following test steps, an error was found:
> step 1: modprobe virtio_net succeeded
> # modprobe virtio_net <-- OK
>
> step 2: fault injection in register_netdevice()
> # modprobe -r virtio_net <-- OK
> # ...
> FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> CPU: 0 PID: 3521 Comm: modprobe
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> Call Trace:
> <TASK>
> ...
> should_failslab+0xa/0x20
> ...
> dev_set_name+0xc0/0x100
> netdev_register_kobject+0xc2/0x340
> register_netdevice+0xbb9/0x1320
> virtnet_probe+0x1d72/0x2658 [virtio_net]
> ...
> </TASK>
> virtio_net: probe of virtio0 failed with error -22
>
> step 3: modprobe virtio_net failed
> # modprobe virtio_net <-- failed
> virtio_net: probe of virtio0 failed with error -2
>
> The root cause of the problem is that the queues are not
> disable on the error handling path when register_netdevice()
> fails in virtnet_probe(), resulting in an error "-ENOENT"
> returned in the next modprobe call in setup_vq().
>
> virtio_pci_modern_device uses virtqueues to send or
> receive message, and "queue_enable" records whether the
> queues are available. In vp_modern_find_vqs(), all queues
> will be selected and activated, but once queues are enabled
> there is no way to go back except reset.
>
> Fix it by reset virtio device on error handling path.
>
> Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
> Signed-off-by: Li Zetao <[email protected]>

I would add to this:

------

This makes error handling follow the same order as normal device cleanup
which does:

static void remove_vq_common(struct virtnet_info *vi)
{
virtio_reset_device(vi->vdev);

/* Free unused buffers in both send and recv, if any. */
free_unused_bufs(vi);

free_receive_bufs(vi);

free_receive_page_frags(vi);

virtnet_del_vqs(vi);
}

static void virtnet_remove(struct virtio_device *vdev)
{
struct virtnet_info *vi = vdev->priv;

virtnet_cpu_notif_remove(vi);

/* Make sure no work handler is accessing the device. */
flush_work(&vi->config_work);

unregister_netdev(vi->dev);

net_failover_destroy(vi->failover);

remove_vq_common(vi);

free_netdev(vi->dev);
}


So unregister, destroy failover, then reset - and that flow
is better tested than error handling so we can be reasonably
sure it works well.

-----


I would thus probably also include this tag instead:

Fixes: 0246555550 ("virtio_net: fix use after free on allocation failure")

this is what introduced the difference in cleanup order,
modern driver just added hardware support.


Besides extending the commit log

Acked-by: Michael S. Tsirkin <[email protected]>


> ---
> drivers/net/virtio_net.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7106932c6f88..86e52454b5b5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev)
> return 0;
>
> free_unregister_netdev:
> - virtio_reset_device(vdev);
> -
> unregister_netdev(dev);
> free_failover:
> net_failover_destroy(vi->failover);
> free_vqs:
> + virtio_reset_device(vdev);
> cancel_delayed_work_sync(&vi->refill);
> free_receive_page_frags(vi);
> virtnet_del_vqs(vi);
> --
> 2.25.1


2022-11-22 14:22:22

by Li Zetao

[permalink] [raw]
Subject: [PATCH v2] virtio_net: Fix probe failed when modprobe virtio_net

When doing the following test steps, an error was found:
step 1: modprobe virtio_net succeeded
# modprobe virtio_net <-- OK

step 2: fault injection in register_netdevice()
# modprobe -r virtio_net <-- OK
# ...
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 3521 Comm: modprobe
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
Call Trace:
<TASK>
...
should_failslab+0xa/0x20
...
dev_set_name+0xc0/0x100
netdev_register_kobject+0xc2/0x340
register_netdevice+0xbb9/0x1320
virtnet_probe+0x1d72/0x2658 [virtio_net]
...
</TASK>
virtio_net: probe of virtio0 failed with error -22

step 3: modprobe virtio_net failed
# modprobe virtio_net <-- failed
virtio_net: probe of virtio0 failed with error -2

The root cause of the problem is that the queues are not
disable on the error handling path when register_netdevice()
fails in virtnet_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. This
makes error handling follow the same order as normal device
cleanup in virtnet_remove() which does: unregister, destroy
failover, then reset. And that flow is better tested than
error handling so we can be reasonably sure it works well.

Fixes: 024655555021 ("virtio_net: fix use after free on allocation failure")
Signed-off-by: Li Zetao <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
---
v1 was posted at: https://lore.kernel.org/all/[email protected]/
v1 -> v2: modify commit log and fixes tag

drivers/net/virtio_net.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7106932c6f88..86e52454b5b5 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev)
return 0;

free_unregister_netdev:
- virtio_reset_device(vdev);
-
unregister_netdev(dev);
free_failover:
net_failover_destroy(vi->failover);
free_vqs:
+ virtio_reset_device(vdev);
cancel_delayed_work_sync(&vi->refill);
free_receive_page_frags(vi);
virtnet_del_vqs(vi);
--
2.25.1

2022-11-22 20:24:40

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_net: Fix probe failed when modprobe virtio_net

On Tue, Nov 22, 2022 at 11:00:46PM +0800, Li Zetao wrote:
> When doing the following test steps, an error was found:
> step 1: modprobe virtio_net succeeded
> # modprobe virtio_net <-- OK
>
> step 2: fault injection in register_netdevice()
> # modprobe -r virtio_net <-- OK
> # ...
> FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> CPU: 0 PID: 3521 Comm: modprobe
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> Call Trace:
> <TASK>
> ...
> should_failslab+0xa/0x20
> ...
> dev_set_name+0xc0/0x100
> netdev_register_kobject+0xc2/0x340
> register_netdevice+0xbb9/0x1320
> virtnet_probe+0x1d72/0x2658 [virtio_net]
> ...
> </TASK>
> virtio_net: probe of virtio0 failed with error -22
>
> step 3: modprobe virtio_net failed
> # modprobe virtio_net <-- failed
> virtio_net: probe of virtio0 failed with error -2
>
> The root cause of the problem is that the queues are not
> disable

if you need to resend it:

not disabled

but that's minor, ok to ignore

> on the error handling path when register_netdevice()
> fails in virtnet_probe(), resulting in an error "-ENOENT"
> returned in the next modprobe call in setup_vq().
>
> virtio_pci_modern_device uses virtqueues to send or
> receive message, and "queue_enable" records whether the
> queues are available. In vp_modern_find_vqs(), all queues
> will be selected and activated, but once queues are enabled
> there is no way to go back except reset.
>
> Fix it by reset virtio device on error handling path. This
> makes error handling follow the same order as normal device
> cleanup in virtnet_remove() which does: unregister, destroy
> failover, then reset. And that flow is better tested than
> error handling so we can be reasonably sure it works well.
>
> Fixes: 024655555021 ("virtio_net: fix use after free on allocation failure")
> Signed-off-by: Li Zetao <[email protected]>
> Acked-by: Michael S. Tsirkin <[email protected]>

Thanks, LGTM, feel free to merge.



> ---
> v1 was posted at: https://lore.kernel.org/all/[email protected]/
> v1 -> v2: modify commit log and fixes tag
>
> drivers/net/virtio_net.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7106932c6f88..86e52454b5b5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev)
> return 0;
>
> free_unregister_netdev:
> - virtio_reset_device(vdev);
> -
> unregister_netdev(dev);
> free_failover:
> net_failover_destroy(vi->failover);
> free_vqs:
> + virtio_reset_device(vdev);
> cancel_delayed_work_sync(&vi->refill);
> free_receive_page_frags(vi);
> virtnet_del_vqs(vi);
> --
> 2.25.1

2022-11-24 09:09:32

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v2] virtio_net: Fix probe failed when modprobe virtio_net

Hello:

This patch was applied to netdev/net.git (master)
by Paolo Abeni <[email protected]>:

On Tue, 22 Nov 2022 23:00:46 +0800 you wrote:
> When doing the following test steps, an error was found:
> step 1: modprobe virtio_net succeeded
> # modprobe virtio_net <-- OK
>
> step 2: fault injection in register_netdevice()
> # modprobe -r virtio_net <-- OK
> # ...
> FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> CPU: 0 PID: 3521 Comm: modprobe
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> Call Trace:
> <TASK>
> ...
> should_failslab+0xa/0x20
> ...
> dev_set_name+0xc0/0x100
> netdev_register_kobject+0xc2/0x340
> register_netdevice+0xbb9/0x1320
> virtnet_probe+0x1d72/0x2658 [virtio_net]
> ...
> </TASK>
> virtio_net: probe of virtio0 failed with error -22
>
> [...]

Here is the summary with links:
- [v2] virtio_net: Fix probe failed when modprobe virtio_net
https://git.kernel.org/netdev/net/c/b06865659463

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html