There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.
Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:
# ethtool -X eth0 hfunc toeplitz
This is how the problem happens:
1) ethtool_set_rxfh() calls virtnet_set_rxfh()
2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
3) virtnet_commit_rss_command() populates 4 entries for the rss
scatter-gather
4) Since the command above does not have a key, then the last
scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;
5) This buffer is passed to qemu, but qemu is not happy with a buffer
with zero length, and do the following in virtqueue_map_desc() (QEMU
function):
if (!sz) {
virtio_error(vdev, "virtio: zero sized buffers are not allowed");
6) virtio_error() (also QEMU function) set the device as broken
vdev->broken = true;
7) Qemu bails out, and do not repond this crazy kernel.
8) The kernel is waiting for the response to come back (function
virtnet_send_command())
9) The kernel is waiting doing the following :
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
!virtqueue_is_broken(vi->cvq))
cpu_relax();
10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.
Fix it by not sending RSS commands if the feature is not available in
the device.
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Breno Leitao <[email protected]>
---
drivers/net/virtio_net.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c640fdf28fc5..e6b0eaf08ac2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3809,6 +3809,9 @@ static int virtnet_set_rxfh(struct net_device *dev,
struct virtnet_info *vi = netdev_priv(dev);
int i;
+ if (!vi->has_rss && !vi->has_rss_hash_report)
+ return -EOPNOTSUPP;
+
if (rxfh->hfunc != ETH_RSS_HASH_NO_CHANGE &&
rxfh->hfunc != ETH_RSS_HASH_TOP)
return -EOPNOTSUPP;
--
2.43.0
在 2024/3/26 下午11:19, Breno Leitao 写道:
> There is a bug when setting the RSS options in virtio_net that can break
> the whole machine, getting the kernel into an infinite loop.
>
> Running the following command in any QEMU virtual machine with virtionet
> will reproduce this problem:
>
> # ethtool -X eth0 hfunc toeplitz
>
> This is how the problem happens:
>
> 1) ethtool_set_rxfh() calls virtnet_set_rxfh()
>
> 2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
>
> 3) virtnet_commit_rss_command() populates 4 entries for the rss
> scatter-gather
>
> 4) Since the command above does not have a key, then the last
> scatter-gatter entry will be zeroed, since rss_key_size == 0.
> sg_buf_size = vi->rss_key_size;
>
> 5) This buffer is passed to qemu, but qemu is not happy with a buffer
> with zero length, and do the following in virtqueue_map_desc() (QEMU
> function):
>
> if (!sz) {
> virtio_error(vdev, "virtio: zero sized buffers are not allowed");
>
> 6) virtio_error() (also QEMU function) set the device as broken
>
> vdev->broken = true;
>
> 7) Qemu bails out, and do not repond this crazy kernel.
>
> 8) The kernel is waiting for the response to come back (function
> virtnet_send_command())
>
> 9) The kernel is waiting doing the following :
>
> while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> !virtqueue_is_broken(vi->cvq))
> cpu_relax();
>
> 10) None of the following functions above is true, thus, the kernel
> loops here forever. Keeping in mind that virtqueue_is_broken() does
> not look at the qemu `vdev->broken`, so, it never realizes that the
> vitio is broken at QEMU side.
>
> Fix it by not sending RSS commands if the feature is not available in
> the device.
>
> Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Breno Leitao <[email protected]>
> ---
> drivers/net/virtio_net.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c640fdf28fc5..e6b0eaf08ac2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3809,6 +3809,9 @@ static int virtnet_set_rxfh(struct net_device *dev,
> struct virtnet_info *vi = netdev_priv(dev);
> int i;
>
> + if (!vi->has_rss && !vi->has_rss_hash_report)
> + return -EOPNOTSUPP;
> +
Why not make the second patch as the first, this seems to work better.
Or squash them into one patch.
Apart from these and Xuan's comments.
For series:
Reviewed-by: Heng Qi <[email protected]>
Regards,
Heng
> if (rxfh->hfunc != ETH_RSS_HASH_NO_CHANGE &&
> rxfh->hfunc != ETH_RSS_HASH_TOP)
> return -EOPNOTSUPP;
On Wed, Mar 27, 2024 at 10:27:58AM +0800, Heng Qi wrote:
>
>
> 在 2024/3/26 下午11:19, Breno Leitao 写道:
> > There is a bug when setting the RSS options in virtio_net that can break
> > the whole machine, getting the kernel into an infinite loop.
> >
> > Running the following command in any QEMU virtual machine with virtionet
> > will reproduce this problem:
> >
> > # ethtool -X eth0 hfunc toeplitz
> >
> > This is how the problem happens:
> >
> > 1) ethtool_set_rxfh() calls virtnet_set_rxfh()
> >
> > 2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
> >
> > 3) virtnet_commit_rss_command() populates 4 entries for the rss
> > scatter-gather
> >
> > 4) Since the command above does not have a key, then the last
> > scatter-gatter entry will be zeroed, since rss_key_size == 0.
> > sg_buf_size = vi->rss_key_size;
> >
> > 5) This buffer is passed to qemu, but qemu is not happy with a buffer
> > with zero length, and do the following in virtqueue_map_desc() (QEMU
> > function):
> >
> > if (!sz) {
> > virtio_error(vdev, "virtio: zero sized buffers are not allowed");
> >
> > 6) virtio_error() (also QEMU function) set the device as broken
> >
> > vdev->broken = true;
> >
> > 7) Qemu bails out, and do not repond this crazy kernel.
> >
> > 8) The kernel is waiting for the response to come back (function
> > virtnet_send_command())
> >
> > 9) The kernel is waiting doing the following :
> >
> > while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > !virtqueue_is_broken(vi->cvq))
> > cpu_relax();
> >
> > 10) None of the following functions above is true, thus, the kernel
> > loops here forever. Keeping in mind that virtqueue_is_broken() does
> > not look at the qemu `vdev->broken`, so, it never realizes that the
> > vitio is broken at QEMU side.
> >
> > Fix it by not sending RSS commands if the feature is not available in
> > the device.
> >
> > Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> > Cc: [email protected]
> > Cc: [email protected]
> > Signed-off-by: Breno Leitao <[email protected]>
> > ---
> > drivers/net/virtio_net.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index c640fdf28fc5..e6b0eaf08ac2 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -3809,6 +3809,9 @@ static int virtnet_set_rxfh(struct net_device *dev,
> > struct virtnet_info *vi = netdev_priv(dev);
> > int i;
> > + if (!vi->has_rss && !vi->has_rss_hash_report)
> > + return -EOPNOTSUPP;
> > +
>
> Why not make the second patch as the first, this seems to work better.
Sure, that works for me. Let me update it in v2.