2014-11-17 09:17:43

by Jason Wang

[permalink] [raw]
Subject: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

Buggy host may advertised buggy host features (a usual case is that host
advertise a feature whose dependencies were missed). In this case, driver
should detect and disable the buggy features by itself.

This patch introduces driver specific sanitize_features() method which is
called just before features finalizing to detect and disable buggy features
advertised by host.

Virtio-net will be the first user.

Cc: Rusty Russell <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: Wanlong Gao <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
---
Changes from V2:
- fix typo
- rename fix_features to sanitize_features
---
drivers/virtio/virtio.c | 4 ++++
include/linux/virtio.h | 1 +
include/linux/virtio_config.h | 12 ++++++++++++
3 files changed, 17 insertions(+)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index df598dd..6a86b4f 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -181,6 +181,10 @@ static int virtio_dev_probe(struct device *_d)
if (device_features & (1 << i))
set_bit(i, dev->features);

+ /* Sanitize buggy features advertised by host */
+ if (drv->sanitize_features)
+ drv->sanitize_features(dev);
+
dev->config->finalize_features(dev);

err = drv->probe(dev);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 65261a7..5aed283 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -142,6 +142,7 @@ struct virtio_driver {
void (*scan)(struct virtio_device *dev);
void (*remove)(struct virtio_device *dev);
void (*config_changed)(struct virtio_device *dev);
+ void (*sanitize_features)(struct virtio_device *dev);
#ifdef CONFIG_PM
int (*freeze)(struct virtio_device *dev);
int (*restore)(struct virtio_device *dev);
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 7f4ef66..7bd89ea 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -96,6 +96,18 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
return test_bit(fbit, vdev->features);
}

+static inline void virtio_disable_feature(struct virtio_device *vdev,
+ unsigned int fbit)
+{
+ BUG_ON(fbit >= VIRTIO_TRANSPORT_F_START);
+ BUG_ON(vdev->config->get_status(vdev) &
+ ~(VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER));
+
+ virtio_check_driver_offered_feature(vdev, fbit);
+
+ clear_bit(fbit, vdev->features);
+}
+
static inline
struct virtqueue *virtio_find_single_vq(struct virtio_device *vdev,
vq_callback_t *c, const char *n)
--
1.9.1


2014-11-17 09:17:49

by Jason Wang

[permalink] [raw]
Subject: [PATCH V3 2/2] virtio-net: sanitize buggy features advertised by host

This patch tries to detect the possible buggy features advertised by host
and sanitize them. One example is booting virtio-net with only ctrl_vq
disabled, qemu may still advertise many features which depends on it. This
will trigger several BUG()s in virtnet_send_command().

This patch utilizes the sanitize_features() method, and disables all
features that depends on ctrl_vq if it was not advertised.

This fixes the crash when booting with ctrl_vq=off using qemu.

Cc: Rusty Russell <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: Wanlong Gao <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
---
Changes from V1:
- fix the cut-and-paste error
Changes from V2:
- loop through an array of feature bits
- switch to use dev_warn()
---
drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ec2a8b4..6fadd8c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1948,6 +1948,31 @@ static int virtnet_restore(struct virtio_device *vdev)
}
#endif

+static void virtnet_sanitize_features(struct virtio_device *dev)
+{
+ unsigned int features_for_ctrl_vq[] = {
+ VIRTIO_NET_F_CTRL_RX,
+ VIRTIO_NET_F_CTRL_VLAN,
+ VIRTIO_NET_F_GUEST_ANNOUNCE,
+ VIRTIO_NET_F_MQ,
+ VIRTIO_NET_F_CTRL_MAC_ADDR
+ };
+ int i;
+
+ if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
+ for (i = 0; i < ARRAY_SIZE(features_for_ctrl_vq); i++) {
+ unsigned int f = features_for_ctrl_vq[i];
+ if (virtio_has_feature(dev, f)) {
+ virtio_disable_feature(dev, f);
+ dev_warn(&dev->dev,
+ "buggy hyperviser: disable feature "
+ "0x%x since VIRTIO_NET_F_CTRL_VQ was "
+ "not advertised.\n", f);
+ }
+ }
+ }
+}
+
static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
@@ -1975,6 +2000,7 @@ static struct virtio_driver virtio_net_driver = {
.probe = virtnet_probe,
.remove = virtnet_remove,
.config_changed = virtnet_config_changed,
+ .sanitize_features = virtnet_sanitize_features,
#ifdef CONFIG_PM_SLEEP
.freeze = virtnet_freeze,
.restore = virtnet_restore,
--
1.9.1

2014-11-17 09:37:21

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> Buggy host may advertised buggy host features (a usual case is that host
> advertise a feature whose dependencies were missed). In this case, driver
> should detect and disable the buggy features by itself.
>
> This patch introduces driver specific sanitize_features() method which is
> called just before features finalizing to detect and disable buggy features
> advertised by host.
>
> Virtio-net will be the first user.
>
> Cc: Rusty Russell <[email protected]>
> Cc: Michael S. Tsirkin <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: Wanlong Gao <[email protected]>
> Signed-off-by: Jason Wang <[email protected]>

Hmm this conflicts with virtio 1.0 work: we drop
features as bitmap there.

> ---
> Changes from V2:
> - fix typo
> - rename fix_features to sanitize_features
> ---
> drivers/virtio/virtio.c | 4 ++++
> include/linux/virtio.h | 1 +
> include/linux/virtio_config.h | 12 ++++++++++++
> 3 files changed, 17 insertions(+)
>
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index df598dd..6a86b4f 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -181,6 +181,10 @@ static int virtio_dev_probe(struct device *_d)
> if (device_features & (1 << i))
> set_bit(i, dev->features);
>
> + /* Sanitize buggy features advertised by host */
> + if (drv->sanitize_features)
> + drv->sanitize_features(dev);
> +
> dev->config->finalize_features(dev);
>
> err = drv->probe(dev);
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 65261a7..5aed283 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -142,6 +142,7 @@ struct virtio_driver {
> void (*scan)(struct virtio_device *dev);
> void (*remove)(struct virtio_device *dev);
> void (*config_changed)(struct virtio_device *dev);
> + void (*sanitize_features)(struct virtio_device *dev);
> #ifdef CONFIG_PM
> int (*freeze)(struct virtio_device *dev);
> int (*restore)(struct virtio_device *dev);
> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> index 7f4ef66..7bd89ea 100644
> --- a/include/linux/virtio_config.h
> +++ b/include/linux/virtio_config.h
> @@ -96,6 +96,18 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> return test_bit(fbit, vdev->features);
> }
>
> +static inline void virtio_disable_feature(struct virtio_device *vdev,
> + unsigned int fbit)
> +{
> + BUG_ON(fbit >= VIRTIO_TRANSPORT_F_START);
> + BUG_ON(vdev->config->get_status(vdev) &
> + ~(VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER));
> +
> + virtio_check_driver_offered_feature(vdev, fbit);
> +
> + clear_bit(fbit, vdev->features);
> +}
> +
> static inline
> struct virtqueue *virtio_find_single_vq(struct virtio_device *vdev,
> vq_callback_t *c, const char *n)
> --
> 1.9.1

2014-11-17 09:44:44

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, 17 Nov 2014 11:37:01 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> > Buggy host may advertised buggy host features (a usual case is that host
> > advertise a feature whose dependencies were missed). In this case, driver
> > should detect and disable the buggy features by itself.
> >
> > This patch introduces driver specific sanitize_features() method which is
> > called just before features finalizing to detect and disable buggy features
> > advertised by host.
> >
> > Virtio-net will be the first user.
> >
> > Cc: Rusty Russell <[email protected]>
> > Cc: Michael S. Tsirkin <[email protected]>
> > Cc: Cornelia Huck <[email protected]>
> > Cc: Wanlong Gao <[email protected]>
> > Signed-off-by: Jason Wang <[email protected]>
>
> Hmm this conflicts with virtio 1.0 work: we drop
> features as bitmap there.

But that's an implementation detail, no? We'll still need a way for the
driver to sanitize features, and I think this interface works just fine.

2014-11-17 10:08:54

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 2/2] virtio-net: sanitize buggy features advertised by host

On Mon, Nov 17, 2014 at 05:17:18PM +0800, Jason Wang wrote:
> This patch tries to detect the possible buggy features advertised by host
> and sanitize them. One example is booting virtio-net with only ctrl_vq
> disabled, qemu may still advertise many features which depends on it. This
> will trigger several BUG()s in virtnet_send_command().
>
> This patch utilizes the sanitize_features() method, and disables all
> features that depends on ctrl_vq if it was not advertised.
>
> This fixes the crash when booting with ctrl_vq=off using qemu.
>
> Cc: Rusty Russell <[email protected]>
> Cc: Michael S. Tsirkin <[email protected]>
> Cc: Cornelia Huck <[email protected]>
> Cc: Wanlong Gao <[email protected]>
> Signed-off-by: Jason Wang <[email protected]>


So I'm not sure this is useful.
The spec says:
The device MUST NOT offer a feature which requires another feature which
was not offered.
So this is a buggy hypervisor, and I believe we should just fail probe.
This can be done without crashing, and is generally a better
idea that second-guessing what hypervisor wants us to do.





However, assuming that we do want this change:
This can be replaced with a table driven design in virtio core, but
since you chose to open code it, I would drop table below altogether.


Just make it
if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_RX);
virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_VLAN);
virtio_disable_feature(dev, VIRTIO_NET_F_GUEST_ANNOUNCE);
virtio_disable_feature(dev, VIRTIO_NET_F_MQ);
virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_MAC_ADDR);
}




> ---
> Changes from V1:
> - fix the cut-and-paste error
> Changes from V2:
> - loop through an array of feature bits
> - switch to use dev_warn()
> ---
> drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ec2a8b4..6fadd8c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1948,6 +1948,31 @@ static int virtnet_restore(struct virtio_device *vdev)
> }
> #endif
>
> +static void virtnet_sanitize_features(struct virtio_device *dev)
> +{
> + unsigned int features_for_ctrl_vq[] = {
> + VIRTIO_NET_F_CTRL_RX,
> + VIRTIO_NET_F_CTRL_VLAN,
> + VIRTIO_NET_F_GUEST_ANNOUNCE,
> + VIRTIO_NET_F_MQ,
> + VIRTIO_NET_F_CTRL_MAC_ADDR
> + };

This is not the only dependency: checksums
have dependencies too. See virtio 1.0 spec.



> + int i;
> +
> + if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
> + for (i = 0; i < ARRAY_SIZE(features_for_ctrl_vq); i++) {
> + unsigned int f = features_for_ctrl_vq[i];
> + if (virtio_has_feature(dev, f)) {
> + virtio_disable_feature(dev, f);
> + dev_warn(&dev->dev,
> + "buggy hyperviser: disable feature "
> + "0x%x since VIRTIO_NET_F_CTRL_VQ was "
> + "not advertised.\n", f);
> + }
> + }
> + }
> +}
> +
> static struct virtio_device_id id_table[] = {
> { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> { 0 },
> @@ -1975,6 +2000,7 @@ static struct virtio_driver virtio_net_driver = {
> .probe = virtnet_probe,
> .remove = virtnet_remove,
> .config_changed = virtnet_config_changed,
> + .sanitize_features = virtnet_sanitize_features,
> #ifdef CONFIG_PM_SLEEP
> .freeze = virtnet_freeze,
> .restore = virtnet_restore,
> --
> 1.9.1

2014-11-17 10:11:53

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
> On Mon, 17 Nov 2014 11:37:01 +0200
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> > > Buggy host may advertised buggy host features (a usual case is that host
> > > advertise a feature whose dependencies were missed). In this case, driver
> > > should detect and disable the buggy features by itself.
> > >
> > > This patch introduces driver specific sanitize_features() method which is
> > > called just before features finalizing to detect and disable buggy features
> > > advertised by host.
> > >
> > > Virtio-net will be the first user.
> > >
> > > Cc: Rusty Russell <[email protected]>
> > > Cc: Michael S. Tsirkin <[email protected]>
> > > Cc: Cornelia Huck <[email protected]>
> > > Cc: Wanlong Gao <[email protected]>
> > > Signed-off-by: Jason Wang <[email protected]>
> >
> > Hmm this conflicts with virtio 1.0 work: we drop
> > features as bitmap there.
>
> But that's an implementation detail, no? We'll still need a way for the
> driver to sanitize features, and I think this interface works just fine.

Now that you mention it, I don't think we do.

The spec is quite explicit that devices must not expose invalid
combinations of features.

Admittedly, BUG_ON isn't very friendly to hypervisors.

But e.g. failing probe seems better than trying to work around
hypervisor bugs - otherwise we'll be stuck maintaining compatibility
with hypervisors forever.

--
MST

2014-11-17 10:20:56

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, 17 Nov 2014 12:11:39 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
> > On Mon, 17 Nov 2014 11:37:01 +0200
> > "Michael S. Tsirkin" <[email protected]> wrote:
> >
> > > On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> > > > Buggy host may advertised buggy host features (a usual case is that host
> > > > advertise a feature whose dependencies were missed). In this case, driver
> > > > should detect and disable the buggy features by itself.
> > > >
> > > > This patch introduces driver specific sanitize_features() method which is
> > > > called just before features finalizing to detect and disable buggy features
> > > > advertised by host.
> > > >
> > > > Virtio-net will be the first user.
> > > >
> > > > Cc: Rusty Russell <[email protected]>
> > > > Cc: Michael S. Tsirkin <[email protected]>
> > > > Cc: Cornelia Huck <[email protected]>
> > > > Cc: Wanlong Gao <[email protected]>
> > > > Signed-off-by: Jason Wang <[email protected]>
> > >
> > > Hmm this conflicts with virtio 1.0 work: we drop
> > > features as bitmap there.
> >
> > But that's an implementation detail, no? We'll still need a way for the
> > driver to sanitize features, and I think this interface works just fine.
>
> Now that you mention it, I don't think we do.
>
> The spec is quite explicit that devices must not expose invalid
> combinations of features.

Unfortunately, this does not ensure that there won't be buggy
hypervisors out there, just as there's buggy hardware floating around.

>
> Admittedly, BUG_ON isn't very friendly to hypervisors.
>
> But e.g. failing probe seems better than trying to work around
> hypervisor bugs - otherwise we'll be stuck maintaining compatibility
> with hypervisors forever.

Good point. Failing probe is still much better than hitting BUG_ONs.

We'll still need a driver callback, though, that can return an error on
bogus feature bit combinations.

2014-11-17 10:29:03

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, Nov 17, 2014 at 11:20:48AM +0100, Cornelia Huck wrote:
> On Mon, 17 Nov 2014 12:11:39 +0200
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
> > > On Mon, 17 Nov 2014 11:37:01 +0200
> > > "Michael S. Tsirkin" <[email protected]> wrote:
> > >
> > > > On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> > > > > Buggy host may advertised buggy host features (a usual case is that host
> > > > > advertise a feature whose dependencies were missed). In this case, driver
> > > > > should detect and disable the buggy features by itself.
> > > > >
> > > > > This patch introduces driver specific sanitize_features() method which is
> > > > > called just before features finalizing to detect and disable buggy features
> > > > > advertised by host.
> > > > >
> > > > > Virtio-net will be the first user.
> > > > >
> > > > > Cc: Rusty Russell <[email protected]>
> > > > > Cc: Michael S. Tsirkin <[email protected]>
> > > > > Cc: Cornelia Huck <[email protected]>
> > > > > Cc: Wanlong Gao <[email protected]>
> > > > > Signed-off-by: Jason Wang <[email protected]>
> > > >
> > > > Hmm this conflicts with virtio 1.0 work: we drop
> > > > features as bitmap there.
> > >
> > > But that's an implementation detail, no? We'll still need a way for the
> > > driver to sanitize features, and I think this interface works just fine.
> >
> > Now that you mention it, I don't think we do.
> >
> > The spec is quite explicit that devices must not expose invalid
> > combinations of features.
>
> Unfortunately, this does not ensure that there won't be buggy
> hypervisors out there, just as there's buggy hardware floating around.
>
> >
> > Admittedly, BUG_ON isn't very friendly to hypervisors.
> >
> > But e.g. failing probe seems better than trying to work around
> > hypervisor bugs - otherwise we'll be stuck maintaining compatibility
> > with hypervisors forever.
>
> Good point. Failing probe is still much better than hitting BUG_ONs.
>
> We'll still need a driver callback, though, that can return an error on
> bogus feature bit combinations.

Why bother? Just check features at start of probe, and return an error.

--
MST

2014-11-17 11:20:30

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Mon, 17 Nov 2014 12:28:49 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Mon, Nov 17, 2014 at 11:20:48AM +0100, Cornelia Huck wrote:
> > On Mon, 17 Nov 2014 12:11:39 +0200
> > "Michael S. Tsirkin" <[email protected]> wrote:
> >
> > > On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
> > > > On Mon, 17 Nov 2014 11:37:01 +0200
> > > > "Michael S. Tsirkin" <[email protected]> wrote:
> > > >
> > > > > On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> > > > > > Buggy host may advertised buggy host features (a usual case is that host
> > > > > > advertise a feature whose dependencies were missed). In this case, driver
> > > > > > should detect and disable the buggy features by itself.
> > > > > >
> > > > > > This patch introduces driver specific sanitize_features() method which is
> > > > > > called just before features finalizing to detect and disable buggy features
> > > > > > advertised by host.
> > > > > >
> > > > > > Virtio-net will be the first user.
> > > > > >
> > > > > > Cc: Rusty Russell <[email protected]>
> > > > > > Cc: Michael S. Tsirkin <[email protected]>
> > > > > > Cc: Cornelia Huck <[email protected]>
> > > > > > Cc: Wanlong Gao <[email protected]>
> > > > > > Signed-off-by: Jason Wang <[email protected]>
> > > > >
> > > > > Hmm this conflicts with virtio 1.0 work: we drop
> > > > > features as bitmap there.
> > > >
> > > > But that's an implementation detail, no? We'll still need a way for the
> > > > driver to sanitize features, and I think this interface works just fine.
> > >
> > > Now that you mention it, I don't think we do.
> > >
> > > The spec is quite explicit that devices must not expose invalid
> > > combinations of features.
> >
> > Unfortunately, this does not ensure that there won't be buggy
> > hypervisors out there, just as there's buggy hardware floating around.
> >
> > >
> > > Admittedly, BUG_ON isn't very friendly to hypervisors.
> > >
> > > But e.g. failing probe seems better than trying to work around
> > > hypervisor bugs - otherwise we'll be stuck maintaining compatibility
> > > with hypervisors forever.
> >
> > Good point. Failing probe is still much better than hitting BUG_ONs.
> >
> > We'll still need a driver callback, though, that can return an error on
> > bogus feature bit combinations.
>
> Why bother? Just check features at start of probe, and return an error.

So we'd fail probing due to bogus features after setting FEATURES_OK in
the virtio-1 case, won't we? Feels a bit weird, but seems to be covered
by the spec.

2014-11-18 03:03:53

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH V3 2/2] virtio-net: sanitize buggy features advertised by host

On 11/17/2014 06:08 PM, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 05:17:18PM +0800, Jason Wang wrote:
>> This patch tries to detect the possible buggy features advertised by host
>> and sanitize them. One example is booting virtio-net with only ctrl_vq
>> disabled, qemu may still advertise many features which depends on it. This
>> will trigger several BUG()s in virtnet_send_command().
>>
>> This patch utilizes the sanitize_features() method, and disables all
>> features that depends on ctrl_vq if it was not advertised.
>>
>> This fixes the crash when booting with ctrl_vq=off using qemu.
>>
>> Cc: Rusty Russell <[email protected]>
>> Cc: Michael S. Tsirkin <[email protected]>
>> Cc: Cornelia Huck <[email protected]>
>> Cc: Wanlong Gao <[email protected]>
>> Signed-off-by: Jason Wang <[email protected]>
>
> So I'm not sure this is useful.
> The spec says:
> The device MUST NOT offer a feature which requires another feature which
> was not offered.

We can't guarantee that hypervisor's implementation are 100% correct.
> So this is a buggy hypervisor, and I believe we should just fail probe.
> This can be done without crashing, and is generally a better
> idea that second-guessing what hypervisor wants us to do.
>

So we still need something like this patch to detect the wrong
dependencies. And the features fixing like this is not something new,
see how net device fix the features through ndo_fix_features().
>
>
>
> However, assuming that we do want this change:
> This can be replaced with a table driven design in virtio core, but
> since you chose to open code it, I would drop table below altogether.
>
>
> Just make it
> if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
> virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_RX);
> virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_VLAN);
> virtio_disable_feature(dev, VIRTIO_NET_F_GUEST_ANNOUNCE);
> virtio_disable_feature(dev, VIRTIO_NET_F_MQ);
> virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_MAC_ADDR);
> }
>

This is similar to what I did in v1 and v2. Either are ok for me.
>
>
>> ---
>> Changes from V1:
>> - fix the cut-and-paste error
>> Changes from V2:
>> - loop through an array of feature bits
>> - switch to use dev_warn()
>> ---
>> drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++++
>> 1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index ec2a8b4..6fadd8c 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1948,6 +1948,31 @@ static int virtnet_restore(struct virtio_device *vdev)
>> }
>> #endif
>>
>> +static void virtnet_sanitize_features(struct virtio_device *dev)
>> +{
>> + unsigned int features_for_ctrl_vq[] = {
>> + VIRTIO_NET_F_CTRL_RX,
>> + VIRTIO_NET_F_CTRL_VLAN,
>> + VIRTIO_NET_F_GUEST_ANNOUNCE,
>> + VIRTIO_NET_F_MQ,
>> + VIRTIO_NET_F_CTRL_MAC_ADDR
>> + };
> This is not the only dependency: checksums
> have dependencies too. See virtio 1.0 spec.
>

I see ,and this kind of check could be added. But we're really safe
since qemu handle such cases and won't advertise any offload feature is
csum is not supported.
>
>> + int i;
>> +
>> + if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
>> + for (i = 0; i < ARRAY_SIZE(features_for_ctrl_vq); i++) {
>> + unsigned int f = features_for_ctrl_vq[i];
>> + if (virtio_has_feature(dev, f)) {
>> + virtio_disable_feature(dev, f);
>> + dev_warn(&dev->dev,
>> + "buggy hyperviser: disable feature "
>> + "0x%x since VIRTIO_NET_F_CTRL_VQ was "
>> + "not advertised.\n", f);
>> + }
>> + }
>> + }
>> +}
>> +
>> static struct virtio_device_id id_table[] = {
>> { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
>> { 0 },
>> @@ -1975,6 +2000,7 @@ static struct virtio_driver virtio_net_driver = {
>> .probe = virtnet_probe,
>> .remove = virtnet_remove,
>> .config_changed = virtnet_config_changed,
>> + .sanitize_features = virtnet_sanitize_features,
>> #ifdef CONFIG_PM_SLEEP
>> .freeze = virtnet_freeze,
>> .restore = virtnet_restore,
>> --
>> 1.9.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-11-18 03:24:05

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On 11/17/2014 06:11 PM, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
>> On Mon, 17 Nov 2014 11:37:01 +0200
>> "Michael S. Tsirkin" <[email protected]> wrote:
>>
>>> On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
>>>> Buggy host may advertised buggy host features (a usual case is that host
>>>> advertise a feature whose dependencies were missed). In this case, driver
>>>> should detect and disable the buggy features by itself.
>>>>
>>>> This patch introduces driver specific sanitize_features() method which is
>>>> called just before features finalizing to detect and disable buggy features
>>>> advertised by host.
>>>>
>>>> Virtio-net will be the first user.
>>>>
>>>> Cc: Rusty Russell <[email protected]>
>>>> Cc: Michael S. Tsirkin <[email protected]>
>>>> Cc: Cornelia Huck <[email protected]>
>>>> Cc: Wanlong Gao <[email protected]>
>>>> Signed-off-by: Jason Wang <[email protected]>
>>> Hmm this conflicts with virtio 1.0 work: we drop
>>> features as bitmap there.
>> But that's an implementation detail, no? We'll still need a way for the
>> driver to sanitize features, and I think this interface works just fine.
> Now that you mention it, I don't think we do.
>
> The spec is quite explicit that devices must not expose invalid
> combinations of features.
>
> Admittedly, BUG_ON isn't very friendly to hypervisors.
>
> But e.g. failing probe seems better than trying to work around
> hypervisor bugs - otherwise we'll be stuck maintaining compatibility
> with hypervisors forever.
>

I'm ok with failing the probe.

But it won't cost big effort to workaround only features dependencies
issue. I don't see how this block any further features implementation.
Looking at virtio-net, it also depends on network core to fix NETIF_F_*
dependencies.

There seems no way to get rid of maintaining compatibility, e.g the
workarounds for the buggy hypervisor without VIRTIO_F_ANY_LAYOUT support.

2014-11-18 11:02:42

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 2/2] virtio-net: sanitize buggy features advertised by host

On Tue, Nov 18, 2014 at 11:03:32AM +0800, Jason Wang wrote:
> On 11/17/2014 06:08 PM, Michael S. Tsirkin wrote:
> > On Mon, Nov 17, 2014 at 05:17:18PM +0800, Jason Wang wrote:
> >> This patch tries to detect the possible buggy features advertised by host
> >> and sanitize them. One example is booting virtio-net with only ctrl_vq
> >> disabled, qemu may still advertise many features which depends on it. This
> >> will trigger several BUG()s in virtnet_send_command().
> >>
> >> This patch utilizes the sanitize_features() method, and disables all
> >> features that depends on ctrl_vq if it was not advertised.
> >>
> >> This fixes the crash when booting with ctrl_vq=off using qemu.
> >>
> >> Cc: Rusty Russell <[email protected]>
> >> Cc: Michael S. Tsirkin <[email protected]>
> >> Cc: Cornelia Huck <[email protected]>
> >> Cc: Wanlong Gao <[email protected]>
> >> Signed-off-by: Jason Wang <[email protected]>
> >
> > So I'm not sure this is useful.
> > The spec says:
> > The device MUST NOT offer a feature which requires another feature which
> > was not offered.
>
> We can't guarantee that hypervisor's implementation are 100% correct.
> > So this is a buggy hypervisor, and I believe we should just fail probe.
> > This can be done without crashing, and is generally a better
> > idea that second-guessing what hypervisor wants us to do.
> >
>
> So we still need something like this patch to detect the wrong
> dependencies. And the features fixing like this is not something new,
> see how net device fix the features through ndo_fix_features().

I think that's different. If linux refuses to work with a broken
hardware NIC, you can not do anything, and hardware is mass-produced so
we know many people are affected.
If linux refuses to work with a misconfigured hypervisor, it's just one
user who misconfigured it, and hypervisor can be fixed.

So let's not work around bugs at least unless we know that many people
already use a broken hypervisor, something prevents vendor from fixing it.


> >
> >
> >
> > However, assuming that we do want this change:
> > This can be replaced with a table driven design in virtio core, but
> > since you chose to open code it, I would drop table below altogether.
> >
> >
> > Just make it
> > if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
> > virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_RX);
> > virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_VLAN);
> > virtio_disable_feature(dev, VIRTIO_NET_F_GUEST_ANNOUNCE);
> > virtio_disable_feature(dev, VIRTIO_NET_F_MQ);
> > virtio_disable_feature(dev, VIRTIO_NET_F_CTRL_MAC_ADDR);
> > }
> >
>
> This is similar to what I did in v1 and v2. Either are ok for me.
> >
> >
> >> ---
> >> Changes from V1:
> >> - fix the cut-and-paste error
> >> Changes from V2:
> >> - loop through an array of feature bits
> >> - switch to use dev_warn()
> >> ---
> >> drivers/net/virtio_net.c | 26 ++++++++++++++++++++++++++
> >> 1 file changed, 26 insertions(+)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index ec2a8b4..6fadd8c 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -1948,6 +1948,31 @@ static int virtnet_restore(struct virtio_device *vdev)
> >> }
> >> #endif
> >>
> >> +static void virtnet_sanitize_features(struct virtio_device *dev)
> >> +{
> >> + unsigned int features_for_ctrl_vq[] = {
> >> + VIRTIO_NET_F_CTRL_RX,
> >> + VIRTIO_NET_F_CTRL_VLAN,
> >> + VIRTIO_NET_F_GUEST_ANNOUNCE,
> >> + VIRTIO_NET_F_MQ,
> >> + VIRTIO_NET_F_CTRL_MAC_ADDR
> >> + };
> > This is not the only dependency: checksums
> > have dependencies too. See virtio 1.0 spec.
> >
>
> I see ,and this kind of check could be added. But we're really safe
> since qemu handle such cases and won't advertise any offload feature is
> csum is not supported.
> >
> >> + int i;
> >> +
> >> + if (!virtio_has_feature(dev, VIRTIO_NET_F_CTRL_VQ)) {
> >> + for (i = 0; i < ARRAY_SIZE(features_for_ctrl_vq); i++) {
> >> + unsigned int f = features_for_ctrl_vq[i];
> >> + if (virtio_has_feature(dev, f)) {
> >> + virtio_disable_feature(dev, f);
> >> + dev_warn(&dev->dev,
> >> + "buggy hyperviser: disable feature "
> >> + "0x%x since VIRTIO_NET_F_CTRL_VQ was "
> >> + "not advertised.\n", f);
> >> + }
> >> + }
> >> + }
> >> +}
> >> +
> >> static struct virtio_device_id id_table[] = {
> >> { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
> >> { 0 },
> >> @@ -1975,6 +2000,7 @@ static struct virtio_driver virtio_net_driver = {
> >> .probe = virtnet_probe,
> >> .remove = virtnet_remove,
> >> .config_changed = virtnet_config_changed,
> >> + .sanitize_features = virtnet_sanitize_features,
> >> #ifdef CONFIG_PM_SLEEP
> >> .freeze = virtnet_freeze,
> >> .restore = virtnet_restore,
> >> --
> >> 1.9.1
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/

2014-11-18 11:05:08

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On Tue, Nov 18, 2014 at 11:23:49AM +0800, Jason Wang wrote:
> On 11/17/2014 06:11 PM, Michael S. Tsirkin wrote:
> > On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
> >> On Mon, 17 Nov 2014 11:37:01 +0200
> >> "Michael S. Tsirkin" <[email protected]> wrote:
> >>
> >>> On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
> >>>> Buggy host may advertised buggy host features (a usual case is that host
> >>>> advertise a feature whose dependencies were missed). In this case, driver
> >>>> should detect and disable the buggy features by itself.
> >>>>
> >>>> This patch introduces driver specific sanitize_features() method which is
> >>>> called just before features finalizing to detect and disable buggy features
> >>>> advertised by host.
> >>>>
> >>>> Virtio-net will be the first user.
> >>>>
> >>>> Cc: Rusty Russell <[email protected]>
> >>>> Cc: Michael S. Tsirkin <[email protected]>
> >>>> Cc: Cornelia Huck <[email protected]>
> >>>> Cc: Wanlong Gao <[email protected]>
> >>>> Signed-off-by: Jason Wang <[email protected]>
> >>> Hmm this conflicts with virtio 1.0 work: we drop
> >>> features as bitmap there.
> >> But that's an implementation detail, no? We'll still need a way for the
> >> driver to sanitize features, and I think this interface works just fine.
> > Now that you mention it, I don't think we do.
> >
> > The spec is quite explicit that devices must not expose invalid
> > combinations of features.
> >
> > Admittedly, BUG_ON isn't very friendly to hypervisors.
> >
> > But e.g. failing probe seems better than trying to work around
> > hypervisor bugs - otherwise we'll be stuck maintaining compatibility
> > with hypervisors forever.
> >
>
> I'm ok with failing the probe.
>
> But it won't cost big effort to workaround only features dependencies
> issue.

>From experience, second-guessing user always adds maintainance.

> I don't see how this block any further features implementation.
> Looking at virtio-net, it also depends on network core to fix NETIF_F_*
> dependencies.

That code is common for all drivers, so it was moved to core.

> There seems no way to get rid of maintaining compatibility, e.g the
> workarounds for the buggy hypervisor without VIRTIO_F_ANY_LAYOUT support.

Right - because too many hypervisors shipped without it, it's too
much work to fix them all.
No such motivation here, right?

--
MST

2014-11-19 03:00:49

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH V3 1/2] virtio: introduce methods of sanitizing device features

On 11/18/2014 07:04 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2014 at 11:23:49AM +0800, Jason Wang wrote:
>> On 11/17/2014 06:11 PM, Michael S. Tsirkin wrote:
>>> On Mon, Nov 17, 2014 at 10:44:30AM +0100, Cornelia Huck wrote:
>>>> On Mon, 17 Nov 2014 11:37:01 +0200
>>>> "Michael S. Tsirkin" <[email protected]> wrote:
>>>>
>>>>> On Mon, Nov 17, 2014 at 05:17:17PM +0800, Jason Wang wrote:
>>>>>> Buggy host may advertised buggy host features (a usual case is that host
>>>>>> advertise a feature whose dependencies were missed). In this case, driver
>>>>>> should detect and disable the buggy features by itself.
>>>>>>
>>>>>> This patch introduces driver specific sanitize_features() method which is
>>>>>> called just before features finalizing to detect and disable buggy features
>>>>>> advertised by host.
>>>>>>
>>>>>> Virtio-net will be the first user.
>>>>>>
>>>>>> Cc: Rusty Russell <[email protected]>
>>>>>> Cc: Michael S. Tsirkin <[email protected]>
>>>>>> Cc: Cornelia Huck <[email protected]>
>>>>>> Cc: Wanlong Gao <[email protected]>
>>>>>> Signed-off-by: Jason Wang <[email protected]>
>>>>> Hmm this conflicts with virtio 1.0 work: we drop
>>>>> features as bitmap there.
>>>> But that's an implementation detail, no? We'll still need a way for the
>>>> driver to sanitize features, and I think this interface works just fine.
>>> Now that you mention it, I don't think we do.
>>>
>>> The spec is quite explicit that devices must not expose invalid
>>> combinations of features.
>>>
>>> Admittedly, BUG_ON isn't very friendly to hypervisors.
>>>
>>> But e.g. failing probe seems better than trying to work around
>>> hypervisor bugs - otherwise we'll be stuck maintaining compatibility
>>> with hypervisors forever.
>>>
>> I'm ok with failing the probe.
>>
>> But it won't cost big effort to workaround only features dependencies
>> issue.
> From experience, second-guessing user always adds maintainance.
>
>> I don't see how this block any further features implementation.
>> Looking at virtio-net, it also depends on network core to fix NETIF_F_*
>> dependencies.
> That code is common for all drivers, so it was moved to core.
>
>> There seems no way to get rid of maintaining compatibility, e.g the
>> workarounds for the buggy hypervisor without VIRTIO_F_ANY_LAYOUT support.
> Right - because too many hypervisors shipped without it, it's too
> much work to fix them all.
> No such motivation here, right?
>

Right, will post the patch that just fail the probe of virtio-net.

Thanks