2021-10-11 07:03:40

by Halil Pasic

[permalink] [raw]
Subject: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

The virtio specification virtio-v1.1-cs01 states: "Transitional devices
MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
been acknowledged by the driver." This is exactly what QEMU as of 6.1
has done relying solely on VIRTIO_F_VERSION_1 for detecting that.

However, the specification also says: "... the driver MAY read (but MUST
NOT write) the device-specific configuration fields to check that it can
support the device ..." before setting FEATURES_OK.

In that case, any transitional device relying solely on
VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
legacy format. In particular, this implies that it is in big endian
format for big endian guests. This naturally confuses the driver which
expects little endian in the modern mode.

It is probably a good idea to amend the spec to clarify that
VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
is complete. Before validate callback existed, config space was only
read after FEATURES_OK. However, we already have two regressions, so
let's address this here as well.

The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
with DASD backing, because things simply don't work with the default.
See Fixes tags for relevant commits.

For QEMU, we can work around the issue by writing out the feature bits
with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
config op for this. This isn't enough to address all vhost devices since
these do not get the features until FEATURES_OK, however it looks like
the affected devices actually never handled the endianness for legacy
mode correctly, so at least that's not a regression.

No devices except virtio net and virtio blk seem to be affected.

Long term the right thing to do is to fix the hypervisors.

Cc: <[email protected]> #v4.11
Signed-off-by: Halil Pasic <[email protected]>
Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range")
Reported-by: [email protected]
Reviewed-by: Cornelia Huck <[email protected]>
---

@Connie: I made some more commit message changes to accommodate Michael's
requests. I just assumed these will work or you as well and kept your
r-b. Please shout at me if it needs to be dropped :)
---
drivers/virtio/virtio.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 0a5b54034d4b..236081afe9a2 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d)
driver_features_legacy = driver_features;
}

+ /*
+ * Some devices detect legacy solely via F_VERSION_1. Write
+ * F_VERSION_1 to force LE config space accesses before FEATURES_OK for
+ * these when needed.
+ */
+ if (drv->validate && !virtio_legacy_is_little_endian()
+ && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
+ dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
+ dev->config->finalize_features(dev);
+ }
+
if (device_features & (1ULL << VIRTIO_F_VERSION_1))
dev->features = driver_features & device_features;
else

base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51
--
2.25.1


2021-10-11 13:17:04

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Mon, Oct 11 2021, Halil Pasic <[email protected]> wrote:

> The virtio specification virtio-v1.1-cs01 states: "Transitional devices
> MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
> been acknowledged by the driver." This is exactly what QEMU as of 6.1
> has done relying solely on VIRTIO_F_VERSION_1 for detecting that.
>
> However, the specification also says: "... the driver MAY read (but MUST
> NOT write) the device-specific configuration fields to check that it can
> support the device ..." before setting FEATURES_OK.
>
> In that case, any transitional device relying solely on
> VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
> legacy format. In particular, this implies that it is in big endian
> format for big endian guests. This naturally confuses the driver which
> expects little endian in the modern mode.
>
> It is probably a good idea to amend the spec to clarify that
> VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
> is complete. Before validate callback existed, config space was only
> read after FEATURES_OK. However, we already have two regressions, so
> let's address this here as well.
>
> The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
> the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
> virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
> with DASD backing, because things simply don't work with the default.
> See Fixes tags for relevant commits.
>
> For QEMU, we can work around the issue by writing out the feature bits
> with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
> config op for this. This isn't enough to address all vhost devices since
> these do not get the features until FEATURES_OK, however it looks like
> the affected devices actually never handled the endianness for legacy
> mode correctly, so at least that's not a regression.
>
> No devices except virtio net and virtio blk seem to be affected.
>
> Long term the right thing to do is to fix the hypervisors.
>
> Cc: <[email protected]> #v4.11
> Signed-off-by: Halil Pasic <[email protected]>
> Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
> Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range")
> Reported-by: [email protected]
> Reviewed-by: Cornelia Huck <[email protected]>
> ---
>
> @Connie: I made some more commit message changes to accommodate Michael's
> requests. I just assumed these will work or you as well and kept your
> r-b. Please shout at me if it needs to be dropped :)

No need to shout, still looks good to me :)

> ---
> drivers/virtio/virtio.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 0a5b54034d4b..236081afe9a2 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d)
> driver_features_legacy = driver_features;
> }
>
> + /*
> + * Some devices detect legacy solely via F_VERSION_1. Write
> + * F_VERSION_1 to force LE config space accesses before FEATURES_OK for
> + * these when needed.
> + */
> + if (drv->validate && !virtio_legacy_is_little_endian()
> + && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
> + dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
> + dev->config->finalize_features(dev);
> + }
> +
> if (device_features & (1ULL << VIRTIO_F_VERSION_1))
> dev->features = driver_features & device_features;
> else
>
> base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51
> --
> 2.25.1

2021-10-13 10:12:33

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Mon, Oct 11, 2021 at 07:39:21AM +0200, Halil Pasic wrote:
> The virtio specification virtio-v1.1-cs01 states: "Transitional devices
> MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
> been acknowledged by the driver." This is exactly what QEMU as of 6.1
> has done relying solely on VIRTIO_F_VERSION_1 for detecting that.
>
> However, the specification also says: "... the driver MAY read (but MUST
> NOT write) the device-specific configuration fields to check that it can
> support the device ..." before setting FEATURES_OK.
>
> In that case, any transitional device relying solely on
> VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
> legacy format. In particular, this implies that it is in big endian
> format for big endian guests. This naturally confuses the driver which
> expects little endian in the modern mode.
>
> It is probably a good idea to amend the spec to clarify that
> VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
> is complete. Before validate callback existed, config space was only
> read after FEATURES_OK. However, we already have two regressions, so
> let's address this here as well.
>
> The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
> the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
> virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
> with DASD backing, because things simply don't work with the default.
> See Fixes tags for relevant commits.
>
> For QEMU, we can work around the issue by writing out the feature bits
> with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
> config op for this. This isn't enough to address all vhost devices since
> these do not get the features until FEATURES_OK, however it looks like
> the affected devices actually never handled the endianness for legacy
> mode correctly, so at least that's not a regression.
>
> No devices except virtio net and virtio blk seem to be affected.
>
> Long term the right thing to do is to fix the hypervisors.
>
> Cc: <[email protected]> #v4.11
> Signed-off-by: Halil Pasic <[email protected]>
> Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
> Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range")
> Reported-by: [email protected]
> Reviewed-by: Cornelia Huck <[email protected]>

OK this looks good! How about a QEMU patch to make it spec compliant on
BE?

> ---
>
> @Connie: I made some more commit message changes to accommodate Michael's
> requests. I just assumed these will work or you as well and kept your
> r-b. Please shout at me if it needs to be dropped :)
> ---
> drivers/virtio/virtio.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 0a5b54034d4b..236081afe9a2 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d)
> driver_features_legacy = driver_features;
> }
>
> + /*
> + * Some devices detect legacy solely via F_VERSION_1. Write
> + * F_VERSION_1 to force LE config space accesses before FEATURES_OK for
> + * these when needed.
> + */
> + if (drv->validate && !virtio_legacy_is_little_endian()
> + && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
> + dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
> + dev->config->finalize_features(dev);
> + }
> +
> if (device_features & (1ULL << VIRTIO_F_VERSION_1))
> dev->features = driver_features & device_features;
> else
>
> base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51
> --
> 2.25.1

2021-10-13 11:27:37

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate



Am 13.10.21 um 12:10 schrieb Michael S. Tsirkin:
> On Mon, Oct 11, 2021 at 07:39:21AM +0200, Halil Pasic wrote:
>> The virtio specification virtio-v1.1-cs01 states: "Transitional devices
>> MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
>> been acknowledged by the driver." This is exactly what QEMU as of 6.1
>> has done relying solely on VIRTIO_F_VERSION_1 for detecting that.
>>
>> However, the specification also says: "... the driver MAY read (but MUST
>> NOT write) the device-specific configuration fields to check that it can
>> support the device ..." before setting FEATURES_OK.
>>
>> In that case, any transitional device relying solely on
>> VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
>> legacy format. In particular, this implies that it is in big endian
>> format for big endian guests. This naturally confuses the driver which
>> expects little endian in the modern mode.
>>
>> It is probably a good idea to amend the spec to clarify that
>> VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
>> is complete. Before validate callback existed, config space was only
>> read after FEATURES_OK. However, we already have two regressions, so
>> let's address this here as well.
>>
>> The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
>> the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
>> virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
>> with DASD backing, because things simply don't work with the default.
>> See Fixes tags for relevant commits.
>>
>> For QEMU, we can work around the issue by writing out the feature bits
>> with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
>> config op for this. This isn't enough to address all vhost devices since
>> these do not get the features until FEATURES_OK, however it looks like
>> the affected devices actually never handled the endianness for legacy
>> mode correctly, so at least that's not a regression.
>>
>> No devices except virtio net and virtio blk seem to be affected.
>>
>> Long term the right thing to do is to fix the hypervisors.
>>
>> Cc: <[email protected]> #v4.11
>> Signed-off-by: Halil Pasic <[email protected]>
>> Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
>> Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range")
>> Reported-by: [email protected]
>> Reviewed-by: Cornelia Huck <[email protected]>
>
> OK this looks good! How about a QEMU patch to make it spec compliant on
> BE?

Who is going to do that? Halil? you? Conny?

Can we get this kernel patch queued for 5.15 and stable without waiting for the QEMU patch
as we have a regression with 4.14?
>
>> ---
>>
>> @Connie: I made some more commit message changes to accommodate Michael's
>> requests. I just assumed these will work or you as well and kept your
>> r-b. Please shout at me if it needs to be dropped :)
>> ---
>> drivers/virtio/virtio.c | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>> index 0a5b54034d4b..236081afe9a2 100644
>> --- a/drivers/virtio/virtio.c
>> +++ b/drivers/virtio/virtio.c
>> @@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d)
>> driver_features_legacy = driver_features;
>> }
>>
>> + /*
>> + * Some devices detect legacy solely via F_VERSION_1. Write
>> + * F_VERSION_1 to force LE config space accesses before FEATURES_OK for
>> + * these when needed.
>> + */
>> + if (drv->validate && !virtio_legacy_is_little_endian()
>> + && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
>> + dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
>> + dev->config->finalize_features(dev);
>> + }
>> +
>> if (device_features & (1ULL << VIRTIO_F_VERSION_1))
>> dev->features = driver_features & device_features;
>> else
>>
>> base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51
>> --
>> 2.25.1
>

2021-10-13 12:26:33

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Wed, Oct 13, 2021 at 01:23:50PM +0200, Christian Borntraeger wrote:
>
>
> Am 13.10.21 um 12:10 schrieb Michael S. Tsirkin:
> > On Mon, Oct 11, 2021 at 07:39:21AM +0200, Halil Pasic wrote:
> > > The virtio specification virtio-v1.1-cs01 states: "Transitional devices
> > > MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
> > > been acknowledged by the driver." This is exactly what QEMU as of 6.1
> > > has done relying solely on VIRTIO_F_VERSION_1 for detecting that.
> > >
> > > However, the specification also says: "... the driver MAY read (but MUST
> > > NOT write) the device-specific configuration fields to check that it can
> > > support the device ..." before setting FEATURES_OK.
> > >
> > > In that case, any transitional device relying solely on
> > > VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
> > > legacy format. In particular, this implies that it is in big endian
> > > format for big endian guests. This naturally confuses the driver which
> > > expects little endian in the modern mode.
> > >
> > > It is probably a good idea to amend the spec to clarify that
> > > VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
> > > is complete. Before validate callback existed, config space was only
> > > read after FEATURES_OK. However, we already have two regressions, so
> > > let's address this here as well.
> > >
> > > The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
> > > the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
> > > virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
> > > with DASD backing, because things simply don't work with the default.
> > > See Fixes tags for relevant commits.
> > >
> > > For QEMU, we can work around the issue by writing out the feature bits
> > > with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
> > > config op for this. This isn't enough to address all vhost devices since
> > > these do not get the features until FEATURES_OK, however it looks like
> > > the affected devices actually never handled the endianness for legacy
> > > mode correctly, so at least that's not a regression.
> > >
> > > No devices except virtio net and virtio blk seem to be affected.
> > >
> > > Long term the right thing to do is to fix the hypervisors.
> > >
> > > Cc: <[email protected]> #v4.11
> > > Signed-off-by: Halil Pasic <[email protected]>
> > > Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
> > > Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range")
> > > Reported-by: [email protected]
> > > Reviewed-by: Cornelia Huck <[email protected]>
> >
> > OK this looks good! How about a QEMU patch to make it spec compliant on
> > BE?
>
> Who is going to do that? Halil? you? Conny?

Halil said he'll do it... Right, Halil?

> Can we get this kernel patch queued for 5.15 and stable without waiting for the QEMU patch
> as we have a regression with 4.14?

Probably. Still trying to decide between this and plain revert for 5.15
and back. Maybe both?

> >
> > > ---
> > >
> > > @Connie: I made some more commit message changes to accommodate Michael's
> > > requests. I just assumed these will work or you as well and kept your
> > > r-b. Please shout at me if it needs to be dropped :)
> > > ---
> > > drivers/virtio/virtio.c | 11 +++++++++++
> > > 1 file changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > index 0a5b54034d4b..236081afe9a2 100644
> > > --- a/drivers/virtio/virtio.c
> > > +++ b/drivers/virtio/virtio.c
> > > @@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d)
> > > driver_features_legacy = driver_features;
> > > }
> > > + /*
> > > + * Some devices detect legacy solely via F_VERSION_1. Write
> > > + * F_VERSION_1 to force LE config space accesses before FEATURES_OK for
> > > + * these when needed.
> > > + */
> > > + if (drv->validate && !virtio_legacy_is_little_endian()
> > > + && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
> > > + dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
> > > + dev->config->finalize_features(dev);
> > > + }
> > > +
> > > if (device_features & (1ULL << VIRTIO_F_VERSION_1))
> > > dev->features = driver_features & device_features;
> > > else
> > >
> > > base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51
> > > --
> > > 2.25.1
> >

2021-10-13 12:48:11

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Wed, 13 Oct 2021 08:24:53 -0400
"Michael S. Tsirkin" <[email protected]> wrote:

> > > OK this looks good! How about a QEMU patch to make it spec compliant on
> > > BE?
> >
> > Who is going to do that? Halil? you? Conny?
>
> Halil said he'll do it... Right, Halil?

I can do it but not right away. Maybe in a couple of weeks. I have some
other bugs to hunt down, before proceeding to this. If somebody else
wants to do it, I'm fine with that as well.

Regards,
Halil

2021-10-13 12:49:31

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Wed, Oct 13, 2021 at 02:44:08PM +0200, Halil Pasic wrote:
> On Wed, 13 Oct 2021 08:24:53 -0400
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > > > OK this looks good! How about a QEMU patch to make it spec compliant on
> > > > BE?
> > >
> > > Who is going to do that? Halil? you? Conny?
> >
> > Halil said he'll do it... Right, Halil?
>
> I can do it but not right away. Maybe in a couple of weeks. I have some
> other bugs to hunt down, before proceeding to this. If somebody else
> wants to do it, I'm fine with that as well.
>
> Regards,
> Halil

Couple of weeks is ok I think.

--
MST

2021-10-13 12:55:20

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Wed, Oct 13 2021, "Michael S. Tsirkin" <[email protected]> wrote:

> On Wed, Oct 13, 2021 at 01:23:50PM +0200, Christian Borntraeger wrote:
>> Can we get this kernel patch queued for 5.15 and stable without waiting for the QEMU patch
>> as we have a regression with 4.14?
>
> Probably. Still trying to decide between this and plain revert for 5.15
> and back. Maybe both?

Probably better queue this one, in case we have some undiscovered
problems with the config space access in virtio-net?

2021-10-13 12:56:44

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate

On Wed, Oct 13, 2021 at 02:52:38PM +0200, Cornelia Huck wrote:
> On Wed, Oct 13 2021, "Michael S. Tsirkin" <[email protected]> wrote:
>
> > On Wed, Oct 13, 2021 at 01:23:50PM +0200, Christian Borntraeger wrote:
> >> Can we get this kernel patch queued for 5.15 and stable without waiting for the QEMU patch
> >> as we have a regression with 4.14?
> >
> > Probably. Still trying to decide between this and plain revert for 5.15
> > and back. Maybe both?
>
> Probably better queue this one, in case we have some undiscovered
> problems with the config space access in virtio-net?

So both then. I think you are right. Pushed out to -next. Will do a pull
towards end of the week.

--
MST