2020-02-20 16:10:13

by Halil Pasic

[permalink] [raw]
Subject: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

Currently if one intends to run a memory protection enabled VM with
virtio devices and linux as the guest OS, one needs to specify the
VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
linux use the DMA API, which in turn handles the memory
encryption/protection stuff if the guest decides to turn itself into
a protected one. This however makes no sense due to multiple reasons:
* The device is not changed by the fact that the guest RAM is
protected. The so called IOMMU bypass quirk is not affected.
* This usage is not congruent with standardised semantics of
VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
for using DMA API in virtio (orthogonal with respect to what is
expressed by VIRTIO_F_IOMMU_PLATFORM).

This series aims to decouple 'have to use DMA API because my (guest) RAM
is protected' and 'have to use DMA API because the device told me
VIRTIO_F_IOMMU_PLATFORM'.

Please find more detailed explanations about the conceptual aspects in
the individual patches. There is however also a very practical problem
that is addressed by this series.

For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
effect The vhost code assumes it the addresses on the virtio descriptor
ring are not guest physical addresses but iova's, and insists on doing a
translation of these regardless of what transport is used (e.g. whether
we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
"vhost: new device IOTLB API".) On s390 this results in severe
performance degradation (c.a. factor 10). BTW with ccw I/O there is
(architecturally) no IOMMU, so the whole address translation makes no
sense in the context of virtio-ccw.

Halil Pasic (2):
mm: move force_dma_unencrypted() to mem_encrypt.h
virtio: let virtio use DMA API when guest RAM is protected

drivers/virtio/virtio_ring.c | 3 +++
include/linux/dma-direct.h | 9 ---------
include/linux/mem_encrypt.h | 10 ++++++++++
3 files changed, 13 insertions(+), 9 deletions(-)


base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
--
2.17.1


2020-02-20 20:49:47

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> Currently if one intends to run a memory protection enabled VM with
> virtio devices and linux as the guest OS, one needs to specify the
> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> linux use the DMA API, which in turn handles the memory
> encryption/protection stuff if the guest decides to turn itself into
> a protected one. This however makes no sense due to multiple reasons:
> * The device is not changed by the fact that the guest RAM is
> protected. The so called IOMMU bypass quirk is not affected.
> * This usage is not congruent with standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM).
>
> This series aims to decouple 'have to use DMA API because my (guest) RAM
> is protected' and 'have to use DMA API because the device told me
> VIRTIO_F_IOMMU_PLATFORM'.
>
> Please find more detailed explanations about the conceptual aspects in
> the individual patches. There is however also a very practical problem
> that is addressed by this series.
>
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

That's just a QEMU thing. It uses the same flag for VIRTIO_F_ACCESS_PLATFORM
and vhost IOTLB. QEMU can separate them, no need to change linux.


> Halil Pasic (2):
> mm: move force_dma_unencrypted() to mem_encrypt.h
> virtio: let virtio use DMA API when guest RAM is protected
>
> drivers/virtio/virtio_ring.c | 3 +++
> include/linux/dma-direct.h | 9 ---------
> include/linux/mem_encrypt.h | 10 ++++++++++
> 3 files changed, 13 insertions(+), 9 deletions(-)
>
>
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> --
> 2.17.1

2020-02-20 21:30:24

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> * This usage is not congruent with standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM).

Quoting the spec:

\item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that
the device can be used on a platform where device access to data
in memory is limited and/or translated. E.g. this is the case if the device can be located
behind an IOMMU that translates bus addresses from the device into physical
addresses in memory, if the device can be limited to only access
certain memory addresses or if special commands such as
a cache flush can be needed to synchronise data in memory with
the device. Whether accesses are actually limited or translated
is described by platform-specific means.
If this feature bit is set to 0, then the device
has same access to memory addresses supplied to it as the
driver has.
In particular, the device will always use physical addresses
matching addresses used by the driver (typically meaning
physical addresses used by the CPU)
and not translated further, and can access any address supplied to it by
the driver. When clear, this overrides any platform-specific description of
whether device access is limited or translated in any way, e.g.
whether an IOMMU may be present.

since device can't access encrypted memory,
this seems to match your case reasonably well.

--
MST

2020-02-20 21:34:47

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10). BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.

So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
Working around it by the guest looks wrong ...

> Halil Pasic (2):
> mm: move force_dma_unencrypted() to mem_encrypt.h
> virtio: let virtio use DMA API when guest RAM is protected
>
> drivers/virtio/virtio_ring.c | 3 +++
> include/linux/dma-direct.h | 9 ---------
> include/linux/mem_encrypt.h | 10 ++++++++++
> 3 files changed, 13 insertions(+), 9 deletions(-)
>
>
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> --
> 2.17.1

2020-02-21 06:23:08

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/21 上午12:06, Halil Pasic wrote:
> Currently if one intends to run a memory protection enabled VM with
> virtio devices and linux as the guest OS, one needs to specify the
> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> linux use the DMA API, which in turn handles the memory
> encryption/protection stuff if the guest decides to turn itself into
> a protected one. This however makes no sense due to multiple reasons:
> * The device is not changed by the fact that the guest RAM is
> protected. The so called IOMMU bypass quirk is not affected.
> * This usage is not congruent with standardised semantics of
> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> for using DMA API in virtio (orthogonal with respect to what is
> expressed by VIRTIO_F_IOMMU_PLATFORM).
>
> This series aims to decouple 'have to use DMA API because my (guest) RAM
> is protected' and 'have to use DMA API because the device told me
> VIRTIO_F_IOMMU_PLATFORM'.
>
> Please find more detailed explanations about the conceptual aspects in
> the individual patches. There is however also a very practical problem
> that is addressed by this series.
>
> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> effect The vhost code assumes it the addresses on the virtio descriptor
> ring are not guest physical addresses but iova's, and insists on doing a
> translation of these regardless of what transport is used (e.g. whether
> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> "vhost: new device IOTLB API".) On s390 this results in severe
> performance degradation (c.a. factor 10).


Do you see a consistent degradation on the performance, or it only
happen when for during the beginning of the test?


> BTW with ccw I/O there is
> (architecturally) no IOMMU, so the whole address translation makes no
> sense in the context of virtio-ccw.


I suspect we can do optimization in qemu side.

E.g send memtable entry via IOTLB API when vIOMMU is not enabled.

If this makes sense, I can draft patch to see if there's any difference.

Thanks


>
> Halil Pasic (2):
> mm: move force_dma_unencrypted() to mem_encrypt.h
> virtio: let virtio use DMA API when guest RAM is protected
>
> drivers/virtio/virtio_ring.c | 3 +++
> include/linux/dma-direct.h | 9 ---------
> include/linux/mem_encrypt.h | 10 ++++++++++
> 3 files changed, 13 insertions(+), 9 deletions(-)
>
>
> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2

2020-02-21 13:39:31

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, 20 Feb 2020 16:29:50 -0500
"Michael S. Tsirkin" <[email protected]> wrote:

> On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> > * This usage is not congruent with standardised semantics of
> > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > for using DMA API in virtio (orthogonal with respect to what is
> > expressed by VIRTIO_F_IOMMU_PLATFORM).
>
> Quoting the spec:
>
> \item[VIRTIO_F_ACCESS_PLATFORM(33)] This feature indicates that
> the device can be used on a platform where device access to data
> in memory is limited and/or translated. E.g. this is the case if the device can be located
> behind an IOMMU that translates bus addresses from the device into physical
> addresses in memory, if the device can be limited to only access
> certain memory addresses or if special commands such as
> a cache flush can be needed to synchronise data in memory with
> the device. Whether accesses are actually limited or translated
> is described by platform-specific means.
> If this feature bit is set to 0, then the device
> has same access to memory addresses supplied to it as the
> driver has.
> In particular, the device will always use physical addresses
> matching addresses used by the driver (typically meaning
> physical addresses used by the CPU)
> and not translated further, and can access any address supplied to it by
> the driver. When clear, this overrides any platform-specific description of
> whether device access is limited or translated in any way, e.g.
> whether an IOMMU may be present.
>
> since device can't access encrypted memory,
> this seems to match your case reasonably well.
>

As David already explained, the device does not have to access encrypted
memory. In fact, we don't have memory encryption but memory protection on
s390. All the device *should* ever see is non-protected memory (one that
was previously shared by the guest).

Our protected guests start as non-protected ones, and may or may not
turn protected during their life-span. From the device perspective,
really, nothing changes. I believe David explained this much better than
I did.

Regards,
Halil

2020-02-21 13:50:23

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, 20 Feb 2020 16:33:35 -0500
"Michael S. Tsirkin" <[email protected]> wrote:

> On Thu, Feb 20, 2020 at 05:06:04PM +0100, Halil Pasic wrote:
> > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > effect The vhost code assumes it the addresses on the virtio descriptor
> > ring are not guest physical addresses but iova's, and insists on doing a
> > translation of these regardless of what transport is used (e.g. whether
> > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > "vhost: new device IOTLB API".) On s390 this results in severe
> > performance degradation (c.a. factor 10). BTW with ccw I/O there is
> > (architecturally) no IOMMU, so the whole address translation makes no
> > sense in the context of virtio-ccw.
>
> So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> Working around it by the guest looks wrong ...

While do think that forcing IOMMU_PLATFORM on devices that do not
want or need it, just to trigger DMA API usage in guest is conceptually
wrong, I do agree, that we might have a host issue. Namely, unlike PCI,
CCW does not have an IOMMU, and the spec clearly states that "Whether
accesses are actually limited or translated is described by
platform-specific means.". With CCW devices we don't have address translation
by IOMMU, and in that sense vhost is probably wrong about trying to do
the translation. That is why I mentioned the patch, and that it's done
regardless of the transport/platform.

Regards,
Halil


>
> > Halil Pasic (2):
> > mm: move force_dma_unencrypted() to mem_encrypt.h
> > virtio: let virtio use DMA API when guest RAM is protected
> >
> > drivers/virtio/virtio_ring.c | 3 +++
> > include/linux/dma-direct.h | 9 ---------
> > include/linux/mem_encrypt.h | 10 ++++++++++
> > 3 files changed, 13 insertions(+), 9 deletions(-)
> >
> >
> > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > --
> > 2.17.1
>

2020-02-21 14:57:36

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Fri, 21 Feb 2020 14:22:26 +0800
Jason Wang <[email protected]> wrote:

>
> On 2020/2/21 上午12:06, Halil Pasic wrote:
> > Currently if one intends to run a memory protection enabled VM with
> > virtio devices and linux as the guest OS, one needs to specify the
> > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > linux use the DMA API, which in turn handles the memory
> > encryption/protection stuff if the guest decides to turn itself into
> > a protected one. This however makes no sense due to multiple reasons:
> > * The device is not changed by the fact that the guest RAM is
> > protected. The so called IOMMU bypass quirk is not affected.
> > * This usage is not congruent with standardised semantics of
> > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > for using DMA API in virtio (orthogonal with respect to what is
> > expressed by VIRTIO_F_IOMMU_PLATFORM).
> >
> > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > is protected' and 'have to use DMA API because the device told me
> > VIRTIO_F_IOMMU_PLATFORM'.
> >
> > Please find more detailed explanations about the conceptual aspects in
> > the individual patches. There is however also a very practical problem
> > that is addressed by this series.
> >
> > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > effect The vhost code assumes it the addresses on the virtio descriptor
> > ring are not guest physical addresses but iova's, and insists on doing a
> > translation of these regardless of what transport is used (e.g. whether
> > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > "vhost: new device IOTLB API".) On s390 this results in severe
> > performance degradation (c.a. factor 10).
>
>
> Do you see a consistent degradation on the performance, or it only
> happen when for during the beginning of the test?
>

AFAIK the degradation is consistent.

>
> > BTW with ccw I/O there is
> > (architecturally) no IOMMU, so the whole address translation makes no
> > sense in the context of virtio-ccw.
>
>
> I suspect we can do optimization in qemu side.
>
> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>
> If this makes sense, I can draft patch to see if there's any difference.

Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
then necessary translation) for virtio-ccw altogether. But Michael
voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
that could be used with guests running in protected mode. I don't share
his opinion, but that's an ongoing discussion.

Should we end up having to do translation from IOVA in vhost, we are
very interested in that translation being fast and efficient.

In that sense we would be very happy to test any optimization that aim
into that direction.

Thank you very much for your input!

Regards,
Halil

>
> Thanks
>
>
> >
> > Halil Pasic (2):
> > mm: move force_dma_unencrypted() to mem_encrypt.h
> > virtio: let virtio use DMA API when guest RAM is protected
> >
> > drivers/virtio/virtio_ring.c | 3 +++
> > include/linux/dma-direct.h | 9 ---------
> > include/linux/mem_encrypt.h | 10 ++++++++++
> > 3 files changed, 13 insertions(+), 9 deletions(-)
> >
> >
> > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
>

2020-02-21 16:43:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote:
> So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> Working around it by the guest looks wrong ...

Yes. If your host (and I don't care if you split hypervisor, ultravisor
and megavisor out in your implementation) wants to support a VM
architecture where the host can't access all guest memory you need to
ensure the DMA API is used. Extra points for simply always setting the
flag and thus future proofing the scheme.

2020-02-24 04:04:57

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/21 下午10:56, Halil Pasic wrote:
> On Fri, 21 Feb 2020 14:22:26 +0800
> Jason Wang <[email protected]> wrote:
>
>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>> Currently if one intends to run a memory protection enabled VM with
>>> virtio devices and linux as the guest OS, one needs to specify the
>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>> linux use the DMA API, which in turn handles the memory
>>> encryption/protection stuff if the guest decides to turn itself into
>>> a protected one. This however makes no sense due to multiple reasons:
>>> * The device is not changed by the fact that the guest RAM is
>>> protected. The so called IOMMU bypass quirk is not affected.
>>> * This usage is not congruent with standardised semantics of
>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>> for using DMA API in virtio (orthogonal with respect to what is
>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>
>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>> is protected' and 'have to use DMA API because the device told me
>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>
>>> Please find more detailed explanations about the conceptual aspects in
>>> the individual patches. There is however also a very practical problem
>>> that is addressed by this series.
>>>
>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>> ring are not guest physical addresses but iova's, and insists on doing a
>>> translation of these regardless of what transport is used (e.g. whether
>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>> performance degradation (c.a. factor 10).
>>
>> Do you see a consistent degradation on the performance, or it only
>> happen when for during the beginning of the test?
>>
> AFAIK the degradation is consistent.
>
>>> BTW with ccw I/O there is
>>> (architecturally) no IOMMU, so the whole address translation makes no
>>> sense in the context of virtio-ccw.
>>
>> I suspect we can do optimization in qemu side.
>>
>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>
>> If this makes sense, I can draft patch to see if there's any difference.
> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> then necessary translation) for virtio-ccw altogether. But Michael
> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> that could be used with guests running in protected mode. I don't share
> his opinion, but that's an ongoing discussion.
>
> Should we end up having to do translation from IOVA in vhost, we are
> very interested in that translation being fast and efficient.
>
> In that sense we would be very happy to test any optimization that aim
> into that direction.
>
> Thank you very much for your input!


Using IOTLB API on platform without IOMMU support is not intended.
Please try the attached patch to see if it helps.

Thanks


>
> Regards,
> Halil
>
>> Thanks
>>
>>
>>> Halil Pasic (2):
>>> mm: move force_dma_unencrypted() to mem_encrypt.h
>>> virtio: let virtio use DMA API when guest RAM is protected
>>>
>>> drivers/virtio/virtio_ring.c | 3 +++
>>> include/linux/dma-direct.h | 9 ---------
>>> include/linux/mem_encrypt.h | 10 ++++++++++
>>> 3 files changed, 13 insertions(+), 9 deletions(-)
>>>
>>>
>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2


Attachments:
0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch (1.71 kB)

2020-02-24 06:07:15

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
>
> On 2020/2/21 下午10:56, Halil Pasic wrote:
> > On Fri, 21 Feb 2020 14:22:26 +0800
> > Jason Wang <[email protected]> wrote:
> >
> > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > Currently if one intends to run a memory protection enabled VM with
> > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > linux use the DMA API, which in turn handles the memory
> > > > encryption/protection stuff if the guest decides to turn itself into
> > > > a protected one. This however makes no sense due to multiple reasons:
> > > > * The device is not changed by the fact that the guest RAM is
> > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > * This usage is not congruent with standardised semantics of
> > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > >
> > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > is protected' and 'have to use DMA API because the device told me
> > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > >
> > > > Please find more detailed explanations about the conceptual aspects in
> > > > the individual patches. There is however also a very practical problem
> > > > that is addressed by this series.
> > > >
> > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > translation of these regardless of what transport is used (e.g. whether
> > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > performance degradation (c.a. factor 10).
> > >
> > > Do you see a consistent degradation on the performance, or it only
> > > happen when for during the beginning of the test?
> > >
> > AFAIK the degradation is consistent.
> >
> > > > BTW with ccw I/O there is
> > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > sense in the context of virtio-ccw.
> > >
> > > I suspect we can do optimization in qemu side.
> > >
> > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > >
> > > If this makes sense, I can draft patch to see if there's any difference.
> > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > then necessary translation) for virtio-ccw altogether. But Michael
> > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > that could be used with guests running in protected mode. I don't share
> > his opinion, but that's an ongoing discussion.
> >
> > Should we end up having to do translation from IOVA in vhost, we are
> > very interested in that translation being fast and efficient.
> >
> > In that sense we would be very happy to test any optimization that aim
> > into that direction.
> >
> > Thank you very much for your input!
>
>
> Using IOTLB API on platform without IOMMU support is not intended. Please
> try the attached patch to see if it helps.
>
> Thanks
>
>
> >
> > Regards,
> > Halil
> >
> > > Thanks
> > >
> > >
> > > > Halil Pasic (2):
> > > > mm: move force_dma_unencrypted() to mem_encrypt.h
> > > > virtio: let virtio use DMA API when guest RAM is protected
> > > >
> > > > drivers/virtio/virtio_ring.c | 3 +++
> > > > include/linux/dma-direct.h | 9 ---------
> > > > include/linux/mem_encrypt.h | 10 ++++++++++
> > > > 3 files changed, 13 insertions(+), 9 deletions(-)
> > > >
> > > >
> > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2

> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> From: Jason Wang <[email protected]>
> Date: Mon, 24 Feb 2020 12:00:10 +0800
> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>
> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> even if the device and vhost claims to support that. This help to
> avoid the performance overhead caused by unnecessary IOTLB miss/update
> transactions on such platform.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> hw/virtio/virtio-bus.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6332d45c3..2741b9fdd2 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> VirtioBusState *bus = VIRTIO_BUS(qbus);
> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> Error *local_err = NULL;
>
> DPRINTF("%s: plug device.\n", qbus->name);
> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> return;
> }
>
> - if (klass->get_dma_as != NULL && has_iommu) {
> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> + if (false && klass->get_dma_as != NULL &&
> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> vdev->dma_as = klass->get_dma_as(qbus->parent);
> } else {
> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> vdev->dma_as = &address_space_memory;
> }
> }


This seems to clear it unconditionally. I guess it's just a debugging
patch, the real one will come later?

> --
> 2.19.1
>

2020-02-24 06:44:57

by David Gibson

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Fri, Feb 21, 2020 at 05:41:51PM +0100, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 04:33:35PM -0500, Michael S. Tsirkin wrote:
> > So it sounds like a host issue: the emulation of s390 unnecessarily complicated.
> > Working around it by the guest looks wrong ...
>
> Yes. If your host (and I don't care if you split hypervisor,
> ultravisor and megavisor out in your implementation) wants to
> support a VM architecture where the host can't access all guest
> memory you need to ensure the DMA API is used. Extra points for
> simply always setting the flag and thus future proofing the scheme.

Moving towards F_ACCESS_PLATFORM everywhere is a good idea (for other
reasons), but that doesn't make the problem as it exists right now go
away.

But, "you need to ensure the DMA API is used" makes no sense from the
host point of view. The existence of the DMA API is an entirely guest
side, and Linux specific detail, the host can't make decisions based
on that.

For POWER - possibly s390 as well - the hypervisor has no way of
knowing at machine construction time whether it will be an old kernel
(or non Linux OS) which can't support F_ACCESS_PLATFORM, or a guest
which will enter secure mode and therefore requires F_ACCESS_PLATFORM
(according to you). That's the fundamental problem here.

The normal virtio model of features that the guest can optionally
accept would work nicely here - except that that wouldn't work for the
case of hardware virtio devices, where the access limitations come
from "host" (platform) side and therefore can't be disabled by that
host.

We really do have two cases here: 1) access restrictions originating
with the host/platform (e.g. hardware virtio) and 2) access
restrictions originating with the guest (e.g. secure VMs). What we
need to do to deal with them is basically the same at the driver
level, but it has subtle and important differences at the platform
level.

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


Attachments:
(No filename) (2.11 kB)
signature.asc (849.00 B)
Download all attachments

2020-02-24 06:45:21

by David Gibson

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Fri, Feb 21, 2020 at 03:56:02PM +0100, Halil Pasic wrote:
> On Fri, 21 Feb 2020 14:22:26 +0800
> Jason Wang <[email protected]> wrote:
>
> >
> > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > Currently if one intends to run a memory protection enabled VM with
> > > virtio devices and linux as the guest OS, one needs to specify the
> > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > linux use the DMA API, which in turn handles the memory
> > > encryption/protection stuff if the guest decides to turn itself into
> > > a protected one. This however makes no sense due to multiple reasons:
> > > * The device is not changed by the fact that the guest RAM is
> > > protected. The so called IOMMU bypass quirk is not affected.
> > > * This usage is not congruent with standardised semantics of
> > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > for using DMA API in virtio (orthogonal with respect to what is
> > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > >
> > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > is protected' and 'have to use DMA API because the device told me
> > > VIRTIO_F_IOMMU_PLATFORM'.
> > >
> > > Please find more detailed explanations about the conceptual aspects in
> > > the individual patches. There is however also a very practical problem
> > > that is addressed by this series.
> > >
> > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > ring are not guest physical addresses but iova's, and insists on doing a
> > > translation of these regardless of what transport is used (e.g. whether
> > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > performance degradation (c.a. factor 10).
> >
> >
> > Do you see a consistent degradation on the performance, or it only
> > happen when for during the beginning of the test?
> >
>
> AFAIK the degradation is consistent.
>
> >
> > > BTW with ccw I/O there is
> > > (architecturally) no IOMMU, so the whole address translation makes no
> > > sense in the context of virtio-ccw.
> >
> >
> > I suspect we can do optimization in qemu side.
> >
> > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> >
> > If this makes sense, I can draft patch to see if there's any difference.
>
> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> then necessary translation) for virtio-ccw altogether. But Michael
> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> that could be used with guests running in protected mode. I don't share
> his opinion, but that's an ongoing discussion.

I'm a bit confused by this. For the ccw specific case,
F_ACCESS_PLATFORM shouldn't have any impact: for you, IOVA == GPA so
everything is easy.

> Should we end up having to do translation from IOVA in vhost, we are
> very interested in that translation being fast and efficient.
>
> In that sense we would be very happy to test any optimization that aim
> into that direction.
>
> Thank you very much for your input!
>
> Regards,
> Halil
>
> >
> > Thanks
> >
> >
> > >
> > > Halil Pasic (2):
> > > mm: move force_dma_unencrypted() to mem_encrypt.h
> > > virtio: let virtio use DMA API when guest RAM is protected
> > >
> > > drivers/virtio/virtio_ring.c | 3 +++
> > > include/linux/dma-direct.h | 9 ---------
> > > include/linux/mem_encrypt.h | 10 ++++++++++
> > > 3 files changed, 13 insertions(+), 9 deletions(-)
> > >
> > >
> > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> >
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


Attachments:
(No filename) (3.95 kB)
signature.asc (849.00 B)
Download all attachments

2020-02-24 06:45:59

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
>> On 2020/2/21 下午10:56, Halil Pasic wrote:
>>> On Fri, 21 Feb 2020 14:22:26 +0800
>>> Jason Wang <[email protected]> wrote:
>>>
>>>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>>>> Currently if one intends to run a memory protection enabled VM with
>>>>> virtio devices and linux as the guest OS, one needs to specify the
>>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>>>> linux use the DMA API, which in turn handles the memory
>>>>> encryption/protection stuff if the guest decides to turn itself into
>>>>> a protected one. This however makes no sense due to multiple reasons:
>>>>> * The device is not changed by the fact that the guest RAM is
>>>>> protected. The so called IOMMU bypass quirk is not affected.
>>>>> * This usage is not congruent with standardised semantics of
>>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>>>> for using DMA API in virtio (orthogonal with respect to what is
>>>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>>>
>>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>>>> is protected' and 'have to use DMA API because the device told me
>>>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>>>
>>>>> Please find more detailed explanations about the conceptual aspects in
>>>>> the individual patches. There is however also a very practical problem
>>>>> that is addressed by this series.
>>>>>
>>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>>>> ring are not guest physical addresses but iova's, and insists on doing a
>>>>> translation of these regardless of what transport is used (e.g. whether
>>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>>>> performance degradation (c.a. factor 10).
>>>> Do you see a consistent degradation on the performance, or it only
>>>> happen when for during the beginning of the test?
>>>>
>>> AFAIK the degradation is consistent.
>>>
>>>>> BTW with ccw I/O there is
>>>>> (architecturally) no IOMMU, so the whole address translation makes no
>>>>> sense in the context of virtio-ccw.
>>>> I suspect we can do optimization in qemu side.
>>>>
>>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>>>
>>>> If this makes sense, I can draft patch to see if there's any difference.
>>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
>>> then necessary translation) for virtio-ccw altogether. But Michael
>>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
>>> that could be used with guests running in protected mode. I don't share
>>> his opinion, but that's an ongoing discussion.
>>>
>>> Should we end up having to do translation from IOVA in vhost, we are
>>> very interested in that translation being fast and efficient.
>>>
>>> In that sense we would be very happy to test any optimization that aim
>>> into that direction.
>>>
>>> Thank you very much for your input!
>>
>> Using IOTLB API on platform without IOMMU support is not intended. Please
>> try the attached patch to see if it helps.
>>
>> Thanks
>>
>>
>>> Regards,
>>> Halil
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Halil Pasic (2):
>>>>> mm: move force_dma_unencrypted() to mem_encrypt.h
>>>>> virtio: let virtio use DMA API when guest RAM is protected
>>>>>
>>>>> drivers/virtio/virtio_ring.c | 3 +++
>>>>> include/linux/dma-direct.h | 9 ---------
>>>>> include/linux/mem_encrypt.h | 10 ++++++++++
>>>>> 3 files changed, 13 insertions(+), 9 deletions(-)
>>>>>
>>>>>
>>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
>> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
>> From: Jason Wang <[email protected]>
>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>
>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>> even if the device and vhost claims to support that. This help to
>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>> transactions on such platform.
>>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> hw/virtio/virtio-bus.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>> index d6332d45c3..2741b9fdd2 100644
>> --- a/hw/virtio/virtio-bus.c
>> +++ b/hw/virtio/virtio-bus.c
>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>> VirtioBusState *bus = VIRTIO_BUS(qbus);
>> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>> Error *local_err = NULL;
>>
>> DPRINTF("%s: plug device.\n", qbus->name);
>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>> return;
>> }
>>
>> - if (klass->get_dma_as != NULL && has_iommu) {
>> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>> + if (false && klass->get_dma_as != NULL &&
>> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>> vdev->dma_as = klass->get_dma_as(qbus->parent);
>> } else {
>> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>> vdev->dma_as = &address_space_memory;
>> }
>> }
>
> This seems to clear it unconditionally. I guess it's just a debugging
> patch, the real one will come later?


My bad, here's the correct one.

Thanks


>
>> --
>> 2.19.1
>>


Attachments:
0001-virtio-turn-on-IOMMU_PLATFORM-properly.patch (1.70 kB)

2020-02-24 07:50:32

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
>
> On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
> > > On 2020/2/21 下午10:56, Halil Pasic wrote:
> > > > On Fri, 21 Feb 2020 14:22:26 +0800
> > > > Jason Wang <[email protected]> wrote:
> > > >
> > > > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > > > Currently if one intends to run a memory protection enabled VM with
> > > > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > > > linux use the DMA API, which in turn handles the memory
> > > > > > encryption/protection stuff if the guest decides to turn itself into
> > > > > > a protected one. This however makes no sense due to multiple reasons:
> > > > > > * The device is not changed by the fact that the guest RAM is
> > > > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > > > * This usage is not congruent with standardised semantics of
> > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > > > >
> > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > > > is protected' and 'have to use DMA API because the device told me
> > > > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > > > >
> > > > > > Please find more detailed explanations about the conceptual aspects in
> > > > > > the individual patches. There is however also a very practical problem
> > > > > > that is addressed by this series.
> > > > > >
> > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > > > translation of these regardless of what transport is used (e.g. whether
> > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > > > performance degradation (c.a. factor 10).
> > > > > Do you see a consistent degradation on the performance, or it only
> > > > > happen when for during the beginning of the test?
> > > > >
> > > > AFAIK the degradation is consistent.
> > > >
> > > > > > BTW with ccw I/O there is
> > > > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > > > sense in the context of virtio-ccw.
> > > > > I suspect we can do optimization in qemu side.
> > > > >
> > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > > > >
> > > > > If this makes sense, I can draft patch to see if there's any difference.
> > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > > > then necessary translation) for virtio-ccw altogether. But Michael
> > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > > > that could be used with guests running in protected mode. I don't share
> > > > his opinion, but that's an ongoing discussion.
> > > >
> > > > Should we end up having to do translation from IOVA in vhost, we are
> > > > very interested in that translation being fast and efficient.
> > > >
> > > > In that sense we would be very happy to test any optimization that aim
> > > > into that direction.
> > > >
> > > > Thank you very much for your input!
> > >
> > > Using IOTLB API on platform without IOMMU support is not intended. Please
> > > try the attached patch to see if it helps.
> > >
> > > Thanks
> > >
> > >
> > > > Regards,
> > > > Halil
> > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Halil Pasic (2):
> > > > > > mm: move force_dma_unencrypted() to mem_encrypt.h
> > > > > > virtio: let virtio use DMA API when guest RAM is protected
> > > > > >
> > > > > > drivers/virtio/virtio_ring.c | 3 +++
> > > > > > include/linux/dma-direct.h | 9 ---------
> > > > > > include/linux/mem_encrypt.h | 10 ++++++++++
> > > > > > 3 files changed, 13 insertions(+), 9 deletions(-)
> > > > > >
> > > > > >
> > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> > > From: Jason Wang <[email protected]>
> > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > >
> > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > even if the device and vhost claims to support that. This help to
> > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > transactions on such platform.
> > >
> > > Signed-off-by: Jason Wang <[email protected]>
> > > ---
> > > hw/virtio/virtio-bus.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index d6332d45c3..2741b9fdd2 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > VirtioBusState *bus = VIRTIO_BUS(qbus);
> > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > > Error *local_err = NULL;
> > > DPRINTF("%s: plug device.\n", qbus->name);
> > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > return;
> > > }
> > > - if (klass->get_dma_as != NULL && has_iommu) {
> > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > + if (false && klass->get_dma_as != NULL &&
> > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > vdev->dma_as = klass->get_dma_as(qbus->parent);
> > > } else {
> > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > vdev->dma_as = &address_space_memory;
> > > }
> > > }
> >
> > This seems to clear it unconditionally. I guess it's just a debugging
> > patch, the real one will come later?
>
>
> My bad, here's the correct one.
>
> Thanks
>
>
> >
> > > --
> > > 2.19.1
> > >

> >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
> From: Jason Wang <[email protected]>
> Date: Mon, 24 Feb 2020 12:00:10 +0800
> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>
> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> even if the device and vhost claims to support that. This help to
> avoid the performance overhead caused by unnecessary IOTLB miss/update
> transactions on such platform.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> hw/virtio/virtio-bus.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6332d45c3..4be64e193e 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> VirtioBusState *bus = VIRTIO_BUS(qbus);
> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> Error *local_err = NULL;
>
> DPRINTF("%s: plug device.\n", qbus->name);
> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> return;
> }
>
> - if (klass->get_dma_as != NULL && has_iommu) {
> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);

So it looks like this line is unnecessary, but it's an unrelated
cleanup, right?

> + if (klass->get_dma_as != NULL &&
> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> vdev->dma_as = klass->get_dma_as(qbus->parent);
> } else {
> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);


Of course any change like that will have to affect migration compat, etc.
Can't we clear the bit when we are sending the features to vhost
instead?


> vdev->dma_as = &address_space_memory;
> }
> }
> --
> 2.19.1
>

2020-02-24 09:31:08

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/24 下午3:48, Michael S. Tsirkin wrote:
> On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
>> On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
>>> On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
>>>> On 2020/2/21 下午10:56, Halil Pasic wrote:
>>>>> On Fri, 21 Feb 2020 14:22:26 +0800
>>>>> Jason Wang <[email protected]> wrote:
>>>>>
>>>>>> On 2020/2/21 上午12:06, Halil Pasic wrote:
>>>>>>> Currently if one intends to run a memory protection enabled VM with
>>>>>>> virtio devices and linux as the guest OS, one needs to specify the
>>>>>>> VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
>>>>>>> linux use the DMA API, which in turn handles the memory
>>>>>>> encryption/protection stuff if the guest decides to turn itself into
>>>>>>> a protected one. This however makes no sense due to multiple reasons:
>>>>>>> * The device is not changed by the fact that the guest RAM is
>>>>>>> protected. The so called IOMMU bypass quirk is not affected.
>>>>>>> * This usage is not congruent with standardised semantics of
>>>>>>> VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
>>>>>>> for using DMA API in virtio (orthogonal with respect to what is
>>>>>>> expressed by VIRTIO_F_IOMMU_PLATFORM).
>>>>>>>
>>>>>>> This series aims to decouple 'have to use DMA API because my (guest) RAM
>>>>>>> is protected' and 'have to use DMA API because the device told me
>>>>>>> VIRTIO_F_IOMMU_PLATFORM'.
>>>>>>>
>>>>>>> Please find more detailed explanations about the conceptual aspects in
>>>>>>> the individual patches. There is however also a very practical problem
>>>>>>> that is addressed by this series.
>>>>>>>
>>>>>>> For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
>>>>>>> effect The vhost code assumes it the addresses on the virtio descriptor
>>>>>>> ring are not guest physical addresses but iova's, and insists on doing a
>>>>>>> translation of these regardless of what transport is used (e.g. whether
>>>>>>> we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
>>>>>>> "vhost: new device IOTLB API".) On s390 this results in severe
>>>>>>> performance degradation (c.a. factor 10).
>>>>>> Do you see a consistent degradation on the performance, or it only
>>>>>> happen when for during the beginning of the test?
>>>>>>
>>>>> AFAIK the degradation is consistent.
>>>>>
>>>>>>> BTW with ccw I/O there is
>>>>>>> (architecturally) no IOMMU, so the whole address translation makes no
>>>>>>> sense in the context of virtio-ccw.
>>>>>> I suspect we can do optimization in qemu side.
>>>>>>
>>>>>> E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
>>>>>>
>>>>>> If this makes sense, I can draft patch to see if there's any difference.
>>>>> Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
>>>>> then necessary translation) for virtio-ccw altogether. But Michael
>>>>> voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
>>>>> that could be used with guests running in protected mode. I don't share
>>>>> his opinion, but that's an ongoing discussion.
>>>>>
>>>>> Should we end up having to do translation from IOVA in vhost, we are
>>>>> very interested in that translation being fast and efficient.
>>>>>
>>>>> In that sense we would be very happy to test any optimization that aim
>>>>> into that direction.
>>>>>
>>>>> Thank you very much for your input!
>>>> Using IOTLB API on platform without IOMMU support is not intended. Please
>>>> try the attached patch to see if it helps.
>>>>
>>>> Thanks
>>>>
>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> Halil Pasic (2):
>>>>>>> mm: move force_dma_unencrypted() to mem_encrypt.h
>>>>>>> virtio: let virtio use DMA API when guest RAM is protected
>>>>>>>
>>>>>>> drivers/virtio/virtio_ring.c | 3 +++
>>>>>>> include/linux/dma-direct.h | 9 ---------
>>>>>>> include/linux/mem_encrypt.h | 10 ++++++++++
>>>>>>> 3 files changed, 13 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>>
>>>>>>> base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
>>>> >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
>>>> From: Jason Wang <[email protected]>
>>>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>>>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>>>
>>>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>>>> even if the device and vhost claims to support that. This help to
>>>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>>>> transactions on such platform.
>>>>
>>>> Signed-off-by: Jason Wang <[email protected]>
>>>> ---
>>>> hw/virtio/virtio-bus.c | 6 +++---
>>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>>>> index d6332d45c3..2741b9fdd2 100644
>>>> --- a/hw/virtio/virtio-bus.c
>>>> +++ b/hw/virtio/virtio-bus.c
>>>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>> VirtioBusState *bus = VIRTIO_BUS(qbus);
>>>> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>>>> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>>>> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>> Error *local_err = NULL;
>>>> DPRINTF("%s: plug device.\n", qbus->name);
>>>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>> return;
>>>> }
>>>> - if (klass->get_dma_as != NULL && has_iommu) {
>>>> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>> + if (false && klass->get_dma_as != NULL &&
>>>> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>>>> vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>> } else {
>>>> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>> vdev->dma_as = &address_space_memory;
>>>> }
>>>> }
>>> This seems to clear it unconditionally. I guess it's just a debugging
>>> patch, the real one will come later?
>>
>> My bad, here's the correct one.
>>
>> Thanks
>>
>>
>>>> --
>>>> 2.19.1
>>>>
>> >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
>> From: Jason Wang <[email protected]>
>> Date: Mon, 24 Feb 2020 12:00:10 +0800
>> Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
>>
>> When transport does not support IOMMU, we should clear IOMMU_PLATFORM
>> even if the device and vhost claims to support that. This help to
>> avoid the performance overhead caused by unnecessary IOTLB miss/update
>> transactions on such platform.
>>
>> Signed-off-by: Jason Wang <[email protected]>
>> ---
>> hw/virtio/virtio-bus.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>> index d6332d45c3..4be64e193e 100644
>> --- a/hw/virtio/virtio-bus.c
>> +++ b/hw/virtio/virtio-bus.c
>> @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>> VirtioBusState *bus = VIRTIO_BUS(qbus);
>> VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>> VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>> - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>> Error *local_err = NULL;
>>
>> DPRINTF("%s: plug device.\n", qbus->name);
>> @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>> return;
>> }
>>
>> - if (klass->get_dma_as != NULL && has_iommu) {
>> - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> So it looks like this line is unnecessary, but it's an unrelated
> cleanup, right?


Yes.


>
>> + if (klass->get_dma_as != NULL &&
>> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>> vdev->dma_as = klass->get_dma_as(qbus->parent);
>> } else {
>> + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>
> Of course any change like that will have to affect migration compat, etc.
> Can't we clear the bit when we are sending the features to vhost
> instead?


That's better.

How about attached?

Thanks


>
>
>> vdev->dma_as = &address_space_memory;
>> }
>> }
>> --
>> 2.19.1
>>


Attachments:
0001-vhost-do-not-set-VIRTIO_F_IOMMU_PLATFORM-when-IOMMU-.patch (1.65 kB)

2020-02-24 13:41:08

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Mon, Feb 24, 2020 at 05:26:20PM +0800, Jason Wang wrote:
>
> On 2020/2/24 下午3:48, Michael S. Tsirkin wrote:
> > On Mon, Feb 24, 2020 at 02:45:03PM +0800, Jason Wang wrote:
> > > On 2020/2/24 下午2:06, Michael S. Tsirkin wrote:
> > > > On Mon, Feb 24, 2020 at 12:01:57PM +0800, Jason Wang wrote:
> > > > > On 2020/2/21 下午10:56, Halil Pasic wrote:
> > > > > > On Fri, 21 Feb 2020 14:22:26 +0800
> > > > > > Jason Wang <[email protected]> wrote:
> > > > > >
> > > > > > > On 2020/2/21 上午12:06, Halil Pasic wrote:
> > > > > > > > Currently if one intends to run a memory protection enabled VM with
> > > > > > > > virtio devices and linux as the guest OS, one needs to specify the
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM flag for each virtio device to make the guest
> > > > > > > > linux use the DMA API, which in turn handles the memory
> > > > > > > > encryption/protection stuff if the guest decides to turn itself into
> > > > > > > > a protected one. This however makes no sense due to multiple reasons:
> > > > > > > > * The device is not changed by the fact that the guest RAM is
> > > > > > > > protected. The so called IOMMU bypass quirk is not affected.
> > > > > > > > * This usage is not congruent with standardised semantics of
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM. Guest memory protected is an orthogonal reason
> > > > > > > > for using DMA API in virtio (orthogonal with respect to what is
> > > > > > > > expressed by VIRTIO_F_IOMMU_PLATFORM).
> > > > > > > >
> > > > > > > > This series aims to decouple 'have to use DMA API because my (guest) RAM
> > > > > > > > is protected' and 'have to use DMA API because the device told me
> > > > > > > > VIRTIO_F_IOMMU_PLATFORM'.
> > > > > > > >
> > > > > > > > Please find more detailed explanations about the conceptual aspects in
> > > > > > > > the individual patches. There is however also a very practical problem
> > > > > > > > that is addressed by this series.
> > > > > > > >
> > > > > > > > For vhost-net the feature VIRTIO_F_IOMMU_PLATFORM has the following side
> > > > > > > > effect The vhost code assumes it the addresses on the virtio descriptor
> > > > > > > > ring are not guest physical addresses but iova's, and insists on doing a
> > > > > > > > translation of these regardless of what transport is used (e.g. whether
> > > > > > > > we emulate a PCI or a CCW device). (For details see commit 6b1e6cc7855b
> > > > > > > > "vhost: new device IOTLB API".) On s390 this results in severe
> > > > > > > > performance degradation (c.a. factor 10).
> > > > > > > Do you see a consistent degradation on the performance, or it only
> > > > > > > happen when for during the beginning of the test?
> > > > > > >
> > > > > > AFAIK the degradation is consistent.
> > > > > >
> > > > > > > > BTW with ccw I/O there is
> > > > > > > > (architecturally) no IOMMU, so the whole address translation makes no
> > > > > > > > sense in the context of virtio-ccw.
> > > > > > > I suspect we can do optimization in qemu side.
> > > > > > >
> > > > > > > E.g send memtable entry via IOTLB API when vIOMMU is not enabled.
> > > > > > >
> > > > > > > If this makes sense, I can draft patch to see if there's any difference.
> > > > > > Frankly I would prefer to avoid IOVAs on the descriptor ring (and the
> > > > > > then necessary translation) for virtio-ccw altogether. But Michael
> > > > > > voiced his opinion that we should mandate F_IOMMU_PLATFORM for devices
> > > > > > that could be used with guests running in protected mode. I don't share
> > > > > > his opinion, but that's an ongoing discussion.
> > > > > >
> > > > > > Should we end up having to do translation from IOVA in vhost, we are
> > > > > > very interested in that translation being fast and efficient.
> > > > > >
> > > > > > In that sense we would be very happy to test any optimization that aim
> > > > > > into that direction.
> > > > > >
> > > > > > Thank you very much for your input!
> > > > > Using IOTLB API on platform without IOMMU support is not intended. Please
> > > > > try the attached patch to see if it helps.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Regards,
> > > > > > Halil
> > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > > Halil Pasic (2):
> > > > > > > > mm: move force_dma_unencrypted() to mem_encrypt.h
> > > > > > > > virtio: let virtio use DMA API when guest RAM is protected
> > > > > > > >
> > > > > > > > drivers/virtio/virtio_ring.c | 3 +++
> > > > > > > > include/linux/dma-direct.h | 9 ---------
> > > > > > > > include/linux/mem_encrypt.h | 10 ++++++++++
> > > > > > > > 3 files changed, 13 insertions(+), 9 deletions(-)
> > > > > > > >
> > > > > > > >
> > > > > > > > base-commit: ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2
> > > > > >From 66fa730460875ac99e81d7db2334cd16bb1d2b27 Mon Sep 17 00:00:00 2001
> > > > > From: Jason Wang <[email protected]>
> > > > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > > > >
> > > > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > > > even if the device and vhost claims to support that. This help to
> > > > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > > > transactions on such platform.
> > > > >
> > > > > Signed-off-by: Jason Wang <[email protected]>
> > > > > ---
> > > > > hw/virtio/virtio-bus.c | 6 +++---
> > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > > > index d6332d45c3..2741b9fdd2 100644
> > > > > --- a/hw/virtio/virtio-bus.c
> > > > > +++ b/hw/virtio/virtio-bus.c
> > > > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > > > VirtioBusState *bus = VIRTIO_BUS(qbus);
> > > > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > > > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > > > > Error *local_err = NULL;
> > > > > DPRINTF("%s: plug device.\n", qbus->name);
> > > > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > > > return;
> > > > > }
> > > > > - if (klass->get_dma_as != NULL && has_iommu) {
> > > > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > > > + if (false && klass->get_dma_as != NULL &&
> > > > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > > > vdev->dma_as = klass->get_dma_as(qbus->parent);
> > > > > } else {
> > > > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > > > vdev->dma_as = &address_space_memory;
> > > > > }
> > > > > }
> > > > This seems to clear it unconditionally. I guess it's just a debugging
> > > > patch, the real one will come later?
> > >
> > > My bad, here's the correct one.
> > >
> > > Thanks
> > >
> > >
> > > > > --
> > > > > 2.19.1
> > > > >
> > > >From b8a8b582f46bb86c7a745b272db7b744779e5cc7 Mon Sep 17 00:00:00 2001
> > > From: Jason Wang <[email protected]>
> > > Date: Mon, 24 Feb 2020 12:00:10 +0800
> > > Subject: [PATCH] virtio: turn on IOMMU_PLATFORM properly
> > >
> > > When transport does not support IOMMU, we should clear IOMMU_PLATFORM
> > > even if the device and vhost claims to support that. This help to
> > > avoid the performance overhead caused by unnecessary IOTLB miss/update
> > > transactions on such platform.
> > >
> > > Signed-off-by: Jason Wang <[email protected]>
> > > ---
> > > hw/virtio/virtio-bus.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index d6332d45c3..4be64e193e 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -47,7 +47,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > VirtioBusState *bus = VIRTIO_BUS(qbus);
> > > VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> > > VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > - bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > > Error *local_err = NULL;
> > > DPRINTF("%s: plug device.\n", qbus->name);
> > > @@ -77,10 +76,11 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > > return;
> > > }
> > > - if (klass->get_dma_as != NULL && has_iommu) {
> > > - virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > So it looks like this line is unnecessary, but it's an unrelated
> > cleanup, right?
>
>
> Yes.
>
>
> >
> > > + if (klass->get_dma_as != NULL &&
> > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > vdev->dma_as = klass->get_dma_as(qbus->parent);
> > > } else {
> > > + virtio_clear_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >
> > Of course any change like that will have to affect migration compat, etc.
> > Can't we clear the bit when we are sending the features to vhost
> > instead?
>
>
> That's better.
>
> How about attached?
>
> Thanks
>
>
> >
> >
> > > vdev->dma_as = &address_space_memory;
> > > }
> > > }
> > > --
> > > 2.19.1
> > >

> >From 3177c5194c729f3056b84c67664c59b9b949bb76 Mon Sep 17 00:00:00 2001
> From: Jason Wang <[email protected]>
> Date: Mon, 24 Feb 2020 17:24:14 +0800
> Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not
> used
>
> We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is
> negotiated. This lead unnecessary IOTLB miss/update transactions when
> IOMMU is used. This patch fixes this.
>
> Signed-off-by: Jason Wang <[email protected]>
> ---
> hw/net/virtio-net.c | 3 +++
> hw/virtio/vhost.c | 4 +---
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 3627bb1717..0d50e8bd34 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
> virtio_net_apply_guest_offloads(n);
> }
>
> + if (vdev->dma_as == &address_space_memory)
> + features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
> +
> for (i = 0; i < n->max_queues; i++) {
> NetClientState *nc = qemu_get_subqueue(n->nic, i);

This pokes at acked features. I think they are also
guest visible ...

> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 9edfadc81d..711b1136f6 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -288,9 +288,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
>
> static int vhost_dev_has_iommu(struct vhost_dev *dev)
> {
> - VirtIODevice *vdev = dev->vdev;
> -
> - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> + return virtio_has_feature(dev->acked_features, VIRTIO_F_IOMMU_PLATFORM);
> }
>
> static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> --
> 2.19.1
>

2020-02-24 13:57:05

by Halil Pasic

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM

On Mon, 24 Feb 2020 17:26:20 +0800
Jason Wang <[email protected]> wrote:

> That's better.
>
> How about attached?
>
> Thanks

Thanks Jason! It does avoid the translation overhead in vhost.

Tested-by: Halil Pasic <[email protected]>

Regarding the code, you fence it in virtio-net.c, but AFAIU this feature
has relevance for other vhost devices as well. E.g. what about vhost
user? Would it be the responsibility of each virtio device to fence this
on its own?

I'm also a bit confused about the semantics of the vhost feature bit
F_ACCESS_PLATFORM. What we have specified on virtio level is:
"""
This feature indicates that the device can be used on a platform where
device access to data in memory is limited and/or translated. E.g. this
is the case if the device can be located behind an IOMMU that translates
bus addresses from the device into physical addresses in memory, if the
device can be limited to only access certain memory addresses or if
special commands such as a cache flush can be needed to synchronise data
in memory with the device. Whether accesses are actually limited or
translated is described by platform-specific means. If this feature bit
is set to 0, then the device has same access to memory addresses
supplied to it as the driver has. In particular, the device will always
use physical addresses matching addresses used by the driver (typically
meaning physical addresses used by the CPU) and not translated further,
and can access any address supplied to it by the driver. When clear,
this overrides any platform-specific description of whether device
access is limited or translated in any way, e.g. whether an IOMMU may be
present.
"""

I read this like the addresses may be IOVAs which require
IMMU translation or GPAs which don't.

On the vhost level however, it seems that F_IOMMU_PLATFORM means that
vhost has to do the translation (via IOTLB API).

Do I understand this correctly? If yes, I believe we should document
this properly.

BTW I'm still not 100% on the purpose and semantics of the
F_ACCESS_PLATFORM feature bit. But that is a different problem.

Regards,
Halil

2020-02-25 03:31:01

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/24 下午9:56, Halil Pasic wrote:
> On Mon, 24 Feb 2020 17:26:20 +0800
> Jason Wang <[email protected]> wrote:
>
>> That's better.
>>
>> How about attached?
>>
>> Thanks
> Thanks Jason! It does avoid the translation overhead in vhost.
>
> Tested-by: Halil Pasic <[email protected]>
>
> Regarding the code, you fence it in virtio-net.c, but AFAIU this feature
> has relevance for other vhost devices as well. E.g. what about vhost
> user? Would it be the responsibility of each virtio device to fence this
> on its own?


Yes, it looks to me it's better to do that in virtio_set_features_nocheck()


>
> I'm also a bit confused about the semantics of the vhost feature bit
> F_ACCESS_PLATFORM. What we have specified on virtio level is:
> """
> This feature indicates that the device can be used on a platform where
> device access to data in memory is limited and/or translated. E.g. this
> is the case if the device can be located behind an IOMMU that translates
> bus addresses from the device into physical addresses in memory, if the
> device can be limited to only access certain memory addresses or if
> special commands such as a cache flush can be needed to synchronise data
> in memory with the device. Whether accesses are actually limited or
> translated is described by platform-specific means. If this feature bit
> is set to 0, then the device has same access to memory addresses
> supplied to it as the driver has. In particular, the device will always
> use physical addresses matching addresses used by the driver (typically
> meaning physical addresses used by the CPU) and not translated further,
> and can access any address supplied to it by the driver. When clear,
> this overrides any platform-specific description of whether device
> access is limited or translated in any way, e.g. whether an IOMMU may be
> present.
> """
>
> I read this like the addresses may be IOVAs which require
> IMMU translation or GPAs which don't.
>
> On the vhost level however, it seems that F_IOMMU_PLATFORM means that
> vhost has to do the translation (via IOTLB API).


Yes.


>
> Do I understand this correctly? If yes, I believe we should document
> this properly.


Good point. I think it was probably wrong to tie F_IOMMU_PLATFORM to
IOTLB API. Technically IOTLB can work with GPA->HVA mapping. I
originally use a dedicated feature bit (you can see that from commit
log), but for some reason Michael tweak it to virtio feature bit. I
guess it was probably because at that time there's no way to specify e.g
backend capability but now we have VHOST_GET_BACKEND_FEATURES.

For now, it was probably too late to fix that but document or we can add
the support of enabling IOTLB via new backend features.


>
> BTW I'm still not 100% on the purpose and semantics of the
> F_ACCESS_PLATFORM feature bit. But that is a different problem.


Yes, I aggree that we should decouple the features that does not belongs
to device (protected, encrypted, swiotlb etc) from F_IOMMU_PLATFORM. But
Michael and other have their points as well.

Thanks


>
> Regards,
> Halil
>

2020-02-25 03:39:12

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM


On 2020/2/24 下午9:40, Michael S. Tsirkin wrote:
>> Subject: [PATCH] vhost: do not set VIRTIO_F_IOMMU_PLATFORM when IOMMU is not
>> used
>>
>> We enable device IOTLB unconditionally when VIRTIO_F_IOMMU_PLATFORM is
>> negotiated. This lead unnecessary IOTLB miss/update transactions when
>> IOMMU is used. This patch fixes this.
>>
>> Signed-off-by: Jason Wang<[email protected]>
>> ---
>> hw/net/virtio-net.c | 3 +++
>> hw/virtio/vhost.c | 4 +---
>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 3627bb1717..0d50e8bd34 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -879,6 +879,9 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
>> virtio_net_apply_guest_offloads(n);
>> }
>>
>> + if (vdev->dma_as == &address_space_memory)
>> + features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
>> +
>> for (i = 0; i < n->max_queues; i++) {
>> NetClientState *nc = qemu_get_subqueue(n->nic, i);
> This pokes at acked features. I think they are also
> guest visible ...


It's the acked features of vhost device, so I guess not?

E.g virtio_set_features_nocheck() did:

    val &= vdev->host_features;
    if (k->set_features) {
        k->set_features(vdev, val);
    }
    vdev->guest_features = val;

Thanks

>