2019-11-08 23:01:48

by Ram Pai

[permalink] [raw]
Subject: [RFC v2 0/2] Enable IOMMU support for pseries Secure VMs

This patch series enables IOMMU support for pseries Secure VMs.

Tested using QEMU command line option:

"-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4,
iommu_platform=on,disable-modern=off,disable-legacy=on"
and

"-device virtio-blk-pci,scsi=off,bus=pci.0,
addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,
iommu_platform=on,disable-modern=off,disable-legacy=on"

changelog:
v2: added comments describing the changes.
requested by Alexey and Michael Ellermen.

Ram Pai (2):
powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.
powerpc/pseries/iommu: Use dma_iommu_ops for Secure VMs aswell.

arch/powerpc/platforms/pseries/iommu.c | 34 +++++++++++++++++++++-------------
1 file changed, 21 insertions(+), 13 deletions(-)

--
1.8.3.1


2019-11-08 23:01:55

by Ram Pai

[permalink] [raw]
Subject: [RFC v2 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

The hypervisor needs to access the contents of the page holding the TCE
entries while setting up the TCE entries in the IOMMU's TCE table.

For SecureVMs, since this page is encrypted, the hypervisor cannot
access valid entries. Share the page with the hypervisor. This ensures
that the hypervisor sees those valid entries.

Why is this safe?
The page contains only TCE entries; not any sensitive data
belonging to the Secure VM. The hypervisor has a genuine need to know
the value of the TCE entries, without which it will not be able to
DMA to/from the pages pointed to by the TCE entries. In a Secure
VM the TCE entries point to pages that are also shared with the
hypervisor; example: pages containing bounce buffers.

Signed-off-by: Ram Pai <[email protected]>
---
arch/powerpc/platforms/pseries/iommu.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 8d9c2b1..a302aaa 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -37,6 +37,7 @@
#include <asm/mmzone.h>
#include <asm/plpar_wrappers.h>
#include <asm/svm.h>
+#include <asm/ultravisor.h>

#include "pseries.h"

@@ -179,6 +180,23 @@ static int tce_build_pSeriesLP(struct iommu_table *tbl, long tcenum,

static DEFINE_PER_CPU(__be64 *, tce_page);

+/*
+ * Allocate a tce page. If secure VM, share the page with the hypervisor.
+ *
+ * NOTE: the TCE page is shared with the hypervisor explicitly and remains
+ * shared for the lifetime of the kernel. It is implicitly unshared at kernel
+ * shutdown through a UV_UNSHARE_ALL_PAGES ucall.
+ */
+static __be64 *alloc_tce_page(void)
+{
+ __be64 *tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
+
+ if (tcep && is_secure_guest())
+ uv_share_page(PHYS_PFN(__pa(tcep)), 1);
+
+ return tcep;
+}
+
static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
long npages, unsigned long uaddr,
enum dma_data_direction direction,
@@ -206,8 +224,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
* from iommu_alloc{,_sg}()
*/
if (!tcep) {
- tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
- /* If allocation fails, fall back to the loop implementation */
+ tcep = alloc_tce_page();
if (!tcep) {
local_irq_restore(flags);
return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
@@ -405,7 +422,7 @@ static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn,
tcep = __this_cpu_read(tce_page);

if (!tcep) {
- tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
+ tcep = alloc_tce_page();
if (!tcep) {
local_irq_enable();
return -ENOMEM;
--
1.8.3.1

2019-11-08 23:03:44

by Ram Pai

[permalink] [raw]
Subject: [RFC v2 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VMs aswell.

Commit edea902c1c1e disabled dma_iommu_ops path, for secure VMs. The
rationale for disabling the dma_iommu_ops path, was to enable use of the
dma_direct path, which had inbuilt support for bounce-buffering through
SWIOTLB.

However dma_iommu_ops is functionally much richer. Depending on the
capabilities of the platform, it can handle direct DMA; with or without
bounce buffering, and it can handle indirect DMA. Hence its better to
leverage the richer functionality supported by dma_iommu_ops.

Revert edea902c1c1e and renable dma_iommu_ops path for pseries Secure
VMs.

Signed-off-by: Ram Pai <[email protected]>
---
arch/powerpc/platforms/pseries/iommu.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index a302aaa..3ffcb54 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -36,7 +36,6 @@
#include <asm/udbg.h>
#include <asm/mmzone.h>
#include <asm/plpar_wrappers.h>
-#include <asm/svm.h>
#include <asm/ultravisor.h>

#include "pseries.h"
@@ -1336,15 +1335,7 @@ void iommu_init_early_pSeries(void)
of_reconfig_notifier_register(&iommu_reconfig_nb);
register_memory_notifier(&iommu_mem_nb);

- /*
- * Secure guest memory is inacessible to devices so regular DMA isn't
- * possible.
- *
- * In that case keep devices' dma_map_ops as NULL so that the generic
- * DMA code path will use SWIOTLB to bounce buffers for DMA.
- */
- if (!is_secure_guest())
- set_pci_dma_ops(&dma_iommu_ops);
+ set_pci_dma_ops(&dma_iommu_ops);
}

static int __init disable_multitce(char *str)
--
1.8.3.1

2019-11-10 21:11:55

by David Gibson

[permalink] [raw]
Subject: Re: [RFC v2 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

On Fri, Nov 08, 2019 at 03:00:10PM -0800, Ram Pai wrote:
> The hypervisor needs to access the contents of the page holding the TCE
> entries while setting up the TCE entries in the IOMMU's TCE table.
>
> For SecureVMs, since this page is encrypted, the hypervisor cannot
> access valid entries. Share the page with the hypervisor. This ensures
> that the hypervisor sees those valid entries.
>
> Why is this safe?
> The page contains only TCE entries; not any sensitive data
> belonging to the Secure VM. The hypervisor has a genuine need to know
> the value of the TCE entries, without which it will not be able to
> DMA to/from the pages pointed to by the TCE entries. In a Secure
> VM the TCE entries point to pages that are also shared with the
> hypervisor; example: pages containing bounce buffers.

The bit that may not be obvious to reviewers from the above is this:

This is *not* a page of "live" TCEs which are actively used for
translation. Instead this is just a transient buffer with a batch of
TCEs to set, passed to the hypervisor with the H_PUT_TCE_INDIRECT call.

>
> Signed-off-by: Ram Pai <[email protected]>
> ---
> arch/powerpc/platforms/pseries/iommu.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index 8d9c2b1..a302aaa 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -37,6 +37,7 @@
> #include <asm/mmzone.h>
> #include <asm/plpar_wrappers.h>
> #include <asm/svm.h>
> +#include <asm/ultravisor.h>
>
> #include "pseries.h"
>
> @@ -179,6 +180,23 @@ static int tce_build_pSeriesLP(struct iommu_table *tbl, long tcenum,
>
> static DEFINE_PER_CPU(__be64 *, tce_page);
>
> +/*
> + * Allocate a tce page. If secure VM, share the page with the hypervisor.
> + *
> + * NOTE: the TCE page is shared with the hypervisor explicitly and remains
> + * shared for the lifetime of the kernel. It is implicitly unshared at kernel
> + * shutdown through a UV_UNSHARE_ALL_PAGES ucall.
> + */
> +static __be64 *alloc_tce_page(void)
> +{
> + __be64 *tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
> +
> + if (tcep && is_secure_guest())
> + uv_share_page(PHYS_PFN(__pa(tcep)), 1);
> +
> + return tcep;
> +}
> +
> static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
> long npages, unsigned long uaddr,
> enum dma_data_direction direction,
> @@ -206,8 +224,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
> * from iommu_alloc{,_sg}()
> */
> if (!tcep) {
> - tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
> - /* If allocation fails, fall back to the loop implementation */
> + tcep = alloc_tce_page();
> if (!tcep) {
> local_irq_restore(flags);
> return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
> @@ -405,7 +422,7 @@ static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn,
> tcep = __this_cpu_read(tce_page);
>
> if (!tcep) {
> - tcep = (__be64 *)__get_free_page(GFP_ATOMIC);
> + tcep = alloc_tce_page();
> if (!tcep) {
> local_irq_enable();
> return -ENOMEM;

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


Attachments:
(No filename) (3.42 kB)
signature.asc (849.00 B)
Download all attachments

2019-11-12 01:17:24

by Ram Pai

[permalink] [raw]
Subject: RE: [RFC v2 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.

On Sun, Nov 10, 2019 at 07:40:06PM +0000, David Gibson wrote:
> On Fri, Nov 08, 2019 at 03:00:10PM -0800, Ram Pai wrote:
> > The hypervisor needs to access the contents of the page holding the TCE
> > entries while setting up the TCE entries in the IOMMU's TCE table.
> >
> > For SecureVMs, since this page is encrypted, the hypervisor cannot
> > access valid entries. Share the page with the hypervisor. This ensures
> > that the hypervisor sees those valid entries.
> >
> > Why is this safe?
> > The page contains only TCE entries; not any sensitive data
> > belonging to the Secure VM. The hypervisor has a genuine need to know
> > the value of the TCE entries, without which it will not be able to
> > DMA to/from the pages pointed to by the TCE entries. In a Secure
> > VM the TCE entries point to pages that are also shared with the
> > hypervisor; example: pages containing bounce buffers.
>
> The bit that may not be obvious to reviewers from the above is this:
>
> This is *not* a page of "live" TCEs which are actively used for
> translation. Instead this is just a transient buffer with a batch of
> TCEs to set, passed to the hypervisor with the H_PUT_TCE_INDIRECT call.


That is true. I should have stated that explicitly. Thanks. will
update the commit log.


RP