Subject: [PATCH v2 0/4] Have TTM support SEV encryption with coherent memory

With SEV memory encryption and in some cases also with SME memory
encryption, coherent memory is unencrypted. In those cases, TTM doesn't
set up the correct page protection. Fix this by having the TTM
coherent page allocator call into the platform code to determine whether
coherent memory is encrypted or not, and modify the page protection if
it is not.

v2:
- Use force_dma_unencrypted() rather than sev_active() to catch also the
special SME encryption cases.


Subject: [PATCH v2 2/4] s390/mm: Export force_dma_unencrypted

From: Thomas Hellstrom <[email protected]>

The force_dma_unencrypted symbol is needed by TTM to set up the correct
page protection when memory encryption is active. Export it.

Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
---
arch/s390/mm/init.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 20340a03ad90..eec7cc303a31 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -165,6 +165,7 @@ bool force_dma_unencrypted(struct device *dev)
{
return sev_active();
}
+EXPORT_SYMBOL(force_dma_unencrypted);

/* protected virtualization */
static void pv_init(void)
--
2.20.1

Subject: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

From: Thomas Hellstrom <[email protected]>

The force_dma_unencrypted symbol is needed by TTM to set up the correct
page protection when memory encryption is active. Export it.

Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
---
arch/x86/mm/mem_encrypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index fece30ca8b0c..bbfe8802d63a 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -377,6 +377,7 @@ bool force_dma_unencrypted(struct device *dev)

return false;
}
+EXPORT_SYMBOL(force_dma_unencrypted);

/* Architecture __weak replacement functions */
void __init mem_encrypt_free_decrypted_mem(void)
--
2.20.1

Subject: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

From: Thomas Hellstrom <[email protected]>

With TTM pages allocated out of the DMA pool, use the
force_dma_unencrypted function to be able to set up the correct
page-protection. Previously it was unconditionally set to encrypted,
which only works with SME encryption on devices with a large enough DMA
mask.

Tested with vmwgfx and sev-es. Screen garbage without this patch and normal
functionality with it.

Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
---
drivers/gpu/drm/ttm/ttm_bo_util.c | 17 +++++++++++++----
drivers/gpu/drm/ttm/ttm_bo_vm.c | 21 ++++++++++-----------
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 4 ++++
drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 6 ++++--
include/drm/ttm/ttm_bo_driver.h | 8 +++++---
include/drm/ttm/ttm_tt.h | 1 +
6 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index fe81c565e7ef..d5ad8f03b63f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -419,11 +419,13 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
page = i * dir + add;
if (old_iomap == NULL) {
pgprot_t prot = ttm_io_prot(old_mem->placement,
+ ttm->page_flags,
PAGE_KERNEL);
ret = ttm_copy_ttm_io_page(ttm, new_iomap, page,
prot);
} else if (new_iomap == NULL) {
pgprot_t prot = ttm_io_prot(new_mem->placement,
+ ttm->page_flags,
PAGE_KERNEL);
ret = ttm_copy_io_ttm_page(ttm, old_iomap, page,
prot);
@@ -526,11 +528,11 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
return 0;
}

-pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
+pgprot_t ttm_io_prot(u32 caching_flags, u32 tt_page_flags, pgprot_t tmp)
{
/* Cached mappings need no adjustment */
if (caching_flags & TTM_PL_FLAG_CACHED)
- return tmp;
+ goto check_encryption;

#if defined(__i386__) || defined(__x86_64__)
if (caching_flags & TTM_PL_FLAG_WC)
@@ -548,6 +550,11 @@ pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
#if defined(__sparc__)
tmp = pgprot_noncached(tmp);
#endif
+
+check_encryption:
+ if (tt_page_flags & TTM_PAGE_FLAG_DECRYPTED)
+ tmp = pgprot_decrypted(tmp);
+
return tmp;
}
EXPORT_SYMBOL(ttm_io_prot);
@@ -594,7 +601,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
if (ret)
return ret;

- if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED)) {
+ if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED) &&
+ !(ttm->page_flags & TTM_PAGE_FLAG_DECRYPTED)) {
/*
* We're mapping a single page, and the desired
* page protection is consistent with the bo.
@@ -608,7 +616,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
* We need to use vmap to get the desired page protection
* or to make the buffer object look contiguous.
*/
- prot = ttm_io_prot(mem->placement, PAGE_KERNEL);
+ prot = ttm_io_prot(mem->placement, ttm->page_flags,
+ PAGE_KERNEL);
map->bo_kmap_type = ttm_bo_map_vmap;
map->virtual = vmap(ttm->pages + start_page, num_pages,
0, prot);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 76eedb963693..194d8d618d23 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -226,12 +226,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
* by mmap_sem in write mode.
*/
cvma = *vma;
- cvma.vm_page_prot = vm_get_page_prot(cvma.vm_flags);
-
- if (bo->mem.bus.is_iomem) {
- cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
- cvma.vm_page_prot);
- } else {
+ if (!bo->mem.bus.is_iomem) {
struct ttm_operation_ctx ctx = {
.interruptible = false,
.no_wait_gpu = false,
@@ -240,14 +235,18 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
};

ttm = bo->ttm;
- cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
- cvma.vm_page_prot);
-
- /* Allocate all page at once, most common usage */
- if (ttm_tt_populate(ttm, &ctx)) {
+ if (ttm_tt_populate(bo->ttm, &ctx)) {
ret = VM_FAULT_OOM;
goto out_io_unlock;
}
+ cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
+ ttm->page_flags,
+ cvma.vm_page_prot);
+ } else {
+ /* Iomem should not be marked encrypted */
+ cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
+ TTM_PAGE_FLAG_DECRYPTED,
+ cvma.vm_page_prot);
}

/*
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 7d78e6deac89..9b15df8ecd49 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -48,6 +48,7 @@
#include <linux/atomic.h>
#include <linux/device.h>
#include <linux/kthread.h>
+#include <linux/dma-direct.h>
#include <drm/ttm/ttm_bo_driver.h>
#include <drm/ttm/ttm_page_alloc.h>
#include <drm/ttm/ttm_set_memory.h>
@@ -984,6 +985,9 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev,
}

ttm->state = tt_unbound;
+ if (force_dma_unencrypted(dev))
+ ttm->page_flags |= TTM_PAGE_FLAG_DECRYPTED;
+
return 0;
}
EXPORT_SYMBOL_GPL(ttm_dma_populate);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
index bb46ca0c458f..d3ced89a37e9 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -483,8 +483,10 @@ int vmw_bo_cpu_blit(struct ttm_buffer_object *dst,
d.src_pages = src->ttm->pages;
d.dst_num_pages = dst->num_pages;
d.src_num_pages = src->num_pages;
- d.dst_prot = ttm_io_prot(dst->mem.placement, PAGE_KERNEL);
- d.src_prot = ttm_io_prot(src->mem.placement, PAGE_KERNEL);
+ d.dst_prot = ttm_io_prot(dst->mem.placement, dst->ttm->page_flags,
+ PAGE_KERNEL);
+ d.src_prot = ttm_io_prot(src->mem.placement, src->ttm->page_flags,
+ PAGE_KERNEL);
d.diff = diff;

for (j = 0; j < h; ++j) {
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 6f536caea368..68ead1bd3042 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -893,13 +893,15 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo);
/**
* ttm_io_prot
*
- * @c_state: Caching state.
+ * @caching_flags: The caching flags of the map.
+ * @tt_page_flags: The tt_page_flags of the map, TTM_PAGE_FLAG_*
* @tmp: Page protection flag for a normal, cached mapping.
*
* Utility function that returns the pgprot_t that should be used for
- * setting up a PTE with the caching model indicated by @c_state.
+ * setting up a PTE with the caching model indicated by @caching_flags,
+ * and encryption state indicated by @tt_page_flags,
*/
-pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp);
+pgprot_t ttm_io_prot(u32 caching_flags, u32 tt_page_flags, pgprot_t tmp);

extern const struct ttm_mem_type_manager_func ttm_bo_manager_func;

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index c0e928abf592..45cc26355513 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -41,6 +41,7 @@ struct ttm_operation_ctx;
#define TTM_PAGE_FLAG_DMA32 (1 << 7)
#define TTM_PAGE_FLAG_SG (1 << 8)
#define TTM_PAGE_FLAG_NO_RETRY (1 << 9)
+#define TTM_PAGE_FLAG_DECRYPTED (1 << 10)

enum ttm_caching_state {
tt_uncached,
--
2.20.1

Subject: [PATCH v2 4/4] drm/ttm: Cache dma pool decrypted pages when AMD SEV is active

From: Thomas Hellstrom <[email protected]>

The TTM dma pool allocates coherent pages for use with TTM. When forcing
unencrypted DMA, such allocations become very expensive since the linear
kernel map has to be changed to mark the pages decrypted. To avoid too many
such allocations and frees, cache the decrypted pages even if they
are in the normal cpu caching state, where otherwise the pool frees them
immediately when unused.

Tested with vmwgfx on SEV-ES.

Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
---
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 9b15df8ecd49..a3247f24e106 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -1000,7 +1000,7 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
struct dma_pool *pool;
struct dma_page *d_page, *next;
enum pool_type type;
- bool is_cached = false;
+ bool immediate_free = false;
unsigned count, i, npages = 0;
unsigned long irq_flags;

@@ -1035,8 +1035,17 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
if (!pool)
return;

- is_cached = (ttm_dma_find_pool(pool->dev,
- ttm_to_type(ttm->page_flags, tt_cached)) == pool);
+ /*
+ * If memory is cached and sev encryption is not active, allocating
+ * and freeing coherent memory is relatively cheap, so we can free
+ * it immediately. If sev encryption is active, allocating coherent
+ * memory involves a call to set_memory_decrypted() which is very
+ * expensive, so cache coherent pages is sev is active.
+ */
+ immediate_free = (ttm_dma_find_pool
+ (pool->dev,
+ ttm_to_type(ttm->page_flags, tt_cached)) == pool &&
+ !force_dma_unencrypted(dev));

/* make sure pages array match list and count number of pages */
count = 0;
@@ -1051,13 +1060,13 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
d_page->vaddr &= ~VADDR_FLAG_UPDATED_COUNT;
}

- if (is_cached)
+ if (immediate_free)
ttm_dma_page_put(pool, d_page);
}

spin_lock_irqsave(&pool->lock, irq_flags);
pool->npages_in_use -= count;
- if (is_cached) {
+ if (immediate_free) {
pool->nfrees += count;
} else {
pool->npages_free += count;
--
2.20.1

2019-09-03 13:47:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On Tue, Sep 03, 2019 at 03:15:01PM +0200, Thomas Hellstr?m (VMware) wrote:
> From: Thomas Hellstrom <[email protected]>
>
> The force_dma_unencrypted symbol is needed by TTM to set up the correct
> page protection when memory encryption is active. Export it.

NAK. This is a helper for the core DMA code and drivers have no
business looking at it.

2019-09-03 13:49:28

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] s390/mm: Export force_dma_unencrypted

On Tue, Sep 03, 2019 at 03:15:02PM +0200, Thomas Hellstr?m (VMware) wrote:
> From: Thomas Hellstrom <[email protected]>
>
> The force_dma_unencrypted symbol is needed by TTM to set up the correct
> page protection when memory encryption is active. Export it.

Smae here. None of a drivers business. DMA decisions are hidden
behind the DMA API.

Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

Hi, Christoph,

On 9/3/19 3:46 PM, Christoph Hellwig wrote:
> On Tue, Sep 03, 2019 at 03:15:01PM +0200, Thomas Hellström (VMware) wrote:
>> From: Thomas Hellstrom <[email protected]>
>>
>> The force_dma_unencrypted symbol is needed by TTM to set up the correct
>> page protection when memory encryption is active. Export it.
> NAK. This is a helper for the core DMA code and drivers have no
> business looking at it.

Is this a layer violation concern, that is, would you be ok with a
similar helper for TTM, or is it that you want to force the graphics
drivers into adhering strictly to the DMA api, even when it from an
engineering perspective makes no sense?

If it's the latter, then I would like to reiterate that it would be
better that we work to come up with a long term plan to add what's
missing to the DMA api to help graphics drivers use coherent memory?

Thanks,

Thomas


2019-09-03 15:18:10

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On 9/3/19 6:15 AM, Thomas Hellström (VMware) wrote:
> The force_dma_unencrypted symbol is needed by TTM to set up the correct
> page protection when memory encryption is active. Export it.

It would be great if this had enough background that I didn't have to
look at patch 4 to figure out what TTM might be.

Why is TTM special? How many other drivers would have to be modified in
a one-off fashion if we go this way? What's the logic behind this being
a non-GPL export?

2019-09-03 15:20:26

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Have TTM support SEV encryption with coherent memory

On Tue, Sep 03, 2019 at 03:15:00PM +0200, Thomas Hellstr?m (VMware) wrote:
> With SEV memory encryption and in some cases also with SME memory
> encryption, coherent memory is unencrypted. In those cases, TTM doesn't
> set up the correct page protection. Fix this by having the TTM
> coherent page allocator call into the platform code to determine whether
> coherent memory is encrypted or not, and modify the page protection if
> it is not.
>
> v2:
> - Use force_dma_unencrypted() rather than sev_active() to catch also the
> special SME encryption cases.

We should probably cc Christoph Hellwig on this ... better to hear his
screams before merging than afterwards. As much as I don't support
screaming maintainers, that seems the least bad option here.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2019-09-03 16:23:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On Tue, Sep 03, 2019 at 04:32:45PM +0200, Thomas Hellstr?m (VMware) wrote:
> Is this a layer violation concern, that is, would you be ok with a similar
> helper for TTM, or is it that you want to force the graphics drivers into
> adhering strictly to the DMA api, even when it from an engineering
> perspective makes no sense?

From looking at DRM I strongly believe that making DRM use the DMA
mapping properly makes a lot of sense from the engineering perspective,
and this series is a good argument for that positions. If DRM was using
the DMA properl we would not need this series to start with, all the
SEV handling is hidden behind the DMA API. While we had occasional
bugs in that support fixing it meant that it covered all drivers
properly using that API.

Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On 9/3/19 5:14 PM, Dave Hansen wrote:
> On 9/3/19 6:15 AM, Thomas Hellström (VMware) wrote:
>> The force_dma_unencrypted symbol is needed by TTM to set up the correct
>> page protection when memory encryption is active. Export it.
> It would be great if this had enough background that I didn't have to
> look at patch 4 to figure out what TTM might be.
>
> Why is TTM special? How many other drivers would have to be modified in
> a one-off fashion if we go this way? What's the logic behind this being
> a non-GPL export?

TTM tries to abstract mapping of graphics buffer objects regardless
where they live. Be it in pci memory or system memory. As such it needs
to figure out the proper page protection. For example if a buffer object
is moved from pci memory to system memory transparently to a user-space
application, all user-space mappings need to be killed and then
reinstated pointing to the new location, sometimes with a new page
protection.

I try to keep away as much as possible from the non-GPL vs GPL export
discussions. I have no strong opinion on the subject. Although since
sev_active() is a non-GPL export, I decided to mimic that.

Thanks
Thomas



2019-09-03 19:39:36

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

This whole thing looks like a fascinating collection of hacks. :)

ttm is taking a stack-alllocated "VMA" and handing it to vmf_insert_*()
which obviously are expecting "real" VMAs that are linked into the mm.
It's extracting some pgprot_t information from the real VMA, making a
psuedo-temporary VMA, then passing the temporary one back into the
insertion functions:

> static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> {
...
> struct vm_area_struct cvma;
...
> if (vma->vm_flags & VM_MIXEDMAP)
> ret = vmf_insert_mixed(&cvma, address,
> __pfn_to_pfn_t(pfn, PFN_DEV));
> else
> ret = vmf_insert_pfn(&cvma, address, pfn);

I can totally see why this needs new exports. But, man, it doesn't seem
like something we want to keep *feeding*.

The real problem here is that the encryption bits from the device VMA's
"true" vma->vm_page_prot don't match the ones that actually get
inserted, probably because the device ptes need the encryption bits
cleared but the system memory PTEs need them set *and* they're mixed
under one VMA.

The thing we need to stop is having mixed encryption rules under one VMA.

2019-09-03 19:52:41

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On Tue, Sep 3, 2019 at 9:38 PM Dave Hansen <[email protected]> wrote:
>
> This whole thing looks like a fascinating collection of hacks. :)
>
> ttm is taking a stack-alllocated "VMA" and handing it to vmf_insert_*()
> which obviously are expecting "real" VMAs that are linked into the mm.
> It's extracting some pgprot_t information from the real VMA, making a
> psuedo-temporary VMA, then passing the temporary one back into the
> insertion functions:
>
> > static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> > {
> ...
> > struct vm_area_struct cvma;
> ...
> > if (vma->vm_flags & VM_MIXEDMAP)
> > ret = vmf_insert_mixed(&cvma, address,
> > __pfn_to_pfn_t(pfn, PFN_DEV));
> > else
> > ret = vmf_insert_pfn(&cvma, address, pfn);
>
> I can totally see why this needs new exports. But, man, it doesn't seem
> like something we want to keep *feeding*.
>
> The real problem here is that the encryption bits from the device VMA's
> "true" vma->vm_page_prot don't match the ones that actually get
> inserted, probably because the device ptes need the encryption bits
> cleared but the system memory PTEs need them set *and* they're mixed
> under one VMA.
>
> The thing we need to stop is having mixed encryption rules under one VMA.

The point here is that we want this. We need to be able to move the
buffer between device ptes and system memory ptes, transparently,
behind userspace back, without races. And the fast path (which is "no
pte exists for this vma") must be real fast, so taking mmap_sem and
replacing the vma is no-go.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2019-09-03 19:56:57

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/3/19 12:51 PM, Daniel Vetter wrote:
>> The thing we need to stop is having mixed encryption rules under one VMA.
> The point here is that we want this. We need to be able to move the
> buffer between device ptes and system memory ptes, transparently,
> behind userspace back, without races. And the fast path (which is "no
> pte exists for this vma") must be real fast, so taking mmap_sem and
> replacing the vma is no-go.

So, when the user asks for encryption and we say, "sure, we'll encrypt
that", then we want the device driver to be able to transparently undo
that encryption under the covers for device memory? That seems suboptimal.

I'd rather the device driver just say: "Nope, you can't encrypt my VMA".
Because that's the truth.

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/3/19 9:55 PM, Dave Hansen wrote:
> On 9/3/19 12:51 PM, Daniel Vetter wrote:
>>> The thing we need to stop is having mixed encryption rules under one VMA.
>> The point here is that we want this. We need to be able to move the
>> buffer between device ptes and system memory ptes, transparently,
>> behind userspace back, without races. And the fast path (which is "no
>> pte exists for this vma") must be real fast, so taking mmap_sem and
>> replacing the vma is no-go.
> So, when the user asks for encryption and we say, "sure, we'll encrypt
> that", then we want the device driver to be able to transparently undo
> that encryption under the covers for device memory? That seems suboptimal.
>
> I'd rather the device driver just say: "Nope, you can't encrypt my VMA".
> Because that's the truth.

The thing here is that it's the underlying physical memory that define
the correct encryption flags. If it's DMA memory and SEV is active or
PCI memory. It's always unencrypted. User-space in a SEV vm should
always, from a data protection point of view, *assume* that graphics
buffers are unencrypted. (Which will of course limit the use of gpus and
display controllers in a SEV vm). Platform code sets the vma encryption
to on by default.

So the question here should really be, can we determine already at mmap
time whether backing memory will be unencrypted and adjust the *real*
vma->vm_page_prot under the mmap_sem?

Possibly, but that requires populating the buffer with memory at mmap
time rather than at first fault time.

And it still requires knowledge whether the device DMA is always
unencrypted (or if SEV is active).

/Thomas




Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On 9/3/19 6:22 PM, Christoph Hellwig wrote:
> On Tue, Sep 03, 2019 at 04:32:45PM +0200, Thomas Hellström (VMware) wrote:
>> Is this a layer violation concern, that is, would you be ok with a similar
>> helper for TTM, or is it that you want to force the graphics drivers into
>> adhering strictly to the DMA api, even when it from an engineering
>> perspective makes no sense?
> >From looking at DRM I strongly believe that making DRM use the DMA
> mapping properly makes a lot of sense from the engineering perspective,
> and this series is a good argument for that positions.

What I mean with "from an engineering perspective" is that drivers would
end up with a non-trivial amount of code supporting purely academic
cases: Setups where software rendering would be faster than gpu
accelerated, and setups on platforms where the driver would never run
anyway because the device would never be supported on that platform...

> If DRM was using
> the DMA properl we would not need this series to start with, all the
> SEV handling is hidden behind the DMA API. While we had occasional
> bugs in that support fixing it meant that it covered all drivers
> properly using that API.

That is not really true. The dma API can't handle faulting of coherent
pages which is what this series is really all about supporting also with
SEV active. To handle the case where we move graphics buffers or send
them to swap space while user-space have them mapped.

To do that and still be fully dma-api compliant we would ideally need,
for example, an exported dma_pgprot(). (dma_pgprot() by the way is still
suffering from one of the bugs that you mention above).

Still, I need a way forward and my questions weren't really answered by
this.

/Thomas





2019-09-03 20:52:46

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
> So the question here should really be, can we determine already at mmap
> time whether backing memory will be unencrypted and adjust the *real*
> vma->vm_page_prot under the mmap_sem?
>
> Possibly, but that requires populating the buffer with memory at mmap
> time rather than at first fault time.

I'm not connecting the dots.

vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
are created at mmap() or fault time. If we establish a good
vma->vm_page_prot, can't we just use it forever for demand faults?

Or, are you concerned that if an attempt is made to demand-fault page
that's incompatible with vma->vm_page_prot that we have to SEGV?

> And it still requires knowledge whether the device DMA is always
> unencrypted (or if SEV is active).

I may be getting mixed up on MKTME (the Intel memory encryption) and
SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
anonymous? Or just anonymous?

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/3/19 10:51 PM, Dave Hansen wrote:
> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>> So the question here should really be, can we determine already at mmap
>> time whether backing memory will be unencrypted and adjust the *real*
>> vma->vm_page_prot under the mmap_sem?
>>
>> Possibly, but that requires populating the buffer with memory at mmap
>> time rather than at first fault time.
> I'm not connecting the dots.
>
> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
> are created at mmap() or fault time. If we establish a good
> vma->vm_page_prot, can't we just use it forever for demand faults?

With SEV I think that we could possibly establish the encryption flags
at vma creation time. But thinking of it, it would actually break with
SME where buffer content can be moved between encrypted system memory
and unencrypted graphics card PCI memory behind user-space's back. That
would imply killing all user-space encrypted PTEs and at fault time set
up new ones pointing to unencrypted PCI memory..

>
> Or, are you concerned that if an attempt is made to demand-fault page
> that's incompatible with vma->vm_page_prot that we have to SEGV?
>
>> And it still requires knowledge whether the device DMA is always
>> unencrypted (or if SEV is active).
> I may be getting mixed up on MKTME (the Intel memory encryption) and
> SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
> anonymous? Or just anonymous?

SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
memory to unencrypted (which is a very slow operation and patch 4 deals
with caching such memory).

/Thomas




2019-09-03 21:42:30

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On Tue, Sep 3, 2019 at 1:46 PM Thomas Hellström (VMware)
<[email protected]> wrote:
>
> On 9/3/19 6:22 PM, Christoph Hellwig wrote:
> > On Tue, Sep 03, 2019 at 04:32:45PM +0200, Thomas Hellström (VMware) wrote:
> >> Is this a layer violation concern, that is, would you be ok with a similar
> >> helper for TTM, or is it that you want to force the graphics drivers into
> >> adhering strictly to the DMA api, even when it from an engineering
> >> perspective makes no sense?
> > >From looking at DRM I strongly believe that making DRM use the DMA
> > mapping properly makes a lot of sense from the engineering perspective,
> > and this series is a good argument for that positions.
>
> What I mean with "from an engineering perspective" is that drivers would
> end up with a non-trivial amount of code supporting purely academic
> cases: Setups where software rendering would be faster than gpu
> accelerated, and setups on platforms where the driver would never run
> anyway because the device would never be supported on that platform...
>
> > If DRM was using
> > the DMA properl we would not need this series to start with, all the
> > SEV handling is hidden behind the DMA API. While we had occasional
> > bugs in that support fixing it meant that it covered all drivers
> > properly using that API.
>
> That is not really true. The dma API can't handle faulting of coherent
> pages which is what this series is really all about supporting also with
> SEV active. To handle the case where we move graphics buffers or send
> them to swap space while user-space have them mapped.
>
> To do that and still be fully dma-api compliant we would ideally need,
> for example, an exported dma_pgprot(). (dma_pgprot() by the way is still
> suffering from one of the bugs that you mention above).
>
> Still, I need a way forward and my questions weren't really answered by
> this.
>
>

I read this patch, I read force_dma_encrypted(), I read the changelog
again, and I haven't the faintest clue what TTM could possibly be
doing with force_dma_encrypted().

You're saying that TTM needs to transparently change mappings to
relocate objects in memory between system memory and device memory.
Great, I don't see the problem. Is the issue that you need to
allocate system memory that is addressable by the GPU and that, if the
GPU has insufficient PA bits, you need unencrypted memory? If so,
this sounds like an excellent use for the DMA API. Rather than
kludging directly knowledge of force_dma_encrypted() into the driver,
can't you at least add, if needed, a new helper specifically to
allocate memory that can be addressed by the device? Like
dma_alloc_coherent()? Or, if for some reason, dma_alloc_coherent()
doesn't do what you need or your driver isn't ready to use it, then
explain *why* and introduce a new function to solve your problem?

Keep in mind that, depending on just how MKTME ends up being supported
in Linux, it's entirely possible that it will be *backwards* from what
you expect -- high address bits will be needed to ask for
*unencrypted* memory.

--Andy

2019-09-03 21:47:58

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
<[email protected]> wrote:
>
> On 9/3/19 10:51 PM, Dave Hansen wrote:
> > On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
> >> So the question here should really be, can we determine already at mmap
> >> time whether backing memory will be unencrypted and adjust the *real*
> >> vma->vm_page_prot under the mmap_sem?
> >>
> >> Possibly, but that requires populating the buffer with memory at mmap
> >> time rather than at first fault time.
> > I'm not connecting the dots.
> >
> > vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
> > are created at mmap() or fault time. If we establish a good
> > vma->vm_page_prot, can't we just use it forever for demand faults?
>
> With SEV I think that we could possibly establish the encryption flags
> at vma creation time. But thinking of it, it would actually break with
> SME where buffer content can be moved between encrypted system memory
> and unencrypted graphics card PCI memory behind user-space's back. That
> would imply killing all user-space encrypted PTEs and at fault time set
> up new ones pointing to unencrypted PCI memory..
>
> >
> > Or, are you concerned that if an attempt is made to demand-fault page
> > that's incompatible with vma->vm_page_prot that we have to SEGV?
> >
> >> And it still requires knowledge whether the device DMA is always
> >> unencrypted (or if SEV is active).
> > I may be getting mixed up on MKTME (the Intel memory encryption) and
> > SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
> > anonymous? Or just anonymous?
>
> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
> memory to unencrypted (which is a very slow operation and patch 4 deals
> with caching such memory).
>

I'm still lost. You have some fancy VMA where the backing pages
change behind the application's back. This isn't particularly novel
-- plain old anonymous memory and plain old mapped files do this too.
Can't you all the insert_pfn APIs and call it a day? What's so
special that you need all this magic? ISTM you should be able to
allocate memory that's addressable by the device (dma_alloc_coherent()
or whatever) and then map it into user memory just like you'd map any
other page.

I feel like I'm missing something here.

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/3/19 11:46 PM, Andy Lutomirski wrote:
> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
> <[email protected]> wrote:
>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>> So the question here should really be, can we determine already at mmap
>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>> vma->vm_page_prot under the mmap_sem?
>>>>
>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>> time rather than at first fault time.
>>> I'm not connecting the dots.
>>>
>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>> are created at mmap() or fault time. If we establish a good
>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>> With SEV I think that we could possibly establish the encryption flags
>> at vma creation time. But thinking of it, it would actually break with
>> SME where buffer content can be moved between encrypted system memory
>> and unencrypted graphics card PCI memory behind user-space's back. That
>> would imply killing all user-space encrypted PTEs and at fault time set
>> up new ones pointing to unencrypted PCI memory..
>>
>>> Or, are you concerned that if an attempt is made to demand-fault page
>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>
>>>> And it still requires knowledge whether the device DMA is always
>>>> unencrypted (or if SEV is active).
>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>> SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
>>> anonymous? Or just anonymous?
>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>> memory to unencrypted (which is a very slow operation and patch 4 deals
>> with caching such memory).
>>
> I'm still lost. You have some fancy VMA where the backing pages
> change behind the application's back. This isn't particularly novel
> -- plain old anonymous memory and plain old mapped files do this too.
> Can't you all the insert_pfn APIs and call it a day? What's so
> special that you need all this magic? ISTM you should be able to
> allocate memory that's addressable by the device (dma_alloc_coherent()
> or whatever) and then map it into user memory just like you'd map any
> other page.
>
> I feel like I'm missing something here.

Yes, so in this case we use dma_alloc_coherent().

With SEV, that gives us unencrypted pages. (Pages whose linear kernel
map is marked unencrypted). With SME that (typcially) gives us encrypted
pages. In both these cases, vm_get_page_prot() returns
an encrypted page protection, which lands in vma->vm_page_prot.

In the SEV case, we therefore need to modify the page protection to
unencrypted. Hence we need to know whether we're running under SEV and
therefore need to modify the protection. If not, the user-space PTE
would incorrectly have the encryption flag set.

/Thomas


Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>> <[email protected]> wrote:
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>> would imply killing all user-space encrypted PTEs and at fault time set
>>> up new ones pointing to unencrypted PCI memory..
>>>
>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>
>>>>> And it still requires knowledge whether the device DMA is always
>>>>> unencrypted (or if SEV is active).
>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
>>>> anonymous?  Or just anonymous?
>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>> with caching such memory).
>>>
>> I'm still lost.  You have some fancy VMA where the backing pages
>> change behind the application's back.  This isn't particularly novel
>> -- plain old anonymous memory and plain old mapped files do this too.
>> Can't you all the insert_pfn APIs and call it a day?  What's so
>> special that you need all this magic?  ISTM you should be able to
>> allocate memory that's addressable by the device (dma_alloc_coherent()
>> or whatever) and then map it into user memory just like you'd map any
>> other page.
>>
>> I feel like I'm missing something here.
>
> Yes, so in this case we use dma_alloc_coherent().
>
> With SEV, that gives us unencrypted pages. (Pages whose linear kernel
> map is marked unencrypted). With SME that (typcially) gives us
> encrypted pages. In both these cases, vm_get_page_prot() returns
> an encrypted page protection, which lands in vma->vm_page_prot.
>
> In the SEV case, we therefore need to modify the page protection to
> unencrypted. Hence we need to know whether we're running under SEV and
> therefore need to modify the protection. If not, the user-space PTE
> would incorrectly have the encryption flag set.
>
> /Thomas
>
>
And, of course, had we not been "fancy", we could have used
dma_mmap_coherent(), which in theory should set up the correct
user-space page protection. But now we're moving stuff around so we can't.

/Thomas


2019-09-03 23:12:26

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Thomas, this series has garnered a nak and a whole pile of thoroughly
confused reviewers.

Could you take another stab at this along with a more ample changelog
explaining the context of the problem? I suspect that's a better place
to start than having us all piece together the disparate parts of the
thread.

2019-09-03 23:16:56

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption



> On Sep 3, 2019, at 3:15 PM, Thomas Hellström (VMware) <[email protected]> wrote:
>
>> On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
>>> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>>> <[email protected]> wrote:
>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>> So the question here should really be, can we determine already at mmap
>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>>
>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>> time rather than at first fault time.
>>>>> I'm not connecting the dots.
>>>>>
>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>> are created at mmap() or fault time. If we establish a good
>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>> With SEV I think that we could possibly establish the encryption flags
>>>> at vma creation time. But thinking of it, it would actually break with
>>>> SME where buffer content can be moved between encrypted system memory
>>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>>> would imply killing all user-space encrypted PTEs and at fault time set
>>>> up new ones pointing to unencrypted PCI memory..
>>>>
>>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>>
>>>>>> And it still requires knowledge whether the device DMA is always
>>>>>> unencrypted (or if SEV is active).
>>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>>> SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
>>>>> anonymous? Or just anonymous?
>>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>>> with caching such memory).
>>>>
>>> I'm still lost. You have some fancy VMA where the backing pages
>>> change behind the application's back. This isn't particularly novel
>>> -- plain old anonymous memory and plain old mapped files do this too.
>>> Can't you all the insert_pfn APIs and call it a day? What's so
>>> special that you need all this magic? ISTM you should be able to
>>> allocate memory that's addressable by the device (dma_alloc_coherent()
>>> or whatever) and then map it into user memory just like you'd map any
>>> other page.
>>>
>>> I feel like I'm missing something here.
>>
>> Yes, so in this case we use dma_alloc_coherent().
>>
>> With SEV, that gives us unencrypted pages. (Pages whose linear kernel map is marked unencrypted). With SME that (typcially) gives us encrypted pages. In both these cases, vm_get_page_prot() returns
>> an encrypted page protection, which lands in vma->vm_page_prot.
>>
>> In the SEV case, we therefore need to modify the page protection to unencrypted. Hence we need to know whether we're running under SEV and therefore need to modify the protection. If not, the user-space PTE would incorrectly have the encryption flag set.
>>

I’m still confused. You got unencrypted pages with an unencrypted PFN. Why do you need to fiddle? You have a PFN, and you’re inserting it with vmf_insert_pfn(). This should just work, no? There doesn’t seem to be any real funny business in dma_mmap_attrs() or dma_common_mmap().

But, reading this, I have more questions:

Can’t you get rid of cvma by using vmf_insert_pfn_prot()?

Would it make sense to add a vmf_insert_dma_page() to directly do exactly what you’re trying to do?

And a broader question just because I’m still confused: why isn’t the encryption bit in the PFN? The whole SEV/SME system seems like it’s trying a bit to hard to be fully invisible to the kernel.

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 1:15 AM, Andy Lutomirski wrote:
>
>> On Sep 3, 2019, at 3:15 PM, Thomas Hellström (VMware) <[email protected]> wrote:
>>
>>> On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
>>>> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>>>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>>>> <[email protected]> wrote:
>>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>>> So the question here should really be, can we determine already at mmap
>>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>>>
>>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>>> time rather than at first fault time.
>>>>>> I'm not connecting the dots.
>>>>>>
>>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>>> are created at mmap() or fault time. If we establish a good
>>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>>> With SEV I think that we could possibly establish the encryption flags
>>>>> at vma creation time. But thinking of it, it would actually break with
>>>>> SME where buffer content can be moved between encrypted system memory
>>>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>>>> would imply killing all user-space encrypted PTEs and at fault time set
>>>>> up new ones pointing to unencrypted PCI memory..
>>>>>
>>>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>>>
>>>>>>> And it still requires knowledge whether the device DMA is always
>>>>>>> unencrypted (or if SEV is active).
>>>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>>>> SEV. Is SEV supported on all memory types? Page cache, hugetlbfs,
>>>>>> anonymous? Or just anonymous?
>>>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>>>> with caching such memory).
>>>>>
>>>> I'm still lost. You have some fancy VMA where the backing pages
>>>> change behind the application's back. This isn't particularly novel
>>>> -- plain old anonymous memory and plain old mapped files do this too.
>>>> Can't you all the insert_pfn APIs and call it a day? What's so
>>>> special that you need all this magic? ISTM you should be able to
>>>> allocate memory that's addressable by the device (dma_alloc_coherent()
>>>> or whatever) and then map it into user memory just like you'd map any
>>>> other page.
>>>>
>>>> I feel like I'm missing something here.
>>> Yes, so in this case we use dma_alloc_coherent().
>>>
>>> With SEV, that gives us unencrypted pages. (Pages whose linear kernel map is marked unencrypted). With SME that (typcially) gives us encrypted pages. In both these cases, vm_get_page_prot() returns
>>> an encrypted page protection, which lands in vma->vm_page_prot.
>>>
>>> In the SEV case, we therefore need to modify the page protection to unencrypted. Hence we need to know whether we're running under SEV and therefore need to modify the protection. If not, the user-space PTE would incorrectly have the encryption flag set.
>>>
> I’m still confused. You got unencrypted pages with an unencrypted PFN. Why do you need to fiddle? You have a PFN, and you’re inserting it with vmf_insert_pfn(). This should just work, no?

OK now I see what causes the confusion.

With SEV, the encryption state is, while *physically* encoded in an
address bit, from what I can tell, not *logically* encoded in the pfn,
but in the page_prot for cpu mapping purposes.  That is, page_to_pfn() 
returns the same pfn whether the page is encrypted or unencrypted. Hence
nobody can't tell from the pfn whether the page is unencrypted or encrypted.

For device DMA address purposes, the encryption status is encoded in the
dma address by the dma layer in phys_to_dma().


> There doesn’t seem to be any real funny business in dma_mmap_attrs() or dma_common_mmap().

No, from what I can tell the call in these functions to dma_pgprot()
generates an incorrect page protection since it doesn't take unencrypted
coherent memory into account. I don't think anybody has used these
functions yet with SEV.

>
> But, reading this, I have more questions:
>
> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?

It looks like that, although there are comments in the code about
serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
write-combining and PAT, so that would require some serious testing with
hardware I don't have. But I guess there is definitely room for
improvement here. Ideally we'd like to be able to change the
vma->vm_page_prot within fault(). But we can

>
> Would it make sense to add a vmf_insert_dma_page() to directly do exactly what you’re trying to do?

Yes, but as a longer term solution I would prefer a general dma_pgprot()
exported, so that we could, in a dma-compliant way, use coherent pages
with other apis, like kmap_atomic_prot() and vmap(). That is, basically
split coherent page allocation in two steps: Allocation and mapping.

>
> And a broader question just because I’m still confused: why isn’t the encryption bit in the PFN? The whole SEV/SME system seems like it’s trying a bit to hard to be fully invisible to the kernel.

I guess you'd have to ask AMD about that. But my understanding is that
encoding it in an address bit does make it trivial to do decryption /
encryption on the fly to DMA devices that are not otherwise aware of it,
just by handing them a special physical address. For cpu mapping
purposes it might become awkward to encode it in the pfn since
pfn_to_page and friends would need knowledge about this. Personally I
think it would have made sense to track it like PAT in track_pfn_insert().

Thanks,

Thomas



2019-09-04 06:59:45

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On Tue, Sep 03, 2019 at 10:46:18PM +0200, Thomas Hellstr?m (VMware) wrote:
> What I mean with "from an engineering perspective" is that drivers would end
> up with a non-trivial amount of code supporting purely academic cases:
> Setups where software rendering would be faster than gpu accelerated, and
> setups on platforms where the driver would never run anyway because the
> device would never be supported on that platform...

And actually work on cases you previously called academic and which now
matter to you because your employer has a suddent interest in SEV.
Academic really is in the eye of the beholder (and of those who pay
the bills).

> That is not really true. The dma API can't handle faulting of coherent pages
> which is what this series is really all about supporting also with SEV
> active. To handle the case where we move graphics buffers or send them to
> swap space while user-space have them mapped.

And the only thing we need to support the fault handler is to add an
offset to the dma_mmap_* APIs. Which I had planned to do for Christian
(one of the few grapics developers who actually tries to play well
with the rest of the kernel instead of piling hacks over hacks like
many others) anyway, but which hasn't happened yet.

> Still, I need a way forward and my questions weren't really answered by
> this.

This is pretty demanding. If you "need" a way forward just work with
all the relevant people instead of piling ob local hacks.

Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On 9/4/19 8:58 AM, Christoph Hellwig wrote:
> On Tue, Sep 03, 2019 at 10:46:18PM +0200, Thomas Hellström (VMware) wrote:
>> What I mean with "from an engineering perspective" is that drivers would end
>> up with a non-trivial amount of code supporting purely academic cases:
>> Setups where software rendering would be faster than gpu accelerated, and
>> setups on platforms where the driver would never run anyway because the
>> device would never be supported on that platform...
> And actually work on cases you previously called academic and which now
> matter to you because your employer has a suddent interest in SEV.
> Academic really is in the eye of the beholder (and of those who pay
> the bills).

But in this particular case we *do* adhere to the dma api, at least as
far as we can. But we're missing functionality.

>
>> That is not really true. The dma API can't handle faulting of coherent pages
>> which is what this series is really all about supporting also with SEV
>> active. To handle the case where we move graphics buffers or send them to
>> swap space while user-space have them mapped.
> And the only thing we need to support the fault handler is to add an
> offset to the dma_mmap_* APIs. Which I had planned to do for Christian
> (one of the few grapics developers who actually tries to play well
> with the rest of the kernel instead of piling hacks over hacks like
> many others) anyway, but which hasn't happened yet.

That sounds great. Is there anything I can do to help out? I thought
this was more or less a dead end since the current dma_mmap_ API
requires the mmap_sem to be held in write mode (modifying the
vma->vm_flags) whereas fault() only offers read mode. But that would
definitely work.

>
>> Still, I need a way forward and my questions weren't really answered by
>> this.
> This is pretty demanding. If you "need" a way forward just work with
> all the relevant people instead of piling ob local hacks.

But I think that was what I was trying to initiate. The question was

"If it's the latter, then I would like to reiterate that it would be
better that we work to come up with a long term plan to add what's
missing to the DMA api to help graphics drivers use coherent memory?"

And since you NAK'd the original patches, I was sort of hoping for a
point in the right direction.

Thanks,

Thomas




2019-09-04 07:36:02

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
> On 9/3/19 10:51 PM, Dave Hansen wrote:
>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>> So the question here should really be, can we determine already at mmap
>>> time whether backing memory will be unencrypted and adjust the *real*
>>> vma->vm_page_prot under the mmap_sem?
>>>
>>> Possibly, but that requires populating the buffer with memory at mmap
>>> time rather than at first fault time.
>> I'm not connecting the dots.
>>
>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>> are created at mmap() or fault time.  If we establish a good
>> vma->vm_page_prot, can't we just use it forever for demand faults?
>
> With SEV I think that we could possibly establish the encryption flags
> at vma creation time. But thinking of it, it would actually break with
> SME where buffer content can be moved between encrypted system memory
> and unencrypted graphics card PCI memory behind user-space's back.
> That would imply killing all user-space encrypted PTEs and at fault
> time set up new ones pointing to unencrypted PCI memory..

Well my problem is where do you see encrypted system memory here?

At least for AMD GPUs all memory accessed must be unencrypted and that
counts for both system as well as PCI memory.

So I don't get why we can't assume always unencrypted and keep it like that.

Regards,
Christian.

2019-09-04 07:55:30

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
<[email protected]> wrote:
> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
> > But, reading this, I have more questions:
> >
> > Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
>
> It looks like that, although there are comments in the code about
> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
> write-combining and PAT, so that would require some serious testing with
> hardware I don't have. But I guess there is definitely room for
> improvement here. Ideally we'd like to be able to change the
> vma->vm_page_prot within fault(). But we can

Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
of the PAT tables, which are dead slow. If you have a struct
io_mapping then that can be done once, and then just blindly inserted.
See remap_io_mapping in i915.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Hi, Christian,

On 9/4/19 9:33 AM, Koenig, Christian wrote:
> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>> So the question here should really be, can we determine already at mmap
>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>> vma->vm_page_prot under the mmap_sem?
>>>>
>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>> time rather than at first fault time.
>>> I'm not connecting the dots.
>>>
>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>> are created at mmap() or fault time.  If we establish a good
>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>> With SEV I think that we could possibly establish the encryption flags
>> at vma creation time. But thinking of it, it would actually break with
>> SME where buffer content can be moved between encrypted system memory
>> and unencrypted graphics card PCI memory behind user-space's back.
>> That would imply killing all user-space encrypted PTEs and at fault
>> time set up new ones pointing to unencrypted PCI memory..
> Well my problem is where do you see encrypted system memory here?
>
> At least for AMD GPUs all memory accessed must be unencrypted and that
> counts for both system as well as PCI memory.

We're talking SME now right?

The current SME setup is that if a device's DMA mask says it's capable
of addressing the encryption bit, coherent memory will be encrypted. The
memory controllers will decrypt for the device on the fly. Otherwise
coherent memory will be decrypted.

>
> So I don't get why we can't assume always unencrypted and keep it like that.

I see two reasons. First, it would break with a real device that signals
it's capable of addressing the encryption bit.

Second I can imagine unaccelerated setups (something like vkms using
prime feeding a VNC connection) where we actually want the TTM buffers
encrypted to protect data.

But at least the latter reason is way far out in the future.

So for me I'm ok with that if that works for you?

/Thomas


>
> Regards,
> Christian.


Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Hi, Dave,

On 9/4/19 1:10 AM, Dave Hansen wrote:
> Thomas, this series has garnered a nak and a whole pile of thoroughly
> confused reviewers.
>
> Could you take another stab at this along with a more ample changelog
> explaining the context of the problem? I suspect that's a better place
> to start than having us all piece together the disparate parts of the
> thread.

Sure.

I was just trying to follow up on the emails  to get a better
understanding what got people confused in the first place.

Thanks,

Thomas


Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 10:19 AM, Thomas Hellström (VMware) wrote:
> Hi, Christian,
>
> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back.
>>> That would imply killing all user-space encrypted PTEs and at fault
>>> time set up new ones pointing to unencrypted PCI memory..
>> Well my problem is where do you see encrypted system memory here?
>>
>> At least for AMD GPUs all memory accessed must be unencrypted and that
>> counts for both system as well as PCI memory.
>
> We're talking SME now right?
>
> The current SME setup is that if a device's DMA mask says it's capable
> of addressing the encryption bit, coherent memory will be encrypted.
> The memory controllers will decrypt for the device on the fly.
> Otherwise coherent memory will be decrypted.
>
>>
>> So I don't get why we can't assume always unencrypted and keep it
>> like that.
>
> I see two reasons. First, it would break with a real device that
> signals it's capable of addressing the encryption bit.
>
> Second I can imagine unaccelerated setups (something like vkms using
> prime feeding a VNC connection) where we actually want the TTM buffers
> encrypted to protect data.
>
> But at least the latter reason is way far out in the future.
>
> So for me I'm ok with that if that works for you?

Hmm, BTW,

Are you sure the AMD GPUs use unencrypted system memory rather than
relying on the memory controllers to decrypt?

In that case it seems strange that they get away with encrypted TTM
PTEs, whereas vmwgfx don't...

/Thomas

>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>
>

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 9:53 AM, Daniel Vetter wrote:
> On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
> <[email protected]> wrote:
>> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
>>> But, reading this, I have more questions:
>>>
>>> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
>> It looks like that, although there are comments in the code about
>> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
>> write-combining and PAT, so that would require some serious testing with
>> hardware I don't have. But I guess there is definitely room for
>> improvement here. Ideally we'd like to be able to change the
>> vma->vm_page_prot within fault(). But we can
> Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
> of the PAT tables, which are dead slow. If you have a struct
> io_mapping then that can be done once, and then just blindly inserted.
> See remap_io_mapping in i915.
> -Daniel

Thanks, Daniel.

Indeed looks a lot like remap_pfn_range(), but usable at fault time?

/Thomas


2019-09-04 11:13:15

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Am 04.09.19 um 10:19 schrieb Thomas Hellström (VMware):
> Hi, Christian,
>
> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back.
>>> That would imply killing all user-space encrypted PTEs and at fault
>>> time set up new ones pointing to unencrypted PCI memory..
>> Well my problem is where do you see encrypted system memory here?
>>
>> At least for AMD GPUs all memory accessed must be unencrypted and that
>> counts for both system as well as PCI memory.
>
> We're talking SME now right?
>
> The current SME setup is that if a device's DMA mask says it's capable
> of addressing the encryption bit, coherent memory will be encrypted.
> The memory controllers will decrypt for the device on the fly.
> Otherwise coherent memory will be decrypted.
>
>>
>> So I don't get why we can't assume always unencrypted and keep it
>> like that.
>
> I see two reasons. First, it would break with a real device that
> signals it's capable of addressing the encryption bit.

Why? Because we don't use dma_mmap_coherent()?

I've already talked with Christoph that we probably want to switch TTM
over to using that instead to also get rid of the ttm_io_prot() hack.

Regards,
Christian.

>
> Second I can imagine unaccelerated setups (something like vkms using
> prime feeding a VNC connection) where we actually want the TTM buffers
> encrypted to protect data.
>
> But at least the latter reason is way far out in the future.
>
> So for me I'm ok with that if that works for you?
>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>
>

2019-09-04 11:46:41

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On Wed, Sep 4, 2019 at 12:38 PM Thomas Hellström (VMware)
<[email protected]> wrote:
>
> On 9/4/19 9:53 AM, Daniel Vetter wrote:
> > On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
> > <[email protected]> wrote:
> >> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
> >>> But, reading this, I have more questions:
> >>>
> >>> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
> >> It looks like that, although there are comments in the code about
> >> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
> >> write-combining and PAT, so that would require some serious testing with
> >> hardware I don't have. But I guess there is definitely room for
> >> improvement here. Ideally we'd like to be able to change the
> >> vma->vm_page_prot within fault(). But we can
> > Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
> > of the PAT tables, which are dead slow. If you have a struct
> > io_mapping then that can be done once, and then just blindly inserted.
> > See remap_io_mapping in i915.
> > -Daniel
>
> Thanks, Daniel.
>
> Indeed looks a lot like remap_pfn_range(), but usable at fault time?

Yeah we call it from our fault handler. It's essentially vm_insert_pfn
except the pat track isn't there, but instead relies on the pat
tracking io_mapping has done already.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2019-09-04 12:24:40

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On Wed, Sep 04, 2019 at 09:32:30AM +0200, Thomas Hellstr?m (VMware) wrote:
> That sounds great. Is there anything I can do to help out? I thought this
> was more or less a dead end since the current dma_mmap_ API requires the
> mmap_sem to be held in write mode (modifying the vma->vm_flags) whereas
> fault() only offers read mode. But that would definitely work.

We'll just need to split into a setup and faul phase. I have some
sketches from a while ago, let me dust them off so that you can
try them.

> "If it's the latter, then I would like to reiterate that it would be better
> that we work to come up with a long term plan to add what's missing to the
> DMA api to help graphics drivers use coherent memory?"

I don't think we need a long term plan. We've been adding features
on an as-needed basis. And now that we have siginificanty less
implementations of the API this actually becomes much easier as well.

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 1:10 PM, Koenig, Christian wrote:
> Am 04.09.19 um 10:19 schrieb Thomas Hellström (VMware):
>> Hi, Christian,
>>
>> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>> So the question here should really be, can we determine already at
>>>>>> mmap
>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>>
>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>> time rather than at first fault time.
>>>>> I'm not connecting the dots.
>>>>>
>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>> are created at mmap() or fault time.  If we establish a good
>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>> With SEV I think that we could possibly establish the encryption flags
>>>> at vma creation time. But thinking of it, it would actually break with
>>>> SME where buffer content can be moved between encrypted system memory
>>>> and unencrypted graphics card PCI memory behind user-space's back.
>>>> That would imply killing all user-space encrypted PTEs and at fault
>>>> time set up new ones pointing to unencrypted PCI memory..
>>> Well my problem is where do you see encrypted system memory here?
>>>
>>> At least for AMD GPUs all memory accessed must be unencrypted and that
>>> counts for both system as well as PCI memory.
>> We're talking SME now right?
>>
>> The current SME setup is that if a device's DMA mask says it's capable
>> of addressing the encryption bit, coherent memory will be encrypted.
>> The memory controllers will decrypt for the device on the fly.
>> Otherwise coherent memory will be decrypted.
>>
>>> So I don't get why we can't assume always unencrypted and keep it
>>> like that.
>> I see two reasons. First, it would break with a real device that
>> signals it's capable of addressing the encryption bit.
> Why? Because we don't use dma_mmap_coherent()?

Well, assuming always unencrypted would obviously break on a real device
with encrypted coherent memory?

dma_mmap_coherent() would work from the encryption point of view
(although I think it's currently buggy and will send out an RFC for what
I believe is a fix for that).

>
> I've already talked with Christoph that we probably want to switch TTM
> over to using that instead to also get rid of the ttm_io_prot() hack.

OK, would that mean us ditching other memory modes completely? And
on-the-fly caching transitions? or is it just for the special case of
cached coherent memory? Do we need to cache the coherent kernel mappings
in TTM as well, for ttm_bo_kmap()?

/Thomas

>
> Regards,
> Christian.
>
>> Second I can imagine unaccelerated setups (something like vkms using
>> prime feeding a VNC connection) where we actually want the TTM buffers
>> encrypted to protect data.
>>
>> But at least the latter reason is way far out in the future.
>>
>> So for me I'm ok with that if that works for you?
>>
>> /Thomas
>>
>>
>>> Regards,
>>> Christian.
>>

Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On 9/4/19 2:35 PM, Thomas Hellström (VMware) wrote:
>
>>
>> I've already talked with Christoph that we probably want to switch TTM
>> over to using that instead to also get rid of the ttm_io_prot() hack.
>
> OK, would that mean us ditching other memory modes completely? And
> on-the-fly caching transitions? or is it just for the special case of
> cached coherent memory? Do we need to cache the coherent kernel
> mappings in TTM as well, for ttm_bo_kmap()?

Reading this again, I wanted to point out that I'm not against this.
Just curious.

/Thomas


Subject: Re: [PATCH v2 1/4] x86/mm: Export force_dma_unencrypted

On 9/4/19 2:22 PM, Christoph Hellwig wrote:
> On Wed, Sep 04, 2019 at 09:32:30AM +0200, Thomas Hellström (VMware) wrote:
>> That sounds great. Is there anything I can do to help out? I thought this
>> was more or less a dead end since the current dma_mmap_ API requires the
>> mmap_sem to be held in write mode (modifying the vma->vm_flags) whereas
>> fault() only offers read mode. But that would definitely work.
> We'll just need to split into a setup and faul phase. I have some
> sketches from a while ago, let me dust them off so that you can
> try them.

I'd be happy to.

Thanks,

Thomas


2019-09-04 18:19:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

On Wed, Sep 04, 2019 at 08:49:03AM +0200, Thomas Hellström (VMware) wrote:
> For device DMA address purposes, the encryption status is encoded in the dma
> address by the dma layer in phys_to_dma().
>
>
> > There doesn’t seem to be any real funny business in dma_mmap_attrs() or dma_common_mmap().
>
> No, from what I can tell the call in these functions to dma_pgprot()
> generates an incorrect page protection since it doesn't take unencrypted
> coherent memory into account. I don't think anybody has used these functions
> yet with SEV.

Yes, I think dma_pgprot is not correct for SEV. Right now that function
isn't used much on x86, it had more grave bugs up to a few -rcs ago..

> > Would it make sense to add a vmf_insert_dma_page() to directly do exactly what you’re trying to do?
>
> Yes, but as a longer term solution I would prefer a general dma_pgprot()
> exported, so that we could, in a dma-compliant way, use coherent pages with
> other apis, like kmap_atomic_prot() and vmap(). That is, basically split
> coherent page allocation in two steps: Allocation and mapping.

The thing is that dma_pgprot is of no help for you at all, as the DMA
API hides the page from you entirely. In fact we do have backends that
do not even have a page backing. But I think we can have a
vmf_insert_page equivalent that does the right thing behind your back
for the varius different implementation (contiguous page(s) in the kernel
lineary, contiguous page(s) with a vmap/ioremap remapping in various
flavours, non-contigous pages(s) with a vmap remapping, and deeply
magic firmware populated pools (well, except maybe for the last, but
at least we can fail gracefully there)).

Subject: Re: [PATCH v2 0/4] Have TTM support SEV encryption with coherent memory

On 9/3/19 3:15 PM, Thomas Hellström (VMware) wrote:
> With SEV memory encryption and in some cases also with SME memory
> encryption, coherent memory is unencrypted. In those cases, TTM doesn't
> set up the correct page protection. Fix this by having the TTM
> coherent page allocator call into the platform code to determine whether
> coherent memory is encrypted or not, and modify the page protection if
> it is not.
>
> v2:
> - Use force_dma_unencrypted() rather than sev_active() to catch also the
> special SME encryption cases.

So, this patchset is obviously withdrawn since

a) We shouldn't have TTM shortcut the dma API in this way.
b) To reviewers it was pretty unclear why this was needed in the first
place, and became
even more unclear in the context of the TTM fault handler.

I've just send out an RFC patchset that basically does the same but in
the context of dma_mmap_coherent() I hope this clears things up and we
should hopefully be able to use a new
dma API function from within the TTM fault handler.

Thanks,

Thomas