LinuxLists.cc - [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory

2022-06-09 03:34:58

Subject: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

set_memory_*crypted() functions are used to modify the "shared" page
attribute of the given memory. Using these APIs will modify the page
attributes of the aliased mappings (which also includes the direct
mapping).

But such aliased mappings modification is not desirable in use cases
like TDX guest, where the requirement is to create the shared mapping
without touching the direct map. It is used when allocating VMM shared
buffers using alloc_pages()/vmap()/set_memory_*crypted() API
combinations.

So to support such use cases, add support for noalias variants of
set_memory_*crypted() functions.

Acked-by: Wander Lairson Costa <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/include/asm/set_memory.h | 2 ++
arch/x86/mm/pat/set_memory.c | 26 ++++++++++++++++++++------
2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index b45c4d27fd46..a79d59692494 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -46,7 +46,9 @@ int set_memory_wb(unsigned long addr, int numpages);
int set_memory_np(unsigned long addr, int numpages);
int set_memory_4k(unsigned long addr, int numpages);
int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_encrypted_noalias(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
+int set_memory_decrypted_noalias(unsigned long addr, int numpages);
int set_memory_np_noalias(unsigned long addr, int numpages);
int set_memory_nonglobal(unsigned long addr, int numpages);
int set_memory_global(unsigned long addr, int numpages);
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 1abd5438f126..ad6c8ece737d 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2021,7 +2021,8 @@ int set_memory_global(unsigned long addr, int numpages)
* __set_memory_enc_pgtable() is used for the hypervisors that get
* informed about "encryption" status via page tables.
*/
-static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
+static int __set_memory_enc_pgtable(unsigned long addr, int numpages,
+ bool enc, int checkalias)
{
pgprot_t empty = __pgprot(0);
struct cpa_data cpa;
@@ -2049,7 +2050,7 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
/* Notify hypervisor that we are about to set/clr encryption attribute. */
x86_platform.guest.enc_status_change_prepare(addr, numpages, enc);

- ret = __change_page_attr_set_clr(&cpa, 1);
+ ret = __change_page_attr_set_clr(&cpa, checkalias);

/*
* After changing the encryption attribute, we need to flush TLBs again
@@ -2069,29 +2070,42 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
return ret;
}

-static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
+static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc,
+ int checkalias)
{
if (hv_is_isolation_supported())
return hv_set_mem_host_visibility(addr, numpages, !enc);

if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
- return __set_memory_enc_pgtable(addr, numpages, enc);
+ return __set_memory_enc_pgtable(addr, numpages, enc, checkalias);

return 0;
}

int set_memory_encrypted(unsigned long addr, int numpages)
{
- return __set_memory_enc_dec(addr, numpages, true);
+ return __set_memory_enc_dec(addr, numpages, true, 1);
}
EXPORT_SYMBOL_GPL(set_memory_encrypted);

int set_memory_decrypted(unsigned long addr, int numpages)
{
- return __set_memory_enc_dec(addr, numpages, false);
+ return __set_memory_enc_dec(addr, numpages, false, 1);
}
EXPORT_SYMBOL_GPL(set_memory_decrypted);

+int set_memory_encrypted_noalias(unsigned long addr, int numpages)
+{
+ return __set_memory_enc_dec(addr, numpages, true, 0);
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted_noalias);
+
+int set_memory_decrypted_noalias(unsigned long addr, int numpages)
+{
+ return __set_memory_enc_dec(addr, numpages, false, 0);
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted_noalias);
+
int set_pages_uc(struct page *page, int numpages)
{
unsigned long addr = (unsigned long)page_address(page);
--
2.25.1

2022-06-24 13:56:23

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On 6/8/22 19:52, Kuppuswamy Sathyanarayanan wrote:
> set_memory_*crypted() functions are used to modify the "shared" page
> attribute of the given memory. Using these APIs will modify the page
> attributes of the aliased mappings (which also includes the direct
> mapping).
>
> But such aliased mappings modification is not desirable in use cases
> like TDX guest, where the requirement is to create the shared mapping
> without touching the direct map. It is used when allocating VMM shared
> buffers using alloc_pages()/vmap()/set_memory_*crypted() API
> combinations.
>
> So to support such use cases, add support for noalias variants of
> set_memory_*crypted() functions.

This changelog has a lot of words, but doesn't tell us much.

It basically says, we don't want to touch the direct map in use cases
where we don't want to touch the direct map. Not helpful.

The alias processing is there for a reason. What is it? Why don't you
need alias processing for TDX? Sure, you decided you don't want to
touch the direct map, but *WHY*? Why is it normally OK to touch the
direct map, but not OK in this case? Even better, why is it *DESIRABLE*
to leave the direct map alone? Lastly, why is this safe? If alias
processing was to protect us from something, why is losing that
protection OK here?

2022-06-27 15:20:58

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On Fri, Jun 24, 2022 at 06:19:23AM -0700, Dave Hansen wrote:
> On 6/8/22 19:52, Kuppuswamy Sathyanarayanan wrote:
> > set_memory_*crypted() functions are used to modify the "shared" page
> > attribute of the given memory. Using these APIs will modify the page
> > attributes of the aliased mappings (which also includes the direct
> > mapping).
> >
> > But such aliased mappings modification is not desirable in use cases
> > like TDX guest, where the requirement is to create the shared mapping
> > without touching the direct map. It is used when allocating VMM shared
> > buffers using alloc_pages()/vmap()/set_memory_*crypted() API
> > combinations.
> >
> > So to support such use cases, add support for noalias variants of
> > set_memory_*crypted() functions.
>
> This changelog has a lot of words, but doesn't tell us much.
>
> It basically says, we don't want to touch the direct map in use cases
> where we don't want to touch the direct map. Not helpful.
>
> The alias processing is there for a reason. What is it? Why don't you
> need alias processing for TDX? Sure, you decided you don't want to
> touch the direct map, but *WHY*? Why is it normally OK to touch the
> direct map, but not OK in this case? Even better, why is it *DESIRABLE*
> to leave the direct map alone? Lastly, why is this safe? If alias
> processing was to protect us from something, why is losing that
> protection OK here?

The whole idea of alloc_pages()/vmap()/set_memory_decrypted() exercise is
to avoid direct map fragmentation.

Alias processing modifies direct mapping and highmap in addition to the
mapping user requested the modification for. Alias processing is required
to keep different mappings in sync, but here we want to avoid this because
it leads to direct mapping fragmentation.

Normally, direct mapping modifications are done for long-lasting
allocation where fragmentation is lesser concern. Here we have transient
allocation which may repeat over and over trashing the direct mapping.

Speaking of safety, I wanted initially claim that it is safe as only owner
of the allocation will touch the memory and it only does via the vmap
mapping.

*BUT*

It made me thing about my recent story with load_unaligned_zeropad().
If we leave the page in direct mapping mapped as private and
load_unaligned_zeropad() will roll off to it, we will get SEPT violation
that will terminate the TD as it is considered unaccepted.

I think we must keep aliases in think. And vmap() doesn't make much sense
in this case :/

I urge you folks to consider DMA API again. Or have any other way to tap
into swiotlb pool.

--
Kirill A. Shutemov

2022-06-27 18:36:27

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On 6/27/22 08:12, Kirill A. Shutemov wrote:
> It made me thing about my recent story with load_unaligned_zeropad().
> If we leave the page in direct mapping mapped as private and
> load_unaligned_zeropad() will roll off to it, we will get SEPT violation
> that will terminate the TD as it is considered unaccepted.
>
> I think we must keep aliases in think. And vmap() doesn't make much sense
> in this case :/
>
> I urge you folks to consider DMA API again. Or have any other way to tap
> into swiotlb pool.

Ugh. This is a good point. We need *PHYSICAL* pages to pad the front
of any page we use for the quotes. That means some crazy code like:

struct page *pages[nr_pages];
struct page *pages_vmap[nr_pages];

for (i = 0; i < nr_pages; i++) {
// allocate an extra "padding" page:
pages[i] = alloc_pages(1, GFP_WHATEVER);
^ note the order=1
// record the page that will be vmap()'d:
pages_vmap[i] = pages[i]+1;
set_pages_decrypted(page_to_virt(pages_vmap[i]));
}

vmap(pages_vmap, nr_pages);

That's just adorable. The other way is to do alloc_pages_exact() with
*one* extra page and just use contiguous memory for it.

I still don't like the idea of using the DMA API itself. But, maybe we
need some common infrastructure that the DMA API and this code use which
says, "get me some pages that I can safely make shared".

2022-06-28 01:57:38

by Kai Huang

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

>
> I still don't like the idea of using the DMA API itself. But, maybe we
> need some common infrastructure that the DMA API and this code use which
> says, "get me some pages that I can safely make shared".

Right. For instance any KVM PV feature would require shared memory, and DMA API
normally doesn't fit (taking 'struct kvm_steal_time' as example).

Maybe we can reserve a CMA for this purpose.

--
Thanks,
-Kai

2022-07-05 16:10:06

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On Tue, Jun 28, 2022 at 01:15:21PM +1200, Kai Huang wrote:
>
> >
> > I still don't like the idea of using the DMA API itself. But, maybe we
> > need some common infrastructure that the DMA API and this code use which
> > says, "get me some pages that I can safely make shared".
>
> Right. For instance any KVM PV feature would require shared memory, and DMA API
> normally doesn't fit (taking 'struct kvm_steal_time' as example).
>
> Maybe we can reserve a CMA for this purpose.

CMA for couple low traffic users sounds like an overkill. It will create
an separate pool just for them.

I think the best way is to add an dummy device and couple of helpers
around DMA API that would allow to tap into swiotlb.

Maybe hide it inside CC infrastructure. Like cc_decrypted_alloc() and
cc_decrypted_free().

--
Kirill A. Shutemov

2022-07-18 14:32:39

by Kuppuswamy Sathyanarayanan

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

Hi Dave,

On 7/5/22 8:29 AM, Kirill A. Shutemov wrote:
>>> I still don't like the idea of using the DMA API itself. But, maybe we
>>> need some common infrastructure that the DMA API and this code use which
>>> says, "get me some pages that I can safely make shared".
>> Right. For instance any KVM PV feature would require shared memory, and DMA API
>> normally doesn't fit (taking 'struct kvm_steal_time' as example).
>>
>> Maybe we can reserve a CMA for this purpose.
> CMA for couple low traffic users sounds like an overkill. It will create
> an separate pool just for them.
>
> I think the best way is to add an dummy device and couple of helpers
> around DMA API that would allow to tap into swiotlb.
>
> Maybe hide it inside CC infrastructure. Like cc_decrypted_alloc() and
> cc_decrypted_free().

I also think creating a generic device in the CC layer, and using it to allocate
memory via DMA APIs is a better approach for this issue. Following is the sample
implementation to give you an idea on how it looks. Please let me know
your comments.

cc_shmem_alloc/free() APIs can be used by CC users to allocate and
free shared memory.

Other vendors can define their own shared memory allocation and free
logic via struct shmem_priv alloc/free/init hooks.

--- a/arch/x86/coco/Makefile
+++ b/arch/x86/coco/Makefile
@@ -3,6 +3,6 @@ CFLAGS_REMOVE_core.o = -pg
KASAN_SANITIZE_core.o := n
CFLAGS_core.o += -fno-stack-protector

-obj-y += core.o
+obj-y += core.o shmem.o

obj-$(CONFIG_INTEL_TDX_GUEST) += tdx/
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 49b44f881484..62fe68d1f60a 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -13,7 +13,7 @@
#include <asm/coco.h>
#include <asm/processor.h>

-static enum cc_vendor vendor __ro_after_init;
+enum cc_vendor vendor __ro_after_init;
static u64 cc_mask __ro_after_init;

static bool intel_cc_platform_has(enum cc_attr attr)
@@ -23,6 +23,7 @@ static bool intel_cc_platform_has(enum cc_attr attr)
case CC_ATTR_HOTPLUG_DISABLED:
case CC_ATTR_GUEST_MEM_ENCRYPT:
case CC_ATTR_MEM_ENCRYPT:
+ case CC_ATTR_SHMEM:
return true;
default:
return false;
@@ -134,6 +135,11 @@ __init void cc_set_vendor(enum cc_vendor v)
vendor = v;
}

+enum cc_vendor cc_get_vendor(void)
+{
+ return vendor;
+}
+
__init void cc_set_mask(u64 mask)
{
cc_mask = mask;

+++ b/arch/x86/coco/shmem.c
@@ -0,0 +1,156 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Shared Memory Allocator
+ *
+ * Copyright (C) 2022 Intel Corporation, Inc.
+ *
+ */
+
+#undef pr_fmt
+#define pr_fmt(fmt) "CC: " fmt
+
+#include <linux/export.h>
+#include <linux/cma.h>
+#include <linux/mm.h>
+#include <linux/cc_platform.h>
+#include <linux/set_memory.h>
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+
+#include <asm/coco.h>
+#include <asm/processor.h>
+
+#define CC_SHMEM_DRIVER "cc-shmem"
+
+struct platform_device *shmem_pdev;
+
+struct shmem_priv
+{
+ int (*init)(struct platform_device *dev);
+ void* (*alloc)(size_t size, gfp_t gfp, struct shmem_priv *priv);
+ void (*free)(void *addr, size_t size, struct shmem_priv *priv);
+ struct platform_device *pdev;
+ void *data;
+};
+
+void *cc_shmem_alloc(size_t size, gfp_t gfp)
+{
+ struct shmem_priv *priv;
+
+ if (!shmem_pdev)
+ return NULL;
+
+ priv = platform_get_drvdata(shmem_pdev);
+
+ return priv->alloc(size, gfp, priv);
+}
+
+void cc_shmem_free(void *addr, size_t size)
+{
+ struct shmem_priv *priv;
+
+ if (!shmem_pdev)
+ return;
+
+ priv = platform_get_drvdata(shmem_pdev);
+
+ priv->free(addr, size, priv);
+}
+
+static int intel_shmem_init(struct platform_device *pdev)
+{
+ struct shmem_priv *priv;
+ dma_addr_t *handle;
+
+ handle = devm_kmalloc(&pdev->dev, sizeof(*handle), GFP_KERNEL);
+ if (!handle)
+ return -ENOMEM;
+
+ priv = platform_get_drvdata(pdev);
+
+ priv->data = handle;
+
+ return dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
+}
+
+static void *intel_shmem_alloc(size_t size, gfp_t gfp, struct shmem_priv *priv)
+{
+ dma_addr_t *handle = (dma_addr_t *) priv->data;
+
+ return dma_alloc_coherent(&priv->pdev->dev, size, handle, gfp);
+}
+
+static void intel_shmem_free(void *addr, size_t size, struct shmem_priv *priv)
+{
+ dma_addr_t *handle = (dma_addr_t *) priv->data;
+
+ return dma_free_coherent(&priv->pdev->dev, size, addr, *handle);
+}
+
+static struct shmem_priv intel_shmem = {
+ .init = intel_shmem_init,
+ .alloc = intel_shmem_alloc,
+ .free = intel_shmem_free,
+};
+
+static int cc_set_drv_data(struct platform_device *pdev)
+{
+ enum cc_vendor vendor = cc_get_vendor();
+
+ switch (vendor) {
+ case CC_VENDOR_INTEL:
+ platform_set_drvdata(pdev, &intel_shmem);
+ return 0;
+ default:
+ break;
+ }
+
+ return -ENODEV;
+}
+
+static int cc_shmem_probe(struct platform_device *pdev)
+{
+ struct shmem_priv *priv;
+ void *data;
+
+ if (cc_set_drv_data(pdev))
+ return -ENODEV;
+
+ priv = platform_get_drvdata(pdev);
+
+ shmem_pdev = priv->pdev = pdev;
+
+ if (priv->init(pdev))
+ return -EIO;
+
+ return 0;
+}
+
+static struct platform_driver cc_shmem_driver = {
+ .probe = cc_shmem_probe,
+ .driver = {
+ .name = CC_SHMEM_DRIVER,
+ },
+};
+
+static int __init cc_shmem_init(void)
+{
+ struct platform_device *pdev;
+ int ret;
+
+ if (!cc_platform_has(CC_ATTR_SHMEM))
+ return -ENODEV;
+
+ ret = platform_driver_register(&cc_shmem_driver);
+ if (ret)
+ return ret;
+
+ pdev = platform_device_register_simple(CC_SHMEM_DRIVER, -1, NULL, 0);
+ if (IS_ERR(pdev)) {
+ platform_driver_unregister(&cc_shmem_driver);
+ return PTR_ERR(pdev);
+ }
+
+ return 0;
+}
+device_initcall(cc_shmem_init);
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 3d98c3a60d34..5c259135c8fc 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -12,11 +12,14 @@ enum cc_vendor {
};

void cc_set_vendor(enum cc_vendor v);
+enum cc_vendor cc_get_vendor(void);
void cc_set_mask(u64 mask);

#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
u64 cc_mkenc(u64 val);
u64 cc_mkdec(u64 val);
+void *cc_shmem_alloc(size_t size, gfp_t gfp);
+void cc_shmem_free(void *addr, size_t size);
#else
static inline u64 cc_mkenc(u64 val)
{
@@ -27,6 +30,10 @@ static inline u64 cc_mkdec(u64 val)
{
return val;
}
+
+void *cc_shmem_alloc(size_t size, gfp_t gfp) { return NULL; }
+void cc_shmem_free(void *addr, size_t size) { }
+
#endif

#endif /* _ASM_X86_COCO_H */
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index cb0d6cd1c12f..50e0db22056f 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -90,6 +90,17 @@ enum cc_attr {
* Examples include TDX Guest.
*/
CC_ATTR_HOTPLUG_DISABLED,
+
+ /**
+ * @CC_ATTR_SHMEM: CC shared memory is supported.
+ *
+ * The platform/OS is running as a guest/virtual machine
+ * can allocate shared memory using cc_shmem_{alloc/free}()
+ * APIs.
+ *
+ */
+ CC_ATTR_SHMEM,
+
};

#ifdef CONFIG_ARCH_HAS_CC_PLATFORM

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2022-07-19 16:20:39

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On Mon, Jul 18, 2022 at 07:22:52AM -0700, Sathyanarayanan Kuppuswamy wrote:
> Hi Dave,
>
> On 7/5/22 8:29 AM, Kirill A. Shutemov wrote:
> >>> I still don't like the idea of using the DMA API itself. But, maybe we
> >>> need some common infrastructure that the DMA API and this code use which
> >>> says, "get me some pages that I can safely make shared".
> >> Right. For instance any KVM PV feature would require shared memory, and DMA API
> >> normally doesn't fit (taking 'struct kvm_steal_time' as example).
> >>
> >> Maybe we can reserve a CMA for this purpose.
> > CMA for couple low traffic users sounds like an overkill. It will create
> > an separate pool just for them.
> >
> > I think the best way is to add an dummy device and couple of helpers
> > around DMA API that would allow to tap into swiotlb.
> >
> > Maybe hide it inside CC infrastructure. Like cc_decrypted_alloc() and
> > cc_decrypted_free().
>
> I also think creating a generic device in the CC layer, and using it to allocate
> memory via DMA APIs is a better approach for this issue. Following is the sample
> implementation to give you an idea on how it looks. Please let me know
> your comments.
>
> cc_shmem_alloc/free() APIs can be used by CC users to allocate and
> free shared memory.

We usually use 'decrypted' term in kernel. cc_decrypted_alloc()/_free().

> Other vendors can define their own shared memory allocation and free
> logic via struct shmem_priv alloc/free/init hooks.
>
> --- a/arch/x86/coco/Makefile
> +++ b/arch/x86/coco/Makefile
> @@ -3,6 +3,6 @@ CFLAGS_REMOVE_core.o = -pg
> KASAN_SANITIZE_core.o := n
> CFLAGS_core.o += -fno-stack-protector
>
> -obj-y += core.o
> +obj-y += core.o shmem.o

Rename shmem.o -> mem.o ?

> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> index 49b44f881484..62fe68d1f60a 100644
> --- a/arch/x86/coco/core.c
> +++ b/arch/x86/coco/core.c
> @@ -13,7 +13,7 @@
> #include <asm/coco.h>
> #include <asm/processor.h>
>
> -static enum cc_vendor vendor __ro_after_init;
> +enum cc_vendor vendor __ro_after_init;
> static u64 cc_mask __ro_after_init;
>
> static bool intel_cc_platform_has(enum cc_attr attr)
> @@ -23,6 +23,7 @@ static bool intel_cc_platform_has(enum cc_attr attr)
> case CC_ATTR_HOTPLUG_DISABLED:
> case CC_ATTR_GUEST_MEM_ENCRYPT:
> case CC_ATTR_MEM_ENCRYPT:
> + case CC_ATTR_SHMEM:
> return true;
> default:
> return false;

I don't think we need a new attribute. CC_ATTR_MEM_ENCRYPT has to be
enough.

> @@ -134,6 +135,11 @@ __init void cc_set_vendor(enum cc_vendor v)
> vendor = v;
> }
>
> +enum cc_vendor cc_get_vendor(void)
> +{
> + return vendor;
> +}
> +
> __init void cc_set_mask(u64 mask)
> {
> cc_mask = mask;
>
> +++ b/arch/x86/coco/shmem.c
> @@ -0,0 +1,156 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Confidential Computing Shared Memory Allocator
> + *
> + * Copyright (C) 2022 Intel Corporation, Inc.
> + *
> + */
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt) "CC: " fmt
> +
> +#include <linux/export.h>
> +#include <linux/cma.h>
> +#include <linux/mm.h>
> +#include <linux/cc_platform.h>
> +#include <linux/set_memory.h>
> +#include <linux/platform_device.h>
> +#include <linux/dma-mapping.h>
> +
> +#include <asm/coco.h>
> +#include <asm/processor.h>
> +
> +#define CC_SHMEM_DRIVER "cc-shmem"
> +
> +struct platform_device *shmem_pdev;
> +
> +struct shmem_priv
> +{
> + int (*init)(struct platform_device *dev);
> + void* (*alloc)(size_t size, gfp_t gfp, struct shmem_priv *priv);
> + void (*free)(void *addr, size_t size, struct shmem_priv *priv);
> + struct platform_device *pdev;
> + void *data;
> +};
> +
> +void *cc_shmem_alloc(size_t size, gfp_t gfp)
> +{
> + struct shmem_priv *priv;
> +
> + if (!shmem_pdev)
> + return NULL;
> +
> + priv = platform_get_drvdata(shmem_pdev);
> +
> + return priv->alloc(size, gfp, priv);
> +}
> +
> +void cc_shmem_free(void *addr, size_t size)
> +{
> + struct shmem_priv *priv;
> +
> + if (!shmem_pdev)
> + return;
> +
> + priv = platform_get_drvdata(shmem_pdev);
> +
> + priv->free(addr, size, priv);
> +}
> +
> +static int intel_shmem_init(struct platform_device *pdev)
> +{
> + struct shmem_priv *priv;
> + dma_addr_t *handle;
> +
> + handle = devm_kmalloc(&pdev->dev, sizeof(*handle), GFP_KERNEL);
> + if (!handle)
> + return -ENOMEM;
> +
> + priv = platform_get_drvdata(pdev);
> +
> + priv->data = handle;
> +
> + return dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
> +}
> +
> +static void *intel_shmem_alloc(size_t size, gfp_t gfp, struct shmem_priv *priv)
> +{
> + dma_addr_t *handle = (dma_addr_t *) priv->data;
> +
> + return dma_alloc_coherent(&priv->pdev->dev, size, handle, gfp);
> +}
> +
> +static void intel_shmem_free(void *addr, size_t size, struct shmem_priv *priv)
> +{
> + dma_addr_t *handle = (dma_addr_t *) priv->data;
> +
> + return dma_free_coherent(&priv->pdev->dev, size, addr, *handle);
> +}
> +
> +static struct shmem_priv intel_shmem = {
> + .init = intel_shmem_init,
> + .alloc = intel_shmem_alloc,
> + .free = intel_shmem_free,
> +};

Hm. What is Intel-specific here. Looks like a generic thing, no?

Maybe just drop all vendor stuff. CC_ATTR_MEM_ENCRYPT should be enough.

--
Kirill A. Shutemov

2022-07-19 17:13:14

by Kuppuswamy Sathyanarayanan

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On 7/19/22 9:13 AM, Kirill A. Shutemov wrote:
> On Mon, Jul 18, 2022 at 07:22:52AM -0700, Sathyanarayanan Kuppuswamy wrote:
>> Hi Dave,
>>
>> On 7/5/22 8:29 AM, Kirill A. Shutemov wrote:
>>>>> I still don't like the idea of using the DMA API itself. But, maybe we
>>>>> need some common infrastructure that the DMA API and this code use which
>>>>> says, "get me some pages that I can safely make shared".
>>>> Right. For instance any KVM PV feature would require shared memory, and DMA API
>>>> normally doesn't fit (taking 'struct kvm_steal_time' as example).
>>>>
>>>> Maybe we can reserve a CMA for this purpose.
>>> CMA for couple low traffic users sounds like an overkill. It will create
>>> an separate pool just for them.
>>>
>>> I think the best way is to add an dummy device and couple of helpers
>>> around DMA API that would allow to tap into swiotlb.
>>>
>>> Maybe hide it inside CC infrastructure. Like cc_decrypted_alloc() and
>>> cc_decrypted_free().
>>
>> I also think creating a generic device in the CC layer, and using it to allocate
>> memory via DMA APIs is a better approach for this issue. Following is the sample
>> implementation to give you an idea on how it looks. Please let me know
>> your comments.
>>
>> cc_shmem_alloc/free() APIs can be used by CC users to allocate and
>> free shared memory.
>
> We usually use 'decrypted' term in kernel. cc_decrypted_alloc()/_free().

Fine with me. I will use it.

>
>> Other vendors can define their own shared memory allocation and free
>> logic via struct shmem_priv alloc/free/init hooks.
>>
>> --- a/arch/x86/coco/Makefile
>> +++ b/arch/x86/coco/Makefile
>> @@ -3,6 +3,6 @@ CFLAGS_REMOVE_core.o = -pg
>> KASAN_SANITIZE_core.o := n
>> CFLAGS_core.o += -fno-stack-protector
>>
>> -obj-y += core.o
>> +obj-y += core.o shmem.o
>
> Rename shmem.o -> mem.o ?

Ok.

>
>> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
>> index 49b44f881484..62fe68d1f60a 100644
>> --- a/arch/x86/coco/core.c
>> +++ b/arch/x86/coco/core.c
>> @@ -13,7 +13,7 @@
>> #include <asm/coco.h>
>> #include <asm/processor.h>
>>
>> -static enum cc_vendor vendor __ro_after_init;
>> +enum cc_vendor vendor __ro_after_init;
>> static u64 cc_mask __ro_after_init;
>>
>> static bool intel_cc_platform_has(enum cc_attr attr)
>> @@ -23,6 +23,7 @@ static bool intel_cc_platform_has(enum cc_attr attr)
>> case CC_ATTR_HOTPLUG_DISABLED:
>> case CC_ATTR_GUEST_MEM_ENCRYPT:
>> case CC_ATTR_MEM_ENCRYPT:
>> + case CC_ATTR_SHMEM:
>> return true;
>> default:
>> return false;
>
> I don't think we need a new attribute. CC_ATTR_MEM_ENCRYPT has to be
> enough.

Ok. Make sense.

>
>> @@ -134,6 +135,11 @@ __init void cc_set_vendor(enum cc_vendor v)
>> vendor = v;
>> }
>>
>> +enum cc_vendor cc_get_vendor(void)
>> +{
>> + return vendor;
>> +}
>> +
>> __init void cc_set_mask(u64 mask)
>> {
>> cc_mask = mask;
>>
>> +++ b/arch/x86/coco/shmem.c
>> @@ -0,0 +1,156 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Confidential Computing Shared Memory Allocator
>> + *
>> + * Copyright (C) 2022 Intel Corporation, Inc.
>> + *
>> + */
>> +
>> +#undef pr_fmt
>> +#define pr_fmt(fmt) "CC: " fmt
>> +
>> +#include <linux/export.h>
>> +#include <linux/cma.h>
>> +#include <linux/mm.h>
>> +#include <linux/cc_platform.h>
>> +#include <linux/set_memory.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/dma-mapping.h>
>> +
>> +#include <asm/coco.h>
>> +#include <asm/processor.h>
>> +
>> +#define CC_SHMEM_DRIVER "cc-shmem"
>> +
>> +struct platform_device *shmem_pdev;
>> +
>> +struct shmem_priv
>> +{
>> + int (*init)(struct platform_device *dev);
>> + void* (*alloc)(size_t size, gfp_t gfp, struct shmem_priv *priv);
>> + void (*free)(void *addr, size_t size, struct shmem_priv *priv);
>> + struct platform_device *pdev;
>> + void *data;
>> +};
>> +
>> +void *cc_shmem_alloc(size_t size, gfp_t gfp)
>> +{
>> + struct shmem_priv *priv;
>> +
>> + if (!shmem_pdev)
>> + return NULL;
>> +
>> + priv = platform_get_drvdata(shmem_pdev);
>> +
>> + return priv->alloc(size, gfp, priv);
>> +}
>> +
>> +void cc_shmem_free(void *addr, size_t size)
>> +{
>> + struct shmem_priv *priv;
>> +
>> + if (!shmem_pdev)
>> + return;
>> +
>> + priv = platform_get_drvdata(shmem_pdev);
>> +
>> + priv->free(addr, size, priv);
>> +}
>> +
>> +static int intel_shmem_init(struct platform_device *pdev)
>> +{
>> + struct shmem_priv *priv;
>> + dma_addr_t *handle;
>> +
>> + handle = devm_kmalloc(&pdev->dev, sizeof(*handle), GFP_KERNEL);
>> + if (!handle)
>> + return -ENOMEM;
>> +
>> + priv = platform_get_drvdata(pdev);
>> +
>> + priv->data = handle;
>> +
>> + return dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
>> +}
>> +
>> +static void *intel_shmem_alloc(size_t size, gfp_t gfp, struct shmem_priv *priv)
>> +{
>> + dma_addr_t *handle = (dma_addr_t *) priv->data;
>> +
>> + return dma_alloc_coherent(&priv->pdev->dev, size, handle, gfp);
>> +}
>> +
>> +static void intel_shmem_free(void *addr, size_t size, struct shmem_priv *priv)
>> +{
>> + dma_addr_t *handle = (dma_addr_t *) priv->data;
>> +
>> + return dma_free_coherent(&priv->pdev->dev, size, addr, *handle);
>> +}
>> +
>> +static struct shmem_priv intel_shmem = {
>> + .init = intel_shmem_init,
>> + .alloc = intel_shmem_alloc,
>> + .free = intel_shmem_free,
>> +};
>
> Hm. What is Intel-specific here. Looks like a generic thing, no?
>
> Maybe just drop all vendor stuff. CC_ATTR_MEM_ENCRYPT should be enough.

I thought that not all CC vendors would want to use DMA APIs for shared
buffer allocation. So adding a vendor layer would give them a way to implement
their own model.

>
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2022-07-19 22:24:03

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On Tue, Jul 19, 2022 at 10:10:20AM -0700, Sathyanarayanan Kuppuswamy wrote:
> >> +static struct shmem_priv intel_shmem = {
> >> + .init = intel_shmem_init,
> >> + .alloc = intel_shmem_alloc,
> >> + .free = intel_shmem_free,
> >> +};
> >
> > Hm. What is Intel-specific here. Looks like a generic thing, no?
> >
> > Maybe just drop all vendor stuff. CC_ATTR_MEM_ENCRYPT should be enough.
>
> I thought that not all CC vendors would want to use DMA APIs for shared
> buffer allocation. So adding a vendor layer would give them a way to implement
> their own model.

set_memory_decrypted() is gated by CC_ATTR_MEM_ENCRYPT and it is the only
requirement for functionality AFAICS.

--
Kirill A. Shutemov

2022-07-20 15:00:59

by Kuppuswamy Sathyanarayanan

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

Hi Kirill,

On 7/19/22 2:55 PM, Kirill A. Shutemov wrote:
> On Tue, Jul 19, 2022 at 10:10:20AM -0700, Sathyanarayanan Kuppuswamy wrote:
>>>> +static struct shmem_priv intel_shmem = {
>>>> + .init = intel_shmem_init,
>>>> + .alloc = intel_shmem_alloc,
>>>> + .free = intel_shmem_free,
>>>> +};
>>>
>>> Hm. What is Intel-specific here. Looks like a generic thing, no?
>>>
>>> Maybe just drop all vendor stuff. CC_ATTR_MEM_ENCRYPT should be enough.
>>
>> I thought that not all CC vendors would want to use DMA APIs for shared
>> buffer allocation. So adding a vendor layer would give them a way to implement
>> their own model.
>
> set_memory_decrypted() is gated by CC_ATTR_MEM_ENCRYPT and it is the only
> requirement for functionality AFAICS.

Makes sense. I think CC_ATTR_GUEST_MEM_ENCRYPT is the better check here. It
force enables the SWIOTLB usage.

diff --git a/arch/x86/coco/Makefile b/arch/x86/coco/Makefile
index c816acf78b6a..96fc4ec4497f 100644
--- a/arch/x86/coco/Makefile
+++ b/arch/x86/coco/Makefile
@@ -3,6 +3,6 @@ CFLAGS_REMOVE_core.o = -pg
KASAN_SANITIZE_core.o := n
CFLAGS_core.o += -fno-stack-protector

-obj-y += core.o
+obj-y += core.o mem.o

obj-$(CONFIG_INTEL_TDX_GUEST) += tdx/
diff --git a/arch/x86/coco/mem.c b/arch/x86/coco/mem.c
new file mode 100644
index 000000000000..ef76a8accc1e
--- /dev/null
+++ b/arch/x86/coco/mem.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Decrypted Memory Allocator
+ *
+ * Copyright (C) 2022 Intel Corporation, Inc.
+ *
+ */
+
+#undef pr_fmt
+#define pr_fmt(fmt) "cc/mem: " fmt
+
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <linux/cc_platform.h>
+#include <linux/set_memory.h>
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+
+#include <asm/coco.h>
+#include <asm/processor.h>
+
+#define CC_MEM_DRIVER "ccmem"
+
+struct platform_device *mem_pdev;
+dma_addr_t handle;
+
+/* Allocate decrypted memory of given size */
+void *cc_decrypted_alloc(size_t size, gfp_t gfp)
+{
+ if (!mem_pdev)
+ return NULL;
+
+ return dma_alloc_coherent(&mem_pdev->dev, size, &handle, gfp);
+}
+
+/* Free given decrypted memory */
+void cc_decrypted_free(void *addr, size_t size)
+{
+ if (!mem_pdev)
+ return;
+
+ dma_free_coherent(&mem_pdev->dev, size, addr, handle);
+}
+
+static int cc_mem_probe(struct platform_device *pdev)
+{
+ if (dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64)))
+ return -EIO;
+
+ mem_pdev = pdev;
+
+ return 0;
+}
+
+static struct platform_driver cc_mem_driver = {
+ .probe = cc_mem_probe,
+ .driver = {
+ .name = CC_MEM_DRIVER,
+ },
+};
+
+static int __init cc_mem_init(void)
+{
+ struct platform_device *pdev;
+ int ret;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+ return -ENODEV;
+
+ ret = platform_driver_register(&cc_mem_driver);
+ if (ret)
+ return ret;
+
+ pdev = platform_device_register_simple(CC_MEM_DRIVER, -1, NULL, 0);
+ if (IS_ERR(pdev)) {
+ platform_driver_unregister(&cc_mem_driver);
+ return PTR_ERR(pdev);
+ }
+
+ return 0;
+}
+device_initcall(cc_mem_init);
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 3d98c3a60d34..9d616d9e3405 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -17,6 +17,8 @@ void cc_set_mask(u64 mask);
#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
u64 cc_mkenc(u64 val);
u64 cc_mkdec(u64 val);
+void *cc_decrypted_alloc(size_t size, gfp_t gfp);
+void cc_decrypted_free(void *addr, size_t size);
#else
static inline u64 cc_mkenc(u64 val)
{
@@ -27,6 +29,10 @@ static inline u64 cc_mkdec(u64 val)
{
return val;
}
+
+void *cc_decrypted_alloc(size_t size, gfp_t gfp) { return NULL; }
+void cc_decrypted_free(void *addr, size_t size) { }
+
#endif

#endif /* _ASM_X86_COCO_H

>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2022-07-20 16:23:51

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On Wed, Jul 20, 2022 at 07:56:04AM -0700, Sathyanarayanan Kuppuswamy wrote:
> diff --git a/arch/x86/coco/mem.c b/arch/x86/coco/mem.c
> new file mode 100644
> index 000000000000..ef76a8accc1e
> --- /dev/null
> +++ b/arch/x86/coco/mem.c
> @@ -0,0 +1,82 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Confidential Computing Decrypted Memory Allocator
> + *
> + * Copyright (C) 2022 Intel Corporation, Inc.
> + *
> + */
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt) "cc/mem: " fmt
> +
> +#include <linux/export.h>
> +#include <linux/mm.h>
> +#include <linux/cc_platform.h>
> +#include <linux/set_memory.h>
> +#include <linux/platform_device.h>
> +#include <linux/dma-mapping.h>
> +
> +#include <asm/coco.h>
> +#include <asm/processor.h>
> +
> +#define CC_MEM_DRIVER "ccmem"
> +
> +struct platform_device *mem_pdev;

'static'?

> +dma_addr_t handle;

Hm. How does it work with >1 allocation a time?

> +
> +/* Allocate decrypted memory of given size */
> +void *cc_decrypted_alloc(size_t size, gfp_t gfp)
> +{
> + if (!mem_pdev)
> + return NULL;
> +
> + return dma_alloc_coherent(&mem_pdev->dev, size, &handle, gfp);
> +}
> +
> +/* Free given decrypted memory */
> +void cc_decrypted_free(void *addr, size_t size)
> +{
> + if (!mem_pdev)
> + return;
> +
> + dma_free_coherent(&mem_pdev->dev, size, addr, handle);
> +}
> +

--
Kirill A. Shutemov

2022-07-20 17:11:51

by Kuppuswamy Sathyanarayanan

[permalink] [raw]

Subject: Re: [PATCH v8 4/5] x86/mm: Add noalias variants of set_memory_*crypted() functions

On 7/20/22 9:17 AM, Kirill A. Shutemov wrote:
> On Wed, Jul 20, 2022 at 07:56:04AM -0700, Sathyanarayanan Kuppuswamy wrote:
>> diff --git a/arch/x86/coco/mem.c b/arch/x86/coco/mem.c
>> new file mode 100644
>> index 000000000000..ef76a8accc1e
>> --- /dev/null
>> +++ b/arch/x86/coco/mem.c
>> @@ -0,0 +1,82 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Confidential Computing Decrypted Memory Allocator
>> + *
>> + * Copyright (C) 2022 Intel Corporation, Inc.
>> + *
>> + */
>> +
>> +#undef pr_fmt
>> +#define pr_fmt(fmt) "cc/mem: " fmt
>> +
>> +#include <linux/export.h>
>> +#include <linux/mm.h>
>> +#include <linux/cc_platform.h>
>> +#include <linux/set_memory.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/dma-mapping.h>
>> +
>> +#include <asm/coco.h>
>> +#include <asm/processor.h>
>> +
>> +#define CC_MEM_DRIVER "ccmem"
>> +
>> +struct platform_device *mem_pdev;
>
> 'static'?

Agree.

>
>> +dma_addr_t handle;
>
> Hm. How does it work with >1 allocation a time?

It is a bug. I think we should make it as parameter to alloc/free APIs.

>
>> +
>> +/* Allocate decrypted memory of given size */
>> +void *cc_decrypted_alloc(size_t size, gfp_t gfp)
>> +{
>> + if (!mem_pdev)
>> + return NULL;
>> +
>> + return dma_alloc_coherent(&mem_pdev->dev, size, &handle, gfp);
>> +}
>> +
>> +/* Free given decrypted memory */
>> +void cc_decrypted_free(void *addr, size_t size)
>> +{
>> + if (!mem_pdev)
>> + return;
>> +
>> + dma_free_coherent(&mem_pdev->dev, size, addr, handle);
>> +}
>> +
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer