2022-05-03 00:50:12

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support


> +
> + /* Get order for Quote buffer page allocation */
> + order = get_order(quote_req.len);
> +
> + /*
> + * Allocate buffer to get TD Quote from the VMM.
> + * Size needs to be 4KB aligned (which is already
> + * met in page allocation).
> + */
> + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> + if (!tdquote) {
> + ret = -ENOMEM;
> + goto quote_failed;
> + }

You can use alloc_pages_exact().

> +
> + /*
> + * Since this buffer will be shared with the VMM via GetQuote
> + * hypercall, decrypt it.
> + */
> + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
> + if (ret)
> + goto quote_failed;


Again, Dave and Andi already commented you should use vmap() to avoid breaking
up the direct-mapping. Please use vmap() instead.

https://lore.kernel.org/all/[email protected]/

Will review the rest later.


--
Thanks,
-Kai



2022-05-03 01:35:47

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Mon, May 02, 2022 at 02:40:26PM +1200, Kai Huang wrote:
>
> > +
> > + /* Get order for Quote buffer page allocation */
> > + order = get_order(quote_req.len);
> > +
> > + /*
> > + * Allocate buffer to get TD Quote from the VMM.
> > + * Size needs to be 4KB aligned (which is already
> > + * met in page allocation).
> > + */
> > + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > + if (!tdquote) {
> > + ret = -ENOMEM;
> > + goto quote_failed;
> > + }
>
> You can use alloc_pages_exact().
>
> > +
> > + /*
> > + * Since this buffer will be shared with the VMM via GetQuote
> > + * hypercall, decrypt it.
> > + */
> > + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
> > + if (ret)
> > + goto quote_failed;
>
>
> Again, Dave and Andi already commented you should use vmap() to avoid breaking
> up the direct-mapping. Please use vmap() instead.
>
> https://lore.kernel.org/all/[email protected]/
>
> Will review the rest later.

I would rather convert it to use DMA API for memory allocation. It will
tap into swiotlb buffer that already converted and there's no need to
touch direct mapping. Both allocation and freeing such memory is cheaper
because of that.

--
Kirill A. Shutemov

2022-05-03 02:18:50

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Tue, 2022-05-03 at 04:27 +0300, Kirill A. Shutemov wrote:
> On Mon, May 02, 2022 at 02:40:26PM +1200, Kai Huang wrote:
> >
> > > +
> > > + /* Get order for Quote buffer page allocation */
> > > + order = get_order(quote_req.len);
> > > +
> > > + /*
> > > + * Allocate buffer to get TD Quote from the VMM.
> > > + * Size needs to be 4KB aligned (which is already
> > > + * met in page allocation).
> > > + */
> > > + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > > + if (!tdquote) {
> > > + ret = -ENOMEM;
> > > + goto quote_failed;
> > > + }
> >
> > You can use alloc_pages_exact().
> >
> > > +
> > > + /*
> > > + * Since this buffer will be shared with the VMM via GetQuote
> > > + * hypercall, decrypt it.
> > > + */
> > > + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
> > > + if (ret)
> > > + goto quote_failed;
> >
> >
> > Again, Dave and Andi already commented you should use vmap() to avoid breaking
> > up the direct-mapping. Please use vmap() instead.
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > Will review the rest later.
>
> I would rather convert it to use DMA API for memory allocation. It will
> tap into swiotlb buffer that already converted and there's no need to
> touch direct mapping. Both allocation and freeing such memory is cheaper
> because of that.
>

Does each DMA allocation and free internally do the actual private/shared
conversion? Or the swiotlb is converted at the beginning at boot and DMA
allocation will always get the shared buffer automatically?

The problem of using DMA API is it will need to bring additional code to use
platform device, which isn't necessary.

Using vmap() we can still (almost) avoid private/shared conversion at IOCTL time
by allocating a default size buffer (which is large enough to cover 99% cases,
etc) at driver initialization time:

https://lore.kernel.org/lkml/20220422233418.1203092-2-sathyanarayanan.kuppuswamy@linux.intel.com/T/#maf7e5f6894548972c5de71f607199a79645856ff


--
Thanks,
-Kai


Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support



On 5/2/22 7:18 PM, Kai Huang wrote:
> On Tue, 2022-05-03 at 04:27 +0300, Kirill A. Shutemov wrote:
>> On Mon, May 02, 2022 at 02:40:26PM +1200, Kai Huang wrote:
>>>
>>>> +
>>>> + /* Get order for Quote buffer page allocation */
>>>> + order = get_order(quote_req.len);
>>>> +
>>>> + /*
>>>> + * Allocate buffer to get TD Quote from the VMM.
>>>> + * Size needs to be 4KB aligned (which is already
>>>> + * met in page allocation).
>>>> + */
>>>> + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
>>>> + if (!tdquote) {
>>>> + ret = -ENOMEM;
>>>> + goto quote_failed;
>>>> + }
>>>
>>> You can use alloc_pages_exact().
>>>
>>>> +
>>>> + /*
>>>> + * Since this buffer will be shared with the VMM via GetQuote
>>>> + * hypercall, decrypt it.
>>>> + */
>>>> + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
>>>> + if (ret)
>>>> + goto quote_failed;
>>>
>>>
>>> Again, Dave and Andi already commented you should use vmap() to avoid breaking
>>> up the direct-mapping. Please use vmap() instead.
>>>
>>> https://lore.kernel.org/all/[email protected]/
>>>
>>> Will review the rest later.
>>
>> I would rather convert it to use DMA API for memory allocation. It will
>> tap into swiotlb buffer that already converted and there's no need to
>> touch direct mapping. Both allocation and freeing such memory is cheaper
>> because of that.
>>
>
> Does each DMA allocation and free internally do the actual private/shared
> conversion? Or the swiotlb is converted at the beginning at boot and DMA
> allocation will always get the shared buffer automatically?

DMA allocation will always return shared buffer.

>
> The problem of using DMA API is it will need to bring additional code to use
> platform device, which isn't necessary.

Yes.

>
> Using vmap() we can still (almost) avoid private/shared conversion at IOCTL time
> by allocating a default size buffer (which is large enough to cover 99% cases,
> etc) at driver initialization time:

Allocating fixed size buffer pool will work for dma buffer allocation
as well.

So the comparison is between platform driver boilerplate code vs vmap
and shared/unshared code addition. It is arguable which is better. I
think it is about preference.

>
> https://lore.kernel.org/lkml/20220422233418.1203092-2-sathyanarayanan.kuppuswamy@linux.intel.com/T/#maf7e5f6894548972c5de71f607199a79645856ff
>
>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2022-05-03 02:43:34

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Tue, May 03, 2022 at 02:18:10PM +1200, Kai Huang wrote:
> On Tue, 2022-05-03 at 04:27 +0300, Kirill A. Shutemov wrote:
> > On Mon, May 02, 2022 at 02:40:26PM +1200, Kai Huang wrote:
> > >
> > > > +
> > > > + /* Get order for Quote buffer page allocation */
> > > > + order = get_order(quote_req.len);
> > > > +
> > > > + /*
> > > > + * Allocate buffer to get TD Quote from the VMM.
> > > > + * Size needs to be 4KB aligned (which is already
> > > > + * met in page allocation).
> > > > + */
> > > > + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > > > + if (!tdquote) {
> > > > + ret = -ENOMEM;
> > > > + goto quote_failed;
> > > > + }
> > >
> > > You can use alloc_pages_exact().
> > >
> > > > +
> > > > + /*
> > > > + * Since this buffer will be shared with the VMM via GetQuote
> > > > + * hypercall, decrypt it.
> > > > + */
> > > > + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
> > > > + if (ret)
> > > > + goto quote_failed;
> > >
> > >
> > > Again, Dave and Andi already commented you should use vmap() to avoid breaking
> > > up the direct-mapping. Please use vmap() instead.
> > >
> > > https://lore.kernel.org/all/[email protected]/
> > >
> > > Will review the rest later.
> >
> > I would rather convert it to use DMA API for memory allocation. It will
> > tap into swiotlb buffer that already converted and there's no need to
> > touch direct mapping. Both allocation and freeing such memory is cheaper
> > because of that.
> >
>
> Does each DMA allocation and free internally do the actual private/shared
> conversion? Or the swiotlb is converted at the beginning at boot and DMA
> allocation will always get the shared buffer automatically?

It can remap as fallback, but usually it allocates from the pool.

> The problem of using DMA API is it will need to bring additional code to use
> platform device, which isn't necessary.

Heh? DMA is in the kernel anyway. Or do you mean some cost from the header
for the compilation unit? That's strange argument. Kernel provides
infrastructure for a reason.

> Using vmap() we can still (almost) avoid private/shared conversion at IOCTL time
> by allocating a default size buffer (which is large enough to cover 99% cases,
> etc) at driver initialization time:
>
> https://lore.kernel.org/lkml/20220422233418.1203092-2-sathyanarayanan.kuppuswamy@linux.intel.com/T/#maf7e5f6894548972c5de71f607199a79645856ff

I don't see a reason to invent ad-hoc solution if there's an establised
API for the task.

--
Kirill A. Shutemov

2022-05-03 03:36:38

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Tue, 2022-05-03 at 05:45 +0300, Kirill A. Shutemov wrote:
> On Tue, May 03, 2022 at 02:18:10PM +1200, Kai Huang wrote:
> > On Tue, 2022-05-03 at 04:27 +0300, Kirill A. Shutemov wrote:
> > > On Mon, May 02, 2022 at 02:40:26PM +1200, Kai Huang wrote:
> > > >
> > > > > +
> > > > > + /* Get order for Quote buffer page allocation */
> > > > > + order = get_order(quote_req.len);
> > > > > +
> > > > > + /*
> > > > > + * Allocate buffer to get TD Quote from the VMM.
> > > > > + * Size needs to be 4KB aligned (which is already
> > > > > + * met in page allocation).
> > > > > + */
> > > > > + tdquote = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > > > > + if (!tdquote) {
> > > > > + ret = -ENOMEM;
> > > > > + goto quote_failed;
> > > > > + }
> > > >
> > > > You can use alloc_pages_exact().
> > > >
> > > > > +
> > > > > + /*
> > > > > + * Since this buffer will be shared with the VMM via GetQuote
> > > > > + * hypercall, decrypt it.
> > > > > + */
> > > > > + ret = set_memory_decrypted((unsigned long)tdquote, 1UL << order);
> > > > > + if (ret)
> > > > > + goto quote_failed;
> > > >
> > > >
> > > > Again, Dave and Andi already commented you should use vmap() to avoid breaking
> > > > up the direct-mapping. Please use vmap() instead.
> > > >
> > > > https://lore.kernel.org/all/[email protected]/
> > > >
> > > > Will review the rest later.
> > >
> > > I would rather convert it to use DMA API for memory allocation. It will
> > > tap into swiotlb buffer that already converted and there's no need to
> > > touch direct mapping. Both allocation and freeing such memory is cheaper
> > > because of that.
> > >
> >
> > Does each DMA allocation and free internally do the actual private/shared
> > conversion? Or the swiotlb is converted at the beginning at boot and DMA
> > allocation will always get the shared buffer automatically?
>
> It can remap as fallback, but usually it allocates from the pool.
>
> > The problem of using DMA API is it will need to bring additional code to use
> > platform device, which isn't necessary.
>
> Heh? DMA is in the kernel anyway. Or do you mean some cost from the header
> for the compilation unit? That's strange argument. Kernel provides
> infrastructure for a reason.

I mean using platform device is more complicated than using misc_register()
directly. You can compare the code between v4 and v5.

>
> > Using vmap() we can still (almost) avoid private/shared conversion at IOCTL time
> > by allocating a default size buffer (which is large enough to cover 99% cases,
> > etc) at driver initialization time:
> >
> > https://lore.kernel.org/lkml/20220422233418.1203092-2-sathyanarayanan.kuppuswamy@linux.intel.com/T/#maf7e5f6894548972c5de71f607199a79645856ff
>
> I don't see a reason to invent ad-hoc solution if there's an establised
> API for the task.
>

DMA API can fit this job doesn't mean it fits better. And I don't see why using
vmap() as I described above is a ad-hoc.

1) There's no real DMA involved in attestation. Using DMA API is overkill.
2) DMA buffers are always shared, but this is only true for now. It's not
guaranteed to be true for future generation of TDX.

It's just a little bit weird to use DMA API when there's no real device and no
real DMA is involved. It's much more like using DMA API for convenience
purpose.


--
Thanks,
-Kai


2022-05-04 00:12:19

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Mon, 2022-05-02 at 19:39 -0700, Sathyanarayanan Kuppuswamy wrote:
> >
> > Using vmap() we can still (almost) avoid private/shared conversion at IOCTL
> > time
> > by allocating a default size buffer (which is large enough to cover 99%
> > cases,
> > etc) at driver initialization time:
>
> Allocating fixed size buffer pool will work for dma buffer allocation
> as well.
>
> So the comparison is between platform driver boilerplate code vs vmap
> and shared/unshared code addition. It is arguable which is better. I
> think it is about preference.

Not really. DMA buffer is guaranteed to be shared for now, but it's not
guaranteed in the future generations of TDX.

--
Thanks,
-Kai


2022-05-04 01:48:37

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On 5/2/22 18:27, Kirill A. Shutemov wrote:
>> Again, Dave and Andi already commented you should use vmap() to avoid breaking
>> up the direct-mapping. Please use vmap() instead.
>>
>> https://lore.kernel.org/all/[email protected]/
>>
>> Will review the rest later.
> I would rather convert it to use DMA API for memory allocation. It will
> tap into swiotlb buffer that already converted and there's no need to
> touch direct mapping. Both allocation and freeing such memory is cheaper
> because of that.

Sathya, I don't quite understand why you are so forcefully declining to
incorporate review feedback on this point. I gave very specific
feedback about the kind of mapping you need and that you should avoid
fragmenting the direct map if at all possible.

Why is this code still fragmenting the direct map?

Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support



On 5/3/22 3:24 PM, Dave Hansen wrote:
> On 5/2/22 18:27, Kirill A. Shutemov wrote:
>>> Again, Dave and Andi already commented you should use vmap() to avoid breaking
>>> up the direct-mapping. Please use vmap() instead.
>>>
>>> https://lore.kernel.org/all/[email protected]/
>>>
>>> Will review the rest later.
>> I would rather convert it to use DMA API for memory allocation. It will
>> tap into swiotlb buffer that already converted and there's no need to
>> touch direct mapping. Both allocation and freeing such memory is cheaper
>> because of that.
>
> Sathya, I don't quite understand why you are so forcefully declining to
> incorporate review feedback on this point. I gave very specific
> feedback about the kind of mapping you need and that you should avoid
> fragmenting the direct map if at all possible.
>
> Why is this code still fragmenting the direct map?

I have already implemented it and testing it now.

In this discussion, we are comparing the use of DMA API for memory
allocation vs vmap/sharing it in driver itself.

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

Hi,

On 5/3/22 3:24 PM, Dave Hansen wrote:
> On 5/2/22 18:27, Kirill A. Shutemov wrote:
>>> Again, Dave and Andi already commented you should use vmap() to avoid breaking
>>> up the direct-mapping. Please use vmap() instead.
>>>
>>> https://lore.kernel.org/all/[email protected]/
>>>
>>> Will review the rest later.
>> I would rather convert it to use DMA API for memory allocation. It will
>> tap into swiotlb buffer that already converted and there's no need to
>> touch direct mapping. Both allocation and freeing such memory is cheaper
>> because of that.
>
> Sathya, I don't quite understand why you are so forcefully declining to
> incorporate review feedback on this point. I gave very specific
> feedback about the kind of mapping you need and that you should avoid
> fragmenting the direct map if at all possible.
>
> Why is this code still fragmenting the direct map?

Next version will looks like below. It includes vmap support. It also
supports parallel GetQuote reuests by using list to track active
GetQuote request list. Did some inital test. It seems to work fine.

Please take a look at it and let me know your comments.

I will continue to test more cases in parallel.

diff --git a/arch/x86/coco/tdx/attest.c b/arch/x86/coco/tdx/attest.c
index 81b59d060392..0ed959dacb7e 100644
--- a/arch/x86/coco/tdx/attest.c
+++ b/arch/x86/coco/tdx/attest.c
@@ -13,20 +13,44 @@
#include <linux/miscdevice.h>
#include <linux/mm.h>
#include <linux/io.h>
+#include <linux/set_memory.h>
+#include <linux/spinlock.h>
#include <asm/tdx.h>
+#include <asm/coco.h>
#include <uapi/asm/tdx.h>

#define DRIVER_NAME "tdx-attest"

/* TDREPORT module call leaf ID */
#define TDX_GET_REPORT 4
+/* GetQuote hypercall leaf ID */
+#define TDVMCALL_GET_QUOTE 0x10002

/* Upper 32 bits has the status code, so mask it */
#define TDCALL_STATUS_CODE_MASK 0xffffffff00000000
#define TDCALL_STATUS_CODE(a) ((a) & TDCALL_STATUS_CODE_MASK)

+/* Used for buffer allocation in GetQuote request */
+struct quote_buf {
+ struct page **pages;
+ int count;
+ void *vmaddr;
+};
+
+/* List entry of quote_list*/
+struct quote_entry {
+ struct quote_buf *buf;
+ struct completion compl;
+ struct list_head list;
+};
+
static struct miscdevice miscdev;

+/* List to track active GetQuote requests */
+static LIST_HEAD(quote_list);
+/* Lock to protect quote_list */
+static DEFINE_SPINLOCK(quote_lock);
+
static long tdx_get_report(void __user *argp)
{
void *reportdata = NULL, *tdreport = NULL;
@@ -76,6 +100,214 @@ static long tdx_get_report(void __user *argp)
return ret;
}

+/* tdx_get_quote_hypercall() - Request to get TD Quote using TDREPORT */
+static long tdx_get_quote_hypercall(struct quote_buf *buf)
+{
+ struct tdx_hypercall_args args = {0};
+
+ args.r10 = TDX_HYPERCALL_STANDARD;
+ args.r11 = TDVMCALL_GET_QUOTE;
+ args.r12 = cc_mkdec(page_to_phys(vmalloc_to_page(buf->vmaddr)));
+ args.r13 = buf->count * PAGE_SIZE;
+
+ /*
+ * Pass the physical address of TDREPORT to the VMM and
+ * trigger the Quote generation. It is not a blocking
+ * call, hence completion of this request will be notified to
+ * the TD guest via a callback interrupt. More info about ABI
+ * can be found in TDX Guest-Host-Communication Interface
+ * (GHCI), sec titled "TDG.VP.VMCALL<GetQuote>".
+ */
+ return __tdx_hypercall(&args, 0);
+}
+
+static struct quote_buf *alloc_quote_buf(u64 req_size)
+{
+ int size = PAGE_ALIGN(req_size);
+ void *addr = NULL, *vmaddr = NULL;
+ int count = size >> PAGE_SHIFT;
+ struct quote_buf *buf;
+ struct page **pages;
+ int i;
+
+ buf = kmalloc(sizeof(*buf), GFP_KERNEL);
+ if (!buf)
+ return NULL;
+
+ /* Allocate mem for array of page ptrs */
+ pages = kcalloc(count, sizeof(*pages), GFP_KERNEL);
+ if (!pages)
+ goto alloc_failed;
+
+ addr = alloc_pages_exact(size, GFP_KERNEL);
+ if (!addr)
+ goto alloc_failed;
+
+ for (i = 0; i < count; i++)
+ pages[i] = virt_to_page(addr + i * PAGE_SIZE);
+
+ vmaddr = vmap(pages, count, VM_MAP, PAGE_KERNEL);
+ if (!vmaddr)
+ goto alloc_failed;
+
+ /* Mark the memory shared to allow VMM access it */
+ if (set_memory_decrypted((unsigned long)vmaddr, count))
+ goto alloc_failed;
+
+ buf->pages = pages;
+ buf->count = count;
+ buf->vmaddr = vmaddr;
+
+ return buf;
+
+alloc_failed:
+ if (vmaddr)
+ vunmap(vmaddr);
+ if (addr)
+ free_pages_exact(addr, size);
+ kfree(pages);
+ kfree(buf);
+ return NULL;
+}
+
+static void free_quote_buf(struct quote_buf *buf)
+{
+ if (!buf)
+ return;
+
+ /* Mark pages private */
+ if (set_memory_encrypted((unsigned long)buf->vmaddr, buf->count)) {
+ pr_warn("Failed to encrypt %d pages at %p", buf->count,
+ buf->vmaddr);
+ return;
+ }
+
+ vunmap(buf->vmaddr);
+
+ while (buf->count--)
+ __free_page(buf->pages[buf->count]);
+
+ kfree(buf->pages);
+ kfree(buf);
+}
+
+static struct quote_entry *alloc_quote_entry(struct tdx_quote_req
*quote_req)
+{
+ struct quote_entry *entry = NULL;
+
+ entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return NULL;
+
+ /* Allocate buffer for quote request */
+ entry->buf = alloc_quote_buf(quote_req->len);
+ if (entry->buf) {
+ kfree(entry);
+ return NULL;
+ }
+
+ init_completion(&entry->compl);
+
+ return entry;
+}
+
+static void free_quote_entry(struct quote_entry *entry)
+{
+ free_quote_buf(entry->buf);
+ kfree(entry);
+}
+
+void add_quote_entry(struct quote_entry *entry)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&quote_lock, flags);
+ list_add_tail(&entry->list, &quote_list);
+ spin_unlock_irqrestore(&quote_lock, flags);
+}
+
+void del_quote_entry(struct quote_entry *entry)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&quote_lock, flags);
+ list_del(&entry->list);
+ spin_unlock_irqrestore(&quote_lock, flags);
+}
+
+static long tdx_get_quote(void __user *argp)
+{
+ struct tdx_quote_req quote_req;
+ struct quote_entry *entry;
+ struct quote_buf *buf;
+ long ret = 0;
+
+ /* Copy GetQuote request struct from user buffer */
+ if (copy_from_user(&quote_req, argp, sizeof(struct tdx_quote_req)))
+ return -EFAULT;
+
+ /* Make sure the length is valid */
+ if (!quote_req.len)
+ return -EINVAL;
+
+ entry = alloc_quote_entry(&quote_req);
+ if (!entry)
+ return -ENOMEM;
+
+ buf = entry->buf;
+
+ /* Copy TDREPORT from user buffer to kernel Quote buffer */
+ if (copy_from_user(buf->vmaddr, (void __user *)quote_req.buf,
+ quote_req.len)) {
+ ret = -EFAULT;
+ goto free_entry;
+ }
+
+ /* Submit GetQuote Request */
+ ret = tdx_get_quote_hypercall(buf);
+ if (ret) {
+ pr_err("GetQuote hypercall failed, status:%lx\n", ret);
+ ret = -EIO;
+ goto free_entry;
+ }
+
+ /* Add current quote entry to quote_list */
+ add_quote_entry(entry);
+
+ /* Wait for attestation completion */
+ ret = wait_for_completion_interruptible(&entry->compl);
+ if (ret < 0) {
+ ret = -EIO;
+ goto del_entry;
+ }
+
+ /* Copy output data back to user buffer */
+ if (copy_to_user((void __user *)quote_req.buf, buf->vmaddr,
quote_req.len))
+ ret = -EFAULT;
+
+del_entry:
+ del_quote_entry(entry);
+free_entry:
+ free_quote_entry(entry);
+ return ret;
+}
+
+static void attestation_callback_handler(void)
+{
+ struct tdx_quote_hdr *quote_hdr;
+ struct quote_entry *entry;
+
+ /* Find processed quote request and mark it complete */
+ spin_lock(&quote_lock);
+ list_for_each_entry(entry, &quote_list, list) {
+ quote_hdr = (struct tdx_quote_hdr *)entry->buf->vmaddr;
+ /* If status is either success or failure, mark it
complete */
+ if (quote_hdr->status != GET_QUOTE_IN_FLIGHT)
+ complete(&entry->compl);
+ }
+ spin_unlock(&quote_lock);
+}
+
static long tdx_attest_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -86,6 +318,9 @@ static long tdx_attest_ioctl(struct file *file,
unsigned int cmd,
case TDX_CMD_GET_REPORT:
ret = tdx_get_report(argp);
break;
+ case TDX_CMD_GET_QUOTE:
+ ret = tdx_get_quote(argp);
+ break;
default:
pr_debug("cmd %d not supported\n", cmd);
break;
@@ -108,6 +343,9 @@ static int __init tdx_attestation_init(void)
if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
return -EIO;

+ /* Register attestation event notify handler */
+ tdx_setup_ev_notify_handler(attestation_callback_handler);
+
miscdev.name = DRIVER_NAME;
miscdev.minor = MISC_DYNAMIC_MINOR;
miscdev.fops = &tdx_attest_fops;
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index b49211994864..ab6c94cd880d 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -15,6 +15,7 @@
#include <asm/idtentry.h>
#include <asm/irq_regs.h>
#include <asm/desc.h>
+#include <asm/io.h>

/* TDX module Call Leaf IDs */
#define TDX_GET_INFO 1
@@ -680,8 +681,15 @@ static bool try_accept_one(phys_addr_t *start,
unsigned long len,
*/
static bool tdx_enc_status_changed(unsigned long vaddr, int numpages,
bool enc)
{
- phys_addr_t start = __pa(vaddr);
- phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE);
+ phys_addr_t start;
+ phys_addr_t end;
+
+ if (is_vmalloc_addr((void *)vaddr))
+ start = page_to_phys(vmalloc_to_page((void*)vaddr));
+ else
+ start = __pa(vaddr);
+
+ end = start + numpages * PAGE_SIZE;

if (!enc) {
/* Set the shared (decrypted) bits: */
diff --git a/arch/x86/include/uapi/asm/tdx.h
b/arch/x86/include/uapi/asm/tdx.h
index 5b721e0ebbb8..3fb69fe3ffb7 100644
--- a/arch/x86/include/uapi/asm/tdx.h
+++ b/arch/x86/include/uapi/asm/tdx.h
@@ -33,4 +33,43 @@ struct tdx_report_req {
/* Get TDREPORT using TDCALL[TDG.MR.REPORT) */
#define TDX_CMD_GET_REPORT _IOWR('T', 0x01, struct
tdx_report_req)

+/* struct tdx_quote_req: Request to generate TD Quote using TDREPORT
+ *
+ * @buf : Pass user data that includes TDREPORT as
input. Upon
+ * successful completion of IOCTL, output is copied
+ * back to the same buffer.
+ * @len : Length of the buffer.
+ */
+struct tdx_quote_req {
+ __u64 buf;
+ __u64 len;
+};
+
+/* Get TD Quote from QE/QGS using GetQuote TDVMCALL */
+#define TDX_CMD_GET_QUOTE _IOR('T', 0x02, struct
tdx_quote_req)
+
+/* TD Quote status codes */
+#define GET_QUOTE_SUCCESS 0
+#define GET_QUOTE_IN_FLIGHT 0xffffffffffffffff
+#define GET_QUOTE_ERROR 0x8000000000000000
+#define GET_QUOTE_SERVICE_UNAVAILABLE 0x8000000000000001
+
+/*
+ * Format of Quote data header. More details can be found in TDX
+ * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
+ * section titled "TDG.VP.VMCALL<GetQuote>"
+ */
+struct tdx_quote_hdr {
+ /* Quote version, filled by TD */
+ __u64 version;
+ /* Status code of Quote request, filled by VMM */
+ __u64 status;
+ /* Length of TDREPORT, filled by TD */
+ __u32 in_len;
+ /* Length of Quote, filled by VMM */
+ __u32 out_len;
+ /* Actual Quote data */
+ __u64 data[0];
+};
+
#endif /* _UAPI_ASM_X86_TDX_H */


--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

2022-05-07 12:57:13

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support


> + /* Submit GetQuote Request */
> + ret = tdx_get_quote_hypercall(buf);
> + if (ret) {
> + pr_err("GetQuote hypercall failed, status:%lx\n", ret);
> + ret = -EIO;
> + goto free_entry;
> + }
> +
> + /* Add current quote entry to quote_list */
> + add_quote_entry(entry);
> +
> + /* Wait for attestation completion */
> + ret = wait_for_completion_interruptible(&entry->compl);
> + if (ret < 0) {
> + ret = -EIO;
> + goto del_entry;
> + }

This is misuse of wait_for_completion_interruptible().

xxx_interruptible() essentially means this operation can be interrupted by
signal. Using xxx_interruptible() in driver IOCTL essentially means when it
returns due to signal, the IOCTL should return -EINTR to let userspace know that
your application received some signal needs handling, and this IOCTL isn't
finished and you should retry. So here we should return -EINTR (and cleanup all
staff has been done) when wait_for_completion_interruptible() returns -
ERESTARTSYS (in fact, it returns only -ERESTARTSYS or 0).

Since normally userspace application just ignore signals, and in this particular
case, asking userspace to retry just makes things more complicated to handle, I
think you can just use wait_for_completion_killable(), which only returns when
the application receives signal that it is going to be killed.

> +
> + /* Copy output data back to user buffer */
> + if (copy_to_user((void __user *)quote_req.buf, buf->vmaddr,
> quote_req.len))
> + ret = -EFAULT;
> +
> +del_entry:
> + del_quote_entry(entry);
> +free_entry:
> + free_quote_entry(entry);

As I (and Isaku) mentioned before, when wait_for_completion_killable() returns
with error, you cannot just convert the buffer to private and free it. The VMM
is still owning it (IN_FLIGHT).

One way to handle is you can put those buffers that are still owned by VMM to a
new list, and have some kernel thread to periodically check buffer's status and
free those are already released by VMM. I haven't thought thoroughly, so maybe
there's better way to handle, though.

--
Thanks,
-Kai



2022-05-09 05:02:08

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

On Wed, 2022-05-04 at 15:49 -0700, Sathyanarayanan Kuppuswamy wrote:
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -15,6 +15,7 @@
>   #include <asm/idtentry.h>
>   #include <asm/irq_regs.h>
>   #include <asm/desc.h>
> +#include <asm/io.h>
>
>   /* TDX module Call Leaf IDs */
>   #define TDX_GET_INFO                   1
> @@ -680,8 +681,15 @@ static bool try_accept_one(phys_addr_t *start,
> unsigned long len,
>    */
>   static bool tdx_enc_status_changed(unsigned long vaddr, int numpages,
> bool enc)
>   {
> -       phys_addr_t start = __pa(vaddr);
> -       phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> +       phys_addr_t start;
> +       phys_addr_t end;
> +
> +       if (is_vmalloc_addr((void *)vaddr))
> +               start =  page_to_phys(vmalloc_to_page((void*)vaddr));
> +       else
> +               start = __pa(vaddr);
> +
> +       end = start + numpages * PAGE_SIZE;
>
>          if (!enc) {
>                  /* Set the shared (decrypted) bits: */

Looks set_memory_decrypted() only works for direct-mapping, so you should not
use this. Instead, you can pass shared bit in 'prot' argument (using
pgprot_decrypted()) when you call vmap(), and explicitly call MapGPA().

--
Thanks,
-Kai



Subject: Re: [PATCH v5 3/3] x86/tdx: Add Quote generation support

Hi Kai,

On 5/4/22 4:28 PM, Kai Huang wrote:
> On Wed, 2022-05-04 at 15:49 -0700, Sathyanarayanan Kuppuswamy wrote:
>> --- a/arch/x86/coco/tdx/tdx.c
>> +++ b/arch/x86/coco/tdx/tdx.c
>> @@ -15,6 +15,7 @@
>>   #include <asm/idtentry.h>
>>   #include <asm/irq_regs.h>
>>   #include <asm/desc.h>
>> +#include <asm/io.h>
>>
>>   /* TDX module Call Leaf IDs */
>>   #define TDX_GET_INFO                   1
>> @@ -680,8 +681,15 @@ static bool try_accept_one(phys_addr_t *start,
>> unsigned long len,
>>    */
>>   static bool tdx_enc_status_changed(unsigned long vaddr, int numpages,
>> bool enc)
>>   {
>> -       phys_addr_t start = __pa(vaddr);
>> -       phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
>> +       phys_addr_t start;
>> +       phys_addr_t end;
>> +
>> +       if (is_vmalloc_addr((void *)vaddr))
>> +               start =  page_to_phys(vmalloc_to_page((void*)vaddr));
>> +       else
>> +               start = __pa(vaddr);
>> +
>> +       end = start + numpages * PAGE_SIZE;
>>
>>          if (!enc) {
>>                  /* Set the shared (decrypted) bits: */
>
> Looks set_memory_decrypted() only works for direct-mapping, so you should not
> use this. Instead, you can pass shared bit in 'prot' argument (using
> pgprot_decrypted()) when you call vmap(), and explicitly call MapGPA().

Is it because of the above change, or you see other direct-mapping
dependencies in set_memory_*() functions?


>

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer