Changes since v3: [1]:
- Combine configfs-tsm + sev-guest conversion with the tdx-guest
extension
- Split PEM formatted certificate data to its own output attribute
(Jeremi)
- Parse the sev-guest output payload and emit the raw report without the
header (Jeremi)
- Drop @format as an input parameter and always request "extended"
reports in the sev-guest case with certificate data optionally
included (inspired by creation of separate @certs attribute)
- Drop usage of cleanup helpers in tdx_report_new() until
mutex_lock_interruptible() grows a guard() helper in v6.7. (Daniel and
Dave)
- Changelog grammar fixes for tdx-guest change (Kirill)
- Defer tdx-guest emitting its cert-chain through @certs pending
question on output payload versioning (i.e. kernel should only support
one). In the meantime zero-sized @certs is a valid output condition.
[1]: http://lore.kernel.org/r/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com
An attestation report is signed evidence of how a Trusted Virtual
Machine (TVM) was launched and its current state. A verifying party uses
the report to make judgements of the confidentiality and integrity of
that execution environment. Upon successful attestation the verifying
party may, for example, proceed to deploy secrets to the TVM to carry
out a workload. Multiple confidential computing platforms share this
similar flow.
The approach of adding adding new char devs and new ioctls, for what
amounts to the same logical functionality with minor formatting
differences across vendors [2], is untenable. Common concepts and the
community benefit from common infrastructure.
Use configfs for this facility for maintainability compared to ioctl(),
and for its scalability compared to sysfs. Atomicity can be enforced at
item creation time, and a conflict detection mechanism is included for
scenarios where multiple threads may share a single configuration
instance.
[2]: http://lore.kernel.org/r/[email protected]
---
Dan Williams (5):
virt: coco: Add a coco/Makefile and coco/Kconfig
configfs-tsm: Introduce a shared ABI for attestation reports
virt: sevguest: Prep for kernel internal {get,get_ext}_report()
mm/slab: Add __free() support for kvfree
virt: sevguest: Add TSM_REPORTS support for SNP_{GET,GET_EXT}_REPORT
Kuppuswamy Sathyanarayanan (1):
virt: tdx-guest: Add Quote generation support using TSM_REPORTS
Documentation/ABI/testing/configfs-tsm | 67 +++++
MAINTAINERS | 8 +
arch/x86/coco/tdx/tdx.c | 21 ++
arch/x86/include/asm/shared/tdx.h | 1
arch/x86/include/asm/tdx.h | 2
drivers/virt/Kconfig | 6
drivers/virt/Makefile | 4
drivers/virt/coco/Kconfig | 14 +
drivers/virt/coco/Makefile | 8 +
drivers/virt/coco/sev-guest/Kconfig | 1
drivers/virt/coco/sev-guest/sev-guest.c | 180 ++++++++++++--
drivers/virt/coco/tdx-guest/Kconfig | 1
drivers/virt/coco/tdx-guest/tdx-guest.c | 229 +++++++++++++++++
drivers/virt/coco/tsm.c | 411 +++++++++++++++++++++++++++++++
include/linux/slab.h | 2
include/linux/tsm.h | 63 +++++
16 files changed, 992 insertions(+), 26 deletions(-)
create mode 100644 Documentation/ABI/testing/configfs-tsm
create mode 100644 drivers/virt/coco/Kconfig
create mode 100644 drivers/virt/coco/Makefile
create mode 100644 drivers/virt/coco/tsm.c
create mode 100644 include/linux/tsm.h
base-commit: 6465e260f48790807eef06b583b38ca9789b6072
From: Kuppuswamy Sathyanarayanan <[email protected]>
In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest. The first step in the attestation process is TDREPORT
generation, which involves getting the guest measurement data in the
format of TDREPORT, which is further used to validate the authenticity
of the TDX guest. TDREPORT by design is integrity-protected and can
only be verified on the local machine.
To support remote verification of the TDREPORT in a SGX-based
attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
(QE) to convert it to a remotely verifiable Quote. SGX QE by design can
only run outside of the TDX guest (i.e. in a host process or in a
normal VM) and guest can use communication channels like vsock or
TCP/IP to send the TDREPORT to the QE. But for security concerns, the
TDX guest may not support these communication channels. To handle such
cases, TDX defines a GetQuote hypercall which can be used by the guest
to request the host VMM to communicate with the SGX QE. More details
about GetQuote hypercall can be found in TDX Guest-Host Communication
Interface (GHCI) for Intel TDX 1.0, section titled
"TDG.VP.VMCALL<GetQuote>".
Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
Computing Guest platforms to get the measurement data via ConfigFS.
Extend the TSM framework and add support to allow an attestation agent
to get the TDX Quote data (included usage example below).
report=/sys/kernel/config/tsm/report/report0
mkdir $report
dd if=/dev/urandom bs=64 count=1 > $report/inblob
hexdump -C $report/outblob
rmdir $report
GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
with TDREPORT data as input, which is further used by the VMM to copy
the TD Quote result after successful Quote generation. To create the
shared buffer, allocate a large enough memory and mark it shared using
set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
for GetQuote requests in the TDX TSM handler.
Although this method reserves a fixed chunk of memory for GetQuote
requests, such one time allocation can help avoid memory fragmentation
related allocation failures later in the uptime of the guest.
Since the Quote generation process is not time-critical or frequently
used, the current version uses a polling model for Quote requests and
it also does not support parallel GetQuote requests.
Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Erdem Aktas <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
arch/x86/coco/tdx/tdx.c | 21 +++
arch/x86/include/asm/shared/tdx.h | 1
arch/x86/include/asm/tdx.h | 2
drivers/virt/coco/tdx-guest/Kconfig | 1
drivers/virt/coco/tdx-guest/tdx-guest.c | 229 +++++++++++++++++++++++++++++++
5 files changed, 253 insertions(+), 1 deletion(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..752867b1d11b 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -104,6 +104,27 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
}
EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);
+/**
+ * tdx_hcall_get_quote() - Wrapper to request TD Quote using GetQuote
+ * hypercall.
+ * @buf: Address of the directly mapped shared kernel buffer which
+ * contains TDREPORT. The same buffer will be used by VMM to
+ * store the generated TD Quote output.
+ * @size: size of the tdquote buffer (4KB-aligned).
+ *
+ * Refer to section titled "TDG.VP.VMCALL<GetQuote>" in the TDX GHCI
+ * v1.0 specification for more information on GetQuote hypercall.
+ * It is used in the TDX guest driver module to get the TD Quote.
+ *
+ * Return 0 on success or error code on failure.
+ */
+u64 tdx_hcall_get_quote(u8 *buf, size_t size)
+{
+ /* Since buf is a shared memory, set the shared (decrypted) bits */
+ return _tdx_hypercall(TDVMCALL_GET_QUOTE, cc_mkdec(virt_to_phys(buf)), size, 0, 0);
+}
+EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
+
static void __noreturn tdx_panic(const char *msg)
{
struct tdx_hypercall_args args = {
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 7513b3bb69b7..9eab19950f39 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -22,6 +22,7 @@
/* TDX hypercall Leaf IDs */
#define TDVMCALL_MAP_GPA 0x10001
+#define TDVMCALL_GET_QUOTE 0x10002
#define TDVMCALL_REPORT_FATAL_ERROR 0x10003
#ifndef __ASSEMBLY__
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 603e6d1e9d4a..ebd1cda4875f 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -52,6 +52,8 @@ bool tdx_early_handle_ve(struct pt_regs *regs);
int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport);
+u64 tdx_hcall_get_quote(u8 *buf, size_t size);
+
#else
static inline void tdx_early_init(void) { };
diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
index 14246fc2fb02..22dd59e19431 100644
--- a/drivers/virt/coco/tdx-guest/Kconfig
+++ b/drivers/virt/coco/tdx-guest/Kconfig
@@ -1,6 +1,7 @@
config TDX_GUEST_DRIVER
tristate "TDX Guest driver"
depends on INTEL_TDX_GUEST
+ select TSM_REPORTS
help
The driver provides userspace interface to communicate with
the TDX module to request the TDX guest details like attestation
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
index 5e44a0fa69bd..1253bf76b570 100644
--- a/drivers/virt/coco/tdx-guest/tdx-guest.c
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -12,12 +12,60 @@
#include <linux/mod_devicetable.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <linux/set_memory.h>
+#include <linux/io.h>
+#include <linux/delay.h>
+#include <linux/tsm.h>
+#include <linux/sizes.h>
#include <uapi/linux/tdx-guest.h>
#include <asm/cpu_device_id.h>
#include <asm/tdx.h>
+/*
+ * Intel's SGX QE implementation generally uses Quote size less
+ * than 8K (2K Quote data + ~5K of certificate blob).
+ */
+#define GET_QUOTE_BUF_SIZE SZ_8K
+
+#define GET_QUOTE_CMD_VER 1
+
+/* TDX GetQuote status codes */
+#define GET_QUOTE_SUCCESS 0
+#define GET_QUOTE_IN_FLIGHT 0xffffffffffffffff
+
+/* struct tdx_quote_buf: Format of Quote request buffer.
+ * @version: Quote format version, filled by TD.
+ * @status: Status code of Quote request, filled by VMM.
+ * @in_len: Length of TDREPORT, filled by TD.
+ * @out_len: Length of Quote data, filled by VMM.
+ * @data: Quote data on output or TDREPORT on input.
+ *
+ * More details of Quote request buffer can be found in TDX
+ * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
+ * section titled "TDG.VP.VMCALL<GetQuote>"
+ */
+struct tdx_quote_buf {
+ u64 version;
+ u64 status;
+ u32 in_len;
+ u32 out_len;
+ u8 data[];
+};
+
+/* Quote data buffer */
+static void *quote_data;
+
+/* Lock to streamline quote requests */
+static DEFINE_MUTEX(quote_lock);
+
+/*
+ * GetQuote request timeout in seconds. Expect that 30 seconds
+ * is enough time for QE to respond to any Quote requests.
+ */
+static u32 getquote_timeout = 30;
+
static long tdx_get_report0(struct tdx_report_req __user *req)
{
u8 *reportdata, *tdreport;
@@ -53,6 +101,154 @@ static long tdx_get_report0(struct tdx_report_req __user *req)
return ret;
}
+static void free_quote_buf(void *buf)
+{
+ size_t len = PAGE_ALIGN(GET_QUOTE_BUF_SIZE);
+ unsigned int count = len >> PAGE_SHIFT;
+
+ if (set_memory_encrypted((unsigned long)buf, count)) {
+ pr_err("Failed to restore encryption mask for Quote buffer, leak it\n");
+ return;
+ }
+
+ free_pages_exact(buf, len);
+}
+
+static void *alloc_quote_buf(void)
+{
+ size_t len = PAGE_ALIGN(GET_QUOTE_BUF_SIZE);
+ unsigned int count = len >> PAGE_SHIFT;
+ void *addr;
+
+ addr = alloc_pages_exact(len, GFP_KERNEL | __GFP_ZERO);
+ if (!addr)
+ return NULL;
+
+ if (set_memory_decrypted((unsigned long)addr, count)) {
+ free_pages_exact(addr, len);
+ return NULL;
+ }
+
+ return addr;
+}
+
+/*
+ * wait_for_quote_completion() - Wait for Quote request completion
+ * @quote_buf: Address of Quote buffer.
+ * @timeout: Timeout in seconds to wait for the Quote generation.
+ *
+ * As per TDX GHCI v1.0 specification, sec titled "TDG.VP.VMCALL<GetQuote>",
+ * the status field in the Quote buffer will be set to GET_QUOTE_IN_FLIGHT
+ * while VMM processes the GetQuote request, and will change it to success
+ * or error code after processing is complete. So wait till the status
+ * changes from GET_QUOTE_IN_FLIGHT or the request being timed out.
+ */
+static int wait_for_quote_completion(struct tdx_quote_buf *quote_buf, u32 timeout)
+{
+ int i = 0;
+
+ /*
+ * Quote requests usually take a few seconds to complete, so waking up
+ * once per second to recheck the status is fine for this use case.
+ */
+ while (quote_buf->status == GET_QUOTE_IN_FLIGHT && i++ < timeout) {
+ if (msleep_interruptible(MSEC_PER_SEC))
+ return -EINTR;
+ }
+
+ return (i == timeout) ? -ETIMEDOUT : 0;
+}
+
+static int tdx_report_new(struct tsm_report *report, void *data)
+{
+ u8 *buf, *reportdata = NULL, *tdreport = NULL;
+ struct tdx_quote_buf *quote_buf = quote_data;
+ struct tsm_desc *desc = &report->desc;
+ int ret;
+ u64 err;
+
+ /* TODO: switch to guard(mutex_intr) */
+ if (mutex_lock_interruptible("e_lock))
+ return -EINTR;
+
+ /*
+ * If the previous request is timedout or interrupted, and the
+ * Quote buf status is still in GET_QUOTE_IN_FLIGHT (owned by
+ * VMM), don't permit any new request.
+ */
+ if (quote_buf->status == GET_QUOTE_IN_FLIGHT) {
+ ret = -EBUSY;
+ goto done;
+ }
+
+ if (desc->inblob_len != TDX_REPORTDATA_LEN) {
+ ret = -EINVAL;
+ goto done;
+ }
+
+ reportdata = kmalloc(TDX_REPORTDATA_LEN, GFP_KERNEL);
+ if (!reportdata) {
+ ret = -ENOMEM;
+ goto done;
+ }
+
+ tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
+ if (!tdreport) {
+ ret = -ENOMEM;
+ goto done;
+ }
+
+ memcpy(reportdata, desc->inblob, desc->inblob_len);
+
+ /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
+ ret = tdx_mcall_get_report0(reportdata, tdreport);
+ if (ret) {
+ pr_err("GetReport call failed\n");
+ goto done;
+ }
+
+ memset(quote_data, 0, GET_QUOTE_BUF_SIZE);
+
+ /* Update Quote buffer header */
+ quote_buf->version = GET_QUOTE_CMD_VER;
+ quote_buf->in_len = TDX_REPORT_LEN;
+
+ memcpy(quote_buf->data, tdreport, TDX_REPORT_LEN);
+
+ err = tdx_hcall_get_quote(quote_data, GET_QUOTE_BUF_SIZE);
+ if (err) {
+ pr_err("GetQuote hypercall failed, status:%llx\n", err);
+ ret = -EIO;
+ goto done;
+ }
+
+ ret = wait_for_quote_completion(quote_buf, getquote_timeout);
+ if (ret) {
+ pr_err("GetQuote request timedout\n");
+ goto done;
+ }
+
+ buf = kvmemdup(quote_buf->data, quote_buf->out_len, GFP_KERNEL);
+ if (!buf) {
+ ret = -ENOMEM;
+ goto done;
+ }
+
+ report->outblob = buf;
+ report->outblob_len = quote_buf->out_len;
+
+ /*
+ * TODO: parse the PEM-formatted cert chain out of the quote buffer when
+ * provided
+ */
+done:
+ mutex_unlock("e_lock);
+ kfree(reportdata);
+ kfree(tdreport);
+
+ return ret;
+}
+
static long tdx_guest_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -82,17 +278,48 @@ static const struct x86_cpu_id tdx_guest_ids[] = {
};
MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
+static const struct tsm_ops tdx_tsm_ops = {
+ .name = KBUILD_MODNAME,
+ .report_new = tdx_report_new,
+};
+
static int __init tdx_guest_init(void)
{
+ int ret;
+
if (!x86_match_cpu(tdx_guest_ids))
return -ENODEV;
- return misc_register(&tdx_misc_dev);
+ ret = misc_register(&tdx_misc_dev);
+ if (ret)
+ return ret;
+
+ quote_data = alloc_quote_buf();
+ if (!quote_data) {
+ pr_err("Failed to allocate Quote buffer\n");
+ ret = -ENOMEM;
+ goto free_misc;
+ }
+
+ ret = tsm_register(&tdx_tsm_ops, NULL, NULL);
+ if (ret)
+ goto free_quote;
+
+ return 0;
+
+free_quote:
+ free_quote_buf(quote_data);
+free_misc:
+ misc_deregister(&tdx_misc_dev);
+
+ return ret;
}
module_init(tdx_guest_init);
static void __exit tdx_guest_exit(void)
{
+ tsm_unregister(&tdx_tsm_ops);
+ free_quote_buf(quote_data);
misc_deregister(&tdx_misc_dev);
}
module_exit(tdx_guest_exit);
In preparation for using the configs-tsm facility to convey attestation
blobs to userspace, switch to using the 'sockptr' api for copying
payloads to provided buffers where 'sockptr' handles user vs kernel
buffers.
While configfs-tsm is meant to replace existing confidential computing
ioctl() implementations for attestation report retrieval the old ioctl()
path needs to stick around for a deprecation period.
No behavior change intended.
Cc: Borislav Petkov <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Dionna Glaze <[email protected]>
Cc: Brijesh Singh <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/virt/coco/sev-guest/sev-guest.c | 50 ++++++++++++++++++++-----------
1 file changed, 33 insertions(+), 17 deletions(-)
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 97dbe715e96a..c3c9e9ea691f 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -19,6 +19,7 @@
#include <crypto/aead.h>
#include <linux/scatterlist.h>
#include <linux/psp-sev.h>
+#include <linux/sockptr.h>
#include <uapi/linux/sev-guest.h>
#include <uapi/linux/psp-sev.h>
@@ -470,7 +471,13 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
return 0;
}
-static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+struct snp_req_resp {
+ sockptr_t req_data;
+ sockptr_t resp_data;
+};
+
+static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg,
+ struct snp_req_resp *io)
{
struct snp_guest_crypto *crypto = snp_dev->crypto;
struct snp_report_resp *resp;
@@ -479,10 +486,10 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
lockdep_assert_held(&snp_cmd_mutex);
- if (!arg->req_data || !arg->resp_data)
+ if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
return -EINVAL;
- if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
return -EFAULT;
/*
@@ -501,7 +508,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
if (rc)
goto e_free;
- if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
rc = -EFAULT;
e_free:
@@ -550,22 +557,25 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
return rc;
}
-static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg,
+ struct snp_req_resp *io)
+
{
struct snp_guest_crypto *crypto = snp_dev->crypto;
struct snp_ext_report_req req;
struct snp_report_resp *resp;
int ret, npages = 0, resp_len;
+ sockptr_t certs_address;
lockdep_assert_held(&snp_cmd_mutex);
- if (!arg->req_data || !arg->resp_data)
+ if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
return -EINVAL;
- if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
return -EFAULT;
- /* userspace does not want certificate data */
+ /* caller does not want certificate data */
if (!req.certs_len || !req.certs_address)
goto cmd;
@@ -573,8 +583,13 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
!IS_ALIGNED(req.certs_len, PAGE_SIZE))
return -EINVAL;
- if (!access_ok((const void __user *)req.certs_address, req.certs_len))
- return -EFAULT;
+ if (sockptr_is_kernel(io->resp_data)) {
+ certs_address = KERNEL_SOCKPTR((void *)req.certs_address);
+ } else {
+ certs_address = USER_SOCKPTR((void __user *)req.certs_address);
+ if (!access_ok(certs_address.user, req.certs_len))
+ return -EFAULT;
+ }
/*
* Initialize the intermediate buffer with all zeros. This buffer
@@ -604,21 +619,19 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
- if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
+ if (copy_to_sockptr(io->req_data, &req, sizeof(req)))
ret = -EFAULT;
}
if (ret)
goto e_free;
- if (npages &&
- copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
- req.certs_len)) {
+ if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, req.certs_len)) {
ret = -EFAULT;
goto e_free;
}
- if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
ret = -EFAULT;
e_free:
@@ -631,6 +644,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
struct snp_guest_dev *snp_dev = to_snp_dev(file);
void __user *argp = (void __user *)arg;
struct snp_guest_request_ioctl input;
+ struct snp_req_resp io;
int ret = -ENOTTY;
if (copy_from_user(&input, argp, sizeof(input)))
@@ -651,15 +665,17 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
return -ENOTTY;
}
+ io.req_data = USER_SOCKPTR((void __user *)input.req_data);
+ io.resp_data = USER_SOCKPTR((void __user *)input.resp_data);
switch (ioctl) {
case SNP_GET_REPORT:
- ret = get_report(snp_dev, &input);
+ ret = get_report(snp_dev, &input, &io);
break;
case SNP_GET_DERIVED_KEY:
ret = get_derived_key(snp_dev, &input);
break;
case SNP_GET_EXT_REPORT:
- ret = get_ext_report(snp_dev, &input);
+ ret = get_ext_report(snp_dev, &input, &io);
break;
default:
break;
On 9/25/2023 9:17 PM, Dan Williams wrote:
> In preparation for using the configs-tsm facility to convey attestation
> blobs to userspace, switch to using the 'sockptr' api for copying
> payloads to provided buffers where 'sockptr' handles user vs kernel
> buffers.
>
> While configfs-tsm is meant to replace existing confidential computing
> ioctl() implementations for attestation report retrieval the old ioctl()
> path needs to stick around for a deprecation period.
>
> No behavior change intended.
>
> Cc: Borislav Petkov <[email protected]>
> Cc: Tom Lendacky <[email protected]>
> Cc: Dionna Glaze <[email protected]>
> Cc: Brijesh Singh <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> ---
Looks good to me.
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
> drivers/virt/coco/sev-guest/sev-guest.c | 50 ++++++++++++++++++++-----------
> 1 file changed, 33 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index 97dbe715e96a..c3c9e9ea691f 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -19,6 +19,7 @@
> #include <crypto/aead.h>
> #include <linux/scatterlist.h>
> #include <linux/psp-sev.h>
> +#include <linux/sockptr.h>
> #include <uapi/linux/sev-guest.h>
> #include <uapi/linux/psp-sev.h>
>
> @@ -470,7 +471,13 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> return 0;
> }
>
> -static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +struct snp_req_resp {
> + sockptr_t req_data;
> + sockptr_t resp_data;
> +};
> +
> +static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg,
> + struct snp_req_resp *io)
> {
> struct snp_guest_crypto *crypto = snp_dev->crypto;
> struct snp_report_resp *resp;
> @@ -479,10 +486,10 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
>
> lockdep_assert_held(&snp_cmd_mutex);
>
> - if (!arg->req_data || !arg->resp_data)
> + if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
> return -EINVAL;
>
> - if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> + if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
> return -EFAULT;
>
> /*
> @@ -501,7 +508,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
> if (rc)
> goto e_free;
>
> - if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> + if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
> rc = -EFAULT;
>
> e_free:
> @@ -550,22 +557,25 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
> return rc;
> }
>
> -static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg,
> + struct snp_req_resp *io)
> +
> {
> struct snp_guest_crypto *crypto = snp_dev->crypto;
> struct snp_ext_report_req req;
> struct snp_report_resp *resp;
> int ret, npages = 0, resp_len;
> + sockptr_t certs_address;
>
> lockdep_assert_held(&snp_cmd_mutex);
>
> - if (!arg->req_data || !arg->resp_data)
> + if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data))
> return -EINVAL;
>
> - if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> + if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
> return -EFAULT;
>
> - /* userspace does not want certificate data */
> + /* caller does not want certificate data */
> if (!req.certs_len || !req.certs_address)
> goto cmd;
>
> @@ -573,8 +583,13 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
> !IS_ALIGNED(req.certs_len, PAGE_SIZE))
> return -EINVAL;
>
> - if (!access_ok((const void __user *)req.certs_address, req.certs_len))
> - return -EFAULT;
> + if (sockptr_is_kernel(io->resp_data)) {
> + certs_address = KERNEL_SOCKPTR((void *)req.certs_address);
> + } else {
> + certs_address = USER_SOCKPTR((void __user *)req.certs_address);
> + if (!access_ok(certs_address.user, req.certs_len))
> + return -EFAULT;
> + }
>
> /*
> * Initialize the intermediate buffer with all zeros. This buffer
> @@ -604,21 +619,19 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
> if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
> req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
>
> - if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
> + if (copy_to_sockptr(io->req_data, &req, sizeof(req)))
> ret = -EFAULT;
> }
>
> if (ret)
> goto e_free;
>
> - if (npages &&
> - copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
> - req.certs_len)) {
> + if (npages && copy_to_sockptr(certs_address, snp_dev->certs_data, req.certs_len)) {
> ret = -EFAULT;
> goto e_free;
> }
>
> - if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> + if (copy_to_sockptr(io->resp_data, resp, sizeof(*resp)))
> ret = -EFAULT;
>
> e_free:
> @@ -631,6 +644,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
> struct snp_guest_dev *snp_dev = to_snp_dev(file);
> void __user *argp = (void __user *)arg;
> struct snp_guest_request_ioctl input;
> + struct snp_req_resp io;
> int ret = -ENOTTY;
>
> if (copy_from_user(&input, argp, sizeof(input)))
> @@ -651,15 +665,17 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
> return -ENOTTY;
> }
>
> + io.req_data = USER_SOCKPTR((void __user *)input.req_data);
> + io.resp_data = USER_SOCKPTR((void __user *)input.resp_data);
> switch (ioctl) {
> case SNP_GET_REPORT:
> - ret = get_report(snp_dev, &input);
> + ret = get_report(snp_dev, &input, &io);
> break;
> case SNP_GET_DERIVED_KEY:
> ret = get_derived_key(snp_dev, &input);
> break;
> case SNP_GET_EXT_REPORT:
> - ret = get_ext_report(snp_dev, &input);
> + ret = get_ext_report(snp_dev, &input, &io);
> break;
> default:
> break;
>
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Peter Gonda wrote:
> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> >
> > From: Kuppuswamy Sathyanarayanan <[email protected]>
> >
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest. The first step in the attestation process is TDREPORT
> > generation, which involves getting the guest measurement data in the
> > format of TDREPORT, which is further used to validate the authenticity
> > of the TDX guest. TDREPORT by design is integrity-protected and can
> > only be verified on the local machine.
> >
> > To support remote verification of the TDREPORT in a SGX-based
> > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > only run outside of the TDX guest (i.e. in a host process or in a
> > normal VM) and guest can use communication channels like vsock or
> > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > TDX guest may not support these communication channels. To handle such
> > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > to request the host VMM to communicate with the SGX QE. More details
> > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > Interface (GHCI) for Intel TDX 1.0, section titled
> > "TDG.VP.VMCALL<GetQuote>".
> >
> > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > Computing Guest platforms to get the measurement data via ConfigFS.
> > Extend the TSM framework and add support to allow an attestation agent
> > to get the TDX Quote data (included usage example below).
> >
> > report=/sys/kernel/config/tsm/report/report0
> > mkdir $report
> > dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > hexdump -C $report/outblob
> > rmdir $report
> >
> > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > with TDREPORT data as input, which is further used by the VMM to copy
> > the TD Quote result after successful Quote generation. To create the
> > shared buffer, allocate a large enough memory and mark it shared using
> > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > for GetQuote requests in the TDX TSM handler.
> >
> > Although this method reserves a fixed chunk of memory for GetQuote
> > requests, such one time allocation can help avoid memory fragmentation
> > related allocation failures later in the uptime of the guest.
> >
> > Since the Quote generation process is not time-critical or frequently
> > used, the current version uses a polling model for Quote requests and
> > it also does not support parallel GetQuote requests.
> >
> > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > Reviewed-by: Erdem Aktas <[email protected]>
> > Signed-off-by: Dan Williams <[email protected]>
>
> Hey Dan,
>
> I tried running your test commands on an SNP enabled guest. To build
> the kernel I just checked out linus/master and applied your series. I
> haven't done any debugging yet, so I will update with what I find.
Ok, not surprised just because I made major changes to the sev-guest interface
and did not have a way to test.
> root@Ubuntu2004:~# hexdump -C $report/outblob
> [ 219.871875] ------------[ cut here ]------------
> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> [ 219.882280] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 219.887628] CPU: 0 PID: 1317 Comm: hexdump Not tainted 6.6.0-rc3-xfstests-000
> 44-gf636850ddfc7 #1
> [ 219.896530] Hardware name: Google Google Compute Engine/Google Compute Engine
Might you be able to send me info offlist about how I can spin this up so I can
test without waiting for the community to try it out?
In any event, thanks for testing! Appreciate it.
On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
>
> From: Kuppuswamy Sathyanarayanan <[email protected]>
>
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest. The first step in the attestation process is TDREPORT
> generation, which involves getting the guest measurement data in the
> format of TDREPORT, which is further used to validate the authenticity
> of the TDX guest. TDREPORT by design is integrity-protected and can
> only be verified on the local machine.
>
> To support remote verification of the TDREPORT in a SGX-based
> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> only run outside of the TDX guest (i.e. in a host process or in a
> normal VM) and guest can use communication channels like vsock or
> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> TDX guest may not support these communication channels. To handle such
> cases, TDX defines a GetQuote hypercall which can be used by the guest
> to request the host VMM to communicate with the SGX QE. More details
> about GetQuote hypercall can be found in TDX Guest-Host Communication
> Interface (GHCI) for Intel TDX 1.0, section titled
> "TDG.VP.VMCALL<GetQuote>".
>
> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> Computing Guest platforms to get the measurement data via ConfigFS.
> Extend the TSM framework and add support to allow an attestation agent
> to get the TDX Quote data (included usage example below).
>
> report=/sys/kernel/config/tsm/report/report0
> mkdir $report
> dd if=/dev/urandom bs=64 count=1 > $report/inblob
> hexdump -C $report/outblob
> rmdir $report
>
> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> with TDREPORT data as input, which is further used by the VMM to copy
> the TD Quote result after successful Quote generation. To create the
> shared buffer, allocate a large enough memory and mark it shared using
> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> for GetQuote requests in the TDX TSM handler.
>
> Although this method reserves a fixed chunk of memory for GetQuote
> requests, such one time allocation can help avoid memory fragmentation
> related allocation failures later in the uptime of the guest.
>
> Since the Quote generation process is not time-critical or frequently
> used, the current version uses a polling model for Quote requests and
> it also does not support parallel GetQuote requests.
>
> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> Reviewed-by: Erdem Aktas <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
Hey Dan,
I tried running your test commands on an SNP enabled guest. To build
the kernel I just checked out linus/master and applied your series. I
haven't done any debugging yet, so I will update with what I find.
root@Ubuntu2004:~# hexdump -C $report/outblob
[ 219.871875] ------------[ cut here ]------------
[ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
[ 219.882280] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 219.887628] CPU: 0 PID: 1317 Comm: hexdump Not tainted 6.6.0-rc3-xfstests-000
44-gf636850ddfc7 #1
[ 219.896530] Hardware name: Google Google Compute Engine/Google Compute Engine
, BIOS Google 09/26/2023
[ 219.905859] RIP: 0010:enc_dec_message+0x4ed/0x570
[ 219.910673] Code: c7 c0 00 00 00 80 48 2b 05 b8 4f 99 00 e9 dc fd ff ff 0f 0b
bb f4 ff ff ff eb b5 0f 0b 0f 0b 0f 0b e8 e7 47 ce ff 89 c3 eb 94 <0f> 0b 0f 0b
0f 0b 0f 0b 48 8d 7c 24 38 e8 11 6b 23 00 8b 5c 24 58
[ 219.929547] RSP: 0018:ffffc90000e27a18 EFLAGS: 00010246
[ 219.934893] RAX: 0000000000000000 RBX: ffffc90000e27bf8 RCX: 0000000000081000
[ 219.942134] RDX: 0000000000000000 RSI: 0000000000080000 RDI: ffffc90080e27bf8
[ 219.949378] RBP: ffff8881018980a0 R08: 0000000000000000 R09: ffffc90000e27a78
[ 219.956621] R10: 0000000000026680 R11: 0000000000000008 R12: ffff888111a3c400
[ 219.963864] R13: ffff8881018980d0 R14: ffff8881003e7000 R15: 0000000000000060
[ 219.971106] FS: 00007fd7e75f5740(0000) GS:ffff888237c00000(0000) knlGS:00000
00000000000
[ 219.979303] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 219.985160] CR2: 00005563500e8808 CR3: 0008000111986001 CR4: 0000000000370ef0
[ 219.992401] Call Trace:
[ 219.994955] <TASK>
[ 219.997160] ? die+0x36/0x80
[ 220.000149] ? do_trap+0xf4/0x100
[ 220.003574] ? enc_dec_message+0x4ed/0x570
[ 220.007777] ? do_error_trap+0x65/0x80
[ 220.011632] ? enc_dec_message+0x4ed/0x570
[ 220.015842] ? exc_invalid_op+0x50/0x70
[ 220.019788] ? enc_dec_message+0x4ed/0x570
[ 220.023994] ? asm_exc_invalid_op+0x1a/0x20
[ 220.028288] ? enc_dec_message+0x4ed/0x570
[ 220.032505] ? enc_dec_message+0x16f/0x570
[ 220.036711] ? srso_alias_return_thunk+0x5/0x7f
[ 220.041352] ? srso_alias_return_thunk+0x5/0x7f
[ 220.045994] handle_guest_request+0xc6/0x330
[ 220.050375] get_ext_report+0x1e0/0x3d0
[ 220.054323] sev_report_new+0x159/0x460
[ 220.058267] tsm_report_read.part.0+0x96/0x120
[ 220.062818] configfs_bin_read_iter+0xe1/0x1e0
[ 220.067377] vfs_read+0x1db/0x310
[ 220.070813] ksys_read+0x6f/0xf0
[ 220.074152] do_syscall_64+0x3f/0x90
[ 220.077843] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 220.083007] RIP: 0033:0x7fd7e7705fd2
[ 220.086695] Code: c0 e9 c2 fe ff ff 50 48 8d 3d aa cb 0a 00 e8 d5 1a 02 00 0f
1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0
ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 220.105572] RSP: 002b:00007fff9bc8fc18 EFLAGS: 00000246 ORIG_RAX: 00000000000
00000
[ 220.113252] RAX: ffffffffffffffda RBX: 00007fd7e77e4980 RCX: 00007fd7e7705fd2
[ 220.120496] RDX: 0000000000001000 RSI: 00005563500e7800 RDI: 0000000000000000
[ 220.127742] RBP: 00007fd7e77e14a0 R08: 0000000000000000 R09: 000000000000007c
[ 220.134988] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[ 220.142232] R13: 00007fd7e77e08a0 R14: 0000000000000d68 R15: 0000000000000d68
[ 220.149479] </TASK>
[ 220.151865] ---[ end trace 0000000000000000 ]---
[ 220.156629] RIP: 0010:enc_dec_message+0x4ed/0x570
[ 220.161479] Code: c7 c0 00 00 00 80 48 2b 05 b8 4f 99 00 e9 dc fd ff ff 0f 0b
bb f4 ff ff ff eb b5 0f 0b 0f 0b 0f 0b e8 e7 47 ce ff 89 c3 eb 94 <0f> 0b 0f 0b
0f 0b 0f 0b 48 8d 7c 24 38 e8 11 6b 23 00 8b 5c 24 58
[ 220.180379] RSP: 0018:ffffc90000e27a18 EFLAGS: 00010246
[ 220.185743] RAX: 0000000000000000 RBX: ffffc90000e27bf8 RCX: 0000000000081000
[ 220.193012] RDX: 0000000000000000 RSI: 0000000000080000 RDI: ffffc90080e27bf8
[ 220.200280] RBP: ffff8881018980a0 R08: 0000000000000000 R09: ffffc90000e27a78
[ 220.207551] R10: 0000000000026680 R11: 0000000000000008 R12: ffff888111a3c400
[ 220.214822] R13: ffff8881018980d0 R14: ffff8881003e7000 R15: 0000000000000060
[ 220.222094] FS: 00007fd7e75f5740(0000) GS:ffff888237c00000(0000) knlGS:00000
00000000000
[ 220.230329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 220.236210] CR2: 00005563500e8808 CR3: 0008000111986001 CR4: 0000000000370ef0
Peter Gonda wrote:
> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> >
> > From: Kuppuswamy Sathyanarayanan <[email protected]>
> >
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest. The first step in the attestation process is TDREPORT
> > generation, which involves getting the guest measurement data in the
> > format of TDREPORT, which is further used to validate the authenticity
> > of the TDX guest. TDREPORT by design is integrity-protected and can
> > only be verified on the local machine.
> >
> > To support remote verification of the TDREPORT in a SGX-based
> > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > only run outside of the TDX guest (i.e. in a host process or in a
> > normal VM) and guest can use communication channels like vsock or
> > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > TDX guest may not support these communication channels. To handle such
> > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > to request the host VMM to communicate with the SGX QE. More details
> > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > Interface (GHCI) for Intel TDX 1.0, section titled
> > "TDG.VP.VMCALL<GetQuote>".
> >
> > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > Computing Guest platforms to get the measurement data via ConfigFS.
> > Extend the TSM framework and add support to allow an attestation agent
> > to get the TDX Quote data (included usage example below).
> >
> > report=/sys/kernel/config/tsm/report/report0
> > mkdir $report
> > dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > hexdump -C $report/outblob
> > rmdir $report
> >
> > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > with TDREPORT data as input, which is further used by the VMM to copy
> > the TD Quote result after successful Quote generation. To create the
> > shared buffer, allocate a large enough memory and mark it shared using
> > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > for GetQuote requests in the TDX TSM handler.
> >
> > Although this method reserves a fixed chunk of memory for GetQuote
> > requests, such one time allocation can help avoid memory fragmentation
> > related allocation failures later in the uptime of the guest.
> >
> > Since the Quote generation process is not time-critical or frequently
> > used, the current version uses a polling model for Quote requests and
> > it also does not support parallel GetQuote requests.
> >
> > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > Reviewed-by: Erdem Aktas <[email protected]>
> > Signed-off-by: Dan Williams <[email protected]>
>
> Hey Dan,
>
> I tried running your test commands on an SNP enabled guest. To build
> the kernel I just checked out linus/master and applied your series. I
> haven't done any debugging yet, so I will update with what I find.
>
> root@Ubuntu2004:~# hexdump -C $report/outblob
> [ 219.871875] ------------[ cut here ]------------
> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
Ok, it does not like virtual address of one of the buffers, but my
changes "should" not have affected that as get_ext_report() internally
uses snp_dev->certs_data and snp_dev->response for bounce buffering the
actual request / response memory. First test I want to try once I can
get on an SNP system is compare this to the ioctl path just make sure
that succeeds.
On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <[email protected]> wrote:
>
> Peter Gonda wrote:
> > On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> > >
> > > From: Kuppuswamy Sathyanarayanan <[email protected]>
> > >
> > > In TDX guest, the attestation process is used to verify the TDX guest
> > > trustworthiness to other entities before provisioning secrets to the
> > > guest. The first step in the attestation process is TDREPORT
> > > generation, which involves getting the guest measurement data in the
> > > format of TDREPORT, which is further used to validate the authenticity
> > > of the TDX guest. TDREPORT by design is integrity-protected and can
> > > only be verified on the local machine.
> > >
> > > To support remote verification of the TDREPORT in a SGX-based
> > > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > > only run outside of the TDX guest (i.e. in a host process or in a
> > > normal VM) and guest can use communication channels like vsock or
> > > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > > TDX guest may not support these communication channels. To handle such
> > > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > > to request the host VMM to communicate with the SGX QE. More details
> > > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > > Interface (GHCI) for Intel TDX 1.0, section titled
> > > "TDG.VP.VMCALL<GetQuote>".
> > >
> > > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > > Computing Guest platforms to get the measurement data via ConfigFS.
> > > Extend the TSM framework and add support to allow an attestation agent
> > > to get the TDX Quote data (included usage example below).
> > >
> > > report=/sys/kernel/config/tsm/report/report0
> > > mkdir $report
> > > dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > > hexdump -C $report/outblob
> > > rmdir $report
> > >
> > > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > > with TDREPORT data as input, which is further used by the VMM to copy
> > > the TD Quote result after successful Quote generation. To create the
> > > shared buffer, allocate a large enough memory and mark it shared using
> > > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > > for GetQuote requests in the TDX TSM handler.
> > >
> > > Although this method reserves a fixed chunk of memory for GetQuote
> > > requests, such one time allocation can help avoid memory fragmentation
> > > related allocation failures later in the uptime of the guest.
> > >
> > > Since the Quote generation process is not time-critical or frequently
> > > used, the current version uses a polling model for Quote requests and
> > > it also does not support parallel GetQuote requests.
> > >
> > > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > > Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > > Reviewed-by: Erdem Aktas <[email protected]>
> > > Signed-off-by: Dan Williams <[email protected]>
> >
> > Hey Dan,
> >
> > I tried running your test commands on an SNP enabled guest. To build
> > the kernel I just checked out linus/master and applied your series. I
> > haven't done any debugging yet, so I will update with what I find.
> >
> > root@Ubuntu2004:~# hexdump -C $report/outblob
> > [ 219.871875] ------------[ cut here ]------------
> > [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
>
> Ok, it does not like virtual address of one of the buffers, but my
> changes "should" not have affected that as get_ext_report() internally
> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> actual request / response memory. First test I want to try once I can
> get on an SNP system is compare this to the ioctl path just make sure
> that succeeds.
I tried the IOCTL with our attestation test and it seems to be working
fine. I was hoping to debug further next week.
I also mailed you offlist to discuss getting access to SNP for testing.
On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <[email protected]> wrote:
>
> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <[email protected]> wrote:
> >
> > Peter Gonda wrote:
> > > On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> > > >
> > > > From: Kuppuswamy Sathyanarayanan <[email protected]>
> > > >
> > > > In TDX guest, the attestation process is used to verify the TDX guest
> > > > trustworthiness to other entities before provisioning secrets to the
> > > > guest. The first step in the attestation process is TDREPORT
> > > > generation, which involves getting the guest measurement data in the
> > > > format of TDREPORT, which is further used to validate the authenticity
> > > > of the TDX guest. TDREPORT by design is integrity-protected and can
> > > > only be verified on the local machine.
> > > >
> > > > To support remote verification of the TDREPORT in a SGX-based
> > > > attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > > > (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > > > only run outside of the TDX guest (i.e. in a host process or in a
> > > > normal VM) and guest can use communication channels like vsock or
> > > > TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > > > TDX guest may not support these communication channels. To handle such
> > > > cases, TDX defines a GetQuote hypercall which can be used by the guest
> > > > to request the host VMM to communicate with the SGX QE. More details
> > > > about GetQuote hypercall can be found in TDX Guest-Host Communication
> > > > Interface (GHCI) for Intel TDX 1.0, section titled
> > > > "TDG.VP.VMCALL<GetQuote>".
> > > >
> > > > Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > > > Computing Guest platforms to get the measurement data via ConfigFS.
> > > > Extend the TSM framework and add support to allow an attestation agent
> > > > to get the TDX Quote data (included usage example below).
> > > >
> > > > report=/sys/kernel/config/tsm/report/report0
> > > > mkdir $report
> > > > dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > > > hexdump -C $report/outblob
> > > > rmdir $report
> > > >
> > > > GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > > > with TDREPORT data as input, which is further used by the VMM to copy
> > > > the TD Quote result after successful Quote generation. To create the
> > > > shared buffer, allocate a large enough memory and mark it shared using
> > > > set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > > > for GetQuote requests in the TDX TSM handler.
> > > >
> > > > Although this method reserves a fixed chunk of memory for GetQuote
> > > > requests, such one time allocation can help avoid memory fragmentation
> > > > related allocation failures later in the uptime of the guest.
> > > >
> > > > Since the Quote generation process is not time-critical or frequently
> > > > used, the current version uses a polling model for Quote requests and
> > > > it also does not support parallel GetQuote requests.
> > > >
> > > > Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > > > Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > > > Reviewed-by: Erdem Aktas <[email protected]>
> > > > Signed-off-by: Dan Williams <[email protected]>
> > >
> > > Hey Dan,
> > >
> > > I tried running your test commands on an SNP enabled guest. To build
> > > the kernel I just checked out linus/master and applied your series. I
> > > haven't done any debugging yet, so I will update with what I find.
> > >
> > > root@Ubuntu2004:~# hexdump -C $report/outblob
> > > [ 219.871875] ------------[ cut here ]------------
> > > [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >
> > Ok, it does not like virtual address of one of the buffers, but my
> > changes "should" not have affected that as get_ext_report() internally
> > uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > actual request / response memory. First test I want to try once I can
> > get on an SNP system is compare this to the ioctl path just make sure
> > that succeeds.
>
I think there may be an issue with CONFIG_DEBUG_SG. That was the
warning we were getting in my above stack trace:
> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
This was for this line in enc_dec_message():
sg_set_buf(&src[1], src_buf, hdr->msg_sz);
I am not sure why in sg_set_buf() virt_addr_valid() returns false for
the address given in the sev_report_new() which is from the variable
'ext_req' which is stack allocated?
static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
unsigned int buflen)
{
#ifdef CONFIG_DEBUG_SG
BUG_ON(!virt_addr_valid(buf));
#endif
sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
}
When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
well at least it doesn't crash the guest. I haven't checked if the
report is valid yet.
Hi,
On 10/3/2023 11:37 AM, Peter Gonda wrote:
> On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <[email protected]> wrote:
>>
>> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <[email protected]> wrote:
>>>
>>> Peter Gonda wrote:
>>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
>>>>>
>>>>> From: Kuppuswamy Sathyanarayanan <[email protected]>
>>>>>
>>>>> In TDX guest, the attestation process is used to verify the TDX guest
>>>>> trustworthiness to other entities before provisioning secrets to the
>>>>> guest. The first step in the attestation process is TDREPORT
>>>>> generation, which involves getting the guest measurement data in the
>>>>> format of TDREPORT, which is further used to validate the authenticity
>>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
>>>>> only be verified on the local machine.
>>>>>
>>>>> To support remote verification of the TDREPORT in a SGX-based
>>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
>>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
>>>>> only run outside of the TDX guest (i.e. in a host process or in a
>>>>> normal VM) and guest can use communication channels like vsock or
>>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
>>>>> TDX guest may not support these communication channels. To handle such
>>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
>>>>> to request the host VMM to communicate with the SGX QE. More details
>>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
>>>>> Interface (GHCI) for Intel TDX 1.0, section titled
>>>>> "TDG.VP.VMCALL<GetQuote>".
>>>>>
>>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
>>>>> Computing Guest platforms to get the measurement data via ConfigFS.
>>>>> Extend the TSM framework and add support to allow an attestation agent
>>>>> to get the TDX Quote data (included usage example below).
>>>>>
>>>>> report=/sys/kernel/config/tsm/report/report0
>>>>> mkdir $report
>>>>> dd if=/dev/urandom bs=64 count=1 > $report/inblob
>>>>> hexdump -C $report/outblob
>>>>> rmdir $report
>>>>>
>>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
>>>>> with TDREPORT data as input, which is further used by the VMM to copy
>>>>> the TD Quote result after successful Quote generation. To create the
>>>>> shared buffer, allocate a large enough memory and mark it shared using
>>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
>>>>> for GetQuote requests in the TDX TSM handler.
>>>>>
>>>>> Although this method reserves a fixed chunk of memory for GetQuote
>>>>> requests, such one time allocation can help avoid memory fragmentation
>>>>> related allocation failures later in the uptime of the guest.
>>>>>
>>>>> Since the Quote generation process is not time-critical or frequently
>>>>> used, the current version uses a polling model for Quote requests and
>>>>> it also does not support parallel GetQuote requests.
>>>>>
>>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
>>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
>>>>> Reviewed-by: Erdem Aktas <[email protected]>
>>>>> Signed-off-by: Dan Williams <[email protected]>
>>>>
>>>> Hey Dan,
>>>>
>>>> I tried running your test commands on an SNP enabled guest. To build
>>>> the kernel I just checked out linus/master and applied your series. I
>>>> haven't done any debugging yet, so I will update with what I find.
>>>>
>>>> root@Ubuntu2004:~# hexdump -C $report/outblob
>>>> [ 219.871875] ------------[ cut here ]------------
>>>> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
>>>
>>> Ok, it does not like virtual address of one of the buffers, but my
>>> changes "should" not have affected that as get_ext_report() internally
>>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
>>> actual request / response memory. First test I want to try once I can
>>> get on an SNP system is compare this to the ioctl path just make sure
>>> that succeeds.
>>
>
> I think there may be an issue with CONFIG_DEBUG_SG. That was the
> warning we were getting in my above stack trace:
>
>> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
>
> This was for this line in enc_dec_message():
>
> sg_set_buf(&src[1], src_buf, hdr->msg_sz);
>
> I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> the address given in the sev_report_new() which is from the variable
> 'ext_req' which is stack allocated?
>
> static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> unsigned int buflen)
> {
> #ifdef CONFIG_DEBUG_SG
> BUG_ON(!virt_addr_valid(buf));
> #endif
> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> }
>
> When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> well at least it doesn't crash the guest. I haven't checked if the
> report is valid yet.
>
Dan, do you think it is related to not allocating direct mapped memory (using
kvalloc)?
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
<[email protected]> wrote:
>
> Hi,
>
> On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <[email protected]> wrote:
> >>
> >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <[email protected]> wrote:
> >>>
> >>> Peter Gonda wrote:
> >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> >>>>>
> >>>>> From: Kuppuswamy Sathyanarayanan <[email protected]>
> >>>>>
> >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> >>>>> trustworthiness to other entities before provisioning secrets to the
> >>>>> guest. The first step in the attestation process is TDREPORT
> >>>>> generation, which involves getting the guest measurement data in the
> >>>>> format of TDREPORT, which is further used to validate the authenticity
> >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> >>>>> only be verified on the local machine.
> >>>>>
> >>>>> To support remote verification of the TDREPORT in a SGX-based
> >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> >>>>> normal VM) and guest can use communication channels like vsock or
> >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> >>>>> TDX guest may not support these communication channels. To handle such
> >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> >>>>> to request the host VMM to communicate with the SGX QE. More details
> >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> >>>>> "TDG.VP.VMCALL<GetQuote>".
> >>>>>
> >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> >>>>> Extend the TSM framework and add support to allow an attestation agent
> >>>>> to get the TDX Quote data (included usage example below).
> >>>>>
> >>>>> report=/sys/kernel/config/tsm/report/report0
> >>>>> mkdir $report
> >>>>> dd if=/dev/urandom bs=64 count=1 > $report/inblob
> >>>>> hexdump -C $report/outblob
> >>>>> rmdir $report
> >>>>>
> >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> >>>>> the TD Quote result after successful Quote generation. To create the
> >>>>> shared buffer, allocate a large enough memory and mark it shared using
> >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> >>>>> for GetQuote requests in the TDX TSM handler.
> >>>>>
> >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> >>>>> requests, such one time allocation can help avoid memory fragmentation
> >>>>> related allocation failures later in the uptime of the guest.
> >>>>>
> >>>>> Since the Quote generation process is not time-critical or frequently
> >>>>> used, the current version uses a polling model for Quote requests and
> >>>>> it also does not support parallel GetQuote requests.
> >>>>>
> >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> >>>>> Reviewed-by: Erdem Aktas <[email protected]>
> >>>>> Signed-off-by: Dan Williams <[email protected]>
> >>>>
> >>>> Hey Dan,
> >>>>
> >>>> I tried running your test commands on an SNP enabled guest. To build
> >>>> the kernel I just checked out linus/master and applied your series. I
> >>>> haven't done any debugging yet, so I will update with what I find.
> >>>>
> >>>> root@Ubuntu2004:~# hexdump -C $report/outblob
> >>>> [ 219.871875] ------------[ cut here ]------------
> >>>> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >>>
> >>> Ok, it does not like virtual address of one of the buffers, but my
> >>> changes "should" not have affected that as get_ext_report() internally
> >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> >>> actual request / response memory. First test I want to try once I can
> >>> get on an SNP system is compare this to the ioctl path just make sure
> >>> that succeeds.
> >>
> >
> > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > warning we were getting in my above stack trace:
> >
> >> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> >
> > This was for this line in enc_dec_message():
> >
> > sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> >
> > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > the address given in the sev_report_new() which is from the variable
> > 'ext_req' which is stack allocated?
> >
> > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> > unsigned int buflen)
> > {
> > #ifdef CONFIG_DEBUG_SG
> > BUG_ON(!virt_addr_valid(buf));
> > #endif
> > sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > }
> >
> > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > well at least it doesn't crash the guest. I haven't checked if the
> > report is valid yet.
> >
>
> Dan, do you think it is related to not allocating direct mapped memory (using
> kvalloc)?
But I think the issue is the stack allocated variable 'ext_req' here:
sev_report_new()
+ void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ guard(mutex)(&snp_cmd_mutex);
+ certs_address = buf + report_size;
+ struct snp_ext_report_req ext_req = {
+ .data = { .vmpl = desc->privlevel },
+ .certs_address = (__u64)certs_address,
+ .certs_len = ext_size,
+ };
+ memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);
>
> --
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
Peter Gonda wrote:
> On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> <[email protected]> wrote:
> >
> > Hi,
> >
> > On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <[email protected]> wrote:
> > >>
> > >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <[email protected]> wrote:
> > >>>
> > >>> Peter Gonda wrote:
> > >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <[email protected]> wrote:
> > >>>>>
> > >>>>> From: Kuppuswamy Sathyanarayanan <[email protected]>
> > >>>>>
> > >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> > >>>>> trustworthiness to other entities before provisioning secrets to the
> > >>>>> guest. The first step in the attestation process is TDREPORT
> > >>>>> generation, which involves getting the guest measurement data in the
> > >>>>> format of TDREPORT, which is further used to validate the authenticity
> > >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> > >>>>> only be verified on the local machine.
> > >>>>>
> > >>>>> To support remote verification of the TDREPORT in a SGX-based
> > >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> > >>>>> normal VM) and guest can use communication channels like vsock or
> > >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > >>>>> TDX guest may not support these communication channels. To handle such
> > >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> > >>>>> to request the host VMM to communicate with the SGX QE. More details
> > >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> > >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> > >>>>> "TDG.VP.VMCALL<GetQuote>".
> > >>>>>
> > >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> > >>>>> Extend the TSM framework and add support to allow an attestation agent
> > >>>>> to get the TDX Quote data (included usage example below).
> > >>>>>
> > >>>>> report=/sys/kernel/config/tsm/report/report0
> > >>>>> mkdir $report
> > >>>>> dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > >>>>> hexdump -C $report/outblob
> > >>>>> rmdir $report
> > >>>>>
> > >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> > >>>>> the TD Quote result after successful Quote generation. To create the
> > >>>>> shared buffer, allocate a large enough memory and mark it shared using
> > >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > >>>>> for GetQuote requests in the TDX TSM handler.
> > >>>>>
> > >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> > >>>>> requests, such one time allocation can help avoid memory fragmentation
> > >>>>> related allocation failures later in the uptime of the guest.
> > >>>>>
> > >>>>> Since the Quote generation process is not time-critical or frequently
> > >>>>> used, the current version uses a polling model for Quote requests and
> > >>>>> it also does not support parallel GetQuote requests.
> > >>>>>
> > >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@dwillia2-xfh.jf.intel.com/ [1]
> > >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
> > >>>>> Reviewed-by: Erdem Aktas <[email protected]>
> > >>>>> Signed-off-by: Dan Williams <[email protected]>
> > >>>>
> > >>>> Hey Dan,
> > >>>>
> > >>>> I tried running your test commands on an SNP enabled guest. To build
> > >>>> the kernel I just checked out linus/master and applied your series. I
> > >>>> haven't done any debugging yet, so I will update with what I find.
> > >>>>
> > >>>> root@Ubuntu2004:~# hexdump -C $report/outblob
> > >>>> [ 219.871875] ------------[ cut here ]------------
> > >>>> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >>>
> > >>> Ok, it does not like virtual address of one of the buffers, but my
> > >>> changes "should" not have affected that as get_ext_report() internally
> > >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > >>> actual request / response memory. First test I want to try once I can
> > >>> get on an SNP system is compare this to the ioctl path just make sure
> > >>> that succeeds.
> > >>
> > >
> > > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > > warning we were getting in my above stack trace:
> > >
> > >> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >
> > > This was for this line in enc_dec_message():
> > >
> > > sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> > >
> > > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > > the address given in the sev_report_new() which is from the variable
> > > 'ext_req' which is stack allocated?
> > >
> > > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> > > unsigned int buflen)
> > > {
> > > #ifdef CONFIG_DEBUG_SG
> > > BUG_ON(!virt_addr_valid(buf));
> > > #endif
> > > sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > > }
> > >
> > > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > > well at least it doesn't crash the guest. I haven't checked if the
> > > report is valid yet.
> > >
> >
> > Dan, do you think it is related to not allocating direct mapped memory (using
> > kvalloc)?
>
> But I think the issue is the stack allocated variable 'ext_req' here:
>
> sev_report_new()
> + void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + guard(mutex)(&snp_cmd_mutex);
> + certs_address = buf + report_size;
> + struct snp_ext_report_req ext_req = {
> + .data = { .vmpl = desc->privlevel },
> + .certs_address = (__u64)certs_address,
> + .certs_len = ext_size,
> + };
> + memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);
If the failure is coming from:
sg_set_buf(&src[1], src_buf, hdr->msg_sz);
...then that is always coming from the stack as get_ext_report()
internally copies either from the user ioctl() address or the kernel
stack into the local stack copy in both cases:
get_ext_report(...)
...
struct snp_ext_report_req req;
...
if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
return -EFAULT;
...
ret = handle_guest_request(..., &req.data, ...);
...where that "&req.data" always becomes the @src_buf argument to
enc_dec_message(). So while I do understand why sg_set_buf() is
complaining, I don't understand why it is not *always* complaining,
regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.
I will be able to dig deeper once I can test on hardware, but I am
thinking that the entire scheme to pass the source buffer on the kernel
stack is broken and is only happening to work because there are no
crypto-accelerators attached that require that the virtual addresses be
virt_addr_valid() for a later dma_map_sg() event.
...or my eyes are overlooking how the ioctl() path is succeeding.
Dan Williams wrote:
> Peter Gonda wrote:
> > On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> > <[email protected]> wrote:
[..]
> > > Dan, do you think it is related to not allocating direct mapped memory (using
> > > kvalloc)?
> >
> > But I think the issue is the stack allocated variable 'ext_req' here:
> >
> > sev_report_new()
> > + void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + guard(mutex)(&snp_cmd_mutex);
> > + certs_address = buf + report_size;
> > + struct snp_ext_report_req ext_req = {
> > + .data = { .vmpl = desc->privlevel },
> > + .certs_address = (__u64)certs_address,
> > + .certs_len = ext_size,
> > + };
> > + memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);
>
> If the failure is coming from:
>
> sg_set_buf(&src[1], src_buf, hdr->msg_sz);
>
> ...then that is always coming from the stack as get_ext_report()
> internally copies either from the user ioctl() address or the kernel
> stack into the local stack copy in both cases:
>
> get_ext_report(...)
> ...
> struct snp_ext_report_req req;
> ...
> if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
> return -EFAULT;
> ...
> ret = handle_guest_request(..., &req.data, ...);
>
> ...where that "&req.data" always becomes the @src_buf argument to
> enc_dec_message(). So while I do understand why sg_set_buf() is
> complaining, I don't understand why it is not *always* complaining,
> regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.
>
> I will be able to dig deeper once I can test on hardware, but I am
> thinking that the entire scheme to pass the source buffer on the kernel
> stack is broken and is only happening to work because there are no
> crypto-accelerators attached that require that the virtual addresses be
> virt_addr_valid() for a later dma_map_sg() event.
>
> ...or my eyes are overlooking how the ioctl() path is succeeding.
Confirmed, I can make DEBUG_SG equally unhappy via the ioctl() path.
Note I changed the BUG_ON() to WARN_ON() in this kernel to keep the
system alive over reproductions:
WARNING: CPU: 0 PID: 1175 at include/linux/scatterlist.h:187 enc_dec_message+0x518/0x5b0 [sev_guest]
[..]
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
RIP: 0010:enc_dec_message+0x518/0x5b0 [sev_guest]
Call Trace:
<TASK>
[..]
handle_guest_request+0x135/0x520 [sev_guest]
get_ext_report+0x1ec/0x3e0 [sev_guest]
snp_guest_ioctl+0x157/0x200 [sev_guest]
So the required fix here will address both cases.
I will note that on this instance no certificate data is being returned,
so I can't test that path, but at least the report retrieval will be
tested for the next posting of this series.