2022-11-04 23:05:09

by Dionna Amalie Glaze

[permalink] [raw]
Subject: [PATCH v8 0/4] Add throttling detection to sev-guest

The guest request synchronous API from SEV-SNP VMs to the host's security
processor consumes a global resource. For this reason, AMD's docs
recommend that the host implements a throttling mechanism. In order for
the guest to know it's been throttled and should try its request again,
we need some good-faith communication from the host that the request
has been throttled.

These patches work with the existing /dev/sev-guest ABI to detect a
throttling code.

Changes from v7:
* Replaced handle_guest_request arguments msg_ver and fw_err with a
pointer to the snp_guest_request_ioctl argument struct.
Changes from v6:
* Rebased on the IV reuse fix patch
* renamed rate_hz to rate_s and fixed its MODULE_PARM_DESC to use the
correct variable name.
* Changed sleep_timeout_interrutible (not defined) to
schedule_timeout_interruptible.
Changes from v5:
* Fixed commit prefix text
* Added all get_maintainers.pl folks to commits' Cc tags
* Changed SET_RET_NO_FW_CALL commit's metadata to show pgonda signs
off and is the author.
Changes from v4:
* Clarified comment on SEV_RET_NO_FW_CALL
* Changed ratelimit loop to use sleep_timeout_interruptible
Changes from v3:
* sev-guest ratelimits itself to one request twice a second.
* Fixed a type signature to use u64 instead of unsigned int
* Set *exitinfo2 unconditionally after the ghcb_hv_call.
Changes from v2:
* Codified the non-firmware-call firmware error code as (u32)-1.
* Changed sev_issue_guest_request unsigned long *fw_err argument to
u64 *exitinfo2 to more accurately and type-safely describe the
value that it outputs.
* Changed sev_issue_guest_request to always set its exitinfo2
argument to either the non-firmware-call error code, the
EXIT_INFO_2 returned from the VMM if the request failed, or 0 on
success. This fixes a bug that returned uninitialized kernel stack
memory to the user when there is no error.
* Changed the throttle behavior to retry in the driver instead of
returning -EAGAIN, due to possible message sequence number reuse
on different message contents.

Changes from v1:
* Changed throttle error code to 2

Cc: Tom Lendacky <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Gonda <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>

Signed-off-by: Dionna Glaze <[email protected]>

Dionna Glaze (3):
x86/sev: Change snp_guest_issue_request's fw_err
virt: sev-guest: Remove err in handle_guest_request
virt: sev-guest: interpret VMM errors from guest request

Peter Gonda (1):
crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

arch/x86/include/asm/sev.h | 4 +-
arch/x86/kernel/sev.c | 10 ++--
drivers/crypto/ccp/sev-dev.c | 2 +-
drivers/virt/coco/sev-guest/sev-guest.c | 76 +++++++++++++++++--------
include/uapi/linux/psp-sev.h | 7 +++
include/uapi/linux/sev-guest.h | 18 +++++-
6 files changed, 85 insertions(+), 32 deletions(-)

--
2.38.1.431.g37b22c650d-goog



2022-11-04 23:05:10

by Dionna Amalie Glaze

[permalink] [raw]
Subject: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

From: Peter Gonda <[email protected]>

The PSP can return a "firmware error" code of -1 in circumstances where
the PSP is not actually called. To make this protocol unambiguous, we
add a constant naming the return value.

Cc: Thomas Lendacky <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andy Lutomirsky <[email protected]>
Cc: John Allen <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: "David S. Miller" <[email protected]>

Signed-off-by: Peter Gonda <[email protected]>
Signed-off-by: Dionna Glaze <[email protected]>
---
drivers/crypto/ccp/sev-dev.c | 2 +-
include/uapi/linux/psp-sev.h | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 06fc7156c04f..97eb3544ab36 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -444,7 +444,7 @@ static int __sev_platform_init_locked(int *error)
{
struct psp_device *psp = psp_master;
struct sev_device *sev;
- int rc = 0, psp_ret = -1;
+ int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
int (*init_function)(int *error);

if (!psp || !psp->sev_data)
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 91b4c63d5cbf..1ad7f0a7e328 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -36,6 +36,13 @@ enum {
* SEV Firmware status code
*/
typedef enum {
+ /*
+ * This error code is not in the SEV spec but is added to convey that
+ * there was an error that prevented the SEV Firmware from being called.
+ * This is (u32)-1 since the firmware error code is represented as a
+ * 32-bit integer.
+ */
+ SEV_RET_NO_FW_CALL = 0xffffffff,
SEV_RET_SUCCESS = 0,
SEV_RET_INVALID_PLATFORM_STATE,
SEV_RET_INVALID_GUEST_STATE,
--
2.38.1.431.g37b22c650d-goog


2022-11-04 23:06:17

by Dionna Amalie Glaze

[permalink] [raw]
Subject: [PATCH v8 3/4] virt: sev-guest: Remove err in handle_guest_request

The err variable may not be set in the call to snp_issue_guest_request,
yet it is unconditionally written back to fw_err if fw_err is non-null.
This is undefined behavior, and currently returns uninitialized kernel
stack memory to user space.

The fw_err argument is better to just pass through to
snp_issue_guest_request, so we do that by passing along the ioctl
argument. This removes the need for an argument to handle_guest_request.

Cc: Tom Lendacky <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Gonda <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Haowen Bai <[email protected]>
Cc: Liam Merwick <[email protected]>
Cc: Yang Yingliang <[email protected]>

Fixes: fce96cf04430 ("virt: Add SEV-SNP guest driver")
Signed-off-by: Dionna Glaze <[email protected]>
---
drivers/virt/coco/sev-guest/sev-guest.c | 37 ++++++++++++-------------
1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index d08ff87c2dac..5615d349b378 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -320,11 +320,11 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
return __enc_payload(snp_dev, req, payload, sz);
}

-static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
+ struct snp_guest_request_ioctl *arg,
u8 type, void *req_buf, size_t req_sz, void *resp_buf,
- u32 resp_sz, __u64 *fw_err)
+ u32 resp_sz)
{
- unsigned long err;
u64 seqno;
int rc;

@@ -336,12 +336,14 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));

/* Encrypt the userspace provided payload */
- rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
+ rc = enc_payload(snp_dev, seqno, arg->msg_version, type, req_buf,
+ req_sz);
if (rc)
return rc;

/* Call firmware to process the request */
- rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ rc = snp_issue_guest_request(exit_code, &snp_dev->input,
+ &arg->fw_err);

/*
* If the extended guest request fails due to having to small of a
@@ -349,23 +351,21 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
* extended data request.
*/
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
- err == SNP_GUEST_REQ_INVALID_LEN) {
+ arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
const unsigned int certs_npages = snp_dev->input.data_npages;

exit_code = SVM_VMGEXIT_GUEST_REQUEST;
- rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ rc = snp_issue_guest_request(exit_code, &snp_dev->input,
+ &arg->fw_err);

- err = SNP_GUEST_REQ_INVALID_LEN;
+ arg->fw_err = SNP_GUEST_REQ_INVALID_LEN;
snp_dev->input.data_npages = certs_npages;
}

- if (fw_err)
- *fw_err = err;
-
if (rc) {
dev_alert(snp_dev->dev,
"Detected error from ASP request. rc: %d, fw_err: %llu\n",
- rc, *fw_err);
+ rc, arg->fw_err);
goto disable_vmpck;
}

@@ -412,9 +412,9 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
if (!resp)
return -ENOMEM;

- rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+ rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
- resp_len, &arg->fw_err);
+ resp_len);
if (rc)
goto e_free;

@@ -452,9 +452,8 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
return -EFAULT;

- rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
- SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
- &arg->fw_err);
+ rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
+ SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len);
if (rc)
return rc;

@@ -514,9 +513,9 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
return -ENOMEM;

snp_dev->input.data_npages = npages;
- ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
+ ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg,
SNP_MSG_REPORT_REQ, &req.data,
- sizeof(req.data), resp->data, resp_len, &arg->fw_err);
+ sizeof(req.data), resp->data, resp_len);

/* If certs length is invalid then copy the returned length */
if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
--
2.38.1.431.g37b22c650d-goog


2022-11-04 23:06:17

by Dionna Amalie Glaze

[permalink] [raw]
Subject: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

The GHCB specification states that the upper 32 bits of exitinfo2 are
for the VMM's error codes. The sev-guest ABI has already locked in
that the fw_err status of the input will be 64 bits, and that
BIT_ULL(32) means that the extended guest request's data buffer was too
small, so we have to keep that ABI.

We can still interpret the upper 32 bits of exitinfo2 for the user
anyway in case the request gets throttled. For safety, since the
encryption algorithm in GHCBv2 is AES_GCM, we cannot return to user
space without having completed the request with the current sequence
number. If we were to return and the guest were to make another request
but with different message contents, then that would be IV reuse.

When throttled, the driver will reschedule itself and then try
again after sleeping half its ratelimit time to avoid a big wait queue.
The ioctl may block indefinitely, but that has always been the case
when deferring these requests to the host.

Cc: Tom Lendacky <[email protected]>
Cc: Peter Gonda <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Liam Merwick <[email protected]>
Cc: Yang Yingliang <[email protected]>
Cc: Haowen Bai <[email protected]>

Signed-off-by: Dionna Glaze <[email protected]>
---
drivers/virt/coco/sev-guest/sev-guest.c | 49 ++++++++++++++++++++-----
include/uapi/linux/sev-guest.h | 18 ++++++++-
2 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 5615d349b378..e8a9c07ea897 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -14,6 +14,7 @@
#include <linux/io.h>
#include <linux/platform_device.h>
#include <linux/miscdevice.h>
+#include <linux/ratelimit.h>
#include <linux/set_memory.h>
#include <linux/fs.h>
#include <crypto/aead.h>
@@ -48,12 +49,22 @@ struct snp_guest_dev {
struct snp_req_data input;
u32 *os_area_msg_seqno;
u8 *vmpck;
+
+ struct ratelimit_state rs;
};

static u32 vmpck_id;
module_param(vmpck_id, uint, 0444);
MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");

+static int rate_s = 1;
+module_param(rate_s, int, 0444);
+MODULE_PARM_DESC(rate_s, "The rate limit interval in seconds to limit requests to.");
+
+static int rate_burst = 2;
+module_param(rate_burst, int, 0444);
+MODULE_PARM_DESC(rate_burst, "The rate limit burst amount to limit requests to.");
+
/* Mutex to serialize the shared buffer access and command handling. */
static DEFINE_MUTEX(snp_cmd_mutex);

@@ -341,9 +352,27 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
if (rc)
return rc;

+retry:
+ /*
+ * Rate limit commands internally since the host can also throttle, and
+ * we don't want to create a tight request spin that could end up
+ * getting this VM throttled more heavily.
+ */
+ if (!__ratelimit(&snp_dev->rs)) {
+ schedule_timeout_interruptible((rate_s * HZ) / 2);
+ goto retry;
+ }
/* Call firmware to process the request */
rc = snp_issue_guest_request(exit_code, &snp_dev->input,
- &arg->fw_err);
+ &arg->exitinfo2);
+
+ /*
+ * The host may return EBUSY if the request has been throttled.
+ * We retry in the driver to avoid returning and reusing the message
+ * sequence number on a different message.
+ */
+ if (arg->vmm_error == SNP_GUEST_VMM_ERR_BUSY)
+ goto retry;

/*
* If the extended guest request fails due to having to small of a
@@ -351,21 +380,21 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
* extended data request.
*/
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
- arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+ arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
const unsigned int certs_npages = snp_dev->input.data_npages;

exit_code = SVM_VMGEXIT_GUEST_REQUEST;
rc = snp_issue_guest_request(exit_code, &snp_dev->input,
- &arg->fw_err);
+ &arg->exitinfo2);

- arg->fw_err = SNP_GUEST_REQ_INVALID_LEN;
+ arg->vmm_error = SNP_GUEST_VMM_ERR_INVALID_LEN;
snp_dev->input.data_npages = certs_npages;
}

if (rc) {
dev_alert(snp_dev->dev,
- "Detected error from ASP request. rc: %d, fw_err: %llu\n",
- rc, arg->fw_err);
+ "Detected error from ASP request. rc: %d, exitinfo2: %llu\n",
+ rc, arg->exitinfo2);
goto disable_vmpck;
}

@@ -518,7 +547,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
sizeof(req.data), resp->data, resp_len);

/* If certs length is invalid then copy the returned length */
- if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+ if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;

if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
@@ -553,7 +582,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
if (copy_from_user(&input, argp, sizeof(input)))
return -EFAULT;

- input.fw_err = 0xff;
+ input.exitinfo2 = SEV_RET_NO_FW_CALL;

/* Message version must be non-zero */
if (!input.msg_version)
@@ -584,7 +613,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long

mutex_unlock(&snp_cmd_mutex);

- if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
+ if (input.exitinfo2 && copy_to_user(argp, &input, sizeof(input)))
return -EFAULT;

return ret;
@@ -734,6 +763,8 @@ static int __init sev_guest_probe(struct platform_device *pdev)
if (ret)
goto e_free_cert_data;

+ ratelimit_state_init(&snp_dev->rs, rate_s * HZ, rate_burst);
+
dev_info(dev, "Initialized SEV guest driver (using vmpck_id %d)\n", vmpck_id);
return 0;

diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 256aaeff7e65..8e4144aa78c9 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -52,8 +52,15 @@ struct snp_guest_request_ioctl {
__u64 req_data;
__u64 resp_data;

- /* firmware error code on failure (see psp-sev.h) */
- __u64 fw_err;
+ /* bits[63:32]: VMM error code, bits[31:0] firmware error code (see psp-sev.h) */
+ union {
+ __u64 exitinfo2;
+ __u64 fw_err; /* Name deprecated in favor of others */
+ struct {
+ __u32 fw_error;
+ __u32 vmm_error;
+ };
+ };
};

struct snp_ext_report_req {
@@ -77,4 +84,11 @@ struct snp_ext_report_req {
/* Get SNP extended report as defined in the GHCB specification version 2. */
#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)

+/* Guest message request EXIT_INFO_2 constants */
+#define SNP_GUEST_FW_ERR_MASK GENMASK_ULL(31, 0)
+#define SNP_GUEST_VMM_ERR_SHIFT 32
+
+#define SNP_GUEST_VMM_ERR_INVALID_LEN 1
+#define SNP_GUEST_VMM_ERR_BUSY 2
+
#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
--
2.38.1.431.g37b22c650d-goog


2022-11-05 01:41:48

by Peter Gonda

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

On Fri, Nov 4, 2022 at 5:01 PM Dionna Glaze <[email protected]> wrote:
>
> The GHCB specification states that the upper 32 bits of exitinfo2 are
> for the VMM's error codes. The sev-guest ABI has already locked in
> that the fw_err status of the input will be 64 bits, and that
> BIT_ULL(32) means that the extended guest request's data buffer was too
> small, so we have to keep that ABI.
>
> We can still interpret the upper 32 bits of exitinfo2 for the user
> anyway in case the request gets throttled. For safety, since the
> encryption algorithm in GHCBv2 is AES_GCM, we cannot return to user
> space without having completed the request with the current sequence
> number. If we were to return and the guest were to make another request
> but with different message contents, then that would be IV reuse.
>
> When throttled, the driver will reschedule itself and then try
> again after sleeping half its ratelimit time to avoid a big wait queue.
> The ioctl may block indefinitely, but that has always been the case
> when deferring these requests to the host.
>
> Cc: Tom Lendacky <[email protected]>
> Cc: Peter Gonda <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Tom Lendacky <[email protected]>
> Cc: Liam Merwick <[email protected]>
> Cc: Yang Yingliang <[email protected]>
> Cc: Haowen Bai <[email protected]>
>
> Signed-off-by: Dionna Glaze <[email protected]>

Reviewed-by: Peter Gonda <[email protected]>

2022-11-05 01:59:43

by Peter Gonda

[permalink] [raw]
Subject: Re: [PATCH v8 3/4] virt: sev-guest: Remove err in handle_guest_request

On Fri, Nov 4, 2022 at 5:01 PM Dionna Glaze <[email protected]> wrote:
>
> The err variable may not be set in the call to snp_issue_guest_request,
> yet it is unconditionally written back to fw_err if fw_err is non-null.
> This is undefined behavior, and currently returns uninitialized kernel
> stack memory to user space.
>
> The fw_err argument is better to just pass through to
> snp_issue_guest_request, so we do that by passing along the ioctl
> argument. This removes the need for an argument to handle_guest_request.
>
> Cc: Tom Lendacky <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Peter Gonda <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Haowen Bai <[email protected]>
> Cc: Liam Merwick <[email protected]>
> Cc: Yang Yingliang <[email protected]>
>
> Fixes: fce96cf04430 ("virt: Add SEV-SNP guest driver")
> Signed-off-by: Dionna Glaze <[email protected]>

Reviewed-by: Peter Gonda <[email protected]>

2022-11-07 18:00:22

by Peter Gonda

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

On Fri, Nov 4, 2022 at 7:33 PM Peter Gonda <[email protected]> wrote:
>
> On Fri, Nov 4, 2022 at 5:01 PM Dionna Glaze <[email protected]> wrote:
> >
> > The GHCB specification states that the upper 32 bits of exitinfo2 are
> > for the VMM's error codes. The sev-guest ABI has already locked in
> > that the fw_err status of the input will be 64 bits, and that
> > BIT_ULL(32) means that the extended guest request's data buffer was too
> > small, so we have to keep that ABI.
> >
> > We can still interpret the upper 32 bits of exitinfo2 for the user
> > anyway in case the request gets throttled. For safety, since the
> > encryption algorithm in GHCBv2 is AES_GCM, we cannot return to user
> > space without having completed the request with the current sequence
> > number. If we were to return and the guest were to make another request
> > but with different message contents, then that would be IV reuse.
> >
> > When throttled, the driver will reschedule itself and then try
> > again after sleeping half its ratelimit time to avoid a big wait queue.
> > The ioctl may block indefinitely, but that has always been the case
> > when deferring these requests to the host.
> >
> > Cc: Tom Lendacky <[email protected]>
> > Cc: Peter Gonda <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Tom Lendacky <[email protected]>
> > Cc: Liam Merwick <[email protected]>
> > Cc: Yang Yingliang <[email protected]>
> > Cc: Haowen Bai <[email protected]>
> >
> > Signed-off-by: Dionna Glaze <[email protected]>
>
> Reviewed-by: Peter Gonda <[email protected]>

Tested-by: Peter Gonda <[email protected]>

Tested with the host throttling patches you shared offlist. Used a
pretty restrictive rate limit to ensure I the hit the limit during
testing.

2022-11-14 21:33:05

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

On 11/4/22 18:00, Dionna Glaze wrote:
> The GHCB specification states that the upper 32 bits of exitinfo2 are
> for the VMM's error codes. The sev-guest ABI has already locked in
> that the fw_err status of the input will be 64 bits, and that
> BIT_ULL(32) means that the extended guest request's data buffer was too
> small, so we have to keep that ABI.
>
> We can still interpret the upper 32 bits of exitinfo2 for the user
> anyway in case the request gets throttled. For safety, since the
> encryption algorithm in GHCBv2 is AES_GCM, we cannot return to user
> space without having completed the request with the current sequence
> number. If we were to return and the guest were to make another request
> but with different message contents, then that would be IV reuse.
>
> When throttled, the driver will reschedule itself and then try
> again after sleeping half its ratelimit time to avoid a big wait queue.
> The ioctl may block indefinitely, but that has always been the case
> when deferring these requests to the host.
>
> Cc: Tom Lendacky <[email protected]>
> Cc: Peter Gonda <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Tom Lendacky <[email protected]>
> Cc: Liam Merwick <[email protected]>
> Cc: Yang Yingliang <[email protected]>
> Cc: Haowen Bai <[email protected]>
>
> Signed-off-by: Dionna Glaze <[email protected]>

Reviewed-by: Tom Lendacky <[email protected]>

I'm wondering if this should be targeted at stable so that older kernels
will be able to handle a host that returns a busy indicator without
destroying the key (Peter's IV re-use patch).

Thanks,
Tom

> ---
> drivers/virt/coco/sev-guest/sev-guest.c | 49 ++++++++++++++++++++-----
> include/uapi/linux/sev-guest.h | 18 ++++++++-
> 2 files changed, 56 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index 5615d349b378..e8a9c07ea897 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -14,6 +14,7 @@
> #include <linux/io.h>
> #include <linux/platform_device.h>
> #include <linux/miscdevice.h>
> +#include <linux/ratelimit.h>
> #include <linux/set_memory.h>
> #include <linux/fs.h>
> #include <crypto/aead.h>
> @@ -48,12 +49,22 @@ struct snp_guest_dev {
> struct snp_req_data input;
> u32 *os_area_msg_seqno;
> u8 *vmpck;
> +
> + struct ratelimit_state rs;
> };
>
> static u32 vmpck_id;
> module_param(vmpck_id, uint, 0444);
> MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
>
> +static int rate_s = 1;
> +module_param(rate_s, int, 0444);
> +MODULE_PARM_DESC(rate_s, "The rate limit interval in seconds to limit requests to.");
> +
> +static int rate_burst = 2;
> +module_param(rate_burst, int, 0444);
> +MODULE_PARM_DESC(rate_burst, "The rate limit burst amount to limit requests to.");
> +
> /* Mutex to serialize the shared buffer access and command handling. */
> static DEFINE_MUTEX(snp_cmd_mutex);
>
> @@ -341,9 +352,27 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> if (rc)
> return rc;
>
> +retry:
> + /*
> + * Rate limit commands internally since the host can also throttle, and
> + * we don't want to create a tight request spin that could end up
> + * getting this VM throttled more heavily.
> + */
> + if (!__ratelimit(&snp_dev->rs)) {
> + schedule_timeout_interruptible((rate_s * HZ) / 2);
> + goto retry;
> + }
> /* Call firmware to process the request */
> rc = snp_issue_guest_request(exit_code, &snp_dev->input,
> - &arg->fw_err);
> + &arg->exitinfo2);
> +
> + /*
> + * The host may return EBUSY if the request has been throttled.
> + * We retry in the driver to avoid returning and reusing the message
> + * sequence number on a different message.
> + */
> + if (arg->vmm_error == SNP_GUEST_VMM_ERR_BUSY)
> + goto retry;
>
> /*
> * If the extended guest request fails due to having to small of a
> @@ -351,21 +380,21 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> * extended data request.
> */
> if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> - arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
> + arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
> const unsigned int certs_npages = snp_dev->input.data_npages;
>
> exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> rc = snp_issue_guest_request(exit_code, &snp_dev->input,
> - &arg->fw_err);
> + &arg->exitinfo2);
>
> - arg->fw_err = SNP_GUEST_REQ_INVALID_LEN;
> + arg->vmm_error = SNP_GUEST_VMM_ERR_INVALID_LEN;
> snp_dev->input.data_npages = certs_npages;
> }
>
> if (rc) {
> dev_alert(snp_dev->dev,
> - "Detected error from ASP request. rc: %d, fw_err: %llu\n",
> - rc, arg->fw_err);
> + "Detected error from ASP request. rc: %d, exitinfo2: %llu\n",
> + rc, arg->exitinfo2);
> goto disable_vmpck;
> }
>
> @@ -518,7 +547,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
> sizeof(req.data), resp->data, resp_len);
>
> /* If certs length is invalid then copy the returned length */
> - if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
> + if (arg->vmm_error == SNP_GUEST_VMM_ERR_INVALID_LEN) {
> req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
>
> if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
> @@ -553,7 +582,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
> if (copy_from_user(&input, argp, sizeof(input)))
> return -EFAULT;
>
> - input.fw_err = 0xff;
> + input.exitinfo2 = SEV_RET_NO_FW_CALL;
>
> /* Message version must be non-zero */
> if (!input.msg_version)
> @@ -584,7 +613,7 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>
> mutex_unlock(&snp_cmd_mutex);
>
> - if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
> + if (input.exitinfo2 && copy_to_user(argp, &input, sizeof(input)))
> return -EFAULT;
>
> return ret;
> @@ -734,6 +763,8 @@ static int __init sev_guest_probe(struct platform_device *pdev)
> if (ret)
> goto e_free_cert_data;
>
> + ratelimit_state_init(&snp_dev->rs, rate_s * HZ, rate_burst);
> +
> dev_info(dev, "Initialized SEV guest driver (using vmpck_id %d)\n", vmpck_id);
> return 0;
>
> diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
> index 256aaeff7e65..8e4144aa78c9 100644
> --- a/include/uapi/linux/sev-guest.h
> +++ b/include/uapi/linux/sev-guest.h
> @@ -52,8 +52,15 @@ struct snp_guest_request_ioctl {
> __u64 req_data;
> __u64 resp_data;
>
> - /* firmware error code on failure (see psp-sev.h) */
> - __u64 fw_err;
> + /* bits[63:32]: VMM error code, bits[31:0] firmware error code (see psp-sev.h) */
> + union {
> + __u64 exitinfo2;
> + __u64 fw_err; /* Name deprecated in favor of others */
> + struct {
> + __u32 fw_error;
> + __u32 vmm_error;
> + };
> + };
> };
>
> struct snp_ext_report_req {
> @@ -77,4 +84,11 @@ struct snp_ext_report_req {
> /* Get SNP extended report as defined in the GHCB specification version 2. */
> #define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)
>
> +/* Guest message request EXIT_INFO_2 constants */
> +#define SNP_GUEST_FW_ERR_MASK GENMASK_ULL(31, 0)
> +#define SNP_GUEST_VMM_ERR_SHIFT 32
> +
> +#define SNP_GUEST_VMM_ERR_INVALID_LEN 1
> +#define SNP_GUEST_VMM_ERR_BUSY 2
> +
> #endif /* __UAPI_LINUX_SEV_GUEST_H_ */

2022-11-14 22:18:42

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On 11/4/22 18:00, Dionna Glaze wrote:
> From: Peter Gonda <[email protected]>
>
> The PSP can return a "firmware error" code of -1 in circumstances where
> the PSP is not actually called. To make this protocol unambiguous, we
> add a constant naming the return value.
>
> Cc: Thomas Lendacky <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Andy Lutomirsky <[email protected]>
> Cc: John Allen <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Cc: "David S. Miller" <[email protected]>
>
> Signed-off-by: Peter Gonda <[email protected]>
> Signed-off-by: Dionna Glaze <[email protected]>

Looks like you missed my Reviewed-by: from an earlier version, so...

Reviewed-by: Tom Lendacky <[email protected]>

> ---
> drivers/crypto/ccp/sev-dev.c | 2 +-
> include/uapi/linux/psp-sev.h | 7 +++++++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 06fc7156c04f..97eb3544ab36 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -444,7 +444,7 @@ static int __sev_platform_init_locked(int *error)
> {
> struct psp_device *psp = psp_master;
> struct sev_device *sev;
> - int rc = 0, psp_ret = -1;
> + int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
> int (*init_function)(int *error);
>
> if (!psp || !psp->sev_data)
> diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> index 91b4c63d5cbf..1ad7f0a7e328 100644
> --- a/include/uapi/linux/psp-sev.h
> +++ b/include/uapi/linux/psp-sev.h
> @@ -36,6 +36,13 @@ enum {
> * SEV Firmware status code
> */
> typedef enum {
> + /*
> + * This error code is not in the SEV spec but is added to convey that
> + * there was an error that prevented the SEV Firmware from being called.
> + * This is (u32)-1 since the firmware error code is represented as a
> + * 32-bit integer.
> + */
> + SEV_RET_NO_FW_CALL = 0xffffffff,
> SEV_RET_SUCCESS = 0,
> SEV_RET_INVALID_PLATFORM_STATE,
> SEV_RET_INVALID_GUEST_STATE,

2022-11-14 22:20:30

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v8 3/4] virt: sev-guest: Remove err in handle_guest_request

On 11/4/22 18:00, Dionna Glaze wrote:
> The err variable may not be set in the call to snp_issue_guest_request,
> yet it is unconditionally written back to fw_err if fw_err is non-null.
> This is undefined behavior, and currently returns uninitialized kernel
> stack memory to user space.
>
> The fw_err argument is better to just pass through to
> snp_issue_guest_request, so we do that by passing along the ioctl
> argument. This removes the need for an argument to handle_guest_request.
>
> Cc: Tom Lendacky <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Peter Gonda <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Haowen Bai <[email protected]>
> Cc: Liam Merwick <[email protected]>
> Cc: Yang Yingliang <[email protected]>
>
> Fixes: fce96cf04430 ("virt: Add SEV-SNP guest driver")
> Signed-off-by: Dionna Glaze <[email protected]>

Reviewed-by: Tom Lendacky <[email protected]>

> ---
> drivers/virt/coco/sev-guest/sev-guest.c | 37 ++++++++++++-------------
> 1 file changed, 18 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index d08ff87c2dac..5615d349b378 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -320,11 +320,11 @@ static int enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8
> return __enc_payload(snp_dev, req, payload, sz);
> }
>
> -static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
> +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code,
> + struct snp_guest_request_ioctl *arg,
> u8 type, void *req_buf, size_t req_sz, void *resp_buf,
> - u32 resp_sz, __u64 *fw_err)
> + u32 resp_sz)
> {
> - unsigned long err;
> u64 seqno;
> int rc;
>
> @@ -336,12 +336,14 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
> memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
>
> /* Encrypt the userspace provided payload */
> - rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
> + rc = enc_payload(snp_dev, seqno, arg->msg_version, type, req_buf,
> + req_sz);
> if (rc)
> return rc;
>
> /* Call firmware to process the request */
> - rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> + rc = snp_issue_guest_request(exit_code, &snp_dev->input,
> + &arg->fw_err);
>
> /*
> * If the extended guest request fails due to having to small of a
> @@ -349,23 +351,21 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
> * extended data request.
> */
> if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> - err == SNP_GUEST_REQ_INVALID_LEN) {
> + arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
> const unsigned int certs_npages = snp_dev->input.data_npages;
>
> exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> - rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> + rc = snp_issue_guest_request(exit_code, &snp_dev->input,
> + &arg->fw_err);
>
> - err = SNP_GUEST_REQ_INVALID_LEN;
> + arg->fw_err = SNP_GUEST_REQ_INVALID_LEN;
> snp_dev->input.data_npages = certs_npages;
> }
>
> - if (fw_err)
> - *fw_err = err;
> -
> if (rc) {
> dev_alert(snp_dev->dev,
> "Detected error from ASP request. rc: %d, fw_err: %llu\n",
> - rc, *fw_err);
> + rc, arg->fw_err);
> goto disable_vmpck;
> }
>
> @@ -412,9 +412,9 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
> if (!resp)
> return -ENOMEM;
>
> - rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> + rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
> SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
> - resp_len, &arg->fw_err);
> + resp_len);
> if (rc)
> goto e_free;
>
> @@ -452,9 +452,8 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
> if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> return -EFAULT;
>
> - rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> - SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
> - &arg->fw_err);
> + rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg,
> + SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len);
> if (rc)
> return rc;
>
> @@ -514,9 +513,9 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
> return -ENOMEM;
>
> snp_dev->input.data_npages = npages;
> - ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
> + ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg,
> SNP_MSG_REPORT_REQ, &req.data,
> - sizeof(req.data), resp->data, resp_len, &arg->fw_err);
> + sizeof(req.data), resp->data, resp_len);
>
> /* If certs length is invalid then copy the returned length */
> if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {

2022-11-16 01:26:34

by Dionna Amalie Glaze

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

Sorry for the top post.

--
-Dionna Glaze, PhD (she/her)

2022-11-29 17:53:04

by Dionna Amalie Glaze

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

Confirming with Borislov, since you've queued "[PATCH V4] virt: sev:
Prevent IV reuse in SNP guest driver" for inclusion, have you also
queued this patch series? The IV reuse patch without this patch series
will cause host throttling to render the guest unable to use
sev-guest.

--
-Dionna Glaze, PhD (she/her)

2022-12-02 13:57:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 4/4] virt: sev-guest: interpret VMM errors from guest request

On Tue, Nov 29, 2022 at 09:18:17AM -0800, Dionna Amalie Glaze wrote:
> Confirming with Borislov, since you've queued "[PATCH V4] virt: sev:
> Prevent IV reuse in SNP guest driver" for inclusion, have you also
> queued this patch series? The IV reuse patch without this patch series
> will cause host throttling to render the guest unable to use
> sev-guest.

As Tom said, the relevant patches have Fixes: tags so they will get
picked up eventually. But lemme go through the set first.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-12-03 13:18:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On Fri, Nov 04, 2022 at 11:00:37PM +0000, Dionna Glaze wrote:
> From: Peter Gonda <[email protected]>
>
> The PSP can return a "firmware error" code of -1 in circumstances where
> the PSP is not actually called. To make this protocol unambiguous, we

Please use passive voice in your commit message: no "we" or "I", etc,
and describe your changes in imperative mood.

> add a constant naming the return value.
>
> Cc: Thomas Lendacky <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Andy Lutomirsky <[email protected]>
> Cc: John Allen <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Cc: "David S. Miller" <[email protected]>
>
> Signed-off-by: Peter Gonda <[email protected]>
> Signed-off-by: Dionna Glaze <[email protected]>
> ---
> drivers/crypto/ccp/sev-dev.c | 2 +-
> include/uapi/linux/psp-sev.h | 7 +++++++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 06fc7156c04f..97eb3544ab36 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -444,7 +444,7 @@ static int __sev_platform_init_locked(int *error)
> {
> struct psp_device *psp = psp_master;
> struct sev_device *sev;
> - int rc = 0, psp_ret = -1;
> + int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
> int (*init_function)(int *error);
>
> if (!psp || !psp->sev_data)

Ok, lemme chase down this flow here:

__sev_platform_init_locked() calls that automatic variable function
pointer ->init_function which already looks funky. See the end of this
mail for a diff removing it and making the code more readable.

The called function can be one of two and both get the pointer to
psp_ret as its only argument.

1. __sev_init_ex_locked() calls __sev_do_cmd_locked() and passes down
*psp_ret.

or

2. __sev_init_locked(). Ditto.

Now, __sev_do_cmd_locked() will overwrite psp_ret when
sev_wait_cmd_ioc() fails. So far so good.

In the case __sev_do_cmd_locked() succeeds, it'll put there something
else:

if (psp_ret)
*psp_ret = reg & PSP_CMDRESP_ERR_MASK;

So no caller will ever see SEV_RET_NO_FW_CALL, as far as I can tell.

And looking further through the rest of the set, nothing tests
SEV_RET_NO_FW_CALL - it only gets assigned.

So why are we even bothering with this?

You can hand in *psp_ret uninitialized and you'll get a value in all
cases. Unless I'm missing an angle.

> diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> index 91b4c63d5cbf..1ad7f0a7e328 100644
> --- a/include/uapi/linux/psp-sev.h
> +++ b/include/uapi/linux/psp-sev.h
> @@ -36,6 +36,13 @@ enum {
> * SEV Firmware status code
> */
> typedef enum {
> + /*
> + * This error code is not in the SEV spec but is added to convey that
> + * there was an error that prevented the SEV Firmware from being called.
> + * This is (u32)-1 since the firmware error code is represented as a
> + * 32-bit integer.
> + */
> + SEV_RET_NO_FW_CALL = 0xffffffff,

What's wrong with having -1 here?

> SEV_RET_SUCCESS = 0,
> SEV_RET_INVALID_PLATFORM_STATE,
> SEV_RET_INVALID_GUEST_STATE,
> --

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 97eb3544ab36..8bc4209b338b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -440,12 +440,20 @@ static int __sev_init_ex_locked(int *error)
return __sev_do_cmd_locked(SEV_CMD_INIT_EX, &data, error);
}

+static inline int __sev_do_init_locked(int *psp_ret)
+{
+ if (sev_init_ex_buffer)
+ return __sev_init_ex_locked(psp_ret);
+ else
+
+ return __sev_init_locked(psp_ret);
+}
+
static int __sev_platform_init_locked(int *error)
{
struct psp_device *psp = psp_master;
struct sev_device *sev;
- int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
- int (*init_function)(int *error);
+ int rc = 0, psp_ret;

if (!psp || !psp->sev_data)
return -ENODEV;
@@ -456,15 +464,12 @@ static int __sev_platform_init_locked(int *error)
return 0;

if (sev_init_ex_buffer) {
- init_function = __sev_init_ex_locked;
rc = sev_read_init_ex_file();
if (rc)
return rc;
- } else {
- init_function = __sev_init_locked;
}

- rc = init_function(&psp_ret);
+ rc = __sev_do_init_locked(&psp_ret);
if (rc && psp_ret == SEV_RET_SECURE_DATA_INVALID) {
/*
* Initialization command returned an integrity check failure
@@ -473,9 +478,12 @@ static int __sev_platform_init_locked(int *error)
* initialization function should succeed by replacing the state
* with a reset state.
*/
- dev_err(sev->dev, "SEV: retrying INIT command because of SECURE_DATA_INVALID error. Retrying once to reset PSP SEV state.");
- rc = init_function(&psp_ret);
+ dev_err(sev->dev,
+"SEV: retrying INIT command because of SECURE_DATA_INVALID error. Retrying once to reset PSP SEV state.");
+
+ rc = __sev_do_init_locked(&psp_ret);
}
+
if (error)
*error = psp_ret;

diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 1ad7f0a7e328..a9ed9e846cd2 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -42,7 +42,7 @@ typedef enum {
* This is (u32)-1 since the firmware error code is represented as a
* 32-bit integer.
*/
- SEV_RET_NO_FW_CALL = 0xffffffff,
+ SEV_RET_NO_FW_CALL = -1,
SEV_RET_SUCCESS = 0,
SEV_RET_INVALID_PLATFORM_STATE,
SEV_RET_INVALID_GUEST_STATE,

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-12-03 19:54:07

by Dionna Amalie Glaze

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On Sat, Dec 3, 2022 at 4:26 AM Borislav Petkov <[email protected]> wrote:
>
> On Fri, Nov 04, 2022 at 11:00:37PM +0000, Dionna Glaze wrote:
> > From: Peter Gonda <[email protected]>
> >
> > The PSP can return a "firmware error" code of -1 in circumstances where
> > the PSP is not actually called. To make this protocol unambiguous, we
>
> Please use passive voice in your commit message: no "we" or "I", etc,
> and describe your changes in imperative mood.
>
> > add a constant naming the return value.
> >
> > Cc: Thomas Lendacky <[email protected]>
> > Cc: Paolo Bonzini <[email protected]>
> > Cc: Joerg Roedel <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Andy Lutomirsky <[email protected]>
> > Cc: John Allen <[email protected]>
> > Cc: Herbert Xu <[email protected]>
> > Cc: "David S. Miller" <[email protected]>
> >
> > Signed-off-by: Peter Gonda <[email protected]>
> > Signed-off-by: Dionna Glaze <[email protected]>
> > ---
> > drivers/crypto/ccp/sev-dev.c | 2 +-
> > include/uapi/linux/psp-sev.h | 7 +++++++
> > 2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> > index 06fc7156c04f..97eb3544ab36 100644
> > --- a/drivers/crypto/ccp/sev-dev.c
> > +++ b/drivers/crypto/ccp/sev-dev.c
> > @@ -444,7 +444,7 @@ static int __sev_platform_init_locked(int *error)
> > {
> > struct psp_device *psp = psp_master;
> > struct sev_device *sev;
> > - int rc = 0, psp_ret = -1;
> > + int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
> > int (*init_function)(int *error);
> >
> > if (!psp || !psp->sev_data)
>
> Ok, lemme chase down this flow here:
>
> __sev_platform_init_locked() calls that automatic variable function
> pointer ->init_function which already looks funky. See the end of this
> mail for a diff removing it and making the code more readable.
>

I'm fine removing it if possible for the sev-dev.c code, but I'll
still need the enum for the next patches in this series. I added it
specifically because of the uninitialized memory problem with `err`
that I witnessed in user space, and to replace the arbitrary 0xff
value in existing code.

> The called function can be one of two and both get the pointer to
> psp_ret as its only argument.
>
> 1. __sev_init_ex_locked() calls __sev_do_cmd_locked() and passes down
> *psp_ret.
>
> or
>
> 2. __sev_init_locked(). Ditto.
>
> Now, __sev_do_cmd_locked() will overwrite psp_ret when
> sev_wait_cmd_ioc() fails. So far so good.

It doesn't always overwrite psp_ret, such as the initial error checking.
The value remains uninitialized for -ENODEV, -EBUSY, -EINVAL.
Thus *error in __sev_platform_init_locked can be set to uninitialized
memory if psp_ret is not first initialized.
That error points to the kernel copy of the user's argument struct,
which the ioctl always copies back.
In the case of those error codes then, without SEV_RET_NO_FW_CALL,
user space will get uninitialized kernel memory.

>
> In the case __sev_do_cmd_locked() succeeds, it'll put there something
> else:
>
> if (psp_ret)
> *psp_ret = reg & PSP_CMDRESP_ERR_MASK;
>
> So no caller will ever see SEV_RET_NO_FW_CALL, as far as I can tell.
>
> And looking further through the rest of the set, nothing tests
> SEV_RET_NO_FW_CALL - it only gets assigned.
>
> So why are we even bothering with this?
>
> You can hand in *psp_ret uninitialized and you'll get a value in all
> cases. Unless I'm missing an angle.
>

I think my above comment points out the wrinkle.

> > diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> > index 91b4c63d5cbf..1ad7f0a7e328 100644
> > --- a/include/uapi/linux/psp-sev.h
> > +++ b/include/uapi/linux/psp-sev.h
> > @@ -36,6 +36,13 @@ enum {
> > * SEV Firmware status code
> > */
> > typedef enum {
> > + /*
> > + * This error code is not in the SEV spec but is added to convey that
> > + * there was an error that prevented the SEV Firmware from being called.
> > + * This is (u32)-1 since the firmware error code is represented as a
> > + * 32-bit integer.
> > + */
> > + SEV_RET_NO_FW_CALL = 0xffffffff,
>
> What's wrong with having -1 here?
>

C++ brain not trusting what type enum has even in C. I can change it to -1.

> > SEV_RET_SUCCESS = 0,
> > SEV_RET_INVALID_PLATFORM_STATE,
> > SEV_RET_INVALID_GUEST_STATE,
> > --
>
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 97eb3544ab36..8bc4209b338b 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -440,12 +440,20 @@ static int __sev_init_ex_locked(int *error)
> return __sev_do_cmd_locked(SEV_CMD_INIT_EX, &data, error);
> }
>
> +static inline int __sev_do_init_locked(int *psp_ret)
> +{
> + if (sev_init_ex_buffer)
> + return __sev_init_ex_locked(psp_ret);
> + else
> +
> + return __sev_init_locked(psp_ret);
> +}
> +
> static int __sev_platform_init_locked(int *error)
> {
> struct psp_device *psp = psp_master;
> struct sev_device *sev;
> - int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
> - int (*init_function)(int *error);
> + int rc = 0, psp_ret;
>
> if (!psp || !psp->sev_data)
> return -ENODEV;
> @@ -456,15 +464,12 @@ static int __sev_platform_init_locked(int *error)
> return 0;
>
> if (sev_init_ex_buffer) {
> - init_function = __sev_init_ex_locked;
> rc = sev_read_init_ex_file();
> if (rc)
> return rc;
> - } else {
> - init_function = __sev_init_locked;
> }
>
> - rc = init_function(&psp_ret);
> + rc = __sev_do_init_locked(&psp_ret);
> if (rc && psp_ret == SEV_RET_SECURE_DATA_INVALID) {
> /*
> * Initialization command returned an integrity check failure
> @@ -473,9 +478,12 @@ static int __sev_platform_init_locked(int *error)
> * initialization function should succeed by replacing the state
> * with a reset state.
> */
> - dev_err(sev->dev, "SEV: retrying INIT command because of SECURE_DATA_INVALID error. Retrying once to reset PSP SEV state.");
> - rc = init_function(&psp_ret);
> + dev_err(sev->dev,
> +"SEV: retrying INIT command because of SECURE_DATA_INVALID error. Retrying once to reset PSP SEV state.");
> +
> + rc = __sev_do_init_locked(&psp_ret);
> }
> +
> if (error)
> *error = psp_ret;
>
> diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> index 1ad7f0a7e328..a9ed9e846cd2 100644
> --- a/include/uapi/linux/psp-sev.h
> +++ b/include/uapi/linux/psp-sev.h
> @@ -42,7 +42,7 @@ typedef enum {
> * This is (u32)-1 since the firmware error code is represented as a
> * 32-bit integer.
> */
> - SEV_RET_NO_FW_CALL = 0xffffffff,
> + SEV_RET_NO_FW_CALL = -1,
> SEV_RET_SUCCESS = 0,
> SEV_RET_INVALID_PLATFORM_STATE,
> SEV_RET_INVALID_GUEST_STATE,
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette



--
-Dionna Glaze, PhD (she/her)

2022-12-03 20:49:37

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On Sat, Dec 03, 2022 at 10:58:39AM -0800, Dionna Amalie Glaze wrote:
> It doesn't always overwrite psp_ret, such as the initial error checking.
> The value remains uninitialized for -ENODEV, -EBUSY, -EINVAL.
> Thus *error in __sev_platform_init_locked can be set to uninitialized
> memory if psp_ret is not first initialized.

Lemme see if I understand it correctly: you wanna signal that all early
return cases in __sev_do_cmd_locked() are such that no firmware was
called?

I.e., everything before the first iowrite into the command buffer?

But then the commit message says:

"The PSP can return a "firmware error" code of -1 in circumstances where
the PSP is not actually called."

which is confusing. How can the PSP return something if it wasn't called?

Or you mean those cases above where it would fail on some of the checks
before issuing a SEV command? I think you do...

So I see Tom has ACKed this but I have to ask: is the SEV spec not going
to use -1 ever?

Also, if this behavior is going to be user-visible, where are we
documenting it? Especially if nothing in the kernel is looking at
that value but only assigning it to a retval which gets looked at by
userspace. Especially then this should be documented.

Dunno, maybe somewhere in Documentation/x86/amd-memory-encryption.rst or
maybe Tom would have a better idea.

> That error points to the kernel copy of the user's argument struct,
> which the ioctl always copies back. In the case of those error codes
> then, without SEV_RET_NO_FW_CALL, user space will get uninitialized
> kernel memory.

Right, but having a return value which means "firmware wasn't called"
sounds weird. Why does userspace care?

I mean, you can just as well return any of the negative values -ENODEV,
-EBUSY, -EINVAL too, depending on where you exit. Having three different
retvals could tell you where exactly it failed, even.

But the question remains: why does userspace needs to know that the
failure happened and firmware wasn't called, as long as it is getting
something negative to signal an error?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-12-05 17:10:27

by Dionna Amalie Glaze

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On Sat, Dec 3, 2022 at 11:37 AM Borislav Petkov <[email protected]> wrote:
>
> On Sat, Dec 03, 2022 at 10:58:39AM -0800, Dionna Amalie Glaze wrote:
> > It doesn't always overwrite psp_ret, such as the initial error checking.
> > The value remains uninitialized for -ENODEV, -EBUSY, -EINVAL.
> > Thus *error in __sev_platform_init_locked can be set to uninitialized
> > memory if psp_ret is not first initialized.
>
> Lemme see if I understand it correctly: you wanna signal that all early
> return cases in __sev_do_cmd_locked() are such that no firmware was
> called?
>
> I.e., everything before the first iowrite into the command buffer?
>
> But then the commit message says:
>
> "The PSP can return a "firmware error" code of -1 in circumstances where
> the PSP is not actually called."
>
> which is confusing. How can the PSP return something if it wasn't called?
>
> Or you mean those cases above where it would fail on some of the checks
> before issuing a SEV command? I think you do...
>
> So I see Tom has ACKed this but I have to ask: is the SEV spec not going
> to use -1 ever?
>

I'll confirm with Tom, since he's changing the GHCB spec for the
throttling value.

> Also, if this behavior is going to be user-visible, where are we
> documenting it? Especially if nothing in the kernel is looking at
> that value but only assigning it to a retval which gets looked at by
> userspace. Especially then this should be documented.
>
> Dunno, maybe somewhere in Documentation/x86/amd-memory-encryption.rst or
> maybe Tom would have a better idea.
>

Agreed it should be in both the Linux documentation and the GHCB spec.

> > That error points to the kernel copy of the user's argument struct,
> > which the ioctl always copies back. In the case of those error codes
> > then, without SEV_RET_NO_FW_CALL, user space will get uninitialized
> > kernel memory.
>
> Right, but having a return value which means "firmware wasn't called"
> sounds weird. Why does userspace care?
>

Arguably it shouldn't ever get this value. We're just not very
selective when we copy back the kernel copy of the ioctl argument.
In all cases user space should treat the value as undefined, but still
we don't want to leak uninitialized kernel stack values.

Host driver: only on platform init, should just see the negative error
value and not try to interpret the fw_err in the argument.
Still the data is copied back and therefore should not be
uninitialized kernel memory.
Possible name: SEV_RET_UNDEFINED, or a return value -1 anyway with a
comment that the argument is undefined.

Guest driver: The host is issuing a guest request on behalf of the
guest using patch 4/4 of this series.
The guest is responsible for keeping the sequence number in sync with
the PSP, so we want to track if the ghcb_hv_call completed
successfully to know we should continue with the incremented IV.
Otherwise we run the risk of the sequence numbers getting out of sync
and we lock down the VMPCK.

The guest driver actually sets exitinfo2 to an undocumented 0xff
initial value just in case.
=If the host doesn't write back a documented EXIT_INFO_2 value like
invalid_len or throttled, then the kernel will emit a log with the
initial value 0xff (or -1 after this patch).

I've changed it to -1 to name the same kind of error across host and
guest: the communication with the PSP didn't complete successfully, so
the "error" value is not from the PSP.
This value can also get returned to user space during a -ENOTTY result.
We can call this NO_FW_CALL or UNDEFINED. I have no real preference.

Whatever value we set initially, the VMM can overwrite exitinfo2
during the ghcb_hv_call.
I'd rather that the "undefined" values were the same across both,
because the guest is merely receiving a value from the host's PSP
driver (or should be).
It keeps the enum for return values a bit tidier and not concerned
with whether the value is viewed from the host or guest.

I can see an argument for not using the PSP header for its enum type
and instead defining and documenting and using the separate the 0xff
value elsewhere, but this seemed as good a place as any.


> I mean, you can just as well return any of the negative values -ENODEV,
> -EBUSY, -EINVAL too, depending on where you exit. Having three different
> retvals could tell you where exactly it failed, even.
>

That's true, those values are already being returned to user space as
the result of the ioctl.

> But the question remains: why does userspace needs to know that the
> failure happened and firmware wasn't called, as long as it is getting
> something negative to signal an error?
>

I hope the above discussion is clear that it's purely a defined
"undefined" because being pickier about what to copy_to_user during
exceptional circumstances in order to not overwrite the user's fw_err
value seems an unnecessary amount of code.

> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette



--
-Dionna Glaze, PhD (she/her)

2022-12-06 20:47:46

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On 12/5/22 11:05, Dionna Amalie Glaze wrote:
> On Sat, Dec 3, 2022 at 11:37 AM Borislav Petkov <[email protected]> wrote:
>>
>> On Sat, Dec 03, 2022 at 10:58:39AM -0800, Dionna Amalie Glaze wrote:
>>> It doesn't always overwrite psp_ret, such as the initial error checking.
>>> The value remains uninitialized for -ENODEV, -EBUSY, -EINVAL.
>>> Thus *error in __sev_platform_init_locked can be set to uninitialized
>>> memory if psp_ret is not first initialized.
>>
>> Lemme see if I understand it correctly: you wanna signal that all early
>> return cases in __sev_do_cmd_locked() are such that no firmware was
>> called?
>>
>> I.e., everything before the first iowrite into the command buffer?
>>
>> But then the commit message says:
>>
>> "The PSP can return a "firmware error" code of -1 in circumstances where
>> the PSP is not actually called."
>>
>> which is confusing. How can the PSP return something if it wasn't called?
>>
>> Or you mean those cases above where it would fail on some of the checks
>> before issuing a SEV command? I think you do...
>>
>> So I see Tom has ACKed this but I have to ask: is the SEV spec not going
>> to use -1 ever?
>>
>
> I'll confirm with Tom, since he's changing the GHCB spec for the
> throttling value.

The SEV API error codes are 16-bits in size, so you'll never see a -1.

>
>> Also, if this behavior is going to be user-visible, where are we
>> documenting it? Especially if nothing in the kernel is looking at
>> that value but only assigning it to a retval which gets looked at by
>> userspace. Especially then this should be documented.
>>
>> Dunno, maybe somewhere in Documentation/x86/amd-memory-encryption.rst or
>> maybe Tom would have a better idea.
>>
>
> Agreed it should be in both the Linux documentation and the GHCB spec.

Linux documentation, yes, GHCB spec, no.

Thanks,
Tom

>
>>> That error points to the kernel copy of the user's argument struct,
>>> which the ioctl always copies back. In the case of those error codes
>>> then, without SEV_RET_NO_FW_CALL, user space will get uninitialized
>>> kernel memory.
>>
>> Right, but having a return value which means "firmware wasn't called"
>> sounds weird. Why does userspace care?
>>
>
> Arguably it shouldn't ever get this value. We're just not very
> selective when we copy back the kernel copy of the ioctl argument.
> In all cases user space should treat the value as undefined, but still
> we don't want to leak uninitialized kernel stack values.
>
> Host driver: only on platform init, should just see the negative error
> value and not try to interpret the fw_err in the argument.
> Still the data is copied back and therefore should not be
> uninitialized kernel memory.
> Possible name: SEV_RET_UNDEFINED, or a return value -1 anyway with a
> comment that the argument is undefined.
>
> Guest driver: The host is issuing a guest request on behalf of the
> guest using patch 4/4 of this series.
> The guest is responsible for keeping the sequence number in sync with
> the PSP, so we want to track if the ghcb_hv_call completed
> successfully to know we should continue with the incremented IV.
> Otherwise we run the risk of the sequence numbers getting out of sync
> and we lock down the VMPCK.
>
> The guest driver actually sets exitinfo2 to an undocumented 0xff
> initial value just in case.
> =If the host doesn't write back a documented EXIT_INFO_2 value like
> invalid_len or throttled, then the kernel will emit a log with the
> initial value 0xff (or -1 after this patch).
>
> I've changed it to -1 to name the same kind of error across host and
> guest: the communication with the PSP didn't complete successfully, so
> the "error" value is not from the PSP.
> This value can also get returned to user space during a -ENOTTY result.
> We can call this NO_FW_CALL or UNDEFINED. I have no real preference.
>
> Whatever value we set initially, the VMM can overwrite exitinfo2
> during the ghcb_hv_call.
> I'd rather that the "undefined" values were the same across both,
> because the guest is merely receiving a value from the host's PSP
> driver (or should be).
> It keeps the enum for return values a bit tidier and not concerned
> with whether the value is viewed from the host or guest.
>
> I can see an argument for not using the PSP header for its enum type
> and instead defining and documenting and using the separate the 0xff
> value elsewhere, but this seemed as good a place as any.
>
>
>> I mean, you can just as well return any of the negative values -ENODEV,
>> -EBUSY, -EINVAL too, depending on where you exit. Having three different
>> retvals could tell you where exactly it failed, even.
>>
>
> That's true, those values are already being returned to user space as
> the result of the ioctl.
>
>> But the question remains: why does userspace needs to know that the
>> failure happened and firmware wasn't called, as long as it is getting
>> something negative to signal an error?
>>
>
> I hope the above discussion is clear that it's purely a defined
> "undefined" because being pickier about what to copy_to_user during
> exceptional circumstances in order to not overwrite the user's fw_err
> value seems an unnecessary amount of code.
>
>> Thx.
>>
>> --
>> Regards/Gruss,
>> Boris.
>>
>> https://people.kernel.org/tglx/notes-about-netiquette
>
>
>

2022-12-06 21:41:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 1/4] crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL

On Mon, Dec 05, 2022 at 09:05:19AM -0800, Dionna Amalie Glaze wrote:
> Arguably it shouldn't ever get this value. We're just not very
> selective when we copy back the kernel copy of the ioctl argument.
> In all cases user space should treat the value as undefined, but still
> we don't want to leak uninitialized kernel stack values.

Absolutely.

> I've changed it to -1 to name the same kind of error across host and
> guest: the communication with the PSP didn't complete successfully, so
> the "error" value is not from the PSP.
> This value can also get returned to user space during a -ENOTTY result.
> We can call this NO_FW_CALL or UNDEFINED. I have no real preference.

Me neither as long as this is written down and agreed upon as a possible
value and not leaking kernel stack.

> Whatever value we set initially, the VMM can overwrite exitinfo2
> during the ghcb_hv_call.
> I'd rather that the "undefined" values were the same across both,
> because the guest is merely receiving a value from the host's PSP
> driver (or should be).
> It keeps the enum for return values a bit tidier and not concerned
> with whether the value is viewed from the host or guest.

Ack.

...

> I hope the above discussion is clear that it's purely a defined
> "undefined" because being pickier about what to copy_to_user during
> exceptional circumstances in order to not overwrite the user's fw_err
> value seems an unnecessary amount of code.

Ok, I think we're on the same page. So pls document that NO_FW_CALL or
so value and what it means and that thing should be taken care of.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette