2022-03-10 13:36:14

by Alejandro Jimenez

[permalink] [raw]
Subject: [RFC 0/3] Expose Confidential Computing capabilities on sysfs

Given the growing number of Confidential Computing features (AMD SME/SEV, Intel
TDX), I believe it is useful to expose relevant state/parameters in sysfs.
e.g. For AMD memory encryption features, the distinction between possible states
(supported/enabled/active) is explained in the documentation at:

https://www.kernel.org/doc/Documentation/x86/amd-memory-encryption.txt

but there are currently no standard interfaces to determine state and other
relevant info (e.g. nr of SEV ASIDs) besides searching dmesg or manually reading
various CPUID leaves and MSRs.

This patchset implements a sysfs interface where only relevant attributes are
displayed depending on context (e.g. no SME entry or ASID attributes are created
when running on a guest)

On EPYC Milan host:

$ grep -r . /sys/kernel/mm/mem_encrypt/*
/sys/kernel/mm/mem_encrypt/c_bit_position:51
/sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
/sys/kernel/mm/mem_encrypt/sev/status:enabled
/sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
/sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
/sys/kernel/mm/mem_encrypt/sev_es/status:enabled
/sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
/sys/kernel/mm/mem_encrypt/sme/status:active

On SEV guest running on EPYC Milan host (displays only relevant entries):

$ grep -r . /sys/kernel/mm/mem_encrypt/*
/sys/kernel/mm/mem_encrypt/c_bit_position:51
/sys/kernel/mm/mem_encrypt/sev/status:active
/sys/kernel/mm/mem_encrypt/sev_es/status:unsupported

The full directory tree looks like:

/sys/kernel/mm/mem_encrypt/
├── c_bit_position
├── sev
│   ├── nr_asid_available
│   ├── nr_sev_asid
│   └── status
├── sev_es
│   ├── nr_asid_available
│   ├── nr_sev_es_asid
│   └── status
└── sme
└── status

The goal is to be able to easily add new entries as new features (TDX, SEV-SNP)
are merged.

I'd appreciate any suggestions/comments.

Thank you,
Alejandro

Alejandro Jimenez (3):
x86: Expose Secure Memory Encryption capabilities in sysfs
x86: Expose SEV capabilities in sysfs
x86: Expose SEV-ES capabilities in sysfs

.../ABI/testing/sysfs-kernel-mm-mem-encrypt | 88 +++++
arch/x86/include/asm/mem_encrypt.h | 6 +
arch/x86/mm/mem_encrypt.c | 27 ++
arch/x86/mm/mem_encrypt_amd.c | 320 ++++++++++++++++++
4 files changed, 441 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt

--
2.34.1


2022-03-10 14:24:41

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs

On 3/9/22 14:06, Alejandro Jimenez wrote:>
> On EPYC Milan host:
>
> $ grep -r . /sys/kernel/mm/mem_encrypt/*
> /sys/kernel/mm/mem_encrypt/c_bit_position:51

Why on earth would we want to expose this to userspace?

> /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
> /sys/kernel/mm/mem_encrypt/sev/status:enabled
> /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
> /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
> /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
> /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
> /sys/kernel/mm/mem_encrypt/sme/status:active

For all of this... What will userspace *do* with it?

For nr_asid_available, I get it. It tells you how many guests you can
still run. But, TDX will need the same logical thing. Should TDX hosts
go looking for this in:

/sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids

?

If it's something that's common, it needs to be somewhere common.

2022-03-10 16:02:39

by Alejandro Jimenez

[permalink] [raw]
Subject: [RFC 3/3] x86: Expose SEV-ES capabilities in sysfs

Expose the state of the SEV-ES feature via the new sysfs interface.
Document the new ABI.

Signed-off-by: Alejandro Jimenez <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
---
.../ABI/testing/sysfs-kernel-mm-mem-encrypt | 28 ++++++++++++-
arch/x86/mm/mem_encrypt_amd.c | 40 ++++++++++++++++++-
2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt b/Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt
index 68a932d4540b..ecd491c0a7bd 100644
--- a/Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt
@@ -49,7 +49,7 @@ Description: Expose status of sev feature. Valid values are:

inactive (Guest only): Running in unencrypted virtual machine.

-What: /sys/kernel/mm/mem_encrypt/sev/nr_asid_available
+What: /sys/kernel/mm/mem_encrypt/{sev,sev_es}/nr_asid_available
Date: March 2022
KernelVersion: 5.17
Description: (Host only) Total number of ASIDs available for encrypted
@@ -60,3 +60,29 @@ Date: March 2022
KernelVersion: 5.17
Description: (Host only) Number of ASIDs available for SEV guests with
SEV-ES disabled.
+
+What: /sys/kernel/mm/mem_encrypt/sev_es/status
+Date: March 2022
+KernelVersion: 5.17
+Description: Expose status of sev_es feature. Valid values are:
+
+ unsupported: Secure Encrypted Virtualization with Encrypted
+ State is not supported by the processor.
+
+ enabled (Host only): Hypervisor host capable of running SEV
+ guests.
+
+ disabled (Host only): Memory encryption has been disabled by
+ System-Configuration Register (SYSCFG) MemEncryptionModeEn bit.
+
+ active (Guest only): Running in virtual machine with encrypted
+ code, data, and guest register state.
+
+ inactive (Guest only): Running in virtual machine with
+ unencrypted register state.
+
+What: /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid
+Date: March 2022
+KernelVersion: 5.17
+Description: (Host only) Number of ASIDs available for SEV guests with SEV-
+ ES enabled.
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 86979e0e26c7..bafc34bf6121 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -39,6 +39,7 @@

#define AMD_SME_BIT BIT(0)
#define AMD_SEV_BIT BIT(1)
+#define AMD_SEV_ES_BIT BIT(3)

#define CC_ATTR_RO(_name) \
static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
@@ -98,7 +99,8 @@ static void encrypted_mem_caps_init(void)
cpuid(AMD_CPUID_ENCRYPTED_MEM, &eax, &ebx, &ecx, &edx);

cbit_pos = ebx & 0x3f;
- sec_encrypt_support_mask = eax & (AMD_SME_BIT | AMD_SEV_BIT);
+ sec_encrypt_support_mask = eax &
+ (AMD_SME_BIT | AMD_SEV_BIT | AMD_SEV_ES_BIT);

max_sev_asid = ecx;
min_sev_asid = edx;
@@ -174,6 +176,10 @@ static ssize_t status_show(struct kobject *kobj,
} else if (!strcmp(kobj->name, "sev")) {
return sev_status_show(AMD_SEV_BIT, X86_FEATURE_SEV,
CC_ATTR_GUEST_MEM_ENCRYPT, buf);
+
+ } else if (!strcmp(kobj->name, "sev_es")) {
+ return sev_status_show(AMD_SEV_ES_BIT, X86_FEATURE_SEV_ES,
+ CC_ATTR_GUEST_STATE_ENCRYPT, buf);
}

/*
@@ -210,6 +216,18 @@ static ssize_t nr_sev_asid_show(struct kobject *kobj,
}
CC_ATTR_RO(nr_sev_asid);

+static ssize_t nr_sev_es_asid_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ unsigned int nr_sev_es_asid = 0;
+
+ if (min_sev_asid)
+ nr_sev_es_asid = min_sev_asid - 1;
+
+ return sysfs_emit(buf, "%u\n", nr_sev_es_asid);
+}
+CC_ATTR_RO(nr_sev_es_asid);
+
static struct attribute *sme_attrs[] = {
&status_attr.attr,
NULL,
@@ -236,16 +254,36 @@ static const struct attribute_group sev_guest_attr_group = {
.attrs = sev_guest_attrs,
};

+static struct attribute *sev_es_host_attrs[] = {
+ &status_attr.attr,
+ &nr_asid_available_attr.attr,
+ &nr_sev_es_asid_attr.attr,
+ NULL,
+};
+static const struct attribute_group sev_es_host_attr_group = {
+ .attrs = sev_es_host_attrs,
+};
+
+static struct attribute *sev_es_guest_attrs[] = {
+ &status_attr.attr,
+ NULL,
+};
+static const struct attribute_group sev_es_guest_attr_group = {
+ .attrs = sev_es_guest_attrs,
+};
+
/* List of features to be exposed when running as hypervisor host */
static struct amd_cc_feature host_cc_feat_list[] = {
AMD_CC_FEATURE("sme", sme_attr_group, NULL),
AMD_CC_FEATURE("sev", sev_host_attr_group, NULL),
+ AMD_CC_FEATURE("sev_es", sev_es_host_attr_group, NULL),
{},
};

/* List of features to be exposed when running as guest */
static struct amd_cc_feature guest_cc_feat_list[] = {
AMD_CC_FEATURE("sev", sev_guest_attr_group, NULL),
+ AMD_CC_FEATURE("sev_es", sev_es_guest_attr_group, NULL),
{},
};

--
2.34.1

2022-03-11 20:35:50

by Alejandro Jimenez

[permalink] [raw]
Subject: Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs


On 3/9/2022 5:40 PM, Dave Hansen wrote:
> On 3/9/22 14:06, Alejandro Jimenez wrote:>
>> On EPYC Milan host:
>>
>> $ grep -r . /sys/kernel/mm/mem_encrypt/*
>> /sys/kernel/mm/mem_encrypt/c_bit_position:51
> Why on earth would we want to expose this to userspace?
>
>> /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
>> /sys/kernel/mm/mem_encrypt/sev/status:enabled
>> /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
>> /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
>> /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
>> /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
>> /sys/kernel/mm/mem_encrypt/sme/status:active
> For all of this... What will userspace *do* with it?

In my case, this information was useful to know for debugging failures
when testing the various features (e.g. need to specify cbitpos property
on QEMU sev-guest object).

It helps get an account of what is currently supported/enabled/active on
the host/guest, given that some of these capabilities will interact with
other components and cause boot hangs or errors (e.g. AVIC+SME or
AVIC+SEV hangs at boot, SEV guests with some configurations need to
increase SWIOTLB limit).

The sysfs entry basically answers the questions in
https://github.com/AMDESE/AMDSEV#faq without needing to run
virsh/qmp-shell/rdmsr.

I am aware than having a new sysfs entry mostly to facilitate debugging
might not be warranted, so I have tagged this as an RFC to ask if others
working in this space have found additional use cases, or just want the
convenience of having the data for current and future CoCo features in a
single location.

>
> For nr_asid_available, I get it. It tells you how many guests you can
> still run. But, TDX will need the same logical thing. Should TDX hosts
> go looking for this in:
>
> /sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids
>
> ?
>
> If it's something that's common, it needs to be somewhere common.
I think it makes sense to have common attributes for all CoCo providers
under /sys/kernel/mm/mem_encrypt/. The various CoCo providers can create
entries under mem_encrypt/<feature> exposing the information relevant to
their specific features like these patches implement for the AMD case,
and populate or link the <common_attr> attribute with the appropriate value.

Then we can have:

/sys/kernel/mm/mem_encrypt/
-- common_attr
-- sme/
-- sev/
-- sev_es/

or:

/sys/kernel/mm/mem_encrypt/
-- common_attr
-- tdx/

Note that at any single time, we are only creating entries that are
applicable to the hardware we are running on, so there is not a mix of
tdx and sme/sev subdirs.

I suspect it will be difficult to agree on what is "common" or even a
descriptive name. Lets say this common attribute will be:

        /sys/kernel/mm/mem_encrypt/common_key

Where common_key can represent AMD SEV ASIDs/AMD SEV-{ES,SNP} ASIDs, or
Intel TDX KeyIDs (private/shared), or s390x SEID (Secure Execution IDs),
or <insert relevant ARM CCA attribute>.

We can have a (probably long) discussion to agree on the above; this
patchset just attempts to provide a framework for registering different
providers, and implements the AMD current capabilities.

Thank you,
Alejandro

2022-03-15 11:03:21

by Kai Huang

[permalink] [raw]
Subject: Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs


>
> More concretely
> - CPU feature (Secure Arbitration Mode: SEAM) as "seam" flag in /proc/cpuinfo

In my current patchset we don't have "seam" flag in /proc/cpuinfo.  

https://lore.kernel.org/kvm/[email protected]/T/#m02542eb723394a81c35b9542b2763c783222d594

TDX architecture doesn't have a CPUID to report SEAM, so we will need a
synthetic flag if we want to add. If userspace has requirement to use it, then
it makes sense to add it and expose to /proc/cpuinfo. But so far I don't know
there's any.

Thanks
-Kai


2022-03-16 10:27:48

by Dave Hansen

[permalink] [raw]
Subject: Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs

On 3/14/22 15:43, Isaku Yamahata wrote:
> xfam_fixed0 fixed-0 value for TD xfam value as a
> hexadecimal number with the "0x" prefix.
> xfam_fixed1 fixed-1 value for TD xfam value as a
> hexadecimal number with the "0x" prefix.

I don't think we should be exporting things and creating ABI just for
the heck of it. These are a prime example. XFAM is reported to the
guest in CPUID. Yes, these may be used to help *build* XFAM, but
userspace doesn't need to know how XFAM was built. It just needs to
know what features it can use.


2022-03-17 03:34:02

by Isaku Yamahata

[permalink] [raw]
Subject: Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs

Added [email protected].

On Thu, Mar 10, 2022 at 01:07:33PM -0500,
Alejandro Jimenez <[email protected]> wrote:

>
> On 3/9/2022 5:40 PM, Dave Hansen wrote:
> > On 3/9/22 14:06, Alejandro Jimenez wrote:>
> > > On EPYC Milan host:
> > >
> > > $ grep -r . /sys/kernel/mm/mem_encrypt/*
> > > /sys/kernel/mm/mem_encrypt/c_bit_position:51
> > Why on earth would we want to expose this to userspace?
> >
> > > /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
> > > /sys/kernel/mm/mem_encrypt/sev/status:enabled
> > > /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
> > > /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
> > > /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
> > > /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
> > > /sys/kernel/mm/mem_encrypt/sme/status:active
> > For all of this... What will userspace *do* with it?
>
> In my case, this information was useful to know for debugging failures when
> testing the various features (e.g. need to specify cbitpos property on QEMU
> sev-guest object).
>
> It helps get an account of what is currently supported/enabled/active on the
> host/guest, given that some of these capabilities will interact with other
> components and cause boot hangs or errors (e.g. AVIC+SME or AVIC+SEV hangs
> at boot, SEV guests with some configurations need to increase SWIOTLB
> limit).
>
> The sysfs entry basically answers the questions in
> https://github.com/AMDESE/AMDSEV#faq without needing to run
> virsh/qmp-shell/rdmsr.
>
> I am aware than having a new sysfs entry mostly to facilitate debugging
> might not be warranted, so I have tagged this as an RFC to ask if others
> working in this space have found additional use cases, or just want the
> convenience of having the data for current and future CoCo features in a
> single location.
> >
> > For nr_asid_available, I get it. It tells you how many guests you can
> > still run. But, TDX will need the same logical thing. Should TDX hosts
> > go looking for this in:
> >
> > /sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids
> >
> > ?
> >
> > If it's something that's common, it needs to be somewhere common.
> I think it makes sense to have common attributes for all CoCo providers
> under /sys/kernel/mm/mem_encrypt/. The various CoCo providers can create
> entries under mem_encrypt/<feature> exposing the information relevant to
> their specific features like these patches implement for the AMD case, and
> populate or link the <common_attr> attribute with the appropriate value.
>
> Then we can have:
>
> /sys/kernel/mm/mem_encrypt/
> -- common_attr
> -- sme/
> -- sev/
> -- sev_es/
>
> or:
>
> /sys/kernel/mm/mem_encrypt/
> -- common_attr
> -- tdx/
>
> Note that at any single time, we are only creating entries that are
> applicable to the hardware we are running on, so there is not a mix of tdx
> and sme/sev subdirs.
>
> I suspect it will be difficult to agree on what is "common" or even a
> descriptive name. Lets say this common attribute will be:
>
> ?????? ?????? /sys/kernel/mm/mem_encrypt/common_key
>
> Where common_key can represent AMD SEV ASIDs/AMD SEV-{ES,SNP} ASIDs, or
> Intel TDX KeyIDs (private/shared), or s390x SEID (Secure Execution IDs), or
> <insert relevant ARM CCA attribute>.
>
> We can have a (probably long) discussion to agree on the above; this
> patchset just attempts to provide a framework for registering different
> providers, and implements the AMD current capabilities.

The number of available Key IDs (TDX keyid or whatever is called) can be common.
Probably the common misc cgroup is desirable. I don't see other common thing,
though. I don't have requirements to expose bit position etc.

TDX requires firmwares which provide information about themselves. Because
they're firmwares, I'm going to use /sysfs/firmware/tdx.

More concretely
- CPU feature (Secure Arbitration Mode: SEAM) as "seam" flag in /proc/cpuinfo
- TDX firmware(P-SEAMLDR and TDX module) information in /sysfs/firmware/tdx/

What: /sys/firmware/tdx/
Description:
Intel's Trust Domain Extensions (TDX) protect guest VMs from
malicious hosts and some physical attacks. This directory
represents the entry point directory for the TDX.

the TDX requires the TDX firmware to load into an isolated
memory region. It requires a two-step loading process. It uses
the first phase firmware loader (a.k.a NP-SEAMLDR) that loads
the next loader and the second phase firmware loader(a.k.a
P-SEAMLDR) that loads the TDX firmware(a.k.a the "TDX module").
=============== ================================================
keyid_num the number of SEAM keyid as an hexadecimal
number with the "0x" prefix.
=============== ================================================
Users: libvirt

What: /sys/firmware/tdx/p_seamldr/
Description:
The P-SEAMLDR is the TDX module loader. The P-SEAMLDR comes
with its attributes, vendor_id, build_date, build_num, minor
version, major version to identify itself.

Provides the information about the P-SEAMLDR loaded on the
platform. This directory exists if the P-SEAMLDR is
successfully loaded. It contains the following read-only files.
The information corresponds to the data structure, SEAMLDR_INFO.
The admins or VMM management software like libvirt can refer to
that information, determine if P-SEAMLDR is supported, and
identify the loaded P-SEAMLDR.

=============== ================================================
version structure version of SEAMLDR_INFO as an
hexadecimal number with the "0x" prefix
"0x0".
attributes 32bit flags as a hexadecimal number with the
"0x" prefix.
Bit 31 - Production-worthy (0) or
debug (1).
Bits 30:0 - Reserved 0.
vendor_id Vendor ID as a hexadecimal number with the "0x"
prefix.
"0x0806" (Intel P-SEAMLDR module).
build_date Build date in yyyy.mm.dd BCD format.
build_num Build number as a hexadecimal number with the
"0x" prefix.
minor Minor version number as a hexadecimal number
with the "0x" prefix.
major Major version number as a hexadecimal number
with the "0x" prefix.
seaminfo The SEAM information of the TDX module currently
loaded as binary file.
seam_ready A boolean flag that indicates that a debuggable
TDX module can be loaded as a hexadecimal number
with the "0x" prefix.
p_seamldr_ready A boolean flag that indicates that the P-SEAMLDR
module is ready for SEAMCALLs as a hexadecimal
number with the "0x" prefix.
=============== ================================================
Users: libvirt

What: /sys/firmware/tdx/tdx_module/
Description:
The TDX requires a firmware as known as the TDX module. It comes
with its attributes, vendor_id, build_data, build_num,
minor_version, major_version, etc.

Provides the information about the TDX module loaded on the
platform. It contains the following read-only files. The
information corresponds to the data structure, TDSYSINFO_STRUCT.
The admins or VMM management software like libvirt can refer to
that information, determine if TDX is supported, and identify
the loaded the TDX module.

================== ============================================
status string of the TDX module status.
"unknown"
"none": the TDX module is not loaded
"loaded": The TDX module is loaded, but not
initialized
"initialized": the TDX module is fully
initialized
"shutdown": the TDX module is shutdown due to
error during initialization.
attributes 32bit flags of the TDX module attributes as
a hexadecimal number with the "0x" prefix.
Bits 31 - a production module(0) or
a debug module(1).
Bits 30:0 Reserved - set to 0.
vendor_id vendor ID as a hexadecimal number with the
"0x" prefix.
build_date build date in yyyymmdd BCD format.
build_num build number as a hexadecimal number with
the "0x" prefix.
minor_version minor version as a hexadecimal number with
the "0x" prefix.
major_version major versionas a hexadecimal number with
the "0x" prefix.
attributes_fixed0 fixed-0 value for TD's attributes as a
hexadecimal number with the "0x" prefix.
attributes_fixed1 fixed-1 value for TD's attributes as a
hexadecimal number with the "0x" prefix.
xfam_fixed0 fixed-0 value for TD xfam value as a
hexadecimal number with the "0x" prefix.
xfam_fixed1 fixed-1 value for TD xfam value as a
hexadecimal number with the "0x" prefix.
================== =============================================

--
Isaku Yamahata <[email protected]>