LinuxLists.cc - [PATCH] docs: security: Confidential computing intro and threat model

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

Hello Greg,

On 3/29/23 5:40 AM, Greg KH wrote:
> On Mon, Mar 27, 2023 at 09:18:16AM -0500, Carlos Bilbao wrote:
>> Kernel developers working on confidential computing operate under a set of
>> assumptions regarding the Linux kernel threat model that differ from the
>> traditional view. In order to effectively engage with the linux-coco
>> mailing list and contribute to ongoing kernel efforts, one must have a
>> thorough familiarity with these concepts. Add a concise,
>> architecture-agnostic introduction and threat model to provide a reference
>> for ongoing design discussions and to help developers gain a foundational
>> understanding of the subject.
>
> Thanks for putting this together. Some questions below:

Thanks for looking into this!

>
>> +The basic CoCo layout includes the host, guest, the interfaces that
>> +communicate guest and host, a platform capable of supporting CoCo, and an
>> +intermediary between the guest virtual machine (VM) and the underlying
>> +platform that acts as security manager::
>> +
>> + +-------------------+ +-----------------------+
>> + | CoCo guest VM |<---->| |
>> + +-------------------+ | |
>> + | Interfaces | | CoCo security manager |
>> + +-------------------+ | |
>> + | Host VMM |<---->| |
>> + +-------------------+ | |
>> + | |
>> + +--------------------+ | |
>> + | CoCo platform |<--->| |
>> + +--------------------+ +-----------------------+
>
> I do not understand, what are the "<--->" lines representing? Function
> calls? APIs? something else?

The "<--->" lines in the diagram represent bidirectional communication
channels or interfaces between the CoCo security manager and the rest of
the components (guest, host, hardware). It's a graphical way to represent
data flow that I think will help some people.

>
>> +The specific details of the CoCo intermediary vastly diverge between
>> +technologies, so much so that in some cases it will be HW and in others
>> +SW.
>> +
>> +Existing Linux kernel threat model
>> +==================================
>> +
>> +The components of the current Linux kernel threat model are::
>> +
>> + +-----------------------+ +-------------------+
>> + | |<---->| Userspace |
>> + | | +-------------------+
>> + | External attack | | Interfaces |
>> + | vectors | +-------------------+
>> + | |<---->| Linux Kernel |
>> + | | +-------------------+
>> + +-----------------------+ +-------------------+
>> + | Bootloader/BIOS |
>> + +-------------------+
>> + +-------------------+
>> + | HW platform |
>> + +-------------------+
>
> Again, what do the "<---->" lines mean? there's no talking betwen the
> bootloader and the kernel? What about the kernel talking to the HW
> without the BIOS (as is most of the time)? What is "Interfaces"?

The "<---->" arrows here also represent the direction of data flow, in
this case between external attackers and userspace or kernel.

Yes, there is communication between the bootloader and the kernel during
the boot process, but this diagram does not represent it explicitly.

The "Interfaces" box represents the various interfaces that allow
communication between kernel and userspace. This includes system calls,
kernel APIs, device drivers, etc.

>
> And "external attack vectors" is odd, how can they get to the kernel
> without going through userspace?
>

It is true that in most cases external attackers will try to exploit
vulnerabilities in userspace first, but it is possible for an attacker to
directly target the kernel, particularly if the host has physical access.
Examples of direct kernel attacks include the vulnerabilities
CVE-2019-19524, CVE-2022-0435 [1] and CVE-2020-24490 [2].

Anyway, the main point we aimed to convey is that the kernel is
vulnerable to external attack vectors, whether they are direct attacks or
rely on userspace privilege escalation. In either case, the security of the
kernel can be compromised and this must be considered in the threat model.

>> +The existing Linux kernel threat model typically assumes execution on a
>> +trusted HW platform with all of the firmware and bootloaders included on
>> +its TCB. The primary attacker resides in the userspace and all of the data
>> +coming from there is generally considered untrusted, unless userspace is
>> +privileged enough to perform trusted actions. In addition, external
>> +attackers are typically considered, including those with access to enabled
>> +external networks (e.g. Ethernet, Wireless, Bluetooth), exposed hardware
>> +interfaces (e.g. USB, Thunderbolt), and the ability to modify the contents
>> +of disks offline.
>
> I can not parse that last sentance well, sorry. What is in addition?
> What are you trying to say, some hardware the kernel trusts and some it
> doesn't? Note there are different "levels" of trust for hardware as
> well (i.e. we attempt to accept any USB configuration header and treat
> that as untrusted but the USB data path we totally trust.)

Yes, this relates to my answer above; "In addition" is used to introduce
something beyond the primary attacker residing in userspace. This extra
stuff emerges in CoCo because the machine owner has access to
components like external networks, exposed hardware interfaces (such as USB
and Thunderbolt), and the disk itself.

The paragraph does not explicitly state that the kernel trusts some
hardware and not others. But yes, that happens in CoCo. For example, AMD
has the AMD Secure Processor (ASP), which is part of the TCB of the AMD
Secure Encrypted Virtualization (SEV) tech. In addition, the kernel may
trust some hardware if certain configuration is enforced.

>
>> +Confidential Computing threat model and security objectives
>> +===========================================================
>> +
>> +Confidential Cloud Computing adds a new type of attacker to the above list:
>> +an untrusted and potentially malicious host. This can be viewed as a more
>> +powerful type of external attacker, as it resides locally on the same
>> +physical machine, in contrast to a remote network attacker, and has control
>> +over the guest kernel communication with most of the HW::
>> +
>> + +------------------------+
>> + | CoCo guest VM |
>> + +-----------------------+ | +-------------------+ |
>> + | |<--->| | Userspace | |
>> + | | | +-------------------+ |
>> + | External attack | | | Interfaces | |
>> + | vectors | | +-------------------+ |
>> + | |<--->| | Linux Kernel | |
>> + | | | +-------------------+ |
>> + +-----------------------+ | +-------------------+ |
>> + | | Bootloader/BIOS | |
>> + +-----------------------+ | +-------------------+ |
>> + | |<--->+------------------------+
>> + | | | Interfaces |
>> + | | +------------------------+
>> + | CoCo security |<--->| Host VMM |
>> + | manager | +------------------------+
>> + | | +------------------------+
>> + | |<--->| CoCo platform |
>> + +-----------------------+ +------------------------+
>
> Again, I don't undertand the layers or <---> here, sorry.

Ditto, data flow/interaction.

>
>> +While the traditional hypervisor has unlimited access to guest data and
>> +can leverage this access to attack the guest, the CoCo systems mitigate
>> +such attacks by adding security features like guest data confidentiality
>> +and integrity protection. This threat model assumes that those features
>> +are available and intact.
>> +
>> +The **Linux kernel CoCo security objectives** can be summarized as follows:
>> +
>> +1. Preserve the confidentiality and integrity of CoCo guest private memory.
>
> Confidentiality from/to what? Itself? Someone else? Userspace?

Confidentiality from everyone that is not explicitly authorized by the
guest. Not itself; the guest can access its own contents.

>
>> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
>
> But a host has to have privileges in order to create/destroy/sleep the
> guest, right?

While it is true that the host system requires some level of privilege to
create, destroy, or pause the guest, part of the goal of preventing
privileged escalation is to ensure that these operations do not provide a
pathway for attackers to gain access to the guest's kernel.

BTW, abusing KVM's ability to perform these operations to somehow gain
access to the guest would be another example of an attack vector that does
not go through userspace.

>
>> +
>> +The above security objectives result in two primary **Linux kernel CoCo
>> +assets**:
>> +
>> +1. Guest kernel execution context.
>> +2. Guest kernel private memory.
>> +
>> +The host retains full control over the CoCo guest resources and can deny
>> +access to them at any time. Because of this, the host Denial of Service
>> +(DoS) attacks against CoCo guests are beyond the scope of this threat
>> +model.
>
> So all resources provided by the host to the guest are trusted? Or are
> not trusted? Confused...

I think I should clarify here that by 'resources' we meant things like
CPU time, memory that the guest can consume, network bandwidth, etc. So
the notion of "trust" is not applicable here. In this case, we would talk
about availability, which is not a CoCo guarantee.

>
>> +The **Linux CoCo attack surface** is any interface exposed from a CoCo
>> +guest Linux kernel towards an untrusted host that is not covered by the
>> +CoCo technology SW/HW protections.
>
> "not covered by" is an odd way to say "we trust lots of things, but not
> all", right? If not, I don't understand again.

We trust anything that the host is not able to manipulate (e.g., dedicated
CoCo HW) and the rest (e.g., bootloader config) we validate and/or protect
from.

>
>> This includes any possible
>> +side-channels, as well as transient execution side channels. Examples of
>> +explicit (not side-channel) interfaces include accesses to port I/O, MMIO
>> +and DMA interfaces, access to PCI configuration space, VMM-specific
>> +hypercalls, access to shared memory pages, interrupts allowed to be
>> +injected to the guest kernel by the host, as well as CoCo technology
>> +specific hypercalls.
>
> So all of those things are trusted? Or are not trusted? Again, I'm
> confused. And who is trusting, or not trusting them? The host? The
> guest?

Same answer. It's worth noting that it's not that the guest does not
"trust" ACPI tables, it's that the guest does not trust the host that
provides them. Depending on the interface, different actions are taken,
and that's where the mitigation matrix comes into play.

>
>> Additionally, the host in a CoCo system typically
>> +controls the process of creating a CoCo guest: it has a method to load
>> +into a guest the firmware and bootloader images, the kernel image
>> +together with the kernel command line. All of this data should also be
>> +considered untrusted until its integrity and authenticity is established.
>
> Who does the authentication? The host? The guest? Through what
> channel?
>

The authentication is carried out by the CoCo security manager, which is
the intermediary between the guest and the underlying platform, in
cooperation with some external third party. The security manager is in
charge of verifying the integrity and authenticity of all this (firmware,
bootloader, kernel, and other data loaded into the guest) before the
normal execution of the guest. The verification is usually based on digital
signatures or other forms of cryptographic authentication. The specific
details on how this will happen in the kernel are still an ongoing
discussion, AFAIK.

>> +The table below shows a threat matrix for the CoCo guest Linux kernel with
>> +the potential mitigation strategies. The matrix refers to CoCo-specific
>> +versions of the guest, host and platform.
>> +
>> +.. list-table:: CoCo Linux guest kernel threat matrix
>> + :widths: auto
>> + :align: center
>> + :header-rows: 1
>> +
>> + * - Threat name
>> + - Threat description
>> + - Mitigation strategy
>> +
>> + * - Guest malicious configuration
>> + - A malicious host modifies one of the following guest's
>> + configuration:
>> +
>> + 1. Guest firmware or bootloader
>> +
>> + 2. Guest kernel or module binaries
>> +
>> + 3. Guest command line parameters
>> +
>> + This allows the host to break the integrity of the code running
>> + inside a CoCo guest and violate the CoCo security objectives.
>
> So hosts are not allowed to change this? I don't understand the use of
> "violate" here, sorry.

The host is capable of altering the configurations of these components.
Attestation can help identify these attacks.

>
>> + - The integrity of the guest's configuration passed via untrusted host
>> + must be ensured by methods such as remote attestation and signing.
>> + This should be largely transparent to the guest kernel and would
>> + allow it to assume a trusted state at the time of boot.
>
> How can it be transparent if the guest has to do this? If the guest
> isn't doing it, who is? Can configuration be changed while the guest is
> running?

See prior answer.

>
>> +
>> + * - CoCo guest data attacks
>> + - A malicious host retains full control of the CoCo guest's data
>> + in-transit between the guest and the host-managed physical or
>> + virtual devices. This allows any attack against confidentiality,
>> + integrity or freshness of such data.
>> + - The CoCo guest is responsible for ensuring the confidentiality,
>> + integrity and freshness of such data using well-established
>> + security mechanisms. For example, for any guest external network
>> + communications that are passed via the untrusted host, an end-to-end
>> + secure session must be established between a guest and a trusted
>> + remote endpoint using well-known protocols such as TLS.
>> + This requirement also applies to protection of the guest's disk
>> + image.
>
> So you trust all I/O into the guest by virtue of it having to be
> encrypted/protected somehow at the data layer? So the guest kernel
> doesn't have to worry about the data contents it is receiving any more
> than it does today?

Right, if there are proper secured control structures for the communication
channels, then the payload should be fine.

>
> I'm stopping here, sorry...

Hopefully I was able to provide some clarity.

>
> greg k-h
>

Thanks,
Carlos

[1] https://www.appgate.com/blog/a-remote-stack-overflow-in-the-linux-kernel
[2] https://google.github.io/security-research/pocs/linux/bleedingtooth/writeup.html

2023-04-21 21:15:35

by Kaplan, David

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

[AMD Official Use Only - General]

> -----Original Message-----
> From: Carlos Bilbao <[email protected]>
> Sent: Monday, March 27, 2023 9:18 AM
> To: [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; Giani, Dhaval
> <[email protected]>; Day, Michael <[email protected]>; Paluri,
> PavanKumar (Pavan Kumar) <[email protected]>; Kaplan, David
> <[email protected]>; Lal, Reshma <[email protected]>; Powell,
> Jeremy <[email protected]>;
> [email protected];
> [email protected]; Lendacky, Thomas
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; Bilbao, Carlos
> <[email protected]>
> Subject: [PATCH] docs: security: Confidential computing intro and threat
> model
>
> Kernel developers working on confidential computing operate under a set of
> assumptions regarding the Linux kernel threat model that differ from the
> traditional view. In order to effectively engage with the linux-coco mailing list
> and contribute to ongoing kernel efforts, one must have a thorough
> familiarity with these concepts. Add a concise, architecture-agnostic
> introduction and threat model to provide a reference for ongoing design
> discussions and to help developers gain a foundational understanding of the
> subject.
>
> Acked-by: Dave Hansen <[email protected]>
> Co-developed-by: Elena Reshetova <[email protected]>
> Signed-off-by: Elena Reshetova <[email protected]>
> Signed-off-by: Carlos Bilbao <[email protected]>
> ---
> .../security/confidential-computing.rst | 245 ++++++++++++++++++
> Documentation/security/index.rst | 1 +
> MAINTAINERS | 6 +
> 3 files changed, 252 insertions(+)
> create mode 100644 Documentation/security/confidential-computing.rst
>
> diff --git a/Documentation/security/confidential-computing.rst
> b/Documentation/security/confidential-computing.rst
> new file mode 100644
> index 000000000000..98439ef7ff9f
> --- /dev/null
> +++ b/Documentation/security/confidential-computing.rst
> @@ -0,0 +1,245 @@
> +===============================
> +Confidential Computing in Linux
> +===============================
> +
> +.. contents:: :local:
> +
> +By: Elena Reshetova <[email protected]> and Carlos Bilbao
> +<[email protected]>
> +
> +Motivation
> +==========
> +
> +Kernel developers working on confidential computing for the cloud
> +operate under a set of assumptions regarding the Linux kernel threat
> +model that differ from the traditional view. In order to effectively
> +engage with the linux-coco mailing list and contribute to its
> +initiatives, one must have a thorough familiarity with these concepts.
> +This document provides a concise, architecture-agnostic introduction to
> +help developers gain a foundational understanding of the subject.
> +
> +Overview and terminology
> +========================
> +
> +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
> +virtualization technologies that allow Cloud Service Providers (CSPs)
> +to provide stronger security guarantees to their clients (usually
> +referred to as tenants) by excluding all the CSP's infrastructure and
> +SW out of the tenant's Trusted Computing Base (TCB).
> +
> +While the concrete implementation details differ between technologies,
> +all of these mechanisms provide increased confidentiality and integrity
> +of CoCo guest memory and execution state (vCPU registers), more tightly
> +controlled guest interrupt injection, as well as some additional
> +mechanisms to control guest-host page mapping. More details on the
> +x86-specific solutions can be found in :doc:`Intel Trust Domain
> +Extensions (TDX) </x86/tdx>` and :doc:`AMD Memory Encryption
> +</x86/amd-memory-encryption>`.
> +
> +The basic CoCo layout includes the host, guest, the interfaces that
> +communicate guest and host, a platform capable of supporting CoCo, and
> +an intermediary between the guest virtual machine (VM) and the
> +underlying platform that acts as security manager::
> +
> + +-------------------+ +-----------------------+
> + | CoCo guest VM |<---->| |
> + +-------------------+ | |
> + | Interfaces | | CoCo security manager |
> + +-------------------+ | |
> + | Host VMM |<---->| |
> + +-------------------+ | |
> + | |
> + +--------------------+ | |
> + | CoCo platform |<--->| |
> + +--------------------+ +-----------------------+
> +
> +The specific details of the CoCo intermediary vastly diverge between
> +technologies, so much so that in some cases it will be HW and in others
> +SW.
> +
> +Existing Linux kernel threat model
> +==================================
> +
> +The components of the current Linux kernel threat model are::
> +
> + +-----------------------+ +-------------------+
> + | |<---->| Userspace |
> + | | +-------------------+
> + | External attack | | Interfaces |
> + | vectors | +-------------------+
> + | |<---->| Linux Kernel |
> + | | +-------------------+
> + +-----------------------+ +-------------------+
> + | Bootloader/BIOS |
> + +-------------------+
> + +-------------------+
> + | HW platform |
> + +-------------------+
> +
> +The existing Linux kernel threat model typically assumes execution on a
> +trusted HW platform with all of the firmware and bootloaders included
> +on its TCB. The primary attacker resides in the userspace and all of
> +the data coming from there is generally considered untrusted, unless
> +userspace is privileged enough to perform trusted actions. In addition,
> +external attackers are typically considered, including those with
> +access to enabled external networks (e.g. Ethernet, Wireless,
> +Bluetooth), exposed hardware interfaces (e.g. USB, Thunderbolt), and
> +the ability to modify the contents of disks offline.
> +
> +Confidential Computing threat model and security objectives
> +=========================================================
> ==
> +
> +Confidential Cloud Computing adds a new type of attacker to the above list:
> +an untrusted and potentially malicious host. This can be viewed as a
> +more powerful type of external attacker, as it resides locally on the
> +same physical machine, in contrast to a remote network attacker, and
> +has control over the guest kernel communication with most of the HW::
> +
> + +------------------------+
> + | CoCo guest VM |
> + +-----------------------+ | +-------------------+ |
> + | |<--->| | Userspace | |
> + | | | +-------------------+ |
> + | External attack | | | Interfaces | |
> + | vectors | | +-------------------+ |
> + | |<--->| | Linux Kernel | |
> + | | | +-------------------+ |
> + +-----------------------+ | +-------------------+ |
> + | | Bootloader/BIOS | |
> + +-----------------------+ | +-------------------+ |
> + | |<--->+------------------------+
> + | | | Interfaces |
> + | | +------------------------+
> + | CoCo security |<--->| Host VMM |
> + | manager | +------------------------+
> + | | +------------------------+
> + | |<--->| CoCo platform |
> + +-----------------------+ +------------------------+
> +
> +While the traditional hypervisor has unlimited access to guest data and
> +can leverage this access to attack the guest, the CoCo systems mitigate
> +such attacks by adding security features like guest data
> +confidentiality and integrity protection. This threat model assumes
> +that those features are available and intact.
> +
> +The **Linux kernel CoCo security objectives** can be summarized as
> follows:
> +
> +1. Preserve the confidentiality and integrity of CoCo guest private memory.
> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
> +
> +The above security objectives result in two primary **Linux kernel CoCo
> +assets**:
> +
> +1. Guest kernel execution context.
> +2. Guest kernel private memory.
> +
> +The host retains full control over the CoCo guest resources and can
> +deny access to them at any time. Because of this, the host Denial of
> +Service
> +(DoS) attacks against CoCo guests are beyond the scope of this threat
> +model.
> +
> +The **Linux CoCo attack surface** is any interface exposed from a CoCo
> +guest Linux kernel towards an untrusted host that is not covered by the
> +CoCo technology SW/HW protections. This includes any possible
> +side-channels, as well as transient execution side channels. Examples
> +of explicit (not side-channel) interfaces include accesses to port I/O,
> +MMIO and DMA interfaces, access to PCI configuration space,
> +VMM-specific hypercalls, access to shared memory pages, interrupts
> +allowed to be injected to the guest kernel by the host, as well as CoCo
> +technology specific hypercalls. Additionally, the host in a CoCo system
> +typically controls the process of creating a CoCo guest: it has a
> +method to load into a guest the firmware and bootloader images, the
> +kernel image together with the kernel command line. All of this data
> +should also be considered untrusted until its integrity and authenticity is
> established.
> +
> +The table below shows a threat matrix for the CoCo guest Linux kernel
> +with the potential mitigation strategies. The matrix refers to
> +CoCo-specific versions of the guest, host and platform.
> +
> +.. list-table:: CoCo Linux guest kernel threat matrix
> + :widths: auto
> + :align: center
> + :header-rows: 1
> +
> + * - Threat name
> + - Threat description
> + - Mitigation strategy
> +
> + * - Guest malicious configuration
> + - A malicious host modifies one of the following guest's
> + configuration:
> +
> + 1. Guest firmware or bootloader
> +
> + 2. Guest kernel or module binaries
> +
> + 3. Guest command line parameters
> +
> + This allows the host to break the integrity of the code running
> + inside a CoCo guest and violate the CoCo security objectives.
> + - The integrity of the guest's configuration passed via untrusted host
> + must be ensured by methods such as remote attestation and signing.
> + This should be largely transparent to the guest kernel and would
> + allow it to assume a trusted state at the time of boot.
> +
> + * - CoCo guest data attacks
> + - A malicious host retains full control of the CoCo guest's data
> + in-transit between the guest and the host-managed physical or
> + virtual devices. This allows any attack against confidentiality,
> + integrity or freshness of such data.
> + - The CoCo guest is responsible for ensuring the confidentiality,
> + integrity and freshness of such data using well-established
> + security mechanisms. For example, for any guest external network
> + communications that are passed via the untrusted host, an end-to-end
> + secure session must be established between a guest and a trusted
> + remote endpoint using well-known protocols such as TLS.
> + This requirement also applies to protection of the guest's disk
> + image.
> +
> + * - Malformed runtime input
> + - A malicious host injects malformed input via any communication
> + interface used by guest's kernel code. If the code is not prepared
> + to handle this input correctly, this can result in a host --> guest
> + kernel privilege escalation. This includes classical side-channel
> + and/or transient execution attack vectors.
> + - The attestation or signing process cannot help to mitigate this
> + threat since this input is highly dynamic. Instead, a different set
> + of mechanisms is required:
> +
> + 1. *Limit the exposed attack surface*. Whenever possible, disable
> + complex kernel features and device drivers (not required for guest
> + operation) that actively use the communication interfaces between
> + the untrusted host and the guest. This is not a new concept for the
> + Linux kernel, since it already has mechanisms to disable external
> + interfaces such as attacker's access via USB/Thunderbolt subsystem.
> +
> + 2. *Harden the exposed attack surface*. Any code that uses such
> + interfaces must treat the input from the untrusted host as malicious
> + and do sanity checks before processing it. This can be ensured by
> + performing a code audit of such device drivers as well as employing
> + other standard techniques for testing the code robustness, such as
> + fuzzing. This is again a well-known concept for the Linux kernel
> + since all its networking code has been previously analyzed under
> + presumption of processing malformed input from a network attacker.
> +
> + * - Malicious runtime input
> + - A malicious host injects a specific input value via any
> + communication interface used by the guest's kernel code. The
> + difference with the previous attack vector (malformed runtime input)
> + is that this input is not malformed, but its value is crafted to
> + impact the guest's kernel security. Examples of such inputs include
> + providing a malicious time to the guest or the entropy to the guest
> + random number generator. Additionally, the timing of such events can
> + be an attack vector on its own, if it results in a particular guest
> + kernel action (i.e. processing of a host-injected interrupt).
> + - Similarly, as with the previous attack vector, it is not possible to
> + use attestation mechanisms to address this threat. Instead, such
> + attack vectors (i.e. interfaces) must be either disabled or made
> + resistant to supplied host input.
> +
> +As can be seen from the above table, the potential mitigation
> +strategies to secure the CoCo Linux guest kernel vary, but can be
> +roughly split into mechanisms that either require or do not require
> +changes to the existing Linux kernel code. One main goal of the CoCo
> +security architecture is to limit the changes to the Linux kernel code
> +to minimum, but at the same time to provide usable and scalable means
> +to facilitate the security of a CoCo guest kernel for all the users of the CoCo
> ecosystem.
> diff --git a/Documentation/security/index.rst
> b/Documentation/security/index.rst
> index 6ed8d2fa6f9e..5de51b130e6a 100644
> --- a/Documentation/security/index.rst
> +++ b/Documentation/security/index.rst
> @@ -6,6 +6,7 @@ Security Documentation
> :maxdepth: 1
>
> credentials
> + confidential-computing
> IMA-templates
> keys/index
> lsm
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7f86d02cb427..4a16727bf7f9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5307,6 +5307,12 @@ S: Orphan
> W: http://accessrunner.sourceforge.net/
> F: drivers/usb/atm/cxacru.c
>
> +CONFIDENTIAL COMPUTING THREAT MODEL
> +M: Elena Reshetova <[email protected]>
> +M: Carlos Bilbao <[email protected]>
> +S: Maintained
> +F: Documentation/security/confidential-computing.rst
> +
> CONFIGFS
> M: Joel Becker <[email protected]>
> M: Christoph Hellwig <[email protected]>
> --
> 2.34.1

Reviewed-by: David Kaplan <[email protected]>

2023-04-22 03:23:12

by Bagas Sanjaya

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Mar 29, 2023 at 12:40:24PM +0200, Greg KH wrote:
> > + * - Guest malicious configuration
> > + - A malicious host modifies one of the following guest's
> > + configuration:
> > +
> > + 1. Guest firmware or bootloader
> > +
> > + 2. Guest kernel or module binaries
> > +
> > + 3. Guest command line parameters
> > +
> > + This allows the host to break the integrity of the code running
> > + inside a CoCo guest and violate the CoCo security objectives.
>
> So hosts are not allowed to change this? I don't understand the use of
> "violate" here, sorry.

I think the situation described above is when malicious actors gain
control of a CoCo host.

Thanks.

--
An old man doll... just what I always wanted! - Clara

Attachments:

(No filename) (790.00 B)
signature.asc (235.00 B)
Download all attachments

2023-04-25 13:49:08

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/21/23 16:09, Kaplan, David wrote:
> [AMD Official Use Only - General]
>
>> -----Original Message-----
>> From: Carlos Bilbao <[email protected]>
>> Sent: Monday, March 27, 2023 9:18 AM
>> To: [email protected]
>> Cc: [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; Giani, Dhaval
>> <[email protected]>; Day, Michael <[email protected]>; Paluri,
>> PavanKumar (Pavan Kumar) <[email protected]>; Kaplan, David
>> <[email protected]>; Lal, Reshma <[email protected]>; Powell,
>> Jeremy <[email protected]>;
>> [email protected];
>> [email protected]; Lendacky, Thomas
>> <[email protected]>; [email protected]; [email protected];
>> [email protected]; [email protected]; linux-
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected]; Bilbao, Carlos
>> <[email protected]>
>> Subject: [PATCH] docs: security: Confidential computing intro and threat
>> model
>>
>> Kernel developers working on confidential computing operate under a set of
>> assumptions regarding the Linux kernel threat model that differ from the
>> traditional view. In order to effectively engage with the linux-coco mailing list
>> and contribute to ongoing kernel efforts, one must have a thorough
>> familiarity with these concepts. Add a concise, architecture-agnostic
>> introduction and threat model to provide a reference for ongoing design
>> discussions and to help developers gain a foundational understanding of the
>> subject.
>>
>> Acked-by: Dave Hansen <[email protected]>
>> Co-developed-by: Elena Reshetova <[email protected]>
>> Signed-off-by: Elena Reshetova <[email protected]>
>> Signed-off-by: Carlos Bilbao <[email protected]>
>> ---
>> .../security/confidential-computing.rst | 245 ++++++++++++++++++
>> Documentation/security/index.rst | 1 +
>> MAINTAINERS | 6 +
>> 3 files changed, 252 insertions(+)
>> create mode 100644 Documentation/security/confidential-computing.rst
>>
>> diff --git a/Documentation/security/confidential-computing.rst
>> b/Documentation/security/confidential-computing.rst
>> new file mode 100644
>> index 000000000000..98439ef7ff9f
>> --- /dev/null
>> +++ b/Documentation/security/confidential-computing.rst
>> @@ -0,0 +1,245 @@
>> +===============================
>> +Confidential Computing in Linux
>> +===============================
>> +
>> +.. contents:: :local:
>> +
>> +By: Elena Reshetova <[email protected]> and Carlos Bilbao
>> +<[email protected]>
>> +
>> +Motivation
>> +==========
>> +
>> +Kernel developers working on confidential computing for the cloud
>> +operate under a set of assumptions regarding the Linux kernel threat
>> +model that differ from the traditional view. In order to effectively
>> +engage with the linux-coco mailing list and contribute to its
>> +initiatives, one must have a thorough familiarity with these concepts.
>> +This document provides a concise, architecture-agnostic introduction to
>> +help developers gain a foundational understanding of the subject.
>> +
>> +Overview and terminology
>> +========================
>> +
>> +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
>> +virtualization technologies that allow Cloud Service Providers (CSPs)
>> +to provide stronger security guarantees to their clients (usually
>> +referred to as tenants) by excluding all the CSP's infrastructure and
>> +SW out of the tenant's Trusted Computing Base (TCB).
>> +
>> +While the concrete implementation details differ between technologies,
>> +all of these mechanisms provide increased confidentiality and integrity
>> +of CoCo guest memory and execution state (vCPU registers), more tightly
>> +controlled guest interrupt injection, as well as some additional
>> +mechanisms to control guest-host page mapping. More details on the
>> +x86-specific solutions can be found in :doc:`Intel Trust Domain
>> +Extensions (TDX) </x86/tdx>` and :doc:`AMD Memory Encryption
>> +</x86/amd-memory-encryption>`.
>> +
>> +The basic CoCo layout includes the host, guest, the interfaces that
>> +communicate guest and host, a platform capable of supporting CoCo, and
>> +an intermediary between the guest virtual machine (VM) and the
>> +underlying platform that acts as security manager::
>> +
>> + +-------------------+ +-----------------------+
>> + | CoCo guest VM |<---->| |
>> + +-------------------+ | |
>> + | Interfaces | | CoCo security manager |
>> + +-------------------+ | |
>> + | Host VMM |<---->| |
>> + +-------------------+ | |
>> + | |
>> + +--------------------+ | |
>> + | CoCo platform |<--->| |
>> + +--------------------+ +-----------------------+
>> +
>> +The specific details of the CoCo intermediary vastly diverge between
>> +technologies, so much so that in some cases it will be HW and in others
>> +SW.
>> +
>> +Existing Linux kernel threat model
>> +==================================
>> +
>> +The components of the current Linux kernel threat model are::
>> +
>> + +-----------------------+ +-------------------+
>> + | |<---->| Userspace |
>> + | | +-------------------+
>> + | External attack | | Interfaces |
>> + | vectors | +-------------------+
>> + | |<---->| Linux Kernel |
>> + | | +-------------------+
>> + +-----------------------+ +-------------------+
>> + | Bootloader/BIOS |
>> + +-------------------+
>> + +-------------------+
>> + | HW platform |
>> + +-------------------+
>> +
>> +The existing Linux kernel threat model typically assumes execution on a
>> +trusted HW platform with all of the firmware and bootloaders included
>> +on its TCB. The primary attacker resides in the userspace and all of
>> +the data coming from there is generally considered untrusted, unless
>> +userspace is privileged enough to perform trusted actions. In addition,
>> +external attackers are typically considered, including those with
>> +access to enabled external networks (e.g. Ethernet, Wireless,
>> +Bluetooth), exposed hardware interfaces (e.g. USB, Thunderbolt), and
>> +the ability to modify the contents of disks offline.
>> +
>> +Confidential Computing threat model and security objectives
>> +=========================================================
>> ==
>> +
>> +Confidential Cloud Computing adds a new type of attacker to the above list:
>> +an untrusted and potentially malicious host. This can be viewed as a
>> +more powerful type of external attacker, as it resides locally on the
>> +same physical machine, in contrast to a remote network attacker, and
>> +has control over the guest kernel communication with most of the HW::
>> +
>> + +------------------------+
>> + | CoCo guest VM |
>> + +-----------------------+ | +-------------------+ |
>> + | |<--->| | Userspace | |
>> + | | | +-------------------+ |
>> + | External attack | | | Interfaces | |
>> + | vectors | | +-------------------+ |
>> + | |<--->| | Linux Kernel | |
>> + | | | +-------------------+ |
>> + +-----------------------+ | +-------------------+ |
>> + | | Bootloader/BIOS | |
>> + +-----------------------+ | +-------------------+ |
>> + | |<--->+------------------------+
>> + | | | Interfaces |
>> + | | +------------------------+
>> + | CoCo security |<--->| Host VMM |
>> + | manager | +------------------------+
>> + | | +------------------------+
>> + | |<--->| CoCo platform |
>> + +-----------------------+ +------------------------+
>> +
>> +While the traditional hypervisor has unlimited access to guest data and
>> +can leverage this access to attack the guest, the CoCo systems mitigate
>> +such attacks by adding security features like guest data
>> +confidentiality and integrity protection. This threat model assumes
>> +that those features are available and intact.
>> +
>> +The **Linux kernel CoCo security objectives** can be summarized as
>> follows:
>> +
>> +1. Preserve the confidentiality and integrity of CoCo guest private memory.
>> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
>> +
>> +The above security objectives result in two primary **Linux kernel CoCo
>> +assets**:
>> +
>> +1. Guest kernel execution context.
>> +2. Guest kernel private memory.
>> +
>> +The host retains full control over the CoCo guest resources and can
>> +deny access to them at any time. Because of this, the host Denial of
>> +Service
>> +(DoS) attacks against CoCo guests are beyond the scope of this threat
>> +model.
>> +
>> +The **Linux CoCo attack surface** is any interface exposed from a CoCo
>> +guest Linux kernel towards an untrusted host that is not covered by the
>> +CoCo technology SW/HW protections. This includes any possible
>> +side-channels, as well as transient execution side channels. Examples
>> +of explicit (not side-channel) interfaces include accesses to port I/O,
>> +MMIO and DMA interfaces, access to PCI configuration space,
>> +VMM-specific hypercalls, access to shared memory pages, interrupts
>> +allowed to be injected to the guest kernel by the host, as well as CoCo
>> +technology specific hypercalls. Additionally, the host in a CoCo system
>> +typically controls the process of creating a CoCo guest: it has a
>> +method to load into a guest the firmware and bootloader images, the
>> +kernel image together with the kernel command line. All of this data
>> +should also be considered untrusted until its integrity and authenticity is
>> established.
>> +
>> +The table below shows a threat matrix for the CoCo guest Linux kernel
>> +with the potential mitigation strategies. The matrix refers to
>> +CoCo-specific versions of the guest, host and platform.
>> +
>> +.. list-table:: CoCo Linux guest kernel threat matrix
>> + :widths: auto
>> + :align: center
>> + :header-rows: 1
>> +
>> + * - Threat name
>> + - Threat description
>> + - Mitigation strategy
>> +
>> + * - Guest malicious configuration
>> + - A malicious host modifies one of the following guest's
>> + configuration:
>> +
>> + 1. Guest firmware or bootloader
>> +
>> + 2. Guest kernel or module binaries
>> +
>> + 3. Guest command line parameters
>> +
>> + This allows the host to break the integrity of the code running
>> + inside a CoCo guest and violate the CoCo security objectives.
>> + - The integrity of the guest's configuration passed via untrusted host
>> + must be ensured by methods such as remote attestation and signing.
>> + This should be largely transparent to the guest kernel and would
>> + allow it to assume a trusted state at the time of boot.
>> +
>> + * - CoCo guest data attacks
>> + - A malicious host retains full control of the CoCo guest's data
>> + in-transit between the guest and the host-managed physical or
>> + virtual devices. This allows any attack against confidentiality,
>> + integrity or freshness of such data.
>> + - The CoCo guest is responsible for ensuring the confidentiality,
>> + integrity and freshness of such data using well-established
>> + security mechanisms. For example, for any guest external network
>> + communications that are passed via the untrusted host, an end-to-end
>> + secure session must be established between a guest and a trusted
>> + remote endpoint using well-known protocols such as TLS.
>> + This requirement also applies to protection of the guest's disk
>> + image.
>> +
>> + * - Malformed runtime input
>> + - A malicious host injects malformed input via any communication
>> + interface used by guest's kernel code. If the code is not prepared
>> + to handle this input correctly, this can result in a host --> guest
>> + kernel privilege escalation. This includes classical side-channel
>> + and/or transient execution attack vectors.
>> + - The attestation or signing process cannot help to mitigate this
>> + threat since this input is highly dynamic. Instead, a different set
>> + of mechanisms is required:
>> +
>> + 1. *Limit the exposed attack surface*. Whenever possible, disable
>> + complex kernel features and device drivers (not required for guest
>> + operation) that actively use the communication interfaces between
>> + the untrusted host and the guest. This is not a new concept for the
>> + Linux kernel, since it already has mechanisms to disable external
>> + interfaces such as attacker's access via USB/Thunderbolt subsystem.
>> +
>> + 2. *Harden the exposed attack surface*. Any code that uses such
>> + interfaces must treat the input from the untrusted host as malicious
>> + and do sanity checks before processing it. This can be ensured by
>> + performing a code audit of such device drivers as well as employing
>> + other standard techniques for testing the code robustness, such as
>> + fuzzing. This is again a well-known concept for the Linux kernel
>> + since all its networking code has been previously analyzed under
>> + presumption of processing malformed input from a network attacker.
>> +
>> + * - Malicious runtime input
>> + - A malicious host injects a specific input value via any
>> + communication interface used by the guest's kernel code. The
>> + difference with the previous attack vector (malformed runtime input)
>> + is that this input is not malformed, but its value is crafted to
>> + impact the guest's kernel security. Examples of such inputs include
>> + providing a malicious time to the guest or the entropy to the guest
>> + random number generator. Additionally, the timing of such events can
>> + be an attack vector on its own, if it results in a particular guest
>> + kernel action (i.e. processing of a host-injected interrupt).
>> + - Similarly, as with the previous attack vector, it is not possible to
>> + use attestation mechanisms to address this threat. Instead, such
>> + attack vectors (i.e. interfaces) must be either disabled or made
>> + resistant to supplied host input.
>> +
>> +As can be seen from the above table, the potential mitigation
>> +strategies to secure the CoCo Linux guest kernel vary, but can be
>> +roughly split into mechanisms that either require or do not require
>> +changes to the existing Linux kernel code. One main goal of the CoCo
>> +security architecture is to limit the changes to the Linux kernel code
>> +to minimum, but at the same time to provide usable and scalable means
>> +to facilitate the security of a CoCo guest kernel for all the users of the CoCo
>> ecosystem.
>> diff --git a/Documentation/security/index.rst
>> b/Documentation/security/index.rst
>> index 6ed8d2fa6f9e..5de51b130e6a 100644
>> --- a/Documentation/security/index.rst
>> +++ b/Documentation/security/index.rst
>> @@ -6,6 +6,7 @@ Security Documentation
>> :maxdepth: 1
>>
>> credentials
>> + confidential-computing
>> IMA-templates
>> keys/index
>> lsm
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 7f86d02cb427..4a16727bf7f9 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -5307,6 +5307,12 @@ S: Orphan
>> W: http://accessrunner.sourceforge.net/
>> F: drivers/usb/atm/cxacru.c
>>
>> +CONFIDENTIAL COMPUTING THREAT MODEL
>> +M: Elena Reshetova <[email protected]>
>> +M: Carlos Bilbao <[email protected]>
>> +S: Maintained
>> +F: Documentation/security/confidential-computing.rst
>> +
>> CONFIGFS
>> M: Joel Becker <[email protected]>
>> M: Christoph Hellwig <[email protected]>
>> --
>> 2.34.1
>
> Reviewed-by: David Kaplan <[email protected]>

Does anyone have other concerns or questions? Otherwise, Jon, my "V2" will
be the same text with David's RB tag for the commit.

Thanks,
Carlos

2023-04-25 15:07:55

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Mon, Mar 27, 2023, Carlos Bilbao wrote:
> +Kernel developers working on confidential computing for the cloud operate
> +under a set of assumptions regarding the Linux kernel threat model that
> +differ from the traditional view. In order to effectively engage with the
> +linux-coco mailing list and contribute to its initiatives, one must have a
> +thorough familiarity with these concepts. This document provides a concise,
> +architecture-agnostic introduction to help developers gain a foundational

Heh, vendor agnostic maybe, but certainly not architecture agnostic.

> +understanding of the subject.
> +
> +Overview and terminology
> +========================
> +
> +Confidential Cloud Computing (CoCo) refers to a set of HW and SW

As per Documentation/security/secrets/coco.rst and every discussion I've observed,
CoCo is Confidential Computing. "Cloud" is not part of the definition. That's
true even if this discussion is restricted to CoCo VMs, e.g. see pKVM.

> +virtualization technologies that allow Cloud Service Providers (CSPs) to

Again, CoCo isn't just for cloud use cases.

> +provide stronger security guarantees to their clients (usually referred to
> +as tenants) by excluding all the CSP's infrastructure and SW out of the
> +tenant's Trusted Computing Base (TCB).

This is inaccurate, the provider may still have software and/or hardware in the TCB.

And for the cloud use case, I very, very strongly object to implying that the goal
of CoCo is to exclude the CSP from the TCB. Getting out of the TCB is the goal for
_some_ CSPs, but it is not a fundamental tenant of CoCo. This viewpoint is heavily
tainted by Intel's and AMD's current offerings, which effectively disallow third
party code for reasons that have nothing to do with security.

https://lore.kernel.org/all/Y+aP8rHr6H3LIf%[email protected]

> +While the concrete implementation details differ between technologies, all
> +of these mechanisms provide increased confidentiality and integrity of CoCo
> +guest memory and execution state (vCPU registers), more tightly controlled
> +guest interrupt injection,

This is highly dependent on how "interrupt" is defined, and how "controlled" is
defined.

> as well as some additional mechanisms to control guest-host page mapping.

This is flat out wrong for SNP for any reasonable definition of "page mapping".
SNP has _zero_ "control" over page tables, which is most people think of when they
see "page mapping".

> More details on the x86-specific solutions can be
> +found in
> +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and
> +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`.

So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV
doesn't provide anything besides increased confidentiality of guest memory, and
SEV-ES doesn't provide integrity or validation of physical page assignment.

> +The basic CoCo layout includes the host, guest, the interfaces that
> +communicate guest and host, a platform capable of supporting CoCo,

CoCo VMs...

> and an intermediary between the guest virtual machine (VM) and the
> underlying platform that acts as security manager::

Having an intermediary is very much an implementation detail.

> +Confidential Computing threat model and security objectives
> +===========================================================
> +
> +Confidential Cloud Computing adds a new type of attacker to the above list:
> +an untrusted and potentially malicious host.

I object to splattering "malicious host" everywhere. Many people are going to
read this and interpret "host" as "the CSP", and then make assumptions like
"CoCo assumes the CSP is malicious!". AIUI, the vast majority of use cases aren't
concerned so much about "the CSP" being malicious, but rather they're concerned
about new attack vectors that come with running code/VMs on a stack that is
managed by a third party, on hardware that doesn't reside in a secured facility,
etc.

> +While the traditional hypervisor has unlimited access to guest data and
> +can leverage this access to attack the guest, the CoCo systems mitigate
> +such attacks by adding security features like guest data confidentiality
> +and integrity protection. This threat model assumes that those features
> +are available and intact.

Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
considered CoCo.

> +The **Linux kernel CoCo security objectives** can be summarized as follows:
> +
> +1. Preserve the confidentiality and integrity of CoCo guest private memory.

So, registers are fair game?

> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
> +
> +The above security objectives result in two primary **Linux kernel CoCo
> +assets**:
> +
> +1. Guest kernel execution context.
> +2. Guest kernel private memory.

...

> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7f86d02cb427..4a16727bf7f9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5307,6 +5307,12 @@ S: Orphan
> W: http://accessrunner.sourceforge.net/
> F: drivers/usb/atm/cxacru.c
>
> +CONFIDENTIAL COMPUTING THREAT MODEL

This is not generic CoCo documentation, it's specific to CoCo VMs. E.g. SGX is
most definitely considered a CoCo feature, and it has no dependencies whatsoever
on virtualization.

> +M: Elena Reshetova <[email protected]>
> +M: Carlos Bilbao <[email protected]>

I would love to see an M: or R: entry for someone that is actually _using_ CoCo.

IMO, this document is way too Intel/AMD centric.

2023-04-26 13:39:45

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

Hi Sean,

Thank you for your review! Please see my comments inline.

> On Mon, Mar 27, 2023, Carlos Bilbao wrote:
> > +Kernel developers working on confidential computing for the cloud operate
> > +under a set of assumptions regarding the Linux kernel threat model that
> > +differ from the traditional view. In order to effectively engage with the
> > +linux-coco mailing list and contribute to its initiatives, one must have a
> > +thorough familiarity with these concepts. This document provides a concise,
> > +architecture-agnostic introduction to help developers gain a foundational
>
> Heh, vendor agnostic maybe, but certainly not architecture agnostic.

I guess it depends where you draw a distinction between vendor and architecture.
What was meant here is that we try to write down the overall threat model
and high-level design that existing technologies use today.
But I don’t mind change to vendor agnostic, if it seems more correct.

>
> > +understanding of the subject.
> > +
> > +Overview and terminology
> > +========================
> > +
> > +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
>
> As per Documentation/security/secrets/coco.rst and every discussion I've
> observed,
> CoCo is Confidential Computing. "Cloud" is not part of the definition. That's
> true even if this discussion is restricted to CoCo VMs, e.g. see pKVM.

Yes, I personally not sure we have a single good term to describe this particular
angle of confidential computing. Generally Confidential Computing can mean
any CoCo technology, including things that do not relate to virtualization (like SGX).
This document doesn’t attempt to cover all CoCo, but only a subset of them that
relates to virtualization. Academia researches have been using term "Confidential Cloud
Computing" (quick search on google scholar gives relevant papers), so this was
a reason to adapt this term. If you have a better proposal, please tell.

>
> > +virtualization technologies that allow Cloud Service Providers (CSPs) to
>
> Again, CoCo isn't just for cloud use cases.

See above.

>
> > +provide stronger security guarantees to their clients (usually referred to
> > +as tenants) by excluding all the CSP's infrastructure and SW out of the
> > +tenant's Trusted Computing Base (TCB).
>
> This is inaccurate, the provider may still have software and/or hardware in the
> TCB.

Well, this is the end goal where we want to be, the practical deployment can
differ of course. We can rephrase that it "allows to exclude all the CSP's
infrastructure and SW out of tenant's TCB."

>
> And for the cloud use case, I very, very strongly object to implying that the goal
> of CoCo is to exclude the CSP from the TCB. Getting out of the TCB is the goal for
> _some_ CSPs, but it is not a fundamental tenant of CoCo. This viewpoint is
> heavily
> tainted by Intel's and AMD's current offerings, which effectively disallow third
> party code for reasons that have nothing to do with security.
>
> https://lore.kernel.org/all/Y+aP8rHr6H3LIf%[email protected]

I am not fully sure what you imply with this. Minimal TCB is always a good goal
from security point of view (less hw/sw equals less bugs). From a tenant point
of view of course it is question of risk evaluation: do they think that CSP stack
has a higher chance to have a bug that can be exploited or SW provided by
HW vendors? You seem to imply that some tenants might consider CSP stack to
be more robust? If so, why would they use CoCo? In this case they are better off
with just normal legacy VMs, no?

>
> > +While the concrete implementation details differ between technologies, all
> > +of these mechanisms provide increased confidentiality and integrity of CoCo
> > +guest memory and execution state (vCPU registers), more tightly controlled
> > +guest interrupt injection,
>
> This is highly dependent on how "interrupt" is defined, and how "controlled" is
> defined.

As you know there are some limitations on what type of interrupt vectors can be
injected into a TD guest. Vectors 0-30 are not injectable. This is what is meant by
"more tightly controlled".

>
> > as well as some additional mechanisms to control guest-host page mapping.
>
> This is flat out wrong for SNP for any reasonable definition of "page mapping".
> SNP has _zero_ "control" over page tables, which is most people think of when
> they
> see "page mapping".

Leaving for AMD guys to comment.

>
> > More details on the x86-specific solutions can be
> > +found in
> > +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and
> > +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`.
>
> So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV
> doesn't provide anything besides increased confidentiality of guest memory, and
> SEV-ES doesn't provide integrity or validation of physical page assignment.
>

Same

> > +The basic CoCo layout includes the host, guest, the interfaces that
> > +communicate guest and host, a platform capable of supporting CoCo,
>
> CoCo VMs...

Will fix.

>
> > and an intermediary between the guest virtual machine (VM) and the
> > underlying platform that acts as security manager::
>
> Having an intermediary is very much an implementation detail.

True, but it is kind of big component, so completely omitting it doesn’t sound
right to me either.

>
> > +Confidential Computing threat model and security objectives
> > +===========================================================
> > +
> > +Confidential Cloud Computing adds a new type of attacker to the above list:
> > +an untrusted and potentially malicious host.
>
> I object to splattering "malicious host" everywhere. Many people are going to
> read this and interpret "host" as "the CSP", and then make assumptions like
> "CoCo assumes the CSP is malicious!". AIUI, the vast majority of use cases aren't
> concerned so much about "the CSP" being malicious, but rather they're
> concerned
> about new attack vectors that come with running code/VMs on a stack that is
> managed by a third party, on hardware that doesn't reside in a secured facility,
> etc.

I see your point. I propose to add paragraph in the beginning that explains that
CSPs do not intend to be malicious (at least we hope they dont), but since they
have a big codebase to manage, bugs in that codebase are normal and CoCo
helps to protect tenants against this situations. Also change "malicious host" to
"unintentionally misbehaving host" or smth like this.

>
> > +While the traditional hypervisor has unlimited access to guest data and
> > +can leverage this access to attack the guest, the CoCo systems mitigate
> > +such attacks by adding security features like guest data confidentiality
> > +and integrity protection. This threat model assumes that those features
> > +are available and intact.
>
> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
> considered CoCo.
>
> > +The **Linux kernel CoCo security objectives** can be summarized as follows:
> > +
> > +1. Preserve the confidentiality and integrity of CoCo guest private memory.
>
> So, registers are fair game?

No, you are right, needs to be augmented here. What we meant here is that
the end goal of the attacker is the tenant secrets and they can also be in registers.

>
> > +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
> > +
> > +The above security objectives result in two primary **Linux kernel CoCo
> > +assets**:
> > +
> > +1. Guest kernel execution context.
> > +2. Guest kernel private memory.
>
> ...
>
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 7f86d02cb427..4a16727bf7f9 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -5307,6 +5307,12 @@ S: Orphan
> > W: http://accessrunner.sourceforge.net/
> > F: drivers/usb/atm/cxacru.c
> >
> > +CONFIDENTIAL COMPUTING THREAT MODEL
>
> This is not generic CoCo documentation, it's specific to CoCo VMs. E.g. SGX is
> most definitely considered a CoCo feature, and it has no dependencies
> whatsoever
> on virtualization.

Yes, so how we call it? CoCo VM is a term for a running entity.
That's why the academic term Confidential Cloud Computing was used in the
beginning, but you didn’t like it either.

>
> > +M: Elena Reshetova <[email protected]>
> > +M: Carlos Bilbao <[email protected]>
>
> I would love to see an M: or R: entry for someone that is actually _using_ CoCo.

Would be more than welcomed!

>
> IMO, this document is way too Intel/AMD centric.

Anyone is free to comment/participate on writing this and help us to adjust to
even further to the rest of vendors, because for us it is hard to know details and
applicability for other hw vendors.
Adding Rivos guys now explicitly to CC list.

Best Regards,
Elena.

2023-04-26 15:10:33

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

Hello Sean,

On 4/26/23 8:32 AM, Reshetova, Elena wrote:
> Hi Sean,
>
> Thank you for your review! Please see my comments inline.
>
>> On Mon, Mar 27, 2023, Carlos Bilbao wrote:
>>> +Kernel developers working on confidential computing for the cloud operate
>>> +under a set of assumptions regarding the Linux kernel threat model that
>>> +differ from the traditional view. In order to effectively engage with the
>>> +linux-coco mailing list and contribute to its initiatives, one must have a
>>> +thorough familiarity with these concepts. This document provides a concise,
>>> +architecture-agnostic introduction to help developers gain a foundational
>>
>> Heh, vendor agnostic maybe, but certainly not architecture agnostic.
>
> I guess it depends where you draw a distinction between vendor and architecture.
> What was meant here is that we try to write down the overall threat model
> and high-level design that existing technologies use today.
> But I don’t mind change to vendor agnostic, if it seems more correct.
>
>>
>>> +understanding of the subject.
>>> +
>>> +Overview and terminology
>>> +========================
>>> +
>>> +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
>>
>> As per Documentation/security/secrets/coco.rst and every discussion I've
>> observed,
>> CoCo is Confidential Computing. "Cloud" is not part of the definition. That's
>> true even if this discussion is restricted to CoCo VMs, e.g. see pKVM.
>
> Yes, I personally not sure we have a single good term to describe this particular
> angle of confidential computing. Generally Confidential Computing can mean
> any CoCo technology, including things that do not relate to virtualization (like SGX).
> This document doesn’t attempt to cover all CoCo, but only a subset of them that
> relates to virtualization. Academia researches have been using term "Confidential Cloud
> Computing" (quick search on google scholar gives relevant papers), so this was
> a reason to adapt this term. If you have a better proposal, please tell.
>
>>
>>> +virtualization technologies that allow Cloud Service Providers (CSPs) to
>>
>> Again, CoCo isn't just for cloud use cases.
>
> See above.
>
>>
>>> +provide stronger security guarantees to their clients (usually referred to
>>> +as tenants) by excluding all the CSP's infrastructure and SW out of the
>>> +tenant's Trusted Computing Base (TCB).
>>
>> This is inaccurate, the provider may still have software and/or hardware in the
>> TCB.
>
> Well, this is the end goal where we want to be, the practical deployment can
> differ of course. We can rephrase that it "allows to exclude all the CSP's
> infrastructure and SW out of tenant's TCB."
>
>>
>> And for the cloud use case, I very, very strongly object to implying that the goal
>> of CoCo is to exclude the CSP from the TCB. Getting out of the TCB is the goal for
>> _some_ CSPs, but it is not a fundamental tenant of CoCo. This viewpoint is
>> heavily
>> tainted by Intel's and AMD's current offerings, which effectively disallow third
>> party code for reasons that have nothing to do with security.
>>
>> https://lore.kernel.org/all/Y+aP8rHr6H3LIf%[email protected]
>
> I am not fully sure what you imply with this. Minimal TCB is always a good goal
> from security point of view (less hw/sw equals less bugs). From a tenant point
> of view of course it is question of risk evaluation: do they think that CSP stack
> has a higher chance to have a bug that can be exploited or SW provided by
> HW vendors? You seem to imply that some tenants might consider CSP stack to
> be more robust? If so, why would they use CoCo? In this case they are better off
> with just normal legacy VMs, no?
>
>
>>
>>> +While the concrete implementation details differ between technologies, all
>>> +of these mechanisms provide increased confidentiality and integrity of CoCo
>>> +guest memory and execution state (vCPU registers), more tightly controlled
>>> +guest interrupt injection,
>>
>> This is highly dependent on how "interrupt" is defined, and how "controlled" is
>> defined.
>
> As you know there are some limitations on what type of interrupt vectors can be
> injected into a TD guest. Vectors 0-30 are not injectable. This is what is meant by
> "more tightly controlled".
>
>>
>>> as well as some additional mechanisms to control guest-host page mapping.
>>
>> This is flat out wrong for SNP for any reasonable definition of "page mapping".
>> SNP has _zero_ "control" over page tables, which is most people think of when
>> they
>> see "page mapping".
>
> Leaving for AMD guys to comment.

In SNP, the guest controls the association of a guest physical address to a
host physical address, so that the host can't switch that through the nested
page tables [1]. We will be more specific to avoid interpretations.

>
>>
>>> More details on the x86-specific solutions can be
>>> +found in
>>> +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and
>>> +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`.
>>
>> So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV
>> doesn't provide anything besides increased confidentiality of guest memory, and
>> SEV-ES doesn't provide integrity or validation of physical page assignment.
>>
>
> Same
>

Personally, I think it's reasonable to mention SEV/SEV-ES in the context of
confidential computing and acknowledge their relevance in this area.

But there is no mention to SEV or SEV-ES in this draft. And the document we
reference there covers AMD-SNP, which provides integrity.

>>> +The basic CoCo layout includes the host, guest, the interfaces that
>>> +communicate guest and host, a platform capable of supporting CoCo,
>>
>> CoCo VMs...
>
> Will fix.
>
>>
>>> and an intermediary between the guest virtual machine (VM) and the
>>> underlying platform that acts as security manager::
>>
>> Having an intermediary is very much an implementation detail.
>
> True, but it is kind of big component, so completely omitting it doesn’t sound
> right to me either.
>
>>
>>> +Confidential Computing threat model and security objectives
>>> +===========================================================
>>> +
>>> +Confidential Cloud Computing adds a new type of attacker to the above list:
>>> +an untrusted and potentially malicious host.
>>
>> I object to splattering "malicious host" everywhere. Many people are going to
>> read this and interpret "host" as "the CSP", and then make assumptions like
>> "CoCo assumes the CSP is malicious!". AIUI, the vast majority of use cases aren't
>> concerned so much about "the CSP" being malicious, but rather they're
>> concerned
>> about new attack vectors that come with running code/VMs on a stack that is
>> managed by a third party, on hardware that doesn't reside in a secured facility,
>> etc.
>
> I see your point. I propose to add paragraph in the beginning that explains that
> CSPs do not intend to be malicious (at least we hope they dont), but since they
> have a big codebase to manage, bugs in that codebase are normal and CoCo
> helps to protect tenants against this situations. Also change "malicious host" to
> "unintentionally misbehaving host" or smth like this.
>
>>
>>> +While the traditional hypervisor has unlimited access to guest data and
>>> +can leverage this access to attack the guest, the CoCo systems mitigate
>>> +such attacks by adding security features like guest data confidentiality
>>> +and integrity protection. This threat model assumes that those features
>>> +are available and intact.
>>
>> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
>> considered CoCo.

Again, nobody mentioned SEV/SEV-ES here.

>>
>>> +The **Linux kernel CoCo security objectives** can be summarized as follows:
>>> +
>>> +1. Preserve the confidentiality and integrity of CoCo guest private memory.
>>
>> So, registers are fair game?
>
> No, you are right, needs to be augmented here. What we meant here is that
> the end goal of the attacker is the tenant secrets and they can also be in registers.
>
>>
>>> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
>>> +
>>> +The above security objectives result in two primary **Linux kernel CoCo
>>> +assets**:
>>> +
>>> +1. Guest kernel execution context.
>>> +2. Guest kernel private memory.
>>
>> ...
>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 7f86d02cb427..4a16727bf7f9 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -5307,6 +5307,12 @@ S: Orphan
>>> W: http://accessrunner.sourceforge.net/
>>> F: drivers/usb/atm/cxacru.c
>>>
>>> +CONFIDENTIAL COMPUTING THREAT MODEL
>>
>> This is not generic CoCo documentation, it's specific to CoCo VMs. E.g. SGX is
>> most definitely considered a CoCo feature, and it has no dependencies
>> whatsoever
>> on virtualization.
>
> Yes, so how we call it? CoCo VM is a term for a running entity.
> That's why the academic term Confidential Cloud Computing was used in the
> beginning, but you didn’t like it either.
>
>>
>>> +M: Elena Reshetova <[email protected]>
>>> +M: Carlos Bilbao <[email protected]>
>>
>> I would love to see an M: or R: entry for someone that is actually _using_ CoCo.
>
> Would be more than welcomed!
>
>>
>> IMO, this document is way too Intel/AMD centric.
>
> Anyone is free to comment/participate on writing this and help us to adjust to
> even further to the rest of vendors, because for us it is hard to know details and
> applicability for other hw vendors.
> Adding Rivos guys now explicitly to CC list. >

I'm sure we can find a common ground for this document.

>
> Best Regards,
> Elena.

Thanks,
Carlos

[1] https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf

2023-04-26 15:27:39

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> > On Mon, Mar 27, 2023, Carlos Bilbao wrote:
[...]
> > > +provide stronger security guarantees to their clients (usually
> > > referred to +as tenants) by excluding all the CSP's
> > > infrastructure and SW out of the +tenant's Trusted Computing Base
> > > (TCB).
> >
> > This is inaccurate, the provider may still have software and/or
> > hardware in the TCB.
>
> Well, this is the end goal where we want to be, the practical
> deployment can differ of course. We can rephrase that it "allows to
> exclude all the CSP's infrastructure and SW out of tenant's TCB."

That's getting even more inaccurate. To run in a Cloud with CoCo you
usually have to insert some provided code, like OVMF and, for AMD, the
SVSM. These are often customized by the CSP to suit the cloud
infrastructure, so you're running their code. The goal, I think, is to
make sure you only run code you trust (some of which may come from the
CSP) in your TCB, which is very different from the statement above.

James

2023-04-26 15:53:20

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/25/23 08:02, Sean Christopherson wrote:
>> +While the traditional hypervisor has unlimited access to guest data and
>> +can leverage this access to attack the guest, the CoCo systems mitigate
>> +such attacks by adding security features like guest data confidentiality
>> +and integrity protection. This threat model assumes that those features
>> +are available and intact.
> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
> considered CoCo.

This document is clearly trying to draw a line in the sand and say:

CoCo on one side, non-CoCo on the other

I think it's less important to name that line than it is to realize what
we need to do on one side versus the other.

For instance, if the system doesn't have strong guest memory
confidentiality protection, then it's kinda silly to talk about the
guest's need to defend against "CoCo guest data attacks".

Sure, the mitigations for "CoCo guest data attacks" are pretty sane even
without all this CoCo jazz. But if your goal is to mitigate damage that
a VMM out of the TCB can do, then they don't do much if there isn't
VMM->guest memory confidentiality in the first place.

So, sure, CoCo implementations exist along a continuum. SGX is in there
(with and without integrity protection), as are SEV=>SEV-ES=>SEV and
MKTME=>TDX.

This document is making the case that the kernel should go to some new
(and extraordinary) lengths to defend itself against ... something.
Those defenses don't make much sense unless we've crossed that line in
the sand.

So, let's not quibble about where CoCo starts or ends, but let's _do_
make a list of things that we need before we do all the nonsense that
this doc suggests.

You're totally right that this doc forgot to mention guest registers
(whoops).

2023-04-26 16:04:36

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> Hello Sean,
>
> On 4/26/23 8:32 AM, Reshetova, Elena wrote:
> > Hi Sean,
> >
> > Thank you for your review! Please see my comments inline.
> >
> >> On Mon, Mar 27, 2023, Carlos Bilbao wrote:

...

> >>> More details on the x86-specific solutions can be
> >>> +found in
> >>> +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and
> >>> +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`.
> >>
> >> So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV
> >> doesn't provide anything besides increased confidentiality of guest memory, and
> >> SEV-ES doesn't provide integrity or validation of physical page assignment.
> >>
> >
> > Same
> >
>
> Personally, I think it's reasonable to mention SEV/SEV-ES in the context of
> confidential computing and acknowledge their relevance in this area.
>
> But there is no mention to SEV or SEV-ES in this draft. And the document we
> reference there covers AMD-SNP, which provides integrity.

...

> >>> +While the traditional hypervisor has unlimited access to guest data and
> >>> +can leverage this access to attack the guest, the CoCo systems mitigate
> >>> +such attacks by adding security features like guest data confidentiality
> >>> +and integrity protection. This threat model assumes that those features
> >>> +are available and intact.
> >>
> >> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
> >> considered CoCo.
>
> Again, nobody mentioned SEV/SEV-ES here.

Yes, somebody did. Unless your dictionary has a wildly different definition for
"all".

: +Overview and terminology
: +========================
: +
: +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
: +virtualization technologies that allow Cloud Service Providers (CSPs) to
: +provide stronger security guarantees to their clients (usually referred to
: +as tenants) by excluding all the CSP's infrastructure and SW out of the
: +tenant's Trusted Computing Base (TCB).
: +
: +While the concrete implementation details differ between technologies, all
^^^
: +of these mechanisms provide increased confidentiality and integrity of CoCo
: +guest memory and execution state (vCPU registers), more tightly controlled
: +guest interrupt injection, as well as some additional mechanisms to control
: +guest-host page mapping. More details on the x86-specific solutions can be
: +found in

This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
explicitly mentioning SEV doesn't magically warp reality to make descriptions like
this one from security/secrets/coco.rst disappear:

Introduction
============

Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
Virtualization) allows guest owners to inject secrets into the VMs
memory without the host/hypervisor being able to read them.

My complaint about this document being too Intel/AMD centric isn't that it doesn't
mention other implementations, it's that the doc describes CoCo purely from the
narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
release and not an objective overview of CoCo.

2023-04-26 16:07:46

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Apr 26, 2023, Dave Hansen wrote:
> On 4/25/23 08:02, Sean Christopherson wrote:
> >> +While the traditional hypervisor has unlimited access to guest data and
> >> +can leverage this access to attack the guest, the CoCo systems mitigate
> >> +such attacks by adding security features like guest data confidentiality
> >> +and integrity protection. This threat model assumes that those features
> >> +are available and intact.
> > Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
> > considered CoCo.
>
> This document is clearly trying to draw a line in the sand and say:
>
> CoCo on one side, non-CoCo on the other
>
> I think it's less important to name that line than it is to realize what
> we need to do on one side versus the other.
>
> For instance, if the system doesn't have strong guest memory
> confidentiality protection, then it's kinda silly to talk about the
> guest's need to defend against "CoCo guest data attacks".
>
> Sure, the mitigations for "CoCo guest data attacks" are pretty sane even
> without all this CoCo jazz. But if your goal is to mitigate damage that
> a VMM out of the TCB can do, then they don't do much if there isn't
> VMM->guest memory confidentiality in the first place.
>
> So, sure, CoCo implementations exist along a continuum. SGX is in there
> (with and without integrity protection), as are SEV=>SEV-ES=>SEV and
> MKTME=>TDX.
>
> This document is making the case that the kernel should go to some new
> (and extraordinary) lengths to defend itself against ... something.

Then name the document something other than confidential-computing.rst, e.g.
tdx-and-snp-threat-model.rst. Because this doc isn't remotely close to achieving
its stated goal of providing an "architecture-agnostic introduction ... to help
developers gain a foundational understanding of the subject". IMO, it does more
harm than good on that front because it presents Intel's and AMD's viewpoints as
if they are widely accepted for all of CoCo, and that is just flagrantly false.

: In order to effectively engage with the linux-coco mailing list and contribute
: to ongoing kernel efforts, one must have a thorough familiarity with these
: concepts. Add a concise, architecture-agnostic introduction and threat model
: to provide a reference for ongoing design discussions and to help developers
: gain a foundational understanding of the subject.

2023-04-26 16:20:20

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Apr 26, 2023, James Bottomley wrote:
> On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> > > On Mon, Mar 27, 2023, Carlos Bilbao wrote:
> [...]
> > > > +provide stronger security guarantees to their clients (usually
> > > > referred to +as tenants) by excluding all the CSP's
> > > > infrastructure and SW out of the +tenant's Trusted Computing Base
> > > > (TCB).
> > >
> > > This is inaccurate, the provider may still have software and/or
> > > hardware in the TCB.
> >
> > Well, this is the end goal where we want to be,

If by "we" you mean Intel and AMD, then yes, that is probably a true statement.
But those goals have nothing to do with security.

> > the practical deployment can differ of course. We can rephrase that it
> > "allows to exclude all the CSP's infrastructure and SW out of tenant's
> > TCB."
>
> That's getting even more inaccurate. To run in a Cloud with CoCo you
> usually have to insert some provided code, like OVMF and, for AMD, the
> SVSM. These are often customized by the CSP to suit the cloud
> infrastructure, so you're running their code. The goal, I think, is to
> make sure you only run code you trust (some of which may come from the
> CSP) in your TCB, which is very different from the statement above.

Yes. And taking things a step further, if we were to ask security concious users
what they would choose to have in their TCB: (a) closed-source firmware written by
a hardware vendor, or (b) open-source software that is provided by CSPs, I am
betting the overwhelming majority would choose (b).

2023-04-26 19:24:27

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/26/23 10:51 AM, Sean Christopherson wrote:
> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>> Hello Sean,
>>
>> On 4/26/23 8:32 AM, Reshetova, Elena wrote:
>>> Hi Sean,
>>>
>>> Thank you for your review! Please see my comments inline.
>>>
>>>> On Mon, Mar 27, 2023, Carlos Bilbao wrote:
>
> ...
>
>>>>> More details on the x86-specific solutions can be
>>>>> +found in
>>>>> +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and
>>>>> +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`.
>>>>
>>>> So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV
>>>> doesn't provide anything besides increased confidentiality of guest memory, and
>>>> SEV-ES doesn't provide integrity or validation of physical page assignment.
>>>>
>>>
>>> Same
>>>
>>
>> Personally, I think it's reasonable to mention SEV/SEV-ES in the context of
>> confidential computing and acknowledge their relevance in this area.
>>
>> But there is no mention to SEV or SEV-ES in this draft. And the document we
>> reference there covers AMD-SNP, which provides integrity.
>
> ...
>
>>>>> +While the traditional hypervisor has unlimited access to guest data and
>>>>> +can leverage this access to attack the guest, the CoCo systems mitigate
>>>>> +such attacks by adding security features like guest data confidentiality
>>>>> +and integrity protection. This threat model assumes that those features
>>>>> +are available and intact.
>>>>
>>>> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be
>>>> considered CoCo.
>>
>> Again, nobody mentioned SEV/SEV-ES here.
>
> Yes, somebody did. Unless your dictionary has a wildly different definition for
> "all".
>
> : +Overview and terminology
> : +========================
> : +
> : +Confidential Cloud Computing (CoCo) refers to a set of HW and SW
> : +virtualization technologies that allow Cloud Service Providers (CSPs) to
> : +provide stronger security guarantees to their clients (usually referred to
> : +as tenants) by excluding all the CSP's infrastructure and SW out of the
> : +tenant's Trusted Computing Base (TCB).
> : +
> : +While the concrete implementation details differ between technologies, all
> ^^^
> : +of these mechanisms provide increased confidentiality and integrity of CoCo
> : +guest memory and execution state (vCPU registers), more tightly controlled
> : +guest interrupt injection, as well as some additional mechanisms to control
> : +guest-host page mapping. More details on the x86-specific solutions can be
> : +found in
>
> This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
> explicitly mentioning SEV doesn't magically warp reality to make descriptions like
> this one from security/secrets/coco.rst disappear:
>
> Introduction
> ============
>
> Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
> Virtualization) allows guest owners to inject secrets into the VMs
> memory without the host/hypervisor being able to read them.
>
> My complaint about this document being too Intel/AMD centric isn't that it doesn't
> mention other implementations, it's that the doc describes CoCo purely from the
> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
> release and not an objective overview of CoCo.

Be specific about the parts of the document that you feel are too
AMD/Intel centric, and we will correct them.

Thanks,
Carlos

2023-04-26 19:59:57

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> On 4/26/23 10:51 AM, Sean Christopherson wrote:
> > This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
> > explicitly mentioning SEV doesn't magically warp reality to make descriptions like
> > this one from security/secrets/coco.rst disappear:
> >
> > Introduction
> > ============
> >
> > Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
> > Virtualization) allows guest owners to inject secrets into the VMs
> > memory without the host/hypervisor being able to read them.
> >
> > My complaint about this document being too Intel/AMD centric isn't that it doesn't
> > mention other implementations, it's that the doc describes CoCo purely from the
> > narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
> > release and not an objective overview of CoCo.
>
> Be specific about the parts of the document that you feel are too
> AMD/Intel centric, and we will correct them.

The whole thing? There aren't specific parts that are too SNP/TDX centric, the
entire tone and approach of the document is wrong. As I responded to Dave, I
would feel differently if the document were named tdx-and-snp-threat-model.rst,
but this patch proposes a generic confidential-computing.rst and presents the
SNP+TDX confidential VM use case as if it's the *only* confidential computing use
case.

2023-04-26 20:14:47

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/26/23 12:21, Carlos Bilbao wrote:
>> Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
>> Virtualization) allows guest owners to inject secrets into the VMs
>> memory without the host/hypervisor being able to read them.
>>
>> My complaint about this document being too Intel/AMD centric isn't that it doesn't
>> mention other implementations, it's that the doc describes CoCo purely from the
>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
>> release and not an objective overview of CoCo.
> Be specific about the parts of the document that you feel are too
> AMD/Intel centric, and we will correct them.

That's kinda not the point.

Confidential computing covers a *REALLY* wide swath of technologies,
even on just AMD and Intel: SGX, SEV{,-ES,SNP}, MKTME, TDX.

But this document is talking about one *VERY* *SPECIFIC* thing: VMs
running under SEV-SNP or TDX and in a very specific environment: CSPs.
Also, not even *ALL* CSPs, a subset of CSPs. You're tossing out a huge
chunk of the confidential computing world without acknowledging it.

I don't have any great suggestions on what you call this subset. Maybe
you get an ack from the CoVE folks:

> https://lore.kernel.org/all/[email protected]/

and call it
tdx-and-snp-and-cove-at-some-random-unnamed-big-fancy-csps-threat-model.rst.
Just add an -and-foo each time a new hardware vendor shows up until
someone smarter than us finds a good name.

But I do think the difficulty here is in drawing that "line in the sand"
I was talking about. You're trying to make the argument that once you
get hardware support for:

1. Increased guest data and state confidentiality from a VMM
2. Better guest data and state integrity in the face of VMM modification
3. More tightly controlled guest interrupt injection
4. Some additional mechanisms to control guest-host page mapping.

... then you need all this *other* stuff that the document talks about.

I think #3 and #4 are really just (SEV and TDX) implementation details.
I can certainly imagine a sane architecture without all of x86's warts
that doesn't care much about #3.

I think I know what #4 is talking about, but it's too handwavy for me to
even offer any improvements. I actually think #4 is just a subset of
integrity protection: make sure that the same data that the guest puts
in memory at a guest physical address comes back out at that address
later. SEV and TDX implement that by preventing the host from remapping
guest physical address space willy nilly, but it's just integrity
protection by another name.

2023-04-26 20:18:00

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/26/23 2:53 PM, Sean Christopherson wrote:
> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>> On 4/26/23 10:51 AM, Sean Christopherson wrote:
>>> This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
>>> explicitly mentioning SEV doesn't magically warp reality to make descriptions like
>>> this one from security/secrets/coco.rst disappear:
>>>
>>> Introduction
>>> ============
>>>
>>> Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
>>> Virtualization) allows guest owners to inject secrets into the VMs
>>> memory without the host/hypervisor being able to read them.
>>>
>>> My complaint about this document being too Intel/AMD centric isn't that it doesn't
>>> mention other implementations, it's that the doc describes CoCo purely from the
>>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
>>> release and not an objective overview of CoCo.
>>
>> Be specific about the parts of the document that you feel are too
>> AMD/Intel centric, and we will correct them.
>
> The whole thing? There aren't specific parts that are too SNP/TDX centric, the
> entire tone and approach of the document is wrong. As I responded to Dave, I
> would feel differently if the document were named tdx-and-snp-threat-model.rst,
> but this patch proposes a generic confidential-computing.rst and presents the
> SNP+TDX confidential VM use case as if it's the *only* confidential computing use
> case.

What part of us describing the current Linux kernel threat model or
defining basic concepts of confidential computing is SNP/TDX centric?

IMHO, simply stating that "the whole thing" is wrong and that you don't
like the "tone", is not making a good enough case for us to change
anything, including the name of the document.

Thanks,
Carlos

2023-04-26 21:33:59

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> On 4/26/23 2:53 PM, Sean Christopherson wrote:
> > On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> >> On 4/26/23 10:51 AM, Sean Christopherson wrote:
> >>> This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
> >>> explicitly mentioning SEV doesn't magically warp reality to make descriptions like
> >>> this one from security/secrets/coco.rst disappear:
> >>>
> >>> Introduction
> >>> ============
> >>>
> >>> Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
> >>> Virtualization) allows guest owners to inject secrets into the VMs
> >>> memory without the host/hypervisor being able to read them.
> >>>
> >>> My complaint about this document being too Intel/AMD centric isn't that it doesn't
> >>> mention other implementations, it's that the doc describes CoCo purely from the
> >>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
> >>> release and not an objective overview of CoCo.
> >>
> >> Be specific about the parts of the document that you feel are too
> >> AMD/Intel centric, and we will correct them.
> >
> > The whole thing? There aren't specific parts that are too SNP/TDX centric, the
> > entire tone and approach of the document is wrong. As I responded to Dave, I
> > would feel differently if the document were named tdx-and-snp-threat-model.rst,
> > but this patch proposes a generic confidential-computing.rst and presents the
> > SNP+TDX confidential VM use case as if it's the *only* confidential computing use
> > case.
>
> What part of us describing the current Linux kernel threat model or
> defining basic concepts of confidential computing is SNP/TDX centric?
>
> IMHO, simply stating that "the whole thing" is wrong and that you don't
> like the "tone", is not making a good enough case for us to change
> anything, including the name of the document.

I honestly don't know how to respond since you are either unable or unwilling to
see the problems with naming a document "confidential computing" and then talking
only about one very, very specific flavor of confidential computing as if that is
the only flavor of confidential computing.

So if you want to push this doc as is, please add my

Nacked-by: Sean Christopherson <[email protected]>

2023-04-26 22:37:57

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/26/23 4:33 PM, Sean Christopherson wrote:
> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>> On 4/26/23 2:53 PM, Sean Christopherson wrote:
>>> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>>>> On 4/26/23 10:51 AM, Sean Christopherson wrote:
>>>>> This document is named confidential-computing.rst, not tdx-and-snp.rst. Not
>>>>> explicitly mentioning SEV doesn't magically warp reality to make descriptions like
>>>>> this one from security/secrets/coco.rst disappear:
>>>>>
>>>>> Introduction
>>>>> ============
>>>>>
>>>>> Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted
>>>>> Virtualization) allows guest owners to inject secrets into the VMs
>>>>> memory without the host/hypervisor being able to read them.
>>>>>
>>>>> My complaint about this document being too Intel/AMD centric isn't that it doesn't
>>>>> mention other implementations, it's that the doc describes CoCo purely from the
>>>>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a press
>>>>> release and not an objective overview of CoCo.
>>>>
>>>> Be specific about the parts of the document that you feel are too
>>>> AMD/Intel centric, and we will correct them.
>>>
>>> The whole thing? There aren't specific parts that are too SNP/TDX centric, the
>>> entire tone and approach of the document is wrong. As I responded to Dave, I
>>> would feel differently if the document were named tdx-and-snp-threat-model.rst,
>>> but this patch proposes a generic confidential-computing.rst and presents the
>>> SNP+TDX confidential VM use case as if it's the *only* confidential computing use
>>> case.
>>
>> What part of us describing the current Linux kernel threat model or
>> defining basic concepts of confidential computing is SNP/TDX centric?
>>
>> IMHO, simply stating that "the whole thing" is wrong and that you don't
>> like the "tone", is not making a good enough case for us to change
>> anything, including the name of the document.
>
> I honestly don't know how to respond since you are either unable or unwilling to
> see the problems with naming a document "confidential computing" and then talking
> only about one very, very specific flavor of confidential computing as if that is
> the only flavor of confidential computing.
>
> So if you want to push this doc as is, please add my
>
> Nacked-by: Sean Christopherson <[email protected]>
>

Well, the intent was and still is to work with the community to collect
feedback and finish a version were all flavors are represented --see
Motivation section of the draft. But if you are unable or unwilling to
collaborate with us, just please make sure to read whatever is the final
version. I will assume it has your Nacked-By otherwise.

To the rest, please do point out to specific parts you consider to be
AMD/Intel agnostic. We will do our best effort to fix it.

Thanks,
Carlos

2023-04-27 12:40:26

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> > On 4/26/23 2:53 PM, Sean Christopherson wrote:
> > > On Wed, Apr 26, 2023, Carlos Bilbao wrote:
> > >> On 4/26/23 10:51 AM, Sean Christopherson wrote:
> > >>> This document is named confidential-computing.rst, not tdx-and-snp.rst.
> Not
> > >>> explicitly mentioning SEV doesn't magically warp reality to make
> descriptions like
> > >>> this one from security/secrets/coco.rst disappear:
> > >>>
> > >>> Introduction
> > >>> ============
> > >>>
> > >>> Confidential Computing (coco) hardware such as AMD SEV (Secure
> Encrypted
> > >>> Virtualization) allows guest owners to inject secrets into the VMs
> > >>> memory without the host/hypervisor being able to read them.
> > >>>
> > >>> My complaint about this document being too Intel/AMD centric isn't that it
> doesn't
> > >>> mention other implementations, it's that the doc describes CoCo purely
> from the
> > >>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a
> press
> > >>> release and not an objective overview of CoCo.
> > >>
> > >> Be specific about the parts of the document that you feel are too
> > >> AMD/Intel centric, and we will correct them.
> > >
> > > The whole thing? There aren't specific parts that are too SNP/TDX centric, the
> > > entire tone and approach of the document is wrong. As I responded to Dave,
> I
> > > would feel differently if the document were named tdx-and-snp-threat-
> model.rst,
> > > but this patch proposes a generic confidential-computing.rst and presents the
> > > SNP+TDX confidential VM use case as if it's the *only* confidential computing
> use
> > > case.
> >
> > What part of us describing the current Linux kernel threat model or
> > defining basic concepts of confidential computing is SNP/TDX centric?
> >
> > IMHO, simply stating that "the whole thing" is wrong and that you don't
> > like the "tone", is not making a good enough case for us to change
> > anything, including the name of the document.
>
> I honestly don't know how to respond since you are either unable or unwilling to
> see the problems with naming a document "confidential computing" and then
> talking
> only about one very, very specific flavor of confidential computing as if that is
> the only flavor of confidential computing.

This is simply an unfair statement. I replied yesterday on this particular angle, i.e.
let's think on how to name this properly: explained our thinking behind using the
"Confidential Cloud Computing" term (with references to academia using it) and asked
what the better name should be. I didn’t get a reply to that, but here you say we
are not willing to cooperate...

So I don’t think it is fair to say that we don’t take feedback!

I agree with Dave that I think the goal of this document is not to come up with a
fancy name (I am fine with call it anything), but to introduce kernel developers to the
new Linux threat model angle for this-particular-use-case-of-confidential-computing.
So that when we submit the hardening mechanisms in the future people are
already familiar with why we need to do this and we don’t have to repeat this story
again and again.

Best Regards,
Elena.

2023-04-27 12:55:13

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

> On Wed, Apr 26, 2023, James Bottomley wrote:
> > On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> > > > On Mon, Mar 27, 2023, Carlos Bilbao wrote:
> > [...]
> > > > > +provide stronger security guarantees to their clients (usually
> > > > > referred to +as tenants) by excluding all the CSP's
> > > > > infrastructure and SW out of the +tenant's Trusted Computing Base
> > > > > (TCB).
> > > >
> > > > This is inaccurate, the provider may still have software and/or
> > > > hardware in the TCB.
> > >
> > > Well, this is the end goal where we want to be,
>
> If by "we" you mean Intel and AMD, then yes, that is probably a true statement.
> But those goals have nothing to do with security.

I disagree from pure security point of view, see below.

>
> > > the practical deployment can differ of course. We can rephrase that it
> > > "allows to exclude all the CSP's infrastructure and SW out of tenant's
> > > TCB."
> >
> > That's getting even more inaccurate. To run in a Cloud with CoCo you
> > usually have to insert some provided code, like OVMF and, for AMD, the
> > SVSM. These are often customized by the CSP to suit the cloud
> > infrastructure, so you're running their code. The goal, I think, is to
> > make sure you only run code you trust (some of which may come from the
> > CSP) in your TCB, which is very different from the statement above.
>
> Yes. And taking things a step further, if we were to ask security concious users
> what they would choose to have in their TCB: (a) closed-source firmware written
> by
> a hardware vendor, or (b) open-source software that is provided by CSPs, I am
> betting the overwhelming majority would choose (b).

As I already replied in my earlier message from yesterday, yes, this is the choice
that anyone has and it is free to make this choice. No questions asked.
(Btw, please note that the above statement is not 100% accurate since the source
code for intel TDX module is at least public).
However, if as you said the majority choose (b), why do they need to enable the
Confidential cloud computing technologies like TDX or SEV-SNP?
If they choose (b), then the whole threat model described in this document do not
simply apply to them and they can forget about anything that we try to describe
here.

Now from the pure security point of view the choice between (a) and (b) is not so easily
done imo. Usually we take into account many factors that affect the risk/chances
that certain piece of SW has a higher risk of having vulnerabilities. This includes the
size of the codebase, its complexity, its attack surface exposure towards external
interfaces, level of testing, whenever the code is public, code dependency chains, etc.
Smaller codebase with no dependencies and small set of exposed interfaces is usually
easier to review from security point of view given that the code is public.

Best Regards,
Elena.

2023-04-27 13:11:42

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

> On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> > > On Mon, Mar 27, 2023, Carlos Bilbao wrote:
> [...]
> > > > +provide stronger security guarantees to their clients (usually
> > > > referred to +as tenants) by excluding all the CSP's
> > > > infrastructure and SW out of the +tenant's Trusted Computing Base
> > > > (TCB).
> > >
> > > This is inaccurate, the provider may still have software and/or
> > > hardware in the TCB.
> >
> > Well, this is the end goal where we want to be, the practical
> > deployment can differ of course. We can rephrase that it "allows to
> > exclude all the CSP's infrastructure and SW out of tenant's TCB."
>
> That's getting even more inaccurate. To run in a Cloud with CoCo you
> usually have to insert some provided code, like OVMF and, for AMD, the
> SVSM. These are often customized by the CSP to suit the cloud
> infrastructure, so you're running their code.

Agree, this *can be the case in practice*, but it doesn’t have to be one. Nothing from the
CoCo technology itself prevents tenants in this model to have their own virtual FW.
The fact that CSPs infrastructure might not support this case is a totally different story.

The goal, I think, is to
> make sure you only run code you trust (some of which may come from the
> CSP) in your TCB, which is very different from the statement above.

At the end it would go down to the agreement between a CSP and tenant, i.e.
how much tenants are willing to trust CSP and how much of CSPs code they
would take into their TCB (using proper means of establishing a trust in these
pieces). This agreement is our of anyone control here and the only thing that
the CoCo technologies are aiming to provide is to enable all these different
models/agreements, including the ultimate case (if wanted) when tenants could run
without any CSP components in their TCB.

So, let’s fix the wording in the document that it is indeed doesn’t rule out any of
the agreements styles, but the goal of this document is not to describe CoCo
use cases, but to talk about what Linux kernel needs to do assuming there are
tenants who want to make sure they run kernel inside a CoCo VM that is ready
to take the new threat model into account.

Best Regards,
Elena.

2023-04-27 13:26:49

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Thu, 2023-04-27 at 12:43 +0000, Reshetova, Elena wrote:
>
> > On Wed, Apr 26, 2023, James Bottomley wrote:
> > > On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
[...]
> > > > the practical deployment can differ of course. We can rephrase
> > > > that it "allows to exclude all the CSP's infrastructure and SW
> > > > out of tenant's TCB."
> > >
> > > That's getting even more inaccurate. To run in a Cloud with
> > > CoCo you usually have to insert some provided code, like OVMF
> > > and, for AMD, the SVSM. These are often customized by the CSP to
> > > suit the cloud infrastructure, so you're running their code. The
> > > goal, I think, is to make sure you only run code you trust (some
> > > of which may come from the CSP) in your TCB, which is very
> > > different from the statement above.
> >
> > Yes. And taking things a step further, if we were to ask security
> > concious users what they would choose to have in their TCB: (a)
> > closed-source firmware written by a hardware vendor, or (b) open-
> > source software that is provided by CSPs, I am betting the
> > overwhelming majority would choose (b).
>
> As I already replied in my earlier message from yesterday, yes, this
> is the choice that anyone has and it is free to make this choice. No
> questions asked. (Btw, please note that the above statement is not
> 100% accurate since the source code for intel TDX module is at least
> public). However, if as you said the majority choose (b), why do they
> need to enable the Confidential cloud computing technologies like TDX
> or SEV-SNP? If they choose (b), then the whole threat model described
> in this document do not simply apply to them and they can forget
> about anything that we try to describe here.

I think the problem is that the tenor of the document is that the CSP
should be seen as the enemy of the tenant. Whereas all CSP's want to be
seen as the partner of the tenant (admittedly so they can upsell
services). In particular, even if you adopt (b) there are several
reasons why you'd use confidential computing:

1. Protection from other tenants who break containment in the cloud.
These tenants could exfiltrate data from Non-CoCo VMs, but likely
would be detected before they had time to launch an attack using
vulnerabilities in the current linux device drivers.
2. Legal data security. There's a lot of value in a CSP being able
to make the legal statement that it does not have access to a
customer data because of CoCo.
3. Insider threats (bribe a CSP admin employee). This one might get
as far as trying to launch an attack on a CoCo VM, but having
checks at the CSP to detect and defeat this would work instead of
every insider threat having to be defeated inside the VM.

In all of those cases (which are not exhaustive) you can regard the CSP
as a partner of the tenant when it comes to preventing and detecting
threats to the CoCo VM, so extreme device driver hardening becomes far
less relevant to these fairly considerable use cases.

> Now from the pure security point of view the choice between (a) and
> (b) is not so easily done imo. Usually we take into account many
> factors that affect the risk/chances that certain piece of SW has a
> higher risk of having vulnerabilities. This includes the size of the
> codebase, its complexity, its attack surface exposure towards
> external interfaces, level of testing, whenever the code is public,
> code dependency chains, etc. Smaller codebase with no dependencies
> and small set of exposed interfaces is usually easier to review from
> security point of view given that the code is public.

This reads like an argument that, from a security point of view,
smaller proprietary code is better than larger, open source, code. I
really don't think we want to open this can of worms. Most industry
players have already bought the idea that open source improves security
because even if you can't rely on the community entirely, you can take
the code to a third party for analysis.

James

2023-04-27 14:21:26

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/27/23 7:29 AM, Reshetova, Elena wrote:
>> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>>> On 4/26/23 2:53 PM, Sean Christopherson wrote:
>>>> On Wed, Apr 26, 2023, Carlos Bilbao wrote:
>>>>> On 4/26/23 10:51 AM, Sean Christopherson wrote:
>>>>>> This document is named confidential-computing.rst, not tdx-and-snp.rst.
>> Not
>>>>>> explicitly mentioning SEV doesn't magically warp reality to make
>> descriptions like
>>>>>> this one from security/secrets/coco.rst disappear:
>>>>>>
>>>>>> Introduction
>>>>>> ============
>>>>>>
>>>>>> Confidential Computing (coco) hardware such as AMD SEV (Secure
>> Encrypted
>>>>>> Virtualization) allows guest owners to inject secrets into the VMs
>>>>>> memory without the host/hypervisor being able to read them.
>>>>>>
>>>>>> My complaint about this document being too Intel/AMD centric isn't that it
>> doesn't
>>>>>> mention other implementations, it's that the doc describes CoCo purely
>> from the
>>>>>> narrow viewpoint of Intel TDX and AMD SNP, and to be blunt, reads like a
>> press
>>>>>> release and not an objective overview of CoCo.
>>>>>
>>>>> Be specific about the parts of the document that you feel are too
>>>>> AMD/Intel centric, and we will correct them.
>>>>
>>>> The whole thing? There aren't specific parts that are too SNP/TDX centric, the
>>>> entire tone and approach of the document is wrong. As I responded to Dave,
>> I
>>>> would feel differently if the document were named tdx-and-snp-threat-
>> model.rst,
>>>> but this patch proposes a generic confidential-computing.rst and presents the
>>>> SNP+TDX confidential VM use case as if it's the *only* confidential computing
>> use
>>>> case.
>>>
>>> What part of us describing the current Linux kernel threat model or
>>> defining basic concepts of confidential computing is SNP/TDX centric?
>>>
>>> IMHO, simply stating that "the whole thing" is wrong and that you don't
>>> like the "tone", is not making a good enough case for us to change
>>> anything, including the name of the document.
>>
>> I honestly don't know how to respond since you are either unable or unwilling to
>> see the problems with naming a document "confidential computing" and then
>> talking
>> only about one very, very specific flavor of confidential computing as if that is
>> the only flavor of confidential computing.
>
> This is simply an unfair statement. I replied yesterday on this particular angle, i.e.
> let's think on how to name this properly: explained our thinking behind using the
> "Confidential Cloud Computing" term (with references to academia using it) and asked
> what the better name should be. I didn’t get a reply to that, but here you say we
> are not willing to cooperate...
>
> So I don’t think it is fair to say that we don’t take feedback!
>
> I agree with Dave that I think the goal of this document is not to come up with a
> fancy name (I am fine with call it anything), but to introduce kernel developers to the
> new Linux threat model angle for this-particular-use-case-of-confidential-computing.
> So that when we submit the hardening mechanisms in the future people are
> already familiar with why we need to do this and we don’t have to repeat this story
> again and again.

Yes! To reiterate, there's two things we definitely wish to do:

1. Narrow down the problem: This new document can be specific to CoCo in
virtual environments. v2 should be clear about that.

2. Gather feedback: we already received some input about potential bias
toward TDX/SNP, which should be addressed on v2.

Thanks,
Carlos

>
> Best Regards,
> Elena.

2023-04-27 15:27:37

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Thu, Apr 27, 2023, Carlos Bilbao wrote:
> On 4/27/23 7:29 AM, Reshetova, Elena wrote:
> > I agree with Dave that I think the goal of this document is not to come up with a
> > fancy name (I am fine with call it anything), but to introduce kernel developers to the
> > new Linux threat model angle for this-particular-use-case-of-confidential-computing.
> > So that when we submit the hardening mechanisms in the future people are
> > already familiar with why we need to do this and we don’t have to repeat this story
> > again and again.
>
> Yes! To reiterate, there's two things we definitely wish to do:
>
> 1. Narrow down the problem: This new document can be specific to CoCo in
> virtual environments. v2 should be clear about that.

Then rename the document as I already suggested. If you want to claim
confidential-computing.rst, then IMO such a doc needs to be written something
like the surprisingly good Wikipedia article[*]. Until one of those two things
happens, my NAK stands.

[*] https://en.wikipedia.org/wiki/Confidential_computing

2023-04-27 16:00:12

[permalink] [raw]

Subject: RE: [PATCH] docs: security: Confidential computing intro and threat model

> On Thu, 2023-04-27 at 12:43 +0000, Reshetova, Elena wrote:
> >
> > > On Wed, Apr 26, 2023, James Bottomley wrote:
> > > > On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> [...]
> > > > > the practical deployment can differ of course. We can rephrase
> > > > > that it "allows to exclude all the CSP's infrastructure and SW
> > > > > out of tenant's TCB."
> > > >
> > > > That's getting even more inaccurate. To run in a Cloud with
> > > > CoCo you usually have to insert some provided code, like OVMF
> > > > and, for AMD, the SVSM. These are often customized by the CSP to
> > > > suit the cloud infrastructure, so you're running their code. The
> > > > goal, I think, is to make sure you only run code you trust (some
> > > > of which may come from the CSP) in your TCB, which is very
> > > > different from the statement above.
> > >
> > > Yes. And taking things a step further, if we were to ask security
> > > concious users what they would choose to have in their TCB: (a)
> > > closed-source firmware written by a hardware vendor, or (b) open-
> > > source software that is provided by CSPs, I am betting the
> > > overwhelming majority would choose (b).
> >
> > As I already replied in my earlier message from yesterday, yes, this
> > is the choice that anyone has and it is free to make this choice. No
> > questions asked. (Btw, please note that the above statement is not
> > 100% accurate since the source code for intel TDX module is at least
> > public). However, if as you said the majority choose (b), why do they
> > need to enable the Confidential cloud computing technologies like TDX
> > or SEV-SNP? If they choose (b), then the whole threat model described
> > in this document do not simply apply to them and they can forget
> > about anything that we try to describe here.
>
> I think the problem is that the tenor of the document is that the CSP
> should be seen as the enemy of the tenant.

We didn’t intend this interpretation and it can be certainly be fixed if
people see it this way.

Whereas all CSP's want to be
> seen as the partner of the tenant (admittedly so they can upsell
> services). In particular, even if you adopt (b) there are several
> reasons why you'd use confidential computing:
>
> 1. Protection from other tenants who break containment in the cloud.
> These tenants could exfiltrate data from Non-CoCo VMs, but likely
> would be detected before they had time to launch an attack using
> vulnerabilities in the current linux device drivers.

Not sure how this "likely to be detected" is going to happen in practice.
If you have a known vulnerability against a CoCo VM (let say in a device
driver interface it exposes), is it so much more difficult for an attacker
to break into CoCo VM vs non-CoCo VM before it is detected?

> 2. Legal data security. There's a lot of value in a CSP being able
> to make the legal statement that it does not have access to a
> customer data because of CoCo.

Let's leave legal out of technical discussion, not my area.

> 3. Insider threats (bribe a CSP admin employee). This one might get
> as far as trying to launch an attack on a CoCo VM, but having
> checks at the CSP to detect and defeat this would work instead of
> every insider threat having to be defeated inside the VM.

Ok, this angle might be valid from CSP point of view, i.e. noticing such
insider attacks might be easier I guess with CoCo VMs.

>
> In all of those cases (which are not exhaustive) you can regard the CSP
> as a partner of the tenant when it comes to preventing and detecting
> threats to the CoCo VM, so extreme device driver hardening becomes far
> less relevant to these fairly considerable use cases.

I think the first case still holds, as well as one case that you have not listed:
a remote attacker attacking CSP stack using some discovered and not yet
fixed vulnerability (stack is big, bugs happen), getting control of CSP stack
and then going after the CoCo VMs to see what it can get there.
What you are saying is that you (as CSP) maintain the good first level defense
to prevent attacker to get control of your/CSP stack to begin with.
What we try to do is the next level of defense (very typical in any security):
we assume that first line of defense has been broken for some reason and
now there is a second one placed to actually protect customers end data.

>
> > Now from the pure security point of view the choice between (a) and
> > (b) is not so easily done imo. Usually we take into account many
> > factors that affect the risk/chances that certain piece of SW has a
> > higher risk of having vulnerabilities. This includes the size of the
> > codebase, its complexity, its attack surface exposure towards
> > external interfaces, level of testing, whenever the code is public,
> > code dependency chains, etc. Smaller codebase with no dependencies
> > and small set of exposed interfaces is usually easier to review from
> > security point of view given that the code is public.
>
> This reads like an argument that, from a security point of view,
> smaller proprietary code is better than larger, open source, code. I
> really don't think we want to open this can of worms.

I don’t think I have made this statement: the code *has to be public*
for anyone to review and I did explicitly list this in the statement above
as "given that the code is public". Only thing I meant is that it is not
not so easy to make a call between (a) and (b) in all cases from a pure
security point of view.

Best Regards,
Elena.

2023-04-27 16:29:02

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Thu, 2023-04-27 at 15:47 +0000, Reshetova, Elena wrote:
> > On Thu, 2023-04-27 at 12:43 +0000, Reshetova, Elena wrote:
> > >
> > > > On Wed, Apr 26, 2023, James Bottomley wrote:
> > > > > On Wed, 2023-04-26 at 13:32 +0000, Reshetova, Elena wrote:
> > [...]
> > > > > > the practical deployment can differ of course. We can
> > > > > > rephrase that it "allows to exclude all the CSP's
> > > > > > infrastructure and SW out of tenant's TCB."
> > > > >
> > > > > That's getting even more inaccurate. To run in a Cloud with
> > > > > CoCo you usually have to insert some provided code, like OVMF
> > > > > and, for AMD, the SVSM. These are often customized by the
> > > > > CSP to suit the cloud infrastructure, so you're running their
> > > > > code. The goal, I think, is to make sure you only run code
> > > > > you trust (some of which may come from the CSP) in your TCB,
> > > > > which is very different from the statement above.
> > > >
> > > > Yes. And taking things a step further, if we were to ask
> > > > security concious users what they would choose to have in their
> > > > TCB: (a) closed-source firmware written by a hardware vendor,
> > > > or (b) open-source software that is provided by CSPs, I am
> > > > betting the overwhelming majority would choose (b).
> > >
> > > As I already replied in my earlier message from yesterday, yes,
> > > this is the choice that anyone has and it is free to make this
> > > choice. No questions asked. (Btw, please note that the above
> > > statement is not 100% accurate since the source code for intel
> > > TDX module is at least public). However, if as you said the
> > > majority choose (b), why do they need to enable the Confidential
> > > cloud computing technologies like TDX or SEV-SNP? If they choose
> > > (b), then the whole threat model described in this document do
> > > not simply apply to them and they can forget about anything that
> > > we try to describe here.
> >
> > I think the problem is that the tenor of the document is that the
> > CSP should be seen as the enemy of the tenant.
>
> We didn’t intend this interpretation and it can be certainly be fixed
> if people see it this way.
>
> Whereas all CSP's want to be
> > seen as the partner of the tenant (admittedly so they can upsell
> > services). In particular, even if you adopt (b) there are several
> > reasons why you'd use confidential computing:
> >
> >    1. Protection from other tenants who break containment in the
> > cloud. These tenants could exfiltrate data from Non-CoCo VMs, but
> > likely would be detected before they had time to launch an attack
> > using vulnerabilities in the current linux device drivers.
>
> Not sure how this "likely to be detected" is going to happen in
> practice.

How do you arrive at that conclusion? Detecting malicious tenant
behaviour is bread and butter for clouds ... especially as a nasty
cloud break out is a potentially business destroying event.

> If you have a known vulnerability against a CoCo VM (let say in a
> device driver interface it exposes), is it so much more difficult for
> an attacker to break into CoCo VM vs non-CoCo VM before it is
> detected?

It's a question of practicality. Given that a tenant has broken
containment and potentially escalated to root, what, in addition, would
they have to do to exfiltrate data from a CoCo VM. The more they have
to do to launch the attack, the greater the chance of their being
detected.

> >    2. Legal data security. There's a lot of value in a CSP being
> > able to make the legal statement that it does not have access to a
> > customer data because of CoCo.
>
> Let's leave legal out of technical discussion, not my area.

It *is* a technical argument. This is about compliance and Data
Sovereignty, which are both services most clouds are interested in
providing because they're a potentially huge and fast growing market.

> >    3. Insider threats (bribe a CSP admin employee). This one might
> > get as far as trying to launch an attack on a CoCo VM, but having
> > checks at the CSP to detect and defeat this would work
> > instead of every insider threat having to be defeated inside the
> > VM.
>
> Ok, this angle might be valid from CSP point of view, i.e. noticing
> such insider attacks might be easier I guess with CoCo VMs.
>
> >
> > In all of those cases (which are not exhaustive) you can regard the
> > CSP as a partner of the tenant when it comes to preventing and
> > detecting threats to the CoCo VM, so extreme device driver
> > hardening becomes far less relevant to these fairly considerable
> > use cases.
>
> I think the first case still holds, as well as one case that you have
> not listed: a remote attacker attacking CSP stack using some
> discovered and not yet fixed vulnerability (stack is big, bugs
> happen), getting control of CSP stack and then going after the CoCo
> VMs to see what it can get there.

Well, that's not really any different from a containment break. Most
cloud security analysis is performed by outside entities who start with
"an attacker has gained root on your compute platform, what can they
do?". So they skip the how and move straight to what is the threat
potential.

> What you are saying is that you (as CSP) maintain the good first
> level defense to prevent attacker to get control of your/CSP stack to
> begin with. What we try to do is the next level of defense (very
> typical in any security): we assume that first line of defense has
> been broken for some reason and now there is a second one placed to
> actually protect customers end data.

Well, that's where cloud security analyses also start. However, what
you've missed is that the cloud detecting the attack and usually
shutting down the node is a valid response. Clouds actually invest
significantly in intrusion detection and remediation systems for this
reason.

> > > Now from the pure security point of view the choice between (a)
> > > and (b) is not so easily done imo. Usually we take into account
> > > many factors that affect the risk/chances that certain piece of
> > > SW has a higher risk of having vulnerabilities. This includes the
> > > size of the codebase, its complexity, its attack surface exposure
> > > towards external interfaces, level of testing, whenever the code
> > > is public, code dependency chains, etc. Smaller codebase with no
> > > dependencies and small set of exposed interfaces is usually
> > > easier to review from security point of view given that the code
> > > is public.
> >
> > This reads like an argument that, from a security point of view,
> > smaller proprietary code is better than larger, open source, code.
> > I really don't think we want to open this can of worms.
>
> I don’t think I have made this statement: the code *has to be public*
> for anyone to review and I did explicitly list this in the statement
> above as "given that the code is public".

Public but not open source is still a problem. The federal government
has walked into several cloud accounts demanding a source code security
review, which means the code was made public to them but not generally.
Without all customers or some third party being able to build the code
and verify it (or ideally supply it ... think something like Red Hat
built the OVMF code this cloud is using and you can prove it using
their build signatures) how do you know the source you're given
corresponds to the binary the signature verifies.

> Only thing I meant is that it is not not so easy to make a call
> between (a) and (b) in all cases from a pure security point of view.

Proper governance is usually listed as a requirement for security.
Public but not Open Source usually exists because of governance or
control issues, which can be cited as a security risk. After all,
whoever does this must have some reason for not running an open source
project in a security critical area.

James

2023-04-27 16:50:31

by Randy Dunlap

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/27/23 09:16, James Bottomley wrote:
> Public but not open source is still a problem. The federal government
> has walked into several cloud accounts demanding a source code security
> review, which means the code was made public to them but not generally.

Apparently we have different definitions of "public".
I don't call that public.

> Without all customers or some third party being able to build the code
> and verify it (or ideally supply it ... think something like Red Hat
> built the OVMF code this cloud is using and you can prove it using
> their build signatures) how do you know the source you're given
> corresponds to the binary the signature verifies.

--
~Randy

2023-04-27 17:26:20

by Michael S. Tsirkin

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On Thu, Apr 27, 2023 at 09:18:08AM -0400, James Bottomley wrote:
> I think the problem is that the tenor of the document is that the CSP
> should be seen as the enemy of the tenant. Whereas all CSP's want to be
> seen as the partner of the tenant (admittedly so they can upsell
> services). In particular, even if you adopt (b) there are several
> reasons why you'd use confidential computing:
>
> 1. Protection from other tenants who break containment in the cloud.
> These tenants could exfiltrate data from Non-CoCo VMs, but likely
> would be detected before they had time to launch an attack using
> vulnerabilities in the current linux device drivers.
> 2. Legal data security. ?There's a lot of value in a CSP being able
> to make the legal statement that it does not have access to a
> customer data because of CoCo.
> 3. Insider threats (bribe a CSP admin employee). ?This one might get
> as far as trying to launch an attack on a CoCo VM, but having
> checks at the CSP to detect and defeat this would work instead of
> every insider threat having to be defeated inside the VM.

And generally, all these are instances of adopting a zero trust
architecture, right? Many CSPs have no need to access VM memory
so they would rather not have the ability.

--
MST

2023-04-27 18:01:32

[permalink] [raw]

Subject: Re: [PATCH] docs: security: Confidential computing intro and threat model

On 4/27/23 10:18 AM, Sean Christopherson wrote:
> On Thu, Apr 27, 2023, Carlos Bilbao wrote:
>> On 4/27/23 7:29 AM, Reshetova, Elena wrote:
>>> I agree with Dave that I think the goal of this document is not to come up with a
>>> fancy name (I am fine with call it anything), but to introduce kernel developers to the
>>> new Linux threat model angle for this-particular-use-case-of-confidential-computing.
>>> So that when we submit the hardening mechanisms in the future people are
>>> already familiar with why we need to do this and we don’t have to repeat this story
>>> again and again.
>>
>> Yes! To reiterate, there's two things we definitely wish to do:
>>
>> 1. Narrow down the problem: This new document can be specific to CoCo in
>> virtual environments. v2 should be clear about that.
>
> Then rename the document as I already suggested. If you want to claim
> confidential-computing.rst, then IMO such a doc needs to be written something
> like the surprisingly good Wikipedia article[*]. Until one of those two things
> happens, my NAK stands.
>
> [*] https://en.wikipedia.org/wiki/Confidential_computing

That's "mea culpa". I should have made it clearer in my previous emails that
changing the name is a non-issue. Also that we are very interested in
feedback from other CoCo flavors. In this regard, we've reached out to ARM
and RISC-V folks. Hopefully, they help us improve the doc and we can add
them as maintainers. Whenever we have a v2, I'd like to CC people from CSP
sector (Oracle, etc.) as well.

Thanks,
Carlos

2023-04-27 18:29:51