LinuxLists.cc - [RFC] Randomness on confidential computing platforms

2024-01-26 19:00:13

Subject: [RFC] Randomness on confidential computing platforms

Problem Statement

Currently Linux RNG uses the random inputs obtained from x86
RDRAND/RDSEED instructions (if present) during early initialization
stage (by mixing the obtained input into the random pool via
_mix_pool_bytes()), as well as for seeding/reseeding ChaCha-based CRNG.
When the calls to both RDRAND/RDSEED fail (including RDRAND internal
retries), the timing-based fallbacks are used in the latter case, and
during the early boot case this source of entropy input is simply
skipped. Overall Linux RNG has many other sources of entropy that it
uses (also depending on what HW is used), but the dominating one is
interrupts.

In a Confidential Computing Guest threat model, given the absence of any
special trusted HW for the secure entropy input, RDRAND/RDSEED
instructions is the only entropy source that is unobservable outside of
Confidential Computing Guest TCB. However, with enough pressure on these
instructions from multiple cores (see Intel SDM, Volume 1, Section
7.3.17, “Random Number Generator Instructions”), they can be made to
fail on purpose and force the Confidential Computing Guest Linux RNG to
use only Host/VMM controlled entropy sources.

Solution options

There are several possible solutions to this problem and the intention
of this RFC is to initiate a joined discussion. Here are some options
that has been considered:

1. Do nothing and accept the risk.
2. Force endless looping on RDRAND/RDSEED instructions when run in a
Confidential Computing Guest (this patch). This option turns the
attack against the quality of cryptographic randomness provided by
Confidential Computing Guest’s Linux RNG into a DoS attack against
the Confidential Computing Guest itself (DoS attack is out of scope
for the Confidential Computing threat model).
3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
Another DoS variant against the Guest.
4. Exit to the host/VMM with an error indication after a Confidential
Computing Guest failed to obtain random input from RDRAND/RDSEED
instructions after reasonable number of retries. This option allows
host/VMM to take some correction action for cases when the load on
RDRAND/RDSEED instructions has been put by another actor, i.e. the
other guest VM. The exit to host/VMM in such cases can be made
transparent for the Confidential Computing Guest in the TDX case with
the assistance of the TDX module component.
5. Anything other better option?

The patch below implements the second option. I believe the problem is
common for Intel TDX and AMD SEV. The patch cover both.
---
arch/x86/boot/compressed/kaslr.c | 6 ++++++
arch/x86/boot/compressed/mem.c | 26 -------------------------
arch/x86/boot/compressed/misc.h | 3 +++
arch/x86/boot/compressed/sev.c | 5 +++++
arch/x86/boot/compressed/sev.h | 2 ++
arch/x86/boot/compressed/tdx.c | 32 ++++++++++++++++++++++++++-----
arch/x86/boot/compressed/tdx.h | 2 ++
arch/x86/coco/core.c | 2 ++
arch/x86/include/asm/archrandom.h | 22 ++++++++++++++++-----
include/linux/cc_platform.h | 11 +++++++++++
10 files changed, 75 insertions(+), 36 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index dec961c6d16a..a7bba37c7539 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -23,6 +23,7 @@
#include "error.h"
#include "../string.h"
#include "efi.h"
+#include "sev.h"

#include <generated/compile.h>
#include <linux/module.h>
@@ -304,6 +305,11 @@ static void handle_mem_options(void)
return;
}

+bool rd_loop(void)
+{
+ return early_is_tdx_guest() || sev_enabled();
+}
+
/*
* In theory, KASLR can put the kernel anywhere in the range of [16M, MAXMEM)
* on 64-bit, and [16M, KERNEL_IMAGE_SIZE) on 32-bit.
diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c
index dbba332e4a12..84a9d9ad98b2 100644
--- a/arch/x86/boot/compressed/mem.c
+++ b/arch/x86/boot/compressed/mem.c
@@ -6,32 +6,6 @@
#include "sev.h"
#include <asm/shared/tdx.h>

-/*
- * accept_memory() and process_unaccepted_memory() called from EFI stub which
- * runs before decompressor and its early_tdx_detect().
- *
- * Enumerate TDX directly from the early users.
- */
-static bool early_is_tdx_guest(void)
-{
- static bool once;
- static bool is_tdx;
-
- if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST))
- return false;
-
- if (!once) {
- u32 eax, sig[3];
-
- cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax,
- &sig[0], &sig[2], &sig[1]);
- is_tdx = !memcmp(TDX_IDENT, sig, sizeof(sig));
- once = true;
- }
-
- return is_tdx;
-}
-
void arch_accept_memory(phys_addr_t start, phys_addr_t end)
{
/* Platform-specific memory-acceptance call goes here */
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index bc2f0f17fb90..3fd0aba836e7 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -255,4 +255,7 @@ static inline bool init_unaccepted_memory(void) { return false; }
extern struct efi_unaccepted_memory *unaccepted_table;
void accept_memory(phys_addr_t start, phys_addr_t end);

+#define rd_loop rd_loop
+extern bool rd_loop(void);
+
#endif /* BOOT_COMPRESSED_MISC_H */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 454acd7a2daf..5e7fb31e630b 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -125,6 +125,11 @@ static bool fault_in_kernel_space(unsigned long address)
/* Include code for early handlers */
#include "../../kernel/sev-shared.c"

+bool sev_enabled(void)
+{
+ return sev_status & MSR_AMD64_SEV_ENABLED;
+}
+
bool sev_snp_enabled(void)
{
return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
diff --git a/arch/x86/boot/compressed/sev.h b/arch/x86/boot/compressed/sev.h
index fc725a981b09..ec99e0390324 100644
--- a/arch/x86/boot/compressed/sev.h
+++ b/arch/x86/boot/compressed/sev.h
@@ -10,11 +10,13 @@

#ifdef CONFIG_AMD_MEM_ENCRYPT

+bool sev_enabled(void);
bool sev_snp_enabled(void);
void snp_accept_memory(phys_addr_t start, phys_addr_t end);

#else

+static inline bool sev_enabled(void) { return false; }
static inline bool sev_snp_enabled(void) { return false; }
static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }

diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 8451d6a1030c..90dcfb9e82bf 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -61,13 +61,35 @@ static inline void tdx_outw(u16 value, u16 port)
tdx_io_out(2, port, value);
}

+/*
+ * accept_memory() and process_unaccepted_memory() called from EFI stub which
+ * runs before decompressor and its early_tdx_detect().
+ *
+ * Enumerate TDX directly from the early users.
+ */
+bool early_is_tdx_guest(void)
+{
+ static bool once;
+ static bool is_tdx;
+
+ if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST))
+ return false;
+
+ if (!once) {
+ u32 eax, sig[3];
+
+ cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax,
+ &sig[0], &sig[2], &sig[1]);
+ is_tdx = !memcmp(TDX_IDENT, sig, sizeof(sig));
+ once = true;
+ }
+
+ return is_tdx;
+}
+
void early_tdx_detect(void)
{
- u32 eax, sig[3];
-
- cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax, &sig[0], &sig[2], &sig[1]);
-
- if (memcmp(TDX_IDENT, sig, sizeof(sig)))
+ if (!early_is_tdx_guest())
return;

/* Use hypercalls instead of I/O instructions */
diff --git a/arch/x86/boot/compressed/tdx.h b/arch/x86/boot/compressed/tdx.h
index 9055482cd35c..6c097de8392e 100644
--- a/arch/x86/boot/compressed/tdx.h
+++ b/arch/x86/boot/compressed/tdx.h
@@ -5,8 +5,10 @@
#include <linux/types.h>

#ifdef CONFIG_INTEL_TDX_GUEST
+bool early_is_tdx_guest(void);
void early_tdx_detect(void);
#else
+bool early_is_tdx_guest(void) { return false; }
static inline void early_tdx_detect(void) { };
#endif

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index f07c3bb7deab..655d881a9cfa 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -22,6 +22,7 @@ static bool noinstr intel_cc_platform_has(enum cc_attr attr)
case CC_ATTR_GUEST_UNROLL_STRING_IO:
case CC_ATTR_GUEST_MEM_ENCRYPT:
case CC_ATTR_MEM_ENCRYPT:
+ case CC_ATTR_GUEST_RAND_LOOP:
return true;
default:
return false;
@@ -72,6 +73,7 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
return sme_me_mask && !(sev_status & MSR_AMD64_SEV_ENABLED);

case CC_ATTR_GUEST_MEM_ENCRYPT:
+ case CC_ATTR_GUEST_RAND_LOOP:
return sev_status & MSR_AMD64_SEV_ENABLED;

case CC_ATTR_GUEST_STATE_ENCRYPT:
diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..63368227c9d6 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -10,6 +10,7 @@
#ifndef ASM_X86_ARCHRANDOM_H
#define ASM_X86_ARCHRANDOM_H

+#include <linux/cc_platform.h>
#include <asm/processor.h>
#include <asm/cpufeature.h>

@@ -17,6 +18,13 @@

/* Unconditional execution of RDRAND and RDSEED */

+#ifndef rd_loop
+static inline bool rd_loop(void)
+{
+ return cc_platform_has(CC_ATTR_GUEST_RAND_LOOP);
+}
+#endif
+
static inline bool __must_check rdrand_long(unsigned long *v)
{
bool ok;
@@ -27,17 +35,21 @@ static inline bool __must_check rdrand_long(unsigned long *v)
: CC_OUT(c) (ok), [out] "=r" (*v));
if (ok)
return true;
- } while (--retry);
+ } while (rd_loop() || --retry);
return false;
}

static inline bool __must_check rdseed_long(unsigned long *v)
{
bool ok;
- asm volatile("rdseed %[out]"
- CC_SET(c)
- : CC_OUT(c) (ok), [out] "=r" (*v));
- return ok;
+ do {
+ asm volatile("rdseed %[out]"
+ CC_SET(c)
+ : CC_OUT(c) (ok), [out] "=r" (*v));
+ if (ok)
+ return ok;
+ } while (rd_loop());
+ return false;
}

/*
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index d08dd65b5c43..e554e8919eb0 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -80,6 +80,17 @@ enum cc_attr {
* using AMD SEV-SNP features.
*/
CC_ATTR_GUEST_SEV_SNP,
+
+ /**
+ * @CC_ATTR_GUEST_RAND_LOOP: Make RDRAND/RDSEED loop forever to
+ * harden the random number generation.
+ *
+ * The platform/OS is running as a guest/virtual machine and
+ * harden the random number generation.
+ *
+ * Examples include TDX guest & SEV.
+ */
+ CC_ATTR_GUEST_RAND_LOOP,
};

#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
--
2.43.0

2024-01-29 16:34:47

by Dave Hansen

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On 1/26/24 05:42, Kirill A. Shutemov wrote:
> 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
> Another DoS variant against the Guest.

I think Sean was going down the same path, but I really dislike the idea
of having TDX-specific (or CoCo-specific) policy here.

How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
can turn on panic_on_warn, if they haven't already.

2024-01-29 16:41:57

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 8:30:11 AM PST, Dave Hansen <[email protected]> wrote:
>On 1/26/24 05:42, Kirill A. Shutemov wrote:
>> 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
>> Another DoS variant against the Guest.
>
>I think Sean was going down the same path, but I really dislike the idea
>of having TDX-specific (or CoCo-specific) policy here.
>
>How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
>can turn on panic_on_warn, if they haven't already.

That would be good anyway.

2024-01-29 17:17:55

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 8:41:49 AM PST, "Kirill A. Shutemov" <[email protected]> wrote:
>On Mon, Jan 29, 2024 at 08:30:11AM -0800, Dave Hansen wrote:
>> On 1/26/24 05:42, Kirill A. Shutemov wrote:
>> > 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
>> > Another DoS variant against the Guest.
>>
>> I think Sean was going down the same path, but I really dislike the idea
>> of having TDX-specific (or CoCo-specific) policy here.
>>
>> How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
>> can turn on panic_on_warn, if they haven't already.
>
>Sure, we can do it for kernel, but we have no control on what userspace
>does.
>
>Sensible userspace on RDRAND/RDSEED failure should fallback to kernel
>asking for random bytes, but who knows if it happens in practice
>everywhere.
>
>Do we care?
>

You can't fix what you can't touch.

2024-01-29 18:55:49

by Dave Hansen

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On 1/29/24 08:41, Kirill A. Shutemov wrote:
> On Mon, Jan 29, 2024 at 08:30:11AM -0800, Dave Hansen wrote:
>> On 1/26/24 05:42, Kirill A. Shutemov wrote:
>>> 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
>>> Another DoS variant against the Guest.
>>
>> I think Sean was going down the same path, but I really dislike the idea
>> of having TDX-specific (or CoCo-specific) policy here.
>>
>> How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
>> can turn on panic_on_warn, if they haven't already.
>
> Sure, we can do it for kernel, but we have no control on what userspace
> does.
>
> Sensible userspace on RDRAND/RDSEED failure should fallback to kernel
> asking for random bytes, but who knows if it happens in practice
> everywhere.
>
> Do we care?

I want to make sure I understand the scenario:

1. We're running in a guest under TDX (or SEV-SNP)
2. The VMM (or somebody) is attacking the guest by eating all the
hardware entropy and RDRAND is effectively busted
3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
failure, that rdrand_long() never gets called.
4. Userspace is using RDRAND output in some critical place like key
generation and is not checking it for failure, nor mixing it with
entropy from any other source
5. Userspace uses the failed RDRAND output to generate a key
6. Someone exploits the horrible key

Is that it?

2024-01-29 20:44:57

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On Mon, Jan 29, 2024 at 10:55:38AM -0800, Dave Hansen wrote:
> On 1/29/24 08:41, Kirill A. Shutemov wrote:
> > On Mon, Jan 29, 2024 at 08:30:11AM -0800, Dave Hansen wrote:
> >> On 1/26/24 05:42, Kirill A. Shutemov wrote:
> >>> 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
> >>> Another DoS variant against the Guest.
> >>
> >> I think Sean was going down the same path, but I really dislike the idea
> >> of having TDX-specific (or CoCo-specific) policy here.
> >>
> >> How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
> >> can turn on panic_on_warn, if they haven't already.
> >
> > Sure, we can do it for kernel, but we have no control on what userspace
> > does.
> >
> > Sensible userspace on RDRAND/RDSEED failure should fallback to kernel
> > asking for random bytes, but who knows if it happens in practice
> > everywhere.
> >
> > Do we care?
>
> I want to make sure I understand the scenario:
>
> 1. We're running in a guest under TDX (or SEV-SNP)
> 2. The VMM (or somebody) is attacking the guest by eating all the
> hardware entropy and RDRAND is effectively busted
> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
> failure, that rdrand_long() never gets called.

Never gets called during attack. It can be used before and after.

> 4. Userspace is using RDRAND output in some critical place like key
> generation and is not checking it for failure, nor mixing it with
> entropy from any other source
> 5. Userspace uses the failed RDRAND output to generate a key
> 6. Someone exploits the horrible key
>
> Is that it?

Yes.

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-01-29 21:04:37

by Dave Hansen

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On 1/29/24 12:26, Kirill A. Shutemov wrote:
>>> Do we care?
>> I want to make sure I understand the scenario:
>>
>> 1. We're running in a guest under TDX (or SEV-SNP)
>> 2. The VMM (or somebody) is attacking the guest by eating all the
>> hardware entropy and RDRAND is effectively busted
>> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
>> failure, that rdrand_long() never gets called.
> Never gets called during attack. It can be used before and after.
>
>> 4. Userspace is using RDRAND output in some critical place like key
>> generation and is not checking it for failure, nor mixing it with
>> entropy from any other source
>> 5. Userspace uses the failed RDRAND output to generate a key
>> 6. Someone exploits the horrible key
>>
>> Is that it?
> Yes.

Is there something that fundamentally makes this a VMM vs. TDX guest
problem? If a malicious VMM can exhaust RDRAND, why can't malicious
userspace do the same?

Let's assume buggy userspace exists. Is that userspace *uniquely*
exposed to a naughty VMM or is that VMM just added to the list of things
that can attack buggy userspace?

2024-01-29 21:18:25

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On Mon, Jan 29, 2024 at 08:30:11AM -0800, Dave Hansen wrote:
> On 1/26/24 05:42, Kirill A. Shutemov wrote:
> > 3. Panic after enough re-tries of RDRAND/RDSEED instructions fail.
> > Another DoS variant against the Guest.
>
> I think Sean was going down the same path, but I really dislike the idea
> of having TDX-specific (or CoCo-specific) policy here.
>
> How about we WARN_ON() RDRAND/RDSEED going bonkers? The paranoid folks
> can turn on panic_on_warn, if they haven't already.

Sure, we can do it for kernel, but we have no control on what userspace
does.

Sensible userspace on RDRAND/RDSEED failure should fallback to kernel
asking for random bytes, but who knows if it happens in practice
everywhere.

Do we care?

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-01-29 21:19:04

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 1:04:23 PM PST, Dave Hansen <[email protected]> wrote:
>On 1/29/24 12:26, Kirill A. Shutemov wrote:
>>>> Do we care?
>>> I want to make sure I understand the scenario:
>>>
>>> 1. We're running in a guest under TDX (or SEV-SNP)
>>> 2. The VMM (or somebody) is attacking the guest by eating all the
>>> hardware entropy and RDRAND is effectively busted
>>> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
>>> failure, that rdrand_long() never gets called.
>> Never gets called during attack. It can be used before and after.
>>
>>> 4. Userspace is using RDRAND output in some critical place like key
>>> generation and is not checking it for failure, nor mixing it with
>>> entropy from any other source
>>> 5. Userspace uses the failed RDRAND output to generate a key
>>> 6. Someone exploits the horrible key
>>>
>>> Is that it?
>> Yes.
>
>Is there something that fundamentally makes this a VMM vs. TDX guest
>problem? If a malicious VMM can exhaust RDRAND, why can't malicious
>userspace do the same?
>
>Let's assume buggy userspace exists. Is that userspace *uniquely*
>exposed to a naughty VMM or is that VMM just added to the list of things
>that can attack buggy userspace?

The concern, I believe, is that a TDX guest is vulnerable as a *victim*, especially if the OS is being malicious.

However, as you say a malicious user space including a conventional VM could try to use it to attack another. The only thing we can do in the kernel about that is to be resilient.

Note that there is an option to the kernel to suspend boot until enough entropy has been gathered that predicting the output of the entropy pool in the kernel ought to be equivalent to breaking AES (in which case we have far worse problems.) To harden the VM case in general perhaps we should consider RDRAND to have zero entropy credit when used as a fallback for RDSEED.

2024-01-29 21:39:16

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 1:17:07 PM PST, "H. Peter Anvin" <[email protected]> wrote:
>On January 29, 2024 1:04:23 PM PST, Dave Hansen <[email protected]> wrote:
>>On 1/29/24 12:26, Kirill A. Shutemov wrote:
>>>>> Do we care?
>>>> I want to make sure I understand the scenario:
>>>>
>>>> 1. We're running in a guest under TDX (or SEV-SNP)
>>>> 2. The VMM (or somebody) is attacking the guest by eating all the
>>>> hardware entropy and RDRAND is effectively busted
>>>> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
>>>> failure, that rdrand_long() never gets called.
>>> Never gets called during attack. It can be used before and after.
>>>
>>>> 4. Userspace is using RDRAND output in some critical place like key
>>>> generation and is not checking it for failure, nor mixing it with
>>>> entropy from any other source
>>>> 5. Userspace uses the failed RDRAND output to generate a key
>>>> 6. Someone exploits the horrible key
>>>>
>>>> Is that it?
>>> Yes.
>>
>>Is there something that fundamentally makes this a VMM vs. TDX guest
>>problem? If a malicious VMM can exhaust RDRAND, why can't malicious
>>userspace do the same?
>>
>>Let's assume buggy userspace exists. Is that userspace *uniquely*
>>exposed to a naughty VMM or is that VMM just added to the list of things
>>that can attack buggy userspace?
>
>The concern, I believe, is that a TDX guest is vulnerable as a *victim*, especially if the OS is being malicious.
>
>However, as you say a malicious user space including a conventional VM could try to use it to attack another. The only thing we can do in the kernel about that is to be resilient.
>
>Note that there is an option to the kernel to suspend boot until enough entropy has been gathered that predicting the output of the entropy pool in the kernel ought to be equivalent to breaking AES (in which case we have far worse problems.) To harden the VM case in general perhaps we should consider RDRAND to have zero entropy credit when used as a fallback for RDSEED.
>

It is probably worth pointing out, too, that in reality the specs for RDRAND/RDSEED are *extremely* sandbagged. The architect told me that it is extremely unlikely that we will *ever* see a failure due to exhaustion, even if it is executed continuously on all cores – the randomness production rate exceeds the bandwidth of the bus in uncore.

2024-01-29 21:44:48

by Kirill A. Shutemov

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On Mon, Jan 29, 2024 at 01:04:23PM -0800, Dave Hansen wrote:
> On 1/29/24 12:26, Kirill A. Shutemov wrote:
> >>> Do we care?
> >> I want to make sure I understand the scenario:
> >>
> >> 1. We're running in a guest under TDX (or SEV-SNP)
> >> 2. The VMM (or somebody) is attacking the guest by eating all the
> >> hardware entropy and RDRAND is effectively busted
> >> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
> >> failure, that rdrand_long() never gets called.
> > Never gets called during attack. It can be used before and after.
> >
> >> 4. Userspace is using RDRAND output in some critical place like key
> >> generation and is not checking it for failure, nor mixing it with
> >> entropy from any other source
> >> 5. Userspace uses the failed RDRAND output to generate a key
> >> 6. Someone exploits the horrible key
> >>
> >> Is that it?
> > Yes.
>
> Is there something that fundamentally makes this a VMM vs. TDX guest
> problem? If a malicious VMM can exhaust RDRAND, why can't malicious
> userspace do the same?
>
> Let's assume buggy userspace exists. Is that userspace *uniquely*
> exposed to a naughty VMM or is that VMM just added to the list of things
> that can attack buggy userspace?

This is good question.

VMM has control over when a VCPU gets scheduled and on what CPU which
gives it tighter control over the target workload. It can make a
difference if there's small window for an attack before RDRAND is
functional again.

Admittedly, I don't find my own argument very convincing :)

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-01-29 22:13:38

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 1:17:07 PM PST, "H. Peter Anvin" <[email protected]> wrote:
>On January 29, 2024 1:04:23 PM PST, Dave Hansen <[email protected]> wrote:
>>On 1/29/24 12:26, Kirill A. Shutemov wrote:
>>>>> Do we care?
>>>> I want to make sure I understand the scenario:
>>>>
>>>> 1. We're running in a guest under TDX (or SEV-SNP)
>>>> 2. The VMM (or somebody) is attacking the guest by eating all the
>>>> hardware entropy and RDRAND is effectively busted
>>>> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
>>>> failure, that rdrand_long() never gets called.
>>> Never gets called during attack. It can be used before and after.
>>>
>>>> 4. Userspace is using RDRAND output in some critical place like key
>>>> generation and is not checking it for failure, nor mixing it with
>>>> entropy from any other source
>>>> 5. Userspace uses the failed RDRAND output to generate a key
>>>> 6. Someone exploits the horrible key
>>>>
>>>> Is that it?
>>> Yes.
>>
>>Is there something that fundamentally makes this a VMM vs. TDX guest
>>problem? If a malicious VMM can exhaust RDRAND, why can't malicious
>>userspace do the same?
>>
>>Let's assume buggy userspace exists. Is that userspace *uniquely*
>>exposed to a naughty VMM or is that VMM just added to the list of things
>>that can attack buggy userspace?
>
>The concern, I believe, is that a TDX guest is vulnerable as a *victim*, especially if the OS is being malicious.
>
>However, as you say a malicious user space including a conventional VM could try to use it to attack another. The only thing we can do in the kernel about that is to be resilient.
>
>Note that there is an option to the kernel to suspend boot until enough entropy has been gathered that predicting the output of the entropy pool in the kernel ought to be equivalent to breaking AES (in which case we have far worse problems.) To harden the VM case in general perhaps we should consider RDRAND to have zero entropy credit when used as a fallback for RDSEED.
>

So as far as I understand, the uncore bus (at least at the time RDRAND/RDSEED was designed) is a single-transaction bus; once a read transaction has been accepted by the bus the bus is locked until the reply is sent (like PCI.) As such, the RNG unit simply doesn't have to option of not returning a response without holding the whole uncore bus locked. However, I believe that if another core is waiting for the bus, that request will be served before the other core can return for more.

If the RNG bit source is crippled for some reason to the point of being near failure, it is certainly possible for a livelock to happen, but at least as far as I understand the likelihood of that happening enough to cause 16 failures in a row is so close to a total failure that it might be as well treated as one.

*Any* security sensitive application that doesn't take total RNG failure into account is fundamentally broken. *Any* hardware random number generator is inherently an analog device, and as such has a nonzero probability of failure. It has an integrity monitor, but all it can do is say "no" and not credit entropy, thereby slowing down and eventually stopping the unit (even RDRAND has a minimum seeding frequency guarantee, unlike /dev/urandom.)

2024-01-29 22:19:02

by Dave Hansen

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On 1/29/24 13:33, Kirill A. Shutemov wrote:
>> Let's assume buggy userspace exists. Is that userspace *uniquely*
>> exposed to a naughty VMM or is that VMM just added to the list of things
>> that can attack buggy userspace?
> This is good question.
>
> VMM has control over when a VCPU gets scheduled and on what CPU which
> gives it tighter control over the target workload. It can make a
> difference if there's small window for an attack before RDRAND is
> functional again.

This is all a bit too theoretical for my taste. I'm fine with doing
some generic mitigation (WARN_ON_ONCE(hardware_is_exhausted)), but we're
talking about a theoretical attack with theoretical buggy software when
in a theoretically unreachable hardware state.

Until it's clearly much more practical, we have much bigger problems to
worry about.

2024-01-29 23:33:45

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [RFC] Randomness on confidential computing platforms

On January 29, 2024 2:18:50 PM PST, Dave Hansen <[email protected]> wrote:
>On 1/29/24 13:33, Kirill A. Shutemov wrote:
>>> Let's assume buggy userspace exists. Is that userspace *uniquely*
>>> exposed to a naughty VMM or is that VMM just added to the list of things
>>> that can attack buggy userspace?
>> This is good question.
>>
>> VMM has control over when a VCPU gets scheduled and on what CPU which
>> gives it tighter control over the target workload. It can make a
>> difference if there's small window for an attack before RDRAND is
>> functional again.
>
>This is all a bit too theoretical for my taste. I'm fine with doing
>some generic mitigation (WARN_ON_ONCE(hardware_is_exhausted)), but we're
>talking about a theoretical attack with theoretical buggy software when
>in a theoretically unreachable hardware state.
>
>Until it's clearly much more practical, we have much bigger problems to
>worry about.

Again, do we even have a problem with the "hold the boot until we have entropy"option?

2024-01-30 08:02:33

by Elena Reshetova

[permalink] [raw]

Subject: RE: [RFC] Randomness on confidential computing platforms

> -----Original Message-----
> From: Kirill A. Shutemov <[email protected]>
> Sent: Monday, January 29, 2024 11:33 PM
> To: Hansen, Dave <[email protected]>
> Cc: Thomas Gleixner <[email protected]>; Ingo Molnar <[email protected]>;
> Borislav Petkov <[email protected]>; Dave Hansen <[email protected]>; H.
> Peter Anvin <[email protected]>; [email protected]; Theodore Ts'o <[email protected]>;
> Jason A. Donenfeld <[email protected]>; Kuppuswamy Sathyanarayanan
> <[email protected]>; Reshetova, Elena
> <[email protected]>; Nakajima, Jun <[email protected]>; Tom
> Lendacky <[email protected]>; Kalra, Ashish <[email protected]>;
> Sean Christopherson <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [RFC] Randomness on confidential computing platforms
>
> On Mon, Jan 29, 2024 at 01:04:23PM -0800, Dave Hansen wrote:
> > On 1/29/24 12:26, Kirill A. Shutemov wrote:
> > >>> Do we care?
> > >> I want to make sure I understand the scenario:
> > >>
> > >> 1. We're running in a guest under TDX (or SEV-SNP)
> > >> 2. The VMM (or somebody) is attacking the guest by eating all the
> > >> hardware entropy and RDRAND is effectively busted
> > >> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
> > >> failure, that rdrand_long() never gets called.
> > > Never gets called during attack. It can be used before and after.
> > >
> > >> 4. Userspace is using RDRAND output in some critical place like key
> > >> generation and is not checking it for failure, nor mixing it with
> > >> entropy from any other source
> > >> 5. Userspace uses the failed RDRAND output to generate a key
> > >> 6. Someone exploits the horrible key
> > >>
> > >> Is that it?
> > > Yes.
> >
> > Is there something that fundamentally makes this a VMM vs. TDX guest
> > problem? If a malicious VMM can exhaust RDRAND, why can't malicious
> > userspace do the same?

Let's be more concrete here: the main problem we are trying to fix here is
to make sure Linux RNG has entropy source(s) that are not under attacker control.
In case of userspace attacking kernel, yes, it can exhaust RDRAND/RDSEED,
but kernel has other entropy sources (interrupts) that are not under full userspace
control or fully observable.
What makes the confidential VM story different is after VMM has exhausted
RDRAND/RDSEED, guest Linux RNG will fall back to the entropy sources that
are under observance/control of VMM and this is what we try to avoid.

> >
> > Let's assume buggy userspace exists. Is that userspace *uniquely*
> > exposed to a naughty VMM or is that VMM just added to the list of things
> > that can attack buggy userspace?

Good behaving userspace will ask for its cryptographic randomness from
Linux RNG (some might do direct RDRAND/RDSEED calls, but most will
rely on Linux RNG). When it does ask for it, it is going to get a number
from it. The fact that that number doesn’t have adequate security is not
visible for userspace in any way. I don’t think anyone will go to dmesg and
check the warning logs to determine this.
So, I don’t see how warning helps here in practice.

Best Regards,
Elena

2024-01-30 08:31:21

by Elena Reshetova

[permalink] [raw]

Subject: RE: [RFC] Randomness on confidential computing platforms

> On January 29, 2024 2:18:50 PM PST, Dave Hansen <[email protected]>
> wrote:
> >On 1/29/24 13:33, Kirill A. Shutemov wrote:
> >>> Let's assume buggy userspace exists. Is that userspace *uniquely*
> >>> exposed to a naughty VMM or is that VMM just added to the list of things
> >>> that can attack buggy userspace?
> >> This is good question.
> >>
> >> VMM has control over when a VCPU gets scheduled and on what CPU which
> >> gives it tighter control over the target workload. It can make a
> >> difference if there's small window for an attack before RDRAND is
> >> functional again.
> >
> >This is all a bit too theoretical for my taste. I'm fine with doing
> >some generic mitigation (WARN_ON_ONCE(hardware_is_exhausted)), but we're
> >talking about a theoretical attack with theoretical buggy software when
> >in a theoretically unreachable hardware state.
> >
> >Until it's clearly much more practical, we have much bigger problems to
> >worry about.
>
> Again, do we even have a problem with the "hold the boot until we have
> entropy"option?

Yes, we do have a problem. You cannot build a secure random number generator
in a situation when attacker controls/observes all your entropy sources.
Linux RNG has many entropy sources (RDRAND/RDSEED is just one of them), and
as soon as we have at least some proper entropy input, you are ok (I am greatly
oversimplifying the RNG theory now).
What changes with confidential computing is that the entropy sources like
interrupts or timing-based information can be viewed as under attacker control
/observance. But this is *not* how Linux RNG views it by its threat model.
So, Linux RNG will boot and run just fine in a confidential guest in situations when
RDRAND/RDSEED always fails (it will use other entropy source like interrupts/timing info),
but the quality of its output becomes questionable assuming host/VMM is out of TCB.

I hope we can get an opinion on this from maintainers of Linux RNG.

Best Regards,
Elena.