2022-07-29 14:38:28

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support



On 7/29/22 09:18, Dave Hansen wrote:
> On 7/29/22 07:01, Tom Lendacky wrote:
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index c05f0124c410..1f7f6205c4f6 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -104,6 +104,15 @@ struct sev_es_runtime_data {
>> * is currently unsupported in SEV-ES guests.
>> */
>> unsigned long dr7;
>> +
>> + /*
>> + * Page State Change structure for use when accepting memory or when
>> + * changing page state. Interrupts are disabled when using the structure
>> + * but an NMI could still be raised, so use a flag to indicate when the
>> + * structure is in use and use the MSR protocol in these cases.
>> + */
>> + struct snp_psc_desc psc_desc;
>> + bool psc_active;
>> };
>
> This thing:
>
> struct snp_psc_desc {
> struct psc_hdr hdr;
> struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
> } __packed;
>
> is 16k, right? Being per-cpu, this might eat up a MB or two of memory
> on a big server?

It's just under 2K, 2,032 bytes.

>
> Considering that runtime acceptance is already single-threaded[1] *and*
> there's a fallback method, why not just have a single copy of this
> guarded by a single lock?

This function is called for more than just memory acceptance. It's also
called for any changes from or to private or shared, which isn't
single-threaded.

Thanks,
Tom

>
> 1.
> https://lore.kernel.org/all/[email protected]/


2022-07-29 19:10:50

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

On 7/29/22 07:25, Tom Lendacky wrote:
>> Considering that runtime acceptance is already single-threaded[1] *and*
>> there's a fallback method, why not just have a single copy of this
>> guarded by a single lock?
>
> This function is called for more than just memory acceptance. It's also
> called for any changes from or to private or shared, which isn't
> single-threaded.

I think this tidbit from the changelog threw me off:

> Protect the use of the per-CPU structure by disabling interrupts during
> memory acceptance.

Could you please revise that to accurately capture the impact of this
change?

2022-07-29 19:37:44

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

On 7/29/22 14:08, Dave Hansen wrote:
> On 7/29/22 07:25, Tom Lendacky wrote:
>>> Considering that runtime acceptance is already single-threaded[1] *and*
>>> there's a fallback method, why not just have a single copy of this
>>> guarded by a single lock?
>>
>> This function is called for more than just memory acceptance. It's also
>> called for any changes from or to private or shared, which isn't
>> single-threaded.
>
> I think this tidbit from the changelog threw me off:
>
>> Protect the use of the per-CPU structure by disabling interrupts during
>> memory acceptance.
>
> Could you please revise that to accurately capture the impact of this
> change?

Is s/memory acceptance/page state changes/ enough of what you are looking
for or something more?

Thanks,
Tom

2022-07-29 19:39:44

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

On 7/29/22 12:22, Tom Lendacky wrote:
>> I think this tidbit from the changelog threw me off:
>>
>>> Protect the use of the per-CPU structure by disabling interrupts during
>>> memory acceptance.
>>
>> Could you please revise that to accurately capture the impact of this
>> change?
>
> Is s/memory acceptance/page state changes/ enough of what you are
> looking for or something more?

That, plus a reminder of when "page state changes" are performed would
be nice. How frequent are they? Are they performance sensitive?
That'll help us decide if the design here is appropriate or not.

2022-07-29 20:15:45

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

On 7/29/22 14:28, Dave Hansen wrote:
> On 7/29/22 12:22, Tom Lendacky wrote:
>>> I think this tidbit from the changelog threw me off:
>>>
>>>> Protect the use of the per-CPU structure by disabling interrupts during
>>>> memory acceptance.
>>>
>>> Could you please revise that to accurately capture the impact of this
>>> change?
>>
>> Is s/memory acceptance/page state changes/ enough of what you are
>> looking for or something more?
>
> That, plus a reminder of when "page state changes" are performed would
> be nice. How frequent are they? Are they performance sensitive?
> That'll help us decide if the design here is appropriate or not.

Without submitting a v2, here's what the updated paragraph would look like:

Page state changes occur whenever DMA memory is allocated or memory needs
to be shared with the hypervisor (kvmclock, attestation reports, etc.).
A per-CPU structure is chosen over a single PSC structure protected with
a lock because these changes can be initiated from interrupt or
soft-interrupt context (e.g. the NVMe driver). Protect the use of the
per-CPU structure by disabling interrupts during page state changes.
Since the set_pages_state() path is the only path into vmgexit_psc(),
rename vmgexit_psc() to __vmgexit_psc() and remove the calls to disable
interrupts which are now performed by set_pages_state().

Hopefully there aren't a lot of page state changes occurring once a system
has booted, so maybe a static struct with a lock would work. I am a bit
worried about an NMI occurring during a page state change that requires a
lock. I suppose, in_nmi() can be used to detect that and go the MSR
protocol route to avoid a deadlock.

I can investigate that if the 2K-extra per-CPU is not desired.

Thanks,
Tom

2022-08-03 18:18:32

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1.1 0/2] Provide SEV-SNP support for unaccepted memory

This series adds SEV-SNP support for unaccepted memory to the patch series
titled:

[PATCHv7 00/14] mm, x86/cc: Implement support for unaccepted memory

Currently, when changing the state of a page under SNP, the page state
change structure is kmalloc()'d. This lead to hangs during boot when
accepting memory because the allocation can trigger the need to accept
more memory. So this series consists of two patches:

- A pre-patch to switch from a kmalloc()'d page state change structure
to a static page state change structure proteced with access protected
by a spinlock.

- SNP support for unaccepted memory.

The series is based off of and tested against Kirill Shutemov's tree:
https://github.com/intel/tdx.git guest-unaccepted-memory

---

This is what the static structure / spinlock method looks like. Let me
know if this approach is preferred over the per-CPU structure. If so,
I'll submit this as a v2.

Thanks,
Tom

Tom Lendacky (2):
x86/sev: Use per-CPU PSC structure in prep for unaccepted memory
support
x86/sev: Add SNP-specific unaccepted memory support

arch/x86/Kconfig | 1 +
arch/x86/boot/compressed/mem.c | 3 ++
arch/x86/boot/compressed/sev.c | 10 ++++-
arch/x86/boot/compressed/sev.h | 23 +++++++++++
arch/x86/include/asm/sev.h | 3 ++
arch/x86/kernel/sev.c | 71 ++++++++++++++++++++++-----------
arch/x86/mm/unaccepted_memory.c | 4 ++
7 files changed, 91 insertions(+), 24 deletions(-)
create mode 100644 arch/x86/boot/compressed/sev.h

--
2.36.1


2022-08-03 18:18:32

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1.1 2/2] x86/sev: Add SNP-specific unaccepted memory support

Add SNP-specific hooks to the unaccepted memory support in the boot
path (__accept_memory()) and the core kernel (accept_memory()) in order
to support booting SNP guests when unaccepted memory is present. Without
this support, SNP guests will fail to boot and/or panic() when unaccepted
memory is present in the EFI memory map.

The process of accepting memory under SNP involves invoking the hypervisor
to perform a page state change for the page to private memory and then
issuing a PVALIDATE instruction to accept the page.

Create the new header file arch/x86/boot/compressed/sev.h because adding
the function declaration to any of the existing SEV related header files
pulls in too many other header files, causing the build to fail.

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/Kconfig | 1 +
arch/x86/boot/compressed/mem.c | 3 +++
arch/x86/boot/compressed/sev.c | 10 +++++++++-
arch/x86/boot/compressed/sev.h | 23 +++++++++++++++++++++++
arch/x86/include/asm/sev.h | 3 +++
arch/x86/kernel/sev.c | 16 ++++++++++++++++
arch/x86/mm/unaccepted_memory.c | 4 ++++
7 files changed, 59 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/boot/compressed/sev.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 34146ecc5bdd..0ad53c3533c2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1553,6 +1553,7 @@ config AMD_MEM_ENCRYPT
select INSTRUCTION_DECODER
select ARCH_HAS_CC_PLATFORM
select X86_MEM_ENCRYPT
+ select UNACCEPTED_MEMORY
help
Say yes to enable support for the encryption of system memory.
This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c
index 48e36e640da1..3e19dc0da0d7 100644
--- a/arch/x86/boot/compressed/mem.c
+++ b/arch/x86/boot/compressed/mem.c
@@ -6,6 +6,7 @@
#include "find.h"
#include "math.h"
#include "tdx.h"
+#include "sev.h"
#include <asm/shared/tdx.h>

#define PMD_SHIFT 21
@@ -39,6 +40,8 @@ static inline void __accept_memory(phys_addr_t start, phys_addr_t end)
/* Platform-specific memory-acceptance call goes here */
if (is_tdx_guest())
tdx_accept_memory(start, end);
+ else if (sev_snp_enabled())
+ snp_accept_memory(start, end);
else
error("Cannot accept memory: unknown platform\n");
}
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 730c4677e9db..d4b06c862094 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -115,7 +115,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
/* Include code for early handlers */
#include "../../kernel/sev-shared.c"

-static inline bool sev_snp_enabled(void)
+bool sev_snp_enabled(void)
{
return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
}
@@ -161,6 +161,14 @@ void snp_set_page_shared(unsigned long paddr)
__page_state_change(paddr, SNP_PAGE_STATE_SHARED);
}

+void snp_accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ while (end > start) {
+ snp_set_page_private(start);
+ start += PAGE_SIZE;
+ }
+}
+
static bool early_setup_ghcb(void)
{
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
diff --git a/arch/x86/boot/compressed/sev.h b/arch/x86/boot/compressed/sev.h
new file mode 100644
index 000000000000..fc725a981b09
--- /dev/null
+++ b/arch/x86/boot/compressed/sev.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * AMD SEV header for early boot related functions.
+ *
+ * Author: Tom Lendacky <[email protected]>
+ */
+
+#ifndef BOOT_COMPRESSED_SEV_H
+#define BOOT_COMPRESSED_SEV_H
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+
+bool sev_snp_enabled(void);
+void snp_accept_memory(phys_addr_t start, phys_addr_t end);
+
+#else
+
+static inline bool sev_snp_enabled(void) { return false; }
+static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
+
+#endif
+
+#endif
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 19514524f0f8..21db66bacefe 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -202,6 +202,7 @@ void snp_set_wakeup_secondary_cpu(void);
bool snp_init(struct boot_params *bp);
void snp_abort(void);
int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err);
+void snp_accept_memory(phys_addr_t start, phys_addr_t end);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -226,6 +227,8 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in
{
return -ENOTTY;
}
+
+static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
#endif

#endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 84d94fd2ec53..db74c38babf7 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -917,6 +917,22 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
pvalidate_pages(vaddr, npages, true);
}

+void snp_accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ unsigned long vaddr;
+ unsigned int npages;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ vaddr = (unsigned long)__va(start);
+ npages = (end - start) >> PAGE_SHIFT;
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ pvalidate_pages(vaddr, npages, true);
+}
+
static int snp_set_vmsa(void *va, bool vmsa)
{
u64 attrs;
diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c
index 9ec2304272dc..b86ad6a8ddf5 100644
--- a/arch/x86/mm/unaccepted_memory.c
+++ b/arch/x86/mm/unaccepted_memory.c
@@ -9,6 +9,7 @@
#include <asm/setup.h>
#include <asm/shared/tdx.h>
#include <asm/unaccepted_memory.h>
+#include <asm/sev.h>

/* Protects unaccepted memory bitmap */
static DEFINE_SPINLOCK(unaccepted_memory_lock);
@@ -66,6 +67,9 @@ void accept_memory(phys_addr_t start, phys_addr_t end)
if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
tdx_accept_memory(range_start * PMD_SIZE,
range_end * PMD_SIZE);
+ } else if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+ snp_accept_memory(range_start * PMD_SIZE,
+ range_end * PMD_SIZE);
} else {
panic("Cannot accept memory: unknown platform\n");
}
--
2.36.1


2022-08-03 18:25:28

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

In advance of providing support for unaccepted memory, switch from using
kmalloc() for allocating the Page State Change (PSC) structure to using a
static structure. This is needed to avoid a possible recursive call into
set_pages_state() if the kmalloc() call requires (more) memory to be
accepted, which would result in a hang.

Page state changes occur whenever DMA memory is allocated or memory needs
to be shared with the hypervisor (kvmclock, attestation reports, etc.).
Since most page state changes occur early in boot and are limited in
number, a single static PSC structure is used and protected by a spin
lock with interrupts disabled.

Even with interrupts disabled, an NMI can be raised while performing
memory acceptance. The NMI could then cause further memory acceptance to
be performed. To prevent a deadlock, use the MSR protocol if executing in
an NMI context.

Since the set_pages_state() path is the only path into vmgexit_psc(),
rename vmgexit_psc() to __vmgexit_psc() and remove the calls to disable
interrupts which are now performed by set_pages_state().

Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kernel/sev.c | 55 +++++++++++++++++++++++++------------------
1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c05f0124c410..84d94fd2ec53 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -66,6 +66,9 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
*/
static struct ghcb *boot_ghcb __section(".data");

+/* Flag to indicate when the first per-CPU GHCB is registered */
+static bool ghcb_percpu_ready __section(".data");
+
/* Bitmap of SEV features supported by the hypervisor */
static u64 sev_hv_features __ro_after_init;

@@ -122,6 +125,15 @@ struct sev_config {

static struct sev_config sev_cfg __read_mostly;

+/*
+ * Page State Change structure for use when accepting memory or when changing
+ * page state. Use is protected by a spinlock with interrupts disabled, but an
+ * NMI could still be raised, so check if running in an NMI an use the MSR
+ * protocol in these cases.
+ */
+static struct snp_psc_desc psc_desc;
+static DEFINE_SPINLOCK(psc_desc_lock);
+
static __always_inline bool on_vc_stack(struct pt_regs *regs)
{
unsigned long sp = regs->sp;
@@ -660,7 +672,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool valid
}
}

-static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+static void early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
{
unsigned long paddr_end;
u64 val;
@@ -742,26 +754,17 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
WARN(1, "invalid memory op %d\n", op);
}

-static int vmgexit_psc(struct snp_psc_desc *desc)
+static int __vmgexit_psc(struct snp_psc_desc *desc)
{
int cur_entry, end_entry, ret = 0;
struct snp_psc_desc *data;
struct ghcb_state state;
struct es_em_ctxt ctxt;
- unsigned long flags;
struct ghcb *ghcb;

- /*
- * __sev_get_ghcb() needs to run with IRQs disabled because it is using
- * a per-CPU GHCB.
- */
- local_irq_save(flags);
-
ghcb = __sev_get_ghcb(&state);
- if (!ghcb) {
- ret = 1;
- goto out_unlock;
- }
+ if (!ghcb)
+ return 1;

/* Copy the input desc into GHCB shared buffer */
data = (struct snp_psc_desc *)ghcb->shared_buffer;
@@ -820,9 +823,6 @@ static int vmgexit_psc(struct snp_psc_desc *desc)
out:
__sev_put_ghcb(&state);

-out_unlock:
- local_irq_restore(flags);
-
return ret;
}

@@ -861,18 +861,25 @@ static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
i++;
}

- if (vmgexit_psc(data))
+ if (__vmgexit_psc(data))
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
}

static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
{
unsigned long vaddr_end, next_vaddr;
- struct snp_psc_desc *desc;
+ unsigned long flags;

- desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
- if (!desc)
- panic("SNP: failed to allocate memory for PSC descriptor\n");
+ /*
+ * Use the MSR protocol when either:
+ * - executing in an NMI to avoid any possibility of a deadlock
+ * - per-CPU GHCBs are not yet registered, since __vmgexit_psc()
+ * uses the per-CPU GHCB.
+ */
+ if (in_nmi() || !ghcb_percpu_ready)
+ return early_set_pages_state(__pa(vaddr), npages, op);
+
+ spin_lock_irqsave(&psc_desc_lock, flags);

vaddr = vaddr & PAGE_MASK;
vaddr_end = vaddr + (npages << PAGE_SHIFT);
@@ -882,12 +889,12 @@ static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
next_vaddr = min_t(unsigned long, vaddr_end,
(VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);

- __set_pages_state(desc, vaddr, next_vaddr, op);
+ __set_pages_state(&psc_desc, vaddr, next_vaddr, op);

vaddr = next_vaddr;
}

- kfree(desc);
+ spin_unlock_irqrestore(&psc_desc_lock, flags);
}

void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
@@ -1254,6 +1261,8 @@ void setup_ghcb(void)
if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
snp_register_per_cpu_ghcb();

+ ghcb_percpu_ready = true;
+
return;
}

--
2.36.1


2022-08-03 18:56:25

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1.1 1/2] x86/sev: Use per-CPU PSC structure in prep for unaccepted memory support

On 8/3/22 13:11, Tom Lendacky wrote:

Of course, I'll fix the subject if submitting this for real... ugh.

Thanks,
Tom

> In advance of providing support for unaccepted memory, switch from using
> kmalloc() for allocating the Page State Change (PSC) structure to using a
> static structure. This is needed to avoid a possible recursive call into
> set_pages_state() if the kmalloc() call requires (more) memory to be
> accepted, which would result in a hang.
>
> Page state changes occur whenever DMA memory is allocated or memory needs
> to be shared with the hypervisor (kvmclock, attestation reports, etc.).
> Since most page state changes occur early in boot and are limited in
> number, a single static PSC structure is used and protected by a spin
> lock with interrupts disabled.
>
> Even with interrupts disabled, an NMI can be raised while performing
> memory acceptance. The NMI could then cause further memory acceptance to
> be performed. To prevent a deadlock, use the MSR protocol if executing in
> an NMI context.
>
> Since the set_pages_state() path is the only path into vmgexit_psc(),
> rename vmgexit_psc() to __vmgexit_psc() and remove the calls to disable
> interrupts which are now performed by set_pages_state().
>
> Signed-off-by: Tom Lendacky <[email protected]>
> ---
> arch/x86/kernel/sev.c | 55 +++++++++++++++++++++++++------------------
> 1 file changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index c05f0124c410..84d94fd2ec53 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -66,6 +66,9 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
> */
> static struct ghcb *boot_ghcb __section(".data");
>
> +/* Flag to indicate when the first per-CPU GHCB is registered */
> +static bool ghcb_percpu_ready __section(".data");
> +
> /* Bitmap of SEV features supported by the hypervisor */
> static u64 sev_hv_features __ro_after_init;
>
> @@ -122,6 +125,15 @@ struct sev_config {
>
> static struct sev_config sev_cfg __read_mostly;
>
> +/*
> + * Page State Change structure for use when accepting memory or when changing
> + * page state. Use is protected by a spinlock with interrupts disabled, but an
> + * NMI could still be raised, so check if running in an NMI an use the MSR
> + * protocol in these cases.
> + */
> +static struct snp_psc_desc psc_desc;
> +static DEFINE_SPINLOCK(psc_desc_lock);
> +
> static __always_inline bool on_vc_stack(struct pt_regs *regs)
> {
> unsigned long sp = regs->sp;
> @@ -660,7 +672,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool valid
> }
> }
>
> -static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
> +static void early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
> {
> unsigned long paddr_end;
> u64 val;
> @@ -742,26 +754,17 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
> WARN(1, "invalid memory op %d\n", op);
> }
>
> -static int vmgexit_psc(struct snp_psc_desc *desc)
> +static int __vmgexit_psc(struct snp_psc_desc *desc)
> {
> int cur_entry, end_entry, ret = 0;
> struct snp_psc_desc *data;
> struct ghcb_state state;
> struct es_em_ctxt ctxt;
> - unsigned long flags;
> struct ghcb *ghcb;
>
> - /*
> - * __sev_get_ghcb() needs to run with IRQs disabled because it is using
> - * a per-CPU GHCB.
> - */
> - local_irq_save(flags);
> -
> ghcb = __sev_get_ghcb(&state);
> - if (!ghcb) {
> - ret = 1;
> - goto out_unlock;
> - }
> + if (!ghcb)
> + return 1;
>
> /* Copy the input desc into GHCB shared buffer */
> data = (struct snp_psc_desc *)ghcb->shared_buffer;
> @@ -820,9 +823,6 @@ static int vmgexit_psc(struct snp_psc_desc *desc)
> out:
> __sev_put_ghcb(&state);
>
> -out_unlock:
> - local_irq_restore(flags);
> -
> return ret;
> }
>
> @@ -861,18 +861,25 @@ static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> i++;
> }
>
> - if (vmgexit_psc(data))
> + if (__vmgexit_psc(data))
> sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
> }
>
> static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
> {
> unsigned long vaddr_end, next_vaddr;
> - struct snp_psc_desc *desc;
> + unsigned long flags;
>
> - desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
> - if (!desc)
> - panic("SNP: failed to allocate memory for PSC descriptor\n");
> + /*
> + * Use the MSR protocol when either:
> + * - executing in an NMI to avoid any possibility of a deadlock
> + * - per-CPU GHCBs are not yet registered, since __vmgexit_psc()
> + * uses the per-CPU GHCB.
> + */
> + if (in_nmi() || !ghcb_percpu_ready)
> + return early_set_pages_state(__pa(vaddr), npages, op);
> +
> + spin_lock_irqsave(&psc_desc_lock, flags);
>
> vaddr = vaddr & PAGE_MASK;
> vaddr_end = vaddr + (npages << PAGE_SHIFT);
> @@ -882,12 +889,12 @@ static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
> next_vaddr = min_t(unsigned long, vaddr_end,
> (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
>
> - __set_pages_state(desc, vaddr, next_vaddr, op);
> + __set_pages_state(&psc_desc, vaddr, next_vaddr, op);
>
> vaddr = next_vaddr;
> }
>
> - kfree(desc);
> + spin_unlock_irqrestore(&psc_desc_lock, flags);
> }
>
> void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
> @@ -1254,6 +1261,8 @@ void setup_ghcb(void)
> if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> snp_register_per_cpu_ghcb();
>
> + ghcb_percpu_ready = true;
> +
> return;
> }
>