2019-07-10 20:16:29

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

Invoke a hypercall when a memory region is changed from encrypted ->
decrypted and vice versa. Hypervisor need to know the page encryption
status during the guest migration.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: "Radim Krčmář" <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/mem_encrypt.h | 3 ++
arch/x86/mm/mem_encrypt.c | 45 +++++++++++++++++++++++++++++-
arch/x86/mm/pageattr.c | 15 ++++++++++
3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 0c196c47d621..6e654ab5a8e4 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -94,4 +94,7 @@ extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypte

#endif /* __ASSEMBLY__ */

+extern void set_memory_enc_dec_hypercall(unsigned long vaddr,
+ unsigned long size, bool enc);
+
#endif /* __X86_MEM_ENCRYPT_H__ */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index e0df96fdfe46..f3fda1de2869 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
#include <linux/dma-direct.h>
#include <linux/swiotlb.h>
#include <linux/mem_encrypt.h>
+#include <linux/kvm_para.h>

#include <asm/tlbflush.h>
#include <asm/fixmap.h>
@@ -25,6 +26,7 @@
#include <asm/processor-flags.h>
#include <asm/msr.h>
#include <asm/cmdline.h>
+#include <asm/kvm_para.h>

#include "mm_internal.h"

@@ -192,6 +194,45 @@ void __init sme_early_init(void)
swiotlb_force = SWIOTLB_FORCE;
}

+void set_memory_enc_dec_hypercall(unsigned long vaddr, unsigned long sz, bool enc)
+{
+ unsigned long vaddr_end, vaddr_next;
+
+ vaddr_end = vaddr + sz;
+
+ for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+ int psize, pmask, level;
+ unsigned long pfn;
+ pte_t *kpte;
+
+ kpte = lookup_address(vaddr, &level);
+ if (!kpte || pte_none(*kpte))
+ return;
+
+ switch (level) {
+ case PG_LEVEL_4K:
+ pfn = pte_pfn(*kpte);
+ break;
+ case PG_LEVEL_2M:
+ pfn = pmd_pfn(*(pmd_t *)kpte);
+ break;
+ case PG_LEVEL_1G:
+ pfn = pud_pfn(*(pud_t *)kpte);
+ break;
+ default:
+ return;
+ }
+
+ psize = page_level_size(level);
+ pmask = page_level_mask(level);
+
+ kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
+ pfn << PAGE_SHIFT, psize >> PAGE_SHIFT, enc);
+
+ vaddr_next = (vaddr & pmask) + psize;
+ }
+}
+
static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
{
pgprot_t old_prot, new_prot;
@@ -249,12 +290,13 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
static int __init early_set_memory_enc_dec(unsigned long vaddr,
unsigned long size, bool enc)
{
- unsigned long vaddr_end, vaddr_next;
+ unsigned long vaddr_end, vaddr_next, start;
unsigned long psize, pmask;
int split_page_size_mask;
int level, ret;
pte_t *kpte;

+ start = vaddr;
vaddr_next = vaddr;
vaddr_end = vaddr + size;

@@ -309,6 +351,7 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,

ret = 0;

+ set_memory_enc_dec_hypercall(start, size, enc);
out:
__flush_tlb_all();
return ret;
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 6a9a77a403c9..971f70f58f49 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -26,6 +26,7 @@
#include <asm/proto.h>
#include <asm/pat.h>
#include <asm/set_memory.h>
+#include <asm/mem_encrypt.h>

#include "mm_internal.h"

@@ -2020,6 +2021,12 @@ int set_memory_global(unsigned long addr, int numpages)
__pgprot(_PAGE_GLOBAL), 0);
}

+void __attribute__((weak)) set_memory_enc_dec_hypercall(unsigned long addr,
+ unsigned long size,
+ bool enc)
+{
+}
+
static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
{
struct cpa_data cpa;
@@ -2060,6 +2067,14 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
*/
cpa_flush(&cpa, 0);

+ /*
+ * When SEV is active, notify hypervisor that a given memory range is mapped
+ * encrypted or decrypted. Hypervisor will use this information during
+ * the VM migration.
+ */
+ if (sev_active())
+ set_memory_enc_dec_hypercall(addr, numpages << PAGE_SHIFT, enc);
+
return ret;
}

--
2.17.1


2019-08-29 16:53:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

On Wed, Jul 10, 2019 at 08:13:11PM +0000, Singh, Brijesh wrote:

> Subject: Re: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

Subject prefix: "x86/mm: Invoke ..."

git log <filename> would usually show you how the prefixing should look
like.

> Invoke a hypercall when a memory region is changed from encrypted ->
> decrypted and vice versa. Hypervisor need to know the page encryption
> status during the guest migration.
>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: "Radim Krčmář" <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Tom Lendacky <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/mem_encrypt.h | 3 ++
> arch/x86/mm/mem_encrypt.c | 45 +++++++++++++++++++++++++++++-
> arch/x86/mm/pageattr.c | 15 ++++++++++
> 3 files changed, 62 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 0c196c47d621..6e654ab5a8e4 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -94,4 +94,7 @@ extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypte
>
> #endif /* __ASSEMBLY__ */
>
> +extern void set_memory_enc_dec_hypercall(unsigned long vaddr,
> + unsigned long size, bool enc);
> +
> #endif /* __X86_MEM_ENCRYPT_H__ */
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index e0df96fdfe46..f3fda1de2869 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -15,6 +15,7 @@
> #include <linux/dma-direct.h>
> #include <linux/swiotlb.h>
> #include <linux/mem_encrypt.h>
> +#include <linux/kvm_para.h>
>
> #include <asm/tlbflush.h>
> #include <asm/fixmap.h>
> @@ -25,6 +26,7 @@
> #include <asm/processor-flags.h>
> #include <asm/msr.h>
> #include <asm/cmdline.h>
> +#include <asm/kvm_para.h>
>
> #include "mm_internal.h"
>
> @@ -192,6 +194,45 @@ void __init sme_early_init(void)
> swiotlb_force = SWIOTLB_FORCE;
> }
>
> +void set_memory_enc_dec_hypercall(unsigned long vaddr, unsigned long sz, bool enc)
> +{
> + unsigned long vaddr_end, vaddr_next;
> +
> + vaddr_end = vaddr + sz;
> +
> + for (; vaddr < vaddr_end; vaddr = vaddr_next) {
> + int psize, pmask, level;
> + unsigned long pfn;
> + pte_t *kpte;
> +
> + kpte = lookup_address(vaddr, &level);
> + if (!kpte || pte_none(*kpte))
> + return;
> +
> + switch (level) {
> + case PG_LEVEL_4K:
> + pfn = pte_pfn(*kpte);
> + break;
> + case PG_LEVEL_2M:
> + pfn = pmd_pfn(*(pmd_t *)kpte);
> + break;
> + case PG_LEVEL_1G:
> + pfn = pud_pfn(*(pud_t *)kpte);
> + break;
> + default:
> + return;
> + }
> +
> + psize = page_level_size(level);
> + pmask = page_level_mask(level);
> +
> + kvm_sev_hypercall3(KVM_HC_PAGE_ENC_STATUS,
> + pfn << PAGE_SHIFT, psize >> PAGE_SHIFT, enc);
> +
> + vaddr_next = (vaddr & pmask) + psize;
> + }
> +}
> +
> static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
> {
> pgprot_t old_prot, new_prot;
> @@ -249,12 +290,13 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
> static int __init early_set_memory_enc_dec(unsigned long vaddr,
> unsigned long size, bool enc)
> {
> - unsigned long vaddr_end, vaddr_next;
> + unsigned long vaddr_end, vaddr_next, start;
> unsigned long psize, pmask;
> int split_page_size_mask;
> int level, ret;
> pte_t *kpte;
>
> + start = vaddr;
> vaddr_next = vaddr;
> vaddr_end = vaddr + size;
>
> @@ -309,6 +351,7 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,
>
> ret = 0;
>
> + set_memory_enc_dec_hypercall(start, size, enc);

That function iterates the same way over the virtual addresses as
early_set_memory_enc_dec() does. Please call kvm_sev_hypercall3(),
wrapped of course, directly from early_set_memory_enc_dec(), for
each iteration of the loop instead of iterating over all the virtual
addresses a second time in set_memory_enc_dec_hypercall().

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 247165, AG München

2019-08-29 18:09:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

On Wed, Jul 10, 2019 at 08:13:11PM +0000, Singh, Brijesh wrote:
> @@ -2060,6 +2067,14 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> */
> cpa_flush(&cpa, 0);
>
> + /*
> + * When SEV is active, notify hypervisor that a given memory range is mapped
> + * encrypted or decrypted. Hypervisor will use this information during
> + * the VM migration.
> + */
> + if (sev_active())
> + set_memory_enc_dec_hypercall(addr, numpages << PAGE_SHIFT, enc);

Btw, tglx has a another valid design concern here: why isn't this a
pv_ops thing? So that it is active only when the hypervisor is actually
present?

I know, I know, this will run on SEV guests only because it is all
(hopefully) behind "if (sev_active())" checks but the clean and accepted
design is a paravirt call, I'd say.

Especially if some day other hypervisors should want to run SEV guests
too...

Thx.

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 247165, AG München

2019-08-29 18:22:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

On Thu, 29 Aug 2019, Borislav Petkov wrote:

> On Wed, Jul 10, 2019 at 08:13:11PM +0000, Singh, Brijesh wrote:
> > @@ -2060,6 +2067,14 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> > */
> > cpa_flush(&cpa, 0);
> >
> > + /*
> > + * When SEV is active, notify hypervisor that a given memory range is mapped
> > + * encrypted or decrypted. Hypervisor will use this information during
> > + * the VM migration.
> > + */
> > + if (sev_active())
> > + set_memory_enc_dec_hypercall(addr, numpages << PAGE_SHIFT, enc);
>
> Btw, tglx has a another valid design concern here: why isn't this a
> pv_ops thing? So that it is active only when the hypervisor is actually
> present?
>
> I know, I know, this will run on SEV guests only because it is all
> (hopefully) behind "if (sev_active())" checks but the clean and accepted
> design is a paravirt call, I'd say.

No. sev_active() has nothing to do with guest mode. It tells whether SEV is
active or not. So yes, this calls into this function on both guest and
host. The latter is beyond pointless.

Thanks,

tglx


2019-08-29 18:33:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 10/11] mm: x86: Invoke hypercall when page encryption status is changed

On Thu, 29 Aug 2019, Thomas Gleixner wrote:
> On Thu, 29 Aug 2019, Borislav Petkov wrote:
>
> > On Wed, Jul 10, 2019 at 08:13:11PM +0000, Singh, Brijesh wrote:
> > > @@ -2060,6 +2067,14 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> > > */
> > > cpa_flush(&cpa, 0);
> > >
> > > + /*
> > > + * When SEV is active, notify hypervisor that a given memory range is mapped
> > > + * encrypted or decrypted. Hypervisor will use this information during
> > > + * the VM migration.
> > > + */
> > > + if (sev_active())
> > > + set_memory_enc_dec_hypercall(addr, numpages << PAGE_SHIFT, enc);
> >
> > Btw, tglx has a another valid design concern here: why isn't this a
> > pv_ops thing? So that it is active only when the hypervisor is actually
> > present?
> >
> > I know, I know, this will run on SEV guests only because it is all
> > (hopefully) behind "if (sev_active())" checks but the clean and accepted
> > design is a paravirt call, I'd say.
>
> No. sev_active() has nothing to do with guest mode. It tells whether SEV is
> active or not. So yes, this calls into this function on both guest and
> host. The latter is beyond pointless.

Oops. sme != sev.

But yes, can we please hide that a bit better....