2020-05-11 23:34:59

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 0/3] arch/x86: Enable MPK feature on AMD

AMD's next generation of EPYC processors support the MPK (Memory
Protection Keys) feature.

AMD documentation for MPK feature is available at "AMD64 Architecture
Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34,
Section 5.6.6 Memory Protection Keys (MPK) Bit".

The documentation can be obtained at the link below:
https://bugzilla.kernel.org/show_bug.cgi?id=206537

This series enables the feature on AMD and updates config parameters
to reflect the MPK support on generic x86 platforms.

---
v3:
- Fixed the problem Jim Mattson pointed out which can cause pkru
resources to get corrupted during host and guest switches.
- Moved the PKU feature detection code from VMX.c to common code.

v2:
https://lore.kernel.org/lkml/[email protected]/
- Introduced intermediate config option X86_MEMORY_PROTECTION_KEYS to
avoid user propmpts. Kept X86_INTEL_MEMORY_PROTECTION_KEYS as is.
Eventually, we will be moving to X86_MEMORY_PROTECTION_KEYS after
couple of kernel revisions.
- Moved pkru data structures to kvm_vcpu_arch. Moved save/restore pkru
to kvm_load_host_xsave_state/kvm_load_guest_xsave_state.

v1:
https://lore.kernel.org/lkml/[email protected]/

Babu Moger (3):
arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86
KVM: x86: Move pkru save/restore to x86.c
KVM: x86: Move MPK feature detection to common code


Documentation/core-api/protection-keys.rst | 3 ++-
arch/x86/Kconfig | 11 +++++++++--
arch/x86/include/asm/disabled-features.h | 4 ++--
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/include/asm/mmu.h | 2 +-
arch/x86/include/asm/mmu_context.h | 4 ++--
arch/x86/include/asm/pgtable.h | 4 ++--
arch/x86/include/asm/pgtable_types.h | 2 +-
arch/x86/include/asm/special_insns.h | 2 +-
arch/x86/include/uapi/asm/mman.h | 2 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kvm/cpuid.c | 4 +++-
arch/x86/kvm/vmx/vmx.c | 22 ----------------------
arch/x86/kvm/x86.c | 17 +++++++++++++++++
arch/x86/mm/Makefile | 2 +-
arch/x86/mm/pkeys.c | 2 +-
scripts/headers_install.sh | 2 +-
tools/arch/x86/include/asm/disabled-features.h | 4 ++--
18 files changed, 48 insertions(+), 42 deletions(-)

--


2020-05-11 23:35:04

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 1/3] arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86

AMD's next generation of EPYC processors support the MPK (Memory
Protection Keys) feature.

So, rename X86_INTEL_MEMORY_PROTECTION_KEYS to X86_MEMORY_PROTECTION_KEYS.

No functional changes.

AMD documentation for MPK feature is available at "AMD64 Architecture
Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34,
Section 5.6.6 Memory Protection Keys (MPK) Bit". Documentation can be
obtained at the link below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <[email protected]>
---
Documentation/core-api/protection-keys.rst | 3 ++-
arch/x86/Kconfig | 11 +++++++++--
arch/x86/include/asm/disabled-features.h | 4 ++--
arch/x86/include/asm/mmu.h | 2 +-
arch/x86/include/asm/mmu_context.h | 4 ++--
arch/x86/include/asm/pgtable.h | 4 ++--
arch/x86/include/asm/pgtable_types.h | 2 +-
arch/x86/include/asm/special_insns.h | 2 +-
arch/x86/include/uapi/asm/mman.h | 2 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/mm/Makefile | 2 +-
arch/x86/mm/pkeys.c | 2 +-
scripts/headers_install.sh | 2 +-
tools/arch/x86/include/asm/disabled-features.h | 4 ++--
14 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index 49d9833af871..d25e89e53c59 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -6,7 +6,8 @@ Memory Protection Keys

Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
which is found on Intel's Skylake "Scalable Processor" Server CPUs.
-It will be avalable in future non-server parts.
+It will be available in future non-server parts. Also, AMD64
+Architecture Programmer’s Manual defines PKU feature in AMD processors.

For anyone wishing to test or use this feature, it is available in
Amazon's EC2 C5 instances and is known to work there using an Ubuntu
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1197b5596d5a..b6f1686526eb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1887,10 +1887,10 @@ config X86_UMIP
results are dummy.

config X86_INTEL_MEMORY_PROTECTION_KEYS
- prompt "Intel Memory Protection Keys"
+ prompt "Memory Protection Keys"
def_bool y
# Note: only available in 64-bit mode
- depends on CPU_SUP_INTEL && X86_64
+ depends on X86_64 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
---help---
@@ -1902,6 +1902,13 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS

If unsure, say y.

+config X86_MEMORY_PROTECTION_KEYS
+ # Note: This is an intermediate change to avoid config prompt to
+ # the users. Eventually, the option X86_INTEL_MEMORY_PROTECTION_KEYS
+ # should be changed to X86_MEMORY_PROTECTION_KEYS permanently after
+ # few kernel revisions.
+ def_bool X86_INTEL_MEMORY_PROTECTION_KEYS
+
choice
prompt "TSX enable mode"
depends on CPU_SUP_INTEL
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 4ea8584682f9..52dbdfed8043 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -36,13 +36,13 @@
# define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31))
#endif /* CONFIG_X86_64 */

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
# define DISABLE_PKU 0
# define DISABLE_OSPKE 0
#else
# define DISABLE_PKU (1<<(X86_FEATURE_PKU & 31))
# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31))
-#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
+#endif /* CONFIG_X86_MEMORY_PROTECTION_KEYS */

#ifdef CONFIG_X86_5LEVEL
# define DISABLE_LA57 0
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index bdeae9291e5c..351d22152709 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -42,7 +42,7 @@ typedef struct {
const struct vdso_image *vdso_image; /* vdso image in use */

atomic_t perf_rdpmc_allowed; /* nonzero if rdpmc is allowed */
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
/*
* One bit per protection key says whether userspace can
* use it or not. protected by mmap_sem.
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 4e55370e48e8..33f4a7ccac5e 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -118,7 +118,7 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id);
atomic64_set(&mm->context.tlb_gen, 0);

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
/* pkey 0 is the default and allocated implicitly */
mm->context.pkey_allocation_map = 0x1;
@@ -163,7 +163,7 @@ do { \
static inline void arch_dup_pkeys(struct mm_struct *oldmm,
struct mm_struct *mm)
{
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
return;

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 4d02e64af1b3..4265720d62c2 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1451,7 +1451,7 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
#define PKRU_WD_BIT 0x2
#define PKRU_BITS_PER_PKEY 2

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
extern u32 init_pkru_value;
#else
#define init_pkru_value 0
@@ -1475,7 +1475,7 @@ static inline bool __pkru_allows_write(u32 pkru, u16 pkey)

static inline u16 pte_flags_pkey(unsigned long pte_flags)
{
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
/* ifdef to avoid doing 59-bit shift on 32-bit values */
return (pte_flags & _PAGE_PKEY_MASK) >> _PAGE_BIT_PKEY_BIT0;
#else
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index b6606fe6cfdf..c61a1ff71d53 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -56,7 +56,7 @@
#define _PAGE_PAT_LARGE (_AT(pteval_t, 1) << _PAGE_BIT_PAT_LARGE)
#define _PAGE_SPECIAL (_AT(pteval_t, 1) << _PAGE_BIT_SPECIAL)
#define _PAGE_CPA_TEST (_AT(pteval_t, 1) << _PAGE_BIT_CPA_TEST)
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
#define _PAGE_PKEY_BIT0 (_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT0)
#define _PAGE_PKEY_BIT1 (_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT1)
#define _PAGE_PKEY_BIT2 (_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT2)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 6d37b8fcfc77..70eaae7e8f04 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -73,7 +73,7 @@ static inline unsigned long native_read_cr4(void)

void native_write_cr4(unsigned long val);

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
static inline u32 rdpkru(void)
{
u32 ecx = 0;
diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h
index d4a8d0424bfb..d4da414a9de2 100644
--- a/arch/x86/include/uapi/asm/mman.h
+++ b/arch/x86/include/uapi/asm/mman.h
@@ -4,7 +4,7 @@

#define MAP_32BIT 0x40 /* only give out 32bit addresses */

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
/*
* Take the 4 protection key bits out of the vma->vm_flags
* value and turn them in to the bits that we can put in
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bed0cb83fe24..e5fb9955214c 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -448,7 +448,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_OSPKE);
}

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
static __init int setup_disable_pku(char *arg)
{
/*
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 98f7c6fa2eaa..17ebf12ba8ff 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_AMD_NUMA) += amdtopology.o
obj-$(CONFIG_ACPI_NUMA) += srat.o
obj-$(CONFIG_NUMA_EMU) += numa_emulation.o

-obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
+obj-$(CONFIG_X86_MEMORY_PROTECTION_KEYS) += pkeys.o
obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o

diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 8873ed1438a9..a77497e8d58c 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Intel Memory Protection Keys management
+ * Memory Protection Keys management
* Copyright (c) 2015, Intel Corporation.
*/
#include <linux/debugfs.h> /* debugfs_create_u32() */
diff --git a/scripts/headers_install.sh b/scripts/headers_install.sh
index a07668a5c36b..6e60e5362d3e 100755
--- a/scripts/headers_install.sh
+++ b/scripts/headers_install.sh
@@ -86,7 +86,7 @@ arch/sh/include/uapi/asm/sigcontext.h:CONFIG_CPU_SH5
arch/sh/include/uapi/asm/stat.h:CONFIG_CPU_SH5
arch/x86/include/uapi/asm/auxvec.h:CONFIG_IA32_EMULATION
arch/x86/include/uapi/asm/auxvec.h:CONFIG_X86_64
-arch/x86/include/uapi/asm/mman.h:CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+arch/x86/include/uapi/asm/mman.h:CONFIG_X86_MEMORY_PROTECTION_KEYS
include/uapi/asm-generic/fcntl.h:CONFIG_64BIT
include/uapi/linux/atmdev.h:CONFIG_COMPAT
include/uapi/linux/elfcore.h:CONFIG_BINFMT_ELF_FDPIC
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index 4ea8584682f9..52dbdfed8043 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -36,13 +36,13 @@
# define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31))
#endif /* CONFIG_X86_64 */

-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_X86_MEMORY_PROTECTION_KEYS
# define DISABLE_PKU 0
# define DISABLE_OSPKE 0
#else
# define DISABLE_PKU (1<<(X86_FEATURE_PKU & 31))
# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31))
-#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
+#endif /* CONFIG_X86_MEMORY_PROTECTION_KEYS */

#ifdef CONFIG_X86_5LEVEL
# define DISABLE_LA57 0

2020-05-11 23:36:21

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code

Both Intel and AMD support (MPK) Memory Protection Key feature.
Move the feature detection from VMX to the common code. It should
work for both the platforms now.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kvm/cpuid.c | 4 +++-
arch/x86/kvm/vmx/vmx.c | 4 ----
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 901cd1fdecd9..3da7d6ea7574 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
#ifdef CONFIG_X86_64
unsigned int f_gbpages = F(GBPAGES);
unsigned int f_lm = F(LM);
+ /* PKU is not yet implemented for shadow paging. */
+ unsigned int f_pku = tdp_enabled ? F(PKU) : 0;
#else
unsigned int f_gbpages = 0;
unsigned int f_lm = 0;
@@ -326,7 +328,7 @@ void kvm_set_cpu_caps(void)
);

kvm_cpu_cap_mask(CPUID_7_ECX,
- F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) |
+ F(AVX512VBMI) | F(LA57) | f_pku | 0 /*OSPKE*/ | F(RDPID) |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 46898a476ba7..d153732ed88f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7136,10 +7136,6 @@ static __init void vmx_set_cpu_caps(void)
if (vmx_pt_mode_is_host_guest())
kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);

- /* PKU is not yet implemented for shadow paging. */
- if (enable_ept && boot_cpu_has(X86_FEATURE_OSPKE))
- kvm_cpu_cap_check_and_set(X86_FEATURE_PKU);
-
if (vmx_umip_emulated())
kvm_cpu_cap_set(X86_FEATURE_UMIP);


2020-05-11 23:37:19

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 2/3] KVM: x86: Move pkru save/restore to x86.c

MPK feature is supported by both VMX and SVM. So we can
safely move pkru state save/restore to common code. Also
move all the pkru data structure to kvm_vcpu_arch.

Also fixes the problem Jim Mattson pointed and suggested below.

"Though rdpkru and wrpkru are contingent upon CR4.PKE, the PKRU
resource isn't. It can be read with XSAVE and written with XRSTOR.
So, if we don't set the guest PKRU value here(kvm_load_guest_xsave_state),
the guest can read the host value.

In case of kvm_load_host_xsave_state, guest with CR4.PKE clear could
potentially use XRSTOR to change the host PKRU value"

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx/vmx.c | 18 ------------------
arch/x86/kvm/x86.c | 17 +++++++++++++++++
3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 42a2d0d3984a..afd8f3780ae0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -578,6 +578,7 @@ struct kvm_vcpu_arch {
unsigned long cr4;
unsigned long cr4_guest_owned_bits;
unsigned long cr8;
+ u32 host_pkru;
u32 pkru;
u32 hflags;
u64 efer;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c2c6335a998c..46898a476ba7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1372,7 +1372,6 @@ void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)

vmx_vcpu_pi_load(vcpu, cpu);

- vmx->host_pkru = read_pkru();
vmx->host_debugctlmsr = get_debugctlmsr();
}

@@ -6577,11 +6576,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)

kvm_load_guest_xsave_state(vcpu);

- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE) &&
- vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vcpu->arch.pkru);
-
pt_guest_enter(vmx);

if (vcpu_to_pmu(vcpu)->version)
@@ -6671,18 +6665,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)

pt_guest_exit(vmx);

- /*
- * eager fpu is enabled if PKEY is supported and CR4 is switched
- * back on host, so it is safe to read guest PKRU from current
- * XSAVE.
- */
- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
- vcpu->arch.pkru = rdpkru();
- if (vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vmx->host_pkru);
- }
-
kvm_load_host_xsave_state(vcpu);

vmx->nested.nested_run_pending = 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c5835f9cb9ad..98baeb74452c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -836,11 +836,25 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
vcpu->arch.ia32_xss != host_xss)
wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
}
+
+ if (static_cpu_has(X86_FEATURE_PKU) &&
+ (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
+ (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
+ vcpu->arch.pkru != vcpu->arch.host_pkru)
+ __write_pkru(vcpu->arch.pkru);
}
EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);

void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
{
+ if (static_cpu_has(X86_FEATURE_PKU) &&
+ (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
+ (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
+ vcpu->arch.pkru = rdpkru();
+ if (vcpu->arch.pkru != vcpu->arch.host_pkru)
+ __write_pkru(vcpu->arch.host_pkru);
+ }
+
if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {

if (vcpu->arch.xcr0 != host_xcr0)
@@ -3570,6 +3584,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)

kvm_x86_ops.vcpu_load(vcpu, cpu);

+ /* Save host pkru register if supported */
+ vcpu->arch.host_pkru = read_pkru();
+
/* Apply any externally detected TSC adjustments (due to suspend) */
if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment);

2020-05-11 23:46:10

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86

On 5/11/20 4:32 PM, Babu Moger wrote:
> AMD's next generation of EPYC processors support the MPK (Memory
> Protection Keys) feature.
>
> So, rename X86_INTEL_MEMORY_PROTECTION_KEYS to X86_MEMORY_PROTECTION_KEYS.
>
> No functional changes.
>
> AMD documentation for MPK feature is available at "AMD64 Architecture
> Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34,
> Section 5.6.6 Memory Protection Keys (MPK) Bit". Documentation can be
> obtained at the link below.

I was hoping to see at least *some* justification in this changelog. Do
you think having "INTEL_" will confuse users? Is there some technical
merit to this change?

The naming churn is an obviously bad, not technically necessary change.

> +config X86_MEMORY_PROTECTION_KEYS
> + # Note: This is an intermediate change to avoid config prompt to
> + # the users. Eventually, the option X86_INTEL_MEMORY_PROTECTION_KEYS
> + # should be changed to X86_MEMORY_PROTECTION_KEYS permanently after
> + # few kernel revisions.
> + def_bool X86_INTEL_MEMORY_PROTECTION_KEYS

"after a few kernel revisions" is code for "never". :)

Could we put an explicit date on this, please? One year seems roughly
right. Or, maybe "after the v5.10" release, so that this will approach
will make into at least one LTS kernel.

Maybe:

# Set the "INTEL_"-free option whenever the "INTEL_" one is set.
# The "INTEL_" one should be removed and replaced by this option after
# 5.10. This avoids exposing most 'oldconfig' users to this churn.

2020-05-11 23:53:57

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code

On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>
> Both Intel and AMD support (MPK) Memory Protection Key feature.
> Move the feature detection from VMX to the common code. It should
> work for both the platforms now.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 4 +++-
> arch/x86/kvm/vmx/vmx.c | 4 ----
> 2 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 901cd1fdecd9..3da7d6ea7574 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
> #ifdef CONFIG_X86_64
> unsigned int f_gbpages = F(GBPAGES);
> unsigned int f_lm = F(LM);
> + /* PKU is not yet implemented for shadow paging. */
> + unsigned int f_pku = tdp_enabled ? F(PKU) : 0;

I think we still want to require that OSPKE be set on the host before
exposing PKU to the guest.

2020-05-12 15:01:43

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86



On 5/11/20 6:44 PM, Dave Hansen wrote:
> On 5/11/20 4:32 PM, Babu Moger wrote:
>> AMD's next generation of EPYC processors support the MPK (Memory
>> Protection Keys) feature.
>>
>> So, rename X86_INTEL_MEMORY_PROTECTION_KEYS to X86_MEMORY_PROTECTION_KEYS.
>>
>> No functional changes.
>>
>> AMD documentation for MPK feature is available at "AMD64 Architecture
>> Programmer’s Manual Volume 2: System Programming, Pub. 24593 Rev. 3.34,
>> Section 5.6.6 Memory Protection Keys (MPK) Bit". Documentation can be
>> obtained at the link below.

I will remove this text. This is not too relevant here.

>
> I was hoping to see at least *some* justification in this changelog. Do
> you think having "INTEL_" will confuse users? Is there some technical
> merit to this change?
>
> The naming churn is an obviously bad, not technically necessary change.

Yes. Technically not necessary. But can cause some confusion on non-intel
platforms.

>
>> +config X86_MEMORY_PROTECTION_KEYS
>> + # Note: This is an intermediate change to avoid config prompt to
>> + # the users. Eventually, the option X86_INTEL_MEMORY_PROTECTION_KEYS
>> + # should be changed to X86_MEMORY_PROTECTION_KEYS permanently after
>> + # few kernel revisions.
>> + def_bool X86_INTEL_MEMORY_PROTECTION_KEYS
>
> "after a few kernel revisions" is code for "never". :)
>
> Could we put an explicit date on this, please? One year seems roughly
> right. Or, maybe "after the v5.10" release, so that this will approach
> will make into at least one LTS kernel.
>
> Maybe:
>
> # Set the "INTEL_"-free option whenever the "INTEL_" one is set.
> # The "INTEL_" one should be removed and replaced by this option after
> # 5.10. This avoids exposing most 'oldconfig' users to this churn.
>
Yes, this should work.

2020-05-12 15:15:04

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code



On 5/11/20 6:51 PM, Jim Mattson wrote:
> On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>>
>> Both Intel and AMD support (MPK) Memory Protection Key feature.
>> Move the feature detection from VMX to the common code. It should
>> work for both the platforms now.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kvm/cpuid.c | 4 +++-
>> arch/x86/kvm/vmx/vmx.c | 4 ----
>> 2 files changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 901cd1fdecd9..3da7d6ea7574 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
>> #ifdef CONFIG_X86_64
>> unsigned int f_gbpages = F(GBPAGES);
>> unsigned int f_lm = F(LM);
>> + /* PKU is not yet implemented for shadow paging. */
>> + unsigned int f_pku = tdp_enabled ? F(PKU) : 0;
>
> I think we still want to require that OSPKE be set on the host before
> exposing PKU to the guest.
>

Ok I can add this check.

+ unsigned int f_pku = tdp_enabled && F(OSPKE)? F(PKU) : 0;

2020-05-12 15:22:24

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86

On 5/12/20 7:57 AM, Babu Moger wrote:
>> I was hoping to see at least *some* justification in this changelog. Do
>> you think having "INTEL_" will confuse users? Is there some technical
>> merit to this change?
>>
>> The naming churn is an obviously bad, not technically necessary change.
> Yes. Technically not necessary. But can cause some confusion on non-intel
> platforms.

Seriously, guys, this is buried deep in kernel code. Who is this confusing?

To me, this is like anything else we rename in the kernel. It causes
churn, which makes patches harder to backport for instance. That's why
we don't rename things willy-nilly when we just don't like the names.

The naming has to cause some practical, real-world problem that we *FIX*
with the rename.

I'm just asking for a concrete, practical problem statement in the
changelog. If there isn't one, then please don't do the rename. The
Kconfig magic is still fine since it fixes a practical problem for end
users.

2020-05-12 15:47:14

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] arch/x86: Rename config X86_INTEL_MEMORY_PROTECTION_KEYS to generic x86



On 5/12/20 10:19 AM, Dave Hansen wrote:
> On 5/12/20 7:57 AM, Babu Moger wrote:
>>> I was hoping to see at least *some* justification in this changelog. Do
>>> you think having "INTEL_" will confuse users? Is there some technical
>>> merit to this change?
>>>
>>> The naming churn is an obviously bad, not technically necessary change.
>> Yes. Technically not necessary. But can cause some confusion on non-intel
>> platforms.
>
> Seriously, guys, this is buried deep in kernel code. Who is this confusing?
>
> To me, this is like anything else we rename in the kernel. It causes
> churn, which makes patches harder to backport for instance. That's why
> we don't rename things willy-nilly when we just don't like the names.
>
> The naming has to cause some practical, real-world problem that we *FIX*
> with the rename.
>
> I'm just asking for a concrete, practical problem statement in the
> changelog. If there isn't one, then please don't do the rename. The
> Kconfig magic is still fine since it fixes a practical problem for end
> users.
>

Alright. Alright. I will just keep Kconfig magic and update the
documentation(protection-keys.rst). Thanks

2020-05-12 16:44:17

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] KVM: x86: Move pkru save/restore to x86.c

On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>
> MPK feature is supported by both VMX and SVM. So we can
> safely move pkru state save/restore to common code. Also
> move all the pkru data structure to kvm_vcpu_arch.
>
> Also fixes the problem Jim Mattson pointed and suggested below.
>
> "Though rdpkru and wrpkru are contingent upon CR4.PKE, the PKRU
> resource isn't. It can be read with XSAVE and written with XRSTOR.
> So, if we don't set the guest PKRU value here(kvm_load_guest_xsave_state),
> the guest can read the host value.
>
> In case of kvm_load_host_xsave_state, guest with CR4.PKE clear could
> potentially use XRSTOR to change the host PKRU value"
>
> Signed-off-by: Babu Moger <[email protected]>

I would do the bugfix as a separate commit, to ease backporting it to
the stable branches.

2020-05-12 17:00:19

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code

On Tue, May 12, 2020 at 8:12 AM Babu Moger <[email protected]> wrote:
>
>
>
> On 5/11/20 6:51 PM, Jim Mattson wrote:
> > On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
> >>
> >> Both Intel and AMD support (MPK) Memory Protection Key feature.
> >> Move the feature detection from VMX to the common code. It should
> >> work for both the platforms now.
> >>
> >> Signed-off-by: Babu Moger <[email protected]>
> >> ---
> >> arch/x86/kvm/cpuid.c | 4 +++-
> >> arch/x86/kvm/vmx/vmx.c | 4 ----
> >> 2 files changed, 3 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >> index 901cd1fdecd9..3da7d6ea7574 100644
> >> --- a/arch/x86/kvm/cpuid.c
> >> +++ b/arch/x86/kvm/cpuid.c
> >> @@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
> >> #ifdef CONFIG_X86_64
> >> unsigned int f_gbpages = F(GBPAGES);
> >> unsigned int f_lm = F(LM);
> >> + /* PKU is not yet implemented for shadow paging. */
> >> + unsigned int f_pku = tdp_enabled ? F(PKU) : 0;
> >
> > I think we still want to require that OSPKE be set on the host before
> > exposing PKU to the guest.
> >
>
> Ok I can add this check.
>
> + unsigned int f_pku = tdp_enabled && F(OSPKE)? F(PKU) : 0;

That doesn't do what you think it does. F(OSPKE) is a non-zero
constant, so that conjunct is always true.

2020-05-12 17:20:01

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] KVM: x86: Move pkru save/restore to x86.c



On 5/12/20 11:39 AM, Jim Mattson wrote:
> On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>>
>> MPK feature is supported by both VMX and SVM. So we can
>> safely move pkru state save/restore to common code. Also
>> move all the pkru data structure to kvm_vcpu_arch.
>>
>> Also fixes the problem Jim Mattson pointed and suggested below.
>>
>> "Though rdpkru and wrpkru are contingent upon CR4.PKE, the PKRU
>> resource isn't. It can be read with XSAVE and written with XRSTOR.
>> So, if we don't set the guest PKRU value here(kvm_load_guest_xsave_state),
>> the guest can read the host value.
>>
>> In case of kvm_load_host_xsave_state, guest with CR4.PKE clear could
>> potentially use XRSTOR to change the host PKRU value"
>>
>> Signed-off-by: Babu Moger <[email protected]>
>
> I would do the bugfix as a separate commit, to ease backporting it to
> the stable branches.

Ok. Sure.

2020-05-12 17:30:12

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code

On Tue, May 12, 2020 at 09:58:19AM -0700, Jim Mattson wrote:
> On Tue, May 12, 2020 at 8:12 AM Babu Moger <[email protected]> wrote:
> >
> >
> >
> > On 5/11/20 6:51 PM, Jim Mattson wrote:
> > > On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
> > >>
> > >> Both Intel and AMD support (MPK) Memory Protection Key feature.
> > >> Move the feature detection from VMX to the common code. It should
> > >> work for both the platforms now.
> > >>
> > >> Signed-off-by: Babu Moger <[email protected]>
> > >> ---
> > >> arch/x86/kvm/cpuid.c | 4 +++-
> > >> arch/x86/kvm/vmx/vmx.c | 4 ----
> > >> 2 files changed, 3 insertions(+), 5 deletions(-)
> > >>
> > >> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > >> index 901cd1fdecd9..3da7d6ea7574 100644
> > >> --- a/arch/x86/kvm/cpuid.c
> > >> +++ b/arch/x86/kvm/cpuid.c
> > >> @@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
> > >> #ifdef CONFIG_X86_64
> > >> unsigned int f_gbpages = F(GBPAGES);
> > >> unsigned int f_lm = F(LM);
> > >> + /* PKU is not yet implemented for shadow paging. */
> > >> + unsigned int f_pku = tdp_enabled ? F(PKU) : 0;
> > >
> > > I think we still want to require that OSPKE be set on the host before
> > > exposing PKU to the guest.
> > >
> >
> > Ok I can add this check.
> >
> > + unsigned int f_pku = tdp_enabled && F(OSPKE)? F(PKU) : 0;
>
> That doesn't do what you think it does. F(OSPKE) is a non-zero
> constant, so that conjunct is always true.

My vote would be to omit f_pku and adjust the cap directly, e.g.

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6828be99b9083..998c902df9e57 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -326,7 +326,7 @@ void kvm_set_cpu_caps(void)
);

kvm_cpu_cap_mask(CPUID_7_ECX,
- F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) |
+ F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
@@ -334,6 +334,8 @@ void kvm_set_cpu_caps(void)
/* Set LA57 based on hardware capability. */
if (cpuid_ecx(7) & F(LA57))
kvm_cpu_cap_set(X86_FEATURE_LA57);
+ if (!tdp_enabled || !boot_cpu_has(OSPKE))
+ kvm_cpu_cap_clear(X86_FEATURE_PKU);

kvm_cpu_cap_mask(CPUID_7_EDX,
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |

2020-05-12 20:06:52

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86: Move MPK feature detection to common code



On 5/12/20 12:28 PM, Sean Christopherson wrote:
> On Tue, May 12, 2020 at 09:58:19AM -0700, Jim Mattson wrote:
>> On Tue, May 12, 2020 at 8:12 AM Babu Moger <[email protected]> wrote:
>>>
>>>
>>>
>>> On 5/11/20 6:51 PM, Jim Mattson wrote:
>>>> On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>>>>>
>>>>> Both Intel and AMD support (MPK) Memory Protection Key feature.
>>>>> Move the feature detection from VMX to the common code. It should
>>>>> work for both the platforms now.
>>>>>
>>>>> Signed-off-by: Babu Moger <[email protected]>
>>>>> ---
>>>>> arch/x86/kvm/cpuid.c | 4 +++-
>>>>> arch/x86/kvm/vmx/vmx.c | 4 ----
>>>>> 2 files changed, 3 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>>>> index 901cd1fdecd9..3da7d6ea7574 100644
>>>>> --- a/arch/x86/kvm/cpuid.c
>>>>> +++ b/arch/x86/kvm/cpuid.c
>>>>> @@ -278,6 +278,8 @@ void kvm_set_cpu_caps(void)
>>>>> #ifdef CONFIG_X86_64
>>>>> unsigned int f_gbpages = F(GBPAGES);
>>>>> unsigned int f_lm = F(LM);
>>>>> + /* PKU is not yet implemented for shadow paging. */
>>>>> + unsigned int f_pku = tdp_enabled ? F(PKU) : 0;
>>>>
>>>> I think we still want to require that OSPKE be set on the host before
>>>> exposing PKU to the guest.
>>>>
>>>
>>> Ok I can add this check.
>>>
>>> + unsigned int f_pku = tdp_enabled && F(OSPKE)? F(PKU) : 0;
>>
>> That doesn't do what you think it does. F(OSPKE) is a non-zero
>> constant, so that conjunct is always true.
>
> My vote would be to omit f_pku and adjust the cap directly, e.g.

Sure. I am fine with this. Thanks

>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 6828be99b9083..998c902df9e57 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -326,7 +326,7 @@ void kvm_set_cpu_caps(void)
> );
>
> kvm_cpu_cap_mask(CPUID_7_ECX,
> - F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) |
> + F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
> @@ -334,6 +334,8 @@ void kvm_set_cpu_caps(void)
> /* Set LA57 based on hardware capability. */
> if (cpuid_ecx(7) & F(LA57))
> kvm_cpu_cap_set(X86_FEATURE_LA57);
> + if (!tdp_enabled || !boot_cpu_has(OSPKE))
> + kvm_cpu_cap_clear(X86_FEATURE_PKU);
>
> kvm_cpu_cap_mask(CPUID_7_EDX,
> F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
>

2020-05-13 06:51:23

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] KVM: x86: Move pkru save/restore to x86.c

On 12/05/20 19:17, Babu Moger wrote:
>
> On 5/12/20 11:39 AM, Jim Mattson wrote:
>> On Mon, May 11, 2020 at 4:33 PM Babu Moger <[email protected]> wrote:
>>> MPK feature is supported by both VMX and SVM. So we can
>>> safely move pkru state save/restore to common code. Also
>>> move all the pkru data structure to kvm_vcpu_arch.
>>>
>>> Also fixes the problem Jim Mattson pointed and suggested below.
>>>
>>> "Though rdpkru and wrpkru are contingent upon CR4.PKE, the PKRU
>>> resource isn't. It can be read with XSAVE and written with XRSTOR.
>>> So, if we don't set the guest PKRU value here(kvm_load_guest_xsave_state),
>>> the guest can read the host value.
>>>
>>> In case of kvm_load_host_xsave_state, guest with CR4.PKE clear could
>>> potentially use XRSTOR to change the host PKRU value"
>>>
>>> Signed-off-by: Babu Moger <[email protected]>
>> I would do the bugfix as a separate commit, to ease backporting it to
>> the stable branches.
> Ok. Sure.

I will take care of this for v4 (pick this patch up and put it in
5.7-rc, package everything as a topic branch, merge it to kvm/next).

Paolo