2017-03-20 08:17:19

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 0/10] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction

rr (http://rr-project.org/), a userspace record-and-replay reverse-
execution debugger, would like to trap and emulate the CPUID instruction.
This would allow us to a) mask away certain hardware features that rr does
not support (e.g. RDRAND) and b) enable trace portability across machines
by providing constant results.

Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
CPL > 0. Expose this capability to userspace as a new pair of arch_prctls,
ARCH_GET_CPUID and ARCH_SET_CPUID.

Since v15:
All: Patch 1 is new, and patch 2 is what was previously patch 9. What were
patches 1-8 are now 3-10, in the same order.

Patch 1: (NEW) x86/msr: Rename MISC_FEATURE_ENABLES
- While fixing the conflicts with the ring3 mwait feature, I noticed it
introduced MSR_MISC_FEATURE_ENABLES, not MSR_MISC_FEATURES_ENABLES.
This is corrected.

Patch 2: x86/arch_prctl: Rename 'code' argument to 'option'
- Previously patch 9/9, now moved to precede all other arch_prctl patches.
- Fixed a stale file location comment in arch/um/include/shared/os.h.

Patch 7: x86/cpufeature: Detect CPUID faulting support
- Split cpuid faulting code into a separate init_cpuid_fault.
- Gate calling both init_cpuid_fault and probe_xeon_phi_r3mwait on
a successful rdmsrl_safe(MSR_MISC_FEATURES_ENABLES).

Patch 8: x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
- Fix the bug with ring 3 mwait interactions that tglx noted by
teaching probe_xeon_phi_r3mwait about the new MSR_MISC_FEATURES_ENABLES
shadow, making init_intel_misc_features responsible for all MSR writes.


2017-03-20 08:17:22

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 01/10] x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES

This matches the only public Intel documentation of this MSR, in the
"Virtualization Technology FlexMigration Application Note"
(preserved at https://bugzilla.kernel.org/attachment.cgi?id=243991)

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/include/asm/msr-index.h | 6 +++---
arch/x86/kernel/cpu/intel.c | 8 ++++----
2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4c928f332f8f..f429b70ebaef 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -553,10 +553,10 @@
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT 39
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE (1ULL << MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)

-/* MISC_FEATURE_ENABLES non-architectural features */
-#define MSR_MISC_FEATURE_ENABLES 0x00000140
+/* MISC_FEATURES_ENABLES non-architectural features */
+#define MSR_MISC_FEATURES_ENABLES 0x00000140

-#define MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT 1
+#define MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT 1

#define MSR_IA32_TSC_DEADLINE 0x000006E0

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 063197771b8d..e229318d7230 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -91,13 +91,13 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
}

if (ring3mwait_disabled) {
- msr_clear_bit(MSR_MISC_FEATURE_ENABLES,
- MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT);
+ msr_clear_bit(MSR_MISC_FEATURES_ENABLES,
+ MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);
return;
}

- msr_set_bit(MSR_MISC_FEATURE_ENABLES,
- MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT);
+ msr_set_bit(MSR_MISC_FEATURES_ENABLES,
+ MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

set_cpu_cap(c, X86_FEATURE_RING3MWAIT);

--
2.11.0

2017-03-20 08:17:53

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 10/10] KVM: x86: virtualize cpuid faulting

Hardware support for faulting on the cpuid instruction is not required to
emulate it, because cpuid triggers a VM exit anyways. KVM handles the relevant
MSRs (MSR_PLATFORM_INFO and MSR_MISC_FEATURES_ENABLE) and upon a
cpuid-induced VM exit checks the cpuid faulting state and the CPL.
kvm_require_cpl is even kind enough to inject the GP fault for us.

Signed-off-by: Kyle Huey <[email protected]>
Reviewed-by: David Matlack <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/cpuid.c | 3 +++
arch/x86/kvm/cpuid.h | 11 +++++++++++
arch/x86/kvm/emulate.c | 7 +++++++
arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++++++
5 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 74ef58c8ff53..df0c2bd970a4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -612,6 +612,8 @@ struct kvm_vcpu_arch {
unsigned long dr7;
unsigned long eff_db[KVM_NR_DB_REGS];
unsigned long guest_debug_dr7;
+ u64 msr_platform_info;
+ u64 msr_misc_features_enables;

u64 mcg_cap;
u64 mcg_status;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index efde6cc50875..cb560a509041 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -876,6 +876,9 @@ int kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
{
u32 eax, ebx, ecx, edx;

+ if (cpuid_fault_enabled(vcpu) && !kvm_require_cpl(vcpu, 0))
+ return;
+
eax = kvm_register_read(vcpu, VCPU_REGS_RAX);
ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx);
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 35058c2c0eea..a6fd40aade7c 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -205,4 +205,15 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
return x86_stepping(best->eax);
}

+static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
+}
+
+static inline bool cpuid_fault_enabled(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.msr_misc_features_enables &
+ MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
+}
+
#endif
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 45c7306c8780..6a2ea945d01f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3854,6 +3854,13 @@ static int em_sti(struct x86_emulate_ctxt *ctxt)
static int em_cpuid(struct x86_emulate_ctxt *ctxt)
{
u32 eax, ebx, ecx, edx;
+ u64 msr = 0;
+
+ ctxt->ops->get_msr(ctxt, MSR_MISC_FEATURES_ENABLES, &msr);
+ if (msr & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT &&
+ ctxt->ops->cpl(ctxt)) {
+ return emulate_gp(ctxt, 0);
+ }

eax = reg_read(ctxt, VCPU_REGS_RAX);
ecx = reg_read(ctxt, VCPU_REGS_RCX);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1faf620a6fdc..16d2082d85fb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1008,6 +1008,8 @@ static u32 emulated_msrs[] = {
MSR_IA32_MCG_CTL,
MSR_IA32_MCG_EXT_CTL,
MSR_IA32_SMBASE,
+ MSR_PLATFORM_INFO,
+ MSR_MISC_FEATURES_ENABLES,
};

static unsigned num_emulated_msrs;
@@ -2331,6 +2333,21 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
vcpu->arch.osvw.status = data;
break;
+ case MSR_PLATFORM_INFO:
+ if (!msr_info->host_initiated ||
+ data & ~MSR_PLATFORM_INFO_CPUID_FAULT ||
+ (!(data & MSR_PLATFORM_INFO_CPUID_FAULT) &&
+ cpuid_fault_enabled(vcpu)))
+ return 1;
+ vcpu->arch.msr_platform_info = data;
+ break;
+ case MSR_MISC_FEATURES_ENABLES:
+ if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT ||
+ (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT &&
+ !supports_cpuid_fault(vcpu)))
+ return 1;
+ vcpu->arch.msr_misc_features_enables = data;
+ break;
default:
if (msr && (msr == vcpu->kvm->arch.xen_hvm_config.msr))
return xen_hvm_config(vcpu, data);
@@ -2545,6 +2562,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
msr_info->data = vcpu->arch.osvw.status;
break;
+ case MSR_PLATFORM_INFO:
+ msr_info->data = vcpu->arch.msr_platform_info;
+ break;
+ case MSR_MISC_FEATURES_ENABLES:
+ msr_info->data = vcpu->arch.msr_misc_features_enables;
+ break;
default:
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
return kvm_pmu_get_msr(vcpu, msr_info->index, &msr_info->data);
@@ -7724,6 +7747,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
if (!init_event) {
kvm_pmu_reset(vcpu);
vcpu->arch.smbase = 0x30000;
+
+ vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
+ vcpu->arch.msr_misc_features_enables = 0;
}

memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
--
2.11.0

2017-03-20 15:24:06

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 09/10] x86/arch_prctl: Selftest for ARCH_[GET|SET]_CPUID

Test disabling and reenabling the cpuid instruction via the new arch_prctl
ARCH_SET_CPUID, retrieving the current state via ARCH_GET_CPUID, and the
expected behaviors across fork() and exec().

Signed-off-by: Kyle Huey <[email protected]>
---
tools/testing/selftests/x86/Makefile | 2 +-
tools/testing/selftests/x86/cpuid_fault.c | 251 ++++++++++++++++++++++++++++++
2 files changed, 252 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/x86/cpuid_fault.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 38e0a9ca5d71..acda4e5fcf25 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -6,7 +6,7 @@ include ../lib.mk

TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \
check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \
- protection_keys test_vdso
+ protection_keys test_vdso cpuid_fault
TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
test_FCMOV test_FCOMI test_FISTTP \
vdso_restorer
diff --git a/tools/testing/selftests/x86/cpuid_fault.c b/tools/testing/selftests/x86/cpuid_fault.c
new file mode 100644
index 000000000000..e3b93c28c655
--- /dev/null
+++ b/tools/testing/selftests/x86/cpuid_fault.c
@@ -0,0 +1,251 @@
+
+/*
+ * Tests for arch_prctl(ARCH_GET_CPUID, ...) / arch_prctl(ARCH_SET_CPUID, ...)
+ *
+ * Basic test to test behaviour of ARCH_GET_CPUID and ARCH_SET_CPUID
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <cpuid.h>
+#include <err.h>
+#include <errno.h>
+#include <sys/wait.h>
+
+#include <sys/prctl.h>
+#include <linux/prctl.h>
+
+/*
+#define ARCH_GET_CPUID 0x1005
+#define ARCH_SET_CPUID 0x1006
+#ifdef __x86_64__
+#define SYS_arch_prctl 158
+#else
+#define SYS_arch_prctl 384
+#endif
+*/
+
+const char *cpuid_names[] = {
+ [0] = "[cpuid disabled]",
+ [1] = "[cpuid enabled]",
+};
+
+int arch_prctl(int option, unsigned long arg2)
+{
+ return syscall(SYS_arch_prctl, option, arg2);
+}
+
+int cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
+ unsigned int *edx)
+{
+ return __get_cpuid(0, eax, ebx, ecx, edx);
+}
+
+int do_child_exec_test(int eax, int ebx, int ecx, int edx)
+{
+ int cpuid_val = 0, child = 0, status = 0;
+
+ printf("arch_prctl(ARCH_GET_CPUID); ");
+
+ cpuid_val = arch_prctl(ARCH_GET_CPUID, 0);
+ if (cpuid_val < 0)
+ errx(1, "ARCH_GET_CPUID fails now, but not before?");
+
+ printf("cpuid_val == %s\n", cpuid_names[cpuid_val]);
+ if (cpuid_val != 0)
+ errx(1, "How did cpuid get re-enabled on fork?");
+
+ child = fork();
+ if (child == 0) {
+ cpuid_val = arch_prctl(ARCH_GET_CPUID, 0);
+ if (cpuid_val < 0)
+ errx(1, "ARCH_GET_CPUID fails now, but not before?");
+
+ printf("cpuid_val == %s\n", cpuid_names[cpuid_val]);
+ if (cpuid_val != 0)
+ errx(1, "How did cpuid get re-enabled on fork?");
+
+ printf("exec\n");
+ execl("/proc/self/exe", "cpuid-fault", "-early-return", NULL);
+ }
+
+ if (child != waitpid(child, &status, 0))
+ errx(1, "waitpid failed!?");
+
+ if (WEXITSTATUS(status) != 0)
+ errx(1, "Execed child exited abnormally");
+
+ return 0;
+}
+
+int child_received_signal;
+
+void child_sigsegv_cb(int sig)
+{
+ int cpuid_val = 0;
+
+ child_received_signal = 1;
+ printf("[ SIG_SEGV ]\n");
+ printf("arch_prctl(ARCH_GET_CPUID); ");
+
+ cpuid_val = arch_prctl(ARCH_GET_CPUID, 0);
+ if (cpuid_val < 0)
+ errx(1, "ARCH_GET_CPUID fails now, but not before?");
+
+ printf("cpuid_val == %s\n", cpuid_names[cpuid_val]);
+ printf("arch_prctl(ARCH_SET_CPUID, 1)\n");
+ if (arch_prctl(ARCH_SET_CPUID, 1) != 0)
+ exit(errno);
+
+ printf("cpuid() == ");
+}
+
+int do_child_test(void)
+{
+ unsigned int eax = 0, ebx = 0, ecx = 0, edx = 0;
+
+ signal(SIGSEGV, child_sigsegv_cb);
+
+ /* the child starts out with cpuid disabled, the signal handler
+ * attempts to enable and retry
+ */
+ printf("cpuid() == ");
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("{%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ return child_received_signal ? 0 : 42;
+}
+
+int signal_count;
+
+void sigsegv_cb(int sig)
+{
+ int cpuid_val = 0;
+
+ signal_count++;
+ printf("[ SIG_SEGV ]\n");
+ printf("arch_prctl(ARCH_GET_CPUID); ");
+
+ cpuid_val = arch_prctl(ARCH_GET_CPUID, 0);
+ if (cpuid_val < 0)
+ errx(1, "ARCH_GET_CPUID fails now, but not before?");
+
+ printf("cpuid_val == %s\n", cpuid_names[cpuid_val]);
+ printf("arch_prctl(ARC_SET_CPUID, 1)\n");
+ if (arch_prctl(ARCH_SET_CPUID, 1) != 0)
+ errx(1, "ARCH_SET_CPUID failed!");
+
+ printf("cpuid() == ");
+}
+
+int main(int argc, char **argv)
+{
+ int cpuid_val = 0, child = 0, status = 0;
+ unsigned int eax = 0, ebx = 0, ecx = 0, edx = 0;
+
+ signal(SIGSEGV, sigsegv_cb);
+ setvbuf(stdout, NULL, _IONBF, 0);
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("cpuid() == {%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ printf("arch_prctl(ARCH_GET_CPUID); ");
+
+ cpuid_val = arch_prctl(ARCH_GET_CPUID, 0);
+ if (cpuid_val < 0) {
+ if (errno == EINVAL) {
+ printf("ARCH_GET_CPUID is unsupported on this kernel.\n");
+ fflush(stdout);
+ exit(0); /* no ARCH_GET_CPUID on this system */
+ } else if (errno == ENODEV) {
+ printf("ARCH_GET_CPUID is unsupported on this hardware.\n");
+ fflush(stdout);
+ exit(0); /* no ARCH_GET_CPUID on this system */
+ } else {
+ errx(errno, "ARCH_GET_CPUID failed unexpectedly!");
+ }
+ }
+
+ printf("cpuid_val == %s\n", cpuid_names[cpuid_val]);
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("cpuid() == {%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ printf("arch_prctl(ARCH_SET_CPUID, 1)\n");
+
+ if (arch_prctl(ARCH_SET_CPUID, 1) != 0) {
+ if (errno == EINVAL) {
+ printf("ARCH_SET_CPUID is unsupported on this kernel.");
+ exit(0); /* no ARCH_SET_CPUID on this system */
+ } else if (errno == ENODEV) {
+ printf("ARCH_SET_CPUID is unsupported on this hardware.");
+ exit(0); /* no ARCH_SET_CPUID on this system */
+ } else {
+ errx(errno, "ARCH_SET_CPUID failed unexpectedly!");
+ }
+ }
+
+
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("cpuid() == {%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ printf("arch_prctl(ARCH_SET_CPUID, 0)\n");
+ fflush(stdout);
+
+ if (arch_prctl(ARCH_SET_CPUID, 0) == -1)
+ errx(1, "ARCH_SET_CPUID failed!");
+
+ printf("cpuid() == ");
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("{%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ printf("arch_prctl(ARCH_SET_CPUID, 0)\n");
+
+ if (signal_count != 1)
+ errx(1, "cpuid didn't fault!");
+
+ if (arch_prctl(ARCH_SET_CPUID, 0) == -1)
+ errx(1, "ARCH_SET_CPUID failed!");
+
+ if (argc > 1)
+ exit(0); /* Don't run the whole test again if we were execed */
+
+ printf("do_child_test\n");
+ child = fork();
+ if (child == 0)
+ return do_child_test();
+
+ if (child != waitpid(child, &status, 0))
+ errx(1, "waitpid failed!?");
+
+ if (WEXITSTATUS(status) != 0)
+ errx(1, "Child exited abnormally!");
+
+ /* The child enabling cpuid should not have affected us */
+ printf("cpuid() == ");
+ eax = ebx = ecx = edx = 0;
+ cpuid(&eax, &ebx, &ecx, &edx);
+ printf("{%x, %x, %x, %x}\n", eax, ebx, ecx, edx);
+ printf("arch_prctl(ARCH_SET_CPUID, 0)\n");
+
+ if (signal_count != 2)
+ errx(1, "cpuid didn't fault!");
+
+ if (arch_prctl(ARCH_SET_CPUID, 0) == -1)
+ errx(1, "ARCH_SET_CPUID failed!");
+
+ /* Our ARCH_CPUID_SIGSEGV should not propagate through exec */
+ printf("do_child_exec_test\n");
+ fflush(stdout);
+
+ child = fork();
+ if (child == 0)
+ return do_child_exec_test(eax, ebx, ecx, edx);
+
+ if (child != waitpid(child, &status, 0))
+ errx(1, "waitpid failed!?");
+
+ if (WEXITSTATUS(status) != 0)
+ errx(1, "Child exited abnormally!");
+
+ printf("All tests passed!\n");
+ exit(EXIT_SUCCESS);
+}
--
2.11.0

2017-03-20 16:12:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v16 08/10] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

On Mon, 20 Mar 2017, Kyle Huey wrote:
> --- a/arch/x86/include/uapi/asm/prctl.h
> +++ b/arch/x86/include/uapi/asm/prctl.h
> @@ -6,8 +6,17 @@
> #define ARCH_GET_FS 0x1003
> #define ARCH_GET_GS 0x1004
>
> +#define ARCH_GET_CPUID 0x1005
> +#define ARCH_SET_CPUID 0x1006
> +
> #define ARCH_MAP_VDSO_X32 0x2001
> #define ARCH_MAP_VDSO_32 0x2002
> #define ARCH_MAP_VDSO_64 0x2003
>
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +# define ARCH_MAP_VDSO_X32 0x2001
> +# define ARCH_MAP_VDSO_32 0x2002
> +# define ARCH_MAP_VDSO_64 0x2003
> +#endif

That hunk is bogus in two aspects:
- It's just a copy of the above wrapped in a ifdef

- The ifdef is broken, because the UAPI headers do not know about
that.

I dropped it.

Subject: [tip:x86/process] x86/arch_prctl: Rename 'code' argument to 'option'

Commit-ID: dd93938a92dc067aba70c401bdf2e50ed58083db
Gitweb: http://git.kernel.org/tip/dd93938a92dc067aba70c401bdf2e50ed58083db
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:20 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:32 +0100

x86/arch_prctl: Rename 'code' argument to 'option'

The x86 specific arch_prctl() arbitrarily changed prctl's 'option' to
'code'. Before adding new options, rename it.

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/um/include/shared/os.h | 4 ++--
arch/x86/include/asm/proto.h | 2 +-
arch/x86/kernel/process_64.c | 8 ++++----
arch/x86/um/asm/ptrace.h | 2 +-
arch/x86/um/os-Linux/prctl.c | 4 ++--
arch/x86/um/syscalls_64.c | 13 +++++++------
6 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index de5d572..32e41c4 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -302,8 +302,8 @@ extern int ignore_sigio_fd(int fd);
extern void maybe_sigio_broken(int fd, int read);
extern void sigio_broken(int fd, int read);

-/* sys-x86_64/prctl.c */
-extern int os_arch_prctl(int pid, int code, unsigned long *addr);
+/* prctl.c */
+extern int os_arch_prctl(int pid, int option, unsigned long *addr);

/* tty.c */
extern int get_pty(void);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 9b9b30b..91675a9 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,6 @@ void x86_report_nx(void);

extern int reboot_force;

-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr);
+long do_arch_prctl(struct task_struct *task, int option, unsigned long addr);

#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a..4377cfe 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -547,13 +547,13 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
}
#endif

-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
+long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
{
int ret = 0;
int doit = task == current;
int cpu;

- switch (code) {
+ switch (option) {
case ARCH_SET_GS:
if (addr >= TASK_SIZE_MAX)
return -EPERM;
@@ -621,9 +621,9 @@ long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
return ret;
}

-long sys_arch_prctl(int code, unsigned long addr)
+long sys_arch_prctl(int option, unsigned long addr)
{
- return do_arch_prctl(current, code, addr);
+ return do_arch_prctl(current, option, addr);
}

unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/um/asm/ptrace.h b/arch/x86/um/asm/ptrace.h
index e59eef2..b291ca5 100644
--- a/arch/x86/um/asm/ptrace.h
+++ b/arch/x86/um/asm/ptrace.h
@@ -78,7 +78,7 @@ static inline int ptrace_set_thread_area(struct task_struct *child, int idx,
return -ENOSYS;
}

-extern long arch_prctl(struct task_struct *task, int code,
+extern long arch_prctl(struct task_struct *task, int option,
unsigned long __user *addr);

#endif
diff --git a/arch/x86/um/os-Linux/prctl.c b/arch/x86/um/os-Linux/prctl.c
index 96eb2bd..0a6e16a 100644
--- a/arch/x86/um/os-Linux/prctl.c
+++ b/arch/x86/um/os-Linux/prctl.c
@@ -6,7 +6,7 @@
#include <sys/ptrace.h>
#include <asm/ptrace.h>

-int os_arch_prctl(int pid, int code, unsigned long *addr)
+int os_arch_prctl(int pid, int option, unsigned long *addr)
{
- return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, code);
+ return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, option);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 10d9070..3c2dd87 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -11,7 +11,8 @@
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>

-long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
+long arch_prctl(struct task_struct *task, int option
+ unsigned long __user *addr)
{
unsigned long *ptr = addr, tmp;
long ret;
@@ -30,7 +31,7 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
* arch_prctl is run on the host, then the registers are read
* back.
*/
- switch (code) {
+ switch (option) {
case ARCH_SET_FS:
case ARCH_SET_GS:
ret = restore_registers(pid, &current->thread.regs.regs);
@@ -50,11 +51,11 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
ptr = &tmp;
}

- ret = os_arch_prctl(pid, code, ptr);
+ ret = os_arch_prctl(pid, option, ptr);
if (ret)
return ret;

- switch (code) {
+ switch (option) {
case ARCH_SET_FS:
current->thread.arch.fs = (unsigned long) ptr;
ret = save_registers(pid, &current->thread.regs.regs);
@@ -73,9 +74,9 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
return ret;
}

-long sys_arch_prctl(int code, unsigned long addr)
+long sys_arch_prctl(int option, unsigned long addr)
{
- return arch_prctl(current, code, (unsigned long __user *) addr);
+ return arch_prctl(current, option, (unsigned long __user *) addr);
}

void arch_switch_to(struct task_struct *to)

Subject: [tip:x86/process] x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64()

Commit-ID: 17a6e1b8e8e8539f89156643f8c3073f09ec446a
Gitweb: http://git.kernel.org/tip/17a6e1b8e8e8539f89156643f8c3073f09ec446a
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:22 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:32 +0100

x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64()

In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64(). Also rename the
second argument of that function from 'addr' to 'arg2', because it will no
longer always be an address.

Signed-off-by: Kyle Huey <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/um/include/shared/os.h | 2 +-
arch/x86/include/asm/proto.h | 3 +--
arch/x86/kernel/process_64.c | 32 +++++++++++++++++---------------
arch/x86/kernel/ptrace.c | 8 ++++----
arch/x86/um/os-Linux/prctl.c | 4 ++--
arch/x86/um/syscalls_64.c | 14 +++++++-------
6 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index 32e41c4..cd1fa97 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -303,7 +303,7 @@ extern void maybe_sigio_broken(int fd, int read);
extern void sigio_broken(int fd, int read);

/* prctl.c */
-extern int os_arch_prctl(int pid, int option, unsigned long *addr);
+extern int os_arch_prctl(int pid, int option, unsigned long *arg2);

/* tty.c */
extern int get_pty(void);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 91675a9..4e276f6 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -9,6 +9,7 @@ void syscall_init(void);

#ifdef CONFIG_X86_64
void entry_SYSCALL_64(void);
+long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2);
#endif

#ifdef CONFIG_X86_32
@@ -30,6 +31,4 @@ void x86_report_nx(void);

extern int reboot_force;

-long do_arch_prctl(struct task_struct *task, int option, unsigned long addr);
-
#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index bf9d7b6..e37f764 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -205,7 +205,7 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
(struct user_desc __user *)tls, 0);
else
#endif
- err = do_arch_prctl(p, ARCH_SET_FS, tls);
+ err = do_arch_prctl_64(p, ARCH_SET_FS, tls);
if (err)
goto out;
}
@@ -548,7 +548,7 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
}
#endif

-long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
+long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
{
int ret = 0;
int doit = task == current;
@@ -556,62 +556,64 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)

switch (option) {
case ARCH_SET_GS:
- if (addr >= TASK_SIZE_MAX)
+ if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.gsindex = 0;
- task->thread.gsbase = addr;
+ task->thread.gsbase = arg2;
if (doit) {
load_gs_index(0);
- ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, addr);
+ ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, arg2);
}
put_cpu();
break;
case ARCH_SET_FS:
/* Not strictly needed for fs, but do it for symmetry
with gs */
- if (addr >= TASK_SIZE_MAX)
+ if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.fsindex = 0;
- task->thread.fsbase = addr;
+ task->thread.fsbase = arg2;
if (doit) {
/* set the selector to 0 to not confuse __switch_to */
loadsegment(fs, 0);
- ret = wrmsrl_safe(MSR_FS_BASE, addr);
+ ret = wrmsrl_safe(MSR_FS_BASE, arg2);
}
put_cpu();
break;
case ARCH_GET_FS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_FS_BASE, base);
else
base = task->thread.fsbase;
- ret = put_user(base, (unsigned long __user *)addr);
+ ret = put_user(base, (unsigned long __user *)arg2);
break;
}
case ARCH_GET_GS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_KERNEL_GS_BASE, base);
else
base = task->thread.gsbase;
- ret = put_user(base, (unsigned long __user *)addr);
+ ret = put_user(base, (unsigned long __user *)arg2);
break;
}

#ifdef CONFIG_CHECKPOINT_RESTORE
# ifdef CONFIG_X86_X32_ABI
case ARCH_MAP_VDSO_X32:
- return prctl_map_vdso(&vdso_image_x32, addr);
+ return prctl_map_vdso(&vdso_image_x32, arg2);
# endif
# if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
case ARCH_MAP_VDSO_32:
- return prctl_map_vdso(&vdso_image_32, addr);
+ return prctl_map_vdso(&vdso_image_32, arg2);
# endif
case ARCH_MAP_VDSO_64:
- return prctl_map_vdso(&vdso_image_64, addr);
+ return prctl_map_vdso(&vdso_image_64, arg2);
#endif

default:
@@ -622,9 +624,9 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
return ret;
}

-SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return do_arch_prctl(current, option, addr);
+ return do_arch_prctl_64(current, option, arg2);
}

unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 2364b23..f37d181 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -396,12 +396,12 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
/*
- * When changing the segment base, use do_arch_prctl
+ * When changing the segment base, use do_arch_prctl_64
* to set either thread.fs or thread.fsindex and the
* corresponding GDT slot.
*/
if (child->thread.fsbase != value)
- return do_arch_prctl(child, ARCH_SET_FS, value);
+ return do_arch_prctl_64(child, ARCH_SET_FS, value);
return 0;
case offsetof(struct user_regs_struct,gs_base):
/*
@@ -410,7 +410,7 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
if (child->thread.gsbase != value)
- return do_arch_prctl(child, ARCH_SET_GS, value);
+ return do_arch_prctl_64(child, ARCH_SET_GS, value);
return 0;
#endif
}
@@ -869,7 +869,7 @@ long arch_ptrace(struct task_struct *child, long request,
Works just like arch_prctl, except that the arguments
are reversed. */
case PTRACE_ARCH_PRCTL:
- ret = do_arch_prctl(child, data, addr);
+ ret = do_arch_prctl_64(child, data, addr);
break;
#endif

diff --git a/arch/x86/um/os-Linux/prctl.c b/arch/x86/um/os-Linux/prctl.c
index 0a6e16a..8431e87 100644
--- a/arch/x86/um/os-Linux/prctl.c
+++ b/arch/x86/um/os-Linux/prctl.c
@@ -6,7 +6,7 @@
#include <sys/ptrace.h>
#include <asm/ptrace.h>

-int os_arch_prctl(int pid, int option, unsigned long *addr)
+int os_arch_prctl(int pid, int option, unsigned long *arg2)
{
- return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, option);
+ return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) arg2, option);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 42369fa..81b9fe1 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -12,10 +12,10 @@
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>

-long arch_prctl(struct task_struct *task, int option
- unsigned long __user *addr)
+long arch_prctl(struct task_struct *task, int option)
+ unsigned long __user *arg2)
{
- unsigned long *ptr = addr, tmp;
+ unsigned long *ptr = arg2, tmp;
long ret;
int pid = task->mm->context.id.u.pid;

@@ -65,19 +65,19 @@ long arch_prctl(struct task_struct *task, int option
ret = save_registers(pid, &current->thread.regs.regs);
break;
case ARCH_GET_FS:
- ret = put_user(tmp, addr);
+ ret = put_user(tmp, arg2);
break;
case ARCH_GET_GS:
- ret = put_user(tmp, addr);
+ ret = put_user(tmp, arg2);
break;
}

return ret;
}

-SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return arch_prctl(current, option, (unsigned long __user *) addr);
+ return arch_prctl(current, option, (unsigned long __user *) arg2);
}

void arch_switch_to(struct task_struct *to)

Subject: [tip:x86/process] x86/arch_prctl: Add do_arch_prctl_common()

Commit-ID: b0b9b014016d16ca7a192da986aa8ebae21bb995
Gitweb: http://git.kernel.org/tip/b0b9b014016d16ca7a192da986aa8ebae21bb995
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:23 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:33 +0100

x86/arch_prctl: Add do_arch_prctl_common()

Add do_arch_prctl_common() to handle arch_prctls that are not specific to 64
bit mode. Call it from the syscall entry point, but not any of the other
callsites in the kernel, which all want one of the existing 64 bit only
arch_prctls.

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/proto.h | 3 +++
arch/x86/kernel/process.c | 6 ++++++
arch/x86/kernel/process_64.c | 8 +++++++-
3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 4e276f6..8d3964f 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -31,4 +31,7 @@ void x86_report_nx(void);

extern int reboot_force;

+long do_arch_prctl_common(struct task_struct *task, int option,
+ unsigned long cpuid_enabled);
+
#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 366db77..b12e95e 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -545,3 +545,9 @@ out:
put_task_stack(p);
return ret;
}
+
+long do_arch_prctl_common(struct task_struct *task, int option,
+ unsigned long cpuid_enabled)
+{
+ return -EINVAL;
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e37f764..d81b0a6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -626,7 +626,13 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)

SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return do_arch_prctl_64(current, option, arg2);
+ long ret;
+
+ ret = do_arch_prctl_64(current, option, arg2);
+ if (ret == -EINVAL)
+ ret = do_arch_prctl_common(current, option, arg2);
+
+ return ret;
}

unsigned long KSTK_ESP(struct task_struct *task)

Subject: [tip:x86/process] x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES

Commit-ID: ab6d9468631a6e56e4c071c6ce6710956485fe08
Gitweb: http://git.kernel.org/tip/ab6d9468631a6e56e4c071c6ce6710956485fe08
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:19 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:32 +0100

x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES

This matches the only public Intel documentation of this MSR, in the
"Virtualization Technology FlexMigration Application Note"
(preserved at https://bugzilla.kernel.org/attachment.cgi?id=243991)

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/msr-index.h | 6 +++---
arch/x86/kernel/cpu/intel.c | 8 ++++----
2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4c928f3..f429b70 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -553,10 +553,10 @@
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT 39
#define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE (1ULL << MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)

-/* MISC_FEATURE_ENABLES non-architectural features */
-#define MSR_MISC_FEATURE_ENABLES 0x00000140
+/* MISC_FEATURES_ENABLES non-architectural features */
+#define MSR_MISC_FEATURES_ENABLES 0x00000140

-#define MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT 1
+#define MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT 1

#define MSR_IA32_TSC_DEADLINE 0x000006E0

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 0631977..e229318 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -91,13 +91,13 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
}

if (ring3mwait_disabled) {
- msr_clear_bit(MSR_MISC_FEATURE_ENABLES,
- MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT);
+ msr_clear_bit(MSR_MISC_FEATURES_ENABLES,
+ MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);
return;
}

- msr_set_bit(MSR_MISC_FEATURE_ENABLES,
- MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT);
+ msr_set_bit(MSR_MISC_FEATURES_ENABLES,
+ MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

set_cpu_cap(c, X86_FEATURE_RING3MWAIT);


Subject: [tip:x86/process] x86/syscalls/32: Wire up arch_prctl on x86-32

Commit-ID: 79170fda313ed5be2394f87aa2a00d597f8ed4a1
Gitweb: http://git.kernel.org/tip/79170fda313ed5be2394f87aa2a00d597f8ed4a1
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:24 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:33 +0100

x86/syscalls/32: Wire up arch_prctl on x86-32

Hook up arch_prctl to call do_arch_prctl() on x86-32, and in 32 bit compat
mode on x86-64. This allows to have arch_prctls that are not specific to 64
bits.

On UML, simply stub out this syscall.

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/kernel/process_32.c | 7 +++++++
arch/x86/kernel/process_64.c | 7 +++++++
arch/x86/um/Makefile | 2 +-
arch/x86/um/syscalls_32.c | 7 +++++++
include/linux/compat.h | 2 ++
6 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 9ba050f..0af59fa 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -390,3 +390,4 @@
381 i386 pkey_alloc sys_pkey_alloc
382 i386 pkey_free sys_pkey_free
383 i386 statx sys_statx
+384 i386 arch_prctl sys_arch_prctl compat_sys_arch_prctl
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4c818f8..ff40e74 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/kdebug.h>
+#include <linux/syscalls.h>

#include <asm/pgtable.h>
#include <asm/ldt.h>
@@ -56,6 +57,7 @@
#include <asm/switch_to.h>
#include <asm/vm86.h>
#include <asm/intel_rdt.h>
+#include <asm/proto.h>

void __show_regs(struct pt_regs *regs, int all)
{
@@ -304,3 +306,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)

return prev_p;
}
+
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return do_arch_prctl_common(current, option, arg2);
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d81b0a6..ea1a618 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -635,6 +635,13 @@ SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
return ret;
}

+#ifdef CONFIG_IA32_EMULATION
+COMPAT_SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return do_arch_prctl_common(current, option, arg2);
+}
+#endif
+
unsigned long KSTK_ESP(struct task_struct *task)
{
return task_pt_regs(task)->sp;
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index e7e7055..69f0827 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -16,7 +16,7 @@ obj-y = bug.o bugs_$(BITS).o delay.o fault.o ldt.o \

ifeq ($(CONFIG_X86_32),y)

-obj-y += checksum_32.o
+obj-y += checksum_32.o syscalls_32.o
obj-$(CONFIG_ELF_CORE) += elfcore.o

subarch-y = ../lib/string_32.o ../lib/atomic64_32.o ../lib/atomic64_cx8_32.o
diff --git a/arch/x86/um/syscalls_32.c b/arch/x86/um/syscalls_32.c
new file mode 100644
index 0000000..627d688
--- /dev/null
+++ b/arch/x86/um/syscalls_32.c
@@ -0,0 +1,7 @@
+#include <linux/syscalls.h>
+#include <os.h>
+
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return -EINVAL;
+}
diff --git a/include/linux/compat.h b/include/linux/compat.h
index aef47be..af9dbc4 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -723,6 +723,8 @@ asmlinkage long compat_sys_sched_rr_get_interval(compat_pid_t pid,
asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
int, const char __user *);

+asmlinkage long compat_sys_arch_prctl(int option, unsigned long arg2);
+
/*
* For most but not all architectures, "am I in a compat syscall?" and
* "am I a compat task?" are the same question. For architectures on which

Subject: [tip:x86/process] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl()

Commit-ID: ff3f097eef30151f5ee250859e0fe8a0ec02c160
Gitweb: http://git.kernel.org/tip/ff3f097eef30151f5ee250859e0fe8a0ec02c160
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:21 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:32 +0100

x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl()

Use the SYSCALL_DEFINE2 macro instead of manually defining it.

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/kernel/process_64.c | 3 ++-
arch/x86/um/syscalls_64.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4377cfe..bf9d7b6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/ftrace.h>
+#include <linux/syscalls.h>

#include <asm/pgtable.h>
#include <asm/processor.h>
@@ -621,7 +622,7 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
return ret;
}

-long sys_arch_prctl(int option, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
{
return do_arch_prctl(current, option, addr);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 3c2dd87..42369fa 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -7,6 +7,7 @@

#include <linux/sched.h>
#include <linux/sched/mm.h>
+#include <linux/syscalls.h>
#include <linux/uaccess.h>
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>
@@ -74,7 +75,7 @@ long arch_prctl(struct task_struct *task, int option
return ret;
}

-long sys_arch_prctl(int option, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
{
return arch_prctl(current, option, (unsigned long __user *) addr);
}

Subject: [tip:x86/process] x86/cpufeature: Detect CPUID faulting support

Commit-ID: 90218ac77d0582eaf2d0872d8d900cbd5bf1f205
Gitweb: http://git.kernel.org/tip/90218ac77d0582eaf2d0872d8d900cbd5bf1f205
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:25 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:34 +0100

x86/cpufeature: Detect CPUID faulting support

Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. This will allow a ptracer to emulate the CPUID
instruction.

Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is
documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991

Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT.

Signed-off-by: Kyle Huey <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/intel.c | 24 +++++++++++++++++++++++-
3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b04bb6d..0fe0044 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -187,6 +187,7 @@
* Reuse free bits when adding new feature flags!
*/
#define X86_FEATURE_RING3MWAIT ( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
+#define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */
#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
#define X86_FEATURE_CAT_L3 ( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f429b70..b1f75da 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -45,6 +45,8 @@
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
#define MSR_PLATFORM_INFO 0x000000ce
+#define MSR_PLATFORM_INFO_CPUID_FAULT_BIT 31
+#define MSR_PLATFORM_INFO_CPUID_FAULT BIT_ULL(MSR_PLATFORM_INFO_CPUID_FAULT_BIT)

#define MSR_PKG_CST_CONFIG_CONTROL 0x000000e2
#define NHM_C3_AUTO_DEMOTE (1UL << 25)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index e229318..a07f829 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -488,6 +488,28 @@ static void intel_bsp_resume(struct cpuinfo_x86 *c)
init_intel_energy_perf(c);
}

+static void init_cpuid_fault(struct cpuinfo_x86 *c)
+{
+ u64 msr;
+
+ if (!rdmsrl_safe(MSR_PLATFORM_INFO, &msr)) {
+ if (msr & MSR_PLATFORM_INFO_CPUID_FAULT)
+ set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
+ }
+}
+
+static void init_intel_misc_features(struct cpuinfo_x86 *c)
+{
+ u64 msr;
+
+ if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
+ return;
+
+ /* Check features and update capabilities */
+ init_cpuid_fault(c);
+ probe_xeon_phi_r3mwait(c);
+}
+
static void init_intel(struct cpuinfo_x86 *c)
{
unsigned int l2 = 0;
@@ -602,7 +624,7 @@ static void init_intel(struct cpuinfo_x86 *c)

init_intel_energy_perf(c);

- probe_xeon_phi_r3mwait(c);
+ init_intel_misc_features(c);
}

#ifdef CONFIG_X86_32

Subject: [tip:x86/process] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

Commit-ID: e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
Gitweb: http://git.kernel.org/tip/e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
Author: Kyle Huey <[email protected]>
AuthorDate: Mon, 20 Mar 2017 01:16:26 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Mar 2017 16:10:34 +0100

x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. Exposing this feature to userspace will allow a
ptracer to trap and emulate the CPUID instruction.

When supported, this feature is controlled by toggling bit 0 of
MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991

Implement a new pair of arch_prctls, available on both x86-32 and x86-64.

ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting
is enabled (and thus the CPUID instruction is not available) or 1 if
CPUID faulting is not enabled.

ARCH_SET_CPUID: Set the CPUID state to the second argument. If
cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will
be deactivated. Returns ENODEV if CPUID faulting is not supported on
this system.

The state of the CPUID faulting flag is propagated across forks, but reset
upon exec.

Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/include/asm/processor.h | 2 +
arch/x86/include/asm/thread_info.h | 6 ++-
arch/x86/include/uapi/asm/prctl.h | 11 ++++--
arch/x86/kernel/cpu/intel.c | 18 +++++----
arch/x86/kernel/process.c | 78 ++++++++++++++++++++++++++++++++++++++
fs/exec.c | 1 +
include/linux/thread_info.h | 4 ++
8 files changed, 109 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b1f75da..673f9ac 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -558,6 +558,8 @@
/* MISC_FEATURES_ENABLES non-architectural features */
#define MSR_MISC_FEATURES_ENABLES 0x00000140

+#define MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT 0
+#define MSR_MISC_FEATURES_ENABLES_CPUID_FAULT BIT_ULL(MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT)
#define MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT 1

#define MSR_IA32_TSC_DEADLINE 0x000006E0
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca..a80c1b3 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -884,6 +884,8 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
extern int get_tsc_mode(unsigned long adr);
extern int set_tsc_mode(unsigned int val);

+DECLARE_PER_CPU(u64, msr_misc_features_shadow);
+
/* Register/unregister a process' MPX related resource */
#define MPX_ENABLE_MANAGEMENT() mpx_enable_management()
#define MPX_DISABLE_MANAGEMENT() mpx_disable_management()
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index ad6f5eb0..9fc44b9 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,6 +87,7 @@ struct thread_info {
#define TIF_SECCOMP 8 /* secure computing */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
+#define TIF_NOCPUID 15 /* CPUID is not accessible in userland */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_NOHZ 19 /* in adaptive nohz mode */
@@ -110,6 +111,7 @@ struct thread_info {
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
+#define _TIF_NOCPUID (1 << TIF_NOCPUID)
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_NOHZ (1 << TIF_NOHZ)
@@ -138,7 +140,7 @@ struct thread_info {

/* flags to check in __switch_to() */
#define _TIF_WORK_CTXSW \
- (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP)
+ (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP)

#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
@@ -239,6 +241,8 @@ static inline int arch_within_stack_frames(const void * const stack,
extern void arch_task_cache_init(void);
extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
extern void arch_release_task_struct(struct task_struct *tsk);
+extern void arch_setup_new_exec(void);
+#define arch_setup_new_exec arch_setup_new_exec
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 835aa51..c457655 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -1,10 +1,13 @@
#ifndef _ASM_X86_PRCTL_H
#define _ASM_X86_PRCTL_H

-#define ARCH_SET_GS 0x1001
-#define ARCH_SET_FS 0x1002
-#define ARCH_GET_FS 0x1003
-#define ARCH_GET_GS 0x1004
+#define ARCH_SET_GS 0x1001
+#define ARCH_SET_FS 0x1002
+#define ARCH_GET_FS 0x1003
+#define ARCH_GET_GS 0x1004
+
+#define ARCH_GET_CPUID 0x1011
+#define ARCH_SET_CPUID 0x1012

#define ARCH_MAP_VDSO_X32 0x2001
#define ARCH_MAP_VDSO_32 0x2002
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index a07f829..dfa90a3 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -90,16 +90,12 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
return;
}

- if (ring3mwait_disabled) {
- msr_clear_bit(MSR_MISC_FEATURES_ENABLES,
- MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);
+ if (ring3mwait_disabled)
return;
- }
-
- msr_set_bit(MSR_MISC_FEATURES_ENABLES,
- MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

set_cpu_cap(c, X86_FEATURE_RING3MWAIT);
+ this_cpu_or(msr_misc_features_shadow,
+ 1UL << MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

if (c == &boot_cpu_data)
ELF_HWCAP2 |= HWCAP2_RING3MWAIT;
@@ -505,9 +501,15 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
return;

- /* Check features and update capabilities */
+ /* Clear all MISC features */
+ this_cpu_write(msr_misc_features_shadow, 0);
+
+ /* Check features and update capabilities and shadow control bits */
init_cpuid_fault(c);
probe_xeon_phi_r3mwait(c);
+
+ msr = this_cpu_read(msr_misc_features_shadow);
+ wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
}

static void init_intel(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b12e95e..0bb8842 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -37,6 +37,7 @@
#include <asm/vm86.h>
#include <asm/switch_to.h>
#include <asm/desc.h>
+#include <asm/prctl.h>

/*
* per-CPU TSS segments. Threads are completely 'soft' on Linux,
@@ -172,6 +173,73 @@ int set_tsc_mode(unsigned int val)
return 0;
}

+DEFINE_PER_CPU(u64, msr_misc_features_shadow);
+
+static void set_cpuid_faulting(bool on)
+{
+ u64 msrval;
+
+ msrval = this_cpu_read(msr_misc_features_shadow);
+ msrval &= ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
+ msrval |= (on << MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT);
+ this_cpu_write(msr_misc_features_shadow, msrval);
+ wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);
+}
+
+static void disable_cpuid(void)
+{
+ preempt_disable();
+ if (!test_and_set_thread_flag(TIF_NOCPUID)) {
+ /*
+ * Must flip the CPU state synchronously with
+ * TIF_NOCPUID in the current running context.
+ */
+ set_cpuid_faulting(true);
+ }
+ preempt_enable();
+}
+
+static void enable_cpuid(void)
+{
+ preempt_disable();
+ if (test_and_clear_thread_flag(TIF_NOCPUID)) {
+ /*
+ * Must flip the CPU state synchronously with
+ * TIF_NOCPUID in the current running context.
+ */
+ set_cpuid_faulting(false);
+ }
+ preempt_enable();
+}
+
+static int get_cpuid_mode(void)
+{
+ return !test_thread_flag(TIF_NOCPUID);
+}
+
+static int set_cpuid_mode(struct task_struct *task, unsigned long cpuid_enabled)
+{
+ if (!static_cpu_has(X86_FEATURE_CPUID_FAULT))
+ return -ENODEV;
+
+ if (cpuid_enabled)
+ enable_cpuid();
+ else
+ disable_cpuid();
+
+ return 0;
+}
+
+/*
+ * Called immediately after a successful exec.
+ */
+void arch_setup_new_exec(void)
+{
+ /* If cpuid was previously disabled for this task, re-enable it. */
+ if (test_thread_flag(TIF_NOCPUID))
+ enable_cpuid();
+}
+
static inline void switch_to_bitmap(struct tss_struct *tss,
struct thread_struct *prev,
struct thread_struct *next,
@@ -225,6 +293,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,

if ((tifp ^ tifn) & _TIF_NOTSC)
cr4_toggle_bits(X86_CR4_TSD);
+
+ if ((tifp ^ tifn) & _TIF_NOCPUID)
+ set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
}

/*
@@ -549,5 +620,12 @@ out:
long do_arch_prctl_common(struct task_struct *task, int option,
unsigned long cpuid_enabled)
{
+ switch (option) {
+ case ARCH_GET_CPUID:
+ return get_cpuid_mode();
+ case ARCH_SET_CPUID:
+ return set_cpuid_mode(task, cpuid_enabled);
+ }
+
return -EINVAL;
}
diff --git a/fs/exec.c b/fs/exec.c
index 65145a3..72934df 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1320,6 +1320,7 @@ void setup_new_exec(struct linux_binprm * bprm)
else
set_dumpable(current->mm, suid_dumpable);

+ arch_setup_new_exec();
perf_event_exec();
__set_task_comm(current, kbasename(bprm->filename), true);

diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 5837387..55125d6 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -101,6 +101,10 @@ static inline void check_object_size(const void *ptr, unsigned long n,
{ }
#endif /* CONFIG_HARDENED_USERCOPY */

+#ifndef arch_setup_new_exec
+static inline void arch_setup_new_exec(void) { }
+#endif
+
#endif /* __KERNEL__ */

#endif /* _LINUX_THREAD_INFO_H */

2017-03-20 16:48:42

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 08/10] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. Exposing this feature to userspace will allow a
ptracer to trap and emulate the CPUID instruction.

When supported, this feature is controlled by toggling bit 0 of
MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991

Implement a new pair of arch_prctls, available on both x86-32 and x86-64.

ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting
is enabled (and thus the CPUID instruction is not available) or 1 if
CPUID faulting is not enabled.

ARCH_SET_CPUID: Set the CPUID state to the second argument. If
cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will
be deactivated. Returns ENODEV if CPUID faulting is not supported on
this system.

The state of the CPUID faulting flag is propagated across forks, but reset
upon exec.

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/include/asm/processor.h | 2 +
arch/x86/include/asm/thread_info.h | 6 ++-
arch/x86/include/uapi/asm/prctl.h | 9 +++++
arch/x86/kernel/cpu/intel.c | 18 +++++----
arch/x86/kernel/process.c | 78 ++++++++++++++++++++++++++++++++++++++
fs/exec.c | 1 +
include/linux/thread_info.h | 4 ++
8 files changed, 111 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b1f75daca34b..673f9ac50f6d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -558,6 +558,8 @@
/* MISC_FEATURES_ENABLES non-architectural features */
#define MSR_MISC_FEATURES_ENABLES 0x00000140

+#define MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT 0
+#define MSR_MISC_FEATURES_ENABLES_CPUID_FAULT BIT_ULL(MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT)
#define MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT 1

#define MSR_IA32_TSC_DEADLINE 0x000006E0
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca5407a..a80c1b3997ed 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -884,6 +884,8 @@ extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
extern int get_tsc_mode(unsigned long adr);
extern int set_tsc_mode(unsigned int val);

+DECLARE_PER_CPU(u64, msr_misc_features_shadow);
+
/* Register/unregister a process' MPX related resource */
#define MPX_ENABLE_MANAGEMENT() mpx_enable_management()
#define MPX_DISABLE_MANAGEMENT() mpx_disable_management()
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index ad6f5eb07a95..9fc44b95f7cb 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,6 +87,7 @@ struct thread_info {
#define TIF_SECCOMP 8 /* secure computing */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
+#define TIF_NOCPUID 15 /* CPUID is not accessible in userland */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_NOHZ 19 /* in adaptive nohz mode */
@@ -110,6 +111,7 @@ struct thread_info {
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
+#define _TIF_NOCPUID (1 << TIF_NOCPUID)
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_NOHZ (1 << TIF_NOHZ)
@@ -138,7 +140,7 @@ struct thread_info {

/* flags to check in __switch_to() */
#define _TIF_WORK_CTXSW \
- (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP)
+ (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP)

#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
@@ -239,6 +241,8 @@ static inline int arch_within_stack_frames(const void * const stack,
extern void arch_task_cache_init(void);
extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
extern void arch_release_task_struct(struct task_struct *tsk);
+extern void arch_setup_new_exec(void);
+#define arch_setup_new_exec arch_setup_new_exec
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_THREAD_INFO_H */
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 835aa51c7f6e..5dbbd1be6b97 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -6,8 +6,17 @@
#define ARCH_GET_FS 0x1003
#define ARCH_GET_GS 0x1004

+#define ARCH_GET_CPUID 0x1005
+#define ARCH_SET_CPUID 0x1006
+
#define ARCH_MAP_VDSO_X32 0x2001
#define ARCH_MAP_VDSO_32 0x2002
#define ARCH_MAP_VDSO_64 0x2003

+#ifdef CONFIG_CHECKPOINT_RESTORE
+# define ARCH_MAP_VDSO_X32 0x2001
+# define ARCH_MAP_VDSO_32 0x2002
+# define ARCH_MAP_VDSO_64 0x2003
+#endif
+
#endif /* _ASM_X86_PRCTL_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index a07f8295c9ed..dfa90a3a5145 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -90,16 +90,12 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
return;
}

- if (ring3mwait_disabled) {
- msr_clear_bit(MSR_MISC_FEATURES_ENABLES,
- MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);
+ if (ring3mwait_disabled)
return;
- }
-
- msr_set_bit(MSR_MISC_FEATURES_ENABLES,
- MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

set_cpu_cap(c, X86_FEATURE_RING3MWAIT);
+ this_cpu_or(msr_misc_features_shadow,
+ 1UL << MSR_MISC_FEATURES_ENABLES_RING3MWAIT_BIT);

if (c == &boot_cpu_data)
ELF_HWCAP2 |= HWCAP2_RING3MWAIT;
@@ -505,9 +501,15 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
return;

- /* Check features and update capabilities */
+ /* Clear all MISC features */
+ this_cpu_write(msr_misc_features_shadow, 0);
+
+ /* Check features and update capabilities and shadow control bits */
init_cpuid_fault(c);
probe_xeon_phi_r3mwait(c);
+
+ msr = this_cpu_read(msr_misc_features_shadow);
+ wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
}

static void init_intel(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b12e95eceb83..0bb88428cbf2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -37,6 +37,7 @@
#include <asm/vm86.h>
#include <asm/switch_to.h>
#include <asm/desc.h>
+#include <asm/prctl.h>

/*
* per-CPU TSS segments. Threads are completely 'soft' on Linux,
@@ -172,6 +173,73 @@ int set_tsc_mode(unsigned int val)
return 0;
}

+DEFINE_PER_CPU(u64, msr_misc_features_shadow);
+
+static void set_cpuid_faulting(bool on)
+{
+ u64 msrval;
+
+ msrval = this_cpu_read(msr_misc_features_shadow);
+ msrval &= ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
+ msrval |= (on << MSR_MISC_FEATURES_ENABLES_CPUID_FAULT_BIT);
+ this_cpu_write(msr_misc_features_shadow, msrval);
+ wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);
+}
+
+static void disable_cpuid(void)
+{
+ preempt_disable();
+ if (!test_and_set_thread_flag(TIF_NOCPUID)) {
+ /*
+ * Must flip the CPU state synchronously with
+ * TIF_NOCPUID in the current running context.
+ */
+ set_cpuid_faulting(true);
+ }
+ preempt_enable();
+}
+
+static void enable_cpuid(void)
+{
+ preempt_disable();
+ if (test_and_clear_thread_flag(TIF_NOCPUID)) {
+ /*
+ * Must flip the CPU state synchronously with
+ * TIF_NOCPUID in the current running context.
+ */
+ set_cpuid_faulting(false);
+ }
+ preempt_enable();
+}
+
+static int get_cpuid_mode(void)
+{
+ return !test_thread_flag(TIF_NOCPUID);
+}
+
+static int set_cpuid_mode(struct task_struct *task, unsigned long cpuid_enabled)
+{
+ if (!static_cpu_has(X86_FEATURE_CPUID_FAULT))
+ return -ENODEV;
+
+ if (cpuid_enabled)
+ enable_cpuid();
+ else
+ disable_cpuid();
+
+ return 0;
+}
+
+/*
+ * Called immediately after a successful exec.
+ */
+void arch_setup_new_exec(void)
+{
+ /* If cpuid was previously disabled for this task, re-enable it. */
+ if (test_thread_flag(TIF_NOCPUID))
+ enable_cpuid();
+}
+
static inline void switch_to_bitmap(struct tss_struct *tss,
struct thread_struct *prev,
struct thread_struct *next,
@@ -225,6 +293,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,

if ((tifp ^ tifn) & _TIF_NOTSC)
cr4_toggle_bits(X86_CR4_TSD);
+
+ if ((tifp ^ tifn) & _TIF_NOCPUID)
+ set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
}

/*
@@ -549,5 +620,12 @@ unsigned long get_wchan(struct task_struct *p)
long do_arch_prctl_common(struct task_struct *task, int option,
unsigned long cpuid_enabled)
{
+ switch (option) {
+ case ARCH_GET_CPUID:
+ return get_cpuid_mode();
+ case ARCH_SET_CPUID:
+ return set_cpuid_mode(task, cpuid_enabled);
+ }
+
return -EINVAL;
}
diff --git a/fs/exec.c b/fs/exec.c
index 65145a3df065..72934df68471 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1320,6 +1320,7 @@ void setup_new_exec(struct linux_binprm * bprm)
else
set_dumpable(current->mm, suid_dumpable);

+ arch_setup_new_exec();
perf_event_exec();
__set_task_comm(current, kbasename(bprm->filename), true);

diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 58373875e8ee..1539e08757f3 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -101,6 +101,10 @@ static inline void check_object_size(const void *ptr, unsigned long n,
{ }
#endif /* CONFIG_HARDENED_USERCOPY */

+#ifndef arch_setup_new_exec
+static inline void arch_setup_new_exec(void) {}
+#endif
+
#endif /* __KERNEL__ */

#endif /* _LINUX_THREAD_INFO_H */
--
2.11.0

2017-03-20 16:49:20

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 07/10] x86/cpufeature: Detect CPUID faulting support

Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. This will allow a ptracer to emulate the CPUID
instruction.

Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is
documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991

Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT.

Signed-off-by: Kyle Huey <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/intel.c | 24 +++++++++++++++++++++++-
3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b04bb6dfed7f..0fe00446f9ca 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -187,6 +187,7 @@
* Reuse free bits when adding new feature flags!
*/
#define X86_FEATURE_RING3MWAIT ( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */
+#define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */
#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
#define X86_FEATURE_CAT_L3 ( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f429b70ebaef..b1f75daca34b 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -45,6 +45,8 @@
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
#define MSR_PLATFORM_INFO 0x000000ce
+#define MSR_PLATFORM_INFO_CPUID_FAULT_BIT 31
+#define MSR_PLATFORM_INFO_CPUID_FAULT BIT_ULL(MSR_PLATFORM_INFO_CPUID_FAULT_BIT)

#define MSR_PKG_CST_CONFIG_CONTROL 0x000000e2
#define NHM_C3_AUTO_DEMOTE (1UL << 25)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index e229318d7230..a07f8295c9ed 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -488,6 +488,28 @@ static void intel_bsp_resume(struct cpuinfo_x86 *c)
init_intel_energy_perf(c);
}

+static void init_cpuid_fault(struct cpuinfo_x86 *c)
+{
+ u64 msr;
+
+ if (!rdmsrl_safe(MSR_PLATFORM_INFO, &msr)) {
+ if (msr & MSR_PLATFORM_INFO_CPUID_FAULT)
+ set_cpu_cap(c, X86_FEATURE_CPUID_FAULT);
+ }
+}
+
+static void init_intel_misc_features(struct cpuinfo_x86 *c)
+{
+ u64 msr;
+
+ if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &msr))
+ return;
+
+ /* Check features and update capabilities */
+ init_cpuid_fault(c);
+ probe_xeon_phi_r3mwait(c);
+}
+
static void init_intel(struct cpuinfo_x86 *c)
{
unsigned int l2 = 0;
@@ -602,7 +624,7 @@ static void init_intel(struct cpuinfo_x86 *c)

init_intel_energy_perf(c);

- probe_xeon_phi_r3mwait(c);
+ init_intel_misc_features(c);
}

#ifdef CONFIG_X86_32
--
2.11.0

2017-03-20 16:49:35

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 05/10] x86/arch_prctl: Add do_arch_prctl_common

Add do_arch_prctl_common() to handle arch_prctls that are not specific to 64
bit mode. Call it from the syscall entry point, but not any of the other
callsites in the kernel, which all want one of the existing 64 bit only
arch_prctls.

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/include/asm/proto.h | 3 +++
arch/x86/kernel/process.c | 6 ++++++
arch/x86/kernel/process_64.c | 8 +++++++-
3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 4e276f6cb1c1..8d3964fc5f91 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -31,4 +31,7 @@ void x86_report_nx(void);

extern int reboot_force;

+long do_arch_prctl_common(struct task_struct *task, int option,
+ unsigned long cpuid_enabled);
+
#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 366db7782fc6..b12e95eceb83 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -545,3 +545,9 @@ unsigned long get_wchan(struct task_struct *p)
put_task_stack(p);
return ret;
}
+
+long do_arch_prctl_common(struct task_struct *task, int option,
+ unsigned long cpuid_enabled)
+{
+ return -EINVAL;
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e37f764c11cc..d81b0a60a45c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -626,7 +626,13 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)

SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return do_arch_prctl_64(current, option, arg2);
+ long ret;
+
+ ret = do_arch_prctl_64(current, option, arg2);
+ if (ret == -EINVAL)
+ ret = do_arch_prctl_common(current, option, arg2);
+
+ return ret;
}

unsigned long KSTK_ESP(struct task_struct *task)
--
2.11.0

2017-03-20 16:49:49

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 03/10] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl

Use the SYSCALL_DEFINE2 macro instead of manually defining it.

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/kernel/process_64.c | 3 ++-
arch/x86/um/syscalls_64.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4377cfe8e449..bf9d7b6c0223 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/ftrace.h>
+#include <linux/syscalls.h>

#include <asm/pgtable.h>
#include <asm/processor.h>
@@ -621,7 +622,7 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
return ret;
}

-long sys_arch_prctl(int option, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
{
return do_arch_prctl(current, option, addr);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 3c2dd8768992..42369fa5421f 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -7,6 +7,7 @@

#include <linux/sched.h>
#include <linux/sched/mm.h>
+#include <linux/syscalls.h>
#include <linux/uaccess.h>
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>
@@ -74,7 +75,7 @@ long arch_prctl(struct task_struct *task, int option
return ret;
}

-long sys_arch_prctl(int option, unsigned long addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
{
return arch_prctl(current, option, (unsigned long __user *) addr);
}
--
2.11.0

2017-03-20 16:49:39

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 04/10] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64

In order to introduce new arch_prctls that are not 64 bit only, rename the
existing 64 bit implementation to do_arch_prctl_64(). Also rename the second
argument to arch_prctl(), which will no longer always be an address.

Signed-off-by: Kyle Huey <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
---
arch/um/include/shared/os.h | 2 +-
arch/x86/include/asm/proto.h | 3 +--
arch/x86/kernel/process_64.c | 32 +++++++++++++++++---------------
arch/x86/kernel/ptrace.c | 8 ++++----
arch/x86/um/os-Linux/prctl.c | 4 ++--
arch/x86/um/syscalls_64.c | 14 +++++++-------
6 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index 32e41c4ef6d3..cd1fa97776c3 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -303,7 +303,7 @@ extern void maybe_sigio_broken(int fd, int read);
extern void sigio_broken(int fd, int read);

/* prctl.c */
-extern int os_arch_prctl(int pid, int option, unsigned long *addr);
+extern int os_arch_prctl(int pid, int option, unsigned long *arg2);

/* tty.c */
extern int get_pty(void);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 91675a960391..4e276f6cb1c1 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -9,6 +9,7 @@ void syscall_init(void);

#ifdef CONFIG_X86_64
void entry_SYSCALL_64(void);
+long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2);
#endif

#ifdef CONFIG_X86_32
@@ -30,6 +31,4 @@ void x86_report_nx(void);

extern int reboot_force;

-long do_arch_prctl(struct task_struct *task, int option, unsigned long addr);
-
#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index bf9d7b6c0223..e37f764c11cc 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -205,7 +205,7 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
(struct user_desc __user *)tls, 0);
else
#endif
- err = do_arch_prctl(p, ARCH_SET_FS, tls);
+ err = do_arch_prctl_64(p, ARCH_SET_FS, tls);
if (err)
goto out;
}
@@ -548,7 +548,7 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
}
#endif

-long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
+long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
{
int ret = 0;
int doit = task == current;
@@ -556,62 +556,64 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)

switch (option) {
case ARCH_SET_GS:
- if (addr >= TASK_SIZE_MAX)
+ if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.gsindex = 0;
- task->thread.gsbase = addr;
+ task->thread.gsbase = arg2;
if (doit) {
load_gs_index(0);
- ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, addr);
+ ret = wrmsrl_safe(MSR_KERNEL_GS_BASE, arg2);
}
put_cpu();
break;
case ARCH_SET_FS:
/* Not strictly needed for fs, but do it for symmetry
with gs */
- if (addr >= TASK_SIZE_MAX)
+ if (arg2 >= TASK_SIZE_MAX)
return -EPERM;
cpu = get_cpu();
task->thread.fsindex = 0;
- task->thread.fsbase = addr;
+ task->thread.fsbase = arg2;
if (doit) {
/* set the selector to 0 to not confuse __switch_to */
loadsegment(fs, 0);
- ret = wrmsrl_safe(MSR_FS_BASE, addr);
+ ret = wrmsrl_safe(MSR_FS_BASE, arg2);
}
put_cpu();
break;
case ARCH_GET_FS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_FS_BASE, base);
else
base = task->thread.fsbase;
- ret = put_user(base, (unsigned long __user *)addr);
+ ret = put_user(base, (unsigned long __user *)arg2);
break;
}
case ARCH_GET_GS: {
unsigned long base;
+
if (doit)
rdmsrl(MSR_KERNEL_GS_BASE, base);
else
base = task->thread.gsbase;
- ret = put_user(base, (unsigned long __user *)addr);
+ ret = put_user(base, (unsigned long __user *)arg2);
break;
}

#ifdef CONFIG_CHECKPOINT_RESTORE
# ifdef CONFIG_X86_X32_ABI
case ARCH_MAP_VDSO_X32:
- return prctl_map_vdso(&vdso_image_x32, addr);
+ return prctl_map_vdso(&vdso_image_x32, arg2);
# endif
# if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
case ARCH_MAP_VDSO_32:
- return prctl_map_vdso(&vdso_image_32, addr);
+ return prctl_map_vdso(&vdso_image_32, arg2);
# endif
case ARCH_MAP_VDSO_64:
- return prctl_map_vdso(&vdso_image_64, addr);
+ return prctl_map_vdso(&vdso_image_64, arg2);
#endif

default:
@@ -622,9 +624,9 @@ long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
return ret;
}

-SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return do_arch_prctl(current, option, addr);
+ return do_arch_prctl_64(current, option, arg2);
}

unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 2364b23ea3e5..f37d18124648 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -396,12 +396,12 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
/*
- * When changing the segment base, use do_arch_prctl
+ * When changing the segment base, use do_arch_prctl_64
* to set either thread.fs or thread.fsindex and the
* corresponding GDT slot.
*/
if (child->thread.fsbase != value)
- return do_arch_prctl(child, ARCH_SET_FS, value);
+ return do_arch_prctl_64(child, ARCH_SET_FS, value);
return 0;
case offsetof(struct user_regs_struct,gs_base):
/*
@@ -410,7 +410,7 @@ static int putreg(struct task_struct *child,
if (value >= TASK_SIZE_MAX)
return -EIO;
if (child->thread.gsbase != value)
- return do_arch_prctl(child, ARCH_SET_GS, value);
+ return do_arch_prctl_64(child, ARCH_SET_GS, value);
return 0;
#endif
}
@@ -869,7 +869,7 @@ long arch_ptrace(struct task_struct *child, long request,
Works just like arch_prctl, except that the arguments
are reversed. */
case PTRACE_ARCH_PRCTL:
- ret = do_arch_prctl(child, data, addr);
+ ret = do_arch_prctl_64(child, data, addr);
break;
#endif

diff --git a/arch/x86/um/os-Linux/prctl.c b/arch/x86/um/os-Linux/prctl.c
index 0a6e16a35b77..8431e87ac333 100644
--- a/arch/x86/um/os-Linux/prctl.c
+++ b/arch/x86/um/os-Linux/prctl.c
@@ -6,7 +6,7 @@
#include <sys/ptrace.h>
#include <asm/ptrace.h>

-int os_arch_prctl(int pid, int option, unsigned long *addr)
+int os_arch_prctl(int pid, int option, unsigned long *arg2)
{
- return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, option);
+ return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) arg2, option);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 42369fa5421f..81b9fe100f7c 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -12,10 +12,10 @@
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>

-long arch_prctl(struct task_struct *task, int option
- unsigned long __user *addr)
+long arch_prctl(struct task_struct *task, int option)
+ unsigned long __user *arg2)
{
- unsigned long *ptr = addr, tmp;
+ unsigned long *ptr = arg2, tmp;
long ret;
int pid = task->mm->context.id.u.pid;

@@ -65,19 +65,19 @@ long arch_prctl(struct task_struct *task, int option
ret = save_registers(pid, &current->thread.regs.regs);
break;
case ARCH_GET_FS:
- ret = put_user(tmp, addr);
+ ret = put_user(tmp, arg2);
break;
case ARCH_GET_GS:
- ret = put_user(tmp, addr);
+ ret = put_user(tmp, arg2);
break;
}

return ret;
}

-SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, addr)
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
{
- return arch_prctl(current, option, (unsigned long __user *) addr);
+ return arch_prctl(current, option, (unsigned long __user *) arg2);
}

void arch_switch_to(struct task_struct *to)
--
2.11.0

2017-03-20 16:49:44

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 02/10] x86/arch_prctl: Rename 'code' argument to 'option'

arch_prctl arbitrarily changed prctl's 'option' to 'code'. Now that we're
adding additional options, fix that.

Signed-off-by: Kyle Huey <[email protected]>
---
arch/um/include/shared/os.h | 4 ++--
arch/x86/include/asm/proto.h | 2 +-
arch/x86/kernel/process_64.c | 8 ++++----
arch/x86/um/asm/ptrace.h | 2 +-
arch/x86/um/os-Linux/prctl.c | 4 ++--
arch/x86/um/syscalls_64.c | 13 +++++++------
6 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index de5d572225f3..32e41c4ef6d3 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -302,8 +302,8 @@ extern int ignore_sigio_fd(int fd);
extern void maybe_sigio_broken(int fd, int read);
extern void sigio_broken(int fd, int read);

-/* sys-x86_64/prctl.c */
-extern int os_arch_prctl(int pid, int code, unsigned long *addr);
+/* prctl.c */
+extern int os_arch_prctl(int pid, int option, unsigned long *addr);

/* tty.c */
extern int get_pty(void);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 9b9b30b19441..91675a960391 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -30,6 +30,6 @@ void x86_report_nx(void);

extern int reboot_force;

-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr);
+long do_arch_prctl(struct task_struct *task, int option, unsigned long addr);

#endif /* _ASM_X86_PROTO_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a5520d..4377cfe8e449 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -547,13 +547,13 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
}
#endif

-long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
+long do_arch_prctl(struct task_struct *task, int option, unsigned long addr)
{
int ret = 0;
int doit = task == current;
int cpu;

- switch (code) {
+ switch (option) {
case ARCH_SET_GS:
if (addr >= TASK_SIZE_MAX)
return -EPERM;
@@ -621,9 +621,9 @@ long do_arch_prctl(struct task_struct *task, int code, unsigned long addr)
return ret;
}

-long sys_arch_prctl(int code, unsigned long addr)
+long sys_arch_prctl(int option, unsigned long addr)
{
- return do_arch_prctl(current, code, addr);
+ return do_arch_prctl(current, option, addr);
}

unsigned long KSTK_ESP(struct task_struct *task)
diff --git a/arch/x86/um/asm/ptrace.h b/arch/x86/um/asm/ptrace.h
index e59eef20647b..b291ca5cf66b 100644
--- a/arch/x86/um/asm/ptrace.h
+++ b/arch/x86/um/asm/ptrace.h
@@ -78,7 +78,7 @@ static inline int ptrace_set_thread_area(struct task_struct *child, int idx,
return -ENOSYS;
}

-extern long arch_prctl(struct task_struct *task, int code,
+extern long arch_prctl(struct task_struct *task, int option,
unsigned long __user *addr);

#endif
diff --git a/arch/x86/um/os-Linux/prctl.c b/arch/x86/um/os-Linux/prctl.c
index 96eb2bd28832..0a6e16a35b77 100644
--- a/arch/x86/um/os-Linux/prctl.c
+++ b/arch/x86/um/os-Linux/prctl.c
@@ -6,7 +6,7 @@
#include <sys/ptrace.h>
#include <asm/ptrace.h>

-int os_arch_prctl(int pid, int code, unsigned long *addr)
+int os_arch_prctl(int pid, int option, unsigned long *addr)
{
- return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, code);
+ return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) addr, option);
}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 10d907098c26..3c2dd8768992 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -11,7 +11,8 @@
#include <asm/prctl.h> /* XXX This should get the constants from libc */
#include <os.h>

-long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
+long arch_prctl(struct task_struct *task, int option
+ unsigned long __user *addr)
{
unsigned long *ptr = addr, tmp;
long ret;
@@ -30,7 +31,7 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
* arch_prctl is run on the host, then the registers are read
* back.
*/
- switch (code) {
+ switch (option) {
case ARCH_SET_FS:
case ARCH_SET_GS:
ret = restore_registers(pid, &current->thread.regs.regs);
@@ -50,11 +51,11 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
ptr = &tmp;
}

- ret = os_arch_prctl(pid, code, ptr);
+ ret = os_arch_prctl(pid, option, ptr);
if (ret)
return ret;

- switch (code) {
+ switch (option) {
case ARCH_SET_FS:
current->thread.arch.fs = (unsigned long) ptr;
ret = save_registers(pid, &current->thread.regs.regs);
@@ -73,9 +74,9 @@ long arch_prctl(struct task_struct *task, int code, unsigned long __user *addr)
return ret;
}

-long sys_arch_prctl(int code, unsigned long addr)
+long sys_arch_prctl(int option, unsigned long addr)
{
- return arch_prctl(current, code, (unsigned long __user *) addr);
+ return arch_prctl(current, option, (unsigned long __user *) addr);
}

void arch_switch_to(struct task_struct *to)
--
2.11.0

2017-03-20 16:49:27

by Kyle Huey

[permalink] [raw]
Subject: [PATCH v16 06/10] x86/syscalls/32: Wire up arch_prctl on x86-32

Hook up arch_prctl to call do_arch_prctl() on x86-32, and in 32 bit compat
mode on x86-64. This allows us to have arch_prctls that are not specific to
64 bits.

On UML, simply stub out this syscall.

Signed-off-by: Kyle Huey <[email protected]>
---
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/kernel/process_32.c | 7 +++++++
arch/x86/kernel/process_64.c | 7 +++++++
arch/x86/um/Makefile | 2 +-
arch/x86/um/syscalls_32.c | 7 +++++++
include/linux/compat.h | 2 ++
6 files changed, 25 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/um/syscalls_32.c

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 9ba050fe47f3..0af59fa789ea 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -390,3 +390,4 @@
381 i386 pkey_alloc sys_pkey_alloc
382 i386 pkey_free sys_pkey_free
383 i386 statx sys_statx
+384 i386 arch_prctl sys_arch_prctl compat_sys_arch_prctl
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4c818f8bc135..ff40e74c9181 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/kdebug.h>
+#include <linux/syscalls.h>

#include <asm/pgtable.h>
#include <asm/ldt.h>
@@ -56,6 +57,7 @@
#include <asm/switch_to.h>
#include <asm/vm86.h>
#include <asm/intel_rdt.h>
+#include <asm/proto.h>

void __show_regs(struct pt_regs *regs, int all)
{
@@ -304,3 +306,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)

return prev_p;
}
+
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return do_arch_prctl_common(current, option, arg2);
+}
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d81b0a60a45c..ea1a6180bf39 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -635,6 +635,13 @@ SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
return ret;
}

+#ifdef CONFIG_IA32_EMULATION
+COMPAT_SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return do_arch_prctl_common(current, option, arg2);
+}
+#endif
+
unsigned long KSTK_ESP(struct task_struct *task)
{
return task_pt_regs(task)->sp;
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index e7e7055a8658..69f0827d5f53 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -16,7 +16,7 @@ obj-y = bug.o bugs_$(BITS).o delay.o fault.o ldt.o \

ifeq ($(CONFIG_X86_32),y)

-obj-y += checksum_32.o
+obj-y += checksum_32.o syscalls_32.o
obj-$(CONFIG_ELF_CORE) += elfcore.o

subarch-y = ../lib/string_32.o ../lib/atomic64_32.o ../lib/atomic64_cx8_32.o
diff --git a/arch/x86/um/syscalls_32.c b/arch/x86/um/syscalls_32.c
new file mode 100644
index 000000000000..627d68836b16
--- /dev/null
+++ b/arch/x86/um/syscalls_32.c
@@ -0,0 +1,7 @@
+#include <linux/syscalls.h>
+#include <os.h>
+
+SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2)
+{
+ return -EINVAL;
+}
diff --git a/include/linux/compat.h b/include/linux/compat.h
index aef47be2a5c1..af9dbc44fd92 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -723,6 +723,8 @@ asmlinkage long compat_sys_sched_rr_get_interval(compat_pid_t pid,
asmlinkage long compat_sys_fanotify_mark(int, unsigned int, __u32, __u32,
int, const char __user *);

+asmlinkage long compat_sys_arch_prctl(int option, unsigned long arg2);
+
/*
* For most but not all architectures, "am I in a compat syscall?" and
* "am I a compat task?" are the same question. For architectures on which
--
2.11.0

2017-03-20 16:52:16

by Kyle Huey

[permalink] [raw]
Subject: Re: [PATCH v16 08/10] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

On Mon, Mar 20, 2017 at 8:00 AM, Thomas Gleixner <[email protected]> wrote:
> On Mon, 20 Mar 2017, Kyle Huey wrote:
>> --- a/arch/x86/include/uapi/asm/prctl.h
>> +++ b/arch/x86/include/uapi/asm/prctl.h
>> @@ -6,8 +6,17 @@
>> #define ARCH_GET_FS 0x1003
>> #define ARCH_GET_GS 0x1004
>>
>> +#define ARCH_GET_CPUID 0x1005
>> +#define ARCH_SET_CPUID 0x1006
>> +
>> #define ARCH_MAP_VDSO_X32 0x2001
>> #define ARCH_MAP_VDSO_32 0x2002
>> #define ARCH_MAP_VDSO_64 0x2003
>>
>> +#ifdef CONFIG_CHECKPOINT_RESTORE
>> +# define ARCH_MAP_VDSO_X32 0x2001
>> +# define ARCH_MAP_VDSO_32 0x2002
>> +# define ARCH_MAP_VDSO_64 0x2003
>> +#endif
>
> That hunk is bogus in two aspects:
> - It's just a copy of the above wrapped in a ifdef
>
> - The ifdef is broken, because the UAPI headers do not know about
> that.
>
> I dropped it.

Looks like that was introduced when I rebased over a01aa6c9f40f for
v14 and I didn't catch it manually. Thanks for finding and fixing
that.

- Kyle

2017-03-21 08:35:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:x86/process] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID


* tip-bot for Kyle Huey <[email protected]> wrote:

> Commit-ID: e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
> Gitweb: http://git.kernel.org/tip/e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
> Author: Kyle Huey <[email protected]>
> AuthorDate: Mon, 20 Mar 2017 01:16:26 -0700
> Committer: Thomas Gleixner <[email protected]>
> CommitDate: Mon, 20 Mar 2017 16:10:34 +0100
>
> x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

Note that this series broke the UML build:

/home/mingo/tip/arch/x86/um/syscalls_64.c:15:6: error: conflicting types for ‘arch_prctl’
...

etc.

Thanks,

Ingo

2017-03-21 18:41:35

by Kyle Huey

[permalink] [raw]
Subject: Re: [tip:x86/process] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID

On Tue, Mar 21, 2017 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
>
> * tip-bot for Kyle Huey <[email protected]> wrote:
>
>> Commit-ID: e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
>> Gitweb: http://git.kernel.org/tip/e9ea1e7f53b852147cbd568b0568c7ad97ec21a3
>> Author: Kyle Huey <[email protected]>
>> AuthorDate: Mon, 20 Mar 2017 01:16:26 -0700
>> Committer: Thomas Gleixner <[email protected]>
>> CommitDate: Mon, 20 Mar 2017 16:10:34 +0100
>>
>> x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
>
> Note that this series broke the UML build:
>
> /home/mingo/tip/arch/x86/um/syscalls_64.c:15:6: error: conflicting types for ‘arch_prctl’
> ...
>
> etc.
>
> Thanks,
>
> Ingo

Yes, I sent another patch ("um: fix build error due to typo")
yesterday and tglx applied it to x86/process.

- Kyle

2017-04-21 09:58:20

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v16 10/10] KVM: x86: virtualize cpuid faulting



On 20/03/2017 09:16, Kyle Huey wrote:
> Hardware support for faulting on the cpuid instruction is not required to
> emulate it, because cpuid triggers a VM exit anyways. KVM handles the relevant
> MSRs (MSR_PLATFORM_INFO and MSR_MISC_FEATURES_ENABLE) and upon a
> cpuid-induced VM exit checks the cpuid faulting state and the CPL.
> kvm_require_cpl is even kind enough to inject the GP fault for us.
>
> Signed-off-by: Kyle Huey <[email protected]>
> Reviewed-by: David Matlack <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 2 ++
> arch/x86/kvm/cpuid.c | 3 +++
> arch/x86/kvm/cpuid.h | 11 +++++++++++
> arch/x86/kvm/emulate.c | 7 +++++++
> arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++++++
> 5 files changed, 49 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 74ef58c8ff53..df0c2bd970a4 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -612,6 +612,8 @@ struct kvm_vcpu_arch {
> unsigned long dr7;
> unsigned long eff_db[KVM_NR_DB_REGS];
> unsigned long guest_debug_dr7;
> + u64 msr_platform_info;
> + u64 msr_misc_features_enables;
>
> u64 mcg_cap;
> u64 mcg_status;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index efde6cc50875..cb560a509041 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -876,6 +876,9 @@ int kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
> {
> u32 eax, ebx, ecx, edx;
>
> + if (cpuid_fault_enabled(vcpu) && !kvm_require_cpl(vcpu, 0))
> + return;
> +
> eax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
> kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx);
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 35058c2c0eea..a6fd40aade7c 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -205,4 +205,15 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
> return x86_stepping(best->eax);
> }
>
> +static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
> +{
> + return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
> +}
> +
> +static inline bool cpuid_fault_enabled(struct kvm_vcpu *vcpu)
> +{
> + return vcpu->arch.msr_misc_features_enables &
> + MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;
> +}
> +
> #endif
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 45c7306c8780..6a2ea945d01f 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3854,6 +3854,13 @@ static int em_sti(struct x86_emulate_ctxt *ctxt)
> static int em_cpuid(struct x86_emulate_ctxt *ctxt)
> {
> u32 eax, ebx, ecx, edx;
> + u64 msr = 0;
> +
> + ctxt->ops->get_msr(ctxt, MSR_MISC_FEATURES_ENABLES, &msr);
> + if (msr & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT &&
> + ctxt->ops->cpl(ctxt)) {
> + return emulate_gp(ctxt, 0);
> + }
>
> eax = reg_read(ctxt, VCPU_REGS_RAX);
> ecx = reg_read(ctxt, VCPU_REGS_RCX);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1faf620a6fdc..16d2082d85fb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1008,6 +1008,8 @@ static u32 emulated_msrs[] = {
> MSR_IA32_MCG_CTL,
> MSR_IA32_MCG_EXT_CTL,
> MSR_IA32_SMBASE,
> + MSR_PLATFORM_INFO,
> + MSR_MISC_FEATURES_ENABLES,
> };
>
> static unsigned num_emulated_msrs;
> @@ -2331,6 +2333,21 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> return 1;
> vcpu->arch.osvw.status = data;
> break;
> + case MSR_PLATFORM_INFO:
> + if (!msr_info->host_initiated ||
> + data & ~MSR_PLATFORM_INFO_CPUID_FAULT ||
> + (!(data & MSR_PLATFORM_INFO_CPUID_FAULT) &&
> + cpuid_fault_enabled(vcpu)))
> + return 1;
> + vcpu->arch.msr_platform_info = data;
> + break;
> + case MSR_MISC_FEATURES_ENABLES:
> + if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT ||
> + (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT &&
> + !supports_cpuid_fault(vcpu)))
> + return 1;
> + vcpu->arch.msr_misc_features_enables = data;
> + break;
> default:
> if (msr && (msr == vcpu->kvm->arch.xen_hvm_config.msr))
> return xen_hvm_config(vcpu, data);
> @@ -2545,6 +2562,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> return 1;
> msr_info->data = vcpu->arch.osvw.status;
> break;
> + case MSR_PLATFORM_INFO:
> + msr_info->data = vcpu->arch.msr_platform_info;
> + break;
> + case MSR_MISC_FEATURES_ENABLES:
> + msr_info->data = vcpu->arch.msr_misc_features_enables;
> + break;
> default:
> if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
> return kvm_pmu_get_msr(vcpu, msr_info->index, &msr_info->data);
> @@ -7724,6 +7747,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> if (!init_event) {
> kvm_pmu_reset(vcpu);
> vcpu->arch.smbase = 0x30000;
> +
> + vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
> + vcpu->arch.msr_misc_features_enables = 0;
> }
>
> memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
>

Patch 10 applied, thanks.

Paolo

2018-07-27 17:20:01

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v16 01/10] x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES

On Mon, Mar 20, 2017 at 1:16 AM, Kyle Huey <[email protected]> wrote:
> This matches the only public Intel documentation of this MSR, in the
> "Virtualization Technology FlexMigration Application Note"
> (preserved at https://bugzilla.kernel.org/attachment.cgi?id=243991)
>
> Signed-off-by: Kyle Huey <[email protected]>

The old spelling matched volume 4 of the SDM, Table 2-43. "Selected
MSRs Supported by Intel Xeon Phi Processors with
DisplayFamily_DisplayModel Signatures 06_57H and 06_85H."