2020-03-28 16:44:58

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 00/10] Support XSAVES supervisor states

Changes in v3:

This version addresses the overhead of an extra XSAVES in
__fpu__restore_sig() with additional patches:

Patch #8: Introduce copy_supervisor_to_kernel(), which does a
supervisor-state-only XSAVES and then relocates the contents
to the standard location.
Patch #9: Preserve supervisor states for __fpu__restore_sig() slow path.

Other small changes are noted in each patch's commit log.

----

Changes in v2:

- Split out small changes to:
https://lkml.kernel.org/r/[email protected]/

- Fix an issue in patch #4, where fpu__clear_user_states() drops
supervisor xstates.

- Add three patches:
Patch #6: Update sanitize_restored_xstate() for supervisor xstates.
Patch #7: Update copy_kernel_to_xregs_err() to use XRSTORS when
supervisor xstates are present.
Patch #8: Update __fpu__restore_sig() to preserve supervisor xstates.

Also make some small changes in response to comments. More details are in
each patch's commit log.

----

There are two types of XSAVE-managed states (xstates): user and supervisor.
This series introduces the supervisor xstate support in preparation for new
features that will make use of supervisor xstates.

This series has been separated for ease of review from the series that add
supervisor xstate features [3].

In current and near future generations of Intel processors there are three
classes of objects that can be managed as supervisor xstates:

- Processor Trace (PT):
Linux already supports PT, but PT xstates are not saved to the FPU
context and not context-switched by the kernel. There are no plans to
integrate PT into XSAVES supervisor states.

- ENQCMD Process Address Space ID (MSR_IA32_PASID):
ENQCMD is a new instruction and will be introduced shortly in a
separate series [2].

- Control-flow Enforcement Technology (CET):
CET is being reviewed on the LKML [1] [3].

Supervisor xstates can be accessed only from the kernel (PL-0) with XSAVES/
XRSTORS instructions. They cannot be accessed with other XSAVE*/XRSTOR*
instructions. MSR_IA32_XSS sets enabled supervisor xstates, while XCR0
sets enabled user xstates.

This series separates the two xstate types by declaring new macros for each
type. The kernel finds all available features during system initialization
and stores them in xfeatures_mask_all. It then retrieves perspective
xstate type with xfeatures_mask_supervisor()/xfeatures_mask_user() for
handling signals and PTRACE.

[1] Detailed information on supervisor xstates can be found in "Intel 64
and IA-32 Architectures Software Developer's Manual":

https://software.intel.com/en-us/download/intel-64-and-ia-32-
architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4

[2] Detailed information on the ENQCMD instruction and MSR_IA32_PASID can
be found in "Intel Architecture Instruction Set Extensions and Future
Features Programming Reference":

https://software.intel.com/sites/default/files/managed/c5/15/
architecture-instruction-set-extensions-programming-reference.pdf

[3] CET patches:

https://lkml.kernel.org/r/[email protected]/
https://lkml.kernel.org/r/[email protected]/

Fenghua Yu (3):
x86/fpu/xstate: Rename validate_xstate_header() to
validate_user_xstate_header()
x86/fpu/xstate: Define new macros for supervisor and user xstates
x86/fpu/xstate: Define new functions for clearing fpregs and xstates

Yu-cheng Yu (7):
x86/fpu/xstate: Separate user and supervisor xfeatures mask
x86/fpu/xstate: Introduce XSAVES supervisor states
x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor
xstates
x86/fpu/xstate: Update copy_kernel_to_xregs_err() for XSAVES
supervisor states
x86/fpu: Introduce copy_supervisor_to_kernel()
x86/fpu/xstate: Preserve supervisor states for slow path of
__fpu__restore_sig()
x86/fpu/xstate: Restore supervisor states for signal return

arch/x86/include/asm/fpu/internal.h | 10 +-
arch/x86/include/asm/fpu/xstate.h | 52 +++++---
arch/x86/kernel/fpu/core.c | 49 ++++---
arch/x86/kernel/fpu/init.c | 3 +-
arch/x86/kernel/fpu/regset.c | 2 +-
arch/x86/kernel/fpu/signal.c | 127 +++++++++++-------
arch/x86/kernel/fpu/xstate.c | 199 +++++++++++++++++++++-------
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
9 files changed, 317 insertions(+), 129 deletions(-)

--
2.21.0


2020-03-28 16:44:59

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 10/10] x86/fpu/xstate: Restore supervisor states for signal return

As described in the previous patch, the signal return fast path directly
restores user states from the user buffer. Once that succeeds, restore
supervisor states (but only when they are not yet restored).

For the slow path, save supervisor states to preserve them across context
switches, and restore after the user states are restored.

The previous version has the overhead of an XSAVES in both the fast and the
slow paths. It is addressed as the following:

- In the fast path, only do an XRSTORS.
- In the slow path, do a supervisor-state-only XSAVES, and relocate the
buffer contents.

Some thoughts in the implementation:

- In the slow path, can any supervisor state become stale between
save/restore?

Answer: set_thread_flag(TIF_NEED_FPU_LOAD) protects the xstate buffer.

- In the slow path, can any code reference a stale supervisor state
register between save/restore?

Answer: In the current lazy-restore scheme, any reference to xstate
registers needs fpregs_lock()/fpregs_unlock() and __fpregs_load_activate().

- Are there other options?

One other option is eagerly restoring all supervisor states.

Currently, CET user-mode states and ENQCMD's PASID do not need to be
eagerly restored. The upcoming CET kernel-mode states (24 bytes) need
to be eagerly restored. To me, eagerly restoring all supervisor states
adds more overhead then benefit at this point.

v3:
- Change copy_xregs_to_kernel() to copy_supervisor_to_kernel(), which is
introduced in a previous patch.
- Update commit log.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 545ca4314096..4dad5afc938d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -347,6 +347,11 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
+ /* Restore supervisor states */
+ if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
fpregs_mark_activate();
fpregs_unlock();
return 0;
@@ -364,14 +369,21 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}

/*
- * The current state of the FPU registers does not matter. By setting
- * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
- * is not modified on context switch and that the xstate is considered
+ * Supervisor states are not modified by user space input. Save
+ * current supervisor states first.
+ * By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
+ * not modified on context switch and that the xstate is considered
* to be loaded again on return to userland (overriding last_cpu avoids
* the optimisation).
*/
- set_thread_flag(TIF_NEED_FPU_LOAD);
+ fpregs_lock();
+ if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+ if (xfeatures_mask_supervisor())
+ copy_supervisor_to_kernel(&fpu->state.xsave);
+ set_thread_flag(TIF_NEED_FPU_LOAD);
+ }
__fpu_invalidate_fpregs_state(fpu);
+ fpregs_unlock();

if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
@@ -393,7 +405,12 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures);
+ /*
+ * Restore previously saved supervisor xstates along with
+ * copied-in user xstates.
+ */
+ ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
+ user_xfeatures | xfeatures_mask_supervisor());

} else if (use_fxsr()) {
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
--
2.21.0

2020-03-28 16:45:13

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 02/10] x86/fpu/xstate: Define new macros for supervisor and user xstates

From: Fenghua Yu <[email protected]>

XCNTXT_MASK is 'all supported xfeatures' before introducing supervisor
xstates. Rename it to XFEATURE_MASK_USER_SUPPORTED to make clear that
these are user xstates.

XFEATURE_MASK_SUPERVISOR is replaced with the following:
- XFEATURE_MASK_SUPERVISOR_SUPPORTED: Currently nothing. ENQCMD and
Control-flow Enforcement Technology (CET) will be introduced in separate
series.
- XFEATURE_MASK_SUPERVISOR_UNSUPPORTED: Currently only Processor Trace.
- XFEATURE_MASK_SUPERVISOR_ALL: the combination of above.

v3:
- Change SUPPORTED_XFEATURES_*, UNSUPPORTED_XFEATURES_*, ALL_XFEATURES_* to
XFEATURE_MASK_*.

Signed-off-by: Fenghua Yu <[email protected]>
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/fpu/xstate.h | 36 ++++++++++++++++++++-----------
arch/x86/kernel/fpu/init.c | 3 ++-
arch/x86/kernel/fpu/xstate.c | 26 +++++++++++-----------
3 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index fc4db51f3b53..b08fa823425f 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -21,19 +21,29 @@
#define XSAVE_YMM_SIZE 256
#define XSAVE_YMM_OFFSET (XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)

-/* Supervisor features */
-#define XFEATURE_MASK_SUPERVISOR (XFEATURE_MASK_PT)
-
-/* All currently supported features */
-#define XCNTXT_MASK (XFEATURE_MASK_FP | \
- XFEATURE_MASK_SSE | \
- XFEATURE_MASK_YMM | \
- XFEATURE_MASK_OPMASK | \
- XFEATURE_MASK_ZMM_Hi256 | \
- XFEATURE_MASK_Hi16_ZMM | \
- XFEATURE_MASK_PKRU | \
- XFEATURE_MASK_BNDREGS | \
- XFEATURE_MASK_BNDCSR)
+/* All currently supported user features */
+#define XFEATURE_MASK_USER_SUPPORTED (XFEATURE_MASK_FP | \
+ XFEATURE_MASK_SSE | \
+ XFEATURE_MASK_YMM | \
+ XFEATURE_MASK_OPMASK | \
+ XFEATURE_MASK_ZMM_Hi256 | \
+ XFEATURE_MASK_Hi16_ZMM | \
+ XFEATURE_MASK_PKRU | \
+ XFEATURE_MASK_BNDREGS | \
+ XFEATURE_MASK_BNDCSR)
+
+/* All currently supported supervisor features */
+#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
+
+/*
+ * Unsupported supervisor features. When a supervisor feature in this mask is
+ * supported in the future, move it to the supported supervisor feature mask.
+ */
+#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT)
+
+/* All supervisor states including supported and unsupported states. */
+#define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
+ XFEATURE_MASK_SUPERVISOR_UNSUPPORTED)

#ifdef CONFIG_X86_64
#define REX_PREFIX "0x48, "
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6ce7e0a23268..61ddc3a5e5c2 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -224,7 +224,8 @@ static void __init fpu__init_system_xstate_size_legacy(void)
*/
u64 __init fpu__get_supported_xfeatures_mask(void)
{
- return XCNTXT_MASK;
+ return XFEATURE_MASK_USER_SUPPORTED |
+ XFEATURE_MASK_SUPERVISOR_SUPPORTED;
}

/* Legacy code to initialize eager fpu mode. */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5a580e403044..497d43335073 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -208,14 +208,13 @@ void fpu__init_cpu_xstate(void)
if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask)
return;
/*
- * Make it clear that XSAVES supervisor states are not yet
- * implemented should anyone expect it to work by changing
- * bits in XFEATURE_MASK_* macros and XCR0.
+ * Unsupported supervisor xstates should not be found in
+ * the xfeatures mask.
*/
- WARN_ONCE((xfeatures_mask & XFEATURE_MASK_SUPERVISOR),
- "x86/fpu: XSAVES supervisor states are not yet implemented.\n");
+ WARN_ONCE((xfeatures_mask & XFEATURE_MASK_SUPERVISOR_UNSUPPORTED),
+ "x86/fpu: Found unsupported supervisor xstates.\n");

- xfeatures_mask &= ~XFEATURE_MASK_SUPERVISOR;
+ xfeatures_mask &= ~XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;

cr4_set_bits(X86_CR4_OSXSAVE);
xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
@@ -438,7 +437,7 @@ static int xfeature_uncompacted_offset(int xfeature_nr)
* format. Checking a supervisor state's uncompacted offset is
* an error.
*/
- if (XFEATURE_MASK_SUPERVISOR & BIT_ULL(xfeature_nr)) {
+ if (XFEATURE_MASK_SUPERVISOR_ALL & BIT_ULL(xfeature_nr)) {
WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
return -1;
}
@@ -475,7 +474,7 @@ int using_compacted_format(void)
int validate_user_xstate_header(const struct xstate_header *hdr)
{
/* No unknown or supervisor features may be set */
- if (hdr->xfeatures & (~xfeatures_mask | XFEATURE_MASK_SUPERVISOR))
+ if (hdr->xfeatures & ~(xfeatures_mask & XFEATURE_MASK_USER_SUPPORTED))
return -EINVAL;

/* Userspace must use the uncompacted format */
@@ -768,7 +767,8 @@ void __init fpu__init_system_xstate(void)
* Update info used for ptrace frames; use standard-format size and no
* supervisor xstates:
*/
- update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask & ~XFEATURE_MASK_SUPERVISOR);
+ update_regset_xstate_info(fpu_user_xstate_size,
+ xfeatures_mask & XFEATURE_MASK_USER_SUPPORTED);

fpu__init_prepare_fx_sw_frame();
setup_init_fpu_buf();
@@ -991,7 +991,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
*/
memset(&header, 0, sizeof(header));
header.xfeatures = xsave->header.xfeatures;
- header.xfeatures &= ~XFEATURE_MASK_SUPERVISOR;
+ header.xfeatures &= XFEATURE_MASK_USER_SUPPORTED;

/*
* Copy xregs_state->header:
@@ -1075,7 +1075,7 @@ int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned i
*/
memset(&header, 0, sizeof(header));
header.xfeatures = xsave->header.xfeatures;
- header.xfeatures &= ~XFEATURE_MASK_SUPERVISOR;
+ header.xfeatures &= XFEATURE_MASK_USER_SUPPORTED;

/*
* Copy xregs_state->header:
@@ -1168,7 +1168,7 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
* The state that came in from userspace was user-state only.
* Mask all the user states out of 'xfeatures':
*/
- xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;
+ xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;

/*
* Add back in the features that came in from userspace:
@@ -1224,7 +1224,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
* The state that came in from userspace was user-state only.
* Mask all the user states out of 'xfeatures':
*/
- xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;
+ xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;

/*
* Add back in the features that came in from userspace:
--
2.21.0

2020-03-28 16:45:30

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 04/10] x86/fpu/xstate: Introduce XSAVES supervisor states

Enable XSAVES supervisor states by setting MSR_IA32_XSS bits according to
CPUID enumeration results. Also revise comments at various places.

v2:
- Remove printing of supervisor xstates from fpu__init_system_xstate().

Co-developed-by: Fenghua Yu <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/kernel/fpu/xstate.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index bb8b53223925..28e8229b23a7 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -228,13 +228,14 @@ void fpu__init_cpu_xstate(void)
* states can be set here.
*/
xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+
+ /*
+ * MSR_IA32_XSS sets supervisor states managed by XSAVES.
+ */
+ if (boot_cpu_has(X86_FEATURE_XSAVES))
+ wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor());
}

-/*
- * Note that in the future we will likely need a pair of
- * functions here: one for user xstates and the other for
- * system xstates. For now, they are the same.
- */
static bool xfeature_enabled(enum xfeature xfeature)
{
return xfeatures_mask_all & BIT_ULL(xfeature);
@@ -625,9 +626,6 @@ static void do_extra_xstate_size_checks(void)
* the size of the *user* states. If we use it to size a buffer
* that we use 'XSAVES' on, we could potentially overflow the
* buffer because 'XSAVES' saves system states too.
- *
- * Note that we do not currently set any bits on IA32_XSS so
- * 'XCR0 | IA32_XSS == XCR0' for now.
*/
static unsigned int __init get_xsaves_size(void)
{
@@ -750,7 +748,12 @@ void __init fpu__init_system_xstate(void)
cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
xfeatures_mask_all = eax + ((u64)edx << 32);

- /* Place supervisor features in xfeatures_mask_all here */
+ /*
+ * Find supervisor xstates supported by the processor.
+ */
+ cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
+ xfeatures_mask_all |= ecx + ((u64)edx << 32);
+
if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
/*
* This indicates that something really unexpected happened
@@ -810,6 +813,13 @@ void fpu__resume_cpu(void)
*/
if (boot_cpu_has(X86_FEATURE_XSAVE))
xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+
+ /*
+ * Restore IA32_XSS. The same CPUID bit enumerates support
+ * of XSAVES and MSR_IA32_XSS.
+ */
+ if (boot_cpu_has(X86_FEATURE_XSAVES))
+ wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor());
}

/*
--
2.21.0

2020-03-28 16:45:34

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 08/10] x86/fpu: Introduce copy_supervisor_to_kernel()

The XSAVES instruction takes a mask and saves only the features specified
in that mask. The kernel normally specifies that all features be saved.

XSAVES also unconditionally uses the "compacted format" which means that
all specified features are saved next to each other in memory. If a
feature is removed from the mask, all the features after it will "move
up" into earlier locations in the buffer.

Introduce copy_supervisor_to_kernel(), which saves only supervisor states
and then moves those states into the standard location where they are
normally found.

Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/include/asm/fpu/xstate.h | 1 +
arch/x86/kernel/fpu/xstate.c | 84 +++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 92104b298d77..422d8369012a 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -75,6 +75,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
+void copy_supervisor_to_kernel(struct xregs_state *xsave);

/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
int validate_user_xstate_header(const struct xstate_header *hdr);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 28e8229b23a7..7d0a9f878b26 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -62,6 +62,7 @@ u64 xfeatures_mask_all __read_mostly;
static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
static unsigned int xstate_sizes[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};

/*
* The XSAVE area of kernel can be in standard or compacted format;
@@ -392,6 +393,33 @@ static void __init setup_xstate_comp_offsets(void)
}
}

+/*
+ * Setup offsets of a supervisor-state-only XSAVES buffer:
+ *
+ * The offsets stored in xstate_comp_offsets[] only work for one specific
+ * value of the Requested Feature BitMap (RFBM). In cases where a different
+ * RFBM value is used, a different set of offsets is required. This set of
+ * offsets is for when RFBM=xfeatures_mask_supervisor().
+ */
+static void __init setup_supervisor_only_offsets(void)
+{
+ unsigned int next_offset;
+ int i;
+
+ next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+
+ for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
+ if (!xfeature_enabled(i) || !xfeature_is_supervisor(i))
+ continue;
+
+ if (xfeature_is_aligned(i))
+ next_offset = ALIGN(next_offset, 64);
+
+ xstate_supervisor_only_offsets[i] = next_offset;
+ next_offset += xstate_sizes[i];
+ }
+}
+
/*
* Print out xstate component offsets and sizes
*/
@@ -790,6 +818,7 @@ void __init fpu__init_system_xstate(void)
fpu__init_prepare_fx_sw_frame();
setup_init_fpu_buf();
setup_xstate_comp_offsets();
+ setup_supervisor_only_offsets();
print_xstate_offset_size();

pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
@@ -1257,6 +1286,61 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
return 0;
}

+/*
+ * Save only supervisor states to the kernel buffer. This blows away all
+ * old states, and is intended to be used only in __fpu__restore_sig(), where
+ * user states are restored from the user buffer.
+ */
+void copy_supervisor_to_kernel(struct xregs_state *xstate)
+{
+ struct xstate_header *header;
+ u64 max_bit, min_bit;
+ u32 lmask, hmask;
+ int err, i;
+
+ if (WARN_ON(!boot_cpu_has(X86_FEATURE_XSAVES)))
+ return;
+
+ if (!xfeatures_mask_supervisor())
+ return;
+
+ max_bit = __fls(xfeatures_mask_supervisor());
+ min_bit = __ffs(xfeatures_mask_supervisor());
+
+ lmask = xfeatures_mask_supervisor();
+ hmask = xfeatures_mask_supervisor() >> 32;
+ XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
+
+ /* We should never fault when copying to a kernel buffer: */
+ if (WARN_ON_FPU(err))
+ return;
+
+ /*
+ * At this point, the buffer has only supervisor states and must be
+ * converted back to normal kernel format.
+ */
+ header = &xstate->header;
+ header->xcomp_bv |= xfeatures_mask_all;
+
+ /*
+ * This only moves states up in the buffer. Start with
+ * the last state and move backwards so that states are
+ * not overwritten until after they are moved. Note:
+ * memmove() allows overlapping src/dst buffers.
+ */
+ for (i = max_bit; i >= min_bit; i--) {
+ u8 *xbuf = (u8 *)xstate;
+
+ if (!((header->xfeatures >> i) & 1))
+ continue;
+
+ /* Move xfeature 'i' into its normal location */
+ memmove(xbuf + xstate_comp_offsets[i],
+ xbuf + xstate_supervisor_only_offsets[i],
+ xstate_sizes[i]);
+ }
+}
+
#ifdef CONFIG_PROC_PID_ARCH_STATUS
/*
* Report the amount of time elapsed in millisecond since last AVX512
--
2.21.0

2020-03-28 16:45:38

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 03/10] x86/fpu/xstate: Separate user and supervisor xfeatures mask

Before the introduction of XSAVES supervisor states, 'xfeatures_mask' is
used at various places to determine XSAVE buffer components and XCR0 bits.
It contains only user xstates. To support supervisor xstates, it is
necessary to separate user and supervisor xstates:

- First, change 'xfeatures_mask' to 'xfeatures_mask_all', which represents
the full set of bits that should ever be set in a kernel XSAVE buffer.
- Introduce xfeatures_mask_supervisor() and xfeatures_mask_user() to
extract relevant xfeatures from xfeatures_mask_all.

v3:
- Change xfeature_enabled() type from static int to static bool while at
it.

v2:
- Fix typo in commit log.
- Move xfeatures_mask_supervisor() from xstate.c to xstate.h.
- Remove printing of user xstates from fpu__init_system_xstate().

Co-developed-by: Fenghua Yu <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 2 +-
arch/x86/include/asm/fpu/xstate.h | 13 ++++-
arch/x86/kernel/fpu/signal.c | 16 +++++--
arch/x86/kernel/fpu/xstate.c | 73 +++++++++++++++++------------
4 files changed, 67 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 44c48e34d799..ccb1bb32ad7d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -92,7 +92,7 @@ static inline void fpstate_init_xstate(struct xregs_state *xsave)
* XRSTORS requires these bits set in xcomp_bv, or it will
* trigger #GP:
*/
- xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask;
+ xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
}

static inline void fpstate_init_fxstate(struct fxregs_state *fx)
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index b08fa823425f..92104b298d77 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -51,7 +51,18 @@
#define REX_PREFIX
#endif

-extern u64 xfeatures_mask;
+extern u64 xfeatures_mask_all;
+
+static inline u64 xfeatures_mask_supervisor(void)
+{
+ return xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+}
+
+static inline u64 xfeatures_mask_user(void)
+{
+ return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
+}
+
extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];

extern void __init update_regset_xstate_info(unsigned int size,
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 585e3651b98f..3df0cfae535f 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -252,13 +252,17 @@ sanitize_restored_xstate(union fpregs_state *state,
*/
static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
{
+ u64 init_bv;
+
if (use_xsave()) {
if (fx_only) {
- u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+ init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
return copy_user_to_fxregs(buf);
} else {
- u64 init_bv = xfeatures_mask & ~xbv;
+ init_bv = xfeatures_mask_user() & ~xbv;
+
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
return copy_user_to_xregs(buf, xbv);
@@ -358,7 +362,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)


if (use_xsave() && !fx_only) {
- u64 init_bv = xfeatures_mask & ~xfeatures;
+ u64 init_bv = xfeatures_mask_user() & ~xfeatures;

if (using_compacted_format()) {
ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
@@ -389,7 +393,9 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

fpregs_lock();
if (use_xsave()) {
- u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+ u64 init_bv;
+
+ init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
}

@@ -465,7 +471,7 @@ void fpu__init_prepare_fx_sw_frame(void)

fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
fx_sw_reserved.extended_size = size;
- fx_sw_reserved.xfeatures = xfeatures_mask;
+ fx_sw_reserved.xfeatures = xfeatures_mask_user();
fx_sw_reserved.xstate_size = fpu_user_xstate_size;

if (IS_ENABLED(CONFIG_IA32_EMULATION) ||
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 497d43335073..bb8b53223925 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -54,9 +54,10 @@ static short xsave_cpuid_features[] __initdata = {
};

/*
- * Mask of xstate features supported by the CPU and the kernel:
+ * This represents the full set of bits that should ever be set in a kernel
+ * XSAVE buffer, both supervisor and user xstates.
*/
-u64 xfeatures_mask __read_mostly;
+u64 xfeatures_mask_all __read_mostly;

static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
static unsigned int xstate_sizes[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
@@ -76,7 +77,7 @@ unsigned int fpu_user_xstate_size;
*/
int cpu_has_xfeatures(u64 xfeatures_needed, const char **feature_name)
{
- u64 xfeatures_missing = xfeatures_needed & ~xfeatures_mask;
+ u64 xfeatures_missing = xfeatures_needed & ~xfeatures_mask_all;

if (unlikely(feature_name)) {
long xfeature_idx, max_idx;
@@ -150,7 +151,7 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
* None of the feature bits are in init state. So nothing else
* to do for us, as the memory layout is up to date.
*/
- if ((xfeatures & xfeatures_mask) == xfeatures_mask)
+ if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
return;

/*
@@ -177,7 +178,7 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
* in a special way already:
*/
feature_bit = 0x2;
- xfeatures = (xfeatures_mask & ~xfeatures) >> 2;
+ xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;

/*
* Update all the remaining memory layouts according to their
@@ -205,19 +206,28 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
*/
void fpu__init_cpu_xstate(void)
{
- if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask)
+ u64 unsup_bits;
+
+ if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask_all)
return;
/*
* Unsupported supervisor xstates should not be found in
* the xfeatures mask.
*/
- WARN_ONCE((xfeatures_mask & XFEATURE_MASK_SUPERVISOR_UNSUPPORTED),
- "x86/fpu: Found unsupported supervisor xstates.\n");
+ unsup_bits = xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
+ WARN_ONCE(unsup_bits, "x86/fpu: Found unsupported supervisor xstates: 0x%llx\n",
+ unsup_bits);

- xfeatures_mask &= ~XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
+ xfeatures_mask_all &= ~XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;

cr4_set_bits(X86_CR4_OSXSAVE);
- xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
+
+ /*
+ * XCR_XFEATURE_ENABLED_MASK (aka. XCR0) sets user features
+ * managed by XSAVE{C, OPT, S} and XRSTOR{S}. Only XSAVE user
+ * states can be set here.
+ */
+ xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
}

/*
@@ -225,9 +235,9 @@ void fpu__init_cpu_xstate(void)
* functions here: one for user xstates and the other for
* system xstates. For now, they are the same.
*/
-static int xfeature_enabled(enum xfeature xfeature)
+static bool xfeature_enabled(enum xfeature xfeature)
{
- return !!(xfeatures_mask & (1UL << xfeature));
+ return xfeatures_mask_all & BIT_ULL(xfeature);
}

/*
@@ -414,7 +424,7 @@ static void __init setup_init_fpu_buf(void)

if (boot_cpu_has(X86_FEATURE_XSAVES))
init_fpstate.xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
- xfeatures_mask;
+ xfeatures_mask_all;

/*
* Init all the features state with header.xfeatures being 0x0
@@ -474,7 +484,7 @@ int using_compacted_format(void)
int validate_user_xstate_header(const struct xstate_header *hdr)
{
/* No unknown or supervisor features may be set */
- if (hdr->xfeatures & ~(xfeatures_mask & XFEATURE_MASK_USER_SUPPORTED))
+ if (hdr->xfeatures & ~xfeatures_mask_user())
return -EINVAL;

/* Userspace must use the uncompacted format */
@@ -609,7 +619,7 @@ static void do_extra_xstate_size_checks(void)


/*
- * Get total size of enabled xstates in XCR0/xfeatures_mask.
+ * Get total size of enabled xstates in XCR0 | IA32_XSS.
*
* Note the SDM's wording here. "sub-function 0" only enumerates
* the size of the *user* states. If we use it to size a buffer
@@ -699,7 +709,7 @@ static int __init init_xstate_size(void)
*/
static void fpu__init_disable_system_xstate(void)
{
- xfeatures_mask = 0;
+ xfeatures_mask_all = 0;
cr4_clear_bits(X86_CR4_OSXSAVE);
setup_clear_cpu_cap(X86_FEATURE_XSAVE);
}
@@ -734,16 +744,21 @@ void __init fpu__init_system_xstate(void)
return;
}

+ /*
+ * Find user xstates supported by the processor.
+ */
cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
- xfeatures_mask = eax + ((u64)edx << 32);
+ xfeatures_mask_all = eax + ((u64)edx << 32);

- if ((xfeatures_mask & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+ /* Place supervisor features in xfeatures_mask_all here */
+ if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
/*
* This indicates that something really unexpected happened
* with the enumeration. Disable XSAVE and try to continue
* booting without it. This is too early to BUG().
*/
- pr_err("x86/fpu: FP/SSE not present amongst the CPU's xstate features: 0x%llx.\n", xfeatures_mask);
+ pr_err("x86/fpu: FP/SSE not present amongst the CPU's xstate features: 0x%llx.\n",
+ xfeatures_mask_all);
goto out_disable;
}

@@ -752,10 +767,10 @@ void __init fpu__init_system_xstate(void)
*/
for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
if (!boot_cpu_has(xsave_cpuid_features[i]))
- xfeatures_mask &= ~BIT(i);
+ xfeatures_mask_all &= ~BIT_ULL(i);
}

- xfeatures_mask &= fpu__get_supported_xfeatures_mask();
+ xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();

/* Enable xstate instructions to be able to continue with initialization: */
fpu__init_cpu_xstate();
@@ -767,8 +782,7 @@ void __init fpu__init_system_xstate(void)
* Update info used for ptrace frames; use standard-format size and no
* supervisor xstates:
*/
- update_regset_xstate_info(fpu_user_xstate_size,
- xfeatures_mask & XFEATURE_MASK_USER_SUPPORTED);
+ update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user());

fpu__init_prepare_fx_sw_frame();
setup_init_fpu_buf();
@@ -776,7 +790,7 @@ void __init fpu__init_system_xstate(void)
print_xstate_offset_size();

pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
- xfeatures_mask,
+ xfeatures_mask_all,
fpu_kernel_xstate_size,
boot_cpu_has(X86_FEATURE_XSAVES) ? "compacted" : "standard");
return;
@@ -795,7 +809,7 @@ void fpu__resume_cpu(void)
* Restore XCR0 on xsave capable CPUs:
*/
if (boot_cpu_has(X86_FEATURE_XSAVE))
- xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
+ xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
}

/*
@@ -840,10 +854,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)

/*
* We should not ever be requesting features that we
- * have not enabled. Remember that xfeatures_mask is
- * what we write to the XCR0 register.
+ * have not enabled.
*/
- WARN_ONCE(!(xfeatures_mask & BIT_ULL(xfeature_nr)),
+ WARN_ONCE(!(xfeatures_mask_all & BIT_ULL(xfeature_nr)),
"get of unsupported state");
/*
* This assumes the last 'xsave*' instruction to
@@ -991,7 +1004,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
*/
memset(&header, 0, sizeof(header));
header.xfeatures = xsave->header.xfeatures;
- header.xfeatures &= XFEATURE_MASK_USER_SUPPORTED;
+ header.xfeatures &= xfeatures_mask_user();

/*
* Copy xregs_state->header:
@@ -1075,7 +1088,7 @@ int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned i
*/
memset(&header, 0, sizeof(header));
header.xfeatures = xsave->header.xfeatures;
- header.xfeatures &= XFEATURE_MASK_USER_SUPPORTED;
+ header.xfeatures &= xfeatures_mask_user();

/*
* Copy xregs_state->header:
--
2.21.0

2020-03-28 16:45:48

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 01/10] x86/fpu/xstate: Rename validate_xstate_header() to validate_user_xstate_header()

From: Fenghua Yu <[email protected]>

The function validate_xstate_header() validates an xstate header coming
from userspace (PTRACE or sigreturn). To make it clear, rename it to
validate_user_xstate_header().

v3:
- Change validate_xstate_header_from_user() to validate_user_xstate_header().

Suggested-by: Dave Hansen <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/fpu/xstate.h | 2 +-
arch/x86/kernel/fpu/regset.c | 2 +-
arch/x86/kernel/fpu/signal.c | 2 +-
arch/x86/kernel/fpu/xstate.c | 6 +++---
4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index c6136d79f8c0..fc4db51f3b53 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -56,6 +56,6 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);

/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-extern int validate_xstate_header(const struct xstate_header *hdr);
+int validate_user_xstate_header(const struct xstate_header *hdr);

#endif
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index d652b939ccfb..bd1d0649f8ce 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -139,7 +139,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
} else {
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
if (!ret)
- ret = validate_xstate_header(&xsave->header);
+ ret = validate_user_xstate_header(&xsave->header);
}

/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 400a05e1c1c5..585e3651b98f 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -366,7 +366,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);

if (!ret && state_size > offsetof(struct xregs_state, header))
- ret = validate_xstate_header(&fpu->state.xsave.header);
+ ret = validate_user_xstate_header(&fpu->state.xsave.header);
}
if (ret)
goto err_out;
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 73fe5979629c..5a580e403044 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -472,7 +472,7 @@ int using_compacted_format(void)
}

/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_xstate_header(const struct xstate_header *hdr)
+int validate_user_xstate_header(const struct xstate_header *hdr)
{
/* No unknown or supervisor features may be set */
if (hdr->xfeatures & (~xfeatures_mask | XFEATURE_MASK_SUPERVISOR))
@@ -1142,7 +1142,7 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)

memcpy(&hdr, kbuf + offset, size);

- if (validate_xstate_header(&hdr))
+ if (validate_user_xstate_header(&hdr))
return -EINVAL;

for (i = 0; i < XFEATURE_MAX; i++) {
@@ -1196,7 +1196,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
if (__copy_from_user(&hdr, ubuf + offset, size))
return -EFAULT;

- if (validate_xstate_header(&hdr))
+ if (validate_user_xstate_header(&hdr))
return -EINVAL;

for (i = 0; i < XFEATURE_MAX; i++) {
--
2.21.0

2020-03-28 16:45:58

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 06/10] x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates

The function sanitize_restored_xstate() sanitizes user xstates of an XSAVE
buffer by setting the buffer's header->xfeatures to the input 'xfeatures',
effectively resetting features not in 'xfeatures' back to the init state.

When supervisor xstates are introduced, it is necessary to make sure only
user xstates are sanitized. Ensure supervisor bits in header->xfeatures
stay set and supervisor states are not modified.

To make names clear, also:

- Rename the function to sanitize_restored_user_xstate().
- Rename input parameter 'xfeatures' to 'xfeatures_from_user'.
- In __fpu__restore_sig(), rename 'xfeatures' to 'user_xfeatures'.

v3:
- Change xfeatures_user to user_xfeatures.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 37 +++++++++++++++++++++++-------------
1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index cd6eafba12da..d09d72334a12 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -211,9 +211,9 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
}

static inline void
-sanitize_restored_xstate(union fpregs_state *state,
- struct user_i387_ia32_struct *ia32_env,
- u64 xfeatures, int fx_only)
+sanitize_restored_user_xstate(union fpregs_state *state,
+ struct user_i387_ia32_struct *ia32_env,
+ u64 xfeatures_from_user, int fx_only)
{
struct xregs_state *xsave = &state->xsave;
struct xstate_header *header = &xsave->header;
@@ -226,13 +226,22 @@ sanitize_restored_xstate(union fpregs_state *state,
*/

/*
- * Init the state that is not present in the memory
- * layout and not enabled by the OS.
+ * 'xfeatures_from_user' might have bits clear which are
+ * set in header->xfeatures. This represents features that
+ * were in init state prior to a signal delivery, and need
+ * to be reset back to the init state. Clear any user
+ * feature bits which are set in the kernel buffer to get
+ * them back to the init state.
+ *
+ * Supervisor state is unchanged by input from userspace.
+ * Ensure supervisor state bits stay set and supervisor
+ * state is not modified.
*/
if (fx_only)
header->xfeatures = XFEATURE_MASK_FPSSE;
else
- header->xfeatures &= xfeatures;
+ header->xfeatures &= xfeatures_from_user |
+ xfeatures_mask_supervisor();
}

if (use_fxsr()) {
@@ -281,7 +290,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
struct task_struct *tsk = current;
struct fpu *fpu = &tsk->thread.fpu;
struct user_i387_ia32_struct env;
- u64 xfeatures = 0;
+ u64 user_xfeatures = 0;
int fx_only = 0;
int ret = 0;

@@ -314,7 +323,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
trace_x86_fpu_xstate_check_failed(fpu);
} else {
state_size = fx_sw_user.xstate_size;
- xfeatures = fx_sw_user.xfeatures;
+ user_xfeatures = fx_sw_user.xfeatures;
}
}

@@ -349,7 +358,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
*/
fpregs_lock();
pagefault_disable();
- ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
+ ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
fpregs_mark_activate();
@@ -362,7 +371,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)


if (use_xsave() && !fx_only) {
- u64 init_bv = xfeatures_mask_user() & ~xfeatures;
+ u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;

if (using_compacted_format()) {
ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
@@ -375,12 +384,13 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
if (ret)
goto err_out;

- sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
+ sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
+ fx_only);

fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- ret = copy_kernel_to_xregs_err(&fpu->state.xsave, xfeatures);
+ ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures);

} else if (use_fxsr()) {
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
@@ -389,7 +399,8 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
goto err_out;
}

- sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
+ sanitize_restored_user_xstate(&fpu->state, envp,
+ user_xfeatures, fx_only);

fpregs_lock();
if (use_xsave()) {
--
2.21.0

2020-03-28 16:46:01

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

From: Fenghua Yu <[email protected]>

Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
supervisor states are introduced, supervisor settings (e.g. CET xstates)
must remain active for signals; It is necessary to have separate functions:

- Create fpu__clear_user_states(): clear only user settings for signals;
- Create fpu__clear_all(): clear both user and supervisor settings in
flush_thread().

Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
functions.

v3:
- Put common code into a static function fpu__clear(), with a parameter
clear_user_only.

v2:
- Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
- Revise commit log.

Signed-off-by: Fenghua Yu <[email protected]>
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 3 +-
arch/x86/kernel/fpu/core.c | 49 +++++++++++++++++++----------
arch/x86/kernel/fpu/signal.c | 4 +--
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
5 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index ccb1bb32ad7d..a42fcb4b690d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -31,7 +31,8 @@ extern void fpu__save(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
-extern void fpu__clear(struct fpu *fpu);
+extern void fpu__clear_user_states(struct fpu *fpu);
+extern void fpu__clear_all(struct fpu *fpu);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 12c70840980e..6ba3a8b78bf9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -294,12 +294,10 @@ void fpu__drop(struct fpu *fpu)
* Clear FPU registers by setting them up from
* the init fpstate:
*/
-static inline void copy_init_fpstate_to_fpregs(void)
+static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
{
- fpregs_lock();
-
if (use_xsave())
- copy_kernel_to_xregs(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
else if (static_cpu_has(X86_FEATURE_FXSR))
copy_kernel_to_fxregs(&init_fpstate.fxsave);
else
@@ -307,9 +305,6 @@ static inline void copy_init_fpstate_to_fpregs(void)

if (boot_cpu_has(X86_FEATURE_OSPKE))
copy_init_pkru_to_fpregs();
-
- fpregs_mark_activate();
- fpregs_unlock();
}

/*
@@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
* Called by sys_execve(), by the signal handler code and by various
* error paths.
*/
-void fpu__clear(struct fpu *fpu)
+static void fpu__clear(struct fpu *fpu, int clear_user_only)
{
- WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
+ if (static_cpu_has(X86_FEATURE_FPU)) {
+ fpregs_lock();
+
+ if (clear_user_only) {
+ if (!fpregs_state_valid(fpu, smp_processor_id()) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
+ copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+ } else {
+ copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+ }
+
+ fpregs_mark_activate();
+ fpregs_unlock();
+ return;
+ } else {
+ fpu__drop(fpu);
+ fpu__initialize(fpu);
+ }
+}

- fpu__drop(fpu);
+void fpu__clear_user_states(struct fpu *fpu)
+{
+ WARN_ON_FPU(fpu != &current->thread.fpu);
+ fpu__clear(fpu, 1);
+}

- /*
- * Make sure fpstate is cleared and initialized.
- */
- fpu__initialize(fpu);
- if (static_cpu_has(X86_FEATURE_FPU))
- copy_init_fpstate_to_fpregs();
+void fpu__clear_all(struct fpu *fpu)
+{
+ WARN_ON_FPU(fpu != &current->thread.fpu);
+ fpu__clear(fpu, 0);
}

/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3df0cfae535f..cd6eafba12da 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -289,7 +289,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
IS_ENABLED(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return 0;
}

@@ -416,7 +416,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

err_out:
if (ret)
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return ret;
}

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 3053c85e0e42..87de18c64cf5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -192,7 +192,7 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));

- fpu__clear(&tsk->thread.fpu);
+ fpu__clear_all(&tsk->thread.fpu);
}

void disable_TSC(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 8a29573851a3..35f878e9f91d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -761,7 +761,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
/*
* Ensure the signal handler starts with the new fpu state.
*/
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.21.0

2020-03-28 16:46:03

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 09/10] x86/fpu/xstate: Preserve supervisor states for slow path of __fpu__restore_sig()

The signal return code is responsible for taking an XSAVE buffer present
in user memory and loading it into the hardware registers. This
operation only affects user XSAVE state and never affects supervisor state.

The fast path through this code simply points XRSTOR directly at the
user buffer. However, due to page faults, this XRSTOR can fail. If it
fails, the signal return code falls back to a slow path which can
tolerate page faults.

That slow path copies the xfeatures one by one out of the user buffer
into the task's fpu state area. However, by being in a context where it
can handle page faults, the code can also schedule. That exposes us to
the whims of the lazy-fpu-load code. That code currently can not
comprehend valid fpregs (the supervisor state) mixed with the *not*
valid user fpregs. To do that, we would need to track whether
individual XSAVE components had valid fpregs or fpstate.

If we left the current code in place, the context-switch code would
think it has an up-to-date fpstate and would fail to save the supervisor
state when scheduling the task out. When scheduling back in, it would
likely restore stale supervisor state.

To fix that, we need to save the supervisor state before the slow path.
That way, the supervisor state is always up-to-date and the task can
survive being scheduled.

Modify copy_user_to_fpregs_zeroing() so that if it fails, fpregs are not
zeroed, and there is no need for fpregs_deactivate() and supervisor states
are preserved.

Move set_thread_flag(TIF_NEED_FPU_LOAD) to the slow path. Without doing
this, the fast path also needs supervisor states to be saved first.

Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 53 +++++++++++++++++++-----------------
1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index d09d72334a12..545ca4314096 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -262,19 +262,23 @@ sanitize_restored_user_xstate(union fpregs_state *state,
static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
{
u64 init_bv;
+ int r;

if (use_xsave()) {
if (fx_only) {
init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;

- copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_fxregs(buf);
+ r = copy_user_to_fxregs(buf);
+ if (!r)
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return r;
} else {
init_bv = xfeatures_mask_user() & ~xbv;

- if (unlikely(init_bv))
+ r = copy_user_to_xregs(buf, xbv);
+ if (!r && unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_xregs(buf, xbv);
+ return r;
}
} else if (use_fxsr()) {
return copy_user_to_fxregs(buf);
@@ -327,28 +331,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}
}

- /*
- * The current state of the FPU registers does not matter. By setting
- * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
- * is not modified on context switch and that the xstate is considered
- * to be loaded again on return to userland (overriding last_cpu avoids
- * the optimisation).
- */
- set_thread_flag(TIF_NEED_FPU_LOAD);
- __fpu_invalidate_fpregs_state(fpu);
-
if ((unsigned long)buf_fx % 64)
fx_only = 1;
- /*
- * For 32-bit frames with fxstate, copy the fxstate so it can be
- * reconstructed later.
- */
- if (ia32_fxstate) {
- ret = __copy_from_user(&env, buf, sizeof(env));
- if (ret)
- goto err_out;
- envp = &env;
- } else {
+
+ if (!ia32_fxstate) {
/*
* Attempt to restore the FPU registers directly from user
* memory. For that to succeed, the user access cannot cause
@@ -365,10 +351,27 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_unlock();
return 0;
}
- fpregs_deactivate(fpu);
fpregs_unlock();
+ } else {
+ /*
+ * For 32-bit frames with fxstate, copy the fxstate so it can
+ * be reconstructed later.
+ */
+ ret = __copy_from_user(&env, buf, sizeof(env));
+ if (ret)
+ goto err_out;
+ envp = &env;
}

+ /*
+ * The current state of the FPU registers does not matter. By setting
+ * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
+ * is not modified on context switch and that the xstate is considered
+ * to be loaded again on return to userland (overriding last_cpu avoids
+ * the optimisation).
+ */
+ set_thread_flag(TIF_NEED_FPU_LOAD);
+ __fpu_invalidate_fpregs_state(fpu);

if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
--
2.21.0

2020-03-28 16:46:23

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 07/10] x86/fpu/xstate: Update copy_kernel_to_xregs_err() for XSAVES supervisor states

The function copy_kernel_to_xregs_err() uses XRSTOR, which can work with
standard or compacted format without supervisor xstates. However, when
supervisor xstates are present, XRSTORS must be used. Fix it by using
XRSTORS when XSAVES is enabled.

I also considered if there were additional cases where XRSTOR might be
mistakenly called instead of XRSTORS. There are only three XRSTOR sites
in kernel:

1. copy_kernel_to_xregs_booting(), already switches between XRSTOR and
XRSTORS based on X86_FEATURE_XSAVES.
2. copy_user_to_xregs(), which *needs* XRSTOR because it is copying from
userspace and must never copy supervisor state with XRSTORS.
3. copy_kernel_to_xregs_err() mistakenly used XRSTOR only. Fixed in
this patch.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a42fcb4b690d..42159f45bf9c 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -400,7 +400,10 @@ static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
u32 hmask = mask >> 32;
int err;

- XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
+ if (static_cpu_has(X86_FEATURE_XSAVES))
+ XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
+ else
+ XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);

return err;
}
--
2.21.0

2020-04-28 17:13:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 01/10] x86/fpu/xstate: Rename validate_xstate_header() to validate_user_xstate_header()

On Sat, Mar 28, 2020 at 09:42:58AM -0700, Yu-cheng Yu wrote:
> From: Fenghua Yu <[email protected]>
>
> The function validate_xstate_header() validates an xstate header coming
> from userspace (PTRACE or sigreturn). To make it clear, rename it to
> validate_user_xstate_header().

For the next time: patch version history like this here:

> v3:
> - Change validate_xstate_header_from_user() to validate_user_xstate_header().

... goes ...

>
> Suggested-by: Dave Hansen <[email protected]>
> Signed-off-by: Fenghua Yu <[email protected]>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Reviewed-by: Dave Hansen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> Reviewed-by: Borislav Petkov <[email protected]>
> ---

<--- ... here so that, when someone applies the patch, the version
history doesn't become part of the commit message.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-28 17:19:16

by Yu-cheng Yu

[permalink] [raw]
Subject: Re: [PATCH v3 01/10] x86/fpu/xstate: Rename validate_xstate_header() to validate_user_xstate_header()

On Tue, 2020-04-28 at 19:11 +0200, Borislav Petkov wrote:
> On Sat, Mar 28, 2020 at 09:42:58AM -0700, Yu-cheng Yu wrote:
> > From: Fenghua Yu <[email protected]>
> >
> > The function validate_xstate_header() validates an xstate header coming
> > from userspace (PTRACE or sigreturn). To make it clear, rename it to
> > validate_user_xstate_header().
>
> For the next time: patch version history like this here:
>
> > v3:
> > - Change validate_xstate_header_from_user() to validate_user_xstate_header().
>
> ... goes ...
>
> > Suggested-by: Dave Hansen <[email protected]>
> > Signed-off-by: Fenghua Yu <[email protected]>
> > Signed-off-by: Yu-cheng Yu <[email protected]>
> > Reviewed-by: Dave Hansen <[email protected]>
> > Reviewed-by: Tony Luck <[email protected]>
> > Reviewed-by: Borislav Petkov <[email protected]>
> > ---
>
> <--- ... here so that, when someone applies the patch, the version
> history doesn't become part of the commit message.
>

Yes, I will do that the next time. Thanks!

2020-04-29 09:30:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

On Sat, Mar 28, 2020 at 09:43:02AM -0700, Yu-cheng Yu wrote:
> @@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
> * Called by sys_execve(), by the signal handler code and by various
> * error paths.
> */
> -void fpu__clear(struct fpu *fpu)
> +static void fpu__clear(struct fpu *fpu, int clear_user_only)

I said:

"fpu__clear(struct fpu *fpu, bool user_only)"
^^^^^^^^^^^^^^

you made it

..., int clear_user_only)

Why?

If you agree with the review comment, then please do it as suggested. If
you don't, then say why you don't.

Why would you do something in-between?

> {
> - WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */

Why are you moving this into the callers when *both* do it?

> + if (static_cpu_has(X86_FEATURE_FPU)) {

Flip this logic:

if (!static_cpu_has(X86_FEATURE_FPU)) {
fpu__drop(fpu);
fpu__initialize(fpu);
return;
}

fpregs_lock();
...

to save an indentation level and make the important case more readable
and locking more prominent.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-29 16:10:07

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

From: Fenghua Yu <[email protected]>

Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
supervisor states are introduced, supervisor settings (e.g. CET xstates)
must remain active for signals; It is necessary to have separate functions:

- Create fpu__clear_user_states(): clear only user settings for signals;
- Create fpu__clear_all(): clear both user and supervisor settings in
flush_thread().

Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
functions.

Signed-off-by: Fenghua Yu <[email protected]>
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>

v3:
- Put common code into a static function fpu__clear(), with a parameter
user_only.

v2:
- Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
- Revise commit log.
---
arch/x86/include/asm/fpu/internal.h | 3 +-
arch/x86/kernel/fpu/core.c | 49 +++++++++++++++++++----------
arch/x86/kernel/fpu/signal.c | 4 +--
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
5 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index ccb1bb32ad7d..a42fcb4b690d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -31,7 +31,8 @@ extern void fpu__save(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
-extern void fpu__clear(struct fpu *fpu);
+extern void fpu__clear_user_states(struct fpu *fpu);
+extern void fpu__clear_all(struct fpu *fpu);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 12c70840980e..7fddd5d60443 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -294,12 +294,10 @@ void fpu__drop(struct fpu *fpu)
* Clear FPU registers by setting them up from
* the init fpstate:
*/
-static inline void copy_init_fpstate_to_fpregs(void)
+static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
{
- fpregs_lock();
-
if (use_xsave())
- copy_kernel_to_xregs(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
else if (static_cpu_has(X86_FEATURE_FXSR))
copy_kernel_to_fxregs(&init_fpstate.fxsave);
else
@@ -307,9 +305,6 @@ static inline void copy_init_fpstate_to_fpregs(void)

if (boot_cpu_has(X86_FEATURE_OSPKE))
copy_init_pkru_to_fpregs();
-
- fpregs_mark_activate();
- fpregs_unlock();
}

/*
@@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
* Called by sys_execve(), by the signal handler code and by various
* error paths.
*/
-void fpu__clear(struct fpu *fpu)
+static void fpu__clear(struct fpu *fpu, int user_only)
{
- WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
+ WARN_ON_FPU(fpu != &current->thread.fpu);

- fpu__drop(fpu);
+ if (!static_cpu_has(X86_FEATURE_FPU)) {
+ fpu__drop(fpu);
+ fpu__initialize(fpu);
+ return;
+ }

- /*
- * Make sure fpstate is cleared and initialized.
- */
- fpu__initialize(fpu);
- if (static_cpu_has(X86_FEATURE_FPU))
- copy_init_fpstate_to_fpregs();
+ fpregs_lock();
+
+ if (user_only) {
+ if (!fpregs_state_valid(fpu, smp_processor_id()) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
+ copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+ } else {
+ copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+ }
+
+ fpregs_mark_activate();
+ fpregs_unlock();
+}
+
+void fpu__clear_user_states(struct fpu *fpu)
+{
+ fpu__clear(fpu, 1);
+}
+
+void fpu__clear_all(struct fpu *fpu)
+{
+ fpu__clear(fpu, 0);
}

/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3df0cfae535f..cd6eafba12da 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -289,7 +289,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
IS_ENABLED(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return 0;
}

@@ -416,7 +416,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

err_out:
if (ret)
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return ret;
}

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9da70b279dad..de182b84723a 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -191,7 +191,7 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));

- fpu__clear(&tsk->thread.fpu);
+ fpu__clear_all(&tsk->thread.fpu);
}

void disable_TSC(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 83b74fb38c8f..0052bbe5dfd4 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -732,7 +732,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
/*
* Ensure the signal handler starts with the new fpu state.
*/
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.21.0

2020-04-29 16:12:04

by Yu-cheng Yu

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

On Wed, 2020-04-29 at 11:27 +0200, Borislav Petkov wrote:
> On Sat, Mar 28, 2020 at 09:43:02AM -0700, Yu-cheng Yu wrote:
> > @@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
> > * Called by sys_execve(), by the signal handler code and by various
> > * error paths.
> > */
> > -void fpu__clear(struct fpu *fpu)
> > +static void fpu__clear(struct fpu *fpu, int clear_user_only)
>
> I said:
>
> "fpu__clear(struct fpu *fpu, bool user_only)"
> ^^^^^^^^^^^^^^
>
> you made it
>
> ..., int clear_user_only)
>
> Why?
>
> If you agree with the review comment, then please do it as suggested. If
> you don't, then say why you don't.
>
> Why would you do something in-between?
>
> > {
> > - WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
>
> Why are you moving this into the callers when *both* do it?
>
> > + if (static_cpu_has(X86_FEATURE_FPU)) {
>
> Flip this logic:
>
> if (!static_cpu_has(X86_FEATURE_FPU)) {
> fpu__drop(fpu);
> fpu__initialize(fpu);
> return;
> }
>
> fpregs_lock();
> ...
>
> to save an indentation level and make the important case more readable
> and locking more prominent.

All fixed.

Thanks,
Yu-cheng

2020-04-29 16:41:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

On Wed, Apr 29, 2020 at 09:06:44AM -0700, Yu-cheng Yu wrote:
> From: Fenghua Yu <[email protected]>
>
> Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
> supervisor states are introduced, supervisor settings (e.g. CET xstates)
> must remain active for signals; It is necessary to have separate functions:
>
> - Create fpu__clear_user_states(): clear only user settings for signals;
> - Create fpu__clear_all(): clear both user and supervisor settings in
> flush_thread().
>
> Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
> functions.
>
> Signed-off-by: Fenghua Yu <[email protected]>
> Co-developed-by: Yu-cheng Yu <[email protected]>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Reviewed-by: Dave Hansen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
>
> v3:
> - Put common code into a static function fpu__clear(), with a parameter
> user_only.
>
> v2:
> - Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
> - Revise commit log.

Try applying that patch from this mail yourself and see whether the
patch changelog will remain in the commit message or it will get
discarded.

> @@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
> * Called by sys_execve(), by the signal handler code and by various
> * error paths.
> */
> -void fpu__clear(struct fpu *fpu)
> +static void fpu__clear(struct fpu *fpu, int user_only)
> {
> - WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
> + WARN_ON_FPU(fpu != &current->thread.fpu);

Why did you remove the side comment?

Is it wrong?

Why do you do such arbitrary changes which are not needed instead of
concentrating on only the changes the patch should do?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-29 17:05:53

by Yu-cheng Yu

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

On Wed, 2020-04-29 at 18:39 +0200, Borislav Petkov wrote:
> On Wed, Apr 29, 2020 at 09:06:44AM -0700, Yu-cheng Yu wrote:
> > From: Fenghua Yu <[email protected]>
> >
> > Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
> > supervisor states are introduced, supervisor settings (e.g. CET xstates)
> > must remain active for signals; It is necessary to have separate functions:
> >
> > - Create fpu__clear_user_states(): clear only user settings for signals;
> > - Create fpu__clear_all(): clear both user and supervisor settings in
> > flush_thread().
> >
> > Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
> > functions.
> >
> > Signed-off-by: Fenghua Yu <[email protected]>
> > Co-developed-by: Yu-cheng Yu <[email protected]>
> > Signed-off-by: Yu-cheng Yu <[email protected]>
> > Reviewed-by: Dave Hansen <[email protected]>
> > Reviewed-by: Tony Luck <[email protected]>
> >
> > v3:
> > - Put common code into a static function fpu__clear(), with a parameter
> > user_only.
> >
> > v2:
> > - Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
> > - Revise commit log.
>
> Try applying that patch from this mail yourself and see whether the
> patch changelog will remain in the commit message or it will get
> discarded.

My mistake! I will fix it.

>
> > @@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
> > * Called by sys_execve(), by the signal handler code and by various
> > * error paths.
> > */
> > -void fpu__clear(struct fpu *fpu)
> > +static void fpu__clear(struct fpu *fpu, int user_only)
> > {
> > - WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
> > + WARN_ON_FPU(fpu != &current->thread.fpu);
>
> Why did you remove the side comment?
>
> Is it wrong?
>
> Why do you do such arbitrary changes which are not needed instead of
> concentrating on only the changes the patch should do?

It has been some time since Thomas commented on this tail comment.
https://lore.kernel.org/lkml/[email protected]/

I think why not fixing it while at it.

Yu-cheng

2020-04-29 17:35:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

On Wed, Apr 29, 2020 at 10:02:46AM -0700, Yu-cheng Yu wrote:
> It has been some time since Thomas commented on this tail comment.
> https://lore.kernel.org/lkml/[email protected]/
>
> I think why not fixing it while at it.

So "fixing it" means removing it or putting it *over* the line?

If you think the comment is obvious and superfluous, then say so in the
commit message "remove obvious side-comment in fpu__clear(), while at
it." to let readers know *why* you've done it, especially since it is a
side-change, not really related to what the patch is trying to do.

Otherwise, readers like me wonder: why is he doing that? What else is
he doing in that patch that doesn't belong here? Why isn't that patch
straightforward doing one thing and one thing only and doing other
stuff?

This certainly doesn't make review easier.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-04-29 17:45:14

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

From: Fenghua Yu <[email protected]>

Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
supervisor states are introduced, supervisor settings (e.g. CET xstates)
must remain active for signals; It is necessary to have separate functions:

- Create fpu__clear_user_states(): clear only user settings for signals;
- Create fpu__clear_all(): clear both user and supervisor settings in
flush_thread().

Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
functions.

Signed-off-by: Fenghua Yu <[email protected]>
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
v3:
- Put common code into a static function fpu__clear(), with a parameter
user_only.

v2:
- Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
- Revise commit log.

arch/x86/include/asm/fpu/internal.h | 3 +-
arch/x86/kernel/fpu/core.c | 49 +++++++++++++++++++----------
arch/x86/kernel/fpu/signal.c | 4 +--
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
5 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index ccb1bb32ad7d..a42fcb4b690d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -31,7 +31,8 @@ extern void fpu__save(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
-extern void fpu__clear(struct fpu *fpu);
+extern void fpu__clear_user_states(struct fpu *fpu);
+extern void fpu__clear_all(struct fpu *fpu);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 12c70840980e..7fddd5d60443 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -294,12 +294,10 @@ void fpu__drop(struct fpu *fpu)
* Clear FPU registers by setting them up from
* the init fpstate:
*/
-static inline void copy_init_fpstate_to_fpregs(void)
+static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
{
- fpregs_lock();
-
if (use_xsave())
- copy_kernel_to_xregs(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
else if (static_cpu_has(X86_FEATURE_FXSR))
copy_kernel_to_fxregs(&init_fpstate.fxsave);
else
@@ -307,9 +305,6 @@ static inline void copy_init_fpstate_to_fpregs(void)

if (boot_cpu_has(X86_FEATURE_OSPKE))
copy_init_pkru_to_fpregs();
-
- fpregs_mark_activate();
- fpregs_unlock();
}

/*
@@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
* Called by sys_execve(), by the signal handler code and by various
* error paths.
*/
-void fpu__clear(struct fpu *fpu)
+static void fpu__clear(struct fpu *fpu, int user_only)
{
- WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
+ WARN_ON_FPU(fpu != &current->thread.fpu);

- fpu__drop(fpu);
+ if (!static_cpu_has(X86_FEATURE_FPU)) {
+ fpu__drop(fpu);
+ fpu__initialize(fpu);
+ return;
+ }

- /*
- * Make sure fpstate is cleared and initialized.
- */
- fpu__initialize(fpu);
- if (static_cpu_has(X86_FEATURE_FPU))
- copy_init_fpstate_to_fpregs();
+ fpregs_lock();
+
+ if (user_only) {
+ if (!fpregs_state_valid(fpu, smp_processor_id()) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
+ copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+ } else {
+ copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+ }
+
+ fpregs_mark_activate();
+ fpregs_unlock();
+}
+
+void fpu__clear_user_states(struct fpu *fpu)
+{
+ fpu__clear(fpu, 1);
+}
+
+void fpu__clear_all(struct fpu *fpu)
+{
+ fpu__clear(fpu, 0);
}

/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3df0cfae535f..cd6eafba12da 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -289,7 +289,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
IS_ENABLED(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return 0;
}

@@ -416,7 +416,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

err_out:
if (ret)
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return ret;
}

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9da70b279dad..de182b84723a 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -191,7 +191,7 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));

- fpu__clear(&tsk->thread.fpu);
+ fpu__clear_all(&tsk->thread.fpu);
}

void disable_TSC(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 83b74fb38c8f..0052bbe5dfd4 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -732,7 +732,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
/*
* Ensure the signal handler starts with the new fpu state.
*/
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.21.0

2020-04-29 19:12:33

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 05/10] x86/fpu/xstate: Define new functions for clearing fpregs and xstates

From: Fenghua Yu <[email protected]>

Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES
supervisor states are introduced, supervisor settings (e.g. CET xstates)
must remain active for signals; It is necessary to have separate functions:

- Create fpu__clear_user_states(): clear only user settings for signals;
- Create fpu__clear_all(): clear both user and supervisor settings in
flush_thread().

Also modify copy_init_fpstate_to_fpregs() to take a mask from above two
functions.

Remove obvious side-comment in fpu__clear(), while at it.

Signed-off-by: Fenghua Yu <[email protected]>
Co-developed-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
v3:
- Put common code into a static function fpu__clear(), with a parameter
user_only.

v2:
- Fixed an issue where fpu__clear_user_states() drops supervisor xstates.
- Revise commit log.

arch/x86/include/asm/fpu/internal.h | 3 +-
arch/x86/kernel/fpu/core.c | 49 +++++++++++++++++++----------
arch/x86/kernel/fpu/signal.c | 4 +--
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
5 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index ccb1bb32ad7d..a42fcb4b690d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -31,7 +31,8 @@ extern void fpu__save(struct fpu *fpu);
extern int fpu__restore_sig(void __user *buf, int ia32_frame);
extern void fpu__drop(struct fpu *fpu);
extern int fpu__copy(struct task_struct *dst, struct task_struct *src);
-extern void fpu__clear(struct fpu *fpu);
+extern void fpu__clear_user_states(struct fpu *fpu);
+extern void fpu__clear_all(struct fpu *fpu);
extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern int dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 12c70840980e..7fddd5d60443 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -294,12 +294,10 @@ void fpu__drop(struct fpu *fpu)
* Clear FPU registers by setting them up from
* the init fpstate:
*/
-static inline void copy_init_fpstate_to_fpregs(void)
+static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
{
- fpregs_lock();
-
if (use_xsave())
- copy_kernel_to_xregs(&init_fpstate.xsave, -1);
+ copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
else if (static_cpu_has(X86_FEATURE_FXSR))
copy_kernel_to_fxregs(&init_fpstate.fxsave);
else
@@ -307,9 +305,6 @@ static inline void copy_init_fpstate_to_fpregs(void)

if (boot_cpu_has(X86_FEATURE_OSPKE))
copy_init_pkru_to_fpregs();
-
- fpregs_mark_activate();
- fpregs_unlock();
}

/*
@@ -318,18 +313,40 @@ static inline void copy_init_fpstate_to_fpregs(void)
* Called by sys_execve(), by the signal handler code and by various
* error paths.
*/
-void fpu__clear(struct fpu *fpu)
+static void fpu__clear(struct fpu *fpu, int user_only)
{
- WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
+ WARN_ON_FPU(fpu != &current->thread.fpu);

- fpu__drop(fpu);
+ if (!static_cpu_has(X86_FEATURE_FPU)) {
+ fpu__drop(fpu);
+ fpu__initialize(fpu);
+ return;
+ }

- /*
- * Make sure fpstate is cleared and initialized.
- */
- fpu__initialize(fpu);
- if (static_cpu_has(X86_FEATURE_FPU))
- copy_init_fpstate_to_fpregs();
+ fpregs_lock();
+
+ if (user_only) {
+ if (!fpregs_state_valid(fpu, smp_processor_id()) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
+ copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+ } else {
+ copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+ }
+
+ fpregs_mark_activate();
+ fpregs_unlock();
+}
+
+void fpu__clear_user_states(struct fpu *fpu)
+{
+ fpu__clear(fpu, 1);
+}
+
+void fpu__clear_all(struct fpu *fpu)
+{
+ fpu__clear(fpu, 0);
}

/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3df0cfae535f..cd6eafba12da 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -289,7 +289,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
IS_ENABLED(CONFIG_IA32_EMULATION));

if (!buf) {
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return 0;
}

@@ -416,7 +416,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)

err_out:
if (ret)
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
return ret;
}

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9da70b279dad..de182b84723a 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -191,7 +191,7 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));

- fpu__clear(&tsk->thread.fpu);
+ fpu__clear_all(&tsk->thread.fpu);
}

void disable_TSC(void)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 83b74fb38c8f..0052bbe5dfd4 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -732,7 +732,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
/*
* Ensure the signal handler starts with the new fpu state.
*/
- fpu__clear(fpu);
+ fpu__clear_user_states(fpu);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.21.0

2020-05-07 13:14:11

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 06/10] x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates

On Sat, Mar 28, 2020 at 09:43:03AM -0700, Yu-cheng Yu wrote:
> The function sanitize_restored_xstate() sanitizes user xstates of an XSAVE
> buffer by setting the buffer's header->xfeatures to the input 'xfeatures',

It doesn't set them to the input xfeatures. I see a "&=" there.

> effectively resetting features not in 'xfeatures' back to the init state.
>
> When supervisor xstates are introduced, it is necessary to make sure only
> user xstates are sanitized. Ensure supervisor bits in header->xfeatures
> stay set and supervisor states are not modified.
>
> To make names clear, also:
>
> - Rename the function to sanitize_restored_user_xstate().
> - Rename input parameter 'xfeatures' to 'xfeatures_from_user'.
> - In __fpu__restore_sig(), rename 'xfeatures' to 'user_xfeatures'.
>
> v3:
> - Change xfeatures_user to user_xfeatures.
>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Reviewed-by: Dave Hansen <[email protected]>
> ---
> arch/x86/kernel/fpu/signal.c | 37 +++++++++++++++++++++++-------------
> 1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index cd6eafba12da..d09d72334a12 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -211,9 +211,9 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
> }
>
> static inline void
> -sanitize_restored_xstate(union fpregs_state *state,
> - struct user_i387_ia32_struct *ia32_env,
> - u64 xfeatures, int fx_only)
> +sanitize_restored_user_xstate(union fpregs_state *state,
> + struct user_i387_ia32_struct *ia32_env,
> + u64 xfeatures_from_user, int fx_only)

Name those user_xfeatures too.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-05-07 13:32:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 07/10] x86/fpu/xstate: Update copy_kernel_to_xregs_err() for XSAVES supervisor states

On Sat, Mar 28, 2020 at 09:43:04AM -0700, Yu-cheng Yu wrote:
> The function copy_kernel_to_xregs_err() uses XRSTOR, which can work with
> standard or compacted format without supervisor xstates. However, when
> supervisor xstates are present, XRSTORS must be used. Fix it by using
> XRSTORS when XSAVES is enabled.
>
> I also considered if there were additional cases where XRSTOR might be
> mistakenly called instead of XRSTORS. There are only three XRSTOR sites
> in kernel:
>
> 1. copy_kernel_to_xregs_booting(), already switches between XRSTOR and
> XRSTORS based on X86_FEATURE_XSAVES.
> 2. copy_user_to_xregs(), which *needs* XRSTOR because it is copying from
> userspace and must never copy supervisor state with XRSTORS.
> 3. copy_kernel_to_xregs_err() mistakenly used XRSTOR only. Fixed in
> this patch.

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-05-07 15:58:00

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 06/10] x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates

The function sanitize_restored_xstate() sanitizes user xstates of an XSAVE
buffer by clearing bits not in the input 'xfeatures' from the buffer's
header->xfeatures, effectively resetting those features back to the init
state.

When supervisor xstates are introduced, it is necessary to make sure only
user xstates are sanitized. Ensure supervisor bits in header->xfeatures
stay set and supervisor states are not modified.

To make names clear, also:

- Rename the function to sanitize_restored_user_xstate().
- Rename input parameter 'xfeatures' to 'user_xfeatures'.
- In __fpu__restore_sig(), rename 'xfeatures' to 'user_xfeatures'.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
v3:
- Change xfeatures_user to user_xfeatures.

arch/x86/kernel/fpu/signal.c | 37 +++++++++++++++++++++++-------------
1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index cd6eafba12da..d09d72334a12 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -211,9 +211,9 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
}

static inline void
-sanitize_restored_xstate(union fpregs_state *state,
- struct user_i387_ia32_struct *ia32_env,
- u64 xfeatures, int fx_only)
+sanitize_restored_user_xstate(union fpregs_state *state,
+ struct user_i387_ia32_struct *ia32_env,
+ u64 user_xfeatures, int fx_only)
{
struct xregs_state *xsave = &state->xsave;
struct xstate_header *header = &xsave->header;
@@ -226,13 +226,22 @@ sanitize_restored_xstate(union fpregs_state *state,
*/

/*
- * Init the state that is not present in the memory
- * layout and not enabled by the OS.
+ * 'user_xfeatures' might have bits clear which are
+ * set in header->xfeatures. This represents features that
+ * were in init state prior to a signal delivery, and need
+ * to be reset back to the init state. Clear any user
+ * feature bits which are set in the kernel buffer to get
+ * them back to the init state.
+ *
+ * Supervisor state is unchanged by input from userspace.
+ * Ensure supervisor state bits stay set and supervisor
+ * state is not modified.
*/
if (fx_only)
header->xfeatures = XFEATURE_MASK_FPSSE;
else
- header->xfeatures &= xfeatures;
+ header->xfeatures &= user_xfeatures |
+ xfeatures_mask_supervisor();
}

if (use_fxsr()) {
@@ -281,7 +290,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
struct task_struct *tsk = current;
struct fpu *fpu = &tsk->thread.fpu;
struct user_i387_ia32_struct env;
- u64 xfeatures = 0;
+ u64 user_xfeatures = 0;
int fx_only = 0;
int ret = 0;

@@ -314,7 +323,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
trace_x86_fpu_xstate_check_failed(fpu);
} else {
state_size = fx_sw_user.xstate_size;
- xfeatures = fx_sw_user.xfeatures;
+ user_xfeatures = fx_sw_user.xfeatures;
}
}

@@ -349,7 +358,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
*/
fpregs_lock();
pagefault_disable();
- ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
+ ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
fpregs_mark_activate();
@@ -362,7 +371,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)


if (use_xsave() && !fx_only) {
- u64 init_bv = xfeatures_mask_user() & ~xfeatures;
+ u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;

if (using_compacted_format()) {
ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
@@ -375,12 +384,13 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
if (ret)
goto err_out;

- sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
+ sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
+ fx_only);

fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- ret = copy_kernel_to_xregs_err(&fpu->state.xsave, xfeatures);
+ ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures);

} else if (use_fxsr()) {
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
@@ -389,7 +399,8 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
goto err_out;
}

- sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
+ sanitize_restored_user_xstate(&fpu->state, envp,
+ user_xfeatures, fx_only);

fpregs_lock();
if (use_xsave()) {
--
2.21.0

2020-05-07 16:01:25

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 07/10] x86/fpu/xstate: Update copy_kernel_to_xregs_err() for XSAVES supervisor states

The function copy_kernel_to_xregs_err() uses XRSTOR, which can work with
standard or compacted format without supervisor xstates. However, when
supervisor xstates are present, XRSTORS must be used. Fix it by using
XRSTORS when XSAVES is enabled.

I also considered if there were additional cases where XRSTOR might be
mistakenly called instead of XRSTORS. There are only three XRSTOR sites
in kernel:

1. copy_kernel_to_xregs_booting(), already switches between XRSTOR and
XRSTORS based on X86_FEATURE_XSAVES.
2. copy_user_to_xregs(), which *needs* XRSTOR because it is copying from
userspace and must never copy supervisor state with XRSTORS.
3. copy_kernel_to_xregs_err() mistakenly used XRSTOR only. Fixed it.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
arch/x86/include/asm/fpu/internal.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a42fcb4b690d..42159f45bf9c 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -400,7 +400,10 @@ static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
u32 hmask = mask >> 32;
int err;

- XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
+ if (static_cpu_has(X86_FEATURE_XSAVES))
+ XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
+ else
+ XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);

return err;
}
--
2.21.0

2020-05-10 08:48:41

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 09/10] x86/fpu/xstate: Preserve supervisor states for slow path of __fpu__restore_sig()

On Sat, Mar 28, 2020 at 09:43:06AM -0700, Yu-cheng Yu wrote:
> The signal return code is responsible for taking an XSAVE buffer present
> in user memory and loading it into the hardware registers. This
> operation only affects user XSAVE state and never affects supervisor state.
>
> The fast path through this code simply points XRSTOR directly at the
> user buffer. However, due to page faults, this XRSTOR can fail. If it
^^^^^^^^^^^^^^^^^^

"However, since user memory is not guaranteed to be always mapped, ... "

> fails, the signal return code falls back to a slow path which can
> tolerate page faults.
>
> That slow path copies the xfeatures one by one out of the user buffer
> into the task's fpu state area. However, by being in a context where it
> can handle page faults, the code can also schedule. That exposes us to
^^
Pls use passive voice in your commit message: no "we" or "I", etc, and
describe your changes in imperative mood.

Fix this in all your commit messages pls.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-05-10 08:51:13

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 10/10] x86/fpu/xstate: Restore supervisor states for signal return

On Sat, Mar 28, 2020 at 09:43:07AM -0700, Yu-cheng Yu wrote:
> v3:
> - Change copy_xregs_to_kernel() to copy_supervisor_to_kernel(), which is
> introduced in a previous patch.
> - Update commit log.
>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Reviewed-by: Dave Hansen <[email protected]>
> ---
> arch/x86/kernel/fpu/signal.c | 27 ++++++++++++++++++++++-----
> 1 file changed, 22 insertions(+), 5 deletions(-)

Took Sebastian and me a while to grok the whole situation so pls add the
comments below to that patch when sending it in your next revision.

Thx.

---
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 4dad5afc938d..06e88d6ebb07 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -347,7 +347,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
- /* Restore supervisor states */
+
+ /*
+ * Restore supervisor states: previous context switch
+ * etc has done XSAVES and saved the supervisor states
+ * in the kernel buffer from which they can be restored
+ * now.
+ *
+ * We cannot do a single XRSTORS here - which would
+ * be nice - because the rest of the FPU registers are
+ * being restored from a user buffer directly. The
+ * single XRSTORS happens below, when the user buffer
+ * has been copied to the kernel one.
+ */
if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
xfeatures_mask_supervisor())
copy_kernel_to_xregs(&fpu->state.xsave,
@@ -369,15 +381,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}

/*
- * Supervisor states are not modified by user space input. Save
- * current supervisor states first.
* By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
* not modified on context switch and that the xstate is considered
* to be loaded again on return to userland (overriding last_cpu avoids
* the optimisation).
*/
fpregs_lock();
+
if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+
+ /*
+ * Supervisor states are not modified by user space input. Save
+ * current supervisor states first and invalidate the FPU regs.
+ */
if (xfeatures_mask_supervisor())
copy_supervisor_to_kernel(&fpu->state.xsave);
set_thread_flag(TIF_NEED_FPU_LOAD);
@@ -405,6 +421,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+
/*
* Restore previously saved supervisor xstates along with
* copied-in user xstates.


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-05-11 20:21:30

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 09/10] x86/fpu/xstate: Preserve supervisor states for slow path of __fpu__restore_sig()

The signal return code is responsible for taking an XSAVE buffer present
in user memory and loading it into the hardware registers. This
operation only affects user XSAVE state and never affects supervisor state.

The fast path through this code simply points XRSTOR directly at the
user buffer. However, since user memory is not guaranteed to be always
mapped, this XRSTOR can fail. If it fails, the signal return code falls
back to a slow path which can tolerate page faults.

That slow path copies the xfeatures one by one out of the user buffer
into the task's fpu state area. However, by being in a context where it
can handle page faults, the code can also schedule. The lazy-fpu-load code
would think it has an up-to-date fpstate and would fail to save the
supervisor state when scheduling the task out. When scheduling back in, it
would likely restore stale supervisor state.

To fix that, preserve supervisor state before the slow path. Modify
copy_user_to_fpregs_zeroing() so that if it fails, fpregs are not zeroed,
and there is no need for fpregs_deactivate() and supervisor states are
preserved.

Move set_thread_flag(TIF_NEED_FPU_LOAD) to the slow path. Without doing
this, the fast path also needs supervisor states to be saved first.

Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 53 +++++++++++++++++++-----------------
1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index d09d72334a12..545ca4314096 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -262,19 +262,23 @@ sanitize_restored_user_xstate(union fpregs_state *state,
static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
{
u64 init_bv;
+ int r;

if (use_xsave()) {
if (fx_only) {
init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;

- copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_fxregs(buf);
+ r = copy_user_to_fxregs(buf);
+ if (!r)
+ copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+ return r;
} else {
init_bv = xfeatures_mask_user() & ~xbv;

- if (unlikely(init_bv))
+ r = copy_user_to_xregs(buf, xbv);
+ if (!r && unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- return copy_user_to_xregs(buf, xbv);
+ return r;
}
} else if (use_fxsr()) {
return copy_user_to_fxregs(buf);
@@ -327,28 +331,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}
}

- /*
- * The current state of the FPU registers does not matter. By setting
- * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
- * is not modified on context switch and that the xstate is considered
- * to be loaded again on return to userland (overriding last_cpu avoids
- * the optimisation).
- */
- set_thread_flag(TIF_NEED_FPU_LOAD);
- __fpu_invalidate_fpregs_state(fpu);
-
if ((unsigned long)buf_fx % 64)
fx_only = 1;
- /*
- * For 32-bit frames with fxstate, copy the fxstate so it can be
- * reconstructed later.
- */
- if (ia32_fxstate) {
- ret = __copy_from_user(&env, buf, sizeof(env));
- if (ret)
- goto err_out;
- envp = &env;
- } else {
+
+ if (!ia32_fxstate) {
/*
* Attempt to restore the FPU registers directly from user
* memory. For that to succeed, the user access cannot cause
@@ -365,10 +351,27 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_unlock();
return 0;
}
- fpregs_deactivate(fpu);
fpregs_unlock();
+ } else {
+ /*
+ * For 32-bit frames with fxstate, copy the fxstate so it can
+ * be reconstructed later.
+ */
+ ret = __copy_from_user(&env, buf, sizeof(env));
+ if (ret)
+ goto err_out;
+ envp = &env;
}

+ /*
+ * The current state of the FPU registers does not matter. By setting
+ * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
+ * is not modified on context switch and that the xstate is considered
+ * to be loaded again on return to userland (overriding last_cpu avoids
+ * the optimisation).
+ */
+ set_thread_flag(TIF_NEED_FPU_LOAD);
+ __fpu_invalidate_fpregs_state(fpu);

if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
--
2.21.0

2020-05-11 20:23:21

by Yu-cheng Yu

[permalink] [raw]
Subject: [PATCH v3 10/10] x86/fpu/xstate: Restore supervisor states for signal return

As described in the previous patch, the signal return fast path directly
restores user states from the user buffer. Once that succeeds, restore
supervisor states (but only when they are not yet restored).

For the slow path, save supervisor states to preserve them across context
switches, and restore after the user states are restored.

The previous version has the overhead of an XSAVES in both the fast and the
slow paths. It is addressed as the following:

- In the fast path, only do an XRSTORS.
- In the slow path, do a supervisor-state-only XSAVES, and relocate the
buffer contents.

Some thoughts in the implementation:

- In the slow path, can any supervisor state become stale between
save/restore?

Answer: set_thread_flag(TIF_NEED_FPU_LOAD) protects the xstate buffer.

- In the slow path, can any code reference a stale supervisor state
register between save/restore?

Answer: In the current lazy-restore scheme, any reference to xstate
registers needs fpregs_lock()/fpregs_unlock() and __fpregs_load_activate().

- Are there other options?

One other option is eagerly restoring all supervisor states.

Currently, CET user-mode states and ENQCMD's PASID do not need to be
eagerly restored. The upcoming CET kernel-mode states (24 bytes) need
to be eagerly restored. To me, eagerly restoring all supervisor states
adds more overhead then benefit at this point.

Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
---
v3:
- Change copy_xregs_to_kernel() to copy_supervisor_to_kernel(), which is
introduced in a previous patch.
- Update commit log.

arch/x86/kernel/fpu/signal.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 545ca4314096..4dad5afc938d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -347,6 +347,23 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
+
+ /*
+ * Restore supervisor states: previous context switch
+ * etc has done XSAVES and saved the supervisor states
+ * in the kernel buffer from which they can be restored
+ * now.
+ *
+ * We cannot do a single XRSTORS here - which would
+ * be nice - because the rest of the FPU registers are
+ * being restored from a user buffer directly. The
+ * single XRSTORS happens below, when the user buffer
+ * has been copied to the kernel one.
+ */
+ if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
fpregs_mark_activate();
fpregs_unlock();
return 0;
@@ -364,14 +381,25 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}

/*
- * The current state of the FPU registers does not matter. By setting
- * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
- * is not modified on context switch and that the xstate is considered
+ * By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
+ * not modified on context switch and that the xstate is considered
* to be loaded again on return to userland (overriding last_cpu avoids
* the optimisation).
*/
- set_thread_flag(TIF_NEED_FPU_LOAD);
+ fpregs_lock();
+
+ if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+
+ /*
+ * Supervisor states are not modified by user space input. Save
+ * current supervisor states first and invalidate the FPU regs.
+ */
+ if (xfeatures_mask_supervisor())
+ copy_supervisor_to_kernel(&fpu->state.xsave);
+ set_thread_flag(TIF_NEED_FPU_LOAD);
+ }
__fpu_invalidate_fpregs_state(fpu);
+ fpregs_unlock();

if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
@@ -393,7 +421,13 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures);
+
+ /*
+ * Restore previously saved supervisor xstates along with
+ * copied-in user xstates.
+ */
+ ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
+ user_xfeatures | xfeatures_mask_supervisor());

} else if (use_fxsr()) {
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
--
2.21.0

2020-05-11 20:31:21

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 00/10] Support XSAVES supervisor states

On Sat, Mar 28, 2020 at 09:42:57AM -0700, Yu-cheng Yu wrote:
> Changes in v3:

Please send a clean v4 with all the review comments addressed.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette