2022-04-05 02:14:23

by Thomas Gleixner

[permalink] [raw]
Subject: [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction

XSAVES/C allow for optimized compaction by saving only the active component
states and compact the storage so it is linear without holes, which is
better for prefetching and dTLB.

This needs preparation in the copy to/from UABI functions. They have to be
aware of the optimized layout.

The change for __copy_xstate_to_uabi_buf() is trivial. It just has to avoid
calculating the offset in the XSTATE buffer when a component is in init
state. That's an improvement in general.

For copy_uabi_to_xstate() it's more effort when supervisor states are
supported and active. If the user space buffer has active states which are
not in the set of current states of the XSTATE buffer, then the buffer
layout will change which means the supervisor states might be overwritten.

Provide a function, which moves them out of the way and invoke it before
any of the extended features is copied from the user space buffer.

Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/fpu/xstate.c | 77 ++++++++++++++++++++++++++++++++++++++++---
arch/x86/kernel/fpu/xstate.h | 14 ++++++-
2 files changed, 84 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1147,10 +1147,10 @@ void __copy_xstate_to_uabi_buf(struct me
pkru.pkru = pkru_val;
membuf_write(&to, &pkru, sizeof(pkru));
} else {
- copy_feature(header.xfeatures & BIT_ULL(i), &to,
- __raw_xsave_addr(xsave, i),
- __raw_xsave_addr(xinit, i),
- xstate_sizes[i]);
+ bool active = header.xfeatures & BIT_ULL(i);
+ void *from = __raw_xsave_addr(active ? xsave : xinit, i);
+
+ membuf_write(&to, from, xstate_sizes[i]);
}
/*
* Keep track of the last copied state in the non-compacted
@@ -1195,6 +1195,73 @@ static int copy_from_buffer(void *dst, u
return 0;
}

+/*
+ * Prepare the kernel XSTATE buffer for saving the user supplied XSTATE. If
+ * the XGETBV1 based optimization is in use then the kernel buffer might
+ * not have the user supplied active features set and an eventual active
+ * supervisor state has to be moved out of the way. With optimized
+ * compaction the features which are to be stored need to be set in the
+ * XCOMP_BV field of the XSTATE header.
+ */
+static void xsave_adjust_xcomp(struct fpstate *fpstate, u64 xuser)
+{
+ struct xregs_state *xsave = &fpstate->regs.xsave;
+ u64 xtmp, xall, xbv, xcur = xsave->header.xfeatures;
+ int i;
+
+ /* Nothing to do if optimized compaction is not in use */
+ if (!xsave_use_xgetbv1())
+ return;
+
+ /*
+ * Check whether the current xstate buffer contains supervisor
+ * state. If not, just set the user supplied features.
+ */
+ if (!(xcur & XFEATURE_MASK_SUPERVISOR_ALL)) {
+ __xstate_init_xcomp_bv(xsave, xuser);
+ return;
+ }
+
+ /*
+ * Set legacy features. They are at a fixed offset and do not
+ * affect the layout.
+ */
+ xbv = xsave->header.xcomp_bv;
+ xbv |= xuser & (XFEATURE_MASK_FP | XFEATURE_MASK_SSE);
+
+ /*
+ * Check whether there is new extended state in the user supplied
+ * buffer. If not, then nothing has to be moved around.
+ */
+ if (!(xuser & ~xbv)) {
+ __xstate_init_xcomp_bv(xsave, xbv);
+ return;
+ }
+
+ /*
+ * No more optimizations. Set the user features and move the
+ * supervisor state(s). If the new user feature is past
+ * the supervisor state, then the loop is moving nothing.
+ */
+ xtmp = xbv & XFEATURE_MASK_SUPERVISOR_ALL;
+ xall = xbv | xuser;
+
+ for (i = fls64(xtmp) - 1; i >= FIRST_EXTENDED_XFEATURE;
+ i = fls64(xtmp) - 1) {
+ unsigned int to, from;
+
+ from = xfeature_get_offset(xbv, i);
+ to = xfeature_get_offset(xall, i);
+ if (from < to) {
+ memmove((void *)xsave + to, (void *)xsave + from,
+ xstate_sizes[i]);
+ } else {
+ WARN_ON_ONCE(to < from);
+ }
+ xtmp &= ~BIT_ULL(i);
+ }
+ __xstate_init_xcomp_bv(xsave, xall);
+}

static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
const void __user *ubuf)
@@ -1232,6 +1299,8 @@ static int copy_uabi_to_xstate(struct fp
}
}

+ xsave_adjust_xcomp(fpstate, hdr.xfeatures);
+
for (i = 0; i < XFEATURE_MAX; i++) {
u64 mask = ((u64)1 << i);

--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -10,14 +10,22 @@
DECLARE_PER_CPU(u64, xfd_state);
#endif

-static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
+static inline bool xsave_use_xgetbv1(void) { return false; }
+
+static inline void __xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
{
/*
* XRSTORS requires these bits set in xcomp_bv, or it will
- * trigger #GP:
+ * trigger #GP. It's also required for XRSTOR when the buffer
+ * was saved with XSAVEC in compacted format.
*/
+ xsave->header.xcomp_bv = mask | XCOMP_BV_COMPACTED_FORMAT;
+}
+
+static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
+{
if (cpu_feature_enabled(X86_FEATURE_XCOMPACTED))
- xsave->header.xcomp_bv = mask | XCOMP_BV_COMPACTED_FORMAT;
+ __xstate_init_xcomp_bv(xsave, mask);
}

static inline u64 xstate_get_group_perm(bool guest)


2022-04-16 02:21:39

by Dave Hansen

[permalink] [raw]
Subject: Re: [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction

On 4/4/22 05:11, Thomas Gleixner wrote:
...
> For copy_uabi_to_xstate() it's more effort when supervisor states are
> supported and active. If the user space buffer has active states which are
> not in the set of current states of the XSTATE buffer, then the buffer
> layout will change which means the supervisor states might be overwritten.
>
> Provide a function, which moves them out of the way and invoke it before
> any of the extended features is copied from the user space buffer.

This took me a minute to sort through. There are three kinds of state
in the kernel buffer in copy_uabi_to_xstate().

1. User state, set in hdr.features which is copied from uabi buffer
2. User state, clear in hdr.features. Feature will be set to its init
state because xsave->header.xfeatures is cleared.
3. Supervisor state which is preserved.

I was confused for a *moment* because the space for a #2 might be
overwritten by a #1 copy-in from userspace. But, that only happens for
states that will end up being set to init. No harm, no foul.

Any interest in doing something like the attached to make
copy_uabi_to_xstate() easier to read?

> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1147,10 +1147,10 @@ void __copy_xstate_to_uabi_buf(struct me
> pkru.pkru = pkru_val;
> membuf_write(&to, &pkru, sizeof(pkru));
> } else {
> - copy_feature(header.xfeatures & BIT_ULL(i), &to,
> - __raw_xsave_addr(xsave, i),
> - __raw_xsave_addr(xinit, i),
> - xstate_sizes[i]);
> + bool active = header.xfeatures & BIT_ULL(i);
> + void *from = __raw_xsave_addr(active ? xsave : xinit, i);
> +
> + membuf_write(&to, from, xstate_sizes[i]);
> }
> /*
> * Keep track of the last copied state in the non-compacted
> @@ -1195,6 +1195,73 @@ static int copy_from_buffer(void *dst, u
> return 0;
> }
>
> +/*
> + * Prepare the kernel XSTATE buffer for saving the user supplied XSTATE. If
> + * the XGETBV1 based optimization is in use then the kernel buffer might
> + * not have the user supplied active features set and an eventual active
> + * supervisor state has to be moved out of the way. With optimized
> + * compaction the features which are to be stored need to be set in the
> + * XCOMP_BV field of the XSTATE header.
> + */

What does "eventual active supervisor state" mean? Just that the state
needs to be preserved since it needs to be restored eventually? I found
that phrase a bit confusing.

I was thinking of a comment more along these lines:

/*
* Adjust 'fpstate' so that it can additionally store all the xfeatures
* set in 'xuser' when optimized compaction is enabled. All bits in
* 'xuser' will be set in XCOMP_BV. Supervisor state will be preserved
* and may be moved to make room for new 'xuser' states. User state may
* be destroyed.
*/

> +static void xsave_adjust_xcomp(struct fpstate *fpstate, u64 xuser)
> +{
> + struct xregs_state *xsave = &fpstate->regs.xsave;
> + u64 xtmp, xall, xbv, xcur = xsave->header.xfeatures;
> + int i;
> +
> + /* Nothing to do if optimized compaction is not in use */
> + if (!xsave_use_xgetbv1())
> + return;

The comment makes me wonder if we need a more descriptive name for
xsave_use_xgetbv1(), like:

if (!xsave_do_optimized_compaction())
return;

> + /*
> + * Check whether the current xstate buffer contains supervisor
> + * state. If not, just set the user supplied features.
> + */
> + if (!(xcur & XFEATURE_MASK_SUPERVISOR_ALL)) {
> + __xstate_init_xcomp_bv(xsave, xuser);
> + return;
> + }
> +
> + /*
> + * Set legacy features. They are at a fixed offset and do not
> + * affect the layout.
> + */
> + xbv = xsave->header.xcomp_bv;
> + xbv |= xuser & (XFEATURE_MASK_FP | XFEATURE_MASK_SSE);
> +
> + /*
> + * Check whether there is new extended state in the user supplied
> + * buffer. If not, then nothing has to be moved around.
> + */
> + if (!(xuser & ~xbv)) {
> + __xstate_init_xcomp_bv(xsave, xbv);
> + return;
> + }
> +
> + /*
> + * No more optimizations. Set the user features and move the
> + * supervisor state(s). If the new user feature is past
> + * the supervisor state, then the loop is moving nothing.
> + */
> + xtmp = xbv & XFEATURE_MASK_SUPERVISOR_ALL;
> + xall = xbv | xuser;


I'd probably at least comment why the loop is backwards:

/*
* Features are only be moved up in the buffer. Start with
* high features to avoid overwriting them with a lower ones.
*/

I know this is a very typical way to implement non-destructive moves,
but my stupid brain seems to default to assuming that for loops only go
forward.

> + for (i = fls64(xtmp) - 1; i >= FIRST_EXTENDED_XFEATURE;
> + i = fls64(xtmp) - 1) {
> + unsigned int to, from;

Is it worth a check here like:

/* Do not move features in their init state: */
if (!(xcur & BIT_ULL(i))) {
xtmp &= ~BIT_ULL(i);
continue;
}

(probably also moving the &= around instead of copying it)

> + from = xfeature_get_offset(xbv, i);
> + to = xfeature_get_offset(xall, i);
> + if (from < to) {
> + memmove((void *)xsave + to, (void *)xsave + from,
> + xstate_sizes[i]);
> + } else {
> + WARN_ON_ONCE(to < from);
> + }
> + xtmp &= ~BIT_ULL(i);
> + }
> + __xstate_init_xcomp_bv(xsave, xall);
> +}
>
> static int copy_uabi_to_xstate(struct fpstate *fpstate, const void *kbuf,
> const void __user *ubuf)
> @@ -1232,6 +1299,8 @@ static int copy_uabi_to_xstate(struct fp
> }
> }
>
> + xsave_adjust_xcomp(fpstate, hdr.xfeatures);
> +
> for (i = 0; i < XFEATURE_MAX; i++) {
> u64 mask = ((u64)1 << i);
>
> --- a/arch/x86/kernel/fpu/xstate.h
> +++ b/arch/x86/kernel/fpu/xstate.h
> @@ -10,14 +10,22 @@
> DECLARE_PER_CPU(u64, xfd_state);
> #endif
>
> -static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
> +static inline bool xsave_use_xgetbv1(void) { return false; }
> +
> +static inline void __xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
> {
> /*
> * XRSTORS requires these bits set in xcomp_bv, or it will
> - * trigger #GP:
> + * trigger #GP. It's also required for XRSTOR when the buffer
> + * was saved with XSAVEC in compacted format.
> */
> + xsave->header.xcomp_bv = mask | XCOMP_BV_COMPACTED_FORMAT;
> +}
> +
> +static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
> +{
> if (cpu_feature_enabled(X86_FEATURE_XCOMPACTED))
> - xsave->header.xcomp_bv = mask | XCOMP_BV_COMPACTED_FORMAT;
> + __xstate_init_xcomp_bv(xsave, mask);
> }
>
> static inline u64 xstate_get_group_perm(bool guest)


Attachments:
copy_uabi_to_xstate.1.patch (2.03 kB)

2022-04-20 17:57:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction

On Tue, Apr 19 2022 at 14:39, Thomas Gleixner wrote:
> On Thu, Apr 14 2022 at 08:46, Dave Hansen wrote:
>>> + for (i = fls64(xtmp) - 1; i >= FIRST_EXTENDED_XFEATURE;
>>> + i = fls64(xtmp) - 1) {
>>> + unsigned int to, from;
>>
>> Is it worth a check here like:
>>
>> /* Do not move features in their init state: */
>> if (!(xcur & BIT_ULL(i))) {
>> xtmp &= ~BIT_ULL(i);
>> continue;
>> }
>
> That would also require to clear the bit in xall, but we can't do that
> in the loop as that affects offsets. Let me think about that.

Duh, misread it. Yes it's possible to do that. OTOH, with the optimized
compaction there won't be a bit set in xbv, but not in xfeatures, except
for the initial state when a task starts.

Thanks,

tglx

2022-04-22 22:36:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [patch 2/3] x86/fpu/xsave: Prepare for optimized compaction

On Thu, Apr 14 2022 at 08:46, Dave Hansen wrote:
> On 4/4/22 05:11, Thomas Gleixner wrote:
> Any interest in doing something like the attached to make
> copy_uabi_to_xstate() easier to read?

Yeah. I've picked it up.

>> +static void xsave_adjust_xcomp(struct fpstate *fpstate, u64 xuser)
>> +{
>> + struct xregs_state *xsave = &fpstate->regs.xsave;
>> + u64 xtmp, xall, xbv, xcur = xsave->header.xfeatures;
>> + int i;
>> +
>> + /* Nothing to do if optimized compaction is not in use */
>> + if (!xsave_use_xgetbv1())
>> + return;
>
> The comment makes me wonder if we need a more descriptive name for
> xsave_use_xgetbv1(), like:
>
> if (!xsave_do_optimized_compaction())
> return;

Makes sense.

>> + /*
>> + * No more optimizations. Set the user features and move the
>> + * supervisor state(s). If the new user feature is past
>> + * the supervisor state, then the loop is moving nothing.
>> + */
>> + xtmp = xbv & XFEATURE_MASK_SUPERVISOR_ALL;
>> + xall = xbv | xuser;
>
>
> I'd probably at least comment why the loop is backwards:
>
> /*
> * Features are only be moved up in the buffer. Start with
> * high features to avoid overwriting them with a lower ones.
> */
>
> I know this is a very typical way to implement non-destructive moves,
> but my stupid brain seems to default to assuming that for loops only go
> forward.

:)

>> + for (i = fls64(xtmp) - 1; i >= FIRST_EXTENDED_XFEATURE;
>> + i = fls64(xtmp) - 1) {
>> + unsigned int to, from;
>
> Is it worth a check here like:
>
> /* Do not move features in their init state: */
> if (!(xcur & BIT_ULL(i))) {
> xtmp &= ~BIT_ULL(i);
> continue;
> }

That would also require to clear the bit in xall, but we can't do that
in the loop as that affects offsets. Let me think about that.

Thanks,

tglx