2020-11-19 23:40:14

by Chang S. Bae

[permalink] [raw]
Subject: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

Rename XFEATURE_MASK_USER_SUPPORTED to XFEATURE_MASK_USER_ENABLED to
literally align with new boot-parameters.

"xstate.disable=0x60000" will disable AMX on a system that has AMX compiled
into XFEATURE_MASK_USER_ENABLED.

"xstate.enable=0x60000" will enable AMX on a system that does NOT have AMX
compiled into XFEATURE_MASK_USER_ENABLED (assuming the kernel is new enough
to support this feature).

While this cmdline is currently enabled only for AMX, it is intended to be
easily enabled to be useful for future XSAVE-enabled features.

Signed-off-by: Chang S. Bae <[email protected]>
Reviewed-by: Len Brown <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
Changes from v1:
* Renamed the user state mask define (Andy Lutomirski and Dave Hansen)
* Changed the error message (Dave Hansen)
* Fixed xfeatures_mask_user()
* Rebased the upstream kernel (5.10) -- revived the param parse function
---
.../admin-guide/kernel-parameters.txt | 15 ++++
arch/x86/include/asm/fpu/types.h | 6 ++
arch/x86/include/asm/fpu/xstate.h | 24 +++---
arch/x86/kernel/fpu/init.c | 73 +++++++++++++++++--
4 files changed, 101 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 526d65d8573a..c41528cfe39f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5995,6 +5995,21 @@
which allow the hypervisor to 'idle' the guest on lock
contention.

+ xstate.enable= [X86-64]
+ xstate.disable= [X86-64]
+ The kernel is compiled with a default xstate bitmask --
+ enabling it to use the XSAVE hardware to efficiently
+ save and restore thread states on context switch.
+ xstate.enable allows adding to that default mask at
+ boot-time without recompiling the kernel just to support
+ the new thread state. (Note that the kernel will ignore
+ any bits in the mask that do not correspond to features
+ that are actually available in CPUID) xstate.disable
+ allows clearing bits in the default mask, forcing the
+ kernel to forget that it supports the specified thread
+ state. When a bit set for both, the kernel takes
+ xstate.disable in a priority.
+
xirc2ps_cs= [NET,PCMCIA]
Format:
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 18eb50fc95e8..ababb748cc8e 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -149,6 +149,12 @@ enum xfeature {
#define XFEATURE_MASK_XTILE (XFEATURE_MASK_XTILE_DATA \
| XFEATURE_MASK_XTILE_CFG)

+#define XFEATURE_REGION_MASK(max_bit, min_bit) \
+ ((BIT_ULL((max_bit) - (min_bit) + 1) - 1) << (min_bit))
+
+#define XFEATURE_MASK_CONFIGURABLE \
+ XFEATURE_REGION_MASK(XFEATURE_XTILE_DATA, XFEATURE_XTILE_CFG)
+
#define FIRST_EXTENDED_XFEATURE XFEATURE_YMM

struct reg_128_bit {
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 1544a874b748..683a8503c1c6 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -25,17 +25,17 @@

#define XSAVE_ALIGNMENT 64

-/* All currently supported user features */
-#define XFEATURE_MASK_USER_SUPPORTED (XFEATURE_MASK_FP | \
- XFEATURE_MASK_SSE | \
- XFEATURE_MASK_YMM | \
- XFEATURE_MASK_OPMASK | \
- XFEATURE_MASK_ZMM_Hi256 | \
- XFEATURE_MASK_Hi16_ZMM | \
- XFEATURE_MASK_PKRU | \
- XFEATURE_MASK_BNDREGS | \
- XFEATURE_MASK_BNDCSR | \
- XFEATURE_MASK_XTILE)
+/* All currently enabled user features */
+#define XFEATURE_MASK_USER_ENABLED (XFEATURE_MASK_FP | \
+ XFEATURE_MASK_SSE | \
+ XFEATURE_MASK_YMM | \
+ XFEATURE_MASK_OPMASK | \
+ XFEATURE_MASK_ZMM_Hi256 | \
+ XFEATURE_MASK_Hi16_ZMM | \
+ XFEATURE_MASK_PKRU | \
+ XFEATURE_MASK_BNDREGS | \
+ XFEATURE_MASK_BNDCSR | \
+ XFEATURE_MASK_XTILE)

/* All currently supported supervisor features */
#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
@@ -87,7 +87,7 @@ static inline u64 xfeatures_mask_supervisor(void)

static inline u64 xfeatures_mask_user(void)
{
- return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
+ return xfeatures_mask_all & ~(XFEATURE_MASK_SUPERVISOR_ALL);
}

static inline u64 xfeatures_mask_supervisor_dynamic(void)
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 954ac4f0f761..f1cdac3321c8 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -5,6 +5,7 @@
#include <asm/fpu/internal.h>
#include <asm/tlbflush.h>
#include <asm/setup.h>
+#include <asm/cmdline.h>

#include <linux/sched.h>
#include <linux/sched/task.h>
@@ -224,14 +225,44 @@ static void __init fpu__init_system_xstate_size_legacy(void)
/*
* Find supported xfeatures based on cpu features and command-line input.
* This must be called after fpu__init_parse_early_param() is called and
- * xfeatures_mask is enumerated.
+ * xfeatures_mask_all is enumerated.
*/
+
+static u64 xstate_enable;
+static u64 xstate_disable;
+
u64 __init fpu__get_supported_xfeatures_mask(void)
{
- u64 mask = XFEATURE_MASK_USER_SUPPORTED | XFEATURE_MASK_SUPERVISOR_SUPPORTED;
-
- if (!IS_ENABLED(CONFIG_X86_64))
- mask &= ~(XFEATURE_MASK_XTILE);
+ u64 mask = XFEATURE_MASK_USER_ENABLED | XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+
+ if (!IS_ENABLED(CONFIG_X86_64)) {
+ mask &= ~(XFEATURE_MASK_XTILE);
+ } else if (xstate_enable || xstate_disable) {
+ u64 custom = mask;
+ u64 unknown;
+
+ custom |= xstate_enable;
+ custom &= ~xstate_disable;
+
+ unknown = custom & ~mask;
+ if (unknown) {
+ /*
+ * User should fully understand the result of using undocumented
+ * xstate component.
+ */
+ pr_warn("x86/fpu: Attempt to enable unknown xstate features 0x%llx\n",
+ unknown);
+ WARN_ON_FPU(1);
+ }
+
+ if ((custom & XFEATURE_MASK_XTILE) != XFEATURE_MASK_XTILE) {
+ pr_warn("x86/fpu: Error in xstate.disable. Additionally disabling 0x%x components.\n",
+ XFEATURE_MASK_XTILE);
+ custom &= ~(XFEATURE_MASK_XTILE);
+ }
+
+ mask = custom;
+ }

return mask;
}
@@ -245,12 +276,44 @@ static void __init fpu__init_system_ctx_switch(void)
on_boot_cpu = 0;
}

+#define HEXA_BASE 16
+/*
+ * Longest parameter of 'xstate.enable=' is 16 hexadecimal characters with '0x' prefix and
+ * an extra '\0' for termination.
+ */
+#define MAX_XSTATE_MASK_CHARS 19
+
+/*
+ * We parse xstate parameters early because fpu__init_system() is executed
+ * before parse_early_param().
+ */
+static void __init fpu__init_parse_early_param(void)
+{
+ char arg[MAX_XSTATE_MASK_CHARS];
+ u64 mask;
+
+ if (cmdline_find_option(boot_command_line, "xstate.enable", arg,
+ sizeof(arg)) &&
+ !kstrtoull(arg, HEXA_BASE, &mask))
+ xstate_enable = mask & XFEATURE_MASK_CONFIGURABLE;
+ else
+ xstate_enable = 0;
+
+ if (cmdline_find_option(boot_command_line, "xstate.disable", arg,
+ sizeof(arg)) &&
+ !kstrtoull(arg, HEXA_BASE, &mask))
+ xstate_disable = mask & XFEATURE_MASK_CONFIGURABLE;
+ else
+ xstate_disable = 0;
+}
+
/*
* Called on the boot CPU once per system bootup, to set up the initial
* FPU state that is later cloned into all processes:
*/
void __init fpu__init_system(struct cpuinfo_x86 *c)
{
+ fpu__init_parse_early_param();
fpu__init_system_early_generic(c);

/*
--
2.17.1


2020-11-20 05:05:48

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

On Thu, Nov 19, 2020 at 3:37 PM Chang S. Bae <[email protected]> wrote:
>
> Rename XFEATURE_MASK_USER_SUPPORTED to XFEATURE_MASK_USER_ENABLED to
> literally align with new boot-parameters.
>
> "xstate.disable=0x60000" will disable AMX on a system that has AMX compiled
> into XFEATURE_MASK_USER_ENABLED.
>
> "xstate.enable=0x60000" will enable AMX on a system that does NOT have AMX
> compiled into XFEATURE_MASK_USER_ENABLED (assuming the kernel is new enough
> to support this feature).
>

What's the purpose of xstate.enable? I can't really imagine it's
useful for AMX. I suppose it could be useful for hypothetical
post-AMX features, but that sounds extremely dangerous. Intel has
changed its strategy so many times on XSTATE extensibility that I find
it quite hard to believe that supporting unknown states is wise.

2020-11-24 18:53:53

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

On Fri, Nov 20, 2020 at 12:03 AM Andy Lutomirski <[email protected]> wrote:
>
> On Thu, Nov 19, 2020 at 3:37 PM Chang S. Bae <[email protected]> wrote:
> > "xstate.enable=0x60000" will enable AMX on a system that does NOT have AMX
> > compiled into XFEATURE_MASK_USER_ENABLED (assuming the kernel is new enough
> > to support this feature).
> >
>
> What's the purpose of xstate.enable? I can't really imagine it's
> useful for AMX. I suppose it could be useful for hypothetical
> post-AMX features, but that sounds extremely dangerous. Intel has
> changed its strategy so many times on XSTATE extensibility that I find
> it quite hard to believe that supporting unknown states is wise.

Not hypothetical -- there are subsequent hardware features coming that
will use the same
exact XSTATE support that this series puts in place for AMX.

We know that when those features ship in new hardware, there will be
a set of customers who want to exercise those features immediately,
but their kernel binary has not yet been re-compiled to see those
features by-default.

The purpose of "xstate.enable" is to empower those users to be able to
explicitly enable support using their existing binary.

You are right -- the feature isn't needed to enable AMX, unless somebody went to
the trouble of building a kernel with the AMX source update, but chose
to disable
AMX-specific recognition, by-default.

thanks,
Len Brown, Intel Open Source Technology Center

2020-11-25 00:06:06

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

> On Nov 24, 2020, at 10:51 AM, Len Brown <[email protected]> wrote:
>
> On Fri, Nov 20, 2020 at 12:03 AM Andy Lutomirski <[email protected]> wrote:
>>
>>> On Thu, Nov 19, 2020 at 3:37 PM Chang S. Bae <[email protected]> wrote:
>>> "xstate.enable=0x60000" will enable AMX on a system that does NOT have AMX
>>> compiled into XFEATURE_MASK_USER_ENABLED (assuming the kernel is new enough
>>> to support this feature).
>>>
>>
>> What's the purpose of xstate.enable? I can't really imagine it's
>> useful for AMX. I suppose it could be useful for hypothetical
>> post-AMX features, but that sounds extremely dangerous. Intel has
>> changed its strategy so many times on XSTATE extensibility that I find
>> it quite hard to believe that supporting unknown states is wise.
>
> Not hypothetical -- there are subsequent hardware features coming that
> will use the same
> exact XSTATE support that this series puts in place for AMX.
>
> We know that when those features ship in new hardware, there will be
> a set of customers who want to exercise those features immediately,
> but their kernel binary has not yet been re-compiled to see those
> features by-default.
>
> The purpose of "xstate.enable" is to empower those users to be able to
> explicitly enable support using their existing binary.
>
> You are right -- the feature isn't needed to enable AMX, unless somebody went to
> the trouble of building a kernel with the AMX source update, but chose
> to disable
> AMX-specific recognition, by-default.
>
>

We may want to taint the kernel if one of these flags is used because,
frankly, Intel’s track record is poor. Suppose we get a new feature
with PKRU-like semantics -- switching it blindly using
XSAVE(C,S,OPT,whatever) would simply incorrect. And XFD itself has
problems — supposedly it’s difficult or impossible to virtualize. It
wouldn’t utterly shock me if Intel were to drop IA32_XFD_ERR and
replace it with a new mechanism that’s less janky.

So this whole thing makes me quite nervous.

(I'm not a virtualization expert, but AIUI IA32_XFD_ERR has some
issues. If it's too late to fix those issues, Intel could probably
get away with completely dropping IA32_XFD_ERR from the spec -- OSes
can handle AMX just fine without it. Then the next XFD-able feature
could introduce a new improved way of reporting which feature
triggered #NM.)

2020-12-01 22:38:33

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH v2 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

> > On Fri, Nov 20, 2020 at 12:03 AM Andy Lutomirski <[email protected]> wrote:

> We may want to taint the kernel if one of these flags is used

I agree with Andy.
I can imagine a distro would want to see the taint bit set if a user
overrides the default that they ship and support.

Chang,
Please taint when the enable flag is used -- Thanks!

> because,
> frankly, Intel’s track record is poor. Suppose we get a new feature
> with PKRU-like semantics -- switching it blindly using
> XSAVE(C,S,OPT,whatever) would simply incorrect.

A valid concern.
To mitigate, there is a new CPUID bit on the way to tell us explicitly
which features are simple (require just XSAVE), versus which
features are not simple and require more complex support.n t
When that bit is locked down, we will add that sanity check here.

> And XFD itself has
> problems — supposedly it’s difficult or impossible to virtualize. It
> wouldn’t utterly shock me if Intel were to drop IA32_XFD_ERR and
> replace it with a new mechanism that’s less janky.
>
> So this whole thing makes me quite nervous.
>
> (I'm not a virtualization expert, but AIUI IA32_XFD_ERR has some
> issues. If it's too late to fix those issues, Intel could probably
> get away with completely dropping IA32_XFD_ERR from the spec -- OSes
> can handle AMX just fine without it. Then the next XFD-able feature
> could introduce a new improved way of reporting which feature
> triggered #NM.)

I don't follow the XFD concern.

There are a couple of cases.

If the hypervisor isn't aware of XFD, it neither uses it, or shows it
to the guests.

If the hypervisor is aware of XFD and supports TMUL, it allocates its
own context
switch buffers at boot time, and does not use the XFD optimization to
save space inside the hypervisor. It can expose XFD to the guests,
(and context switch it when guests switch on/off physical CPUs)

The guest sees two cases, either XFD exists or it doesn't, just like
on real hardware.

thanks,
Len Brown, Intel Open Source Technology Center