2022-03-10 01:54:11

by Chang S. Bae

[permalink] [raw]
Subject: [PATCH v2 0/2] x86/fpu: Make AMX state ready for CPU idle

AMX state is a large state (at least 8KB or more). Entering CPU idle with
this non-initialized large state may result in shallow states while a
deeper low-power state is available.

We can confirm this behavior is implementation-specific. Section 3.3 in [1]
will be updated to clarify this. Sorry for the delay in this conclusion
and the update here.

Changes from v1 [2]:
* Drop the code to call TILERELEASE from arch_cpu_idle_enter(). Instead,
do it from the intel-idle driver, as the behavior is *not* architectural.
* Simplify the driver code change (Rui).
* Improve the dynamic state flag check in the helper function.
* Drop the new C-state table as it is done by another patchset [3].
* Drop the opcode map changes as tip's x86/misc [4] has merged it already.

With the last two changes, the series depends on these:
* The C-state table update for Sapphire Rapids in linux-pm [3].
* The update of the x86 opcode map in the tip's branch [4].

The series on top of the above is available here:
git://github.com/intel/amx-linux.git tilerelease

Thanks,
Chang

[1]: Intel Architecture Instruction Set Extension Programming Reference
May 2021, https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf
[2]: https://lore.kernel.org/lkml/[email protected]/
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9edf3c0ffef0ec1bed8300315852b5c6a0997130
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/misc&id=9dd94df75b30eca03ed2151dd5bbc152a6f19abf

Chang S. Bae (2):
x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
intel_idle: Add a new flag to initialize the AMX state

arch/x86/include/asm/fpu/api.h | 2 ++
arch/x86/include/asm/special_insns.h | 9 +++++++++
arch/x86/kernel/fpu/core.c | 15 +++++++++++++++
drivers/idle/intel_idle.c | 18 ++++++++++++++++--
4 files changed, 42 insertions(+), 2 deletions(-)

--
2.17.1


2022-03-10 07:07:30

by Chang S. Bae

[permalink] [raw]
Subject: [PATCH v2 1/2] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle

When a CPU enters an idle state, non-initialized states left in large
registers may be the cause of preventing deeper low-power states.

The new helper ensures the AMX state is initialized to make the CPU
ready for low-power states. It will be used by the intel idle driver.

Signed-off-by: Chang S. Bae <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
Changes from v1:
* Check the dynamic state flag first, to avoid #UD with XGETBV(1).

Dependency on the opcode map update:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/misc&id=9dd94df75b30eca03ed2151dd5bbc152a6f19abf
---
arch/x86/include/asm/fpu/api.h | 2 ++
arch/x86/include/asm/special_insns.h | 9 +++++++++
arch/x86/kernel/fpu/core.c | 15 +++++++++++++++
3 files changed, 26 insertions(+)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index c83b3020350a..df48912fd1c8 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -165,4 +165,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu)
struct task_struct;
extern long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2);

+extern void fpu_idle_fpregs(void);
+
#endif /* _ASM_X86_FPU_API_H */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 68c257a3de0d..d434fbaeb3ff 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -294,6 +294,15 @@ static inline int enqcmds(void __iomem *dst, const void *src)
return 0;
}

+static inline void tile_release(void)
+{
+ /*
+ * Instruction opcode for TILERELEASE; supported in binutils
+ * version >= 2.36.
+ */
+ asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0");
+}
+
#endif /* __KERNEL__ */

#endif /* _ASM_X86_SPECIAL_INSNS_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8dea01ffc5c1..572dfa962b7b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -847,3 +847,18 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr)
*/
return 0;
}
+
+/*
+ * Initialize register state that may prevent from entering low-power idle.
+ * This function will be invoked from the cpuidle driver only when needed.
+ */
+void fpu_idle_fpregs(void)
+{
+ if (!fpu_state_size_dynamic())
+ return;
+
+ if (xfeatures_in_use() & XFEATURE_MASK_XTILE) {
+ tile_release();
+ fpregs_deactivate(&current->thread.fpu);
+ }
+}
--
2.17.1

2022-03-10 14:34:09

by Chang S. Bae

[permalink] [raw]
Subject: [PATCH v2 2/2] intel_idle: Add a new flag to initialize the AMX state

The non-initialized AMX state can be the cause of C-state demotion from C6
to C1E. This low-power idle state may improve power savings and thus result
in a higher available turbo frequency budget.

This behavior is implementation-specific. Initialize the state for the C6
entrance of Sapphire Rapids as needed.

Suggested-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Chang S. Bae <[email protected]>
Tested-by : Zhang Rui <[email protected]>
Cc: Artem Bityutskiy <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
Changes from v1:
* Simplify the code with a new flag (Rui).
* Rebase on Artem's patches for SPR intel_idle.
* Massage the changelog.

Dependency on the new C-state table for SPR:
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9edf3c0ffef0ec1bed8300315852b5c6a0997130
---
drivers/idle/intel_idle.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 1c7c25909e54..6ecbeffdf785 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -54,6 +54,7 @@
#include <asm/intel-family.h>
#include <asm/mwait.h>
#include <asm/msr.h>
+#include <asm/fpu/api.h>

#define INTEL_IDLE_VERSION "0.5.1"

@@ -99,6 +100,11 @@ static unsigned int mwait_substates __initdata;
*/
#define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)

+/*
+ * Initialize large xstate for the C6-state entrance.
+ */
+#define CPUIDLE_FLAG_INIT_XSTATE BIT(16)
+
/*
* MWAIT takes an 8-bit "hint" in EAX "suggesting"
* the C-state (top nibble) and sub-state (bottom nibble)
@@ -136,6 +142,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
local_irq_enable();

+ if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
+ fpu_idle_fpregs();
+
mwait_idle_with_hints(eax, ecx);

return index;
@@ -156,8 +165,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
- unsigned long eax = flg2MWAIT(drv->states[index].flags);
unsigned long ecx = 1; /* break on interrupt flag */
+ struct cpuidle_state *state = &drv->states[index];
+ unsigned long eax = flg2MWAIT(state->flags);
+
+ if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
+ fpu_idle_fpregs();

mwait_idle_with_hints(eax, ecx);

@@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = {
{
.name = "C6",
.desc = "MWAIT 0x20",
- .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \
+ CPUIDLE_FLAG_INIT_XSTATE,
.exit_latency = 290,
.target_residency = 800,
.enter = &intel_idle,
--
2.17.1

2022-03-10 16:01:14

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle

On 3/9/22 14:34, Chang S. Bae wrote:
> +/*
> + * Initialize register state that may prevent from entering low-power idle.
> + * This function will be invoked from the cpuidle driver only when needed.
> + */
> +void fpu_idle_fpregs(void)
> +{
> + if (!fpu_state_size_dynamic())
> + return;

Is this check just an optimization? I'm having trouble imagining a
situation where we would have:

(xfeatures_in_use() & XFEATURE_MASK_XTILE) == true
but
fpu_state_size_dynamic() == false

> + if (xfeatures_in_use() & XFEATURE_MASK_XTILE) {
> + tile_release();
> + fpregs_deactivate(&current->thread.fpu);
> + }
> +}

xfeatures_in_use() isn't exactly expensive either.

2022-03-11 22:00:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] intel_idle: Add a new flag to initialize the AMX state

On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <[email protected]> wrote:
>
> The non-initialized AMX state can be the cause of C-state demotion from C6
> to C1E. This low-power idle state may improve power savings and thus result
> in a higher available turbo frequency budget.
>
> This behavior is implementation-specific. Initialize the state for the C6
> entrance of Sapphire Rapids as needed.
>
> Suggested-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Chang S. Bae <[email protected]>
> Tested-by : Zhang Rui <[email protected]>
> Cc: Artem Bityutskiy <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> Changes from v1:
> * Simplify the code with a new flag (Rui).
> * Rebase on Artem's patches for SPR intel_idle.
> * Massage the changelog.
>
> Dependency on the new C-state table for SPR:
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9edf3c0ffef0ec1bed8300315852b5c6a0997130
> ---
> drivers/idle/intel_idle.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 1c7c25909e54..6ecbeffdf785 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -54,6 +54,7 @@
> #include <asm/intel-family.h>
> #include <asm/mwait.h>
> #include <asm/msr.h>
> +#include <asm/fpu/api.h>
>
> #define INTEL_IDLE_VERSION "0.5.1"
>
> @@ -99,6 +100,11 @@ static unsigned int mwait_substates __initdata;
> */
> #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)
>
> +/*
> + * Initialize large xstate for the C6-state entrance.
> + */
> +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16)
> +
> /*
> * MWAIT takes an 8-bit "hint" in EAX "suggesting"
> * the C-state (top nibble) and sub-state (bottom nibble)
> @@ -136,6 +142,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
> if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> local_irq_enable();
>
> + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
> + fpu_idle_fpregs();
> +
> mwait_idle_with_hints(eax, ecx);
>
> return index;
> @@ -156,8 +165,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
> static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
> struct cpuidle_driver *drv, int index)
> {
> - unsigned long eax = flg2MWAIT(drv->states[index].flags);
> unsigned long ecx = 1; /* break on interrupt flag */
> + struct cpuidle_state *state = &drv->states[index];
> + unsigned long eax = flg2MWAIT(state->flags);
> +
> + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
> + fpu_idle_fpregs();
>
> mwait_idle_with_hints(eax, ecx);
>
> @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = {
> {
> .name = "C6",
> .desc = "MWAIT 0x20",
> - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \

Why is the backslash at the end of the line needed?

> + CPUIDLE_FLAG_INIT_XSTATE,
> .exit_latency = 290,
> .target_residency = 800,
> .enter = &intel_idle,
> --

2022-03-11 22:03:07

by Chang S. Bae

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] intel_idle: Add a new flag to initialize the AMX state

On 3/10/2022 10:34 AM, Rafael J. Wysocki wrote:
> On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <[email protected]> wrote:
>>
[...]
>> @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = {
>> {
>> .name = "C6",
>> .desc = "MWAIT 0x20",
>> - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
>> + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \
>
> Why is the backslash at the end of the line needed?

No, it is not needed.

Sorry, I think I was mindlessly following the style in this new c-state
table:
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/drivers/idle/intel_idle.c?h=linux-next#n787

Thanks,
Chang

2022-03-11 22:07:03

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] intel_idle: Add a new flag to initialize the AMX state

On Thu, 2022-03-10 at 10:50 -0800, Chang S. Bae wrote:
> On 3/10/2022 10:34 AM, Rafael J. Wysocki wrote:
> > On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <[email protected]>
> > wrote:
> > >
> [...]
> > > @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata =
> > > {
> > >          {
> > >                  .name = "C6",
> > >                  .desc = "MWAIT 0x20",
> > > -               .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> > > +               .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \
> >
> > Why is the backslash at the end of the line needed?
>
> No, it is not needed.
>
> Sorry, I think I was mindlessly following the style in this new c-state
> table:
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-
> pm.git/tree/drivers/idle/intel_idle.c?h=linux-next#n787
>
> Thanks,
> Chang
>

Sorry, too much python programming lately, so I automatically added the back-
slash. Let me know if you would remove that backslash at the same time, or I can
submit a patch.

Artem.

2022-03-22 07:30:01

by Chang S. Bae

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] intel_idle: Add a new flag to initialize the AMX state

On 3/10/2022 11:33 PM, Artem Bityutskiy wrote:
>
> Let me know if you would remove that backslash at the same time, or I can
> submit a patch.

Looks like the table was merged without the backslash. Rafael thankfully
removed it via:
https://lore.kernel.org/linux-pm/4731792.31r3eYUQgx@kreacher/

Thanks,
Chang