2020-02-26 19:12:01

by Kyung Min Park

[permalink] [raw]
Subject: [PATCH 2/2] x86/asm/delay: Introduce TPAUSE delay

TPAUSE instructs the processor to enter an implementation-dependent
optimized state. The instruction execution wakes up when the time-stamp
counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
The instruction execution also wakes up due to the expiration of
the operating system time-limit or by an external interrupt
or exceptions such as a debug exception or a machine check exception.

TPAUSE offers a choice of two lower power states:
1. Light-weight power/performance optimized state C0.1
2. Improved power/performance optimized state C0.2
This way, it can save power with low wake-up latency in comparison to
spinloop based delay. The selection between the two is governed by the
input register.

TPAUSE is available on processors with X86_FEATURE_WAITPKG.

Reviewed-by: Tony Luck <[email protected]>
Co-developed-by: Fenghua Yu <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Kyung Min Park <[email protected]>
---
arch/x86/include/asm/mwait.h | 17 +++++++++++++++++
arch/x86/lib/delay.c | 26 +++++++++++++++++++++++++-
2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 9d5252c..2067501 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -22,6 +22,8 @@
#define MWAITX_ECX_TIMER_ENABLE BIT(1)
#define MWAITX_MAX_LOOPS ((u32)-1)
#define MWAITX_DISABLE_CSTATES 0xf0
+#define TPAUSE_C01_STATE 1
+#define TPAUSE_C02_STATE 0

static inline void __monitor(const void *eax, unsigned long ecx,
unsigned long edx)
@@ -120,4 +122,19 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
current_clr_polling();
}

+/*
+ * Caller can specify whether to enter C0.1 (low latency, less
+ * power saving) or C0.2 state (saves more power, but longer wakeup
+ * latency). This may be overridden by the IA32_UMWAIT_CONTROL MSR
+ * which can force requests for C0.2 to be downgraded to C0.1.
+ */
+static inline void __tpause(unsigned int ecx, unsigned int edx,
+ unsigned int eax)
+{
+ /* "tpause %ecx, %edx, %eax;" */
+ asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
+ :
+ : "c"(ecx), "d"(edx), "a"(eax));
+}
+
#endif /* _ASM_X86_MWAIT_H */
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 6be29cf..3553150 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -86,6 +86,26 @@ static void delay_tsc(unsigned long __loops)
}

/*
+ * On Intel the TPAUSE instruction waits until any of:
+ * 1) the TSC counter exceeds the value provided in EAX:EDX
+ * 2) global timeout in IA32_UMWAIT_CONTROL is exceeded
+ * 3) an external interrupt occurs
+ */
+static void tpause(u64 start, u64 cycles)
+{
+ u64 until = start + cycles;
+ unsigned int eax, edx;
+
+ eax = (unsigned int)(until & 0xffffffff);
+ edx = (unsigned int)(until >> 32);
+
+ /* Hard code the deeper (C0.2) sleep state because exit latency is
+ * small compared to the "microseconds" that usleep() will delay.
+ */
+ __tpause(TPAUSE_C02_STATE, edx, eax);
+}
+
+/*
* On some AMD platforms, MWAITX has a configurable 32-bit timer, that
* counts with TSC frequency. The input value is the loop of the
* counter, it will exit when the timer expires.
@@ -153,8 +173,12 @@ static void (*delay_platform)(unsigned long) = delay_loop;

void use_tsc_delay(void)
{
- if (delay_platform == delay_loop)
+ if (static_cpu_has(X86_FEATURE_WAITPKG)) {
+ wait_func = tpause;
+ delay_platform = delay_iterate;
+ } else if (delay_platform == delay_loop) {
delay_platform = delay_tsc;
+ }
}

void use_mwaitx_delay(void)
--
2.7.4


2020-02-26 21:12:42

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/asm/delay: Introduce TPAUSE delay

On Wed, Feb 26, 2020 at 11:10:58AM -0800, Kyung Min Park wrote:
> TPAUSE instructs the processor to enter an implementation-dependent
> optimized state. The instruction execution wakes up when the time-stamp
> counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
> The instruction execution also wakes up due to the expiration of
> the operating system time-limit or by an external interrupt

This is actually a behavior change. Today's udelay() will continue
after processing the interrupt. Your patches don't

I don't think it's a problem though. The interrupt will cause
a long enough delay that exceed any reasonable udelay() requirements.

There would be a difference if someone did really long udelay()s, much
longer than typical interrupts, in this case you might end up
with a truncated udelay, but such long udelays are not something that we
would encourage.

I don't think you need to change anything in the code, but should
probably document this behavior.

-Andi

2020-02-26 21:21:39

by Luck, Tony

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/asm/delay: Introduce TPAUSE delay

On Wed, Feb 26, 2020 at 01:10:40PM -0800, Andi Kleen wrote:
> On Wed, Feb 26, 2020 at 11:10:58AM -0800, Kyung Min Park wrote:
> > TPAUSE instructs the processor to enter an implementation-dependent
> > optimized state. The instruction execution wakes up when the time-stamp
> > counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
> > The instruction execution also wakes up due to the expiration of
> > the operating system time-limit or by an external interrupt
>
> This is actually a behavior change. Today's udelay() will continue
> after processing the interrupt. Your patches don't

The instruction level TPAUSE is called inside delay_wait()
that checks to see of we were interrupted early and loops to issue
another TPAUSE if needed.

-Tony

2020-02-26 21:22:58

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/asm/delay: Introduce TPAUSE delay

On Wed, Feb 26, 2020 at 01:10:40PM -0800, Andi Kleen wrote:
> On Wed, Feb 26, 2020 at 11:10:58AM -0800, Kyung Min Park wrote:
> > TPAUSE instructs the processor to enter an implementation-dependent
> > optimized state. The instruction execution wakes up when the time-stamp
> > counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
> > The instruction execution also wakes up due to the expiration of
> > the operating system time-limit or by an external interrupt
>
> This is actually a behavior change. Today's udelay() will continue
> after processing the interrupt. Your patches don't
>
> I don't think it's a problem though. The interrupt will cause
> a long enough delay that exceed any reasonable udelay() requirements.
>
> There would be a difference if someone did really long udelay()s, much
> longer than typical interrupts, in this case you might end up
> with a truncated udelay, but such long udelays are not something that we
> would encourage.

TPAUSE is in a loop which checks if this udelay exceeds deadline.
Coming back from interrupt, the loop checks deadline and finds
there is still left time to delay. Then udelay() goes back to TPAUSE.

Thanks.

-Fenghua

2020-02-26 22:00:14

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/asm/delay: Introduce TPAUSE delay

On Wed, Feb 26, 2020 at 01:20:34PM -0800, Luck, Tony wrote:
> On Wed, Feb 26, 2020 at 01:10:40PM -0800, Andi Kleen wrote:
> > On Wed, Feb 26, 2020 at 11:10:58AM -0800, Kyung Min Park wrote:
> > > TPAUSE instructs the processor to enter an implementation-dependent
> > > optimized state. The instruction execution wakes up when the time-stamp
> > > counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
> > > The instruction execution also wakes up due to the expiration of
> > > the operating system time-limit or by an external interrupt
> >
> > This is actually a behavior change. Today's udelay() will continue
> > after processing the interrupt. Your patches don't
>
> The instruction level TPAUSE is called inside delay_wait()
> that checks to see of we were interrupted early and loops to issue
> another TPAUSE if needed.

Ah right. It was already solved for mwaitx. Great.

-Andi