Add architectural support for the cpuidle-haltpoll driver by defining
arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
an optimized polling mechanism via smp_cond_load*().
Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
cpuidle-haltpoll to be selected.
Note that we limit cpuidle-haltpoll support to when the event-stream is
available. This is necessary because polling via smp_cond_load_relaxed()
uses WFE to wait for a store which might not happen for an prolonged
period of time. So, ensure the event-stream is around to provide a
terminating condition.
Signed-off-by: Ankur Arora <[email protected]>
---
arch/arm64/Kconfig | 10 ++++++++++
arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
2 files changed, 31 insertions(+)
create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b11c98b3e84..6f2df162b10e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -34,6 +34,7 @@ config ARM64
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ select ARCH_HAS_OPTIMIZED_POLL
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
@@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
config ARCH_SUSPEND_POSSIBLE
def_bool y
+config ARCH_CPUIDLE_HALTPOLL
+ bool "Enable selection of the cpuidle-haltpoll driver"
+ default n
+ help
+ cpuidle-haltpoll allows for adaptive polling based on
+ current load before entering the idle state.
+
+ Some virtualized workloads benefit from using it.
+
endmenu # "Power management options"
menu "CPU Power Management"
diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
new file mode 100644
index 000000000000..a79bdec7f516
--- /dev/null
+++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_HALTPOLL_H
+#define _ASM_HALTPOLL_H
+
+static inline void arch_haltpoll_enable(unsigned int cpu)
+{
+}
+
+static inline void arch_haltpoll_disable(unsigned int cpu)
+{
+}
+
+static inline bool arch_haltpoll_supported(void)
+{
+ /*
+ * Ensure the event stream is available to provide a terminating
+ * condition to the WFE in the poll loop.
+ */
+ return arch_timer_evtstrm_available();
+}
+#endif
--
2.39.3
On Tue, 2024-04-30 at 11:37 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Add architectural support for the cpuidle-haltpoll driver by defining
> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
> an optimized polling mechanism via smp_cond_load*().
>
> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
> cpuidle-haltpoll to be selected.
>
> Note that we limit cpuidle-haltpoll support to when the event-stream is
> available. This is necessary because polling via smp_cond_load_relaxed()
> uses WFE to wait for a store which might not happen for an prolonged
> period of time. So, ensure the event-stream is around to provide a
> terminating condition.
>
> Signed-off-by: Ankur Arora <[email protected]>
> ---
> arch/arm64/Kconfig | 10 ++++++++++
> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
> 2 files changed, 31 insertions(+)
> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 7b11c98b3e84..6f2df162b10e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -34,6 +34,7 @@ config ARM64
> select ARCH_HAS_MEMBARRIER_SYNC_CORE
> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> + select ARCH_HAS_OPTIMIZED_POLL
> select ARCH_HAS_PTE_DEVMAP
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_HW_PTE_YOUNG
> @@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
> config ARCH_SUSPEND_POSSIBLE
> def_bool y
>
> +config ARCH_CPUIDLE_HALTPOLL
> + bool "Enable selection of the cpuidle-haltpoll driver"
> + default n
> + help
> + cpuidle-haltpoll allows for adaptive polling based on
> + current load before entering the idle state.
> +
> + Some virtualized workloads benefit from using it.
> +
> endmenu # "Power management options"
>
> menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..a79bdec7f516
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_HALTPOLL_H
> +#define _ASM_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu)
> +{
> +}
> +
> +static inline void arch_haltpoll_disable(unsigned int cpu)
> +{
> +}
> +
> +static inline bool arch_haltpoll_supported(void)
> +{
> + /*
> + * Ensure the event stream is available to provide a terminating
> + * condition to the WFE in the poll loop.
> + */
> + return arch_timer_evtstrm_available();
Note this fails build when CONFIG_HALTPOLL_CPUIDLE=m (module):
ERROR: modpost: "arch_cpu_idle" [drivers/cpuidle/cpuidle-haltpoll.ko]
undefined!
ERROR: modpost: "arch_timer_evtstrm_available"
[drivers/cpuidle/cpuidle-haltpoll.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
make[1]: *** [/home/ubuntu/linux/Makefile:1886: modpost] Error 2
make: *** [Makefile:240: __sub-make] Error 2
You could add EXPORT_SYMBOL_*()'s on the above helpers or restrict
HALTPOLL_CPUIDLE module to built-in (remove "tristate" Kconfig).
Otherwise, everything worked for me when built-in (=y) atop 6.10.0
(4a4be1a). I see similar performance gains in `perf bench` on AWS
Graviton3 c7g.16xlarge.
Regards,
Haris Okanovic
> +}
> +#endif
> --
> 2.39.3
>
Okanovic, Haris <[email protected]> writes:
> On Tue, 2024-04-30 at 11:37 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Add architectural support for the cpuidle-haltpoll driver by defining
>> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
>> an optimized polling mechanism via smp_cond_load*().
>>
>> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
>> cpuidle-haltpoll to be selected.
>>
>> Note that we limit cpuidle-haltpoll support to when the event-stream is
>> available. This is necessary because polling via smp_cond_load_relaxed()
>> uses WFE to wait for a store which might not happen for an prolonged
>> period of time. So, ensure the event-stream is around to provide a
>> terminating condition.
>>
>> Signed-off-by: Ankur Arora <[email protected]>
>> ---
>> arch/arm64/Kconfig | 10 ++++++++++
>> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
>> 2 files changed, 31 insertions(+)
>> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 7b11c98b3e84..6f2df162b10e 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -34,6 +34,7 @@ config ARM64
>> select ARCH_HAS_MEMBARRIER_SYNC_CORE
>> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> + select ARCH_HAS_OPTIMIZED_POLL
>> select ARCH_HAS_PTE_DEVMAP
>> select ARCH_HAS_PTE_SPECIAL
>> select ARCH_HAS_HW_PTE_YOUNG
>> @@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
>> config ARCH_SUSPEND_POSSIBLE
>> def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> + bool "Enable selection of the cpuidle-haltpoll driver"
>> + default n
>> + help
>> + cpuidle-haltpoll allows for adaptive polling based on
>> + current load before entering the idle state.
>> +
>> + Some virtualized workloads benefit from using it.
>> +
>> endmenu # "Power management options"
>>
>> menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..a79bdec7f516
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,21 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_HALTPOLL_H
>> +#define _ASM_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline void arch_haltpoll_disable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline bool arch_haltpoll_supported(void)
>> +{
>> + /*
>> + * Ensure the event stream is available to provide a terminating
>> + * condition to the WFE in the poll loop.
>> + */
>> + return arch_timer_evtstrm_available();
>
> Note this fails build when CONFIG_HALTPOLL_CPUIDLE=m (module):
>
> ERROR: modpost: "arch_cpu_idle" [drivers/cpuidle/cpuidle-haltpoll.ko]
> undefined!
> ERROR: modpost: "arch_timer_evtstrm_available"
> [drivers/cpuidle/cpuidle-haltpoll.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
> make[1]: *** [/home/ubuntu/linux/Makefile:1886: modpost] Error 2
> make: *** [Makefile:240: __sub-make] Error 2
Thanks for trying it out. Missed that.
> You could add EXPORT_SYMBOL_*()'s on the above helpers or restrict
> HALTPOLL_CPUIDLE module to built-in (remove "tristate" Kconfig).
Yeah AFAICT this is the only cpuidle driver providing the module
option. Unfortunately can't remove the tristate thing. People might
already be using it as a module on x86.
I think the arch_cpu_idle() makes sense to export. For
arch_timer_evtstrm_available(), eventually the arch_haltpoll_*()
in any case need to move out of a header file. I'll just do that
now.
> Otherwise, everything worked for me when built-in (=y) atop 6.10.0
> (4a4be1a). I see similar performance gains in `perf bench` on AWS
> Graviton3 c7g.16xlarge.
Excellent. Thanks for checking.
--
ankur