When kernel is booted with idle=nomwait do not use MWAIT as the
default idle state.
If the user boots the kernel with idle=nomwait, it is a clear
direction to not use mwait as the default idle state.
However, the current code does not take this into consideration
while selecting the default idle state on x86.
This patch fixes it by checking for the idle=nomwait boot option in
prefer_mwait_c1_over_halt().
Also update the documentation around idle=nomwait appropriately.
Signed-off-by: Wyes Karny <[email protected]>
---
Changes in v4:
- Update documentation around idle=nomwait
- Rename patch subject
Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
arch/x86/kernel/process.c | 6 +++++-
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
index aec2cd2aaea7..19754beb5a4e 100644
--- a/Documentation/admin-guide/pm/cpuidle.rst
+++ b/Documentation/admin-guide/pm/cpuidle.rst
@@ -612,8 +612,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor
by default this way, for example.
The other kernel command line parameters controlling CPU idle time management
-described below are only relevant for the *x86* architecture and some of
-them affect Intel processors only.
+described below are only relevant for the *x86* architecture and references
+to ``intel_idle`` affect Intel processors only.
The *x86* architecture support code recognizes three kernel command line
options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
@@ -635,10 +635,13 @@ idle, so it very well may hurt single-thread computations performance as well as
energy-efficiency. Thus using it for performance reasons may not be a good idea
at all.]
-The ``idle=nomwait`` option disables the ``intel_idle`` driver and causes
-``acpi_idle`` to be used (as long as all of the information needed by it is
-there in the system's ACPI tables), but it is not allowed to use the
-``MWAIT`` instruction of the CPUs to ask the hardware to enter idle states.
+The ``idle=nomwait`` option prevents the use of ``MWAIT`` instruction of
+the CPU to enter idle states. When this option is used, the ``acpi_idle``
+driver will use the ``HLT`` instruction instead of ``MWAIT``. On systems
+running Intel processors, this option disables the ``intel_idle`` driver
+and forces the use of the ``acpi_idle`` driver instead. Note that in either
+case, ``acpi_idle`` driver will function only if all the information needed
+by it is in the system's ACPI tables.
In addition to the architecture-level kernel command line options affecting CPU
idle time management, there are parameters affecting individual ``CPUIdle``
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b370767f5b19..4e0178b066c5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -824,6 +824,10 @@ static void amd_e400_idle(void)
*/
static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
{
+ /* User has disallowed the use of MWAIT. Fallback to HALT */
+ if (boot_option_idle_override == IDLE_NOMWAIT)
+ return 0;
+
if (c->x86_vendor != X86_VENDOR_INTEL)
return 0;
@@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
} else if (!strcmp(str, "nomwait")) {
/*
* If the boot option of "idle=nomwait" is added,
- * it means that mwait will be disabled for CPU C2/C3
+ * it means that mwait will be disabled for CPU C1/C2/C3
* states. In such case it won't touch the variable
* of boot_option_idle_override.
*/
--
git-series 0.9.1
On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
> When kernel is booted with idle=nomwait do not use MWAIT as the
> default idle state.
>
> If the user boots the kernel with idle=nomwait, it is a clear
> direction to not use mwait as the default idle state.
> However, the current code does not take this into consideration
> while selecting the default idle state on x86.
>
> This patch fixes it by checking for the idle=nomwait boot option in
> prefer_mwait_c1_over_halt().
>
> Also update the documentation around idle=nomwait appropriately.
>
> Signed-off-by: Wyes Karny <[email protected]>
> ---
> Changes in v4:
> - Update documentation around idle=nomwait
> - Rename patch subject
>
> Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
> arch/x86/kernel/process.c | 6 +++++-
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/admin-guide/pm/cpuidle.rst
> b/Documentation/admin-guide/pm/cpuidle.rst
> index aec2cd2aaea7..19754beb5a4e 100644
> --- a/Documentation/admin-guide/pm/cpuidle.rst
> +++ b/Documentation/admin-guide/pm/cpuidle.rst
> @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the systems
> that use the ``ladder`` governor
> by default this way, for example.
>
> The other kernel command line parameters controlling CPU idle time
> management
> -described below are only relevant for the *x86* architecture and
> some of
> -them affect Intel processors only.
> +described below are only relevant for the *x86* architecture and
> references
> +to ``intel_idle`` affect Intel processors only.
>
> The *x86* architecture support code recognizes three kernel command
> line
> options related to CPU idle time management: ``idle=poll``,
> ``idle=halt``,
> @@ -635,10 +635,13 @@ idle, so it very well may hurt single-thread
> computations performance as well as
> energy-efficiency. Thus using it for performance reasons may not be
> a good idea
> at all.]
>
> -The ``idle=nomwait`` option disables the ``intel_idle`` driver and
> causes
> -``acpi_idle`` to be used (as long as all of the information needed
> by it is
> -there in the system's ACPI tables), but it is not allowed to use the
> -``MWAIT`` instruction of the CPUs to ask the hardware to enter idle
> states.
> +The ``idle=nomwait`` option prevents the use of ``MWAIT``
> instruction of
> +the CPU to enter idle states. When this option is used, the
> ``acpi_idle``
> +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
> systems
> +running Intel processors, this option disables the ``intel_idle``
> driver
> +and forces the use of the ``acpi_idle`` driver instead. Note that in
> either
> +case, ``acpi_idle`` driver will function only if all the information
> needed
> +by it is in the system's ACPI tables.
>
> In addition to the architecture-level kernel command line options
> affecting CPU
> idle time management, there are parameters affecting individual
> ``CPUIdle``
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index b370767f5b19..4e0178b066c5 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
> */
> static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
> {
> + /* User has disallowed the use of MWAIT. Fallback to HALT */
> + if (boot_option_idle_override == IDLE_NOMWAIT)
> + return 0;
> +
> if (c->x86_vendor != X86_VENDOR_INTEL)
> return 0;
>
> @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
> } else if (!strcmp(str, "nomwait")) {
> /*
> * If the boot option of "idle=nomwait" is added,
> - * it means that mwait will be disabled for CPU C2/C3
> + * it means that mwait will be disabled for CPU
> C1/C2/C3
> * states. In such case it won't touch the variable
> * of boot_option_idle_override.
the code didn't change boot_option_idle_override when it was
introduced, but this has changed since commit d18960494f65 ("ACPI,
intel_idle: Cleanup idle= internal variables")
thanks,
rui
Hi Rui,
On 5/25/2022 1:36 PM, Zhang Rui wrote:
> On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
>> When kernel is booted with idle=nomwait do not use MWAIT as the
>> default idle state.
>>
>> If the user boots the kernel with idle=nomwait, it is a clear
>> direction to not use mwait as the default idle state.
>> However, the current code does not take this into consideration
>> while selecting the default idle state on x86.
>>
>> This patch fixes it by checking for the idle=nomwait boot option in
>> prefer_mwait_c1_over_halt().
>>
>> Also update the documentation around idle=nomwait appropriately.
>>
>> Signed-off-by: Wyes Karny <[email protected]>
>> ---
>> Changes in v4:
>> - Update documentation around idle=nomwait
>> - Rename patch subject
>>
>> Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
>> arch/x86/kernel/process.c | 6 +++++-
>> 2 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/pm/cpuidle.rst
>> b/Documentation/admin-guide/pm/cpuidle.rst
>> index aec2cd2aaea7..19754beb5a4e 100644
>> --- a/Documentation/admin-guide/pm/cpuidle.rst
>> +++ b/Documentation/admin-guide/pm/cpuidle.rst
>> @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the systems
>> that use the ``ladder`` governor
>> by default this way, for example.
>>
>> The other kernel command line parameters controlling CPU idle time
>> management
>> -described below are only relevant for the *x86* architecture and
>> some of
>> -them affect Intel processors only.
>> +described below are only relevant for the *x86* architecture and
>> references
>> +to ``intel_idle`` affect Intel processors only.
>>
>> The *x86* architecture support code recognizes three kernel command
>> line
>> options related to CPU idle time management: ``idle=poll``,
>> ``idle=halt``,
>> @@ -635,10 +635,13 @@ idle, so it very well may hurt single-thread
>> computations performance as well as
>> energy-efficiency. Thus using it for performance reasons may not be
>> a good idea
>> at all.]
>>
>> -The ``idle=nomwait`` option disables the ``intel_idle`` driver and
>> causes
>> -``acpi_idle`` to be used (as long as all of the information needed
>> by it is
>> -there in the system's ACPI tables), but it is not allowed to use the
>> -``MWAIT`` instruction of the CPUs to ask the hardware to enter idle
>> states.
>> +The ``idle=nomwait`` option prevents the use of ``MWAIT``
>> instruction of
>> +the CPU to enter idle states. When this option is used, the
>> ``acpi_idle``
>> +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
>> systems
>> +running Intel processors, this option disables the ``intel_idle``
>> driver
>> +and forces the use of the ``acpi_idle`` driver instead. Note that in
>> either
>> +case, ``acpi_idle`` driver will function only if all the information
>> needed
>> +by it is in the system's ACPI tables.
>>
>> In addition to the architecture-level kernel command line options
>> affecting CPU
>> idle time management, there are parameters affecting individual
>> ``CPUIdle``
>> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>> index b370767f5b19..4e0178b066c5 100644
>> --- a/arch/x86/kernel/process.c
>> +++ b/arch/x86/kernel/process.c
>> @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
>> */
>> static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
>> {
>> + /* User has disallowed the use of MWAIT. Fallback to HALT */
>> + if (boot_option_idle_override == IDLE_NOMWAIT)
>> + return 0;
>> +
>> if (c->x86_vendor != X86_VENDOR_INTEL)
>> return 0;
>>
>> @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
>> } else if (!strcmp(str, "nomwait")) {
>> /*
>> * If the boot option of "idle=nomwait" is added,
>> - * it means that mwait will be disabled for CPU C2/C3
>> + * it means that mwait will be disabled for CPU
>> C1/C2/C3
>> * states. In such case it won't touch the variable
>> * of boot_option_idle_override.
>
> the code didn't change boot_option_idle_override when it was
> introduced, but this has changed since commit d18960494f65 ("ACPI,
> intel_idle: Cleanup idle= internal variables")
Could you please clarify bit more why the commit you mentioned is
related to this patch?
>
> thanks,
> rui
>
On Thu, 2022-06-02 at 21:11 +0530, Wyes Karny wrote:
> >
>
> Hi Rui,
>
> On 5/25/2022 1:36 PM, Zhang Rui wrote:
> > On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
> > > When kernel is booted with idle=nomwait do not use MWAIT as the
> > > default idle state.
> > >
> > > If the user boots the kernel with idle=nomwait, it is a clear
> > > direction to not use mwait as the default idle state.
> > > However, the current code does not take this into consideration
> > > while selecting the default idle state on x86.
> > >
> > > This patch fixes it by checking for the idle=nomwait boot option
> > > in
> > > prefer_mwait_c1_over_halt().
> > >
> > > Also update the documentation around idle=nomwait appropriately.
> > >
> > > Signed-off-by: Wyes Karny <[email protected]>
> > > ---
> > > Changes in v4:
> > > - Update documentation around idle=nomwait
> > > - Rename patch subject
> > >
> > > Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
> > > arch/x86/kernel/process.c | 6 +++++-
> > > 2 files changed, 14 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/pm/cpuidle.rst
> > > b/Documentation/admin-guide/pm/cpuidle.rst
> > > index aec2cd2aaea7..19754beb5a4e 100644
> > > --- a/Documentation/admin-guide/pm/cpuidle.rst
> > > +++ b/Documentation/admin-guide/pm/cpuidle.rst
> > > @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the
> > > systems
> > > that use the ``ladder`` governor
> > > by default this way, for example.
> > >
> > > The other kernel command line parameters controlling CPU idle
> > > time
> > > management
> > > -described below are only relevant for the *x86* architecture and
> > > some of
> > > -them affect Intel processors only.
> > > +described below are only relevant for the *x86* architecture and
> > > references
> > > +to ``intel_idle`` affect Intel processors only.
> > >
> > > The *x86* architecture support code recognizes three kernel
> > > command
> > > line
> > > options related to CPU idle time management: ``idle=poll``,
> > > ``idle=halt``,
> > > @@ -635,10 +635,13 @@ idle, so it very well may hurt single-
> > > thread
> > > computations performance as well as
> > > energy-efficiency. Thus using it for performance reasons may
> > > not be
> > > a good idea
> > > at all.]
> > >
> > > -The ``idle=nomwait`` option disables the ``intel_idle`` driver
> > > and
> > > causes
> > > -``acpi_idle`` to be used (as long as all of the information
> > > needed
> > > by it is
> > > -there in the system's ACPI tables), but it is not allowed to use
> > > the
> > > -``MWAIT`` instruction of the CPUs to ask the hardware to enter
> > > idle
> > > states.
> > > +The ``idle=nomwait`` option prevents the use of ``MWAIT``
> > > instruction of
> > > +the CPU to enter idle states. When this option is used, the
> > > ``acpi_idle``
> > > +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
> > > systems
> > > +running Intel processors, this option disables the
> > > ``intel_idle``
> > > driver
> > > +and forces the use of the ``acpi_idle`` driver instead. Note
> > > that in
> > > either
> > > +case, ``acpi_idle`` driver will function only if all the
> > > information
> > > needed
> > > +by it is in the system's ACPI tables.
> > >
> > > In addition to the architecture-level kernel command line
> > > options
> > > affecting CPU
> > > idle time management, there are parameters affecting individual
> > > ``CPUIdle``
> > > diff --git a/arch/x86/kernel/process.c
> > > b/arch/x86/kernel/process.c
> > > index b370767f5b19..4e0178b066c5 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
> > > */
> > > static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86
> > > *c)
> > > {
> > > + /* User has disallowed the use of MWAIT. Fallback to HALT */
> > > + if (boot_option_idle_override == IDLE_NOMWAIT)
> > > + return 0;
> > > +
> > > if (c->x86_vendor != X86_VENDOR_INTEL)
> > > return 0;
> > >
> > > @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
> > > } else if (!strcmp(str, "nomwait")) {
> > > /*
> > > * If the boot option of "idle=nomwait" is added,
> > > - * it means that mwait will be disabled for CPU C2/C3
> > > + * it means that mwait will be disabled for CPU
> > > C1/C2/C3
> > > * states. In such case it won't touch the variable
> > > * of boot_option_idle_override.
> >
> > the code didn't change boot_option_idle_override when it was
> > introduced, but this has changed since commit d18960494f65 ("ACPI,
> > intel_idle: Cleanup idle= internal variables")
>
> Could you please clarify bit more why the commit you mentioned is
> related to this patch?
>
The comment "In such case it won't touch the variable of
boot_option_idle_override." has been broken for some time, it is not
related with this patch. But given that this patch "Also update the
documentation around idle=nomwait appropriately", so my suggestion is
to update it altogether, by deleting the last sentence.
thanks,
rui
Hello Rui,
On 6/5/2022 6:02 PM, Zhang Rui wrote:
> On Thu, 2022-06-02 at 21:11 +0530, Wyes Karny wrote:
>>>
>>
>> Hi Rui,
>>
>> On 5/25/2022 1:36 PM, Zhang Rui wrote:
>>> On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
>>>> When kernel is booted with idle=nomwait do not use MWAIT as the
>>>> default idle state.
>>>>
>>>> If the user boots the kernel with idle=nomwait, it is a clear
>>>> direction to not use mwait as the default idle state.
>>>> However, the current code does not take this into consideration
>>>> while selecting the default idle state on x86.
>>>>
>>>> This patch fixes it by checking for the idle=nomwait boot option
>>>> in
>>>> prefer_mwait_c1_over_halt().
>>>>
>>>> Also update the documentation around idle=nomwait appropriately.
>>>>
>>>> Signed-off-by: Wyes Karny <[email protected]>
>>>> ---
>>>> Changes in v4:
>>>> - Update documentation around idle=nomwait
>>>> - Rename patch subject
>>>>
>>>> Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
>>>> arch/x86/kernel/process.c | 6 +++++-
>>>> 2 files changed, 14 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/Documentation/admin-guide/pm/cpuidle.rst
>>>> b/Documentation/admin-guide/pm/cpuidle.rst
>>>> index aec2cd2aaea7..19754beb5a4e 100644
>>>> --- a/Documentation/admin-guide/pm/cpuidle.rst
>>>> +++ b/Documentation/admin-guide/pm/cpuidle.rst
>>>> @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the
>>>> systems
>>>> that use the ``ladder`` governor
>>>> by default this way, for example.
>>>>
>>>> The other kernel command line parameters controlling CPU idle
>>>> time
>>>> management
>>>> -described below are only relevant for the *x86* architecture and
>>>> some of
>>>> -them affect Intel processors only.
>>>> +described below are only relevant for the *x86* architecture and
>>>> references
>>>> +to ``intel_idle`` affect Intel processors only.
>>>>
>>>> The *x86* architecture support code recognizes three kernel
>>>> command
>>>> line
>>>> options related to CPU idle time management: ``idle=poll``,
>>>> ``idle=halt``,
>>>> @@ -635,10 +635,13 @@ idle, so it very well may hurt single-
>>>> thread
>>>> computations performance as well as
>>>> energy-efficiency. Thus using it for performance reasons may
>>>> not be
>>>> a good idea
>>>> at all.]
>>>>
>>>> -The ``idle=nomwait`` option disables the ``intel_idle`` driver
>>>> and
>>>> causes
>>>> -``acpi_idle`` to be used (as long as all of the information
>>>> needed
>>>> by it is
>>>> -there in the system's ACPI tables), but it is not allowed to use
>>>> the
>>>> -``MWAIT`` instruction of the CPUs to ask the hardware to enter
>>>> idle
>>>> states.
>>>> +The ``idle=nomwait`` option prevents the use of ``MWAIT``
>>>> instruction of
>>>> +the CPU to enter idle states. When this option is used, the
>>>> ``acpi_idle``
>>>> +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
>>>> systems
>>>> +running Intel processors, this option disables the
>>>> ``intel_idle``
>>>> driver
>>>> +and forces the use of the ``acpi_idle`` driver instead. Note
>>>> that in
>>>> either
>>>> +case, ``acpi_idle`` driver will function only if all the
>>>> information
>>>> needed
>>>> +by it is in the system's ACPI tables.
>>>>
>>>> In addition to the architecture-level kernel command line
>>>> options
>>>> affecting CPU
>>>> idle time management, there are parameters affecting individual
>>>> ``CPUIdle``
>>>> diff --git a/arch/x86/kernel/process.c
>>>> b/arch/x86/kernel/process.c
>>>> index b370767f5b19..4e0178b066c5 100644
>>>> --- a/arch/x86/kernel/process.c
>>>> +++ b/arch/x86/kernel/process.c
>>>> @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
>>>> */
>>>> static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86
>>>> *c)
>>>> {
>>>> + /* User has disallowed the use of MWAIT. Fallback to HALT */
>>>> + if (boot_option_idle_override == IDLE_NOMWAIT)
>>>> + return 0;
>>>> +
>>>> if (c->x86_vendor != X86_VENDOR_INTEL)
>>>> return 0;
>>>>
>>>> @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
>>>> } else if (!strcmp(str, "nomwait")) {
>>>> /*
>>>> * If the boot option of "idle=nomwait" is added,
>>>> - * it means that mwait will be disabled for CPU C2/C3
>>>> + * it means that mwait will be disabled for CPU
>>>> C1/C2/C3
>>>> * states. In such case it won't touch the variable
>>>> * of boot_option_idle_override.
>>>
>>> the code didn't change boot_option_idle_override when it was
>>> introduced, but this has changed since commit d18960494f65 ("ACPI,
>>> intel_idle: Cleanup idle= internal variables")
>>
>> Could you please clarify bit more why the commit you mentioned is
>> related to this patch?
>>
>
> The comment "In such case it won't touch the variable of
> boot_option_idle_override." has been broken for some time, it is not
> related with this patch. But given that this patch "Also update the
> documentation around idle=nomwait appropriately", so my suggestion is
> to update it altogether, by deleting the last sentence.
Sure, will do. Thanks!
>
> thanks,
> rui
>