2022-12-02 04:27:26

by Li RongQing

[permalink] [raw]
Subject: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver

From: Li RongQing <[email protected]>

x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will
cause performance degradation, so override prefer_mwait_c1_over_halt
to a new value, aviod loading cpuidle-haltpoll driver

Signed-off-by: Li RongQing <[email protected]>
---
arch/x86/include/asm/processor.h | 2 +-
arch/x86/kernel/process.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 67c9d73..6bc94fd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -658,7 +658,7 @@ extern void amd_e400_c1e_apic_setup(void);
extern unsigned long boot_option_idle_override;

enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT,
- IDLE_POLL};
+ IDLE_POLL, IDLE_PREF_MWAIT};

extern void enable_sep_cpu(void);
extern int sysenter_setup(void);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c21b734..a16985c 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
} else if (prefer_mwait_c1_over_halt(c)) {
pr_info("using mwait in idle threads\n");
x86_idle = mwait_idle;
+ boot_option_idle_override = IDLE_PREF_MWAIT;
} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
pr_info("using TDX aware idle routine\n");
x86_idle = tdx_safe_halt;
--
2.9.4


2022-12-02 19:03:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver

Li!

On Fri, Dec 02 2022 at 11:37, [email protected] wrote:
> From: Li RongQing <[email protected]>
>
> x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will
> cause performance degradation, so override prefer_mwait_c1_over_halt
> to a new value, aviod loading cpuidle-haltpoll driver

Neither the subject line nor the above makes any sense to me.

prefer_mwait_c1_over_halt() is a function which is invoked and when it
returns true then the execution ends up in the code path you are
patching.

> @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
> } else if (prefer_mwait_c1_over_halt(c)) {
> pr_info("using mwait in idle threads\n");
> x86_idle = mwait_idle;
> + boot_option_idle_override = IDLE_PREF_MWAIT;

What you do is setting boot_option_idle_override to a new value, but
that has nothing to do with prefer_mwait_c1_over_halt() at all.

So how are you overriding that function to a new value?

But that's just a word smithing problem.

The real and way worse problem is that you pick a variable, which has
the purpose to capture the idle override on the kernel command line, and
modify it as you see fit, just to prevent that driver from loading.

select_idle_routine() reads it to check whether there was a command line
override or not. But it is not supposed to write it. Why?

Have you checked what else evaluates that variable? Obviously not,
because a simple grep would have told you:

drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE)
drivers/idle/intel_idle.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE)

Congratulations!

Your patch breaks the default setup of every recent Intel system on the
planet because it not only prevents the cpuidle-haltpoll, but also the
intel_idle driver from loading.

Seriously. It's not too much asked to do at least a simple grep and look
at all _nine_ places which evaluate boot_option_idle_override.

Thanks,

tglx




2022-12-04 12:21:49

by Li RongQing

[permalink] [raw]
Subject: RE: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid loading cpuidle-haltpoll driver



> -----Original Message-----
> From: Thomas Gleixner <[email protected]>
> Sent: Saturday, December 3, 2022 2:48 AM
> To: Li,Rongqing <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid
> loading cpuidle-haltpoll driver
>
> Li!
>
> On Fri, Dec 02 2022 at 11:37, [email protected] wrote:
> > From: Li RongQing <[email protected]>
> >
> > x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will
> > cause performance degradation, so override prefer_mwait_c1_over_halt
> > to a new value, aviod loading cpuidle-haltpoll driver
>
> Neither the subject line nor the above makes any sense to me.
>
> prefer_mwait_c1_over_halt() is a function which is invoked and when it returns
> true then the execution ends up in the code path you are patching.
>
> > @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
> > } else if (prefer_mwait_c1_over_halt(c)) {
> > pr_info("using mwait in idle threads\n");
> > x86_idle = mwait_idle;
> > + boot_option_idle_override = IDLE_PREF_MWAIT;
>
> What you do is setting boot_option_idle_override to a new value, but that has
> nothing to do with prefer_mwait_c1_over_halt() at all.
>
> So how are you overriding that function to a new value?
>
> But that's just a word smithing problem.
>
> The real and way worse problem is that you pick a variable, which has the
> purpose to capture the idle override on the kernel command line, and modify it
> as you see fit, just to prevent that driver from loading.
>
> select_idle_routine() reads it to check whether there was a command line
> override or not. But it is not supposed to write it. Why?
>
> Have you checked what else evaluates that variable? Obviously not, because a
> simple grep would have told you:
>
> drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override !=
> IDLE_NO_OVERRIDE)
> drivers/idle/intel_idle.c: if (boot_option_idle_override !=
> IDLE_NO_OVERRIDE)
>
> Congratulations!
>
> Your patch breaks the default setup of every recent Intel system on the planet
> because it not only prevents the cpuidle-haltpoll, but also the intel_idle driver
> from loading.
>
> Seriously. It's not too much asked to do at least a simple grep and look at all
> _nine_ places which evaluate boot_option_idle_override.
>

Sorry for the careless

Thanks for the review, I will send a new version, which export a function to tell haltpoll driver whether or not mwait is used, like below


diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 67c9d73..159ef33 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -862,4 +862,6 @@ bool arch_is_platform_page(u64 paddr);
#define arch_is_platform_page arch_is_platform_page
#endif

+bool is_mwait_idle(void);
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c21b734..330972c 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -896,6 +896,12 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
x86_idle = default_idle;
}

+bool is_mwait_idle(void)
+{
+ return x86_idle == mwait_idle;
+}
+EXPORT_SYMBOL_GPL(is_mwait_idle);
+
void amd_e400_c1e_apic_setup(void)
{
if (boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c
index 3a39a7f..8cf1ddf 100644
--- a/drivers/cpuidle/cpuidle-haltpoll.c
+++ b/drivers/cpuidle/cpuidle-haltpoll.c
@@ -17,6 +17,7 @@
#include <linux/sched/idle.h>
#include <linux/kvm_para.h>
#include <linux/cpuidle_haltpoll.h>
+#include <linux/processor.h>

static bool force __read_mostly;
module_param(force, bool, 0444);
@@ -111,6 +112,9 @@ static int __init haltpoll_init(void)
if (!kvm_para_available() || !haltpoll_want())
return -ENODEV;

+ if (is_mwait_idle())
+ return -ENODEV;
+
cpuidle_poll_state_init(drv);

ret = cpuidle_register_driver(drv);



Thanks

-Li