2019-11-06 11:57:07

by Zhenzhong Duan

[permalink] [raw]
Subject: [PATCH RESEND v2 0/4] misc fixes on halt-poll code for both KVM and guest

This patchset tries to fix below issues:

1. Admin could set halt_poll_ns to 0 at runtime to disable poll and kernel
behave just like the generic halt driver. Then If guest_halt_poll_grow_start
is set to 0 and guest_halt_poll_ns set to nonzero later, cpu_halt_poll_us will
never grow beyond 0. The first two patches fix this issue from both kvm and
guest side.

2. guest_halt_poll_grow_start and guest_halt_poll_ns could be adjusted at
runtime by admin, this could make a window where cpu_halt_poll_us jump out
of the boundary. the window could be long in some cases(e.g. guest_halt_poll_grow_start
is bumped and cpu_halt_poll_us is shrinking) The last two patches fix this
issue from both kvm and guest side.

3. The 4th patch also simplifies branch check code.

v2:
Rewrite the patches and drop unnecessory changes

Zhenzhong Duan (4):
cpuidle-haltpoll: ensure grow start value is nonzero
KVM: ensure grow start value is nonzero
cpuidle-haltpoll: ensure cpu_halt_poll_us in right scope
KVM: ensure vCPU halt_poll_us in right scope

drivers/cpuidle/governors/haltpoll.c | 50 ++++++++++++++++++++++++-----------
virt/kvm/kvm_main.c | 51 ++++++++++++++++++++++++------------
2 files changed, 68 insertions(+), 33 deletions(-)

--
1.8.3.1


2019-11-06 11:58:57

by Zhenzhong Duan

[permalink] [raw]
Subject: [PATCH RESEND v2 2/4] KVM: ensure grow start value is nonzero

vcpu->halt_poll_ns could be zeroed in certain cases (e.g. by
halt_poll_ns = 0). If halt_poll_grow_start is zero,
vcpu->halt_poll_ns will never be bigger than zero.

Use param callback to avoid writing zero to halt_poll_grow_start.

Signed-off-by: Zhenzhong Duan <[email protected]>
---
virt/kvm/kvm_main.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d6f0696..359516b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -69,6 +69,26 @@
MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");

+static int grow_start_set(const char *val, const struct kernel_param *kp)
+{
+ int ret;
+ unsigned int n;
+
+ if (!val)
+ return -EINVAL;
+
+ ret = kstrtouint(val, 0, &n);
+ if (ret || !n)
+ return -EINVAL;
+
+ return param_set_uint(val, kp);
+}
+
+static const struct kernel_param_ops grow_start_ops = {
+ .set = grow_start_set,
+ .get = param_get_uint,
+};
+
/* Architectures should define their poll value according to the halt latency */
unsigned int halt_poll_ns = KVM_HALT_POLL_NS_DEFAULT;
module_param(halt_poll_ns, uint, 0644);
@@ -81,7 +101,7 @@

/* The start value to grow halt_poll_ns from */
unsigned int halt_poll_ns_grow_start = 10000; /* 10us */
-module_param(halt_poll_ns_grow_start, uint, 0644);
+module_param_cb(halt_poll_ns_grow_start, &grow_start_ops, &halt_poll_ns_grow_start, 0644);
EXPORT_SYMBOL_GPL(halt_poll_ns_grow_start);

/* Default resets per-vcpu halt_poll_ns . */
--
1.8.3.1

2019-11-06 12:01:03

by Zhenzhong Duan

[permalink] [raw]
Subject: [PATCH RESEND v2 3/4] cpuidle-haltpoll: ensure cpu_halt_poll_us in right scope

As user can adjust guest_halt_poll_grow_start and guest_halt_poll_ns
which leads to cpu_halt_poll_us beyond the two boundaries. This patch
ensures cpu_halt_poll_us in that scope.

If guest_halt_poll_shrink is 0, shrink the cpu_halt_poll_us to
guest_halt_poll_grow_start instead of 0. To disable poll we can set
guest_halt_poll_ns to 0.

If user wrongly set guest_halt_poll_grow_start > guest_halt_poll_ns > 0,
guest_halt_poll_ns take precedency and poll time is a fixed value of
guest_halt_poll_ns.

Signed-off-by: Zhenzhong Duan <[email protected]>
---
drivers/cpuidle/governors/haltpoll.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
index 660859d..4a39df4 100644
--- a/drivers/cpuidle/governors/haltpoll.c
+++ b/drivers/cpuidle/governors/haltpoll.c
@@ -97,32 +97,30 @@ static int haltpoll_select(struct cpuidle_driver *drv,

static void adjust_poll_limit(struct cpuidle_device *dev, unsigned int block_us)
{
- unsigned int val;
+ unsigned int val = dev->poll_limit_ns;
u64 block_ns = block_us*NSEC_PER_USEC;

/* Grow cpu_halt_poll_us if
- * cpu_halt_poll_us < block_ns < guest_halt_poll_us
+ * cpu_halt_poll_us < block_ns <= guest_halt_poll_us
*/
- if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns) {
+ if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns &&
+ guest_halt_poll_grow)
val = dev->poll_limit_ns * guest_halt_poll_grow;
-
- if (val < guest_halt_poll_grow_start)
- val = guest_halt_poll_grow_start;
- if (val > guest_halt_poll_ns)
- val = guest_halt_poll_ns;
-
- dev->poll_limit_ns = val;
- } else if (block_ns > guest_halt_poll_ns &&
- guest_halt_poll_allow_shrink) {
+ else if (block_ns > guest_halt_poll_ns &&
+ guest_halt_poll_allow_shrink) {
unsigned int shrink = guest_halt_poll_shrink;

- val = dev->poll_limit_ns;
if (shrink == 0)
- val = 0;
+ val = guest_halt_poll_grow_start;
else
val /= shrink;
- dev->poll_limit_ns = val;
}
+ if (val < guest_halt_poll_grow_start)
+ val = guest_halt_poll_grow_start;
+ if (val > guest_halt_poll_ns)
+ val = guest_halt_poll_ns;
+
+ dev->poll_limit_ns = val;
}

/**
--
1.8.3.1

2019-11-11 20:17:08

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 2/4] KVM: ensure grow start value is nonzero

On Wed, Nov 06, 2019 at 07:55:00PM +0800, Zhenzhong Duan wrote:
> vcpu->halt_poll_ns could be zeroed in certain cases (e.g. by
> halt_poll_ns = 0). If halt_poll_grow_start is zero,
> vcpu->halt_poll_ns will never be bigger than zero.
>
> Use param callback to avoid writing zero to halt_poll_grow_start.

This doesn't explain why allowing an admin to disable halt polling by
writing halt_poll_grow_start=0 is a bad thing. Paolo had the same
question in v1, here[1] and in the guest driver[2].

[1] https://lkml.kernel.org/r/[email protected]
[2] https://lkml.kernel.org/r/[email protected]

>
> Signed-off-by: Zhenzhong Duan <[email protected]>
> ---
> virt/kvm/kvm_main.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d6f0696..359516b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -69,6 +69,26 @@
> MODULE_AUTHOR("Qumranet");
> MODULE_LICENSE("GPL");
>
> +static int grow_start_set(const char *val, const struct kernel_param *kp)
> +{
> + int ret;
> + unsigned int n;
> +
> + if (!val)
> + return -EINVAL;
> +
> + ret = kstrtouint(val, 0, &n);
> + if (ret || !n)
> + return -EINVAL;
> +
> + return param_set_uint(val, kp);
> +}
> +
> +static const struct kernel_param_ops grow_start_ops = {
> + .set = grow_start_set,
> + .get = param_get_uint,
> +};
> +
> /* Architectures should define their poll value according to the halt latency */
> unsigned int halt_poll_ns = KVM_HALT_POLL_NS_DEFAULT;
> module_param(halt_poll_ns, uint, 0644);
> @@ -81,7 +101,7 @@
>
> /* The start value to grow halt_poll_ns from */
> unsigned int halt_poll_ns_grow_start = 10000; /* 10us */
> -module_param(halt_poll_ns_grow_start, uint, 0644);
> +module_param_cb(halt_poll_ns_grow_start, &grow_start_ops, &halt_poll_ns_grow_start, 0644);
> EXPORT_SYMBOL_GPL(halt_poll_ns_grow_start);
>
> /* Default resets per-vcpu halt_poll_ns . */
> --
> 1.8.3.1
>

2019-11-12 12:20:38

by Zhenzhong Duan

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 2/4] KVM: ensure grow start value is nonzero

On 2019/11/12 4:13, Sean Christopherson wrote:
> On Wed, Nov 06, 2019 at 07:55:00PM +0800, Zhenzhong Duan wrote:
>> vcpu->halt_poll_ns could be zeroed in certain cases (e.g. by
>> halt_poll_ns = 0). If halt_poll_grow_start is zero,
>> vcpu->halt_poll_ns will never be bigger than zero.
>>
>> Use param callback to avoid writing zero to halt_poll_grow_start.
> This doesn't explain why allowing an admin to disable halt polling by
> writing halt_poll_grow_start=0 is a bad thing. Paolo had the same
> question in v1, here[1] and in the guest driver[2].
>
> [1]https://lkml.kernel.org/r/[email protected]
> [2]https://lkml.kernel.org/r/[email protected]

Ok, answer all the same questions about grow_start=0 here.

VCPU halt polling time may be nonzero even if grow_start=0, such as in below situation:
0=grow_start< block_ns< (vcpu->halt_poll_ns)< halt_poll_ns

grow_start=0 has your mentioned effect only in below sequence:
1. set halt_poll_ns=0 to disable halt polling(this lead to vcpu->halt_poll_ns=0)
2. set grow_start=0
3. set halt_poll_ns to nonzero
4. Admin expect halt polling time auto adjust in range [0, nonzero], but polling time stick at 0.

So I think we should use halt_poll_ns=0 to disable halt polling instead of grow_start=0.

Zhenzhong

2019-11-15 10:46:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 3/4] cpuidle-haltpoll: ensure cpu_halt_poll_us in right scope

On Wednesday, November 6, 2019 12:55:01 PM CET Zhenzhong Duan wrote:
> As user can adjust guest_halt_poll_grow_start and guest_halt_poll_ns
> which leads to cpu_halt_poll_us beyond the two boundaries. This patch
> ensures cpu_halt_poll_us in that scope.
>
> If guest_halt_poll_shrink is 0, shrink the cpu_halt_poll_us to
> guest_halt_poll_grow_start instead of 0. To disable poll we can set
> guest_halt_poll_ns to 0.
>
> If user wrongly set guest_halt_poll_grow_start > guest_halt_poll_ns > 0,
> guest_halt_poll_ns take precedency and poll time is a fixed value of
> guest_halt_poll_ns.
>
> Signed-off-by: Zhenzhong Duan <[email protected]>
> ---
> drivers/cpuidle/governors/haltpoll.c | 28 +++++++++++++---------------
> 1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
> index 660859d..4a39df4 100644
> --- a/drivers/cpuidle/governors/haltpoll.c
> +++ b/drivers/cpuidle/governors/haltpoll.c
> @@ -97,32 +97,30 @@ static int haltpoll_select(struct cpuidle_driver *drv,
>
> static void adjust_poll_limit(struct cpuidle_device *dev, unsigned int block_us)
> {
> - unsigned int val;
> + unsigned int val = dev->poll_limit_ns;

Not necessary to initialize it here.

> u64 block_ns = block_us*NSEC_PER_USEC;
>
> /* Grow cpu_halt_poll_us if
> - * cpu_halt_poll_us < block_ns < guest_halt_poll_us
> + * cpu_halt_poll_us < block_ns <= guest_halt_poll_us

You could update the comment to say "dev->poll_limit_ns" instead of
"cpu_halt_poll_us" while at it.

> */
> - if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns) {
> + if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns &&
> + guest_halt_poll_grow)

The "{" brace is still needed as per the coding style and I'm not sure why
to avoid guest_halt_poll_grow equal to zero here?

> val = dev->poll_limit_ns * guest_halt_poll_grow;
> -
> - if (val < guest_halt_poll_grow_start)
> - val = guest_halt_poll_grow_start;
> - if (val > guest_halt_poll_ns)
> - val = guest_halt_poll_ns;
> -
> - dev->poll_limit_ns = val;
> - } else if (block_ns > guest_halt_poll_ns &&
> - guest_halt_poll_allow_shrink) {
> + else if (block_ns > guest_halt_poll_ns &&
> + guest_halt_poll_allow_shrink) {
> unsigned int shrink = guest_halt_poll_shrink;
>
> - val = dev->poll_limit_ns;
> if (shrink == 0)
> - val = 0;
> + val = guest_halt_poll_grow_start;

That's going to be corrected below, so the original code would be fine.

> else
> val /= shrink;

Here you can do

val = dev->poll_limit_ns / shrink;

> - dev->poll_limit_ns = val;
> }
> + if (val < guest_halt_poll_grow_start)
> + val = guest_halt_poll_grow_start;

Note that guest_halt_poll_grow_start is in us (as per the comment next to its
definition and the initial value). That is a bug in the original code too,
but anyway.

> + if (val > guest_halt_poll_ns)
> + val = guest_halt_poll_ns;
> +
> + dev->poll_limit_ns = val;
> }
>
> /**
>




2019-11-17 09:02:15

by Zhenzhong Duan

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 3/4] cpuidle-haltpoll: ensure cpu_halt_poll_us in right scope

On 2019/11/15 18:45, Rafael J. Wysocki wrote:

> On Wednesday, November 6, 2019 12:55:01 PM CET Zhenzhong Duan wrote:
>> As user can adjust guest_halt_poll_grow_start and guest_halt_poll_ns
>> which leads to cpu_halt_poll_us beyond the two boundaries. This patch
>> ensures cpu_halt_poll_us in that scope.
>>
>> If guest_halt_poll_shrink is 0, shrink the cpu_halt_poll_us to
>> guest_halt_poll_grow_start instead of 0. To disable poll we can set
>> guest_halt_poll_ns to 0.
>>
>> If user wrongly set guest_halt_poll_grow_start > guest_halt_poll_ns > 0,
>> guest_halt_poll_ns take precedency and poll time is a fixed value of
>> guest_halt_poll_ns.
>>
>> Signed-off-by: Zhenzhong Duan <[email protected]>
>> ---
>> drivers/cpuidle/governors/haltpoll.c | 28 +++++++++++++---------------
>> 1 file changed, 13 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
>> index 660859d..4a39df4 100644
>> --- a/drivers/cpuidle/governors/haltpoll.c
>> +++ b/drivers/cpuidle/governors/haltpoll.c
>> @@ -97,32 +97,30 @@ static int haltpoll_select(struct cpuidle_driver *drv,
>>
>> static void adjust_poll_limit(struct cpuidle_device *dev, unsigned int block_us)
>> {
>> - unsigned int val;
>> + unsigned int val = dev->poll_limit_ns;
> Not necessary to initialize it here.

Then an random val may bypass all the check and get assigned to dev->poll_limit_ns

if guest_halt_poll_grow_start< block_ns< uninitialized val< guest_halt_poll_ns

With my change, dev->poll_limit_ns will not be changed in that case, logic same as original code.

>
>> u64 block_ns = block_us*NSEC_PER_USEC;
>>
>> /* Grow cpu_halt_poll_us if
>> - * cpu_halt_poll_us < block_ns < guest_halt_poll_us
>> + * cpu_halt_poll_us < block_ns <= guest_halt_poll_us
> You could update the comment to say "dev->poll_limit_ns" instead of
> "cpu_halt_poll_us" while at it.

Will do, also guest_halt_poll_us to guest_halt_poll_ns

>
>> */
>> - if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns) {
>> + if (block_ns > dev->poll_limit_ns && block_ns <= guest_halt_poll_ns &&
>> + guest_halt_poll_grow)
> The "{" brace is still needed as per the coding style and I'm not sure why
> to avoid guest_halt_poll_grow equal to zero here?

Will add "{}" and remove guest_halt_poll_grow check. My inital thought was to prevent

dev->poll_limit_ns get shrinked with guest_halt_poll_grow=0.

>
>> val = dev->poll_limit_ns * guest_halt_poll_grow;
>> -
>> - if (val < guest_halt_poll_grow_start)
>> - val = guest_halt_poll_grow_start;
>> - if (val > guest_halt_poll_ns)
>> - val = guest_halt_poll_ns;
>> -
>> - dev->poll_limit_ns = val;
>> - } else if (block_ns > guest_halt_poll_ns &&
>> - guest_halt_poll_allow_shrink) {
>> + else if (block_ns > guest_halt_poll_ns &&
>> + guest_halt_poll_allow_shrink) {
>> unsigned int shrink = guest_halt_poll_shrink;
>>
>> - val = dev->poll_limit_ns;
>> if (shrink == 0)
>> - val = 0;
>> + val = guest_halt_poll_grow_start;
> That's going to be corrected below, so the original code would be fine.

val was assigned twice using 'val = 0' while it's once with my change, optimal a bit?

>
>> else
>> val /= shrink;
> Here you can do
>
> val = dev->poll_limit_ns / shrink;

Any special reason?Looks no difference for me.

>
>> - dev->poll_limit_ns = val;
>> }
>> + if (val < guest_halt_poll_grow_start)
>> + val = guest_halt_poll_grow_start;
> Note that guest_halt_poll_grow_start is in us (as per the comment next to its
> definition and the initial value). That is a bug in the original code too,
> but anyway.

Good catch! will fix the comment. The default 50000ns vs 50000us, looks author means ns.
guest_halt_poll_ns defaults to 200000, also hints ns for guest_halt_poll_grow_start.

Thanks

Zhenzhong