2019-03-05 07:57:36

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH v4.9 1/1] jiffies: use jiffies64_to_nsecs() to fix 100% steal usage for xen vcpu hotplug

[ Not relevant upstream, therefore no upstream commit. ]

To fix, use jiffies64_to_nsecs() directly instead of deriving the result
according to jiffies_to_usecs().

As the return type of jiffies_to_usecs() is 'unsigned int', when the return
value is more than the size of 'unsigned int', the leading 32 bits would be
discarded.

Suppose USEC_PER_SEC=1000000L and HZ=1000, below are the expected and
actual incorrect result of jiffies_to_usecs(0x7770ef70):

- expected : jiffies_to_usecs(0x7770ef70) = 0x000001d291274d80
- incorrect : jiffies_to_usecs(0x7770ef70) = 0x0000000091274d80

The leading 0x000001d200000000 is discarded.

After xen vcpu hotplug and when the new vcpu steal clock is calculated for
the first time, the result of this_rq()->prev_steal_time in
steal_account_process_tick() would be far smaller than the expected
value, due to that jiffies_to_usecs() discards the leading 32 bits.

As a result, the diff between current steal and this_rq()->prev_steal_time
is always very large. Steal usage would become 100% when the initial steal
clock obtained from xen hypervisor is very large during xen vcpu hotplug,
that is, when the guest is already up for a long time.

The bug can be detected by doing the following:

* Boot xen guest with vcpus=2 and maxvcpus=4
* Leave the guest running for a month so that the initial steal clock for
the new vcpu would be very large
* Hotplug 2 extra vcpus
* The steal time of new vcpus in /proc/stat would increase abnormally and
sometimes steal usage in top can become 100%

This was incidentally fixed in the patch set starting by
commit 93825f2ec736 ("jiffies: Reuse TICK_NSEC instead of NSEC_PER_JIFFY")
and ended with
commit b672592f0221 ("sched/cputime: Remove generic asm headers").

This version applies to the v4.9 series.

Link: https://lkml.org/lkml/2019/2/28/1373
Suggested-by: Juergen Gross <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
---
include/linux/jiffies.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 734377a..94aff43 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -287,13 +287,13 @@ extern unsigned long preset_lpj;
extern unsigned int jiffies_to_msecs(const unsigned long j);
extern unsigned int jiffies_to_usecs(const unsigned long j);

+extern u64 jiffies64_to_nsecs(u64 j);
+
static inline u64 jiffies_to_nsecs(const unsigned long j)
{
- return (u64)jiffies_to_usecs(j) * NSEC_PER_USEC;
+ return jiffies64_to_nsecs(j);
}

-extern u64 jiffies64_to_nsecs(u64 j);
-
extern unsigned long __msecs_to_jiffies(const unsigned int m);
#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ)
/*
--
2.7.4



2019-03-06 08:18:42

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4.9 1/1] jiffies: use jiffies64_to_nsecs() to fix 100% steal usage for xen vcpu hotplug

Thanks to Joe Jin's reminding, this patch is applicable to mainline linux
kernel, although there is no issue due to this kind of bug in mainline kernel.

Therefore, can I first submit this patch to mainline kernel and then backport it
to stable linux with more detailed explanation how the issue is reproduced on xen?

This would help synchronize stable with mainline better.

Thank you very much!

Dongli Zhang

On 3/5/19 3:59 PM, Dongli Zhang wrote:
> [ Not relevant upstream, therefore no upstream commit. ]
>
> To fix, use jiffies64_to_nsecs() directly instead of deriving the result
> according to jiffies_to_usecs().
>
> As the return type of jiffies_to_usecs() is 'unsigned int', when the return
> value is more than the size of 'unsigned int', the leading 32 bits would be
> discarded.
>
> Suppose USEC_PER_SEC=1000000L and HZ=1000, below are the expected and
> actual incorrect result of jiffies_to_usecs(0x7770ef70):
>
> - expected : jiffies_to_usecs(0x7770ef70) = 0x000001d291274d80
> - incorrect : jiffies_to_usecs(0x7770ef70) = 0x0000000091274d80
>
> The leading 0x000001d200000000 is discarded.
>
> After xen vcpu hotplug and when the new vcpu steal clock is calculated for
> the first time, the result of this_rq()->prev_steal_time in
> steal_account_process_tick() would be far smaller than the expected
> value, due to that jiffies_to_usecs() discards the leading 32 bits.
>
> As a result, the diff between current steal and this_rq()->prev_steal_time
> is always very large. Steal usage would become 100% when the initial steal
> clock obtained from xen hypervisor is very large during xen vcpu hotplug,
> that is, when the guest is already up for a long time.
>
> The bug can be detected by doing the following:
>
> * Boot xen guest with vcpus=2 and maxvcpus=4
> * Leave the guest running for a month so that the initial steal clock for
> the new vcpu would be very large
> * Hotplug 2 extra vcpus
> * The steal time of new vcpus in /proc/stat would increase abnormally and
> sometimes steal usage in top can become 100%
>
> This was incidentally fixed in the patch set starting by
> commit 93825f2ec736 ("jiffies: Reuse TICK_NSEC instead of NSEC_PER_JIFFY")
> and ended with
> commit b672592f0221 ("sched/cputime: Remove generic asm headers").
>
> This version applies to the v4.9 series.
>
> Link: https://lkml.org/lkml/2019/2/28/1373
> Suggested-by: Juergen Gross <[email protected]>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> include/linux/jiffies.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
> index 734377a..94aff43 100644
> --- a/include/linux/jiffies.h
> +++ b/include/linux/jiffies.h
> @@ -287,13 +287,13 @@ extern unsigned long preset_lpj;
> extern unsigned int jiffies_to_msecs(const unsigned long j);
> extern unsigned int jiffies_to_usecs(const unsigned long j);
>
> +extern u64 jiffies64_to_nsecs(u64 j);
> +
> static inline u64 jiffies_to_nsecs(const unsigned long j)
> {
> - return (u64)jiffies_to_usecs(j) * NSEC_PER_USEC;
> + return jiffies64_to_nsecs(j);
> }
>
> -extern u64 jiffies64_to_nsecs(u64 j);
> -
> extern unsigned long __msecs_to_jiffies(const unsigned int m);
> #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ)
> /*
>

2019-03-12 12:56:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4.9 1/1] jiffies: use jiffies64_to_nsecs() to fix 100% steal usage for xen vcpu hotplug

On Wed, Mar 06, 2019 at 03:35:40PM +0800, Dongli Zhang wrote:
> Thanks to Joe Jin's reminding, this patch is applicable to mainline linux
> kernel, although there is no issue due to this kind of bug in mainline kernel.
>
> Therefore, can I first submit this patch to mainline kernel and then backport it
> to stable linux with more detailed explanation how the issue is reproduced on xen?

Yes, please do that.

thanks,

greg k-h

2019-03-12 23:42:01

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH v4.9 1/1] jiffies: use jiffies64_to_nsecs() to fix 100% steal usage for xen vcpu hotplug



On 3/12/19 8:56 PM, Greg KH wrote:
> On Wed, Mar 06, 2019 at 03:35:40PM +0800, Dongli Zhang wrote:
>> Thanks to Joe Jin's reminding, this patch is applicable to mainline linux
>> kernel, although there is no issue due to this kind of bug in mainline kernel.
>>
>> Therefore, can I first submit this patch to mainline kernel and then backport it
>> to stable linux with more detailed explanation how the issue is reproduced on xen?
>
> Yes, please do that.

I am working on a patch to reform the xen steal clock formula by xen specific
interfaces in linux kernel. In that way, we will not need to change the
signature of existing jiffies relevant functions which expect only delta values
as input.

The new formula will not generate such large input value.

Dongli Zhang

>
> thanks,
>
> greg k-h
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>