Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753872AbcCENTQ (ORCPT ); Sat, 5 Mar 2016 08:19:16 -0500 Received: from mail-wm0-f52.google.com ([74.125.82.52]:33469 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751874AbcCENTF (ORCPT ); Sat, 5 Mar 2016 08:19:05 -0500 Date: Sat, 5 Mar 2016 14:19:00 +0100 From: Frederic Weisbecker To: Thomas Gleixner Cc: Chris Friesen , John Stultz , Daniel Lezcano , lkml , Peter Zijlstra , Ingo Molnar , Rik van Riel Subject: Re: [PATCH] steal_account_process_tick() should return jiffies Message-ID: <20160305131856.GA4441@lerouge> References: <56DA1339.5030601@mail.usask.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3461 Lines: 91 On Sat, Mar 05, 2016 at 11:27:01AM +0100, Thomas Gleixner wrote: > Chris, > > On Fri, 4 Mar 2016, Chris Friesen wrote: > > First of all the subject line should contain a subsystem prefix, > i.e. "sched/cputime:" > > > The callers of steal_account_process_tick() expect it to return whether > > the last jiffy was stolen or not. > > > > Currently the return value of steal_account_process_tick() is in units > > of cputime, which vary between either jiffies or nsecs depending on > > CONFIG_VIRT_CPU_ACCOUNTING_GEN. > > Sure, but what is the actual problem? The return value is boolean and tells > whether there was stolen time accounted or not. > > > The fix is to change steal_account_process_tick() to always return > > jiffies. If CONFIG_VIRT_CPU_ACCOUNTING_GEN is not enabled then this > > is a no-op. > > What does that fix? > > > As far as I can tell this bug has been present since commit dee08a72. > > Which bug? > > > Signed-off-by: Chris Friesen > > --- > > > > kernel/sched/cputime.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > index b2ab2ff..e724496 100644 > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -276,7 +276,7 @@ static __always_inline bool steal_account_process_tick(void) > > this_rq()->prev_steal_time += cputime_to_nsecs(steal_ct); > > > > account_steal_time(steal_ct); > > - return steal_ct; > > + return cputime_to_jiffies(steal_ct); > > So if steal time is close to a jiffie, then cputime_to_jiffies will return 0 > and you account a full jiffie to user/system/whatever. > > Without a proper explanation of the problem and the resulting "bug" I really > cannot figure out why we want that change. Indeed the changelog should better explain the problem. So I think the issue is that if the cputime has nsecs granularity and we have a tiny stolen time to account (lets say a few nanosecs, in fact anything that is below a jiffy), we are not going to account the tick on user/system. But the fix doesn't look right to me because we are still accounting the steal time if it is lower than a jiffy and that steal time will never be substracted to user/system time if it never reach a jiffy. Instead the fix should accumulate the steal time and account it only once it's worth a jiffy and then substract it from system/user time accordingly. Something like that: diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index b2ab2ff..d38e25f 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -262,7 +262,7 @@ static __always_inline bool steal_account_process_tick(void) #ifdef CONFIG_PARAVIRT if (static_key_false(¶virt_steal_enabled)) { u64 steal; - cputime_t steal_ct; + unsigned long steal_jiffies; steal = paravirt_steal_clock(smp_processor_id()); steal -= this_rq()->prev_steal_time; @@ -272,11 +272,11 @@ static __always_inline bool steal_account_process_tick(void) * based on jiffies). Lets cast the result to cputime * granularity and account the rest on the next rounds. */ - steal_ct = nsecs_to_cputime(steal); - this_rq()->prev_steal_time += cputime_to_nsecs(steal_ct); + steal_jiffies = nsecs_to_jiffies(steal); + this_rq()->prev_steal_time += jiffies_to_nsecs(steal_jiffies); account_steal_time(steal_ct); - return steal_ct; + return steal_jiffies; } #endif return false;