Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933686AbdC3Mke (ORCPT ); Thu, 30 Mar 2017 08:40:34 -0400 Received: from mail-wr0-f180.google.com ([209.85.128.180]:33610 "EHLO mail-wr0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933007AbdC3Mkc (ORCPT ); Thu, 30 Mar 2017 08:40:32 -0400 Date: Thu, 30 Mar 2017 14:40:28 +0200 From: Frederic Weisbecker To: Wanpeng Li Cc: Rik van Riel , Luiz Capitulino , "linux-kernel@vger.kernel.org" Subject: Re: [BUG nohz]: wrong user and system time accounting Message-ID: <20170330124026.GA3626@lerouge> References: <20170323165512.60945ac6@redhat.com> <1490636129.8850.76.camel@redhat.com> <20170328132406.7d23579c@redhat.com> <20170329131656.1d6cb743@redhat.com> <1490818125.28917.11.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1515 Lines: 37 On Thu, Mar 30, 2017 at 09:58:44AM +0800, Wanpeng Li wrote: > 2017-03-30 4:08 GMT+08:00 Rik van Riel : > > > > In other words, the tick on cpu0 is aligned > > with the tick on the nohz_full cpus, and > > jiffies is advanced while the nohz_full cpus > > with an active tick happen to be in kernel > > mode? > > > > Frederic, can you think of any reason why > > the tick on nohz_full CPUs would end up aligned > > with the tick on cpu0, instead of running at some > > random offset? > > > > A random offset, or better yet a somewhat randomized > > tick length to make sure that simultaneous ticks are > > fairly rare and the vtime sampling does not end up > > "in phase" with the jiffies incrementing, could make > > the accounting work right again. > > > > Of course, that assumes the above hypothesis is correct :) > > There is such a feature skew_tick currently, refer to commit > 5307c9556bc (tick: add tick skew boot option), w/ skew_tick=1 boot > parameter, the bug disappear, however, the commit also mentioned that > it will hurt power consumption. Oh, I completely missed that! > I will try Frederic's proposal which > is similar to my original idea "how bad would it be to revert to > sched_clock() instead of jiffies in vtime_delta()? We could use > nanosecond granularity to check deltas but only perform an actual > cputime update when that delta >= TICK_NSEC." Thanks! I hope sched_clock() won't introduce too much overhead. Otherwise we may want to pick up the skew_tick solution.