Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755658AbcDLDYL (ORCPT ); Mon, 11 Apr 2016 23:24:11 -0400 Received: from mga14.intel.com ([192.55.52.115]:10512 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753237AbcDLDYK (ORCPT ); Mon, 11 Apr 2016 23:24:10 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,471,1455004800"; d="scan'208";a="953003110" Date: Tue, 12 Apr 2016 03:41:41 +0800 From: Yuyang Du To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , Benjamin Segall , Paul Turner , Morten Rasmussen , Dietmar Eggemann , Juri Lelli Subject: Re: [PATCH 2/4] sched/fair: Drop out incomplete current period when sched averages accrue Message-ID: <20160411194141.GH8697@intel.com> References: <1460327765-18024-1-git-send-email-yuyang.du@intel.com> <1460327765-18024-3-git-send-email-yuyang.du@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2113 Lines: 50 Hi Vincent, On Mon, Apr 11, 2016 at 11:08:04AM +0200, Vincent Guittot wrote: > > @@ -2704,11 +2694,14 @@ static __always_inline int > > __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > > unsigned long weight, int running, struct cfs_rq *cfs_rq) > > { > > - u64 delta, scaled_delta, periods; > > - u32 contrib; > > - unsigned int delta_w, scaled_delta_w, decayed = 0; > > + u64 delta; > > + u32 contrib, periods; > > unsigned long scale_freq, scale_cpu; > > > > + /* > > + * now rolls down to a period boundary > > + */ > > + now = now && (u64)(~0xFFFFF); > > delta = now - sa->last_update_time; > > /* > > * This should only happen when time goes backwards, which it > > @@ -2720,89 +2713,56 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > > } > > > > /* > > - * Use 1024ns as the unit of measurement since it's a reasonable > > - * approximation of 1us and fast to compute. > > + * Use 1024*1024ns as an approximation of 1ms period, pretty close. > > */ > > - delta >>= 10; > > - if (!delta) > > + periods = delta >> 20; > > + if (!periods) > > return 0; > > sa->last_update_time = now; > > The optimization looks quite interesting but I see one potential issue > with migration as we will lose the part of the ongoing period that is > now not saved anymore. This lost part can be quite significant for a > short task that ping pongs between CPUs. Yes, basically, it is we lose precision (~1ms scale in contrast with ~1us scale). But as I wrote, we may either lose a sub-1ms, or gain a sub-1ms, statistically, they should even out, given the load/util updates are quite a large number of samples, and we do want a lot of samples for the metrics, this is the point of the entire average thing. Plus, as you also said, the incomplete current period also plays a (somewhat) negative role here. In addition, excluding the flat hierarchical util patch, we gain quite some efficiency.