Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751938AbcDNCCu (ORCPT ); Wed, 13 Apr 2016 22:02:50 -0400 Received: from mga04.intel.com ([192.55.52.120]:14916 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750724AbcDNCCu (ORCPT ); Wed, 13 Apr 2016 22:02:50 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,482,1455004800"; d="scan'208";a="944726830" Date: Thu, 14 Apr 2016 02:20:25 +0800 From: Yuyang Du To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , Benjamin Segall , Paul Turner , Morten Rasmussen , Dietmar Eggemann , Juri Lelli Subject: Re: [PATCH 4/4] sched/fair: Implement flat hierarchical structure for util_avg Message-ID: <20160413182025.GN8697@intel.com> References: <1460327765-18024-1-git-send-email-yuyang.du@intel.com> <1460327765-18024-5-git-send-email-yuyang.du@intel.com> <20160411203733.GI8697@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2377 Lines: 48 Hi Vincent, On Wed, Apr 13, 2016 at 01:27:21PM +0200, Vincent Guittot wrote: > >> Why not using the sched_avg of the rq->cfs in order to track the > >> utilization of the root cfs_rq instead of adding a new sched_avg into > >> the rq ? Then you call update_cfs_rq_load_avg(rq->cfs) when you want > >> to update/sync the utilization of the rq->cfs and for one call you > >> will update both the load_avg and the util_avg of the root cfs instead > >> of duplicating the sequence in _update_load_avg > > > > This is the approach taken by Dietmar in his patch, a fairly easy approach. > > The problem is though, if so, we update the root cfs_rq only when it is > > the root cfs_rq to update. A simple contrived case will make it never > > updated except in update_blocked_averages(). My impression is that this > > might be too much precision lost. > > > > And thus we take this alternative approach, and thus I revisited > > __update_load_avg() to optimize it. > > > > [snip] > > > >> > - if (atomic_long_read(&cfs_rq->removed_util_avg)) { > >> > - long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0); > >> > - sa->util_avg = max_t(long, sa->util_avg - r, 0); > >> > - sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0); > >> > + if (atomic_long_read(&rq->removed_util_avg)) { > >> > + long r = atomic_long_xchg(&rq->removed_util_avg, 0); > >> > + rq->avg.util_avg = max_t(long, rq->avg.util_avg - r, 0); > >> > + rq->avg.util_sum = max_t(s32, rq->avg.util_sum - r * LOAD_AVG_MAX, 0); > >> > >> I see one potential issue here because the rq->util_avg may (surely) > >> have been already updated and decayed during the update of a > >> sched_entity but before we substract the removed_util_avg > > > > This is the same now, because cfs_rq will be regularly updated in > > update_blocked_averages(), which basically means cfs_rq will be newer > > than task for sure, although task tries to catch up when removed. > > I don't agree on that part. At now, we check and substract > removed_util_avg before calling __update_load_avg for a cfs_rq, so it > will be removed before changing last_update_time. Despite the cross CPU issue, you are right. > With your patch, we update rq->avg.util_avg and last_update_time > without checking removed_util_avg. But, yes, we do.