Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753142AbcDODrr (ORCPT ); Thu, 14 Apr 2016 23:47:47 -0400 Received: from mga11.intel.com ([192.55.52.93]:35652 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751906AbcDODrq (ORCPT ); Thu, 14 Apr 2016 23:47:46 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,485,1455004800"; d="scan'208";a="945606590" Date: Fri, 15 Apr 2016 04:05:23 +0800 From: Yuyang Du To: Dietmar Eggemann Cc: Vincent Guittot , Peter Zijlstra , Ingo Molnar , linux-kernel , Benjamin Segall , Paul Turner , Morten Rasmussen , Juri Lelli Subject: Re: [PATCH 2/4] sched/fair: Drop out incomplete current period when sched averages accrue Message-ID: <20160414200523.GP8697@intel.com> References: <1460327765-18024-1-git-send-email-yuyang.du@intel.com> <1460327765-18024-3-git-send-email-yuyang.du@intel.com> <570E61FE.4060000@arm.com> <20160413184412.GO8697@intel.com> <570F929B.1070805@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <570F929B.1070805@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1591 Lines: 48 Hi Dietmar, On Thu, Apr 14, 2016 at 01:52:43PM +0100, Dietmar Eggemann wrote: > On 13/04/16 19:44, Yuyang Du wrote: > > On Wed, Apr 13, 2016 at 05:28:18PM +0200, Vincent Guittot wrote: > > [...] > > > By "bailing out", you mean return without update because the delta is less > > than 1ms? > > yes. > > > > >>> Examples of 1 periodic task pinned to a cpu on an ARM64 system, HZ=250 > >>> in steady state: > >>> > >>> (1) task runtime = 100us period = 200us > >>> > >>> pelt load/util signal > >>> > >>> 1us: 488-491 > >>> > >>> 1ms: 483-534 > > > > 100us/200us = 50%, so the util should center around 512, it seems in this > > regard, it is better, but the variance is undesirable. > > I see. You mentioned the over-decay thing in the patch header. Is this > also why you change the contribution of the most recent period from 1002 > (1024*y) to 1024? Yes, it is because that (most recent) period is the "current" period. > This variance gets worse if the ratio runtime/period is further reduced > (e.g. 25us/1000us). > > You can even create tasks which go stealth mode (e.g. 25us/1048us). En... this is a good case to beat it. > It shows periods of 0 load/util (~1.55s) and than massive spikes (~700 for > ~300ms). The short runtime and the task period synced to 1024*1024ns > allow that we hit consecutive enqueues or dequeues for a long time even > the task might drift relative to the pelt window. But whenever we pass 1ms, we will update. And I am curious, how does the current 1us works in this case? Anyway, I will reproduce it myself.