Date: Mon, 7 Aug 2017 15:40:22 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Joel Fernandes <joelaf@google.com>
Cc: linux-kernel@vger.kernel.org, kernel-team@android.com,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Juri Lelli <juri.lelli@arm.com>,
        Brendan Jackman <brendan.jackman@arm.com>,
        Dietmar Eggeman <dietmar.eggemann@arm.com>,
        Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate
Message-ID: <20170807134022.bvql3s5lrfgh3e5q@hirez.programming.kicks-ass.net>
References: <20170804154023.26874-1-joelaf@google.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170804154023.26874-1-joelaf@google.com>
User-Agent: NeoMutt/20170609 (1.8.3)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1105
Lines: 26

On Fri, Aug 04, 2017 at 08:40:23AM -0700, Joel Fernandes wrote:
> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the
> amount accumulated during a single update doesn't cross a period
> boundary.

> This is fine in cases where the amount accrued is much smaller than
> the size of a single PELT window (1ms) however if the amount accrued
> is high then the relative error (calculated against what the actual
> signal would be had we updated the averages) can be quite high - as
> much 3-6% in my testing.

The max accumulate we can have and not cross a boundary is 1023*1024 ns.
At which point we get a divisor of LOAD_AVG_MAX - 1024 + 1023.

So for util_sum we'd have a increase of 1023*1024/(47742-1) = ~22. Which
on the total signal for util (1024) is ~2.1%

Where does the 3-6% come from?

> Inorder to fix this, this patch does the average update by also
> checking how much time has elapsed since the last update and update
> the averages if it has been long enough (as a threshold I chose
> 128us).

This of course does the divisions more often; anything on performance
impact?