MIME-Version: 1.0
In-Reply-To: <20170807134022.bvql3s5lrfgh3e5q@hirez.programming.kicks-ass.net>
References: <20170804154023.26874-1-joelaf@google.com> <20170807134022.bvql3s5lrfgh3e5q@hirez.programming.kicks-ass.net>
From: Joel Fernandes <joelaf@google.com>
Date: Thu, 10 Aug 2017 16:11:25 -0700
Message-ID: <CAJWu+oqAneGz1OJdAx5HMvSxbp76A8RDk8JvAWmEWnKtLvQTTg@mail.gmail.com>
Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate
To: Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>, kernel-team@android.com,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Juri Lelli <juri.lelli@arm.com>,
        Brendan Jackman <brendan.jackman@arm.com>,
        Dietmar Eggeman <dietmar.eggemann@arm.com>,
        Ingo Molnar <mingo@redhat.com>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1716
Lines: 60

Hi Peter,

On Mon, Aug 7, 2017 at 6:40 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Aug 04, 2017 at 08:40:23AM -0700, Joel Fernandes wrote:
>> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the
>> amount accumulated during a single update doesn't cross a period
>> boundary.
>
>> This is fine in cases where the amount accrued is much smaller than
>> the size of a single PELT window (1ms) however if the amount accrued
>> is high then the relative error (calculated against what the actual
>> signal would be had we updated the averages) can be quite high - as
>> much 3-6% in my testing.
>
> The max accumulate we can have and not cross a boundary is 1023*1024 ns.
> At which point we get a divisor of LOAD_AVG_MAX - 1024 + 1023.
>
> So for util_sum we'd have a increase of 1023*1024/(47742-1) = ~22. Which
> on the total signal for util (1024) is ~2.1%
>
> Where does the 3-6% come from?
>
>> Inorder to fix this, this patch does the average update by also
>> checking how much time has elapsed since the last update and update
>> the averages if it has been long enough (as a threshold I chose
>> 128us).
>
> This of course does the divisions more often; anything on performance
> impact?

I ran hackbench and as such I don't see any degradation in performance.

# while [ 1 ]; do hackbench 5 thread 500; done
Running with 5*40 (== 200) tasks.

without:
Time: 0.742
Time: 0.770
Time: 0.857
Time: 0.809
Time: 0.721
Time: 0.725
Time: 0.717
Time: 0.699

with:
Time: 0.787
Time: 0.816
Time: 0.744
Time: 0.832
Time: 0.798
Time: 0.785
Time: 0.714
Time: 0.721

If there's any other benchmark or anything different in this test
you'd like me to run, let me know, thanks.

Regards,
Joel