Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752562AbdHIRva (ORCPT ); Wed, 9 Aug 2017 13:51:30 -0400 Received: from mail-qt0-f175.google.com ([209.85.216.175]:35158 "EHLO mail-qt0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752469AbdHIRv3 (ORCPT ); Wed, 9 Aug 2017 13:51:29 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170804154023.26874-1-joelaf@google.com> From: Joel Fernandes Date: Wed, 9 Aug 2017 10:51:26 -0700 Message-ID: Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate To: Vincent Guittot Cc: linux-kernel , kernel-team@android.com, Peter Zijlstra , Juri Lelli , Brendan Jackman , Dietmar Eggeman , Ingo Molnar Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7076 Lines: 164 Hi Vincent, On Wed, Aug 9, 2017 at 3:23 AM, Vincent Guittot wrote: >> >> Yes this is true, however since I'm using the 'delta' instead of >> period_contrib, its only does the update every 128us, however if >> several updates fall within a 128us boundary then those will be rate >> limited. So say we have a flood of updates, then the updates have to >> be spaced every 128us to reach the maximum number of division, I don't >> know whether this is a likely situation or would happen very often? I >> am planning to run some benchmarks and check that there is no >> regression as well as Peter mentioned about the performance aspect. >> >>>> In order to compare the signals with/without the patch I created a synthetic >>>> test (20ms runtime, 100ms period) and analyzed the signals and created a report >>>> on the analysis data/plots both with and without the fix: >>>> http://www.linuxinternals.org/misc/pelt-error.pdf >>> >>> The glitch described in page 2 shows a decrease of the util_avg which >>> is not linked to accuracy of the calculation but due to the use of the >>> wrong range when computing util_avg. >> >> Yes, and I corrected the graphs this time to show what its like after >> your patch and confirm that there is STILL a glitch. You are right >> that there isn't a reduction after your patch, however in my updated >> graphs there is a glitch and its not a downward peak but a stall in >> the update, the error is still quite high and can be as high as the >> absolute 2% error, in my update graphs I show an example where its ~ >> 1.8% (18 / 1024). >> >> Could you please take a look at my updated document? I have included >> new graph and traces there and color coded them so its easy to >> correlate the trace lines to the error in the graph: Here's the >> updated new link: >> https://github.com/joelagnel/joelagnel.github.io/blob/master/misc/pelt-error-rev2.pdf > > I see strange behavior in your rev2 document: > At timestamp 9.235635, we have util_avg=199 acc_util_avg=199 > util_err=0. Everything looks fine but I don't this point on the graph > Then, at 9.235636 (which is the red colored faulty trace), we have > util_avg=182 acc_util_avg=200 util_err=18. > Firstly, this means that util_avg has been updated (199 -> 182) so the > error is not a problem of util_avg not been updated often enough :-) > Then, util_avg decreases (199 -> 182) whereas it must increase because > the task is still running. This should not happen and this is exactly > what commit 625ed2bf049d should fix. So either the patch is not > applied or it doesn't fix completely the problem. I think you are looking at wrong trace lines. The graph is generated with for rq util only (cfs_rq == 1), so the lines in the traces you should look at are the ones with cfs_rq= 1. Only cfs_rq==1 lines were used to generate the graphs. In this you will see rq util_avg change as follows: 165 -> 182 -> 182 (missed an update causing error). This is also reflected in the graph in the graph where you see the flat green line. > > That would be interesting to also display the last_update_time of sched_avg > >> >>> commit 625ed2bf049d "sched/cfs: Make util/load_avg more stable" fixes >>> this glitch. >>> And the lower peak value in page 3 is probably linked to the inaccuracy >> >> This is not true. The reduction in peak in my tests which happen even >> after your patch is because of the dequeue that happens just before >> the period boundary is hit. Could you please take a look at the >> updated document in the link above? In there I show in the second >> example with a trace that corresponds the reduction in peak during the >> dequeue and is because of the delay in update. These errors go away >> with my patch. > > There is the same strange behavior there: > When the reduction in peak happens, the util_avg is updated whereas > your concerns is that util_avg is not update often enough. > At timestamp 10.656683, we have util_avg=389 acc_util_avg=389 util_err=0 > At timestamp 10.657420, we have util_avg=396 acc_util_avg=396 > util_err=0. I don't see this point on the graph > At timestamp 10.657422, we have util_avg=389 acc_util_avg=399 > util_err=10. This is the colored faulty trace but util_avg has been > updated from 369 to 389 Yeah, same thing here, you should look at the lines with cfs_rq == 1. The util changes as: 363 -> 376 -> 389 -> 389 (missed update). thanks, -Joel > > Regards, > Vincent > >> >>> I agree that there is an inaccuracy (the max absolute value of 22) but >>> that's in favor of less overhead. Have you seen wrong behavior because >>> of this inaccuracy ? >> >> I haven't tried to nail this to a wrong behavior however since other >> patches have been posted to fix inaccuracy and I do see we reach the >> theoretical maximum error on quite a few occassions, I think its >> justifiable. Also the overhead is minimal if updates aren't happening >> several times in a window, and at 128us interval, and the few times >> that the update does happen, the division is performed only during >> those times. So incases where it does fix the error, it does so with >> minimal overhead. I do agree with the overhead point and I'm planning >> to do more tests with hackbench to confirm overhead is minimal. I'll >> post some updates about it soon. >> >> Thanks! >> >> -Joel >> >> >>> >>>> >>>> With the patch, the error in the signal is significantly reduced, and is >>>> non-existent beyond a small negligible amount. >>>> >>>> Cc: Vincent Guittot >>>> Cc: Peter Zijlstra >>>> Cc: Juri Lelli >>>> Cc: Brendan Jackman >>>> Cc: Dietmar Eggeman >>>> Signed-off-by: Joel Fernandes >>>> --- >>>> kernel/sched/fair.c | 8 ++++++-- >>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>>> index 4f1825d60937..1347643737f3 100644 >>>> --- a/kernel/sched/fair.c >>>> +++ b/kernel/sched/fair.c >>>> @@ -2882,6 +2882,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa, >>>> unsigned long weight, int running, struct cfs_rq *cfs_rq) >>>> { >>>> u64 delta; >>>> + int periods; >>>> >>>> delta = now - sa->last_update_time; >>>> /* >>>> @@ -2908,9 +2909,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa, >>>> * accrues by two steps: >>>> * >>>> * Step 1: accumulate *_sum since last_update_time. If we haven't >>>> - * crossed period boundaries, finish. >>>> + * crossed period boundaries and the time since last update is small >>>> + * enough, we're done. >>>> */ >>>> - if (!accumulate_sum(delta, cpu, sa, weight, running, cfs_rq)) >>>> + periods = accumulate_sum(delta, cpu, sa, weight, running, cfs_rq); >>>> + >>>> + if (!periods && delta < 128) >>>> return 0; >>>> >>>> /* >>>> -- >>>> 2.14.0.rc1.383.gd1ce394fe2-goog >>>>