Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751743AbdHGNkd (ORCPT ); Mon, 7 Aug 2017 09:40:33 -0400 Received: from merlin.infradead.org ([205.233.59.134]:49388 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750751AbdHGNkc (ORCPT ); Mon, 7 Aug 2017 09:40:32 -0400 Date: Mon, 7 Aug 2017 15:40:22 +0200 From: Peter Zijlstra To: Joel Fernandes Cc: linux-kernel@vger.kernel.org, kernel-team@android.com, Vincent Guittot , Juri Lelli , Brendan Jackman , Dietmar Eggeman , Ingo Molnar Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate Message-ID: <20170807134022.bvql3s5lrfgh3e5q@hirez.programming.kicks-ass.net> References: <20170804154023.26874-1-joelaf@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170804154023.26874-1-joelaf@google.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1105 Lines: 26 On Fri, Aug 04, 2017 at 08:40:23AM -0700, Joel Fernandes wrote: > The PELT signal (sa->load_avg and sa->util_avg) are not updated if the > amount accumulated during a single update doesn't cross a period > boundary. > This is fine in cases where the amount accrued is much smaller than > the size of a single PELT window (1ms) however if the amount accrued > is high then the relative error (calculated against what the actual > signal would be had we updated the averages) can be quite high - as > much 3-6% in my testing. The max accumulate we can have and not cross a boundary is 1023*1024 ns. At which point we get a divisor of LOAD_AVG_MAX - 1024 + 1023. So for util_sum we'd have a increase of 1023*1024/(47742-1) = ~22. Which on the total signal for util (1024) is ~2.1% Where does the 3-6% come from? > Inorder to fix this, this patch does the average update by also > checking how much time has elapsed since the last update and update > the averages if it has been long enough (as a threshold I chose > 128us). This of course does the divisions more often; anything on performance impact?