Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2998007AbdDZKWV (ORCPT ); Wed, 26 Apr 2017 06:22:21 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:36184 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2997308AbdDZKWN (ORCPT ); Wed, 26 Apr 2017 06:22:13 -0400 MIME-Version: 1.0 In-Reply-To: <20170425210810.GB20255@wtj.duckdns.org> References: <20170424201344.GA14169@wtj.duckdns.org> <20170424201444.GC14169@wtj.duckdns.org> <20170425184941.GB15593@wtj.duckdns.org> <20170425210810.GB20255@wtj.duckdns.org> From: Vincent Guittot Date: Wed, 26 Apr 2017 12:21:52 +0200 Message-ID: Subject: Re: [PATCH 2/2] sched/fair: Always propagate runnable_load_avg To: Tejun Heo Cc: Ingo Molnar , Peter Zijlstra , linux-kernel , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , kernel-team@fb.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2040 Lines: 58 On 25 April 2017 at 23:08, Tejun Heo wrote: > On Tue, Apr 25, 2017 at 11:49:41AM -0700, Tejun Heo wrote: >> > I have run a quick test with your patches and schbench on my platform. >> > I haven't been able to reproduce your regression but my platform is >> > quite different from yours (only 8 cores without SMT) >> > But most importantly, the parent cfs_rq->runnable_load_avg never >> > reaches 0 (or almost 0) when it is idle. Instead, it still has a >> > runnable_load_avg (this is not due to rounding computation) whereas >> > runnable_load_avg should be 0 >> >> Heh, let me try that out. Probably a silly mistake somewhere. > > This is from the follow-up patch. I was confused. Because we don't > propagate decays, we still should decay the runnable_load_avg; > otherwise, we end up accumulating errors in the counter. I'll drop > the last patch. Ok, the runnable_load_avg goes back to 0 when I drop patch 3. But i see runnable_load_avg sometimes significantly higher than load_avg which is normally not possible as load_avg = runnable_load_avg + sleeping task's load_avg Then, I just have the opposite behavior on my platform. I see a increase of latency at p99 with your patches. My platform is a hikey : 2x4 cores ARM and I have used schbench -m 2 -t 4 -s 10000 -c 15000 -r 30 so I have 1 worker thread per CPU which is similar to what you are doing on your platform With v4.11-rc8. I have run 10 times the test and get consistent results schbench -m 2 -t 4 -s 10000 -c 15000 -r 30 Latency percentiles (usec) 50.0000th: 255 75.0000th: 350 90.0000th: 454 95.0000th: 489 *99.0000th: 539 99.5000th: 585 99.9000th: 10224 min=0, max=13567 With your patches i see an increase of the latency for p99. I run 10 times the test too and half tests show latency increase like below schbench$ ./schbench -m 2 -t 4 -s 10000 -c 15000 -r 30 Latency percentiles (usec) 50.0000th: 216 75.0000th: 295 90.0000th: 395 95.0000th: 444 *99.0000th: 2034 99.5000th: 5960 99.9000th: 12240 min=0, max=14744 > > Thanks. > > -- > tejun