Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1954473AbdDYUtS (ORCPT ); Tue, 25 Apr 2017 16:49:18 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:35117 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1950054AbdDYUtK (ORCPT ); Tue, 25 Apr 2017 16:49:10 -0400 Date: Tue, 25 Apr 2017 13:49:07 -0700 From: Tejun Heo To: Vincent Guittot Cc: Ingo Molnar , Peter Zijlstra , linux-kernel , Linus Torvalds , Mike Galbraith , Paul Turner , Chris Mason , kernel-team@fb.com Subject: Re: [PATCH 2/2] sched/fair: Always propagate runnable_load_avg Message-ID: <20170425204907.GA20255@wtj.duckdns.org> References: <20170424201344.GA14169@wtj.duckdns.org> <20170424201444.GC14169@wtj.duckdns.org> <20170425184941.GB15593@wtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170425184941.GB15593@wtj.duckdns.org> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1737 Lines: 61 On Tue, Apr 25, 2017 at 11:49:41AM -0700, Tejun Heo wrote: > Will try that too. I can't see why HT would change it because I see > single CPU queues misevaluated. Just in case, you need to tune the > test params so that it doesn't load the machine too much and that > there are some non-CPU intensive workloads going on to purturb things > a bit. Anyways, I'm gonna try disabling HT. It's finickier but after changing the duty cycle a bit, it reproduces w/ HT off. I think the trick is setting the number of threads to the number of logical CPUs and tune -s/-c so that p99 starts climbing up. The following is from the root cgroup. ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30 Latency percentiles (usec) 50.0000th: 51 75.0000th: 62 90.0000th: 67 95.0000th: 70 *99.0000th: 1482 99.5000th: 5048 99.9000th: 9008 min=0, max=10066 And the following is from a first-level cgroup with maximum CPU weight. # ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30 Latency percentiles (usec) 50.0000th: 51 75.0000th: 62 90.0000th: 71 95.0000th: 84 *99.0000th: 10064 99.5000th: 10064 99.9000th: 10064 min=0, max=10089 It's interesting that p99 ends up aligned on the CPU burn duration. It looks like some threads end up wating for full durations. The following is with the patches applied in the same cgroup setup. # ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30 Latency percentiles (usec) 50.0000th: 64 75.0000th: 73 90.0000th: 102 95.0000th: 111 *99.0000th: 1954 99.5000th: 5432 99.9000th: 9520 min=0, max=10012 The numbers fluctuate quite a bit between runs but the pattern is still very clear - e.g. 10ms p99 never shows up in the root cgroup or on the patched kernel. Thanks. -- tejun