Date: Tue, 25 Apr 2017 13:49:07 -0700
From: Tejun Heo <tj@kernel.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
        Chris Mason <clm@fb.com>, kernel-team@fb.com
Subject: Re: [PATCH 2/2] sched/fair: Always propagate runnable_load_avg
Message-ID: <20170425204907.GA20255@wtj.duckdns.org>
References: <20170424201344.GA14169@wtj.duckdns.org>
 <20170424201444.GC14169@wtj.duckdns.org>
 <CAKfTPtCf0JUPubBtjY25Lr6J1aihUMjs3HEw+8MXcCwpuku7eQ@mail.gmail.com>
 <CAKfTPtBLYKXyEYpyTWDRakP8zwe0z=_2HT3Lg7UM2PdQUF3kAA@mail.gmail.com>
 <CAKfTPtAJmJsT2=DbZFtK2aBVkNKbksueuDs_vCzsvWPR-_Aebg@mail.gmail.com>
 <20170425184941.GB15593@wtj.duckdns.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170425184941.GB15593@wtj.duckdns.org>
User-Agent: Mutt/1.8.0 (2017-02-23)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1737
Lines: 61

On Tue, Apr 25, 2017 at 11:49:41AM -0700, Tejun Heo wrote:
> Will try that too.  I can't see why HT would change it because I see
> single CPU queues misevaluated.  Just in case, you need to tune the
> test params so that it doesn't load the machine too much and that
> there are some non-CPU intensive workloads going on to purturb things
> a bit.  Anyways, I'm gonna try disabling HT.

It's finickier but after changing the duty cycle a bit, it reproduces
w/ HT off.  I think the trick is setting the number of threads to the
number of logical CPUs and tune -s/-c so that p99 starts climbing up.
The following is from the root cgroup.

 ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30
 Latency percentiles (usec)
	 50.0000th: 51
	 75.0000th: 62
	 90.0000th: 67
	 95.0000th: 70
	 *99.0000th: 1482
	 99.5000th: 5048
	 99.9000th: 9008
	 min=0, max=10066

And the following is from a first-level cgroup with maximum CPU
weight.

 # ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30
 Latency percentiles (usec)
	 50.0000th: 51
	 75.0000th: 62
	 90.0000th: 71
	 95.0000th: 84
	 *99.0000th: 10064
	 99.5000th: 10064
	 99.9000th: 10064
	 min=0, max=10089

It's interesting that p99 ends up aligned on the CPU burn duration.
It looks like some threads end up wating for full durations.

The following is with the patches applied in the same cgroup setup.

 # ~/schbench -m 2 -t 8 -s 15000 -c 10000 -r 30
 Latency percentiles (usec)
	 50.0000th: 64
	 75.0000th: 73
	 90.0000th: 102
	 95.0000th: 111
	 *99.0000th: 1954
	 99.5000th: 5432
	 99.9000th: 9520
	 min=0, max=10012

The numbers fluctuate quite a bit between runs but the pattern is
still very clear - e.g. 10ms p99 never shows up in the root cgroup or
on the patched kernel.

Thanks.

-- 
tejun