Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754669AbbGFSgd (ORCPT ); Mon, 6 Jul 2015 14:36:33 -0400 Received: from mail-wi0-f182.google.com ([209.85.212.182]:33922 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751017AbbGFSgc (ORCPT ); Mon, 6 Jul 2015 14:36:32 -0400 Message-ID: <1436207790.2940.30.camel@gmail.com> Subject: Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE From: Mike Galbraith To: Josef Bacik Cc: Peter Zijlstra , riel@redhat.com, mingo@redhat.com, linux-kernel@vger.kernel.org, morten.rasmussen@arm.com, kernel-team Date: Mon, 06 Jul 2015 20:36:30 +0200 In-Reply-To: <559A91F4.7000903@fb.com> References: <1432761736-22093-1-git-send-email-jbacik@fb.com> <20150528102127.GD3644@twins.programming.kicks-ass.net> <20150528110514.GR18673@twins.programming.kicks-ass.net> <1434087305.3674.26.camel@gmail.com> <5581B70D.2000800@fb.com> <1434588939.3444.25.camel@gmail.com> <55823F33.7040005@fb.com> <1434600765.3393.9.camel@gmail.com> <55957871.7080906@fb.com> <1435905658.6418.52.camel@gmail.com> <1436025462.17152.37.camel@gmail.com> <1436080661.22930.22.camel@gmail.com> <1436159590.5850.27.camel@gmail.com> <559A91F4.7000903@fb.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3123 Lines: 71 On Mon, 2015-07-06 at 10:34 -0400, Josef Bacik wrote: > On 07/06/2015 01:13 AM, Mike Galbraith wrote: > > Hm. Piddling with pgbench, which doesn't seem to collapse into a > > quivering heap when load exceeds cores these days, deltas weren't all > > that impressive, but it does appreciate the extra effort a bit, and a > > bit more when clients receive it as well. > > > > If you test, and have time to piddle, you could try letting wake_wide() > > return 1 + sched_feat(WAKE_WIDE_IDLE) instead of adding only if wakee is > > the dispatcher. > > > > Numbers from my little desktop box. > > > > NO_WAKE_WIDE_IDLE > > postgres@homer:~> pgbench.sh > > clients 8 tps = 116697.697662 > > clients 12 tps = 115160.230523 > > clients 16 tps = 115569.804548 > > clients 20 tps = 117879.230514 > > clients 24 tps = 118281.753040 > > clients 28 tps = 116974.796627 > > clients 32 tps = 119082.163998 avg 117092.239 1.000 > > > > WAKE_WIDE_IDLE > > postgres@homer:~> pgbench.sh > > clients 8 tps = 124351.735754 > > clients 12 tps = 124419.673135 > > clients 16 tps = 125050.716498 > > clients 20 tps = 124813.042352 > > clients 24 tps = 126047.442307 > > clients 28 tps = 125373.719401 > > clients 32 tps = 126711.243383 avg 125252.510 1.069 1.000 > > > > WAKE_WIDE_IDLE (clients as well as server) > > postgres@homer:~> pgbench.sh > > clients 8 tps = 130539.795246 > > clients 12 tps = 128984.648554 > > clients 16 tps = 130564.386447 > > clients 20 tps = 129149.693118 > > clients 24 tps = 130211.119780 > > clients 28 tps = 130325.355433 > > clients 32 tps = 129585.656963 avg 129908.665 1.109 1.037 I had a typo in my script, so those desktop box numbers were all doing the same number of clients. It doesn't invalidate anything, but the individual deltas are just run to run variance.. not to mention that single cache box is not all that interesting for this anyway. That happens when interconnect becomes a player. > I have time for twiddling, we're carrying ye olde WAKE_IDLE until we get > this solved upstream and then I'll rip out the old and put in the new, > I'm happy to screw around until we're all happy. I'll throw this in a > kernel this morning and run stuff today. Barring any issues with the > testing infrastructure I should have results today. Thanks, I'll be interested in your results. Taking pgbench to a little NUMA box, I'm seeing _nada_ outside of variance with master (crap). I have a way to win significantly for _older_ kernels, and that win over master _may_ provide some useful insight, but I don't trust postgres/pgbench as far as I can toss the planet, so don't have a warm fuzzy about trying to use it to approximate your real world load. BTW, what's your topology look like (numactl --hardware). -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/