Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752072AbbFDEwk (ORCPT ); Thu, 4 Jun 2015 00:52:40 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:35291 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750762AbbFDEwb (ORCPT ); Thu, 4 Jun 2015 00:52:31 -0400 Message-ID: <1433393548.3651.45.camel@gmail.com> Subject: Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE From: Mike Galbraith To: Josef Bacik Cc: Peter Zijlstra , Rik van Riel , mingo@redhat.com, linux-kernel@vger.kernel.org, morten.rasmussen@arm.com, kernel-team Date: Thu, 04 Jun 2015 06:52:28 +0200 In-Reply-To: <556F64B9.1050603@fb.com> References: <1432761736-22093-1-git-send-email-jbacik@fb.com> <20150528102127.GD3644@twins.programming.kicks-ass.net> <20150528110514.GR18673@twins.programming.kicks-ass.net> <5568D43D.20703@fb.com> <556CB4A8.1050509@fb.com> <1433191354.11346.22.camel@twins> <556DE3FB.9020400@fb.com> <556F0B5E.6030805@redhat.com> <1433341448.1495.4.camel@twins> <1433345444.3343.21.camel@gmail.com> <556F23E5.5020107@fb.com> <1433350386.3996.15.camel@gmail.com> <556F3677.2090206@fb.com> <1433353411.3407.15.camel@gmail.com> <556F64B9.1050603@fb.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1874 Lines: 39 On Wed, 2015-06-03 at 16:34 -0400, Josef Bacik wrote: > On 06/03/2015 01:43 PM, Mike Galbraith wrote: > > There are also other loads like your server where waking to an idle cpu > > dominates all else, pgbench is one of those. In that case, you've got a > > 1:N waker/wakee relationship, and what matters above ALL else is when > > the mother of all work (the single server thread) wants a CPU, it had > > better get it NOW, else the load stalls. Likewise, 'mom' being > > preempted hurts truckloads. Perhaps your server has a similar thing > > going on, keeping wakees the hell away from the waker rules all. > > > > Yeah our server has two waker threads (one per numa node) and then the N > number of wakee threads. I'll run tbench and pgbench with the new > patches and see if there's a degredation. Thanks, If you look for wake_wide(), it could perhaps be used to select wider search for only the right flavor load component when BALANCE_WAKE is set. That would let the cache lovers in your box continue to perform while improving the 1:N component. That wider search still needs to become cheaper though, low hanging fruit being to stop searching when you find load = 0.. but you may meet the energy efficient folks, who iirc want to make it even more expensive. wake_wide() inadvertently helped another sore spot btw - a gaggle of pretty light tasks being awakened from an interrupt source tended to cluster around that source, preventing such loads from being all they can be in a very similar manner. Xen (shudder;) showed that nicely in older kernels, due to the way its weird dom0 gizmo works. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/