Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761224Ab3GSTP0 (ORCPT ); Fri, 19 Jul 2013 15:15:26 -0400 Received: from g4t0016.houston.hp.com ([15.201.24.19]:12635 "EHLO g4t0016.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753616Ab3GSTPX (ORCPT ); Fri, 19 Jul 2013 15:15:23 -0400 Message-ID: <1374261318.1830.6.camel@j-VirtualBox> Subject: Re: [RFC] sched: Limit idle_balance() when it is being used too frequently From: Jason Low To: Peter Zijlstra Cc: Rik van Riel , Ingo Molnar , LKML , Mike Galbraith , Thomas Gleixner , Paul Turner , Alex Shi , Preeti U Murthy , Vincent Guittot , Morten Rasmussen , Namhyung Kim , Andrew Morton , Kees Cook , Mel Gorman , aswin@hp.com, scott.norton@hp.com, chegu_vinod@hp.com Date: Fri, 19 Jul 2013 12:15:18 -0700 In-Reply-To: <20130719183717.GP27075@twins.programming.kicks-ass.net> References: <1374048701.6000.21.camel@j-VirtualBox> <20130717093913.GP23818@dyad.programming.kicks-ass.net> <1374076741.7412.35.camel@j-VirtualBox> <20130717161815.GR23818@dyad.programming.kicks-ass.net> <51E6D9B7.1030705@redhat.com> <20130717180156.GS23818@dyad.programming.kicks-ass.net> <1374120144.1816.45.camel@j-VirtualBox> <20130718093218.GH27075@twins.programming.kicks-ass.net> <51E7D89A.8010009@redhat.com> <1374174399.1792.42.camel@j-VirtualBox> <20130719183717.GP27075@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2066 Lines: 48 On Fri, 2013-07-19 at 20:37 +0200, Peter Zijlstra wrote: > On Thu, Jul 18, 2013 at 12:06:39PM -0700, Jason Low wrote: > > > N = 1 > > ----- > > 19.21% reaim [k] __read_lock_failed > > 14.79% reaim [k] mspin_lock > > 12.19% reaim [k] __write_lock_failed > > 7.87% reaim [k] _raw_spin_lock > > 2.03% reaim [k] start_this_handle > > 1.98% reaim [k] update_sd_lb_stats > > 1.92% reaim [k] mutex_spin_on_owner > > 1.86% reaim [k] update_cfs_rq_blocked_load > > 1.14% swapper [k] intel_idle > > 1.10% reaim [.] add_long > > 1.09% reaim [.] add_int > > 1.08% reaim [k] load_balance > > But but but but.. wth is causing this? The only thing we do more of with > N=1 is idle_balance(); where would that cause __{read,write}_lock_failed > and or mspin_lock() contention like that. > > There shouldn't be a rwlock_t in the entire scheduler; those things suck > worse than quicksand. > > If, as Rik thought, we'd have more rq->lock contention, then I'd > expected _raw_spin_lock to be up highest. For this particular fserver workload, that mutex was acquired in the function calls from ext4_orphan_add() and ext4_orphan_del(). Those read and write lock calls were from start_this_handle(). Although these functions are not called within the idle_balance() code path, update_sd_lb_stats(), tg_load_down(), idle_cpu(), spin_lock(), ect... increases the time spent in the kernel and that appears to be indirectly causing more time to be spent acquiring those other kernel locks. Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/