Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751359AbbGCEmo (ORCPT ); Fri, 3 Jul 2015 00:42:44 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:34948 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752863AbbGCEmg (ORCPT ); Fri, 3 Jul 2015 00:42:36 -0400 Message-ID: <1435898552.6418.14.camel@gmail.com> Subject: Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance() From: Mike Galbraith To: Yuyang Du Cc: Morten Rasmussen , Peter Zijlstra , Rabin Vincent , "mingo@redhat.com" , "linux-kernel@vger.kernel.org" , Paul Turner , Ben Segall Date: Fri, 03 Jul 2015 06:42:32 +0200 In-Reply-To: <20150702184234.GC5197@intel.com> References: <20150630143057.GA31689@axis.com> <1435728995.9397.7.camel@gmail.com> <20150701145551.GA15690@axis.com> <20150701204404.GH25159@twins.programming.kicks-ass.net> <20150701232511.GA5197@intel.com> <20150702105359.GY19282@twins.programming.kicks-ass.net> <20150702114454.GB7598@e105550-lin.cambridge.arm.com> <20150702184234.GC5197@intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3010 Lines: 51 On Fri, 2015-07-03 at 02:42 +0800, Yuyang Du wrote: > But still, I think, even with the above, in idle balancing, pulling until the source > rq's nr_running == 1 is not just "a short term fix", but should be there permanently > acting like a last guard with no overhead, why not. Yeah, seems so. Searching for steal all samples... (this is all with autogroup) load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 3 imb: 23 det_tasks: 2 det_load: 3 zeros: 1 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 0 imb: 32 det_tasks: 2 det_load: 0 zeros: 2 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 1 imb: 17 det_tasks: 2 det_load: 1 zeros: 1 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 37 imb: 22 det_tasks: 2 det_load: 37 zeros: 1 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 0 imb: 102 det_tasks: 2 det_load: 0 zeros: 2 load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 93 imb: 47 det_tasks: 1 det_load: 93 zeros: 0 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 202 imb: 125 det_tasks: 2 det_load: 202 zeros: 0 load_balance: idle - s_run: 0 d_run: 2 s_load: 0 d_load: 243 imb: 188 det_tasks: 2 det_load: 243 zeros: 0 load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 145 imb: 73 det_tasks: 1 det_load: 145 zeros: 0 load_balance: idle - s_run: 0 d_run: 1 s_load: 0 d_load: 46 imb: 24 det_tasks: 1 det_load: 46 zeros: 0 Both varieties of total pilferage (w/wo 0 load tasks involved) seem to happen only during idle balance, never periodic (yet). Oddity: make -j8 occasionally stacks/pulls piles of load=dinky. homer:/sys/kernel/debug/tracing # for i in `seq 1 10`; do cat trace|grep "s_run: 1.*det_tasks: $i.*zeros: 0"|wc -l; done 71634 1567 79 15 1 3 0 2 3 0 homer:/sys/kernel/debug/tracing # cat trace|grep "s_run: 1.*det_tasks: 8.*zeros: 0" -0 [002] dNs. 594.973783: load_balance: norm - s_run: 1 d_run: 9 s_load: 67 d_load: 1110 imb: 86 det_tasks: 8 det_load: 86 zeros: 0 <...>-10367 [007] d... 1456.477281: load_balance: idle - s_run: 1 d_run: 8 s_load: 805 d_load: 22 imb: 45 det_tasks: 8 det_load: 22 zeros: 0 homer:/sys/kernel/debug/tracing # cat trace|grep "s_run: 1.*det_tasks: 9.*zeros: 0" <...>-23317 [004] d... 486.677925: load_balance: idle - s_run: 1 d_run: 9 s_load: 888 d_load: 27 imb: 47 det_tasks: 9 det_load: 27 zeros: 0 <...>-11485 [002] d... 573.411095: load_balance: idle - s_run: 1 d_run: 9 s_load: 124 d_load: 78 imb: 82 det_tasks: 9 det_load: 78 zeros: 0 <...>-23286 [000] d... 1510.378740: load_balance: idle - s_run: 1 d_run: 9 s_load: 102 d_load: 58 imb: 63 det_tasks: 9 det_load: 58 zeros: 0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/