Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754319AbbGCD26 (ORCPT ); Thu, 2 Jul 2015 23:28:58 -0400 Received: from mga03.intel.com ([134.134.136.65]:40958 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754050AbbGCD2v (ORCPT ); Thu, 2 Jul 2015 23:28:51 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,397,1432623600"; d="scan'208";a="722151016" Date: Fri, 3 Jul 2015 03:37:02 +0800 From: Yuyang Du To: Morten Rasmussen Cc: Mike Galbraith , Peter Zijlstra , Rabin Vincent , "mingo@redhat.com" , "linux-kernel@vger.kernel.org" , Paul Turner , Ben Segall Subject: Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance() Message-ID: <20150702193702.GD5197@intel.com> References: <20150630143057.GA31689@axis.com> <1435728995.9397.7.camel@gmail.com> <20150701145551.GA15690@axis.com> <20150701204404.GH25159@twins.programming.kicks-ass.net> <20150701232511.GA5197@intel.com> <1435824347.5351.18.camel@gmail.com> <20150702010539.GB5197@intel.com> <20150702114032.GA7598@e105550-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150702114032.GA7598@e105550-lin.cambridge.arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2358 Lines: 53 Hi Morten, On Thu, Jul 02, 2015 at 12:40:32PM +0100, Morten Rasmussen wrote: > detach_tasks() will attempts to pull 62 based on tasks task_h_load() but > the task_h_load() sum is only 5 + 10 + 0 and hence detach_tasks() will > empty the src_rq. > > IOW, since task groups include blocked load in the load_avg_contrib (see > __update_group_entity_contrib() and __update_cfs_rq_tg_load_contrib()) the > imbalance includes blocked load and hence env->imbalance >= > sum(task_h_load(p)) for all tasks p on the rq. Which leads to > detach_tasks() emptying the rq completely in the reported scenario where > blocked load > runnable load. Whenever I want to know the load avg concerning task group, I need to walk through the complete codes again, I prefer not to do it this time. But it should not be that simply to say "the 118 comes from the blocked load". Anyway, with blocked load, yes, we definitely can't move (or even find) some ammount of the imbalance if we only look at the tasks on the queue. But this may or may not be a problem. Firstly, the question comes to whether we want blocked load anywhere. This is just about a "now vs. average" question. Secondly, if we stick to average, we just need to treat the blocked load consistently, not that group SE has it, but task SE does not, or somewhere has it, others not. Thanks, Yuyang > Whether emptying the src_rq is the right thing to do depends on on your > point of view. Does balanced load (runnable+blocked) take priority over > keeping cpus busy or not? For idle_balance() it seems intuitively > correct to not empty the rq and hence you could consider env->imbalance > to be too big. > > I think we will see more of this kind of problems if we include > weighted_cpuload() as well. Parts of the imbalance calculation code is > quite old and could use some attention first. > > A short term fix could be what Yuyang propose, stop pulling tasks when > there is only one left in detach_tasks(). It won't affect active load > balance where we may want to migrate the last task as it active load > balance doesn't use detach_tasks(). > > Morten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/