Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752486AbbLKTyG (ORCPT ); Fri, 11 Dec 2015 14:54:06 -0500 Received: from smtprelay0048.hostedemail.com ([216.40.44.48]:55865 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751051AbbLKTyF (ORCPT ); Fri, 11 Dec 2015 14:54:05 -0500 X-Session-Marker: 726F737465647440676F6F646D69732E6F7267 X-Spam-Summary: 2,0,0,,d41d8cd98f00b204,rostedt@goodmis.org,:::::::::::,RULES_HIT:41:355:379:541:599:800:960:973:988:989:1260:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1534:1543:1593:1594:1605:1711:1730:1747:1777:1792:2198:2199:2393:2553:2559:2562:2689:2693:2731:2740:2898:3138:3139:3140:3141:3142:3622:3865:3866:3867:3868:3870:3871:3872:3873:3874:4250:5007:6261:7875:7901:7903:7904:8531:10004:10400:10450:10455:10848:10967:11026:11232:11658:11914:12043:12050:12296:12517:12519:12663:12740:14096:14097:14659:19904:19999:21060:21080:30012:30034:30045:30054:30070:30090:30091,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:3,LUA_SUMMARY:none X-HE-Tag: ray71_5d45654e17b49 X-Filterd-Recvd-Size: 4345 Date: Fri, 11 Dec 2015 14:53:59 -0500 From: Steven Rostedt To: Luca Abeni Cc: Peter Zijlstra , Thomas Gleixner , Juri Lelli , Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: SCHED_RR vs push-pull Message-ID: <20151211145359.12aac9bd@gandalf.local.home> In-Reply-To: <20151211203918.3e9d4c9a@luca-1225C> References: <20151211141028.GH6357@twins.programming.kicks-ass.net> <20151211203918.3e9d4c9a@luca-1225C> X-Mailer: Claws Mail 3.13.0 (GTK+ 2.24.28; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3561 Lines: 92 On Fri, 11 Dec 2015 20:39:18 +0100 Luca Abeni wrote: > Hi Peter, > > On Fri, 11 Dec 2015 15:10:28 +0100 > Peter Zijlstra wrote: > [...] > > Thomas just reported a 'fun' problem with our rt 'load-balancer'. > I suspect the root of the proble is that rt push/pull do not implement > a load balancer, but just make sure that the M high priority tasks > (where M is the number of CPUs) are scheduled for execution. > The difference with a "real" load balancer can be seen when there are > multiple tasks with the same priority :) Yep. > > > > The problem is 2 cpus 4 equal prio RR tasks. > > Suppose an unequal distribution of these tasks among the CPUs; eg 1:3. > > > > Now one would expect; barring other constraints; that each CPU would > > get 2 of the tasks and they would RR on their prio level. > > > > This does not happen. > > > > The push-pull thing only acts when there's idle or new tasks, and in > > the above scenario, the CPU with only the single RR task will happily > > continue running that task, while the other CPU will have to RR > > between the remaining 3. > I might be wrong, but I think this is due to the > if (lowest_rq->rt.highest_prio.curr <= task->prio) { > in rt.c::find_lock_lowest_rq(). > I suspect that changing "<=" in "<" might fix the issue, at the cost of > generating a lot of useless tasks migrations. I'm against this. Unless we only do this when current and the task we want to move are both RR. It might help. > > > Now my initial thoughts were to define a global RR order using a > > virtual timeline and you'll get something like EEVDF on a per RR prio > > level with push-pull state between that. > > > > Which might be a tad over engineered. > I suspect this issue can be fixed in a simpler way, by changing the > check I mentioned above. What happens when current is FIFO, and we just moved an RR task over that will now never run? > > If you want to balance SCHED_RR tasks with the same priority, I think > the "lowest_rq->rt.highest_prio.curr <= task->prio" should be extended > to do the migration if: > - the local task has a higher priority than the highest priority task > on lowest_rq (this is what's currently done) > - the local task has the same priority of the highest priority task on > lowest_rq and they are SCHED_RR and the number of tasks with > task->prio on the local RQ is larger than the number of tasks with > lowest_rq->rt.highest_prio.curr on lowest_rq + 1. Well, the number of tasks may not be good enough. We need to only look at RR tasks. Perhaps if current is RR and the waiter on the other CPU is RR, we can do a scan to see if a balance should be done. > > I think this could work, but I just looked at the code, without any > real test. If you provide a simple program implementing a testcase, I > can do some experiments in next week. > > The alternative (of course I have to mention it :) would be to use > SCHED_DEADLINE instead of SCHED_RR. Hmm, I wonder if we could have a wrapper around SCHED_DEADLINE to implement SCHED_RR. Probably not, because SCHED_RR has hard coded priorities and SCHED_DEADLINE is more dynamic (and still higher than SCHED_FIFO). > > > > Happy thinking ;-) Heh, I originally thought Peter said "Happy Thanksgiving". -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/