Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752900AbbLKTjh (ORCPT ); Fri, 11 Dec 2015 14:39:37 -0500 Received: from mail-wm0-f54.google.com ([74.125.82.54]:34735 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751494AbbLKTjf (ORCPT ); Fri, 11 Dec 2015 14:39:35 -0500 Date: Fri, 11 Dec 2015 20:39:18 +0100 From: Luca Abeni To: Peter Zijlstra Cc: Steven Rostedt , Thomas Gleixner , Juri Lelli , Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: SCHED_RR vs push-pull Message-ID: <20151211203918.3e9d4c9a@luca-1225C> In-Reply-To: <20151211141028.GH6357@twins.programming.kicks-ass.net> References: <20151211141028.GH6357@twins.programming.kicks-ass.net> X-Mailer: Claws Mail 3.8.0 (GTK+ 2.24.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2797 Lines: 71 Hi Peter, On Fri, 11 Dec 2015 15:10:28 +0100 Peter Zijlstra wrote: [...] > Thomas just reported a 'fun' problem with our rt 'load-balancer'. I suspect the root of the proble is that rt push/pull do not implement a load balancer, but just make sure that the M high priority tasks (where M is the number of CPUs) are scheduled for execution. The difference with a "real" load balancer can be seen when there are multiple tasks with the same priority :) > The problem is 2 cpus 4 equal prio RR tasks. > Suppose an unequal distribution of these tasks among the CPUs; eg 1:3. > > Now one would expect; barring other constraints; that each CPU would > get 2 of the tasks and they would RR on their prio level. > > This does not happen. > > The push-pull thing only acts when there's idle or new tasks, and in > the above scenario, the CPU with only the single RR task will happily > continue running that task, while the other CPU will have to RR > between the remaining 3. I might be wrong, but I think this is due to the if (lowest_rq->rt.highest_prio.curr <= task->prio) { in rt.c::find_lock_lowest_rq(). I suspect that changing "<=" in "<" might fix the issue, at the cost of generating a lot of useless tasks migrations. > Now my initial thoughts were to define a global RR order using a > virtual timeline and you'll get something like EEVDF on a per RR prio > level with push-pull state between that. > > Which might be a tad over engineered. I suspect this issue can be fixed in a simpler way, by changing the check I mentioned above. If you want to balance SCHED_RR tasks with the same priority, I think the "lowest_rq->rt.highest_prio.curr <= task->prio" should be extended to do the migration if: - the local task has a higher priority than the highest priority task on lowest_rq (this is what's currently done) - the local task has the same priority of the highest priority task on lowest_rq and they are SCHED_RR and the number of tasks with task->prio on the local RQ is larger than the number of tasks with lowest_rq->rt.highest_prio.curr on lowest_rq + 1. I think this could work, but I just looked at the code, without any real test. If you provide a simple program implementing a testcase, I can do some experiments in next week. The alternative (of course I have to mention it :) would be to use SCHED_DEADLINE instead of SCHED_RR. Luca > > Is there a 'sane' solution to this problem? One that still is > deterministic, because this is after all, RT scheduling. > > Happy thinking ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/