Message-ID: <43C9477B.8060709@google.com>
Date: Sat, 14 Jan 2006 10:48:27 -0800
From: "Martin J. Bligh" <mbligh@google.com>
User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051011)
MIME-Version: 1.0
To: Peter Williams <pwil3058@bigpond.net.au>
CC: Con Kolivas <kernel@kolivas.org>, Andrew Morton <akpm@osdl.org>,
       linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
       Andy Whitcroft <apw@shadowen.org>
Subject: Re: -mm seems significanty slower than mainline on kernbench
References: <43C45BDC.1050402@google.com> <43C4A3E9.1040301@google.com> <43C4F8EE.50208@bigpond.net.au> <200601120129.16315.kernel@kolivas.org> <43C58117.9080706@bigpond.net.au> <43C5A8C6.1040305@bigpond.net.au> <43C6A24E.9080901@google.com> <43C6B60E.2000003@bigpond.net.au> <43C6D636.8000105@bigpond.net.au> <43C75178.80809@bigpond.net.au>
In-Reply-To: <43C75178.80809@bigpond.net.au>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3927
Lines: 108


>
> Attached is a new patch to fix the excessive idle problem.  This patch 
> takes a new approach to the problem as it was becoming obvious that 
> trying to alter the load balancing code to cope with biased load was 
> harder than it seemed.
>
> This approach reverts to the old load values but weights them 
> according to tasks' bias_prio values.  This means that any assumptions 
> by the load balancing code that the load generated by a single task is 
> SCHED_LOAD_SCALE will still hold.  Then, in find_busiest_group(), the 
> imbalance is scaled back up to bias_prio scale so that move_tasks() 
> can move biased load rather than tasks.
>
OK, this one seems to fix the issue that I had, AFAICS. Congrats, and 
thanks,

M.

> One advantage of this is that when there are no non zero niced tasks 
> the processing will be mathematically the same as the original code. 
> Kernbench results from a 2 CPU Celeron 550Mhz system are:
>
> Average Optimal -j 8 Load Run:
> Elapsed Time 1056.16 (0.831102)
> User Time 1906.54 (1.38447)
> System Time 182.086 (0.973386)
> Percent CPU 197 (0)
> Context Switches 48727.2 (249.351)
> Sleeps 27623.4 (413.913)
>
> This indicates that, on average, 98.9% of the total available CPU was 
> used by the build.
>
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
>
> BTW I think that we need to think about a slightly more complex nice 
> to bias mapping function.  The current one gives a nice==19 1/20 of 
> the bias of a nice=0 task but only gives nice=-20 tasks twice the bias 
> of a nice=0 task.  I don't think this is a big problem as the majority 
> of non nice==0 tasks will have positive nice but should be looked at 
> for a future enhancement.
>
> Peter
>
>------------------------------------------------------------------------
>
>Index: MM-2.6.X/kernel/sched.c
>===================================================================
>--- MM-2.6.X.orig/kernel/sched.c	2006-01-13 14:53:34.000000000 +1100
>+++ MM-2.6.X/kernel/sched.c	2006-01-13 15:11:19.000000000 +1100
>@@ -1042,7 +1042,8 @@ void kick_process(task_t *p)
> static unsigned long source_load(int cpu, int type)
> {
> 	runqueue_t *rq = cpu_rq(cpu);
>-	unsigned long load_now = rq->prio_bias * SCHED_LOAD_SCALE;
>+	unsigned long load_now = (rq->prio_bias * SCHED_LOAD_SCALE) /
>+		NICE_TO_BIAS_PRIO(0);
> 
> 	if (type == 0)
> 		return load_now;
>@@ -1056,7 +1057,8 @@ static unsigned long source_load(int cpu
> static inline unsigned long target_load(int cpu, int type)
> {
> 	runqueue_t *rq = cpu_rq(cpu);
>-	unsigned long load_now = rq->prio_bias * SCHED_LOAD_SCALE;
>+	unsigned long load_now = (rq->prio_bias * SCHED_LOAD_SCALE) /
>+		NICE_TO_BIAS_PRIO(0);
> 
> 	if (type == 0)
> 		return load_now;
>@@ -1322,7 +1324,8 @@ static int try_to_wake_up(task_t *p, uns
> 			 * of the current CPU:
> 			 */
> 			if (sync)
>-				tl -= p->bias_prio * SCHED_LOAD_SCALE;
>+				tl -= (p->bias_prio * SCHED_LOAD_SCALE) /
>+					NICE_TO_BIAS_PRIO(0);
> 
> 			if ((tl <= load &&
> 				tl + target_load(cpu, idx) <= SCHED_LOAD_SCALE) ||
>@@ -2159,7 +2162,7 @@ find_busiest_group(struct sched_domain *
> 	}
> 
> 	/* Get rid of the scaling factor, rounding down as we divide */
>-	*imbalance = *imbalance / SCHED_LOAD_SCALE;
>+	*imbalance = (*imbalance * NICE_TO_BIAS_PRIO(0)) / SCHED_LOAD_SCALE;
> 	return busiest;
> 
> out_balanced:
>@@ -2472,7 +2475,8 @@ static void rebalance_tick(int this_cpu,
> 	struct sched_domain *sd;
> 	int i;
> 
>-	this_load = this_rq->prio_bias * SCHED_LOAD_SCALE;
>+	this_load = (this_rq->prio_bias * SCHED_LOAD_SCALE) /
>+		NICE_TO_BIAS_PRIO(0);
> 	/* Update our load */
> 	for (i = 0; i < 3; i++) {
> 		unsigned long new_load = this_load;
>  
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/