Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756217Ab1C3VwJ (ORCPT ); Wed, 30 Mar 2011 17:52:09 -0400 Received: from mga11.intel.com ([192.55.52.93]:26254 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933410Ab1C3VHP (ORCPT ); Wed, 30 Mar 2011 17:07:15 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.63,270,1299484800"; d="scan'208";a="673447488" From: Andi Kleen References: <20110330203.501921634@firstfloor.org> In-Reply-To: <20110330203.501921634@firstfloor.org> To: ncrao@google.com, a.p.zijlstra@chello.nl, ak@linux.intel.com, mingo@elte.hu, efault@gmx.de, gregkh@suse.de, linux-kernel@vger.kernel.org, stable@kernel.org, tim.bird@am.sony.com Subject: [PATCH] [94/275] sched: Set group_imb only a task can be pulled from the busiest cpu Message-Id: <20110330210532.980E43E1A05@tassilo.jf.intel.com> Date: Wed, 30 Mar 2011 14:05:32 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2922 Lines: 77 2.6.35-longterm review patch. If anyone has any objections, please let me know. ------------------ Commit: 2582f0eba54066b5e98ff2b27ef0cfa833b59f54 upstream When cycling through sched groups to determine the busiest group, set group_imb only if the busiest cpu has more than 1 runnable task. This patch fixes the case where two cpus in a group have one runnable task each, but there is a large weight differential between these two tasks. The load balancer is unable to migrate any task from this group, and hence do not consider this group to be imbalanced. Signed-off-by: Nikhil Rao Signed-off-by: Peter Zijlstra Signed-off-by: Andi Kleen LKML-Reference: <1286996978-7007-3-git-send-email-ncrao@google.com> [ small code readability edits ] Signed-off-by: Ingo Molnar Signed-off-by: Mike Galbraith Acked-by: Peter Zijlstra Signed-off-by: Greg Kroah-Hartman --- kernel/sched_fair.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) Index: linux-2.6.35.y/kernel/sched_fair.c =================================================================== --- linux-2.6.35.y.orig/kernel/sched_fair.c 2011-03-29 23:03:00.047298961 -0700 +++ linux-2.6.35.y/kernel/sched_fair.c 2011-03-29 23:55:03.511377329 -0700 @@ -2352,7 +2352,7 @@ int local_group, const struct cpumask *cpus, int *balance, struct sg_lb_stats *sgs) { - unsigned long load, max_cpu_load, min_cpu_load; + unsigned long load, max_cpu_load, min_cpu_load, max_nr_running; int i; unsigned int balance_cpu = -1, first_idle_cpu = 0; unsigned long avg_load_per_task = 0; @@ -2363,6 +2363,7 @@ /* Tally up the load of all CPUs in the group */ max_cpu_load = 0; min_cpu_load = ~0UL; + max_nr_running = 0; for_each_cpu_and(i, sched_group_cpus(group), cpus) { struct rq *rq = cpu_rq(i); @@ -2380,8 +2381,10 @@ load = target_load(i, load_idx); } else { load = source_load(i, load_idx); - if (load > max_cpu_load) + if (load > max_cpu_load) { max_cpu_load = load; + max_nr_running = rq->nr_running; + } if (min_cpu_load > load) min_cpu_load = load; } @@ -2421,11 +2424,10 @@ if (sgs->sum_nr_running) avg_load_per_task = sgs->sum_weighted_load / sgs->sum_nr_running; - if ((max_cpu_load - min_cpu_load) > 2*avg_load_per_task) + if ((max_cpu_load - min_cpu_load) > 2*avg_load_per_task && max_nr_running > 1) sgs->group_imb = 1; - sgs->group_capacity = - DIV_ROUND_CLOSEST(group->cpu_power, SCHED_LOAD_SCALE); + sgs->group_capacity = DIV_ROUND_CLOSEST(group->cpu_power, SCHED_LOAD_SCALE); } /** -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/