Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757176AbYCJW3r (ORCPT ); Mon, 10 Mar 2008 18:29:47 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754772AbYCJW3j (ORCPT ); Mon, 10 Mar 2008 18:29:39 -0400 Received: from 75-130-111-13.dhcp.oxfr.ma.charter.com ([75.130.111.13]:35903 "EHLO novell1.haskins.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751269AbYCJW3j (ORCPT ); Mon, 10 Mar 2008 18:29:39 -0400 From: Gregory Haskins Subject: [PATCH v2] keep rd->online and cpu_online_map in sync To: suresh.b.siddha@intel.com Cc: ego@in.ibm.com, rjw@sisk.pl, akpm@linux-foundation.org, dmitry.adamushko@gmail.com, ego@in.ibm.com, mingo@elte.hu, oleg@sign.ru, yi.y.yang@intel.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, ghaskins@novell.com Date: Mon, 10 Mar 2008 17:59:11 -0400 Message-ID: <20080310215813.10968.960.stgit@novell1.haskins.net> In-Reply-To: <20080310221014.GB27329@linux-os.sc.intel.com> References: <20080310221014.GB27329@linux-os.sc.intel.com> User-Agent: StGIT/0.12.1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2372 Lines: 71 >>> On Mon, Mar 10, 2008 at 6:10 PM, in message <20080310221014.GB27329@linux-os.sc.intel.com>, Suresh Siddha wrote: > On Mon, Mar 10, 2008 at 04:00:28PM -0600, Gregory Haskins wrote: >> >>> On Mon, Mar 10, 2008 at 6:03 PM, in message > <200803102303.28660.rjw@sisk.pl>, >> "Rafael J. Wysocki" wrote: >> > On Monday, 10 of March 2008, Suresh Siddha wrote: >> >> > >> >> > - case CPU_DOWN_PREPARE: >> >> > + case CPU_DYING: >> >> >> >> Don't we need to take care of CPU_DYING_FROZEN aswell? >> > >> > Well, I'd say we do. >> >> Should I add that to the patch as well then? > > Yes please. Here is v2 with the suggested improvement -Greg ------------------------ keep rd->online and cpu_online_map in sync It is possible to allow the root-domain cache of online cpus to become out of sync with the global cpu_online_map. This is because we currently trigger removal of cpus too early in the notifier chain. Other DOWN_PREPARE handlers may in fact run and reconfigure the root-domain topology, thereby stomping on our own offline handling. The end result is that rd->online may become out of sync with cpu_online_map, which results in potential task misrouting. So change the offline handling to be more tightly coupled with the global offline process by triggering on CPU_DYING intead of CPU_DOWN_PREPARE. Signed-off-by: Gregory Haskins Cc: Gautham R Shenoy Cc: "Siddha, Suresh B" Cc: Ingo Molnar Cc: "Rafael J. Wysocki" Cc: Andrew Morton --- kernel/sched.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 52b9867..1cb53fb 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5881,7 +5881,8 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) spin_unlock_irq(&rq->lock); break; - case CPU_DOWN_PREPARE: + case CPU_DYING: + case CPU_DYING_FROZEN: /* Update our root-domain */ rq = cpu_rq(cpu); spin_lock_irqsave(&rq->lock, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/