Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932141AbVISD2s (ORCPT ); Sun, 18 Sep 2005 23:28:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932147AbVISD2s (ORCPT ); Sun, 18 Sep 2005 23:28:48 -0400 Received: from b3162.static.pacific.net.au ([203.143.238.98]:19686 "EHLO cunningham.myip.net.au") by vger.kernel.org with ESMTP id S932141AbVISD2r (ORCPT ); Sun, 18 Sep 2005 23:28:47 -0400 Subject: PATCH: Fix race in cpu_down (hotplug cpu) From: Nigel Cunningham Reply-To: ncunningham@cyclades.com To: Andrew Morton , Linus Torvalds , Zwane Mwaikambo Cc: Linux Kernel Mailing List Content-Type: text/plain Organization: Cyclades Message-Id: <1127100518.9696.62.camel@localhost> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6-1mdk Date: Mon, 19 Sep 2005 13:28:38 +1000 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1486 Lines: 42 Hmm... managed to miss a word at the end of the first para and thus not make sense. Let's try again. ---------- Hi. There is a race condition in taking down a cpu (kernel/cpu.c::cpu_down). A cpu can already be idling when we clear its online flag, and we do not force the idle task to reschedule. This results in __cpu_die timing out. A simple fix is to force the idle task on the cpu going down to reschedule. Without the patch below, Suspend2 get into a deadlock at resume time when this issue occurs. I could not complete 20 cycles without seeing the issue. With the patch below, I have completed 75 cycles on the trot without problems. Please apply. Signed-off-by: Nigel Cunningham diff -ruNp 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 9910-hotplug-cpu-race.patch-new/kernel/cpu.c --- 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 2005-08-29 10:29:58.000000000 +1000 +++ 9910-hotplug-cpu-race.patch-new/kernel/cpu.c 2005-09-19 12:15:08.000000000 +1000 @@ -126,6 +126,9 @@ int cpu_down(unsigned int cpu) while (!idle_cpu(cpu)) yield(); + /* CPU may have idled before we set its offline flag. */ + set_tsk_need_resched(idle_task(cpu)); + /* This actually kills the CPU. */ __cpu_die(cpu); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/