Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964825Ab1FWSBy (ORCPT ); Thu, 23 Jun 2011 14:01:54 -0400 Received: from mail.windriver.com ([147.11.1.11]:34918 "EHLO mail.windriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933545Ab1FWRhY (ORCPT ); Thu, 23 Jun 2011 13:37:24 -0400 From: Paul Gortmaker To: stable@kernel.org, linux-kernel@vger.kernel.org Cc: stable-review@kernel.org, Matt Evans , Benjamin Herrenschmidt , Paul Gortmaker Subject: [34-longterm 161/247] powerpc/kexec: Fix orphaned offline CPUs across kexec Date: Thu, 23 Jun 2011 13:33:49 -0400 Message-Id: <1308850515-15242-102-git-send-email-paul.gortmaker@windriver.com> X-Mailer: git-send-email 1.7.4.4 In-Reply-To: <1308850515-15242-1-git-send-email-paul.gortmaker@windriver.com> References: <1308849690-14530-1-git-send-email-paul.gortmaker@windriver.com> <1308850515-15242-1-git-send-email-paul.gortmaker@windriver.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3098 Lines: 86 From: Matt Evans ------------------- This is a commit scheduled for the next v2.6.34 longterm release. If you see a problem with using this for longterm, please comment. ------------------- commit e8e5c2155b0035b6e04f29be67f6444bc914005b upstream. When CPU hotplug is used, some CPUs may be offline at the time a kexec is performed. The subsequent kernel may expect these CPUs to be already running, and will declare them stuck. On pseries, there's also a soft-offline (cede) state that CPUs may be in; this can also cause problems as the kexeced kernel may ask RTAS if they're online -- and RTAS would say they are. The CPU will either appear stuck, or will cause a crash as we replace its cede loop beneath it. This patch kicks each present offline CPU awake before the kexec, so that none are forever lost to these assumptions in the subsequent kernel. Now, the behaviour is that all available CPUs that were offlined are now online & usable after the kexec. This mimics the behaviour of a full reboot (on which all CPUs will be restarted). Signed-off-by: Matt Evans Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Paul Gortmaker --- arch/powerpc/kernel/machine_kexec_64.c | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c index 040bd1d..f61943d 100644 --- a/arch/powerpc/kernel/machine_kexec_64.c +++ b/arch/powerpc/kernel/machine_kexec_64.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -169,10 +170,34 @@ static void kexec_smp_down(void *arg) /* NOTREACHED */ } +/* + * We need to make sure each present CPU is online. The next kernel will scan + * the device tree and assume primary threads are online and query secondary + * threads via RTAS to online them if required. If we don't online primary + * threads, they will be stuck. However, we also online secondary threads as we + * may be using 'cede offline'. In this case RTAS doesn't see the secondary + * threads as offline -- and again, these CPUs will be stuck. + * + * So, we online all CPUs that should be running, including secondary threads. + */ +static void wake_offline_cpus(void) +{ + int cpu = 0; + + for_each_present_cpu(cpu) { + if (!cpu_online(cpu)) { + printk(KERN_INFO "kexec: Waking offline cpu %d.\n", + cpu); + cpu_up(cpu); + } + } +} + static void kexec_prepare_cpus(void) { int my_cpu, i, notified=-1; + wake_offline_cpus(); smp_call_function(kexec_smp_down, NULL, /* wait */0); my_cpu = get_cpu(); -- 1.7.4.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/