Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759402AbZAGPbO (ORCPT ); Wed, 7 Jan 2009 10:31:14 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752244AbZAGPa4 (ORCPT ); Wed, 7 Jan 2009 10:30:56 -0500 Received: from mail-bw0-f21.google.com ([209.85.218.21]:39045 "EHLO mail-bw0-f21.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751845AbZAGPaz (ORCPT ); Wed, 7 Jan 2009 10:30:55 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=YCBObb6LS9jERaSaTx28LIwgaajbcNAENXoC+eUOae4HbRp7mrZR+pdzdBWwvefmIw NAzNPtJuME6d7f3XETJ6cWu0ncLD/QKyvYAKMO8Nv5L9kTAVHXPELgmKEoWJ3MwZ/3XZ ryMjTu/a0o+/zbugoh3YbAMKs3bB6aUqV08Ls= Message-ID: Date: Wed, 7 Jan 2009 16:30:52 +0100 From: "=?ISO-8859-1?Q?Fr=E9d=E9ric_Weisbecker?=" To: "Heiko Carstens" Subject: Re: [PATCH] stop_machine/cpu hotplug: fix disable_nonboot_cpus Cc: "Linus Torvalds" , "Andrew Morton" , "Rusty Russell" , "Pekka Enberg" , "Justin P. Mattock" , linux-kernel@vger.kernel.org, "Jeff Chua" In-Reply-To: <20090107151946.GA25560@osiris.boeblingen.de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <4963F368.7080909@gmail.com> <84144f020901062248j5d406656wb21130d914c7749d@mail.gmail.com> <84144f020901070030k6fb888f6n84255078e4885d28@mail.gmail.com> <20090107091534.GA4633@osiris.boeblingen.de.ibm.com> <1231319946.14720.7.camel@penberg-laptop> <20090107122728.GB4633@osiris.boeblingen.de.ibm.com> <20090107151946.GA25560@osiris.boeblingen.de.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3383 Lines: 81 2009/1/7 Heiko Carstens : > From: Heiko Carstens > > disable_nonboot_cpus calls _cpu_down. But _cpu_down requires that the > caller already created the stop_machine workqueue (like cpu_down does). > Otherwise a call to stop_machine will lead to accesses to random memory > regions. > > When introducing this new interface (9ea09af3bd3090e8349ca2899ca2011bd94cda85 > "stop_machine: introduce stop_machine_create/destroy") I missed the second > call site of _cpu_down. > So add the missing stop_machine_create/destroy calls to disable_nonboot_cpus > as well. > > Fixes suspend-to-ram/disk and also this bug: > > [ 286.547348] BUG: unable to handle kernel paging request at 6b6b6b6b > [ 286.548940] IP: [] __stop_machine+0x88/0xe3 > [ 286.550598] Oops: 0002 [#1] SMP > [ 286.560580] Pid: 3273, comm: halt Not tainted (2.6.28-06127-g238c6d5 > [ 286.560580] EIP: is at __stop_machine+0x88/0xe3 > [ 286.560580] Process halt (pid: 3273, ti=f1a28000 task=f4530f30 > [ 286.560580] Call Trace: > [ 286.560580] [] ? _cpu_down+0x10f/0x234 > [ 286.560580] [] ? disable_nonboot_cpus+0x58/0xdc > [ 286.560580] [] ? kernel_poweroff+0x22/0x39 > [ 286.560580] [] ? sys_reboot+0xde/0x14c > [ 286.560580] [] ? complete_signal+0x179/0x191 > [ 286.560580] [] ? send_signal+0x1cc/0x1e1 > [ 286.560580] [] ? _spin_unlock_irqrestore+0x2d/0x3c > [ 286.560580] [] ? group_send_signal_info+0x58/0x61 > [ 286.560580] [] ? kill_pid_info+0x30/0x3a > [ 286.560580] [] ? sys_kill+0x75/0x13a > [ 286.560580] [] ? mntput_no_expire+ox1f/0x101 > [ 286.560580] [] ? dput+0x1e/0x105 > [ 286.560580] [] ? __fput+0x150/0x158 > [ 286.560580] [] ? audit_syscall_entry+0x137/0x159 > [ 286.560580] [] ? sysenter_do_call+0x12/0x34 > > Reported-by: "Justin P. Mattock" > Reviewed-by: Pekka Enberg > Signed-off-by: Heiko Carstens > --- > kernel/cpu.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > Index: linux-2.6/kernel/cpu.c > =================================================================== > --- linux-2.6.orig/kernel/cpu.c > +++ linux-2.6/kernel/cpu.c > @@ -379,8 +379,11 @@ static cpumask_var_t frozen_cpus; > > int disable_nonboot_cpus(void) > { > - int cpu, first_cpu, error = 0; > + int cpu, first_cpu, error; > > + error = stop_machine_create(); > + if (error) > + return error; > cpu_maps_update_begin(); > first_cpu = cpumask_first(cpu_online_mask); > /* We take down all of the non-boot CPUs in one shot to avoid races > @@ -409,6 +412,7 @@ int disable_nonboot_cpus(void) > printk(KERN_ERR "Non-boot CPUs are not disabled\n"); > } > cpu_maps_update_done(); > + stop_machine_destroy(); > return error; > } > That should explain why suspend to disk failed on my box yesterday on the processors stage... Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/