Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756206Ab2EEPY6 (ORCPT ); Sat, 5 May 2012 11:24:58 -0400 Received: from netrider.rowland.org ([192.131.102.5]:59013 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1755951Ab2EEPY4 (ORCPT ); Sat, 5 May 2012 11:24:56 -0400 Date: Sat, 5 May 2012 11:24:55 -0400 (EDT) From: Alan Stern X-X-Sender: stern@netrider.rowland.org To: Peter Zijlstra cc: Nishanth Aravamudan , "Srivatsa S. Bhat" , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug In-Reply-To: <1336167852.6509.90.camel@twins> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2703 Lines: 51 On Fri, 4 May 2012, Peter Zijlstra wrote: > That said, the whole suspend/resume 'problem' does seem worth fixing and > is a very special case where we absolutely know we're going to get back > in the state we are in and userspace isn't actually running. So ideally > we'd go with the bhat's patch that skips the sched_domain rebuilds > entirely +- some bug-fixes ;-). Just as an interesting side comment... The USB subsystem faced this same problem years ago. The question was: When a USB device (especially a mass-storage device) is unplugged and then reconnected, is the new device instance the same as the old one? Linus stepped in and firmly assured us that it was not. That's very much like the situation you're describing: If CPU 4 is hot-unplugged and then a new CPU appears in slot 4, is it the same CPU as before (and does it therefore belong to the same cpusets as before)? But this led to problems during suspend, because not all systems could maintain bus connectivity while the system was asleep, and almost none can during hibernation. As a result, mounted filesystems would become unavailable after resume even though the USB storage device had been plugged in the whole time. To the kernel, it appeared that the device had been unplugged during suspend and then replugged during resume. We ended up adopting a special-purpose solution just to handle that case. It's described in Documentation/usb/persist.txt if you want the full details. In brief, when the system resumes it checks to see if a device appears to be present at the same port where a device used to be. If it is, and if its descriptors match the values remembered for the former device, then we accept the new device as being the same as the old one, even though the hardware indicates that the connection was not maintained during the system sleep. >From my point of view, this suggests that CPU hot-unplug is not quite the right tool to use during suspend. The CPU doesn't actually go away; it merely becomes unusable for a while. In other words, this approach applies an incorrect abstraction. What's really needed is something a little different: a way to avoid running any tasks on that CPU while not removing it from the system. If this means some tasks can no longer run on any CPUs, so be it -- this happens only during suspend, after all. Then during resume, when the CPU is brought back up, tasks are allowed to run on it again. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/