Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759974AbXK0WoP (ORCPT ); Tue, 27 Nov 2007 17:44:15 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751930AbXK0WoA (ORCPT ); Tue, 27 Nov 2007 17:44:00 -0500 Received: from ogre.sisk.pl ([217.79.144.158]:46057 "EHLO ogre.sisk.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752687AbXK0Wn7 (ORCPT ); Tue, 27 Nov 2007 17:43:59 -0500 From: "Rafael J. Wysocki" To: Kyle Moffett Subject: Re: freeze vs freezer Date: Wed, 28 Nov 2007 00:01:59 +0100 User-Agent: KMail/1.9.6 (enterprise 20070904.708012) Cc: Matthew Garrett , David Chinner , Jeremy Fitzhardinge , xfs-masters@oss.sgi.com, Linux Kernel Mailing List , Len Brown References: <4744FD87.7010301@goop.org> <200711271840.24825.rjw@sisk.pl> <8B00F353-983F-40E7-931B-EA73CCD32F0A@mac.com> In-Reply-To: <8B00F353-983F-40E7-931B-EA73CCD32F0A@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200711280002.00755.rjw@sisk.pl> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3077 Lines: 66 On Tuesday, 27 of November 2007, Kyle Moffett wrote: > On Nov 27, 2007, at 12:40:24, Rafael J. Wysocki wrote: > > On Tuesday, 27 of November 2007, Matthew Garrett wrote: > >> On Mon, Nov 26, 2007 at 10:53:34PM +0100, Rafael J. Wysocki wrote: > >>> On Monday, 26 of November 2007, David Chinner wrote: > >>>> So how do you handle threads that are blocked on I/O or a lock > >>>> during the system freeze process, then? > >>> > >>> We wait until they can continue. > >> > >> So if I have a process blocked on an unavilable NFS mount, I can't > >> suspend? > > > > That's correct, you can't. > > > > [And I know what you're going to say. ;-)] > > Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE" > instead of a zero preempt_count()? Really what we should do is just > iterate over all of the actual physical devices and tell each one > "Block new IO requests preemptably, finish pending DMA, put the > hardware in low-power mode, and prepare for suspend/hibernate". As > long as each driver knows how to do those simple things we can have > an entirely consistent kernel image for both suspend and for > hibernation. Well, this is more-or-less how we all imagine that should be done eventually. The main problem is how to implement it without causing too much breakage. Also, there are some dirty details that need to be taken into consideration. > When all tasks are preemptable we can very trivially rely on the > drivers to enforce the "Stop new IO submission" with a dirt-simple > semaphore or waitqueue. The sleep itself will be > TASK_UNINTERRUPTIBLE, but it will be done from a preemptible context. If there are any drivers that make their devices available via mmap(), that won't be sufficient. Probably, we'll need a two iterations over devices to handle all corner cases. Moreover, for hibernation we need to resume at least some devices in order to save the image, which shouldn't result in unblocking the waiting tasks. > That way the system suspend time is the sum of the suspend times of > the devices on the system, and the suspend time of any given device > is the sum of its maximum non-preemptible critical section and the > time to flush all of its remaining pending DMA/etc. This is almost > completely independent of the load-level of the machine, and it does > not depend on things like NFS filesystems. The one gotcha is that it > does not flush dirty filesystem pages to disk first, although that > could be fixed with a few VFS and blockdev hooks which hierarchically > flush and "freeze" block devices and filesystems before actually > disabling devices much the way that device-mapper can pause a device > to take a snapshot and end up with a clean journal on the filesystem > afterwards. Yes, I generally agree. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/