Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161358AbXBUUxQ (ORCPT ); Wed, 21 Feb 2007 15:53:16 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1161175AbXBUUxQ (ORCPT ); Wed, 21 Feb 2007 15:53:16 -0500 Received: from ogre.sisk.pl ([217.79.144.158]:40059 "EHLO ogre.sisk.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161358AbXBUUxP (ORCPT ); Wed, 21 Feb 2007 15:53:15 -0500 From: "Rafael J. Wysocki" To: Oleg Nesterov Subject: Re: freezer problems Date: Wed, 21 Feb 2007 21:47:54 +0100 User-Agent: KMail/1.9.5 Cc: paulmck@linux.vnet.ibm.com, ego@in.ibm.com, akpm@osdl.org, paulmck@us.ibm.com, mingo@elte.hu, vatsa@in.ibm.com, dipankar@in.ibm.com, venkatesh.pallipadi@intel.com, linux-kernel@vger.kernel.org, Pavel Machek References: <20070214144031.GA15257@in.ibm.com> <200702211913.41883.rjw@sisk.pl> <20070221200314.GA91@tv-sign.ru> In-Reply-To: <20070221200314.GA91@tv-sign.ru> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200702212147.55756.rjw@sisk.pl> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2593 Lines: 53 On Wednesday, 21 February 2007 21:03, Oleg Nesterov wrote: > On 02/21, Rafael J. Wysocki wrote: > > > > On Wednesday, 21 February 2007 19:14, Paul E. McKenney wrote: > > > On Tue, Feb 20, 2007 at 07:29:01PM +0100, Rafael J. Wysocki wrote: > > > > On Tuesday, 20 February 2007 01:32, Rafael J. Wysocki wrote: > > > > > On Tuesday, 20 February 2007 01:12, Oleg Nesterov wrote: > > > > > Hm. In the case discussed above we have a task that's right before calling > > > > > frozen_process(), so we can't thaw it, because it's not frozen. It will be > > > > > frozen just in a while, but try_to_freeze_tasks() and thaw_tasks() have no > > > > > way to check this. > > > > > > > > > > I think to close this race the refrigerator should check TIF_FREEZE and set > > > > > PF_FROZEN _and_ reset TIF_FREEZE under a lock > > I personally think this is good. Not only this allows us to close the race, > I think we can do more. > > > that would also have to be > > > > > taken by try_to_freeze_tasks() in the beginning of the error path. This will > > > > > ensure that all tasks either freeze themselves before the error path in > > > > > try_to_freeze_tasks() is executed, or remain unfrozen. > > How about take this lock in thaw_tasks() instead/too ? Good point. If we take it in thaw_tasks(), then the tasks that have TIF_FREEZE set, but haven't entered the refrigerator yet, won't be able to enter the refrigerator until thaw_tasks() releases the lock ... > > Currently we need a separate loop in thaw_tasks() to handle PF_FREEZER_SKIP. This > means that PF_FREEZER_SKIP is not so generic: thaw_tasks() can't tolerate if such > a task was woken in between. What if we change thaw_process() to clear TIF_FREEZE ? ... and then we can drop the PF_FREEZER_SKIP-handling loop and change thaw_process() to clear TIF_FREEZE. > Note also that we can use task_lock() instead of global refrigerator_lock. This > means that thaw_process() should take it too, probably this is slowdown, but I > think not too much because thaw_process() is going to write to p->flags anyway. > In this case thaw_process() works perfectly as cancel_freezing_and_thaw() and > can be used to fix exec/coredump in future. Hm, I think we can try doing this too. I'll try to prepare a patch later today. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/