Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751428AbXAFQ3u (ORCPT ); Sat, 6 Jan 2007 11:29:50 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751427AbXAFQ3u (ORCPT ); Sat, 6 Jan 2007 11:29:50 -0500 Received: from mail.screens.ru ([213.234.233.54]:42677 "EHLO mail.screens.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751428AbXAFQ3t (ORCPT ); Sat, 6 Jan 2007 11:29:49 -0500 Date: Sat, 6 Jan 2007 19:30:35 +0300 From: Oleg Nesterov To: Srivatsa Vaddagiri Cc: Andrew Morton , David Howells , Christoph Hellwig , Ingo Molnar , Linus Torvalds , linux-kernel@vger.kernel.org, Gautham shenoy Subject: Re: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update Message-ID: <20070106163035.GA2948@tv-sign.ru> References: <20061217223416.GA6872@tv-sign.ru> <20061218162701.a3b5bfda.akpm@osdl.org> <20061219004319.GA821@tv-sign.ru> <20070104113214.GA30377@in.ibm.com> <20070104142936.GA179@tv-sign.ru> <20070104091850.c1feee76.akpm@osdl.org> <20070106151036.GA951@tv-sign.ru> <20070106154506.GC24274@in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070106154506.GC24274@in.ibm.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1129 Lines: 35 On 01/06, Srivatsa Vaddagiri wrote: > > On Sat, Jan 06, 2007 at 06:10:36PM +0300, Oleg Nesterov wrote: > > Increment hotplug_sequence earlier, under CPU_DOWN_PREPARE. We can't > > miss the event, the task running flush_workqueue() will be re-scheduled > > at least once before CPU actually disappears from cpu_online_map. > > Eww ..what happens if flush_workqueue() starts after CPU_DOWN_PREPARE? ^^^^^ Stupid me. Thanks. > CPU_DOWN_PREPARE(8) > hotplug_sequence++ = 10 > > flush_workqueue() > sequence = 10 > flush cpus 1 ....7 > > CPU_DEAD(8) > take_over_work(8->1) > > return not flushing dead cpu8 (=BUG) I'll try to do something else tomorrow. Do you see a simple soulution? The current usage of workqueue_mutex (I mean stable kernel) is broken and deadlockable. We really need to change it. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/