Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758160AbYG2Ms2 (ORCPT ); Tue, 29 Jul 2008 08:48:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751717AbYG2MsV (ORCPT ); Tue, 29 Jul 2008 08:48:21 -0400 Received: from x346.tv-sign.ru ([89.108.83.215]:42752 "EHLO mail.screens.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751398AbYG2MsU (ORCPT ); Tue, 29 Jul 2008 08:48:20 -0400 Date: Tue, 29 Jul 2008 16:52:01 +0400 From: Oleg Nesterov To: Dmitry Adamushko Cc: linux-kernel@vger.kernel.org, Ingo Molnar Subject: Re: [patch, minor] workqueue: consistently use 'err' in __create_workqueue_key() Message-ID: <20080729125201.GC177@tv-sign.ru> References: <1217277694.20627.9.camel@earth> <20080729110250.GA177@tv-sign.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3190 Lines: 76 On 07/29, Dmitry Adamushko wrote: > > 2008/7/29 Oleg Nesterov : > > On 07/28, Dmitry Adamushko wrote: > >> > >> I guess error handling is a bit illogical in __create_workqueue_key() > > > > Please see below, > > > >> for_each_possible_cpu(cpu) { > >> cwq = init_cpu_workqueue(wq, cpu); > >> - if (err || !cpu_online(cpu)) > >> + if (!cpu_online(cpu)) > >> continue; > >> err = create_workqueue_thread(cwq, cpu); > >> + if (err) > >> + break; > > > > This was done on purpose. The code above does init_cpu_workqueue(cpu) > > for each possible cpu, even if we fail to create cwq->thread for some > > cpu. This way destroy_workqueue() (called below) shouldn't worry about > > the partially initialized workqueues. > > > > The patch above should work, but it assumes that destroy_workqueue() > > must do nothing with cwq if cwq->thread == NULL, this is not very > > robust. > > Yes, I saw this test and that's why I decided that destroy_workqueue() > is able (designed) to deal with partially-initialized objects. No, no. cwq->thread == NULL just means that it has no ->thread and nothing more, it does not mean cwq was not initialized, see below. > Note, for the race scenario with cpu-hotplug (which I've overlooked > indeed) which you describe below, we also seem to depend on the same > "cwq->thread == NULL" test in cleanup_workqueue_thread() as follows: > > assume, cpu_down(cpu) -> CPU_POST_DEAD -> cleanup_workqueue_thread() > gets called for a partially initialized workqueue for 'cpu' for which > create_workqueue_thread() has previously failed in > create_worqueue_key(). Well, it _is_ initialized, but yes cwq->thread can be NULL, > > > > And, more importantly. Let's suppose __create_workqueue_key() does > > "break" and drops cpu_add_remove_lock. Then we race with cpu-hotplug > > which can hit the uninitialized cwq. This is fixable, but needs other > > complication. > > And I'd say this behavior (of having a partially-created object > visible to the outside world) is not that robust. e.g. the > aforementioned race would be eliminated if we place a wq on the global > list only when it's been successfully initialized. Note that start_workqueue_thread() and cleanup_workqueue_thread() has to check cwq->thread != NULL anyway, suppose that CPU_UP_PREPARE fails. Yes, we can change __create_workqueue_key() to check err == 0 before list_add(), but this just adds more checks without any gain. Note also that in fact it is better to do start_workqueue_thread() even if create_workqueue_thread(). This doesn't matter with the current implementation, but start_workqueue_thread() ensures that cwq->thread can be kthread_stop()'ed, and start_workqueue_thread() can be changed so it can fail even if kthread_create() succeeds. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/