Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754100AbaAGURS (ORCPT ); Tue, 7 Jan 2014 15:17:18 -0500 Received: from mx1.redhat.com ([209.132.183.28]:19886 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753737AbaAGURL (ORCPT ); Tue, 7 Jan 2014 15:17:11 -0500 Date: Tue, 7 Jan 2014 15:16:41 -0500 (EST) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Mike Snitzer cc: Linus Torvalds , Greg Kroah-Hartman , Bart Van Assche , Jeff Mahoney , Linux Kernel Mailing List , device-mapper development , Thomas Gleixner , Paul McKenney , Ingo Molnar Subject: Re: kobject: provide kobject_put_wait to fix module unload race In-Reply-To: <20140107191910.GA8884@redhat.com> Message-ID: References: <52C98BCC.9040900@acm.org> <20140105182640.GA2522@kroah.com> <20140106213111.GA2536@redhat.com> <20140107191910.GA8884@redhat.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 7 Jan 2014, Mike Snitzer wrote: > On Tue, Jan 07 2014 at 1:00pm -0500, > Mikulas Patocka wrote: > > > > > > > On Tue, 7 Jan 2014, Linus Torvalds wrote: > > > > > This looks completely broken to me. You do a "kobject_put()" and then > > > after you've dropped that last use, you wait for the completion of > > > something that may already have been free'd. > > > > > > Wtf? Am I missing something? > > > > > > Linus > > > > It is correct. The release method dm_kobject_release doesn't free the > > kobject. It just signals the completion and returns. > > > > This is the sequence of operations: > > * call kobject_put > > * wait until all users stop using the kobject, when it happens the release > > method is called > > * the release method signals the completion and returns > > * the unload code that waits on the completion continues > > * the unload code frees the mapped_device structure that contains the > > kobject > > > > Using kobject this way avoids the module unload race that was mentioned at > > the beginning of this thread. > > I've staged your patch in linux-next for 3.14, see: > http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=af7b1e5c767fc895788c971c8f4686402ac8344f Looking at this patch, I realize that it is buggy too. If module unload happens at this point (after the completion is signaled, but before the release function returns), it crashes. static void dm_kobject_release(struct kobject *kobj) { complete(dm_get_completion_from_kobject(kobj)); >========== module unload here ===============< } The patch that I sent initially in this thread doesn't have this bug because the completion is signaled in non-module code. That goes back to my initial claim - it is impossible to use the kobject interface correctly! But if Greg doesn't want a patch that fixes the kobject interface, I don't really know what to do about it. Maybe another possibility - maintain a list of all kobjects and walk them in the generic module unload code to check if their release method ponits to the module that is being unloaded? Greg, would you accept a patch like this? Mikulas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/