Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760255AbYFLW0S (ORCPT ); Thu, 12 Jun 2008 18:26:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754729AbYFLW0G (ORCPT ); Thu, 12 Jun 2008 18:26:06 -0400 Received: from bohort.kerlabs.com ([62.160.40.57]:59952 "EHLO bohort.kerlabs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754612AbYFLW0F (ORCPT ); Thu, 12 Jun 2008 18:26:05 -0400 Date: Fri, 13 Jun 2008 00:25:58 +0200 From: Louis Rilling To: Joel Becker Cc: linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com Subject: Re: [PATCH 1/3][BUGFIX] configfs: Introduce configfs_dirent_lock Message-ID: <20080612222558.GA4012@localdomain> Reply-To: Louis.Rilling@kerlabs.com References: <20080612133126.335618468@kerlabs.com> <20080612134203.763166823@kerlabs.com> <20080612191348.GE5377@mail.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline In-Reply-To: <20080612191348.GE5377@mail.oracle.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9483 Lines: 219 On Thu, Jun 12, 2008 at 12:13:48PM -0700, Joel Becker wrote: > On Thu, Jun 12, 2008 at 03:31:27PM +0200, Louis Rilling wrote: > > This patch introduces configfs_dirent_lock spinlock to protect configfs_dirent > > traversals against linkage mutations (add/del/move). This will allow > > configfs_detach_prep() to avoid locking i_mutexes. > > I like that you expanded the scope to cover all configfs_dirent > linkages. These are all protected by i_mutex in the current code, but > your rename fix removes that protection. > > > Locking rules for configfs_dirent linkage mutations are the same plus the > > requirement of taking configfs_dirent_lock. For configfs_dirent walking, one can > > either take appropriate i_mutex as before, or take configfs_dirent_lock. > > Nope, you *must* take configfs_dirent_lock now. You've removed > i_mutex protection in the last patch. Oh well. Do you mean because of CONFIGFS_USET_DROPPING being set without i_mutex locked? This is the only mutation (except in the s_links patch) done without i_mutex locked. I thought that actually either other configfs_dirent traversals like readdir() and dir_lseek() would prevent detach_prep() from succeeding because they add dirents before, or are done in places where detach_prep() cannot do harm because new_dirent() fails whenever it sees CONFIGFS_USET_DROPPING: detach_attrs() and detach_groups() must ignore CONFIGFS_USET_DROPPING, depend_prep() is protected since the whole path is locked from configfs root, lookup() can succeed since at worst its result will be invalidated when actually detaching the default groups. The only function for which I can not figure out is configfs_hash_and_remove(), but it is not used. I admit that the case of symlink() needs an extra check to ensure that the target is not about to be removed. The bug was already there though, right? Anyway, if it looks conceptually simpler to use configfs_dirent_lock (probably better a mutex in that case) wherever i_mutex are supposed to protect configfs_dirent traversals, I'm ok with it. > > > > The spinlock could actually be a mutex, but the critical sections are either > > O(1) or should not be too long (default groups walking in last patch). > > I'm wary of someone's reasonably deep groups. Discussing it > yesterday I'd been convinced that a mutex was good to start with. But > given your increased scope of the lock, let's try the spinlock and see > what happens. > > > +extern spinlock_t configfs_dirent_lock; > > Boy I wish this could be static to one file :-( > > > @@ -79,7 +84,9 @@ static struct configfs_dirent *configfs_ > > atomic_set(&sd->s_count, 1); > > INIT_LIST_HEAD(&sd->s_links); > > INIT_LIST_HEAD(&sd->s_children); > > + spin_lock(&configfs_dirent_lock); > > list_add(&sd->s_sibling, &parent_sd->s_children); > > + spin_unlock(&configfs_dirent_lock); > > sd->s_element = element; > > You need to set s_element either under the lock or before taking > the lock. Once you've dropped the lock, someone can reach this dirent > from the parent, and should see s_element. Will do. > > > @@ -173,7 +180,9 @@ static int create_dir(struct config_item > > } else { > > struct configfs_dirent *sd = d->d_fsdata; > > if (sd) { > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_put(sd); > > } > > } > > @@ -224,7 +233,9 @@ int configfs_create_link(struct configfs > > else { > > struct configfs_dirent *sd = dentry->d_fsdata; > > if (sd) { > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_put(sd); > > } > > } > > These strike me as wrong - I would think that one should either > see the old configfs_dirent tree or the new one. That is, one would > take the dirent lock at the beginning of configfs_mkdir() and release it > at the end - so any other code that looks at the dirent tree will see an > atomic change. Here, some other path could see the new dirent after > configfs_make_dirent() but then it disappears when configfs_create() > fails. > If you did that, though, it'd have to be a mutex. And we should not take other i_mutex in populate_groups() and populate_attrs(), otherwise deadlocks could happen. > Now, the only thing that sees this intermediate condition is > configfs itself. Everyone else is protected by i_mutex. I guess it's > OK - but can you comment that fact? i_mutex does *not* protect > traversal of the configfs_dirent tree, but it does prevent the outside > world from seeing the intermediate states. The only intermediate conditions that may hurt one's mind is that an mkdir() (resp. symlink()) racing with an rmdir() can successfuly call make_item()/make_group() (resp. allow_link()) and immediately fail when finalizing with attach_item()/attach_group() (resp. create_link()). So, from userspace and the VFS this seems like "mkdir foo/bar/baz" simply failed because of "rmdir foo", while at the same time from the subsystem point of view this seems like userspace did "mkdir foo/bar/baz; rmdir foo/bar/baz; rmdir foo". As I pointed out in the rename fix, this however can already happen when attach_item()/attach_group() (resp. create_link()) fails because of memory pressure for instance. For the other intermediate conditions, see above about locking rules. > > > @@ -238,7 +249,9 @@ static void remove_dir(struct dentry * d > > struct configfs_dirent * sd; > > > > sd = d->d_fsdata; > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_put(sd); > > if (d->d_inode) > > simple_rmdir(parent->d_inode,d); > > @@ -410,7 +423,9 @@ static void detach_attrs(struct config_i > > list_for_each_entry_safe(sd, tmp, &parent_sd->s_children, s_sibling) { > > if (!sd->s_element || !(sd->s_type & CONFIGFS_NOT_PINNED)) > > continue; > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_drop_dentry(sd, dentry); > > configfs_put(sd); > > } > > @@ -1268,7 +1283,9 @@ static int configfs_dir_close(struct ino > > struct configfs_dirent * cursor = file->private_data; > > > > mutex_lock(&dentry->d_inode->i_mutex); > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&cursor->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > mutex_unlock(&dentry->d_inode->i_mutex); > > > > release_configfs_dirent(cursor); > > @@ -1362,7 +1383,9 @@ static loff_t configfs_dir_lseek(struct > > struct list_head *p; > > loff_t n = file->f_pos - 2; > > > > + spin_lock(&configfs_dirent_lock); > > list_del(&cursor->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > p = sd->s_children.next; > > while (n && p != &sd->s_children) { > > struct configfs_dirent *next; > > @@ -1372,7 +1395,9 @@ static loff_t configfs_dir_lseek(struct > > n--; > > p = p->next; > > } > > + spin_lock(&configfs_dirent_lock); > > list_add_tail(&cursor->s_sibling, p); > > + spin_unlock(&configfs_dirent_lock); > > } > > } > > mutex_unlock(&dentry->d_inode->i_mutex); > > Index: b/fs/configfs/inode.c > > =================================================================== > > --- a/fs/configfs/inode.c 2008-06-12 13:44:27.000000000 +0200 > > +++ b/fs/configfs/inode.c 2008-06-12 13:44:34.000000000 +0200 > > @@ -247,7 +247,9 @@ void configfs_hash_and_remove(struct den > > if (!sd->s_element) > > continue; > > if (!strcmp(configfs_get_name(sd), name)) { > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_drop_dentry(sd, dir); > > configfs_put(sd); > > break; > > Index: b/fs/configfs/symlink.c > > =================================================================== > > --- a/fs/configfs/symlink.c 2008-06-12 13:44:27.000000000 +0200 > > +++ b/fs/configfs/symlink.c 2008-06-12 13:44:34.000000000 +0200 > > @@ -169,7 +169,9 @@ int configfs_unlink(struct inode *dir, s > > parent_item = configfs_get_config_item(dentry->d_parent); > > type = parent_item->ci_type; > > > > + spin_lock(&configfs_dirent_lock); > > list_del_init(&sd->s_sibling); > > + spin_unlock(&configfs_dirent_lock); > > configfs_drop_dentry(sd, dentry->d_parent); > > dput(dentry); > > configfs_put(sd); > > You do the lock,del(sibling),unlock so often, maybe it should be > a helper. Then you can make configfs_dirent_lock static to dir.c. > Well, you use dirent_lock in your s_links patch, so maybe not static. Yes I need it not static, or implement helpers for s_children/s_sibling and for s_links/sl_list. And if the scope should be extended to all cases of configfs_dirent traversals plus whatever would fix symlink()'s target not disappearing, it should definitely not be static. Louis -- Dr Louis Rilling Kerlabs - IRISA Skype: louis.rilling Campus Universitaire de Beaulieu Phone: (+33|0) 2 99 84 71 52 Avenue du General Leclerc Fax: (+33|0) 2 99 84 71 71 35042 Rennes CEDEX - France http://www.kerlabs.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/