Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756991Ab0F3SQT (ORCPT ); Wed, 30 Jun 2010 14:16:19 -0400 Received: from mail-bw0-f46.google.com ([209.85.214.46]:64046 "EHLO mail-bw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753041Ab0F3SQR convert rfc822-to-8bit (ORCPT ); Wed, 30 Jun 2010 14:16:17 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=wvuZzyfWZ35frxPzl6/54a52AiF/C2AluTOTRRoO25vYwq0NAVyQ1R5MfH2TgpPL4s sqNn+jXE9gAgPb9tYLZoTGXLPzRAjnl1GXUOqH+/XK2b5KZMXS0vZheIZbzx1vIW2lPd Ur5PgSU8LWnR5NB22EJY0u3zYHJjhvHxMOy84= MIME-Version: 1.0 In-Reply-To: <87mxuc91r2.fsf@linux.vnet.ibm.com> References: <20100608004102.GQ31073@ZenIV.linux.org.uk> <87pqzrexon.fsf@linux.vnet.ibm.com> <871vbwe6lp.fsf@linux.vnet.ibm.com> <87mxuc91r2.fsf@linux.vnet.ibm.com> Date: Wed, 30 Jun 2010 12:16:15 -0600 X-Google-Sender-Auth: Gj-8yBSascM7iuq1zuV6Cb_KTxY Message-ID: Subject: Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 From: Latchesar Ionkov To: "Aneesh Kumar K. V" Cc: Linus Torvalds , Eric Van Hensbergen , V9FS Developers , Al Viro , linux-kernel Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13043 Lines: 335 I think that you need to use the s_vfs_rename_mutex in the super_block struct instead of introducing a new rename_lock in the v9fs session. Thanks, Lucho On Wed, Jun 30, 2010 at 5:52 AM, Aneesh Kumar K. V wrote: > On Tue, 29 Jun 2010 13:38:57 -0700, Linus Torvalds wrote: >> On Tue, Jun 29, 2010 at 1:18 PM, Eric Van Hensbergen wrote: >> > >> > Does this approach satisfy your concerns? ?We've been going over >> > several different options on how to proceed, but this seems to be the >> > best option. >> >> Using a p9fs rename lock does seem to be a reasonable option. >> >> That said, the patch itself seems to not be valid. You drop the lock >> too early in v9fs_fid_lookup() as far as I can tell. You then re-take >> it before doing that whole >> >> ? ?for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) >> >> loop with it held again, but that's totally bogus - because you >> dropped the lock, your 'n-1' count has absolutely no meaning any more, >> since a cross-directory rename might have changed the depth of the >> thing in the meantime. >> >> And if the depth changes, you aren't at all guaranteed to stay on the >> same p9fs filesystem, so now you're doing that d_parent access without >> the proper locking (sure: you hold the rename lock, but it's not at >> all guaranteed that the rename lock is the _right_ lock any more as >> you traverse the list down!) >> >> But I didn't look deeply at the patch. There might be some reason why >> it's safe (I doubt it, though), and there might be other places where >> you do the same. But in general, dropping and re-taking a lock is a >> bad idea. If you dropped the lock, you can't depend on anything you >> found out while having held it. > > You are correct. we cannot drop the rename lock in between. I also found > another issue in that we are using dentry->d_name.name directly. That > would imply we need to hold the rename_lock even during the > client_walk. ?How about the patch below ?. I updated the patch to hold > rename_lock during multiple path walk. Also the rename path is updated > to hold the lock during p9_client_rename operations. > > commit 79f6f20dbb70ad35db37b674957c95de20662a75 > Author: Aneesh Kumar K.V > Date: ? Wed Jun 30 15:37:58 2010 +0530 > > ? ?fs/9p: Prevent parallel rename when doing fid_lookup > > ? ?During fid lookup we need to make sure that the dentry->d_parent doesn't > ? ?change so that we can safely walk the parent dentries. To ensure that > ? ?we need to prevent cross directory rename during fid_lookup. Add a > ? ?per superblock rename_lock rwlock to prevent parallel fid lookup and rename. > > ? ?Signed-off-by: Aneesh Kumar K.V > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c > index 5d6cfcb..7b387fe 100644 > --- a/fs/9p/fid.c > +++ b/fs/9p/fid.c > @@ -97,6 +97,34 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, u32 uid, int any) > ? ? ? ?return ret; > ?} > > +/* > + * We need to hold v9ses->rename_lock as long as we hold references > + * to returned path array. Array element contain pointers to > + * dentry names. > + */ > +static int build_path_from_dentry(struct v9fs_session_info *v9ses, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct dentry *dentry, char ***names) > +{ > + ? ? ? int n = 0, i; > + ? ? ? char **wnames; > + ? ? ? struct dentry *ds; > + > + ? ? ? for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) > + ? ? ? ? ? ? ? n++; > + > + ? ? ? wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); > + ? ? ? if (!wnames) > + ? ? ? ? ? ? ? goto err_out; > + > + ? ? ? for (ds = dentry, i = (n-1); i >= 0; i--, ds = ds->d_parent) > + ? ? ? ? ? ? ? wnames[i] = (char ?*)ds->d_name.name; > + > + ? ? ? *names = wnames; > + ? ? ? return n; > +err_out: > + ? ? ? return -ENOMEM; > +} > + > ?/** > ?* v9fs_fid_lookup - lookup for a fid, try to walk if not found > ?* @dentry: dentry to look for fid in > @@ -112,7 +140,7 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > ? ? ? ?int i, n, l, clone, any, access; > ? ? ? ?u32 uid; > ? ? ? ?struct p9_fid *fid, *old_fid = NULL; > - ? ? ? struct dentry *d, *ds; > + ? ? ? struct dentry *ds; > ? ? ? ?struct v9fs_session_info *v9ses; > ? ? ? ?char **wnames, *uname; > > @@ -139,50 +167,63 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > ? ? ? ?fid = v9fs_fid_find(dentry, uid, any); > ? ? ? ?if (fid) > ? ? ? ? ? ? ? ?return fid; > - > + ? ? ? /* > + ? ? ? ?* we don't have a matching fid. To do a TWALK we need > + ? ? ? ?* parent fid. We need to prevent rename when we want to > + ? ? ? ?* look at the parent. > + ? ? ? ?*/ > + ? ? ? read_lock(&v9ses->rename_lock); > ? ? ? ?ds = dentry->d_parent; > ? ? ? ?fid = v9fs_fid_find(ds, uid, any); > - ? ? ? if (!fid) { /* walk from the root */ > - ? ? ? ? ? ? ? n = 0; > - ? ? ? ? ? ? ? for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) > - ? ? ? ? ? ? ? ? ? ? ? n++; > - > - ? ? ? ? ? ? ? fid = v9fs_fid_find(ds, uid, any); > - ? ? ? ? ? ? ? if (!fid) { /* the user is not attached to the fs yet */ > - ? ? ? ? ? ? ? ? ? ? ? if (access == V9FS_ACCESS_SINGLE) > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return ERR_PTR(-EPERM); > - > - ? ? ? ? ? ? ? ? ? ? ? if (v9fs_proto_dotu(v9ses) || > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? v9fs_proto_dotl(v9ses)) > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uname = NULL; > - ? ? ? ? ? ? ? ? ? ? ? else > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uname = v9ses->uname; > + ? ? ? if (fid) { > + ? ? ? ? ? ? ? /* Found the parent fid do a lookup with that */ > + ? ? ? ? ? ? ? fid = p9_client_walk(fid, 1, (char **)&ds->d_name.name, 1); > + ? ? ? ? ? ? ? read_unlock(&v9ses->rename_lock); > + ? ? ? ? ? ? ? return fid; > + ? ? ? } > + ? ? ? read_unlock(&v9ses->rename_lock); > > - ? ? ? ? ? ? ? ? ? ? ? fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? v9ses->aname); > + ? ? ? /* start from the root and try to do a lookup */ > + ? ? ? fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any); > + ? ? ? if (!fid) { > + ? ? ? ? ? ? ? /* the user is not attached to the fs yet */ > + ? ? ? ? ? ? ? if (access == V9FS_ACCESS_SINGLE) > + ? ? ? ? ? ? ? ? ? ? ? return ERR_PTR(-EPERM); > > - ? ? ? ? ? ? ? ? ? ? ? if (IS_ERR(fid)) > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return fid; > + ? ? ? ? ? ? ? if (v9fs_proto_dotu(v9ses) || v9fs_proto_dotl(v9ses)) > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uname = NULL; > + ? ? ? ? ? ? ? else > + ? ? ? ? ? ? ? ? ? ? ? uname = v9ses->uname; > > - ? ? ? ? ? ? ? ? ? ? ? v9fs_fid_add(ds, fid); > - ? ? ? ? ? ? ? } > - ? ? ? } else /* walk from the parent */ > - ? ? ? ? ? ? ? n = 1; > + ? ? ? ? ? ? ? fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?v9ses->aname); > + ? ? ? ? ? ? ? if (IS_ERR(fid)) > + ? ? ? ? ? ? ? ? ? ? ? return fid; > > - ? ? ? if (ds == dentry) > + ? ? ? ? ? ? ? v9fs_fid_add(dentry->d_sb->s_root, fid); > + ? ? ? } > + ? ? ? /* If we are root ourself just return that */ > + ? ? ? if (dentry->d_sb->s_root == dentry) > ? ? ? ? ? ? ? ?return fid; > - > - ? ? ? wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); > - ? ? ? if (!wnames) > - ? ? ? ? ? ? ? return ERR_PTR(-ENOMEM); > - > - ? ? ? for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) > - ? ? ? ? ? ? ? wnames[i] = (char *) d->d_name.name; > - > + ? ? ? /* > + ? ? ? ?* Do a multipath walk with attached root. > + ? ? ? ?* When walking parent we need to make sure we > + ? ? ? ?* don't have a parallel rename happening > + ? ? ? ?*/ > + ? ? ? read_lock(&v9ses->rename_lock); > + ? ? ? n ?= build_path_from_dentry(v9ses, dentry, &wnames); > + ? ? ? if (n < 0) { > + ? ? ? ? ? ? ? fid = ERR_CAST(ERR_PTR(n)); > + ? ? ? ? ? ? ? goto err_out; > + ? ? ? } > ? ? ? ?clone = 1; > ? ? ? ?i = 0; > ? ? ? ?while (i < n) { > ? ? ? ? ? ? ? ?l = min(n - i, P9_MAXWELEM); > + ? ? ? ? ? ? ? /* > + ? ? ? ? ? ? ? ?* We need to hold rename lock when doing a multipath > + ? ? ? ? ? ? ? ?* walk to ensure none of the patch component change > + ? ? ? ? ? ? ? ?*/ > ? ? ? ? ? ? ? ?fid = p9_client_walk(fid, l, &wnames[i], clone); > ? ? ? ? ? ? ? ?if (IS_ERR(fid)) { > ? ? ? ? ? ? ? ? ? ? ? ?if (old_fid) { > @@ -194,15 +235,16 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?p9_client_clunk(old_fid); > ? ? ? ? ? ? ? ? ? ? ? ?} > ? ? ? ? ? ? ? ? ? ? ? ?kfree(wnames); > - ? ? ? ? ? ? ? ? ? ? ? return fid; > + ? ? ? ? ? ? ? ? ? ? ? goto err_out; > ? ? ? ? ? ? ? ?} > ? ? ? ? ? ? ? ?old_fid = fid; > ? ? ? ? ? ? ? ?i += l; > ? ? ? ? ? ? ? ?clone = 0; > ? ? ? ?} > - > ? ? ? ?kfree(wnames); > ? ? ? ?v9fs_fid_add(dentry, fid); > +err_out: > + ? ? ? read_unlock(&v9ses->rename_lock); > ? ? ? ?return fid; > ?} > > diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c > index 3c49201..b41bcef 100644 > --- a/fs/9p/v9fs.c > +++ b/fs/9p/v9fs.c > @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, > ? ? ? ? ? ? ? ?__putname(v9ses->uname); > ? ? ? ? ? ? ? ?return ERR_PTR(-ENOMEM); > ? ? ? ?} > + ? ? ? rwlock_init(&v9ses->rename_lock); > > ? ? ? ?rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); > ? ? ? ?if (rc) { > diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h > index bec4d0b..dee4f26 100644 > --- a/fs/9p/v9fs.h > +++ b/fs/9p/v9fs.h > @@ -104,6 +104,7 @@ struct v9fs_session_info { > ? ? ? ?struct p9_client *clnt; /* 9p client */ > ? ? ? ?struct list_head slist; /* list of sessions registered with v9fs */ > ? ? ? ?struct backing_dev_info bdi; > + ? ? ? rwlock_t rename_lock; > ?}; > > ?struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, > diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c > index 503a6a2..eae89ad 100644 > --- a/fs/9p/vfs_inode.c > +++ b/fs/9p/vfs_inode.c > @@ -964,6 +964,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, > > ? ? ? ?sb = dir->i_sb; > ? ? ? ?v9ses = v9fs_inode2v9ses(dir); > + ? ? ? /* We can walk d_parent because we hold the dir->i_mutex */ > ? ? ? ?dfid = v9fs_fid_lookup(dentry->d_parent); > ? ? ? ?if (IS_ERR(dfid)) > ? ? ? ? ? ? ? ?return ERR_CAST(dfid); > @@ -1049,7 +1050,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > ? ? ? ?struct p9_fid *olddirfid; > ? ? ? ?struct p9_fid *newdirfid; > ? ? ? ?struct p9_wstat wstat; > - ? ? ? int retval; > + ? ? ? int retval, cross_dir_rename = 0; > > ? ? ? ?P9_DPRINTK(P9_DEBUG_VFS, "\n"); > ? ? ? ?retval = 0; > @@ -1070,6 +1071,9 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > ? ? ? ? ? ? ? ?retval = PTR_ERR(newdirfid); > ? ? ? ? ? ? ? ?goto clunk_olddir; > ? ? ? ?} > + ? ? ? cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); > + ? ? ? if (cross_dir_rename) > + ? ? ? ? ? ? ? write_lock(&v9ses->rename_lock); > > ? ? ? ?if (v9fs_proto_dotl(v9ses)) { > ? ? ? ? ? ? ? ?retval = p9_client_rename(oldfid, newdirfid, > @@ -1077,21 +1081,27 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > ? ? ? ? ? ? ? ?if (retval != -ENOSYS) > ? ? ? ? ? ? ? ? ? ? ? ?goto clunk_newdir; > ? ? ? ?} > + ? ? ? if (cross_dir_rename) { > + ? ? ? ? ? ? ? /* > + ? ? ? ? ? ? ? ?* 9P .u can only handle file rename in the same directory > + ? ? ? ? ? ? ? ?*/ > > - ? ? ? /* 9P can only handle file rename in the same directory */ > - ? ? ? if (memcmp(&olddirfid->qid, &newdirfid->qid, sizeof(newdirfid->qid))) { > ? ? ? ? ? ? ? ?P9_DPRINTK(P9_DEBUG_ERROR, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"old dir and new dir are different\n"); > ? ? ? ? ? ? ? ?retval = -EXDEV; > ? ? ? ? ? ? ? ?goto clunk_newdir; > ? ? ? ?} > - > ? ? ? ?v9fs_blank_wstat(&wstat); > ? ? ? ?wstat.muid = v9ses->uname; > ? ? ? ?wstat.name = (char *) new_dentry->d_name.name; > ? ? ? ?retval = p9_client_wstat(oldfid, &wstat); > > ?clunk_newdir: > + ? ? ? if (!retval) > + ? ? ? ? ? ? ? /* successful rename */ > + ? ? ? ? ? ? ? d_move(old_dentry, new_dentry); > + ? ? ? if (cross_dir_rename) > + ? ? ? ? ? ? ? write_unlock(&v9ses->rename_lock); > ? ? ? ?p9_client_clunk(newdirfid); > > ?clunk_olddir: > diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c > index 0740675..3abc3ec 100644 > --- a/fs/9p/vfs_super.c > +++ b/fs/9p/vfs_super.c > @@ -287,4 +287,5 @@ struct file_system_type v9fs_fs_type = { > ? ? ? ?.get_sb = v9fs_get_sb, > ? ? ? ?.kill_sb = v9fs_kill_super, > ? ? ? ?.owner = THIS_MODULE, > + ? ? ? .fs_flags = FS_RENAME_DOES_D_MOVE, > ?}; > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > V9fs-developer mailing list > V9fs-developer@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/v9fs-developer > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/