Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752767AbcCGO40 (ORCPT ); Mon, 7 Mar 2016 09:56:26 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:51193 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752350AbcCGO4S (ORCPT ); Mon, 7 Mar 2016 09:56:18 -0500 Date: Mon, 7 Mar 2016 14:56:16 +0000 From: Al Viro To: Sage Weil Cc: "Yan, Zheng" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, ceph-devel@vger.kernel.org Subject: Re: [ceph] what's going on with d_rehash() in splice_dentry()? Message-ID: <20160307145616.GT17997@ZenIV.linux.org.uk> References: <20160226174250.GA17997@ZenIV.linux.org.uk> <20160307021639.GR17997@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2028 Lines: 36 On Mon, Mar 07, 2016 at 06:25:13AM -0500, Sage Weil wrote: > That looks okay, but changing d_rehash to d_add still means you're doing > te d_instantiate(dn, NULL) in the d_unhashed case; is there a reason you > changed that line? Is the dentry_rcuwalk_invalidate in __d_instantiate is > important before rehashing? d_add() gets uninlined in my queue, along with some locking massage to avoid extra bouncing of ->d_lock we do for no good reason. So that call is also going away. > > As an aside, tracking back to the originating fs method is > > painful as hell ;-/ I _think_ that rehash can be hit during ->lookup() > > returning a negative, but I wouldn't bet a dime on it not happening from > > other methods... AFAICS, the change should be OK regardless of what > > it's been called from, but... _ouch_. Is is documented anywhere public? > > It is a pain to follow, yes. FWIW this whole block is predicated in > req->r_locked_dir being non-NULL (i.e., VFS holds dir->i_mutex), which is > only true for lookup, create operations (mkdir/mknod/symlink/etc.), > atomic_open, and the .get_name export op. There's not much documentation > beyond a description of the meaning of fields (e.g. r_locked_dir) in > fs/ceph/mds_client.h ... Yes. Now consider the plans to make ->i_mutex an rwsem and weaken it for ->lookup() (i.e. held only shared when it's held for lookup alone). IOW, the aforementioned queue... I would rather avoid d_rehash() in ->lookup() instances; the tricky part is that we want to avoid parallel lookups on the *same* name, so we need an "in parallel lookup" state for dentries and mechanism allowing to wait for it to be finished. d_add()/d_splice_alias() on such a dentry would take it out of that state and I would rather avoid dealing with it in d_rehash()... I was going to post the series last week, got sidetracked on various fun things I'd found (ecryptfs and lustre). Hopefully will get all of that sorted out and the whole series posted for review tomorrow or on Wednesday...