Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751294AbbHAH0K (ORCPT ); Sat, 1 Aug 2015 03:26:10 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:44084 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750991AbbHAH0I (ORCPT ); Sat, 1 Aug 2015 03:26:08 -0400 Date: Sat, 1 Aug 2015 08:26:03 +0100 From: Al Viro To: Linus Torvalds Cc: Dominique Martinet , Hugh Dickins , "J. Bruce Fields" , Dominique Martinet , Linux Kernel Mailing List , linux-fsdevel , David Howells Subject: Re: v4.2-rc dcache regression, probably 75a6f82a0d10 Message-ID: <20150801072603.GV17109@ZenIV.linux.org.uk> References: <20150731205036.GA3752@nautica> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2570 Lines: 70 On Fri, Jul 31, 2015 at 03:52:38PM -0700, Linus Torvalds wrote: > Is that correct? Maybe, I haven't checked. And maybe it's a big bad > bug. Regardless, it sure as hell isn't just changing the order of the > access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU" > clearing came from __d_instantiate(), but now it hits __d_obtain_alias > too. Actually, the shit had hit the fan earlier. Look: in commit b18825a7c8e37a7cf6abb97a12a6ad71af160de7 Author: David Howells Date: Thu Sep 12 19:22:53 2013 +0100 VFS: Put a small type field into struct dentry::d_flags we have this: @@ -1823,7 +1794,7 @@ static int link_path_walk(const char *name, struct nameidata *nd) if (err) return err; } - if (!can_lookup(nd->inode)) { + if (!d_is_directory(nd->path.dentry)) { err = -ENOTDIR; break; } And that has turned the check done to an inode that *was* ours at some point (i.e. fetching it had been followed by checking that ->d_seq had been still valid) into something completely unprotected. Suppose we are in lazy mode and somebody had evicted nd->path.dentry after we'd looked it up and before that check. Sure, its ->d_seq had been bumped by that, and we would've failed anyway. With ECHILD. Which, unlike ENOTDIR, is "repeat in non-lazy mode". AFAICS, that's where the problem is. It affects only RCU mode and only the places where dentry isn't pinned. That place in link_path_walk() is trivial - we just need to do if (unlikely(!d_can_lookup(nd->path.dentry))) { if (nd->flags & LOOKUP_RCU) { if (unlazy_walk(nd, NULL, 0)) return -ECHILD; } return -ENOTDIR; } there. AFAICS, other places of that sort are not a problem anymore. Folks, could you check if this fixes the problems you are seeing? diff --git a/fs/namei.c b/fs/namei.c index ae4e4c1..b16c3a7 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1954,7 +1954,11 @@ OK: continue; } } - if (unlikely(!d_can_lookup(nd->path.dentry))) + if (unlikely(!d_can_lookup(nd->path.dentry))) { + if (nd->flags & LOOKUP_RCU) { + if (unlazy_walk(nd, NULL, 0)) + return -ECHILD; + } return -ENOTDIR; } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/