Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751255AbbHAF7F (ORCPT ); Sat, 1 Aug 2015 01:59:05 -0400 Received: from nautica.notk.org ([91.121.71.147]:53228 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750856AbbHAF7D (ORCPT ); Sat, 1 Aug 2015 01:59:03 -0400 Date: Sat, 1 Aug 2015 07:58:45 +0200 From: Dominique Martinet To: Hugh Dickins Cc: Linus Torvalds , "J. Bruce Fields" , Dominique Martinet , Al Viro , Linux Kernel Mailing List , linux-fsdevel , David Howells Subject: Re: v4.2-rc dcache regression, probably 75a6f82a0d10 Message-ID: <20150801055845.GA1860@nautica> References: <20150731205036.GA3752@nautica> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3201 Lines: 79 Hugh Dickins wrote on Fri, Jul 31, 2015: > On Fri, 31 Jul 2015, Linus Torvalds wrote: >> I'd be more suspicious about other effects. For example, iot's not at >> all obvious that the commit in question just changes the order of the >> flags/inode field accesses, there are potentialy bigger changes there. >> For example, this part (in __d_obtain_alias): >> >> - tmp->d_inode = inode; >> - tmp->d_flags |= add_flags; >> + __d_set_inode_and_type(tmp, inode, add_flags); >> >> looks a bit off, because it *used* to just add those flags, but now, >> through __d_set_inode_and_type, it does >> >> + dentry->d_inode = inode; >> + smp_wmb(); >> + flags = READ_ONCE(dentry->d_flags); >> + flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU); >> + flags |= type_flags; >> + WRITE_ONCE(dentry->d_flags, flags); >> >> so it clears DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU. >> >> Is that correct? Maybe, I haven't checked. And maybe it's a big bad >> bug. Regardless, it sure as hell isn't just changing the order of the >> access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU" >> clearing came from __d_instantiate(), but now it hits __d_obtain_alias >> too. Yeah, had spotted this and tried to fix just this bit, but it didn't seem to change much for my problem. Not saying it isn't a bug, but I have no idea what __d_obtain_alias does and nobody seemed to care about this bit in my previous thread. > Yes, the one which grabbed my attention is: > > @@ -311,7 +346,7 @@ static void dentry_iput(struct dentry * dentry) > { > struct inode *inode = dentry->d_inode; > if (inode) { > - dentry->d_inode = NULL; > + __d_clear_type_and_inode(dentry); > hlist_del_init(&dentry->d_u.d_alias); > spin_unlock(&dentry->d_lock); > spin_unlock(&inode->i_lock); Oh, missed that one... Running tests with just that over the weekend, it's a good candidate and the first 10 minutes of tests sound quite positive! I think it's wrong to call it a "fix" even if it stops the bug from reproducing though, the way I understand the intent of the patch, they want that checking d_flags be enough to take decisions without having to check d_inode as well - so now things rely on that, it's still going to be wrong on HEAD... I think? (I'm running tests at this commit, so I don't have the patches that rely on that yet) As I understand things, the fact that lookup managed to get a being-removed entry from rcu/wherever isn't changed, it's just that it won't fail as fast and maybe something later will notice the lack of inode and fallback graciously instead of that ENOTDIR I've been tracking -- so that commit makes it possible to hit the bug, but there's another issue about taking the dentry in the first place? I really need to spend some time to understand vfs/dcache better as a whole at some point, I've been looking at a small part of it without context for too long... Cheers, -- Dominique -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/