Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp637124pxk; Sun, 30 Aug 2020 17:35:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzw/lRXKQKKKAazEm4RAg0bJ+ibfqxNqK6m5w03FAdCy+qRkzOg1kohYc92b6mAQ6ClKOsO X-Received: by 2002:a17:906:255b:: with SMTP id j27mr9660272ejb.46.1598834145087; Sun, 30 Aug 2020 17:35:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598834145; cv=none; d=google.com; s=arc-20160816; b=jQhc0L5F1i8GpjXi064k+rRAi8XO8qbDzoVsHJdl4xm5Gc6BQPvRXdF8lgkeI3HjU/ /dmDkk+IQNPIlx1tXqF977BVAcFmLeH3kLH3bLQNUvqCksSdIe3OYy9NyYFUz9kuImfj CAtTLewAy+hTJ8ukZzEI+Ma25xBdoj8VRW2+kjJo3Nvav3yzL4FO0FIR9FbTMZ9m+uYD GMIkfYtpQc0yD50xX9zHurEWaKIKBdFGdYdXuOLueb1JzqlEuynxTNrsHLCRvasI+Q1+ SgC4j6//Hi7LMs/KmZ+8Hi0ftluGfRBRmziNy1xpbCPu87crlY2Z0q8SwcXBKLJ4Cdph nFUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Ne1foemu+r+iHCXpCwAATnPAqOY9bvwBR9IyWDe5iWs=; b=S/UuM92u5C+HjMRMg91lPeCQETcE8agMEUJKTkELY/pT/hLz8+hVF9EzA16JPRs1m7 xEsXCNht+f/k9PNj1xGVq+jOGitS4iztFOthcqm3M3QERF0pJHkOazefJyLTgSO3ybun ZFTq9U8XIOm5sRTxGjsXvWVajS+o7CKD+f3G4kBcMmViL7a0jKtGH5WRQ5KzyaRyo87M GbISEcRvDQMfILMRS2J+qW25iM+mXRBB/9/6D2R+esFH5asf61gm5gWFqip/Boina7Oy hxzFHqsiXziGH5c6q3eNJCKihQWDxPZUBBtb78n1wFdYjRJjnYp5DBTqTUhuBLh54AHI XGfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gv16si366784ejb.325.2020.08.30.17.35.21; Sun, 30 Aug 2020 17:35:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726447AbgHaAeO (ORCPT + 99 others); Sun, 30 Aug 2020 20:34:14 -0400 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:34185 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726179AbgHaAeN (ORCPT ); Sun, 30 Aug 2020 20:34:13 -0400 Received: from dread.disaster.area (pa49-181-146-199.pa.nsw.optusnet.com.au [49.181.146.199]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 2B359824857; Mon, 31 Aug 2020 10:34:08 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1kCXlv-0000ei-2W; Mon, 31 Aug 2020 10:34:07 +1000 Date: Mon, 31 Aug 2020 10:34:07 +1000 From: Dave Chinner To: "Li, Hao" Cc: viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, y-goto@fujitsu.com Subject: Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set Message-ID: <20200831003407.GE12096@dread.disaster.area> References: <20200821015953.22956-1-lihao2018.fnst@cn.fujitsu.com> <20200827063748.GA12096@dread.disaster.area> <6b3b3439-2199-8f00-ceca-d65769e94fe0@cn.fujitsu.com> <20200828003541.GD12096@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=IuRgj43g c=1 sm=1 tr=0 cx=a_idp_d a=GorAHYkI+xOargNMzM6qxQ==:117 a=GorAHYkI+xOargNMzM6qxQ==:17 a=kj9zAlcOel0A:10 a=y4yBn9ojGxQA:10 a=omOdbC7AAAAA:8 a=7-415B0cAAAA:8 a=Q-SpeKd3jp49hb1f-zoA:9 a=CjuIK1q_8ugA:10 a=baC4JDFNLZpnPwus_NF9:22 a=biEYGPWJfzWAr4FL6Ov7:22 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 28, 2020 at 05:04:14PM +0800, Li, Hao wrote: > On 2020/8/28 8:35, Dave Chinner wrote: > > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: > >> On 2020/8/27 14:37, Dave Chinner wrote: > >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > >>>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > >>>> set from being killed, so the corresponding inode can't be evicted. If > >>>> the DAX policy of an inode is changed, we can't make policy changing > >>>> take effects unless dropping caches manually. > >>>> > >>>> This patch fixes this problem and flushes the inode to disk to prepare > >>>> for evicting it. > >>>> > >>>> Signed-off-by: Hao Li > >>>> --- > >>>> fs/dcache.c | 3 ++- > >>>> fs/inode.c | 2 +- > >>>> 2 files changed, 3 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/fs/dcache.c b/fs/dcache.c > >>>> index ea0485861d93..486c7409dc82 100644 > >>>> --- a/fs/dcache.c > >>>> +++ b/fs/dcache.c > >>>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) > >>>> */ > >>>> smp_rmb(); > >>>> d_flags = READ_ONCE(dentry->d_flags); > >>>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > >>>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > >>>> + | DCACHE_DONTCACHE; > >>> Seems reasonable, but you need to update the comment above as to > >>> how this flag fits into this code.... > >> Yes. I will change it. Thanks. > >> > >>>> /* Nothing to do? Dropping the reference was all we needed? */ > >>>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) > >>>> diff --git a/fs/inode.c b/fs/inode.c > >>>> index 72c4c347afb7..5218a8aebd7f 100644 > >>>> --- a/fs/inode.c > >>>> +++ b/fs/inode.c > >>>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > >>>> } > >>>> > >>>> state = inode->i_state; > >>>> - if (!drop) { > >>>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > >>>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > >>>> spin_unlock(&inode->i_lock); > >>> What's this supposed to do? We'll only get here with drop set if the > >>> filesystem is mounting or unmounting. > >> The variable drop will also be set to True if I_DONTCACHE is set on > >> inode->i_state. > >> Although mounting/unmounting will set the drop variable, it won't set > >> I_DONTCACHE if I understand correctly. As a result, > >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > > So what does this have to do with changing the way the dcache > > treats DCACHE_DONTCACHE? > After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with > DCACHE_DONTCACHE set will be killed unconditionally even if > DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). > This inode has I_DONTCACHE flag, so op->drop_inode() will return true, > and the inode will be evicted _directly_ even though it has dirty pages. Yes. But this doesn't rely on DCACHE_DONTCACHE behaviour. Inodes that are looked up and cached by the filesystem without going through dentry cache pathwalks can have I_DONTCACHE set and then get evicted... i.e. we can get I_DONTCACHE set on inodes that do not have dentries attached to them. That's the original functionality that got pulled up from XFS - internal iteration of inodes, either via quotacheck or bulkstat.... > I think the kernel will run into error state because it doesn't writeback > dirty pages of this inode before evicting. This is why I write back this > inode here. > > According to my test, if I revert the second hunk of this patch, kernel > will hang after running the following command: > echo 123 > test.txt && xfs_io -c "chattr +x" test.txt > > The backtrace is: > > xfs_fs_destroy_inode+0x204/0x24d > destroy_inode+0x3b/0x65 > evict+0x150/0x1aa > iput+0x117/0x19a > dentry_unlink_inode+0x12b/0x12f > __dentry_kill+0xee/0x211 > dentry_kill+0x112/0x22f > dput+0x79/0x86 > __fput+0x200/0x23f > ____fput+0x25/0x28 > task_work_run+0x144/0x177 > do_exit+0x4fb/0xb94 > > This backtrace is printed with an ASSERT(0) statement in xfs_fs_destroy_inode(). Sure, that's your messenger. That doesn't mean the bug can't be triggered by other means. > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > > is unhashed, should we be writing it back? i.e. it might have been > > dropped for a different reason to I_DONTCACHE.... > This is a problem I didn't realize. You are right. If the inode has been > unlinked/unhashed and the I_DONTCACHE is also set, the appended condition > will lead to unnecessary writeback. > > I think I need to handle the inode writeback more carefully. Maybe I can > revert the second hunk and remove I_DONTCACHE from generic_drop_inode() > and then change > > if (!drop && (sb->s_flags & SB_ACTIVE)) > > to > > if (!drop && !(inode->i_state & I_DONTCACHE) && (sb->s_flags & SB_ACTIVE)) > > This approach would be more suitable. Yup, that's pretty much what I was thinking, but as a standalone patch and preceding the DCACHE_DONTCACHE behaviour change. Thanks! :) Cheers, Dave. -- Dave Chinner david@fromorbit.com