Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp2318725pxb; Mon, 11 Jan 2021 06:49:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJxhX/4aitnaxWcGF712PmCb2W/hXi0zGqsdm5zoxfgK/cz1/kptIh6WwvT4YPqu3IpuYJMN X-Received: by 2002:a17:906:7f0b:: with SMTP id d11mr11145229ejr.7.1610376540264; Mon, 11 Jan 2021 06:49:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610376540; cv=none; d=google.com; s=arc-20160816; b=QIc7WAHgBO1BRq9r9Kjyyd2rJJgTRN0gwPqsM8FK6sjjek4SqP7suFyeN2DYE2WBgm TseX18gvh4u5K4hEvmO9aKoKN3I1V2PexifhwU5qyM535P/8IBOyvwRrAKzrXOFd1Tfj ErPaTAw9thOQJ1DzaKQgs+M97Ng0pJlZ25lZJAs8XAf1tRAQ3jOQhDogIuIhXYxYDKbT qZHYT9JP+w7g+9CEH2yMsYRj5f30Ig0I19bFW5n5nglD5j1eru03hyZKDX3PaIHOnSYG dt5+M/DRuf7W7o3I8nhIp/Mf/Opvj7lYB4BbppMbeRvlm1i9viejyS7jdYlLJGN4i8uZ BpHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=igwhTWH6zHolkq1l9RtyKVpnHE+u4wZKfmznjLT6j1o=; b=bg60NXmUO23pxhpCueREGEL7pZaXnFXhUfrNPhfwjFnRcgxRWGsHZjTvjqnAH1B3yy kpBfp2cbvu/B1TKkXx96CdjEbpj1uOJ+z+p7jH3RKHj0QvPsG11clAazfTX9LbjOEcM1 cneQaxPdWRbsuVD4ulyR5ripHI6N0sglM8P1bjcx2gYnDH09rG7/nUpWML7iwQMjZHBl 7/riQe2D4sQTWxtuUuLMQKdB2PxJXHKfNdbwuR10+hgTdO1N139ocgWsF6iC0o9yVMQI WZBHbAQmQNQupWHvofgQ3J6vER70GrE2Ik2xfzQ32Naz5ObFsJabTJJpWh68csfcXuSF Vshg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u21si1918400ejo.496.2021.01.11.06.48.35; Mon, 11 Jan 2021 06:49:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388702AbhAKOqm (ORCPT + 99 others); Mon, 11 Jan 2021 09:46:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:58876 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727304AbhAKOqm (ORCPT ); Mon, 11 Jan 2021 09:46:42 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 99D83AB3E; Mon, 11 Jan 2021 14:46:00 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 509B81E0807; Mon, 11 Jan 2021 15:46:00 +0100 (CET) Date: Mon, 11 Jan 2021 15:46:00 +0100 From: Jan Kara To: Eric Biggers Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, Theodore Ts'o , Christoph Hellwig , stable@vger.kernel.org, Jan Kara Subject: Re: [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode() Message-ID: <20210111144600.GC808@quack2.suse.cz> References: <20210109075903.208222-1-ebiggers@kernel.org> <20210109075903.208222-2-ebiggers@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri 08-01-21 23:58:52, Eric Biggers wrote: > From: Eric Biggers > > When lazytime is enabled and an inode is being written due to its > in-memory updated timestamps having expired, either due to a sync() or > syncfs() system call or due to dirtytime_expire_interval having elapsed, > the VFS needs to inform the filesystem so that the filesystem can copy > the inode's timestamps out to the on-disk data structures. > > This is done by __writeback_single_inode() calling > mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC). > > However, this occurs after __writeback_single_inode() has already > cleared the dirty flags from ->i_state. This causes two bugs: > > - mark_inode_dirty_sync() redirties the inode, causing it to remain > dirty. This wastefully causes the inode to be written twice. But > more importantly, it breaks cases where sync_filesystem() is expected > to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY > ioctl (as reported at > https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well > as possibly filesystem freezing (freeze_super()). > > - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is > called from __writeback_single_inode() for lazytime expiration, > xfs_fs_dirty_inode() ignores the notification. (XFS only cares about > lazytime expirations, and it assumes that I_DIRTY_TIME will contain > i_state during those.) Therefore, lazy timestamps aren't persisted by > sync(), syncfs(), or dirtytime_expire_interval on XFS. > > Fix this by moving the call to mark_inode_dirty_sync() to earlier in > __writeback_single_inode(), before the dirty flags are cleared from > i_state. This makes filesystems be properly notified of the timestamp > expiration, and it avoids incorrectly redirtying the inode. > > This fixes xfstest generic/580 (which tests > FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime > enabled. It also fixes the new lazytime xfstest I've proposed, which > reproduces the above-mentioned XFS bug > (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org). > > Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But > due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the > right thing to do because mark_inode_dirty_sync() now knows not to move > the inode to a writeback list if it is currently queued for sync. > > Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") > Cc: stable@vger.kernel.org > Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback") > Suggested-by: Jan Kara > Signed-off-by: Eric Biggers Thanks for writing this fix! It looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/fs-writeback.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index acfb55834af23..c41cb887eb7d3 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > } > > /* > - * Some filesystems may redirty the inode during the writeback > - * due to delalloc, clear dirty metadata flags right before > - * write_inode() > + * If the inode has dirty timestamps and we need to write them, call > + * mark_inode_dirty_sync() to notify the filesystem about it and to > + * change I_DIRTY_TIME into I_DIRTY_SYNC. > */ > - spin_lock(&inode->i_lock); > - > - dirty = inode->i_state & I_DIRTY; > if ((inode->i_state & I_DIRTY_TIME) && > - ((dirty & I_DIRTY_INODE) || > - wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > + (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > time_after(jiffies, inode->dirtied_time_when + > dirtytime_expire_interval * HZ))) { > - dirty |= I_DIRTY_TIME; > trace_writeback_lazytime(inode); > + mark_inode_dirty_sync(inode); > } > + > + /* > + * Some filesystems may redirty the inode during the writeback > + * due to delalloc, clear dirty metadata flags right before > + * write_inode() > + */ > + spin_lock(&inode->i_lock); > + dirty = inode->i_state & I_DIRTY; > inode->i_state &= ~dirty; > > /* > @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > > spin_unlock(&inode->i_lock); > > - if (dirty & I_DIRTY_TIME) > - mark_inode_dirty_sync(inode); > /* Don't write the inode if only I_DIRTY_PAGES was set */ > if (dirty & ~I_DIRTY_PAGES) { > int err = write_inode(inode, wbc); > -- > 2.30.0 > -- Jan Kara SUSE Labs, CR