Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751882AbdLNLQX (ORCPT ); Thu, 14 Dec 2017 06:16:23 -0500 Received: from mail.kernel.org ([198.145.29.99]:35574 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751802AbdLNLQV (ORCPT ); Thu, 14 Dec 2017 06:16:21 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 696C2218C5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=jlayton@kernel.org Message-ID: <1513250179.3504.2.camel@kernel.org> Subject: Re: [PATCH 14/19] xfs: convert to new i_version API From: Jeff Layton To: Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, neilb@suse.de, bfields@fieldses.org, amir73il@gmail.com, jack@suse.de, viro@zeniv.linux.org.uk Date: Thu, 14 Dec 2017 06:16:19 -0500 In-Reply-To: <20171214021707.GG5858@dastard> References: <20171213142017.23653-1-jlayton@kernel.org> <20171213142017.23653-15-jlayton@kernel.org> <20171213224837.GB5858@dastard> <20171213232537.GC4094@dastard> <1513210222.3498.66.camel@kernel.org> <20171214021707.GG5858@dastard> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.2 (3.26.2-1.fc27) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4049 Lines: 98 On Thu, 2017-12-14 at 13:17 +1100, Dave Chinner wrote: > On Wed, Dec 13, 2017 at 07:10:22PM -0500, Jeff Layton wrote: > > On Thu, 2017-12-14 at 10:25 +1100, Dave Chinner wrote: > > > So now I've looked at the last patch ..... > > > > > > On Thu, Dec 14, 2017 at 09:48:37AM +1100, Dave Chinner wrote: > > > > On Wed, Dec 13, 2017 at 09:20:12AM -0500, Jeff Layton wrote: > > > > > From: Jeff Layton > > > > > > > > > > Signed-off-by: Jeff Layton > > > > > --- > > > > > fs/xfs/libxfs/xfs_inode_buf.c | 5 +++-- > > > > > fs/xfs/xfs_icache.c | 4 ++-- > > > > > fs/xfs/xfs_inode.c | 2 +- > > > > > fs/xfs/xfs_inode_item.c | 2 +- > > > > > fs/xfs/xfs_trans_inode.c | 2 +- > > > > > 5 files changed, 8 insertions(+), 7 deletions(-) > > > > > > > > > > diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c > > > > > index 6b7989038d75..6b47de201391 100644 > > > > > --- a/fs/xfs/libxfs/xfs_inode_buf.c > > > > > +++ b/fs/xfs/libxfs/xfs_inode_buf.c > > > > > @@ -264,7 +264,8 @@ xfs_inode_from_disk( > > > > > to->di_flags = be16_to_cpu(from->di_flags); > > > > > > > > > > if (to->di_version == 3) { > > > > > - inode->i_version = be64_to_cpu(from->di_changecount); > > > > > + inode_set_iversion_queried(inode, > > > > > + be64_to_cpu(from->di_changecount)); > > > > > > > > So we use the "kernel managed" (really not sure what that means) > > > > set function here to read it off disk, but... > > > > > > This stores the value from disk in the incore inode as "val << 1", > > > then sets the lowest bit to indicate that it has been "queried" > > > so that it will be incremented on the first modification. > > > > > > Why do we initialise values read from disk as "queried"? This means > > > the i_version will change once every time it's brought into memory > > > and modified, regardless of whether anyone is looking at it. What > > > purpose does this serve? > > > > > > > I don't think we want to store the QUERIED bit. > > > > It's always possible that we crash at an inopportune time and a query > > happened vs. this value before this thing hit the backing store. > > > > If we always set the queried bit when we load it from disk, then we know > > that that scenario is harmless, at the negligible expense of having to > > bump it on the first write. > > Reasonable. Needs documentation. > Will do. FWIW, there's another reason to do it this way too: backward compatibility. If we don't try to store the queried bit then we should be able to go back and forth between legacy kernels and the ones with the new i_version handling without any trouble. The older kernels will just bump the count more frequently. > > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > > > > index 801274126648..be6d87980dd5 100644 > > > > > --- a/fs/xfs/xfs_inode.c > > > > > +++ b/fs/xfs/xfs_inode.c > > > > > @@ -833,7 +833,7 @@ xfs_ialloc( > > > > > ip->i_d.di_flags = 0; > > > > > > > > > > if (ip->i_d.di_version == 3) { > > > > > - inode->i_version = 1; > > > > > + inode_set_iversion(inode, 1); > > > > > > > > But here you are using the "filesystem managed" mdoe to set the > > > > new value. Why? How is this any different from reading the value > > > > off disk and setting it? > > > > > > Still don't understand why this is different to reading the inode > > > from disk.... > > > > This is a allocating a brand new, never before seen inode. There's no > > way this i_version could have ever been seen, so there's no need to flag > > it as queried. > > More documentation. People are going to need to know this stuff to > be able to implement/maintain this stuff in working order - it's no > longer a simple, obvious "just increment the counter on > modification" variable and that has potential ramifications for > filesystems that store this on disk. > > Definitely. I'm finding that documenting this has been the hardest part. Thanks for the review so far! -- Jeff Layton