Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753371AbdF2O3C (ORCPT ); Thu, 29 Jun 2017 10:29:02 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:37341 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751974AbdF2O2w (ORCPT ); Thu, 29 Jun 2017 10:28:52 -0400 From: William Koh To: "Darrick J. Wong" CC: Andreas Dilger , "Theodore Ts'o" , linux-ext4 , lkml , Kernel Team , "J. Bruce Fields" , linux-fsdevel , Trond Myklebust , xfs Subject: Re: [PATCH] fs: ext4: inode->i_generation not assigned 0. Thread-Topic: [PATCH] fs: ext4: inode->i_generation not assigned 0. Thread-Index: AQHS8Frhf5zD+VA+OkO552aALR7fE6I7HxsA//+tlgCAAHugAIAAKaEA Date: Thu, 29 Jun 2017 14:28:41 +0000 Message-ID: <0D8EA9C1-E2E7-4D63-8F12-4BDED555F18E@fb.com> References: <20A40B3C-E179-432B-B56F-BDAAF0CD2E1F@dilger.ca> <7CD38230-D961-428F-B2E9-2C0E28CAF442@fb.com> <20170629045940.GB5865@birch.djwong.org> In-Reply-To: <20170629045940.GB5865@birch.djwong.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: oracle.com; dkim=none (message not signed) header.d=none;oracle.com; dmarc=none action=none header.from=fb.com; x-originating-ip: [2620:10d:c090:200::5:547e] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY4PR15MB1608;20:SmEtFn8nw/hFMR8+7hfbW4DZg+DGD57tPbP1VTuTQWBu7NviqrdcZT+qt4jsuA6X4nSOZ6aothWIjDYWLZ/7RkD9BVslQNhFY/K8apJbLb4IhLKHf1zyhUdf5icd2WMYowQgq0OhTcraI4bM27LUruNQiXQJ3VVcN6tw/BOrP4Y= x-ms-office365-filtering-correlation-id: d8338abc-dcb5-4ada-7e23-08d4befb242c x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(300000503095)(300135400095)(201703131423075)(201703031133081)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:CY4PR15MB1608; x-ms-traffictypediagnostic: CY4PR15MB1608: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(133145235818549)(236129657087228)(67672495146484)(148574349560750)(146099531331640); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(93006095)(93001095)(100000703101)(100105400095)(10201501046)(3002001)(6041248)(20161123560025)(20161123564025)(20161123562025)(20161123555025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(6072148)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:CY4PR15MB1608;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:CY4PR15MB1608; x-forefront-prvs: 0353563E2B x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39850400002)(39410400002)(39450400003)(39840400002)(39400400002)(24454002)(377454003)(83716003)(38730400002)(33656002)(229853002)(110136004)(93886004)(3660700001)(3280700002)(86362001)(5660300001)(53546010)(25786009)(53936002)(4326008)(82746002)(2900100001)(6246003)(478600001)(102836003)(6506006)(2906002)(6116002)(54356999)(305945005)(2950100002)(50986999)(76176999)(36756003)(6916009)(8936002)(81166006)(6512007)(8676002)(189998001)(6486002)(6436002)(99286003)(7736002)(14454004)(77096006)(54906002)(461764006);DIR:OUT;SFP:1102;SCL:1;SRVR:CY4PR15MB1608;H:CY4PR15MB1606.namprd15.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <13B26339EA26614EB70AEDD643F901E4@namprd15.prod.outlook.com> MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Jun 2017 14:28:41.7640 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR15MB1608 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-29_10:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v5TET7xR027626 Content-Length: 5629 Lines: 131 On 6/28/17, 9:59 PM, "Darrick J. Wong" wrote: [add linux-xfs to cc] On Thu, Jun 29, 2017 at 04:37:14AM +0000, William Koh wrote: > On 6/28/17, 7:32 PM, "Andreas Dilger" wrote: > > On Jun 28, 2017, at 4:06 PM, Kyungchan Koh wrote: > > > > In fs/ext4/super.c, the function ext4_nfs_get_inode takes as input > > "generation" that can be used to specify the generation of the inode to > > be returned. When 0 is given as input, then inodes of any generation can > > be returned. Therefore, generation 0 is a special case that should be > > avoided when assigning generation to inodes. > > I'd agree with this change to avoid assigning generation == 0 to real inodes. > > Also, the separate question arises about whether we need to allow file handle > lookup with generation == 0? That allows FID guessing easily, while requiring > a non-zero generation makes that a lot harder. > > What are the cases where generation == 0 are used? > > Honestly, I’m not too sure. I just noticed that generation 0 was a special > case from reading the code. > > > A new inline function, ext4_inode_set_gen, will take care of the > > problem. Now, inodes cannot have a generation of 0, so this patch fixes > > the issue. > > > > Signed-off-by: Kyungchan Koh > > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > > index 3219154..74c6677 100644 > > --- a/fs/ext4/ext4.h > > +++ b/fs/ext4/ext4.h > > @@ -1549,6 +1549,14 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino) > > ino <= le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)); > > } > > > > +static inline void ext4_inode_set_gen(struct inode *inode, > > + struct ext4_sb_info *sbi) > > +{ > > + inode->i_generation = sbi->s_next_generation++; > > + if (!inode->i_generation) > > This should be marked "unlikely()" since it happens at most once every 4B > file creations (though likely even less since it is unlikely that so many > files will be created in a single mount). > > Got it. > > > + inode->i_generation = sbi->s_next_generation++; > > +} > > + > > > > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > > index 98ac2f1..d33f6f0 100644 > > --- a/fs/ext4/ialloc.c > > +++ b/fs/ext4/ialloc.c > > @@ -1072,7 +1072,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode } > > spin_lock(&sbi->s_next_gen_lock); > > - inode->i_generation = sbi->s_next_generation++; > > + ext4_inode_set_gen(inode, sbi); > > spin_unlock(&sbi->s_next_gen_lock); > > > > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c > > index 0c21e22..d52a467 100644 > > --- a/fs/ext4/ioctl.c > > +++ b/fs/ext4/ioctl.c > > @@ -160,8 +160,8 @@ static long swap_inode_boot_loader(struct super_block *sb, > > > > spin_lock(&sbi->s_next_gen_lock); > > - inode->i_generation = sbi->s_next_generation++; > > - inode_bl->i_generation = sbi->s_next_generation++; > > + ext4_inode_set_gen(inode, sbi); > > + ext4_inode_set_gen(inode_bl, sbi); > > spin_unlock(&sbi->s_next_gen_lock); > > > > > Cheers, Andreas > > This is applicable to many fs, including ext2, ext4, exofs, jfs, and f2fs. > Therefore, a shared helper in linux/fs.h will allow for easy changes > in all fs. Is there any reason that might be a bad idea? AFAICT, i_generation == 0 in XFS and btrfs is just as valid as any other number. There is no special casing of zero in either filesystem. So now, my curiosity intrigued, I surveyed all the Linux filesystems that can export to NFS. I see that there are actually quite a few fs (ext[2-4], exofs, efs, fat, jfs, f2fs, isofs, nilfs2, reiserfs, udf, ufs) that treat zero as a special value meaning "ignore generation check"; others (xfs, btrfs, fuse, ntfs, ocfs2) that don't consider zero special and always require a match; and still others (affs, befs, ceph, gfs2, jffs2, squashfs) that don't check at all. That to mean strongly suggests that more research is necessary to figure out why some of the filesystems that support i_generation reserve zero as a special value to disable generation checks and why others always require an exact match. Until we can recapture why things are they way they are, it doesn't make much sense to have a helper that only applies to half the filesystems. Granted, the contents of a file handle are generally left up to the individual filesystem, and the behaviors are very different, so I also don't see that much value in hoisting i_generation updates to the VFS level. I guess it wouldn't really matter if XFS stopped writing i_generation = 0 onto disk, but I'm too curious about this odd difference in behavior to let it go just yet. :) --D That makes sense. I’ll try to also look into this matter and send a newer patch with the most optimal fix to this issue. -Kyungchan Koh > > Best, > Kyungchan Koh > > > > >