Return-Path: linux-nfs-owner@vger.kernel.org Received: from cobra.newdream.net ([66.33.216.30]:58175 "EHLO cobra.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753500Ab3JBSrX (ORCPT ); Wed, 2 Oct 2013 14:47:23 -0400 Date: Wed, 2 Oct 2013 11:47:22 -0700 (PDT) From: Sage Weil To: "J. Bruce Fields" cc: Christoph Hellwig , Al Viro , linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, sandeen@redhat.com Subject: Re: why is i_ino unsigned long, anyway? In-Reply-To: <20131002175328.GF14808@fieldses.org> Message-ID: References: <20130912160324.GE1462@fieldses.org> <20130912193328.GP13318@ZenIV.linux.org.uk> <20130929115454.GA3953@infradead.org> <20131002142527.GD14808@fieldses.org> <20131002160527.GB23875@infradead.org> <20131002175328.GF14808@fieldses.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-nfs-owner@vger.kernel.org List-ID: On Wed, 2 Oct 2013, J. Bruce Fields wrote: > On Wed, Oct 02, 2013 at 09:05:27AM -0700, Christoph Hellwig wrote: > > On Wed, Oct 02, 2013 at 10:25:27AM -0400, J. Bruce Fields wrote: > > > If so then it's no huge code duplication to it by hand: > > > > > > if (inode->i_op->getattr) > > > inode->i_op->getattr(path->mnt, path->dentry, &stat); > > > else > > > generic_fillattr(inode, &stat); > > > > Maybe make that a vfs_getattr_nosec and let vfs_getattr call it? > > > > Including a proper kerneldoc comment explaining when to use it, please. > > Something like this? I'm late to this thread, but: getattr() is a comparatively expensive operation for just getting an (immutable) ino value on a distributed or clustered fs. It would be nice if there were a separate call that didn't try to populate all the other kstat fields with valid data. Something like this was part of the xstat series from forever ago (a bit mask passed to getattr indicating which fields were needed), so the approach below might be okay if we think we'll get there sometime soon, but my preference would be for another fix... (Ceph also uses 64-bit inos.) Thanks! sage > > --b. > > commit 8418a41b7192cf2f372ae091207adb29a088f9a0 > Author: J. Bruce Fields > Date: Tue Sep 10 11:41:12 2013 -0400 > > exportfs: fix 32-bit nfsd handling of 64-bit inode numbers > > Symptoms were spurious -ENOENTs on stat of an NFS filesystem from a > 32-bit NFS server exporting a very large XFS filesystem, when the > server's cache is cold (so the inodes in question are not in cache). > > Reported-by: Trevor Cordes > Signed-off-by: J. Bruce Fields > > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c > index 293bc2e..811831a 100644 > --- a/fs/exportfs/expfs.c > +++ b/fs/exportfs/expfs.c > @@ -215,7 +215,7 @@ struct getdents_callback { > struct dir_context ctx; > char *name; /* name that was found. It already points to a > buffer NAME_MAX+1 is size */ > - unsigned long ino; /* the inum we are looking for */ > + u64 ino; /* the inum we are looking for */ > int found; /* inode matched? */ > int sequence; /* sequence counter */ > }; > @@ -255,10 +255,10 @@ static int get_name(const struct path *path, char *name, struct dentry *child) > struct inode *dir = path->dentry->d_inode; > int error; > struct file *file; > + struct kstat stat; > struct getdents_callback buffer = { > .ctx.actor = filldir_one, > .name = name, > - .ino = child->d_inode->i_ino > }; > > error = -ENOTDIR; > @@ -268,6 +268,16 @@ static int get_name(const struct path *path, char *name, struct dentry *child) > if (!dir->i_fop) > goto out; > /* > + * inode->i_ino is unsigned long, kstat->ino is u64, so the > + * former would be insufficient on 32-bit hosts when the > + * filesystem supports 64-bit inode numbers. So we need to > + * actually call ->getattr, not just read i_ino: > + */ > + error = vfs_getattr_nosec(path, &stat); > + if (error) > + return error; > + buffer.ino = stat.ino; > + /* > * Open the directory ... > */ > file = dentry_open(path, O_RDONLY, cred); > diff --git a/fs/stat.c b/fs/stat.c > index 04ce1ac..71a39e8 100644 > --- a/fs/stat.c > +++ b/fs/stat.c > @@ -37,14 +37,21 @@ void generic_fillattr(struct inode *inode, struct kstat *stat) > > EXPORT_SYMBOL(generic_fillattr); > > -int vfs_getattr(struct path *path, struct kstat *stat) > +/** > + * vfs_getattr_nosec - getattr without security checks > + * @path: file to get attributes from > + * @stat: structure to return attributes in > + * > + * Get attributes without calling security_inode_getattr. > + * > + * Currently the only caller other than vfs_getattr is internal to the > + * filehandle lookup code, which uses only the inode number and returns > + * no attributes to any user. Any other code probably wants > + * vfs_getattr. > + */ > +int vfs_getattr_nosec(struct path *path, struct kstat *stat) > { > struct inode *inode = path->dentry->d_inode; > - int retval; > - > - retval = security_inode_getattr(path->mnt, path->dentry); > - if (retval) > - return retval; > > if (inode->i_op->getattr) > return inode->i_op->getattr(path->mnt, path->dentry, stat); > @@ -53,6 +60,18 @@ int vfs_getattr(struct path *path, struct kstat *stat) > return 0; > } > > +EXPORT_SYMBOL_GPL(vfs_getattr_nosec); > + > +int vfs_getattr(struct path *path, struct kstat *stat) > +{ > + int retval; > + > + retval = security_inode_getattr(path->mnt, path->dentry); > + if (retval) > + return retval; > + return vfs_getattr_nosec(path, stat); > +} > + > EXPORT_SYMBOL(vfs_getattr); > > int vfs_fstat(unsigned int fd, struct kstat *stat) > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 9818747..5a51faa 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -2500,6 +2500,7 @@ extern int page_symlink(struct inode *inode, const char *symname, int len); > extern const struct inode_operations page_symlink_inode_operations; > extern int generic_readlink(struct dentry *, char __user *, int); > extern void generic_fillattr(struct inode *, struct kstat *); > +int vfs_getattr_nosec(struct path *path, struct kstat *stat); > extern int vfs_getattr(struct path *, struct kstat *); > void __inode_add_bytes(struct inode *inode, loff_t bytes); > void inode_add_bytes(struct inode *inode, loff_t bytes); > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >