Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753152AbcKQXlt (ORCPT ); Thu, 17 Nov 2016 18:41:49 -0500 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:35794 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753098AbcKQXlq (ORCPT ); Thu, 17 Nov 2016 18:41:46 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ask1AHw/Llh5LIq7EGdsb2JhbABeGgEBAQECAQEBAQgBAQEBgzcBAQEBAR+BWIJ7g3mcMwEBAQEBAQaBHIwfhjeEFYYbBAICgh1UAQIBAQEBAQIGAQEBAQEBAQE3RUISAYQTAQEBAwEnExwjBQsIAw4KCSUPBSUDBxoTGYhLB65TPYtYAQEBBwIBJB6FVIUkh3qCEh4FjmGLYpBigXuEdolAjVCEC4FDEQyFTyo0hS4RK4IPAQEB Date: Fri, 18 Nov 2016 10:40:47 +1100 From: Dave Chinner To: David Howells Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/4] statx: Add a system call to make enhanced file info available Message-ID: <20161117234047.GE28177@dastard> References: <147938969703.13574.10295364502230379833.stgit@warthog.procyon.org.uk> <147938970382.13574.11581172952175034619.stgit@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <147938970382.13574.11581172952175034619.stgit@warthog.procyon.org.uk> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6262 Lines: 167 On Thu, Nov 17, 2016 at 01:35:03PM +0000, David Howells wrote: > Add a system call to make extended file information available, including > file creation, data version and some attribute flags where available > through the underlying filesystem. > > > ======== > OVERVIEW > ======== > > The idea was initially proposed as a set of xattrs that could be retrieved > with getxattr(), but the general preferance proved to be for a new syscall > with an extended stat structure. > > This has a number of uses: > .... > (5) Data version number: Could be used by userspace NFS servers [Aneesh > Kumar]. > > Can also be used to modify fill_post_wcc() in NFSD which retrieves > i_version directly, but has just called vfs_getattr(). It could get > it from the kstat struct if it used vfs_xgetattr() instead. This needs a much clearer name that stx_version as "version" is entirely ambiguous. e.g. Inodes have internal version numbers to disambiguate life cycles. and there are versioning filesystems which have multiple versions of the same file. So if stx_version this is intended to export the internal filesystem inode change counter (i.e. inode->i_version) then lets call it that: stx_modification_count. It's clear and unambiguous as to what it represents, especially as this counter is more than just a "data modification" counter - inode metadata modifications will also cause it to change.... .... > (13) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. > Note that the Linux IOC flags are a mess and filesystems such as Ext4 > define flags that aren't in linux/fs.h, so translation in the kernel > may be a necessity (or, possibly, we provide the filesystem type too). And we now also have FS_IOC_FSGETXATTR that extends the flags and information userspace can get from filesystems. It makes little sense to now add xstat() and not add everything this interface also exports... > The defined bits in request_mask and stx_mask are: > > STATX_TYPE Want/got stx_mode & S_IFMT > STATX_MODE Want/got stx_mode & ~S_IFMT > STATX_NLINK Want/got stx_nlink > STATX_UID Want/got stx_uid > STATX_GID Want/got stx_gid > STATX_ATIME Want/got stx_atime{,_ns} > STATX_MTIME Want/got stx_mtime{,_ns} > STATX_CTIME Want/got stx_ctime{,_ns} > STATX_INO Want/got stx_ino > STATX_SIZE Want/got stx_size > STATX_BLOCKS Want/got stx_blocks > STATX_BASIC_STATS [The stuff in the normal stat struct] > STATX_BTIME Want/got stx_btime{,_ns} > STATX_VERSION Want/got stx_version > STATX_ALL [All currently available stuff] What happens when an application uses STATX_ALL from a future kernel that defines more flags than are initially supported, and that application then is run on a kernel that onyl supports the initial fields? > stx_btime is the file creation time; stx_version is the data version number > (i_version); stx_mask is a bitmask indicating the data provided; and > __spares*[] are where as-yet undefined fields can be placed. > > Time fields are split into separate seconds and nanoseconds fields to make > packing easier and the granularities can be queried with the filesystem > info system call. Note that times will be negative if before 1970; in such > a case, the nanosecond fields will also be negative if not zero. So what happens in ten years time when we want to support femptosecond resolution in the timestamp interface? We've got to change everything to 64 bit? Shouldn't we just make everything timestamp related 64 bit? > > The bits defined in the stx_attributes field convey information about a > file, how it is accessed, where it is and what it does. The following > attributes map to FS_*_FL flags and are the same numerical value: Please isolate the new interface flags completely from the FS_*_FL values. We should not repeat the mistake of tying values derived from filesystem specific on-disk values to a user interface. > STATX_ATTR_COMPRESSED File is compressed by the fs > STATX_ATTR_IMMUTABLE File is marked immutable > STATX_ATTR_APPEND File is append-only > STATX_ATTR_NODUMP File is not to be dumped > STATX_ATTR_ENCRYPTED File requires key to decrypt in fs > > The supported flags are listed by: > > STATX_ATTR_FS_IOC_FLAGS Again, we have many more common and extended flags than this. NOATIME and SYNC are two that immediately come to mind as generic flags that should be in this... > > [Are any other IOC flags of sufficient general interest to be exposed > through this interface?] > > New flags include: > > STATX_ATTR_NONUNIX_OWNERSHIP File doesn't have Unixy ownership > STATX_ATTR_HAS_ACL File has an ACL So statx will require us to do ACL lookups? i.e. instead of just reading the inode to get the information, we'll also have to do extended attribute lookups? That's potentially very expensive if the extended attribute is not stored in the inode.... > STATX_ATTR_KERNEL_API File is kernel API (eg: procfs/sysfs) > STATX_ATTR_REMOTE File is remote and needs network > STATX_ATTR_FABRICATED File was made up by fs Every file is fabricated by a filesystem :P Perhaps you're wanting "virtual file" because it is has no physical presence? > Fields in struct statx come in a number of classes: > > (0) stx_dev_*, stx_blksize. > > These are local system information and are always available. What does stx_blksize actually mean? It's completely ambiguous in stat() because we don't actually report the physical block size here - we report the "minimum unit of efficient IO" that we expect applications to use. Please define :P > ======= > TESTING > ======= > > The following test program can be used to test the statx system call: > > samples/statx/test-statx.c > > Just compile and run, passing it paths to the files you want to examine. > The file is built automatically if CONFIG_SAMPLES is enabled. Can we get xfstests written to exercise and validate all this functionality, please? I'd suggest that adding xfs_io support for the statx syscall would be far more useful for xfstests than a standalone test program, too. We already have equivalent stat() functionality in xfs_io and that's used quite a bit in xfstests.... Cheers, Dave. -- Dave Chinner david@fromorbit.com