Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753007AbcKSWni (ORCPT ); Sat, 19 Nov 2016 17:43:38 -0500 Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:2486 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752932AbcKSWnh (ORCPT ); Sat, 19 Nov 2016 17:43:37 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgI5AHHUMFh5LIq7EGdsb2JhbABeHAEBBAEBCgEBgyoOAQEBAQEfgViGdJw0AQEBAQEBBoEcjCWGO4QThhsCAgEBAoF8VAECAQEBAQECBgEBAQEBAQEBN0VCAQQNhBQBAQEDATocIwULCAMOCgklDwUlAwcaE4hlB61Wi0sBAQEBBgIBJB6FVIUkhCaDVIISHgWUZYVokGyQMo1fhAuBSBMMg0UfgXEqNIIbgmhSgjsBAQE Date: Sun, 20 Nov 2016 09:43:31 +1100 From: Dave Chinner To: David Howells Cc: Andreas Dilger , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/4] statx: Add a system call to make enhanced file info available Message-ID: <20161119224331.GE31101@dastard> References: <20161118220744.GC31101@dastard> <147938969703.13574.10295364502230379833.stgit@warthog.procyon.org.uk> <147938970382.13574.11581172952175034619.stgit@warthog.procyon.org.uk> <20161117234047.GE28177@dastard> <11317.1479509642@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <11317.1479509642@warthog.procyon.org.uk> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3874 Lines: 107 On Fri, Nov 18, 2016 at 10:54:02PM +0000, David Howells wrote: > Dave Chinner wrote: > > > And when we start thinking in those timeframes, an > > increase in timestamp resoultion of at least another 10e-3 is > > likely.... > > Is it, though? To be useful, surely you have to be able to jam quite a few > instructions into a 1ns block, including memory accesses. > > Rather than providing: > > struct timestamp { > __s64 seconds; > __s64 femtoseconds; > }; > > which would require 64-bit divisions to get nanosecond timestamps that we do > actually use, I would lean towards: > > struct timestamp { > __s64 seconds; > __s32 nanoseconds; > __s32 femtoseconds; > }; No. Just provide a 64 bit high resoultion field, and define it to contain nanoseconds. When we need higher resolution to be exported to userspace, we use a /feature flag/ to indicate that is contains something like attoseconds or the like. This also gives userspace a method of choosing the resolution it wants..... > __s32 stx_btime_ns; /* File creation time (ns part) */ > __s32 stx_ctime_ns; /* Last attribute change time (ns part) */ > __s32 stx_mtime_ns; /* Last data modification time (ns part) */ > > and then add: > > __s32 stx_atime_fs; /* Last access time (fs part) */ > __s32 stx_btime_fs; /* File creation time (fs part) */ > __s32 stx_ctime_fs; /* Last attribute change time (fs part) */ > __s32 stx_mtime_fs; /* Last data modification time (fs part) */ > > later. Which burns a significant part of the spare space in the structure. > If we *really* do want to allow for atto- or femto- second resolution > timestamps (and you've still not entirely convinced me that it's going to be > necessary - the speed of signal propagation still has an ungetroundable > limit), then we could stick the space in now - but I think it's likely to > remain dead space. We don't really care if the strucuture 250 bytes or 300 bytes, it's not going to have any significant impact on memory usage or performance. We should just suck it up now and future proof the timestamp interface, as history has proven repeatedly that we suck as making our time/timestamp interfaces future proof.... > > > Using the existing FS_*_FL flags as initial values is not worse than > > > starting with any other arbitrary values for the flags. > > > > Except it starts with a sparse set of flags for no good reason. > > Actually, a very good reason. You can map those flags, on ext4 at least, with > a load, an AND and an OR. Three instructions[*]. If the bits don't > correspond, it gets more expensive (4-5 instructions per bit + 1). Two words: premature optimisation. Every other filesystem has to map their own internal flags to the user interface flags, so why should we make the user interface harder to understand and maintain just to allow a special snowflake optimisation for /only one filesystem/? There's a worse problem than that, though - we can't add certain flags to the userspace API because the /conflict with ext4 on-disk flags/ that aren't exported to userspace and so to work around that we have this shitty hack to define what parts of the flag space contain flags that the userspace API can interact with: #define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ #define FS_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */ These mask out the ext2/3/4 flags that should not be visible to userspace and/or not be modifiable by userspace. Having to maintain crap like this is a direct result of exporting on-disk flag values to userspace APIs. *Let's not repeat the mistakes of the past* in new interfaces. Cheers, Dave. > > [*] Leastways, it *should* be three instructions, but gcc fails to optimise it > correctly. I have a bz logged for this. > > David > -- Dave Chinner david@fromorbit.com