Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934532AbdC3SIA (ORCPT ); Thu, 30 Mar 2017 14:08:00 -0400 Received: from mail-io0-f174.google.com ([209.85.223.174]:33940 "EHLO mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934026AbdC3SH6 (ORCPT ); Thu, 30 Mar 2017 14:07:58 -0400 MIME-Version: 1.0 In-Reply-To: <22214.1490895007@warthog.procyon.org.uk> References: <22214.1490895007@warthog.procyon.org.uk> From: Linus Torvalds Date: Thu, 30 Mar 2017 11:07:56 -0700 X-Google-Sender-Auth: 62uDHfCgv3odhqRE9UgjgXb8KjE Message-ID: Subject: Re: Apparent backward time travel in timestamps on file creation To: David Howells Cc: Thomas Gleixner , John Stultz , Linux Kernel Mailing List , linux-fsdevel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2278 Lines: 53 On Thu, Mar 30, 2017 at 10:30 AM, David Howells wrote: > > I've been writing a testcase for xfstests to test statx. However, it's turned > up what I think is a bug in the kernel's time-tracking system. If I do: > > date +%s.%N > touch foo > dump-timestamps foo > > such that foo is created, sometimes the atime, mtime and ctime timestamps on > foo will be *before* the time printed by 'date'. I don't think our filesystem timestamps have ever been equivalent to "gettimeofday()". The "gettimeofday()" times are actually fairly expensive to calculate (although we've optimized that code heavily, because it's a very common system call in many loads). We try to give gettimeofday() much higher precision than any other time in the system: it not only participates in all the NTP stuff, we also actually read the hardware time register (hopefully the TSC, but it can be any time source) and interpolate that very carefully to give a high-quality clock value that is *much* higher precision than the timer tick. In contrast, the filesystem times are based on CURRENT_TIME (and the modern variation: "current_time(inode)") that is a completely different animal. It truncates the time to the inode time granularity, yes, but there's a much more fundamental thing going on: instead of using that exact gettimeofday() thing, it just uses "current_kernel_time()". And current_kernel_time() doesn't do *any* of the fancy "interpolate high-quality hardware clock" stuff. No, it just uses "xtime" that is updated by the timer interrupt (ok, that's slightly simplified, 'xtime' is no longer just a single global, we have a per-timekeeping one these days, but it's historically and conceptually what we're doing). The difference can be quite noticeable - basically the "gettimeofday()" time will interpolate within timer ticks, while "xtime" is just the truncated "time at timer tick" value _without_ the correction. And none of this is new. It goes back to forever. Any Linux version that had gettimeofday() with the finer-grained offset. Not 0.01 (I checked). [ Goes digging. Ok, the gettimeofday timer interpolation was added in April, 1993 in version 0.99.9 ] So it's been like this for about 24 years now. Linus