Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751574AbbHQUlO (ORCPT ); Mon, 17 Aug 2015 16:41:14 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:33485 "EHLO mail-pa0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751459AbbHQUlL (ORCPT ); Mon, 17 Aug 2015 16:41:11 -0400 From: John Stultz To: lkml Cc: Karsten Blees , Prarit Bhargava , Richard Cochran , Ingo Molnar , Thomas Gleixner , Karsten Blees , John Stultz Subject: [PATCH 2/9] time: Fix nanosecond file time rounding in timespec_trunc() Date: Mon, 17 Aug 2015 13:40:56 -0700 Message-Id: <1439844063-7957-3-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1439844063-7957-1-git-send-email-john.stultz@linaro.org> References: <1439844063-7957-1-git-send-email-john.stultz@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3776 Lines: 100 From: Karsten Blees timespec_trunc() avoids rounding if granularity <= nanoseconds-per-jiffie (or TICK_NSEC). This optimization assumes that: 1. current_kernel_time().tv_nsec is already rounded to TICK_NSEC (i.e. with HZ=1000 you'd get 1000000, 2000000, 3000000... but never 1000001). This is no longer true (probably since hrtimers introduced in 2.6.16). 2. TICK_NSEC is evenly divisible by all possible granularities. This may be true for HZ=100, 250, 1000, but obviously not for HZ=300 / TICK_NSEC=3333333 (introduced in 2.6.20). Thus, sub-second portions of in-core file times are not rounded to on-disk granularity. I.e. file times may change when the inode is re-read from disk or when the file system is remounted. This affects all file systems with file time granularities > 1 ns and < 1s, e.g. CEPH (1000 ns), UDF (1000 ns), CIFS (100 ns), NTFS (100 ns) and FUSE (configurable from user mode via struct fuse_init_out.time_gran). Steps to reproduce with e.g. UDF: $ dd if=/dev/zero of=udfdisk count=10000 && mkudffs udfdisk $ mkdir udf && mount udfdisk udf $ touch udf/test && stat -c %y udf/test 2015-06-09 10:22:56.130006767 +0200 $ umount udf && mount udfdisk udf $ stat -c %y udf/test 2015-06-09 10:22:56.130006000 +0200 Remounting truncates the mtime to 1 µs. Fix the rounding in timespec_trunc() and update the documentation. timespec_trunc() is exclusively used to calculate inode's [acm]time (mostly via current_fs_time()), and always with super_block.s_time_gran as second argument. So this can safely be changed without side effects. Note: This does _not_ fix the issue for FAT's 2 second mtime resolution, as super_block.s_time_gran isn't prepared to handle different ctime / mtime / atime resolutions nor resolutions > 1 second. Cc: Prarit Bhargava Cc: Richard Cochran Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Karsten Blees Signed-off-by: John Stultz --- kernel/time/time.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/kernel/time/time.c b/kernel/time/time.c index 85d5bb1..34dbd42 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -287,26 +287,20 @@ EXPORT_SYMBOL(jiffies_to_usecs); * @t: Timespec * @gran: Granularity in ns. * - * Truncate a timespec to a granularity. gran must be smaller than a second. - * Always rounds down. - * - * This function should be only used for timestamps returned by - * current_kernel_time() or CURRENT_TIME, not with do_gettimeofday() because - * it doesn't handle the better resolution of the latter. + * Truncate a timespec to a granularity. Always rounds down. gran must + * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). */ struct timespec timespec_trunc(struct timespec t, unsigned gran) { - /* - * Division is pretty slow so avoid it for common cases. - * Currently current_kernel_time() never returns better than - * jiffies resolution. Exploit that. - */ - if (gran <= jiffies_to_usecs(1) * 1000) { + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) { /* nothing */ - } else if (gran == 1000000000) { + } else if (gran == NSEC_PER_SEC) { t.tv_nsec = 0; - } else { + } else if (gran > 1 && gran < NSEC_PER_SEC) { t.tv_nsec -= t.tv_nsec % gran; + } else { + WARN(1, "illegal file time granularity: %u", gran); } return t; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/