2010-08-31 08:43:39

by Satoru Takeuchi

[permalink] [raw]
Subject: [RFC][PATCH] make file's timestamp more accurate

Hi,

linux has supported nanosecond order file's timestamp since 2.5.48.
However current file timestamp is got by current_fs_time() and
is only updated once a tick. It can't say true nanosecond accuracy.
In addition, gettimeofday() before a file operation updating
{a,c,m}time would outstrip file's timestamp because of the difference
about time source between gettimeofday() and file's timestamp.
A certain kind of application would corrupted by this problem.

I attached a most simple patch fixing this problem here. However
it has several problems and I don't say it can be applied as is.
The most big two problems is the following:

- It would cause performance regression, especially in
not TSC capable system.
- Is gettimeofday()'s monotonicity reliable on all systems?

The relative discussion:
http://lkml.org/lkml/2010/7/13/443

Does anybody have good idea? Should it be tunable, for example?

Thanks,
Satoru

Index: linux-2.6.36-rc3/kernel/time.c
===================================================================
--- linux-2.6.36-rc3.orig/kernel/time.c 2010-08-31 16:07:43.000000000 +0900
+++ linux-2.6.36-rc3/kernel/time.c 2010-08-31 16:08:11.000000000 +0900
@@ -227,7 +227,8 @@ SYSCALL_DEFINE1(adjtimex, struct timex _
*/
struct timespec current_fs_time(struct super_block *sb)
{
- struct timespec now = current_kernel_time();
+ struct timespec now;
+ getnstimeofday(&now);
return timespec_trunc(now, sb->s_time_gran);
}
EXPORT_SYMBOL(current_fs_time);



2010-09-09 16:23:37

by john stultz

[permalink] [raw]
Subject: Re: [RFC][PATCH] make file's timestamp more accurate

On Tue, 2010-08-31 at 17:42 +0900, Satoru Takeuchi wrote:
> linux has supported nanosecond order file's timestamp since 2.5.48.
> However current file timestamp is got by current_fs_time() and
> is only updated once a tick. It can't say true nanosecond accuracy.
> In addition, gettimeofday() before a file operation updating
> {a,c,m}time would outstrip file's timestamp because of the difference
> about time source between gettimeofday() and file's timestamp.
> A certain kind of application would corrupted by this problem.

Applications mixing gettimeofday and filesystem timesamps can currently
use clock_gettime(CLOCK_REALTIME_COARSE,...) - which returns tick
granular timestamps, the same as the filesystem timestamps - method to
avoid this issue.

However, Patrick LoPresti (cc'ed) was working on a similar issue here
connected to nfs.

> I attached a most simple patch fixing this problem here. However
> it has several problems and I don't say it can be applied as is.
> The most big two problems is the following:
>
> - It would cause performance regression, especially in
> not TSC capable system.
> - Is gettimeofday()'s monotonicity reliable on all systems?

It *should* be. But hardware issues can cause trouble here.

> The relative discussion:
> http://lkml.org/lkml/2010/7/13/443
>
> Does anybody have good idea? Should it be tunable, for example?

I think the discussion from earlier suggested that this be configurable
from a mount option so the performance/granularity trade-off can be
managed there.

Potential pot-holes on the road here: Although I guess doing this on a
per-mount basis in the future could make it difficult for apps that use
CLOCK_REALTIME_COARSE to function if fs granularity is increased. Some
sort of CLOCK_REALTIME_FS could be introduced to map to whichever
granularity is right, but that can only be done on a global basis.
Hrm...

thanks
-john


2010-09-10 05:54:11

by Satoru Takeuchi

[permalink] [raw]
Subject: Re: [RFC][PATCH] make file's timestamp more accurate

Hi John,

(2010/09/10 1:23), john stultz wrote:
> On Tue, 2010-08-31 at 17:42 +0900, Satoru Takeuchi wrote:
>> linux has supported nanosecond order file's timestamp since 2.5.48.
>> However current file timestamp is got by current_fs_time() and
>> is only updated once a tick. It can't say true nanosecond accuracy.
>> In addition, gettimeofday() before a file operation updating
>> {a,c,m}time would outstrip file's timestamp because of the difference
>> about time source between gettimeofday() and file's timestamp.
>> A certain kind of application would corrupted by this problem.
>
> Applications mixing gettimeofday and filesystem timesamps can currently
> use clock_gettime(CLOCK_REALTIME_COARSE,...) - which returns tick
> granular timestamps, the same as the filesystem timestamps - method to
> avoid this issue.
>
> However, Patrick LoPresti (cc'ed) was working on a similar issue here
> connected to nfs.

Thank you for your comment.

Does it the following one? I overlooked it ;-(

http://lkml.org/lkml/2010/8/13/199

> Consider the following "revision 2" of my proposal:
>
> 1) Add a function pointer "current_fs_time" to struct super_block.
>
> 2) Replace all calls of the form:
>
> current_fs_time(sb);
>
> with
>
> sb->current_fs_time(sb);
>
> 3) Arrange for the default value to point to the current implementation.
>
> These first three could be one patch. They change no functionality;
> they just enable the next step.
>
> Finally:
>
> 4) Add a mount option to cause sb->current_fs_time(sb) to use the
> hi-res implementation.

I like this Patrick's idea. Patrick, are you trying this patch now?
If so, I wait for you, and if no, I'll try to implement it.

Thanks,
Satoru

>
>> I attached a most simple patch fixing this problem here. However
>> it has several problems and I don't say it can be applied as is.
>> The most big two problems is the following:
>>
>> - It would cause performance regression, especially in
>> not TSC capable system.
>> - Is gettimeofday()'s monotonicity reliable on all systems?
>
> It *should* be. But hardware issues can cause trouble here.
>
>> The relative discussion:
>> http://lkml.org/lkml/2010/7/13/443
>>
>> Does anybody have good idea? Should it be tunable, for example?
>
> I think the discussion from earlier suggested that this be configurable
> from a mount option so the performance/granularity trade-off can be
> managed there.
>
> Potential pot-holes on the road here: Although I guess doing this on a
> per-mount basis in the future could make it difficult for apps that use
> CLOCK_REALTIME_COARSE to function if fs granularity is increased. Some
> sort of CLOCK_REALTIME_FS could be introduced to map to whichever
> granularity is right, but that can only be done on a global basis.
> Hrm...
>
> thanks
> -john
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>