2009-12-14 21:28:23

by Petr Titěra

[permalink] [raw]
Subject: Wrong atime on recent kernels

Hello,

I see some strange file modification times recently. It seems to
me that in some situations, kernel allows to set nanoseconds part of
file access, modification or change time to 100000000 ns. Problem seems
to be in some generic part of kernel because I see it on several
different filesysytems (ext4 and nilf2). These is I've got during my
testing on kernel 2.6.32-tip-08309-gad8e75a.

File: `./Documentation/dvb/contributors.txt'
Size: 3035 Blocks: 8 IO Block: 4096 regular file
Device: fe04h/65028d Inode: 818 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-14 10:29:04.1000000000 +0100
Modify: 2009-12-14 10:29:04.1000000000 +0100
Change: 2009-12-14 10:29:04.1000000000 +0100

See that all times of that file ends with 1e6 nanoseconds.

Petr Titera


2009-12-14 21:41:45

by Andi Kleen

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

Petr Tit?ra <[email protected]> writes:

> Hello,
>
> I see some strange file modification times recently. It seems to
> me that in some situations, kernel allows to set nanoseconds part of
> file access, modification or change time to 100000000 ns. Problem
> seems to be in some generic part of kernel because I see it on several
> different filesysytems (ext4 and nilf2). These is I've got during my
> testing on kernel 2.6.32-tip-08309-gad8e75a.

Do you know which kernel was the last one to be not showing this?

Is 2.6.32 plain ok?

-Andi

--
[email protected] -- Speaking for myself only.

2009-12-14 21:45:51

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

2009/12/14 Petr Tit?ra <[email protected]>:
> Hello,
>
> ? ? ?I see some strange file modification times recently. It seems to me
> that in some situations, kernel allows to set nanoseconds part of ?file
> access, modification or change time ?to 100000000 ns. Problem seems to be in
> some generic part of kernel because I see it on several different
> filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
> ?2.6.32-tip-08309-gad8e75a.
>
> ?File: `./Documentation/dvb/contributors.txt'
> ?Size: 3035 ? ? ? ? ? ?Blocks: 8 ? ? ? ? ?IO Block: 4096 ? regular file
> Device: fe04h/65028d ? ?Inode: 818 ? ? ? ? Links: 1
> Access: (0644/-rw-r--r--) ?Uid: ( ? ?0/ ? ?root) ? Gid: ( ? ?0/ ? ?root)
> Access: 2009-12-14 10:29:04.1000000000 +0100
> Modify: 2009-12-14 10:29:04.1000000000 +0100
> Change: 2009-12-14 10:29:04.1000000000 +0100
>
> See that all times of that file ends with 1e6 nanoseconds.

Hrmmm. Does reverting this change solve it?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=patch;h=7bc7d637452383d56ba4368d4336b0dde1bb476d

thanks
-john

2009-12-14 22:00:24

by Petr Titěra

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

Andi Kleen napsal(a):
> Petr Tit?ra <[email protected]> writes:
>
>
>> Hello,
>>
>> I see some strange file modification times recently. It seems to
>> me that in some situations, kernel allows to set nanoseconds part of
>> file access, modification or change time to 100000000 ns. Problem
>> seems to be in some generic part of kernel because I see it on several
>> different filesysytems (ext4 and nilf2). These is I've got during my
>> testing on kernel 2.6.32-tip-08309-gad8e75a.
>>
>
> Do you know which kernel was the last one to be not showing this?
>
>
To Be honest I'm not sure. I remember that I've seen it for the first
time when I tried to install kernel-2.6.32-0.65.rc8.git5.fc13.i686
package from Fedora Rawhide and that was on 7th December. The only
kernels I could be running at that time are:
2.6.32-rc8-tip-02151-gf00b740, 2.6.32-tip-02531-gbbbe9f2 or
2.6.32-tip-02695-g4dc2ffc.

> Is 2.6.32 plain ok?
>
>
Will test.

> -Andi
>
>

Petr

2009-12-17 01:24:50

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

On Wed, 2009-12-16 at 21:55 +0100, Petr Titěra wrote:
> john stultz napsal(a):
> > 2009/12/14 Petr Titěra <[email protected]>:
> >
> > > Hello,
> > >
> > > I see some strange file modification times recently. It seems to me
> > > that in some situations, kernel allows to set nanoseconds part of file
> > > access, modification or change time to 100000000 ns. Problem seems to be in
> > > some generic part of kernel because I see it on several different
> > > filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
> > > 2.6.32-tip-08309-gad8e75a.
> > >
> > > File: `./Documentation/dvb/contributors.txt'
> > > Size: 3035 Blocks: 8 IO Block: 4096 regular file
> > > Device: fe04h/65028d Inode: 818 Links: 1
> > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> > > Access: 2009-12-14 10:29:04.1000000000 +0100
> > > Modify: 2009-12-14 10:29:04.1000000000 +0100
> > > Change: 2009-12-14 10:29:04.1000000000 +0100
> > >
> > > See that all times of that file ends with 1e6 nanoseconds.
> > >
> >
> > Hrmmm. Does reverting this change solve it?
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=patch;h=7bc7d637452383d56ba4368d4336b0dde1bb476d
> >
> > thanks
> > -john
> >
> Hello,
>
> I did not test reverting this patch yet, because I did not find
> reliable way how to reproduce these strange modify times. But as I
> read your description. Would it be possible that if there would be bug
> in your patch i would be observer on mostly quiet system? I'm asking
> because full day of testing of the system under load did not produce
> any result, but then when I tried to run "find / | xargs stat" on idle
> system I've got several new instances of wrong access time (filesystem
> is mounted without noatime)

Yes, its possible it would be more likely to be seen on an idle system.

Can you describe your system to me? Is it x86? x86_64? Could you send
your .config to me?

Please try to revert the patch, and I'll see if I can reproduce anything
similar on my systems.

thanks
-john

2009-12-17 03:26:36

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

On Wed, 2009-12-16 at 21:55 +0100, Petr Titěra wrote:
> john stultz napsal(a):
> > 2009/12/14 Petr Titěra <[email protected]>:
> >
> > > Hello,
> > >
> > > I see some strange file modification times recently. It seems to me
> > > that in some situations, kernel allows to set nanoseconds part of file
> > > access, modification or change time to 100000000 ns. Problem seems to be in
> > > some generic part of kernel because I see it on several different
> > > filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
> > > 2.6.32-tip-08309-gad8e75a.
> > >
> > > File: `./Documentation/dvb/contributors.txt'
> > > Size: 3035 Blocks: 8 IO Block: 4096 regular file
> > > Device: fe04h/65028d Inode: 818 Links: 1
> > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> > > Access: 2009-12-14 10:29:04.1000000000 +0100
> > > Modify: 2009-12-14 10:29:04.1000000000 +0100
> > > Change: 2009-12-14 10:29:04.1000000000 +0100
> > >
> > > See that all times of that file ends with 1e6 nanoseconds.
>
> I did not test reverting this patch yet, because I did not find
> reliable way how to reproduce these strange modify times. But as I
> read your description. Would it be possible that if there would be bug
> in your patch i would be observer on mostly quiet system? I'm asking
> because full day of testing of the system under load did not produce
> any result, but then when I tried to run "find / | xargs stat" on idle
> system I've got several new instances of wrong access time (filesystem
> is mounted without noatime)

Another quick question:

What is the normal behavior you see when this issue is not cropping up?

Do you normally see all 0's in the ns field? Or do you expect to see an
actual ns value?

I'm asking as all the filesystems I've played with have all zeros, so
I'm not sure if I need to try a different filesystem (I tried ext4, but
it was with a disk that was originally ext3), or if the issue is just
the stray 1sec value in the ns field.

thanks
-john

2009-12-17 11:05:01

by Petr Titěra

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

john stultz napsal(a):
> On Wed, 2009-12-16 at 21:55 +0100, Petr Titěra wrote:
>
>> john stultz napsal(a):
>>
>>> 2009/12/14 Petr Titěra <[email protected]>:
>>>
>>>
>>>> Hello,
>>>>
>>>> I see some strange file modification times recently. It seems to me
>>>> that in some situations, kernel allows to set nanoseconds part of file
>>>> access, modification or change time to 100000000 ns. Problem seems to be in
>>>> some generic part of kernel because I see it on several different
>>>> filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
>>>> 2.6.32-tip-08309-gad8e75a.
>>>>
>>>> File: `./Documentation/dvb/contributors.txt'
>>>> Size: 3035 Blocks: 8 IO Block: 4096 regular file
>>>> Device: fe04h/65028d Inode: 818 Links: 1
>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>> Access: 2009-12-14 10:29:04.1000000000 +0100
>>>> Modify: 2009-12-14 10:29:04.1000000000 +0100
>>>> Change: 2009-12-14 10:29:04.1000000000 +0100
>>>>
>>>> See that all times of that file ends with 1e6 nanoseconds.
>>>>
>> I did not test reverting this patch yet, because I did not find
>> reliable way how to reproduce these strange modify times. But as I
>> read your description. Would it be possible that if there would be bug
>> in your patch i would be observer on mostly quiet system? I'm asking
>> because full day of testing of the system under load did not produce
>> any result, but then when I tried to run "find / | xargs stat" on idle
>> system I've got several new instances of wrong access time (filesystem
>> is mounted without noatime)
>>
>
> Another quick question:
>
> What is the normal behavior you see when this issue is not cropping up?
>
> Do you normally see all 0's in the ns field? Or do you expect to see an
> actual ns value?
>
>
Sorry to reply again. Previous message did not get to list:

I see values which seems to be ns times there. My root filesystem is
ext4 too (recently I do not remeber if I formated it from scratch when I
reinstalled that system) but I see this happen on other filesystems too

Root filesystem (ext4 may be converted from ext3)

File: `/etc/sysconfig'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fe00h/65024d Inode: 65282 Links: 7
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-16 21:14:00.172000000 +0100
Modify: 2009-12-12 11:01:48.1000000000 +0100
Change: 2009-12-12 11:01:48.1000000000 +0100
File: `/etc/sysconfig/prelink'
Size: 1459 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 22706 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-14 10:27:46.912000002 +0100
Modify: 2004-11-23 11:43:08.000000000 +0100
Change: 2009-12-08 22:57:24.656000002 +0100
File: `/etc/sysconfig/i18n'
Size: 47 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 48962 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2010-08-27 18:07:21.500013018 +0200
Modify: 2009-06-22 23:33:43.113581313 +0200
Change: 2009-06-22 23:58:39.936318201 +0200

/home (nilfs2)

File: `/home/linux-2.6/include/linux/netfilter_ipv4/ipt_tos.h'
Size: 184 Blocks: 8 IO Block: 4096 regular file
Device: fe04h/65028d Inode: 20141 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-15 18:59:33.1000000000 +0100
Modify: 2009-12-15 18:59:33.1000000000 +0100
Change: 2009-12-15 18:59:33.1000000000 +0100
File: `/home/linux-2.6/include/linux/netfilter_ipv4/ipt_ttl.h'
Size: 350 Blocks: 8 IO Block: 4096 regular file
Device: fe04h/65028d Inode: 20547 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-15 00:23:58.135760901 +0100
Modify: 2009-12-15 00:23:58.135760901 +0100
Change: 2009-12-15 00:23:58.135760901 +0100


/sources (btrfs)

File:
`/sources/linux-2.6/.git/objects/pack/pack-9aea3a0847debb83ad688214f648799fc46af3d3.pack'
Size: 6255096 Blocks: 12224 IO Block: 4096 regular file
Device: 13h/19d Inode: 2129247 Links: 1
Access: (0444/-r--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-16 21:16:09.424000000 +0100
Modify: 2009-12-16 21:17:20.1000000000 +0100
Change: 2009-12-16 21:17:21.564000000 +0100
File:
`/sources/linux-2.6/.git/objects/pack/pack-9aea3a0847debb83ad688214f648799fc46af3d3.idx'
Size: 159552 Blocks: 312 IO Block: 4096 regular file
Device: 13h/19d Inode: 2129248 Links: 1
Access: (0444/-r--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-16 21:17:21.296000000 +0100
Modify: 2009-12-16 21:17:21.324000001 +0100
Change: 2009-12-16 21:17:21.592000001 +0100


Now when I'm looking through stat /stats.file I was able to find some
really old instances of this error from October:

File: `/mnt/data/linux-2.6/.git/refs/remotes/origin'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fe00h/65024d Inode: 130953 Links: 2
Access: (0775/drwxrwxr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-16 21:21:52.776000002 +0100
Modify: 2009-10-14 07:57:03.1000000000 +0200
Change: 2009-10-14 07:57:03.1000000000 +0200
File: `/mnt/data/linux-2.6/.git/refs/remotes/origin/master'
Size: 41 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 147522 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-10-14 07:57:04.040000000 +0200
Modify: 2009-10-14 07:57:03.970000000 +0200
Change: 2009-10-14 07:57:03.1000000000 +0200

So this happened before but only recently it started to happen in places
where it hurts. I found this trange behaviour because I was unable to
create initramfs of new kernels. mkinitrd command could not copy files
and preserve their times because of timestamp validity check in cp.

I will try to revert commit you told me and will test.

Petr

> I'm asking as all the filesystems I've played with have all zeros, so
> I'm not sure if I need to try a different filesystem (I tried ext4, but
> it was with a disk that was originally ext3), or if the issue is just
> the stray 1sec value in the ns field.
>
> thanks
> -john
>
>
>
>
> __________ Informace od ESET Smart Security, verze databaze 4694 (20091216) __________
>
> Tuto zpravu proveril ESET Smart Security.
>
> http://www.eset.cz
>
>
>



__________ Informace od ESET Smart Security, verze databaze 4694 (20091216) __________

Tuto zpravu proveril ESET Smart Security.

http://www.eset.cz

2009-12-17 21:19:20

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

On Thu, 2009-12-17 at 12:04 +0100, Petr Titěra wrote:
> Now when I'm looking through stat /stats.file I was able to find some
> really old instances of this error from October:
>
> File: `/mnt/data/linux-2.6/.git/refs/remotes/origin'
> Size: 4096 Blocks: 8 IO Block: 4096 directory
> Device: fe00h/65024d Inode: 130953 Links: 2
> Access: (0775/drwxrwxr-x) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-16 21:21:52.776000002 +0100
> Modify: 2009-10-14 07:57:03.1000000000 +0200
> Change: 2009-10-14 07:57:03.1000000000 +0200
> File: `/mnt/data/linux-2.6/.git/refs/remotes/origin/master'
> Size: 41 Blocks: 8 IO Block: 4096 regular file
> Device: fe00h/65024d Inode: 147522 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-10-14 07:57:04.040000000 +0200
> Modify: 2009-10-14 07:57:03.970000000 +0200
> Change: 2009-10-14 07:57:03.1000000000 +0200
>
> So this happened before but only recently it started to happen in places
> where it hurts. I found this trange behaviour because I was unable to
> create initramfs of new kernels. mkinitrd command could not copy files
> and preserve their times because of timestamp validity check in cp.

Huh. If its really the case, then its not the change I suspect.

Ted: Have you seen anything like this? Where the sub-second field holds
a full second of time?


> I will try to revert commit you told me and will test.

Thanks, that will help greatly narrow this down.

thanks again
-john

2009-12-18 03:13:49

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

On Thu, 2009-12-17 at 12:04 +0100, Petr Titěra wrote:
> john stultz napsal(a):
> > On Wed, 2009-12-16 at 21:55 +0100, Petr Titěra wrote:
> >
> >> john stultz napsal(a):
> >>
> >>> 2009/12/14 Petr Titěra <[email protected]>:
> >>>
> >>>
> >>>> Hello,
> >>>>
> >>>> I see some strange file modification times recently. It seems to me
> >>>> that in some situations, kernel allows to set nanoseconds part of file
> >>>> access, modification or change time to 100000000 ns. Problem seems to be in
> >>>> some generic part of kernel because I see it on several different
> >>>> filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
> >>>> 2.6.32-tip-08309-gad8e75a.
> >>>>
> >>>> File: `./Documentation/dvb/contributors.txt'
> >>>> Size: 3035 Blocks: 8 IO Block: 4096 regular file
> >>>> Device: fe04h/65028d Inode: 818 Links: 1
> >>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> >>>> Access: 2009-12-14 10:29:04.1000000000 +0100
> >>>> Modify: 2009-12-14 10:29:04.1000000000 +0100
> >>>> Change: 2009-12-14 10:29:04.1000000000 +0100
> >>>>
> >>>> See that all times of that file ends with 1e6 nanoseconds.
> >>>>
> >> I did not test reverting this patch yet, because I did not find
> >> reliable way how to reproduce these strange modify times. But as I
> >> read your description. Would it be possible that if there would be bug
> >> in your patch i would be observer on mostly quiet system? I'm asking
> >> because full day of testing of the system under load did not produce
> >> any result, but then when I tried to run "find / | xargs stat" on idle
> >> system I've got several new instances of wrong access time (filesystem
> >> is mounted without noatime)
> >>
> >
> > Another quick question:
> >
> > What is the normal behavior you see when this issue is not cropping up?
> >
> > Do you normally see all 0's in the ns field? Or do you expect to see an
> > actual ns value?
> >
> >
> Sorry to reply again. Previous message did not get to list:
>
> I see values which seems to be ns times there. My root filesystem is
> ext4 too (recently I do not remeber if I formated it from scratch when I
> reinstalled that system) but I see this happen on other filesystems too
>
> Root filesystem (ext4 may be converted from ext3)
>
> File: `/etc/sysconfig'
> Size: 4096 Blocks: 8 IO Block: 4096 directory
> Device: fe00h/65024d Inode: 65282 Links: 7
> Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-16 21:14:00.172000000 +0100
> Modify: 2009-12-12 11:01:48.1000000000 +0100
> Change: 2009-12-12 11:01:48.1000000000 +0100
> File: `/etc/sysconfig/prelink'
> Size: 1459 Blocks: 8 IO Block: 4096 regular file
> Device: fe00h/65024d Inode: 22706 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-14 10:27:46.912000002 +0100
> Modify: 2004-11-23 11:43:08.000000000 +0100
> Change: 2009-12-08 22:57:24.656000002 +0100
> File: `/etc/sysconfig/i18n'
> Size: 47 Blocks: 8 IO Block: 4096 regular file
> Device: fe00h/65024d Inode: 48962 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2010-08-27 18:07:21.500013018 +0200
> Modify: 2009-06-22 23:33:43.113581313 +0200
> Change: 2009-06-22 23:58:39.936318201 +0200

So I'm not reproducing this with 2.6.33-rc1 on a fresh ext4 partition on
x68_64.

File: `virt'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 804h/2052d Inode: 1868440 Links: 3
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-17 21:22:44.692710730 -0500
Modify: 2009-12-17 20:14:40.000000000 -0500
Change: 2009-12-17 21:20:21.001915208 -0500
File: `vmlinux'
Size: 21122497 Blocks: 24136 IO Block: 4096 regular file
Device: 804h/2052d Inode: 1874435 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-17 21:22:05.381691121 -0500
Modify: 2009-12-17 21:22:05.376691754 -0500
Change: 2009-12-17 21:22:05.376691754 -0500
File: `vmlinux.o'
Size: 16701780 Blocks: 32624 IO Block: 4096 regular file
Device: 804h/2052d Inode: 1874418 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-12-17 21:22:01.138228732 -0500
Modify: 2009-12-17 21:22:01.131229619 -0500
Change: 2009-12-17 21:22:01.131229619 -0500


Let me know if you find anything that helps narrow this down.

thanks
-john


2009-12-20 22:30:23

by Petr Titěra

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

john stultz napsal(a):
> On Thu, 2009-12-17 at 12:04 +0100, Petr Tit�ra wrote:
>
>> john stultz napsal(a):
>>
>>> On Wed, 2009-12-16 at 21:55 +0100, Petr Tit�ra wrote:
>>>
>>>
>>>> john stultz napsal(a):
>>>>
>>>>
>>>>> 2009/12/14 Petr TitÄ�ra <[email protected]>:
>>>>>
>>>>>
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I see some strange file modification times recently. It seems to me
>>>>>> that in some situations, kernel allows to set nanoseconds part of file
>>>>>> access, modification or change time to 100000000 ns. Problem seems to be in
>>>>>> some generic part of kernel because I see it on several different
>>>>>> filesysytems (ext4 and nilf2). These is I've got during my testing on kernel
>>>>>> 2.6.32-tip-08309-gad8e75a.
>>>>>>
>>>>>> File: `./Documentation/dvb/contributors.txt'
>>>>>> Size: 3035 Blocks: 8 IO Block: 4096 regular file
>>>>>> Device: fe04h/65028d Inode: 818 Links: 1
>>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>> Access: 2009-12-14 10:29:04.1000000000 +0100
>>>>>> Modify: 2009-12-14 10:29:04.1000000000 +0100
>>>>>> Change: 2009-12-14 10:29:04.1000000000 +0100
>>>>>>
>>>>>> See that all times of that file ends with 1e6 nanoseconds.
>>>>>>
>>>>>>
>>>> I did not test reverting this patch yet, because I did not find
>>>> reliable way how to reproduce these strange modify times. But as I
>>>> read your description. Would it be possible that if there would be bug
>>>> in your patch i would be observer on mostly quiet system? I'm asking
>>>> because full day of testing of the system under load did not produce
>>>> any result, but then when I tried to run "find / | xargs stat" on idle
>>>> system I've got several new instances of wrong access time (filesystem
>>>> is mounted without noatime)
>>>>
>>>>
>>> Another quick question:
>>>
>>> What is the normal behavior you see when this issue is not cropping up?
>>>
>>> Do you normally see all 0's in the ns field? Or do you expect to see an
>>> actual ns value?
>>>
>>>
>>>
>> Sorry to reply again. Previous message did not get to list:
>>
>> I see values which seems to be ns times there. My root filesystem is
>> ext4 too (recently I do not remeber if I formated it from scratch when I
>> reinstalled that system) but I see this happen on other filesystems too
>>
>> Root filesystem (ext4 may be converted from ext3)
>>
>> File: `/etc/sysconfig'
>> Size: 4096 Blocks: 8 IO Block: 4096 directory
>> Device: fe00h/65024d Inode: 65282 Links: 7
>> Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2009-12-16 21:14:00.172000000 +0100
>> Modify: 2009-12-12 11:01:48.1000000000 +0100
>> Change: 2009-12-12 11:01:48.1000000000 +0100
>> File: `/etc/sysconfig/prelink'
>> Size: 1459 Blocks: 8 IO Block: 4096 regular file
>> Device: fe00h/65024d Inode: 22706 Links: 1
>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2009-12-14 10:27:46.912000002 +0100
>> Modify: 2004-11-23 11:43:08.000000000 +0100
>> Change: 2009-12-08 22:57:24.656000002 +0100
>> File: `/etc/sysconfig/i18n'
>> Size: 47 Blocks: 8 IO Block: 4096 regular file
>> Device: fe00h/65024d Inode: 48962 Links: 1
>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2010-08-27 18:07:21.500013018 +0200
>> Modify: 2009-06-22 23:33:43.113581313 +0200
>> Change: 2009-06-22 23:58:39.936318201 +0200
>>
>
> So I'm not reproducing this with 2.6.33-rc1 on a fresh ext4 partition on
> x68_64.
>
> File: `virt'
> Size: 4096 Blocks: 8 IO Block: 4096 directory
> Device: 804h/2052d Inode: 1868440 Links: 3
> Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-17 21:22:44.692710730 -0500
> Modify: 2009-12-17 20:14:40.000000000 -0500
> Change: 2009-12-17 21:20:21.001915208 -0500
> File: `vmlinux'
> Size: 21122497 Blocks: 24136 IO Block: 4096 regular file
> Device: 804h/2052d Inode: 1874435 Links: 1
> Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-17 21:22:05.381691121 -0500
> Modify: 2009-12-17 21:22:05.376691754 -0500
> Change: 2009-12-17 21:22:05.376691754 -0500
> File: `vmlinux.o'
> Size: 16701780 Blocks: 32624 IO Block: 4096 regular file
> Device: 804h/2052d Inode: 1874418 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2009-12-17 21:22:01.138228732 -0500
> Modify: 2009-12-17 21:22:01.131229619 -0500
> Change: 2009-12-17 21:22:01.131229619 -0500
>
>
> Let me know if you find anything that helps narrow this down.
>
>
Hello,

I know its far fetched, but is there something what is preventing
xtime.tv_nsec to be exactly 999999999 near the end of update_wall_time
in kernel/time/timekeeping.c?

Petr


> thanks
> -john
>
>
>
>

2009-12-20 23:31:43

by Petr Titěra

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

Petr Titěra napsal(a):
> john stultz napsal(a):
>> On Thu, 2009-12-17 at 12:04 +0100, Petr Tit��ra wrote:
>>> john stultz napsal(a):
>>>> On Wed, 2009-12-16 at 21:55 +0100, Petr Tit��ra wrote:
>>>>> john stultz napsal(a):
>>>>>> 2009/12/14 Petr TitĂ�ďż˝ra <[email protected]>:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I see some strange file modification times recently. It seems to me
>>>>>>> that in some situations, kernel allows to set nanoseconds part
>>>>>>> of file
>>>>>>> access, modification or change time to 100000000 ns. Problem
>>>>>>> seems to be in
>>>>>>> some generic part of kernel because I see it on several different
>>>>>>> filesysytems (ext4 and nilf2). These is I've got during my
>>>>>>> testing on kernel
>>>>>>> 2.6.32-tip-08309-gad8e75a.
>>>>>>>
>>>>>>> File: `./Documentation/dvb/contributors.txt'
>>>>>>> Size: 3035 Blocks: 8 IO Block: 4096 regular file
>>>>>>> Device: fe04h/65028d Inode: 818 Links: 1
>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>> Access: 2009-12-14 10:29:04.1000000000 +0100
>>>>>>> Modify: 2009-12-14 10:29:04.1000000000 +0100
>>>>>>> Change: 2009-12-14 10:29:04.1000000000 +0100
>>>>>>>
>>>>>>> See that all times of that file ends with 1e6 nanoseconds.
>>>>> I did not test reverting this patch yet, because I did not find
>>>>> reliable way how to reproduce these strange modify times. But as I
>>>>> read your description. Would it be possible that if there would be
>>>>> bug
>>>>> in your patch i would be observer on mostly quiet system? I'm asking
>>>>> because full day of testing of the system under load did not produce
>>>>> any result, but then when I tried to run "find / | xargs stat" on
>>>>> idle
>>>>> system I've got several new instances of wrong access time
>>>>> (filesystem
>>>>> is mounted without noatime)
>>>> Another quick question:
>>>>
>>>> What is the normal behavior you see when this issue is not cropping
>>>> up?
>>>>
>>>> Do you normally see all 0's in the ns field? Or do you expect to
>>>> see an
>>>> actual ns value?
>>>>
>>> Sorry to reply again. Previous message did not get to list:
>>>
>>> I see values which seems to be ns times there. My root filesystem is
>>> ext4 too (recently I do not remeber if I formated it from scratch
>>> when I reinstalled that system) but I see this happen on other
>>> filesystems too
>>>
>>> Root filesystem (ext4 may be converted from ext3)
>>>
>>> File: `/etc/sysconfig'
>>> Size: 4096 Blocks: 8 IO Block: 4096 directory
>>> Device: fe00h/65024d Inode: 65282 Links: 7
>>> Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
>>> Access: 2009-12-16 21:14:00.172000000 +0100
>>> Modify: 2009-12-12 11:01:48.1000000000 +0100
>>> Change: 2009-12-12 11:01:48.1000000000 +0100
>>> File: `/etc/sysconfig/prelink'
>>> Size: 1459 Blocks: 8 IO Block: 4096 regular file
>>> Device: fe00h/65024d Inode: 22706 Links: 1
>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>> Access: 2009-12-14 10:27:46.912000002 +0100
>>> Modify: 2004-11-23 11:43:08.000000000 +0100
>>> Change: 2009-12-08 22:57:24.656000002 +0100
>>> File: `/etc/sysconfig/i18n'
>>> Size: 47 Blocks: 8 IO Block: 4096 regular file
>>> Device: fe00h/65024d Inode: 48962 Links: 1
>>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>>> Access: 2010-08-27 18:07:21.500013018 +0200
>>> Modify: 2009-06-22 23:33:43.113581313 +0200
>>> Change: 2009-06-22 23:58:39.936318201 +0200
>>
>> So I'm not reproducing this with 2.6.33-rc1 on a fresh ext4 partition on
>> x68_64.
>>
>> File: `virt'
>> Size: 4096 Blocks: 8 IO Block: 4096 directory
>> Device: 804h/2052d Inode: 1868440 Links: 3
>> Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2009-12-17 21:22:44.692710730 -0500
>> Modify: 2009-12-17 20:14:40.000000000 -0500
>> Change: 2009-12-17 21:20:21.001915208 -0500
>> File: `vmlinux'
>> Size: 21122497 Blocks: 24136 IO Block: 4096 regular file
>> Device: 804h/2052d Inode: 1874435 Links: 1
>> Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2009-12-17 21:22:05.381691121 -0500
>> Modify: 2009-12-17 21:22:05.376691754 -0500
>> Change: 2009-12-17 21:22:05.376691754 -0500
>> File: `vmlinux.o'
>> Size: 16701780 Blocks: 32624 IO Block: 4096 regular file
>> Device: 804h/2052d Inode: 1874418 Links: 1
>> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
>> Access: 2009-12-17 21:22:01.138228732 -0500
>> Modify: 2009-12-17 21:22:01.131229619 -0500
>> Change: 2009-12-17 21:22:01.131229619 -0500
>>
>>
>> Let me know if you find anything that helps narrow this down.
>>
> Hello,
>
> I know its far fetched, but is there something what is preventing
> xtime.tv_nsec to be exactly 999999999 near the end of update_wall_time
> in kernel/time/timekeeping.c?
>
> Petr
>

Just to follow up. I'm asking because I see a lot of files with access
and/or modify times near the top of thousanth of second (see
`/etc/sysconfig/prelink' in my example) and I thing that addition of 1
to xtime.tv_nsec ath the end of update_wall_time can 'owerflow' to whole
second.

Petr
>
>> thanks
>> -john
>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-12-21 21:16:46

by john stultz

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

On Mon, 2009-12-21 at 00:31 +0100, Petr Titěra wrote:
> Petr Titěra napsal(a):
> > john stultz napsal(a):
> >> Let me know if you find anything that helps narrow this down.
> >>
> >
> > I know its far fetched, but is there something what is preventing
> > xtime.tv_nsec to be exactly 999999999 near the end of update_wall_time
> > in kernel/time/timekeeping.c?
> >
> Just to follow up. I'm asking because I see a lot of files with access
> and/or modify times near the top of thousanth of second (see
> `/etc/sysconfig/prelink' in my example) and I thing that addition of 1
> to xtime.tv_nsec ath the end of update_wall_time can 'owerflow' to whole
> second.


Oof! Yikes.

Yea, the sub-nanosecond rounding fix we added quite awhile back indeed
opens a hole where xtime.tv_nsec could be exactly 1sec. Good eye!

Of course, most of the timekeeping accessors handle this properly by
normalizing the timespec before returning, so its likely just users of
current_kernel_time() and direct accessors of xtime that might be bitten
here.

And this probably was obscured before because the xtime_cache() was
normalized. Did you verify that reverting that patch I pointed you to
resolves the issue? If not, please do, so we can get this fixed up.

Now I'm a little baffled why you see it all the time on your boxes. For
this to trigger, you have to have an interrupt in the last ns of a
second, and then the window for these odd filesystem stamps is only open
for 1-10ms.

Sigh. Once we get the last of the non GENERIC_TIME arches converted to
arch_gettimeoffset, we can kill all of those rounding hacks and just
manage the sub-nanosecond portion sanely. I'm looking forward to that
day!


So again, Bravo on catching this!

thanks
-john

2009-12-22 15:50:24

by Petr Titěra

[permalink] [raw]
Subject: Re: Wrong atime on recent kernels

john stultz napsal(a):
> On Mon, 2009-12-21 at 00:31 +0100, Petr Titěra wrote:
>
>> Petr Titěra napsal(a):
>>
>>> john stultz napsal(a):
>>>
>>>> Let me know if you find anything that helps narrow this down.
>>>>
>>>>
>>> I know its far fetched, but is there something what is preventing
>>> xtime.tv_nsec to be exactly 999999999 near the end of update_wall_time
>>> in kernel/time/timekeeping.c?
>>>
>>>
>> Just to follow up. I'm asking because I see a lot of files with access
>> and/or modify times near the top of thousanth of second (see
>> `/etc/sysconfig/prelink' in my example) and I thing that addition of 1
>> to xtime.tv_nsec ath the end of update_wall_time can 'owerflow' to whole
>> second.
>>
>
>
> Oof! Yikes.
>
> Yea, the sub-nanosecond rounding fix we added quite awhile back indeed
> opens a hole where xtime.tv_nsec could be exactly 1sec. Good eye!
>
> Of course, most of the timekeeping accessors handle this properly by
> normalizing the timespec before returning, so its likely just users of
> current_kernel_time() and direct accessors of xtime that might be bitten
> here.
>
> And this probably was obscured before because the xtime_cache() was
> normalized. Did you verify that reverting that patch I pointed you to
> resolves the issue? If not, please do, so we can get this fixed up.
>
>
I can confirm that I was not able to see any of those error after I've
reverted that patch. But I was not able to repliace this at will.
Considering that first files with this kind of error started to appear
just about the time your patch went in I would propose that your
explanation is plausible.

> Now I'm a little baffled why you see it all the time on your boxes. For
> this to trigger, you have to have an interrupt in the last ns of a
> second, and then the window for these odd filesystem stamps is only open
> for 1-10ms.
>
>
I think my computer for some unknow reason had better chance of it. This
is snip from filtered and sorted stats of files on my disk:

Access: 2009-12-16 21:51:55.659999999 +0100
Access: 2009-12-16 21:51:55.632000000 +0100
Access: 2009-12-16 21:51:55.552000004 +0100
Access: 2009-12-16 21:51:55.512000003 +0100
Access: 2009-12-16 21:51:55.436000005 +0100
Access: 2009-12-16 21:51:55.432000009 +0100
Access: 2009-12-16 21:51:55.363999951 +0100
Access: 2009-12-16 21:51:55.295999930 +0100
Access: 2009-12-16 21:51:55.287999689 +0100
Access: 2009-12-16 21:51:54.703999875 +0100
Access: 2009-12-16 21:51:54.683999001 +0100
Access: 2009-12-16 21:48:32.844000001 +0100
Access: 2009-12-16 21:48:31.375999999 +0100
Access: 2009-12-16 21:48:31.344000000 +0100
Access: 2009-12-16 21:48:31.047999999 +0100
Access: 2009-12-16 21:48:31.028000002 +0100
Access: 2009-12-16 21:48:31.015999998 +0100
Access: 2009-12-16 21:48:31.015999998 +0100

You see that in my case nanosecond times are sometimes oscilating
withing edge of full milisecond. The sub millisecond part of time is
mostly farr off of it.

Petr

> Sigh. Once we get the last of the non GENERIC_TIME arches converted to
> arch_gettimeoffset, we can kill all of those rounding hacks and just
> manage the sub-nanosecond portion sanely. I'm looking forward to that
> day!
>
>
> So again, Bravo on catching this!
>
> thanks
> -john
>
>
>
> __________ Informace od ESET Smart Security, verze databaze 4709 (20091222) __________
>
> Tuto zpravu proveril ESET Smart Security.
>
> http://www.eset.cz
>
>
>



__________ Informace od ESET Smart Security, verze databaze 4709 (20091222) __________

Tuto zpravu proveril ESET Smart Security.

http://www.eset.cz