2008-03-07 06:01:58

by Adam Schrotenboer

[permalink] [raw]
Subject: Re: [opensuse] nfs_update_inode: inode X mode changed, Y to Z

Neil Brown wrote:
> On Wednesday March 5, [email protected] wrote:
>
>> Neil Brown wrote:
>>
>>> On Wednesday March 5, [email protected] wrote:
>>>
>>> Where :-)
>>>
>>> A excerpt from the logs to show actual error messages could be
>>> helpful.
>>>
>>> Thanks,
>>> NeilBrown
>>>
>> sorry, must have missed the paste
>>
>> Feb 28 17:27:58 koi kernel: nfs_update_inode: inode 3221350147 mode
>> changed, 0040755 to 0100644
>> Feb 28 21:57:22 koi kernel: nfs_update_inode: inode 2149056279 mode
>> changed, 0040755 to 0100755
>> Mar 5 01:43:21 koi kernel: nfs_update_inode: inode 3222473680 mode
>> changed, 0040755 to 0100644
>> Mar 5 08:52:19 koi kernel: nfs_update_inode: inode 3222473568 mode
>> changed, 0120777 to 0100644
>> Mar 5 15:00:01 koi kernel: nfs_update_inode: inode 3222473569 mode
>> changed, 0120777 to 0100644
>> Mar 5 15:00:31 koi kernel: nfs_update_inode: inode 3222473674 mode
>> changed, 0120777 to 0100644
>> Mar 5 15:00:31 koi kernel: nfs_update_inode: inode 3222473675 mode
>> changed, 0120777 to 0100644
>> Mar 5 15:00:31 koi kernel: nfs_update_inode: inode 3222473672 mode
>> changed, 0040755 to 0100644
>> Mar 5 15:00:31 koi kernel: nfs_update_inode: inode 3222473673 mode
>> changed, 0120777 to 0100644
>>
>>
>>
> Hmmm...
> Directories and symlinks "changing" to regular files.
>
> 32bit inode numbers with the high bit set.
>
> Random times, but a burst at 3pm one afternoon.
>
> The fact that it happens so rarely make is very hard to collect more
> data....
>
>
> What happens when a user experiences an error?
well, sometimes it shows up as a read/write error (that is the error
code returned by the application).
SVN tend to be more verbose (although I have no examples)
> What sort of file is
> the error reported against?
It does not to my knowledge only happen to certain classes of files,
although it does seem to happen most often when lots of files are being
created.
> Does the file remain inaccessible or does
> it start working again soon?
No data, maybe Thomas knows something.
> Is the same file accessible from another
> client?
>
No data, esp as it never happens to me personally.
> NeilBrown
>



Attachments:
signature.asc (250.00 B)
OpenPGP digital signature

2008-03-12 17:55:29

by Adam Schrotenboer

[permalink] [raw]
Subject: Re: [opensuse] nfs_update_inode: inode X mode changed, Y to Z


Another excerpt from last night.

Mar 12 00:53:16 wanda kernel: nfs: server 192.168.1.27 not responding,
still trying
Mar 12 00:53:18 wanda kernel: nfs: server 192.168.1.27 OK
Mar 12 00:54:29 wanda kernel: nfs_update_inode: inode 1075049148 mode
changed, 0040755 to 0100644
Mar 12 00:54:49 wanda kernel: nfs_update_inode: inode 1611920610 mode
changed, 0100644 to 0040755
Mar 12 00:54:49 wanda kernel: nfs_update_inode: inode 1611920612 mode
changed, 0040755 to 0100644
Mar 12 00:54:50 wanda kernel: nfs_update_inode: inode 2685631887 mode
changed, 0100644 to 0040755
Mar 12 00:54:50 wanda kernel: nfs_update_inode: inode 3222506021 mode
changed, 0100644 to 0040755
Mar 12 00:54:55 wanda kernel: nfs_update_inode: inode 1372131 mode
changed, 0100644 to 0040755
Mar 12 00:55:25 wanda kernel: nfs_update_inode: inode 3759413120 mode
changed, 0040755 to 0100644
Mar 12 00:56:05 wanda kernel: nfs_update_inode: inode 1075051607 mode
changed, 0100644 to 0040755
Mar 12 00:56:05 wanda kernel: nfs_update_inode: inode 1075051608 mode
changed, 0040755 to 0100644
Mar 12 00:56:05 wanda kernel: nfs_update_inode: inode 1075051612 mode
changed, 0100644 to 0040755
Mar 12 00:56:05 wanda kernel: nfs_update_inode: inode 1075051613 mode
changed, 0040755 to 0100644
Mar 12 00:56:06 wanda kernel: nfs_update_inode: inode 1075051616 mode
changed, 0100644 to 0040755
Mar 12 00:56:06 wanda kernel: nfs_update_inode: inode 1075051617 mode
changed, 0040755 to 0100644
Mar 12 00:56:09 wanda kernel: nfs_update_inode: inode 1075051624 mode
changed, 0100644 to 0040755
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926434 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926438 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926442 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926446 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926450 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926454 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926458 mode
changed, 0040755 to 0100644
Mar 12 01:00:33 wanda kernel: nfs_update_inode: inode 1611926462 mode
changed, 0040755 to 0100644
Mar 12 01:00:34 wanda kernel: nfs_update_inode: inode 1611926463 mode
changed, 0100644 to 0040755
Mar 12 09:36:46 wanda kernel: nfs_update_inode: inode 538186966 mode
changed, 0100644 to 0040755
Mar 12 09:36:46 wanda kernel: nfs_update_inode: inode 538186967 mode
changed, 0040755 to 0100644
Mar 12 09:36:46 wanda kernel: nfs_update_inode: inode 538186970 mode
changed, 0100644 to 0040755
Mar 12 09:36:46 wanda kernel: nfs_update_inode: inode 538186972 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186974 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186977 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186979 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186982 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186984 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186987 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186989 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186992 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186994 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186997 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538186999 mode
changed, 0100644 to 0040755
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538187002 mode
changed, 0040755 to 0100644
Mar 12 09:36:47 wanda kernel: nfs_update_inode: inode 538187004 mode
changed, 0100644 to 0040755

It keeps going...

Another machine, different times.

Mar 12 03:35:19 minnow kernel: nfs_update_inode: inode 2685628653 mode
changed, 0100644 to 0040755
Mar 12 03:38:01 minnow kernel: nfs_update_inode: inode 2685628658 mode
changed, 0100644 to 0040755
Mar 12 03:38:04 minnow kernel: nfs_update_inode: inode 2685628707 mode
changed, 0100644 to 0040755
Mar 12 03:38:04 minnow kernel: nfs_update_inode: inode 2685628715 mode
changed, 0100644 to 0040755
Mar 12 03:38:04 minnow kernel: nfs_update_inode: inode 2685628719 mode
changed, 0100644 to 0040755
Mar 12 03:38:04 minnow kernel: nfs_update_inode: inode 2685628731 mode
changed, 0100644 to 0040755
Mar 12 03:38:05 minnow kernel: nfs_update_inode: inode 3222503513 mode
changed, 0100644 to 0040755
Mar 12 03:38:05 minnow kernel: nfs_update_inode: inode 2685628818 mode
changed, 0100644 to 0040755
Mar 12 03:38:06 minnow kernel: nfs_update_inode: inode 2685628842 mode
changed, 0100644 to 0040755
Mar 12 03:38:06 minnow kernel: nfs_update_inode: inode 2685628846 mode
changed, 0100644 to 0040755
Mar 12 03:38:06 minnow kernel: nfs_update_inode: inode 2685628851 mode
changed, 0100644 to 0040755
Mar 12 03:38:06 minnow kernel: nfs_update_inode: inode 2685628856 mode
changed, 0100644 to 0040755
Mar 12 03:38:07 minnow kernel: nfs_update_inode: inode 3222503585 mode
changed, 0100644 to 0040755
Mar 12 03:38:09 minnow kernel: nfs_update_inode: inode 2685628923 mode
changed, 0100644 to 0040755
Mar 12 03:38:09 minnow kernel: nfs_update_inode: inode 2685628927 mode
changed, 0100644 to 0040755
Mar 12 04:00:48 minnow kernel: nfs_update_inode: inode 2149158555 mode
changed, 0100644 to 0040755
Mar 12 04:00:51 minnow kernel: nfs_update_inode: inode 1611926234 mode
changed, 0040755 to 0100644
Mar 12 04:01:04 minnow kernel: nfs_update_inode: inode 1611926235 mode
changed, 0100644 to 0040755
Mar 12 04:01:04 minnow kernel: nfs_update_inode: inode 1611926238 mode
changed, 0040755 to 0100644
Mar 12 04:01:04 minnow kernel: nfs_update_inode: inode 1611926240 mode
changed, 0100644 to 0040755
Mar 12 04:01:04 minnow kernel: nfs_update_inode: inode 1611926243 mode
changed, 0040755 to 0100644

I have other machines it has occurred to as well. The only usage pattern
I have is that it happened mostly while running Mentor's Precision, but
I am doubting that that is relevant.

This is the line from /proc/mounts on Minnow:

192.168.1.27:/mnt/storage0/users /home/users nfs
rw,v3,rsize=32768,wsize=32768,hard,lock,proto=tcp,addr=192.168.1.27 0 0

192.168.1.27 is our machine dolphin, which is running OpenSuSE 10.2
x86_64 on Intel(R) Xeon(R) CPU E5310 @ 1.60GHz with 4G of RAM, and is
an XFS file system running on a Dell PERC5/i
adam@dolphin:~$ df -h /mnt/storage0/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 2.8T 2.0T 807G 72% /mnt/storage0
from /etc/fstab
/dev/disk/by-label/storage0 /mnt/storage0 xfs rw,usrquota,grpquota 0 0
It is not running with inode64, but probably should be. Hasn't become a
problem yet. Otoh, I understand that NFS v3 doesn't support 64bit inode
numbers.
FYI almost all of our NFS clients are x86_64 Linux(except 2)

I have CC:d Fred Revenu, who was the person who brought it to my
attention this morning that it had happened again last night.


Attachments:
signature.asc (250.00 B)
OpenPGP digital signature

2008-03-12 18:08:25

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [opensuse] nfs_update_inode: inode X mode changed, Y to Z


On Wed, 2008-03-12 at 10:55 -0700, Adam Schrotenboer wrote:
> Another excerpt from last night.
>
> Mar 12 00:53:16 wanda kernel: nfs: server 192.168.1.27 not responding,
> still trying
> Mar 12 00:53:18 wanda kernel: nfs: server 192.168.1.27 OK
> Mar 12 00:54:29 wanda kernel: nfs_update_inode: inode 1075049148 mode
> changed, 0040755 to 0100644

Hang on. That does not look like an XID collision problem...

That code path basically means that the fileids/inode numbers match,
which would not be the case if we were talking about an XID collision
causing the reply cache to replay an old request...

That message rather symptomatic of a filehandle reuse problem. In other
words a NFS filehandle appears to be reused to label a regular file
after it has been used for a directory with the same fileid/inode
number. That is a definite server bug.

Are you sure that you are using the regular kernel nfs server?

--
Trond Myklebust
NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-03-12 18:16:25

by Adam Schrotenboer

[permalink] [raw]
Subject: Re: [opensuse] nfs_update_inode: inode X mode changed, Y to Z

Trond Myklebust wrote:
> On Wed, 2008-03-12 at 10:55 -0700, Adam Schrotenboer wrote:
>
>> Another excerpt from last night.
>>
>> Mar 12 00:53:16 wanda kernel: nfs: server 192.168.1.27 not responding,
>> still trying
>> Mar 12 00:53:18 wanda kernel: nfs: server 192.168.1.27 OK
>> Mar 12 00:54:29 wanda kernel: nfs_update_inode: inode 1075049148 mode
>> changed, 0040755 to 0100644
>>
>
> Hang on. That does not look like an XID collision problem...
>
> That code path basically means that the fileids/inode numbers match,
> which would not be the case if we were talking about an XID collision
> causing the reply cache to replay an old request...
>
> That message rather symptomatic of a filehandle reuse problem. In other
> words a NFS filehandle appears to be reused to label a regular file
> after it has been used for a directory with the same fileid/inode
> number. That is a definite server bug.
>
> Are you sure that you are using the regular kernel nfs server?
>
>
Quite. OpenSuSE 10.2 stock unmodified kernel.

dolphin:~ # uname -a
Linux dolphin 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006
x86_64 x86_64 x86_64 GNU/Linux

dolphin:~ # top -bn1 |grep nfsd


PID PPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3916 1 root 15 0 0 0 0 S 0 0.0 3:04.42 nfsd
3942 1 root 15 0 0 0 0 S 0 0.0 3:11.88 nfsd
(there's quite a bit more of these threads than this (32), but they all
look the same in top or ps)



Attachments:
signature.asc (250.00 B)
OpenPGP digital signature