2010-12-08 18:52:20

by David Flynn

[permalink] [raw]
Subject: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

Dear all;

On an NFS4 client (mounting a solaris ufs export), for a long time i've
been experiencing vim complaining when exiting:
E137: Viminfo file is not writable: /home/davidf/.viminfo

Further examination has revealed this to occur when vim has been used on
multiple systems using the same nfs filesystem. Vim tries to identify if it
has permission to modify .viminfo by stat'ing it, whereupon stat returns
an incorrect st_uid/st_gid and vim gives up.

Assuming two systems (A and B) with clean caches:
A> vim
:q
B> vim
:q
A> vim
:q
E137: ...
A> vim
:q
E137: ...

It should be noted that:
- (A) must be using nfs4, although sec=sys/sec=krb5 do not affect behaviour.
- (B) can be nfs4 or nfs3 (or possibly anything else)
- Dropping caches on (A) will clear the fault

This fault is very reproducible using kernel 2.6.35. I'm also able to
reproduce on 2.6.32.11 I have not been able to reproduce on a 2.6.26 system.

More succinctly, the following transcript reproduces the problem.
- vcfe0 corresponds to (A) above.
- lxg1 corresponds to (B) above.
- ~/test-stat-open-stat is a program that does:
stat(argv[1], &st);
printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
argv[1], st.st_uid, st.st_gid);

int fd = open(argv[1], O_RDONLY);
printf("open(\"%s\", O_RDONLY) = %d\n", argv[1], fd);
stat(argv[1], &st);
printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
argv[1], st.st_uid, st.st_gid);

close(fd);
printf("close(%d)\n", fd);
stat(argv[1], &st);
printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
argv[1], st.st_uid, st.st_gid);

vcfe0:~/z$ bash <<EOF
echo === truncate === ; \
echo foo >.file ; \
echo === remote unlink, create === ; \
ssh lxg1 "rm $PWD/.file; echo barbar >$PWD/.file" ; \
echo === 'stat(1)' === ; \
stat .file ; \
echo === stat,open,stat,close,stat === ; \
~/test-stat-open-stat .file ; \
echo === 'stat(1)' === ; \
stat .file
EOF

=== truncate ===
=== remote unlink, create ===
=== stat(1) ===
File: `.file'
Size: 4 Blocks: 0 IO Block: 1048576 regular file
Device: 18h/24d Inode: 5678963 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1179/ davidf) Gid: ( 1000/ rd)
Access: 2010-11-29 16:50:03.414109000 +0000
Modify: 2010-11-29 17:00:17.698839000 +0000
Change: 2010-11-29 17:00:17.698839000 +0000
=== stat,open,stat,close,stat ===
stat(".file"...), st_uid=1179, st_gid=1000
open(".file", O_RDONLY) = 3
stat(".file"...), st_uid=4294967294, st_gid=4294967294
close(3)
stat(".file"...), st_uid=4294967294, st_gid=4294967294
=== stat(1) ===
File: `.file'
Size: 7 Blocks: 2 IO Block: 1048576 regular file
Device: 18h/24d Inode: 5678906 Links: 1
Access: (0644/-rw-r--r--) Uid: (4294967294/ UNKNOWN) Gid: (4294967294/ UNKNOWN)
Access: 2010-11-29 17:00:18.059076000 +0000
Modify: 2010-11-29 17:00:18.064360000 +0000
Change: 2010-11-29 17:00:18.064360000 +0000

Kind regards,

..david


2010-12-08 20:20:54

by David Flynn

[permalink] [raw]
Subject: Re: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

* Trond Myklebust ([email protected]) wrote:
> You really should file a bug with the vim folks on this. Checking
> permissions using the above scheme is inherently racy (and quite
> frankly, wrong). See the caveats in the 'access()' manpage for details.

> If you want to find out if the file can be modified, then the only race
> free way to do so, is to try to open() it.

Yes, i intend to do this -- what is more, they already have the file
open (they do, open(), stat()). I havn't looked at the vim source yet
to spot quite why they think this is a good idea. (If they really are
trying to manually spot permissions, it misses so many other routes,
acls, etc).

> > === truncate ===
> > === remote unlink, create ===
> > === stat(1) ===
> > File: `.file'
> > Size: 4 Blocks: 0 IO Block: 1048576 regular file
> > Device: 18h/24d Inode: 5678963 Links: 1
> > Access: (0644/-rw-r--r--) Uid: ( 1179/ davidf) Gid: ( 1000/ rd)
> > Access: 2010-11-29 16:50:03.414109000 +0000
> > Modify: 2010-11-29 17:00:17.698839000 +0000
> > Change: 2010-11-29 17:00:17.698839000 +0000
> > === stat,open,stat,close,stat ===
> > stat(".file"...), st_uid=1179, st_gid=1000
> > open(".file", O_RDONLY) = 3
> > stat(".file"...), st_uid=4294967294, st_gid=4294967294
> > close(3)
> > stat(".file"...), st_uid=4294967294, st_gid=4294967294
> > === stat(1) ===
> > File: `.file'
> > Size: 7 Blocks: 2 IO Block: 1048576 regular file
> > Device: 18h/24d Inode: 5678906 Links: 1
> > Access: (0644/-rw-r--r--) Uid: (4294967294/ UNKNOWN) Gid: (4294967294/ UNKNOWN)
> > Access: 2010-11-29 17:00:18.059076000 +0000
> > Modify: 2010-11-29 17:00:18.064360000 +0000
> > Change: 2010-11-29 17:00:18.064360000 +0000
>
> Your idmapper appears to be mapping the user to 'nobody', which is
> usually a sign that it is misconfigured. It may, however, also mean that
> the username (or groupname) is missing in /etc/passwd (or /etc/groups).

I don't think this is the case. I performed a network trace of the
transaction and i can see that for all the calls with a user/group in
they refer to me ([email protected], [email protected]).

I also checked the local idmapper to see what it thought of things, and
it doesn't get an upcall to map this request, so values have been
previously cached.

Also, the thing about this "race" is that it is very very slow. its
utterly reliable. running vim on "host-B" does not have to occur while
it runs on "host-A", nor does it seem to need to happen soon
after/before "host-A" quits/starts vim.

> I've no idea how to check the Solaris idmapper configuration, but on
> Linux you want to look at /etc/idmapd.conf (specifically the 'Domain'
> line).

That is correct as far as i can tell. The first part of the transaction
is OK, however, it fails as soon as stat() is called after open().

I've just sanitized a network capture of the following. I've removed
anything sensitive to what i hope are just the calls related to
the above (please excuse any tcp lost segments due to this). It is
available at:
ftp://ftp.kw.bbc.co.uk/davidf/priv/nfs-staterror-20101208-clean.pcap

bash <<EOF
> ssh lxg1 "rm $PWD/.file; echo barbar >$PWD/.file" ; \
> sleep 10 ; \
> echo === 'stat(1)' === ; \
> stat .file ; \
> echo === stat,open,stat,close,stat === ; \
> ~/test-stat-open-stat .file ; \
> echo === 'stat(1)' === ; \
> stat .file
> EOF
=== truncate ===
=== remote unlink, create ===
=== stat(1) ===
File: `.file'
Size: 4 Blocks: 2 IO Block: 1048576 regular file
Device: 18h/24d Inode: 7111198 Links: 0
Access: (0644/-rw-r--r--) Uid: ( 1179/ davidf) Gid: ( 1000/ rd)
Access: 2010-12-08 19:38:16.289020000 +0000
Modify: 2010-12-08 19:38:16.312946000 +0000
Change: 2010-12-08 19:38:26.870908000 +0000
=== stat,open,stat,close,stat ===
stat("..file"...), st_uid=1179, st_gid=1000
open(".file", O_RDONLY) = 3
stat("..file"...), st_uid=4294967294, st_gid=4294967294
close(3)
stat("..file"...), st_uid=4294967294, st_gid=4294967294
=== stat(1) ===
File: `.file'
Size: 7 Blocks: 2 IO Block: 1048576 regular file
Device: 18h/24d Inode: 7111201 Links: 1
Access: (0644/-rw-r--r--) Uid: (4294967294/ UNKNOWN) Gid: (4294967294/ UNKNOWN)
Access: 2010-12-08 19:38:26.884045000 +0000
Modify: 2010-12-08 19:38:26.891030000 +0000
Change: 2010-12-08 19:38:26.891030000 +0000


Transcript of the network capture:

time = 0s [pkt1] ".file" is created with FH:
nfs.fhandle ==
01:54:00:1e:00:00:00:02:00:0a:00:00:00:6c:82:1e:57:b5:6a:e5:00:0a:00:00:00:00:00:02:69:96:a5:a0:00:00:00:00

"foo" is written to it, commited, closed
time = 0.0498s [pkt14]

sleep 10 seconds

ssh to remote system, "rm $PWD/.file; echo barbar >$PWD/.file"

sleep 10 seconds

time=20.617s [pkt15] /usr/bin/stat .file
call uses:
nfs.fhandle ==
01:54:00:1e:00:00:00:02:00:0a:00:00:00:6c:82:1e:57:b5:6a:e5:00:0a:00:00:00:00:00:02:69:96:a5:a0:00:00:00:00

time=20.639s [pkt redacted] (~/test-stat-open-stat loads via nfs)

time=20.730s [pkt17] ~/test-stat-open-stat open()'s ".file"
GETFH returns
nfs.fhandle ==
01:54:00:1e:00:00:00:02:00:0a:00:00:00:6c:82:21:26:3e:e3:78:00:0a:00:00:00:00:00:02:69:96:a5:a0:00:00:00:00

time=20.731s [pkt19] ~/test-stat-open-stat close()'s ".file"
time=20.732s [pkt21] /usr/bin/stat .file

I can't quite discriminate which calls after ~/test-stat-open-stat
(packets 17,19,21) were caused by which process (ie,
~/test-stat-open-stat or /usr/bin/stat), in earlier parts i could see
the PATH search for "stat" occurring.

I can't see any calls that don't use my textual owner / group. Can you
confirm this?

Kind regards,

..david

2010-12-09 13:11:36

by David Flynn

[permalink] [raw]
Subject: Re: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

* David Flynn ([email protected]) wrote:
> * Trond Myklebust ([email protected]) wrote:
> > While the open() call may indeed fail to set the uid/gid (because to do
> > so would require an upcall right in the middle of an asynchronous RPC
> > call), it should normally mark the inode as requiring revalidation if
> > this is the case.
>
> Where ought this occur (so we can validate that it is indeed the case)?
>
> > The ensuing stat() should then trigger a GETATTR call on the wire, which
> > will correct the uid/gid.

I've re-run the test with nfs client debugging turned on. I've also
modified the test case to write output to /dev/kmsg to allow easier
identification of what is going on. An annotated log is available
at:

ftp://ftp.kw.bbc.co.uk/davidf/priv/nfs-20101209/bad

NB, the system i've run this on has an nfsroot[1], and 1MiB kernel log
ring buffer. with nfs debugging, the test case would occasionally
do as expected, the transcript of this is available as:

ftp://ftp.kw.bbc.co.uk/davidf/priv/nfs-20101209/good

I've tried to keep the formatting similar, so that you can do:
$ diff -u <(sed -e 's/^\[.\+\]//' bad) <(sed -e 's/^\[.\+\]//' good)

This is available as:
ftp://ftp.kw.bbc.co.uk/davidf/priv/nfs-20101209/bad-to-good.diff

[1] although i can reproduce on systems without nfsroot, i don't have
root access to them.

..david

2010-12-08 20:45:31

by Myklebust, Trond

[permalink] [raw]
Subject: Re: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

On Wed, 2010-12-08 at 20:20 +0000, David Flynn wrote:
> I don't think this is the case. I performed a network trace of the
> transaction and i can see that for all the calls with a user/group in
> they refer to me ([email protected], [email protected]).

Yes. I saw your trace, and it looked good.

> I also checked the local idmapper to see what it thought of things, and
> it doesn't get an upcall to map this request, so values have been
> previously cached.

Have you tried running the idmapper with the '-f -v -v' flags so you can
log the translation upcalls? The default cache timeout is 10 minutes, so
you shouldn't have to wait too long.

> Also, the thing about this "race" is that it is very very slow. its
> utterly reliable. running vim on "host-B" does not have to occur while
> it runs on "host-A", nor does it seem to need to happen soon
> after/before "host-A" quits/starts vim.
>
> > I've no idea how to check the Solaris idmapper configuration, but on
> > Linux you want to look at /etc/idmapd.conf (specifically the 'Domain'
> > line).
>
> That is correct as far as i can tell. The first part of the transaction
> is OK, however, it fails as soon as stat() is called after open().

While the open() call may indeed fail to set the uid/gid (because to do
so would require an upcall right in the middle of an asynchronous RPC
call), it should normally mark the inode as requiring revalidation if
this is the case.
The ensuing stat() should then trigger a GETATTR call on the wire, which
will correct the uid/gid.

> I can't see any calls that don't use my textual owner / group. Can you
> confirm this?

Agreed.

Cheers
Trond

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2010-12-08 15:48:36

by David Flynn

[permalink] [raw]
Subject: Re: can't dump core on NFSv4 mounts

* David Flynn ([email protected]) wrote:
> On an NFS4 client (mounting a solaris ufs export), for a long time i've
> been experiencing vim complaining when exiting:
> E137: Viminfo file is not writable: /home/davidf/.viminfo

A similar and possibly related fault that I'm seeing is the inability
for a process to dump a core file when it segfaults on an nfs4 mount.

If the filesystem is mounted nfs3, then this issue does not arise.

using:
$ echo 'int main() {return *(int*)0;}' >segfault.c

on an nfs4 mount:
host-a$ ~/segfault
Segmentation fault
host-a$ ls -l
total 1
-rw------- 1 4294967294 4294967294 0 2010-12-08 15:34 core

a remote system shows:
host-b$ ls -l
total 1
-rw------- 1 davidf rd 0 Dec 8 15:34 core

waiting for the dust to settle:
host-a$ ls -l
total 1
-rw------- 1 davidf rd 0 2010-12-08 15:34 core

Iff, the core file already exists, then the core dump succeeds:
host-a$ touch core
host-a$ ls -l
total 1
-rw------- 1 davidf rd 0 2010-12-08 15:42 core
host-a$ ~/segfault
Segmentation fault (core dumped)

I can reliably reproduce using 2.6.32.11 and 2.6.35
I can not reproduce on 2.6.26

Kind regards,

..david

2010-12-08 21:19:35

by David Flynn

[permalink] [raw]
Subject: Re: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

* Trond Myklebust ([email protected]) wrote:
> Have you tried running the idmapper with the '-f -v -v' flags so you can
> log the translation upcalls? The default cache timeout is 10 minutes, so
> you shouldn't have to wait too long.

vc-r210-0:/# rpc.idmapd -f -v -v
rpc.idmapd: Expiration time is 600 seconds.
rpc.idmapd: Opened /proc/net/rpc/nfs4.nametoid/channel
rpc.idmapd: Opened /proc/net/rpc/nfs4.idtoname/channel
rpc.idmapd: New client: 0
rpc.idmapd: New client: 1
rpc.idmapd: New client: 10
rpc.idmapd: New client: 11
rpc.idmapd: New client: 12
rpc.idmapd: New client: 13
rpc.idmapd: New client: 14
rpc.idmapd: New client: 15
rpc.idmapd: New client: 2
rpc.idmapd: Opened /var/lib/nfs/rpc_pipefs/nfs/clnt2/idmap
rpc.idmapd: New client: 4
rpc.idmapd: New client: 5
rpc.idmapd: Opened /var/lib/nfs/rpc_pipefs/nfs/clnt5/idmap
rpc.idmapd: New client: 7
rpc.idmapd: New client: 8
rpc.idmapd: New client: 9
rpc.idmapd: New client: a
rpc.idmapd: New client: b
rpc.idmapd: New client: c
rpc.idmapd: New client: d
rpc.idmapd: New client: e
rpc.idmapd: New client: f
rpc.idmapd: Client 2: (user) name "[email protected]" -> id "0"
rpc.idmapd: Client 2: (group) name "[email protected]" -> id "20"
rpc.idmapd: Client 2: (user) name "[email protected]" -> id "1179"
rpc.idmapd: Client 2: (group) name "[email protected]" -> id "1000"

These four translations occured at the very start of test, ie,
when bash was executed / ".file" truncated. There were no further
translations.

> While the open() call may indeed fail to set the uid/gid (because to do
> so would require an upcall right in the middle of an asynchronous RPC
> call), it should normally mark the inode as requiring revalidation if
> this is the case.

Where ought this occur (so we can validate that it is indeed the case)?

> The ensuing stat() should then trigger a GETATTR call on the wire, which
> will correct the uid/gid.

Is this possibly related to:
http://article.gmane.org/gmane.linux.nfs/37224 ?

Kind regards,

..david

2010-12-08 19:20:21

by Myklebust, Trond

[permalink] [raw]
Subject: Re: NFSv4: "Viminfo file is not writable" -- stat(2) uid/gid incorrect

On Wed, 2010-12-08 at 15:33 +0000, David Flynn wrote:
> Dear all;
>
> On an NFS4 client (mounting a solaris ufs export), for a long time i've
> been experiencing vim complaining when exiting:
> E137: Viminfo file is not writable: /home/davidf/.viminfo
>
> Further examination has revealed this to occur when vim has been used on
> multiple systems using the same nfs filesystem. Vim tries to identify if it
> has permission to modify .viminfo by stat'ing it, whereupon stat returns
> an incorrect st_uid/st_gid and vim gives up.

You really should file a bug with the vim folks on this. Checking
permissions using the above scheme is inherently racy (and quite
frankly, wrong). See the caveats in the 'access()' manpage for details.

If you want to find out if the file can be modified, then the only race
free way to do so, is to try to open() it.

> Assuming two systems (A and B) with clean caches:
> A> vim
> :q
> B> vim
> :q
> A> vim
> :q
> E137: ...
> A> vim
> :q
> E137: ...
>
> It should be noted that:
> - (A) must be using nfs4, although sec=sys/sec=krb5 do not affect behaviour.
> - (B) can be nfs4 or nfs3 (or possibly anything else)
> - Dropping caches on (A) will clear the fault
>
> This fault is very reproducible using kernel 2.6.35. I'm also able to
> reproduce on 2.6.32.11 I have not been able to reproduce on a 2.6.26 system.
>
> More succinctly, the following transcript reproduces the problem.
> - vcfe0 corresponds to (A) above.
> - lxg1 corresponds to (B) above.
> - ~/test-stat-open-stat is a program that does:
> stat(argv[1], &st);
> printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
> argv[1], st.st_uid, st.st_gid);
>
> int fd = open(argv[1], O_RDONLY);
> printf("open(\"%s\", O_RDONLY) = %d\n", argv[1], fd);
> stat(argv[1], &st);
> printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
> argv[1], st.st_uid, st.st_gid);
>
> close(fd);
> printf("close(%d)\n", fd);
> stat(argv[1], &st);
> printf("stat(\".%s\"...), st_uid=%u, st_gid=%u\n",
> argv[1], st.st_uid, st.st_gid);
>
> vcfe0:~/z$ bash <<EOF
> echo === truncate === ; \
> echo foo >.file ; \
> echo === remote unlink, create === ; \
> ssh lxg1 "rm $PWD/.file; echo barbar >$PWD/.file" ; \
> echo === 'stat(1)' === ; \
> stat .file ; \
> echo === stat,open,stat,close,stat === ; \
> ~/test-stat-open-stat .file ; \
> echo === 'stat(1)' === ; \
> stat .file
> EOF
>
> === truncate ===
> === remote unlink, create ===
> === stat(1) ===
> File: `.file'
> Size: 4 Blocks: 0 IO Block: 1048576 regular file
> Device: 18h/24d Inode: 5678963 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 1179/ davidf) Gid: ( 1000/ rd)
> Access: 2010-11-29 16:50:03.414109000 +0000
> Modify: 2010-11-29 17:00:17.698839000 +0000
> Change: 2010-11-29 17:00:17.698839000 +0000
> === stat,open,stat,close,stat ===
> stat(".file"...), st_uid=1179, st_gid=1000
> open(".file", O_RDONLY) = 3
> stat(".file"...), st_uid=4294967294, st_gid=4294967294
> close(3)
> stat(".file"...), st_uid=4294967294, st_gid=4294967294
> === stat(1) ===
> File: `.file'
> Size: 7 Blocks: 2 IO Block: 1048576 regular file
> Device: 18h/24d Inode: 5678906 Links: 1
> Access: (0644/-rw-r--r--) Uid: (4294967294/ UNKNOWN) Gid: (4294967294/ UNKNOWN)
> Access: 2010-11-29 17:00:18.059076000 +0000
> Modify: 2010-11-29 17:00:18.064360000 +0000
> Change: 2010-11-29 17:00:18.064360000 +0000

Your idmapper appears to be mapping the user to 'nobody', which is
usually a sign that it is misconfigured. It may, however, also mean that
the username (or groupname) is missing in /etc/passwd (or /etc/groups).

I've no idea how to check the Solaris idmapper configuration, but on
Linux you want to look at /etc/idmapd.conf (specifically the 'Domain'
line).

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com