2008-01-29 17:22:49

by Chuck Lever III

[permalink] [raw]
Subject: Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

On Jan 29, 2008, at 8:04 AM, Gianluca Alberici wrote:
> Hello Chuck,
>
> I attach as you requested the two dumpfiles obtained by
>
> tcpdump -s0 -i lo -w /tmp/dump-(not-)working port 2049
>
>
>
> They contain the dump relative to the usual double try: at first
> the open() syscall creates the file, while in the second tries to
> truncate to zero length.

The client is doing a SETATTR to truncate the file, and the server
returns NFSERR_INVAL (EINVAL) which is error code 22. This is not an
RPC decoding problem: the server is genuinely returning an error.

If this were a client problem, we would see it with other servers as
well, but this is the only reported case I am aware of. The client's
SETATTR request looks valid in both cases (and the two requests are
very nearly identical), so the next step is to look closely at your
server to determine why the request fails in the "not working" case.

NFSERR_INVAL is quite uncommon, and in fact is not defined in the NFS
version 2 specification (RFC 1094). This suggests that the server
has encountered some kind of internal problem, or that it is simply a
broken implementation.

I think you mentioned previously that the server is the Debian user-
space server. You should contact Debian and ask for their help to
diagnose the problem. (As far as I know there are no user-space
server developers on this list, but I could be incorrect).

> Chuck Lever wrote:
>
>> Hi Gianluca-
>>
>> On Jan 27, 2008, at 7:08 AM, Gianluca Alberici wrote:
>>
>>> Hello Chuck,
>>>
>>> i have produced the output you requested using the code i used to
>>> show you last time (which simply tries to open(... | O_TRUNC) a
>>> file onto the nfs mount and writes "Hello" into it. I simply
>>> iterate execution 2 times. The mount is a loop mount on 127.0.0.1
>>> Since the second execution (the first time it creates the file)
>>> you get EINVAL:
>>>
>>> FILE CREATION:
>>>
>>> hydra:~# tcpdump -s0 -i lo port 2049
>>> tcpdump: verbose output suppressed, use -v or -vv for full
>>> protocol decode
>>> listening on lo, link-type EN10MB (Ethernet), capture size 65535
>>> bytes
>>> 12:15:06.306619 IP localhost.251828621 > localhost.nfs: 120
>>> getattr fh Unknown/
>>> 47521E2B0223C100000000000000000000000000000000000000000000000000
>>> 12:15:06.306666 IP localhost.nfs > localhost.251828621: reply ok
>>> 96 getattr DIR 40777 ids 0/0 sz 4096
>>> 12:15:06.306705 IP localhost.268605837 > localhost.nfs: 128
>>> lookup fh Unknown/
>>> 47521E2B0223C100000000000000000000000000000000000000000000000000
>>> "test"
>>> 12:15:06.306752 IP localhost.nfs > localhost.268605837: reply ok
>>> 28 lookup ERROR: No such file or directory
>>> 12:15:06.306786 IP localhost.285383053 > localhost.nfs: 160
>>> create fh Unknown/
>>> 47521E2B0223C100000000000000000000000000000000000000000000000000
>>> "test"
>>> 12:15:06.306917 IP localhost.nfs > localhost.285383053: reply ok
>>> 128 create fh Unknown/
>>> 48521E2B0323C120000000000000000000000000000000000000000000000000
>>> 12:15:06.307179 IP localhost.302160269 > localhost.nfs: 144 write
>>> fh Unknown/
>>> 48521E2B0323C120000000000000000000000000000000000000000000000000
>>> 5 (5) bytes @ 0 (0)
>>> 12:15:06.307283 IP localhost.nfs > localhost.302160269: reply ok
>>> 96 write
>>
>>
>> We need to have the raw output of tcpdump. Please use "-w
>> dumpfile" and send the raw output.
>>
>>>> sudo tcpdump -s0 -w /tmp/dumpfile hostname-of-server
>>>
>>
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> <dump-working><dump-not-working>

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2008-01-29 17:54:06

by Peter Åstrand

[permalink] [raw]
Subject: Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

On Tue, 29 Jan 2008, Chuck Lever wrote:

> I think you mentioned previously that the server is the Debian user-space
> server. You should contact Debian and ask for their help to diagnose the
> problem. (As far as I know there are no user-space server developers on this
> list, but I could be incorrect).

(I've thought I've been advertising unfs3 almost too much :-))

If this problem shows up with unfs3, I can probably take a look at it.

Regards,
---
Peter Åstrand ThinLinc Chief Developer
Cendio AB http://www.cendio.se
Wallenbergs gata 4
583 30 Linköping Phone: +46-13-21 46 00