LinuxLists.cc - Patch for mapping EILSEQ into NFSERR

2013-12-02 20:28:53

Subject: Patch for mapping EILSEQ into NFSERR_INVAL

Hello,

NTFS file system and some other filesystems seem to return EILSEQ when
user passes badly formatted data. Current NFSv4 implementation does not
map EILSEQ into any known NFSv4 error code which causes problems in some
use cases. In my case I observed that if filesystem returns EILSEQ, the
NFS share will begin I/O erroring until it's able to recover (If there
are handles open to the files in the share, I/O errors will continue
until they are closed after which the share recovers after small period
of time).

Given that widely used ntfs-3g FUSE module also returns EILSEQ on the
same case (I tested this) I would argue that a fix should be done for
upstream especially since RFC5661 clearly defines that invalid UTF-8
sequence should map into NFSERR_INVAL, exact quote: "Where the client
sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL
(see Table 5)".

Other way of fixing this would be to change all EILSEQs to return
something else but I think this is a wrong way to fix the problem since
it takes the sane error message away (Invalid or incomplete multibyte or
wide character).

If we simply do a mapping here then isn't the fix straightforward
(attached)?

BR, Antti

Attachments:

proposed_mapping.patch (469.00 B)

2013-12-04 21:23:03

by Trond Myklebust

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Dec 4, 2013, at 16:03, Dr Fields James Bruce <[email protected]> wrote:

> On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti T?nkyr? wrote:
>
>>>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
>
> And I see something I'd overlooked before: the client is sending the
> later opens with the same open owner and sequence id. But NFS4ERR_IO is
> a seqid-mutating error. So now I think this probably *is* a client
> bug....

Umm? Yes and no. The client should be able to recover when it discovers that the seqid is out of sync.

That said, I see that we do

status = decode_op_hdr(xdr, OP_OPEN);
if (status != -EIO)
nfs_increment_open_seqid(status, res->seqid);

and since NFS4ERR_IO == EIO, that means we skip the seqid update when you send us NFS4ERR_IO.

Cheers,
Trond

2013-12-03 20:48:09

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Mon, Dec 02, 2013 at 10:52:50PM +0200, Antti Tönkyrä wrote:
> On 2013-12-02 22:45, Trond Myklebust wrote:
> >On Dec 2, 2013, at 15:21, Antti Tönkyrä <[email protected]> wrote:
> >
> >>Hello,
> >>
> >>NTFS file system and some other filesystems seem to return EILSEQ when user passes badly formatted data. Current NFSv4 implementation does not map EILSEQ into any known NFSv4 error code which causes problems in some use cases. In my case I observed that if filesystem returns EILSEQ, the NFS share will begin I/O erroring until it's able to recover (If there are handles open to the files in the share, I/O errors will continue until they are closed after which the share recovers after small period of time).

So how exactly does that happen? I'd expect an error on open or lookup
to prevent the client from ever getting a filehandle in the first place.

Looking back at https://bugzilla.kernel.org/attachment.cgi?id=116211 ...

OK, it makes sense that touching a file with a bad name would get an
error, but you're seeing that cause later creates of files on the same
filesystem fail. I can't figure out why that would happen.

Could we see a network trace showing that happen?

Would also be nice to confirm whether this is reproduceable with
something other than ZFS.

--b.

> >>
> >>Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
> >The NFS client will then happily map that straight into EINVAL for you...
> >
> >Cheers,
> > Trond
> Hello,
>
> But if we get a (somewhat) sane error back, we'd avoid the whole NFS
> share dying in I/O errors because someone decided to touch a file
> with his terminal having wrong charset (if I understood this
> correctly). The problem here is that we don't simply get one I/O
> error from the offending file, but rather that the whole share dies
> after EILSEQ is returned from the backing filesystem.

2013-12-04 11:15:46

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-04 10:31, Christoph Hellwig wrote:
> Please just fix ntfs to do the mapping for you, EILSEQ is not an
> error the VFS or applications expect either.
>
How would you propose the mapping to be done? Are you sure about EILSEQ
not being an error that VFS/applications expect, my userspace tools seem
to handle local filesystem returning EILSEQ just fine?

Moreover I think we really should see why nfs goes into permanent I/O
error state when this is triggered. I don't really mind seeing nfs
unhandled error codes in my kernel log and I/O error on the nfs client
when trying to write badly named files but the fact we cannot recover
from the situation (without closing all handles to the open share) is
alarming.

- Antti

2013-12-04 21:03:58

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti Tönkyrä wrote:
> Looks like I can strace it and I did an example strace, if you need
> something more specific please do tell. Command log has completion
> timestamps for associating what part of strace happened at what
> command.
>
> http://daedalus.pingtimeout.net/dbg/strace-ioerr.txt
> http://daedalus.pingtimeout.net/dbg/strace-commandlog.txt

Eh, not much use without knowing fuse's protocol, but:

> And just in case you missed my other mail, the I/O error doesn't
> kill the share if using NFSv3 (mount -t nfs -o vers=3,intr,hard
> ...). The initial I/O error happens but the share doesn't die even
> when there are file handles open there.

Oh, I overlooked that. So that merits another look at the network
trace:

> >>http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap

And I see something I'd overlooked before: the client is sending the
later opens with the same open owner and sequence id. But NFS4ERR_IO is
a seqid-mutating error. So now I think this probably *is* a client
bug....

What's the client kernel version?

--b.

2013-12-04 22:49:13

by Trond Myklebust

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, 2013-12-04 at 16:38 -0500, Dr Fields James Bruce wrote:
> On Wed, Dec 04, 2013 at 04:22:48PM -0500, Trond Myklebust wrote:
> >
> > On Dec 4, 2013, at 16:03, Dr Fields James Bruce <[email protected]> wrote:
> >
> > > On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti Tönkyrä wrote:
> > >
> > >>>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
> > >
> > > And I see something I'd overlooked before: the client is sending the
> > > later opens with the same open owner and sequence id. But NFS4ERR_IO is
> > > a seqid-mutating error. So now I think this probably *is* a client
> > > bug....
> >
> > Umm… Yes and no. The client should be able to recover when it discovers that the seqid is out of sync.
> >
> > That said, I see that we do
> >
> > status = decode_op_hdr(xdr, OP_OPEN);
> > if (status != -EIO)
> > nfs_increment_open_seqid(status, res->seqid);
> >
> > and since NFS4ERR_IO == EIO, that means we skip the seqid update when you send us NFS4ERR_IO.
>
> Oh, OK. Maybe decode_op_hdr could use -NFS4ERR_BADXDR for the two
> decoding errors it catches and eliminate the need for this special -EIO
> case?

That is sort of hackish. How about something like the following patch.

> I think NFS4ERR_IO is a legal error for these operations. (Even if the
> server should have returned something else in this case.)

It is legal for OPEN. It does not seem to be legal for OPEN_CONFIRM,
LOCK, LOCKU, CLOSE or OPEN_DOWNGRADE according to RFC3530bis.

Cheers
Trond

8<--------------------------------------------------------
>From 968a89bfaf176d70352bb1c0003bf496fdc180aa Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Wed, 4 Dec 2013 17:39:23 -0500
Subject: [PATCH] NFSv4: OPEN must handle the NFS4ERR_IO return code correctly

decode_op_hdr() cannot distinguish between an XDR decoding error and
the perfectly valid errorcode NFS4ERR_IO. This is normally not a
problem, but for the particular case of OPEN, we need to be able
to increment the NFSv4 open sequence id when the server returns
a valid response.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4xdr.c | 47 +++++++++++++++++++++++++++++++----------------
1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 5be2868c02f1..8c21d69a9dc1 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -3097,7 +3097,8 @@ out_overflow:
return -EIO;
}

-static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
+static bool __decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected,
+ int *nfs_retval)
{
__be32 *p;
uint32_t opnum;
@@ -3107,19 +3108,32 @@ static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
if (unlikely(!p))
goto out_overflow;
opnum = be32_to_cpup(p++);
- if (opnum != expected) {
- dprintk("nfs: Server returned operation"
- " %d but we issued a request for %d\n",
- opnum, expected);
- return -EIO;
- }
+ if (unlikely(opnum != expected))
+ goto out_bad_operation;
nfserr = be32_to_cpup(p);
- if (nfserr != NFS_OK)
- return nfs4_stat_to_errno(nfserr);
- return 0;
+ if (nfserr == NFS_OK)
+ *nfs_retval = 0;
+ else
+ *nfs_retval = nfs4_stat_to_errno(nfserr);
+ return true;
+out_bad_operation:
+ dprintk("nfs: Server returned operation"
+ " %d but we issued a request for %d\n",
+ opnum, expected);
+ *nfs_retval = -EREMOTEIO;
+ return false;
out_overflow:
print_overflow_msg(__func__, xdr);
- return -EIO;
+ *nfs_retval = -EIO;
+ return false;
+}
+
+static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
+{
+ int retval;
+
+ __decode_op_hdr(xdr, expected, &retval);
+ return retval;
}

/* Dummy routine */
@@ -5001,11 +5015,12 @@ static int decode_open(struct xdr_stream *xdr, struct nfs_openres *res)
uint32_t savewords, bmlen, i;
int status;

- status = decode_op_hdr(xdr, OP_OPEN);
- if (status != -EIO)
- nfs_increment_open_seqid(status, res->seqid);
- if (!status)
- status = decode_stateid(xdr, &res->stateid);
+ if (!__decode_op_hdr(xdr, OP_OPEN, &status))
+ return status;
+ nfs_increment_open_seqid(status, res->seqid);
+ if (status)
+ return status;
+ status = decode_stateid(xdr, &res->stateid);
if (unlikely(status))
return status;

--
1.8.3.1

2013-12-03 21:22:12

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
> OK, it makes sense that touching a file with a bad name would get an
> error, but you're seeing that cause later creates of files on the same
> filesystem fail. I can't figure out why that would happen.
...

So maybe there's some other problem here, but...

> > >>Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
> > >The NFS client will then happily map that straight into EINVAL for you...

This seems like a spec bug?

NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
wire all the time. But I don't know what other error would work.

I guess a client could map INVAL to EILSEQ on open or lookup (is there
any other reason a correct client should ever see INVAL on those ops?).
Or do that only if fs_charset is supported and has
FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.

--b.

2013-12-04 15:41:06

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, Dec 04, 2013 at 08:55:04AM +0200, Antti Tönkyrä wrote:
> On 2013-12-03 23:22, Dr Fields James Bruce wrote:
> >On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
> >>OK, it makes sense that touching a file with a bad name would get an
> >>error, but you're seeing that cause later creates of files on the same
> >>filesystem fail. I can't figure out why that would happen.
> >...
> >
> >So maybe there's some other problem here, but...
> >
> >>>>>Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
> >>>>The NFS client will then happily map that straight into EINVAL for you...
> >This seems like a spec bug?
> >
> >NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
> >wire all the time. But I don't know what other error would work.
> >
> >I guess a client could map INVAL to EILSEQ on open or lookup (is there
> >any other reason a correct client should ever see INVAL on those ops?).
> >Or do that only if fs_charset is supported and has
> >FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.
> >
> >--b.
> Hello,
>
> I don't really think it's an issue if we don't do any mapping here
> either, I/O error is perfectly acceptable to me but the whole share
> dying is of course not very desirable...
>
> I have been conducting my tests with ntfs-3g for now and after
> applying my patch to map EILSEQ into INVAL I didn't witness the
> share dying anymore. I captured network traffic from both cases
> (urls below). The tests were conducted so that after mounting the
> NFS share a file was opened (with w-mode) after which pcap was
> started. After that the following commands were executed:
>
> touch `echo -e $'\some_bad_sequence'` (I tested these with \377 and \375)
> <wait a bit>
> touch something
>
> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
> http://daedalus.pingtimeout.net/dbg/eilseq_mapped.pcap
>
> With mapping patch applied, the bad sequence touch returns invalid
> argument but doesn't kill the share as somewhat visible from the
> pcap.

The later creates show up as opens for "something", and the server's
resending with NFSERR_IO on the open.

So, thanks, that probably rules out any client-side bug.

Since you're testing a fuse filesystem--is there an easy way to get some
debugging info out of it? E.g. is it running as an ntfs-3g daemon that
you can strace?

--b.

2013-12-02 20:52:54

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-02 22:45, Trond Myklebust wrote:
> On Dec 2, 2013, at 15:21, Antti T?nkyr? <[email protected]> wrote:
>
>> Hello,
>>
>> NTFS file system and some other filesystems seem to return EILSEQ when user passes badly formatted data. Current NFSv4 implementation does not map EILSEQ into any known NFSv4 error code which causes problems in some use cases. In my case I observed that if filesystem returns EILSEQ, the NFS share will begin I/O erroring until it's able to recover (If there are handles open to the files in the share, I/O errors will continue until they are closed after which the share recovers after small period of time).
>>
>> Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
> The NFS client will then happily map that straight into EINVAL for you...
>
> Cheers,
> Trond
Hello,

But if we get a (somewhat) sane error back, we'd avoid the whole NFS
share dying in I/O errors because someone decided to touch a file with
his terminal having wrong charset (if I understood this correctly). The
problem here is that we don't simply get one I/O error from the
offending file, but rather that the whole share dies after EILSEQ is
returned from the backing filesystem.

BR, Antti

2013-12-04 08:31:22

by Christoph Hellwig

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

Please just fix ntfs to do the mapping for you, EILSEQ is not an
error the VFS or applications expect either.

2013-12-04 11:34:35

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-04 13:19, Christoph Hellwig wrote:
> On Wed, Dec 04, 2013 at 01:15:41PM +0200, Antti T?nkyr? wrote:
>> >On 2013-12-04 10:31, Christoph Hellwig wrote:
>>> > >Please just fix ntfs to do the mapping for you, EILSEQ is not an
>>> > >error the VFS or applications expect either.
>>> > >
>> >How would you propose the mapping to be done? Are you sure about
>> >EILSEQ not being an error that VFS/applications expect, my userspace
>> >tools seem to handle local filesystem returning EILSEQ just fine?
> EILSEQ doesn't have a specified meaning for VFS operations. Of course
> your app could handle it in some way, but that's your implementation
> specific way that has not base for it.
>
Okay, I understand your point. However, in this case I was referring to
basic GNU/Linux userspace tools which I can hardly call my own apps :)

All in all I guess we could keep the current mapping path which
eventually leads into I/O error. Next we should understand why the share
fails when I/O error is emitted...

- Antti

2013-12-04 20:44:50

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-04 17:41, Dr Fields James Bruce wrote:
> On Wed, Dec 04, 2013 at 08:55:04AM +0200, Antti Tönkyrä wrote:
>> On 2013-12-03 23:22, Dr Fields James Bruce wrote:
>>> On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
>>>> OK, it makes sense that touching a file with a bad name would get an
>>>> error, but you're seeing that cause later creates of files on the same
>>>> filesystem fail. I can't figure out why that would happen.
>>> ...
>>>
>>> So maybe there's some other problem here, but...
>>>
>>>>>>> Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
>>>>>> The NFS client will then happily map that straight into EINVAL for you...
>>> This seems like a spec bug?
>>>
>>> NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
>>> wire all the time. But I don't know what other error would work.
>>>
>>> I guess a client could map INVAL to EILSEQ on open or lookup (is there
>>> any other reason a correct client should ever see INVAL on those ops?).
>>> Or do that only if fs_charset is supported and has
>>> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.
>>>
>>> --b.
>> Hello,
>>
>> I don't really think it's an issue if we don't do any mapping here
>> either, I/O error is perfectly acceptable to me but the whole share
>> dying is of course not very desirable...
>>
>> I have been conducting my tests with ntfs-3g for now and after
>> applying my patch to map EILSEQ into INVAL I didn't witness the
>> share dying anymore. I captured network traffic from both cases
>> (urls below). The tests were conducted so that after mounting the
>> NFS share a file was opened (with w-mode) after which pcap was
>> started. After that the following commands were executed:
>>
>> touch `echo -e $'\some_bad_sequence'` (I tested these with \377 and \375)
>> <wait a bit>
>> touch something
>>
>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
>> http://daedalus.pingtimeout.net/dbg/eilseq_mapped.pcap
>>
>> With mapping patch applied, the bad sequence touch returns invalid
>> argument but doesn't kill the share as somewhat visible from the
>> pcap.
> The later creates show up as opens for "something", and the server's
> resending with NFSERR_IO on the open.
>
> So, thanks, that probably rules out any client-side bug.
>
> Since you're testing a fuse filesystem--is there an easy way to get some
> debugging info out of it? E.g. is it running as an ntfs-3g daemon that
> you can strace?
>
> --b.
Looks like I can strace it and I did an example strace, if you need
something more specific please do tell. Command log has completion
timestamps for associating what part of strace happened at what command.

http://daedalus.pingtimeout.net/dbg/strace-ioerr.txt
http://daedalus.pingtimeout.net/dbg/strace-commandlog.txt

And just in case you missed my other mail, the I/O error doesn't kill
the share if using NFSv3 (mount -t nfs -o vers=3,intr,hard ...). The
initial I/O error happens but the share doesn't die even when there are
file handles open there.

- Antti

2013-12-04 06:55:10

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-03 23:22, Dr Fields James Bruce wrote:
> On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
>> OK, it makes sense that touching a file with a bad name would get an
>> error, but you're seeing that cause later creates of files on the same
>> filesystem fail. I can't figure out why that would happen.
> ...
>
> So maybe there's some other problem here, but...
>
>>>>> Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
>>>> The NFS client will then happily map that straight into EINVAL for you...
> This seems like a spec bug?
>
> NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
> wire all the time. But I don't know what other error would work.
>
> I guess a client could map INVAL to EILSEQ on open or lookup (is there
> any other reason a correct client should ever see INVAL on those ops?).
> Or do that only if fs_charset is supported and has
> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.
>
> --b.
Hello,

I don't really think it's an issue if we don't do any mapping here
either, I/O error is perfectly acceptable to me but the whole share
dying is of course not very desirable...

I have been conducting my tests with ntfs-3g for now and after applying
my patch to map EILSEQ into INVAL I didn't witness the share dying
anymore. I captured network traffic from both cases (urls below). The
tests were conducted so that after mounting the NFS share a file was
opened (with w-mode) after which pcap was started. After that the
following commands were executed:

touch `echo -e $'\some_bad_sequence'` (I tested these with \377 and \375)
<wait a bit>
touch something

http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
http://daedalus.pingtimeout.net/dbg/eilseq_mapped.pcap

With mapping patch applied, the bad sequence touch returns invalid
argument but doesn't kill the share as somewhat visible from the pcap.

BR, Antti

2013-12-04 11:19:36

by Christoph Hellwig

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, Dec 04, 2013 at 01:15:41PM +0200, Antti T?nkyr? wrote:
> On 2013-12-04 10:31, Christoph Hellwig wrote:
> >Please just fix ntfs to do the mapping for you, EILSEQ is not an
> >error the VFS or applications expect either.
> >
> How would you propose the mapping to be done? Are you sure about
> EILSEQ not being an error that VFS/applications expect, my userspace
> tools seem to handle local filesystem returning EILSEQ just fine?

EILSEQ doesn't have a specified meaning for VFS operations. Of course
your app could handle it in some way, but that's your implementation
specific way that has not base for it.

> Moreover I think we really should see why nfs goes into permanent
> I/O error state when this is triggered. I don't really mind seeing
> nfs unhandled error codes in my kernel log and I/O error on the nfs
> client when trying to write badly named files but the fact we cannot
> recover from the situation (without closing all handles to the open
> share) is alarming.

No argument on that. Just saying that the mapping you proposed is not a
good idea.

2013-12-05 19:45:43

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, Dec 04, 2013 at 01:34:31PM +0200, Antti Tönkyrä wrote:
> On 2013-12-04 13:19, Christoph Hellwig wrote:
> >On Wed, Dec 04, 2013 at 01:15:41PM +0200, Antti T?nkyr? wrote:
> >>>On 2013-12-04 10:31, Christoph Hellwig wrote:
> >>>> >Please just fix ntfs to do the mapping for you, EILSEQ is not an
> >>>> >error the VFS or applications expect either.
> >>>> >
> >>>How would you propose the mapping to be done? Are you sure about
> >>>EILSEQ not being an error that VFS/applications expect, my userspace
> >>>tools seem to handle local filesystem returning EILSEQ just fine?
> >EILSEQ doesn't have a specified meaning for VFS operations. Of course
> >your app could handle it in some way, but that's your implementation
> >specific way that has not base for it.
> >
> Okay, I understand your point. However, in this case I was referring
> to basic GNU/Linux userspace tools which I can hardly call my own
> apps :)
>
> All in all I guess we could keep the current mapping path which
> eventually leads into I/O error. Next we should understand why the
> share fails when I/O error is emitted...

OK, that fixed....

I'm still confused about the question of what error to return.

I don't think man pages or posix allow open to fail because of a bad
name, but there are filesystems that do, so you're stuck returning
something.

>From a grep of EILSEQ in fs/ it looks like that's what ntfs and befs
return on lookups of bad utf-8.

I guess NFS will return EINVAL (if the server follows 3530).

--b.

2013-12-05 08:39:26

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-05 00:49, Trond Myklebust wrote:
> On Wed, 2013-12-04 at 16:38 -0500, Dr Fields James Bruce wrote:
>> On Wed, Dec 04, 2013 at 04:22:48PM -0500, Trond Myklebust wrote:
>>> On Dec 4, 2013, at 16:03, Dr Fields James Bruce <[email protected]> wrote:
>>>
>>>> On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti Tönkyrä wrote:
>>>>
>>>>>>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
>>>> And I see something I'd overlooked before: the client is sending the
>>>> later opens with the same open owner and sequence id. But NFS4ERR_IO is
>>>> a seqid-mutating error. So now I think this probably *is* a client
>>>> bug....
>>> Umm… Yes and no. The client should be able to recover when it discovers that the seqid is out of sync.
>>>
>>> That said, I see that we do
>>>
>>> status = decode_op_hdr(xdr, OP_OPEN);
>>> if (status != -EIO)
>>> nfs_increment_open_seqid(status, res->seqid);
>>>
>>> and since NFS4ERR_IO == EIO, that means we skip the seqid update when you send us NFS4ERR_IO.
>> Oh, OK. Maybe decode_op_hdr could use -NFS4ERR_BADXDR for the two
>> decoding errors it catches and eliminate the need for this special -EIO
>> case?
> That is sort of hackish. How about something like the following patch.
>
>> I think NFS4ERR_IO is a legal error for these operations. (Even if the
>> server should have returned something else in this case.)
> It is legal for OPEN. It does not seem to be legal for OPEN_CONFIRM,
> LOCK, LOCKU, CLOSE or OPEN_DOWNGRADE according to RFC3530bis.
>
> Cheers
> Trond
>
> 8<--------------------------------------------------------
> From 968a89bfaf176d70352bb1c0003bf496fdc180aa Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Wed, 4 Dec 2013 17:39:23 -0500
> Subject: [PATCH] NFSv4: OPEN must handle the NFS4ERR_IO return code correctly
>
> decode_op_hdr() cannot distinguish between an XDR decoding error and
> the perfectly valid errorcode NFS4ERR_IO. This is normally not a
> problem, but for the particular case of OPEN, we need to be able
> to increment the NFSv4 open sequence id when the server returns
> a valid response.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/nfs4xdr.c | 47 +++++++++++++++++++++++++++++++----------------
> 1 file changed, 31 insertions(+), 16 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 5be2868c02f1..8c21d69a9dc1 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -3097,7 +3097,8 @@ out_overflow:
> return -EIO;
> }
>
> -static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
> +static bool __decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected,
> + int *nfs_retval)
> {
> __be32 *p;
> uint32_t opnum;
> @@ -3107,19 +3108,32 @@ static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
> if (unlikely(!p))
> goto out_overflow;
> opnum = be32_to_cpup(p++);
> - if (opnum != expected) {
> - dprintk("nfs: Server returned operation"
> - " %d but we issued a request for %d\n",
> - opnum, expected);
> - return -EIO;
> - }
> + if (unlikely(opnum != expected))
> + goto out_bad_operation;
> nfserr = be32_to_cpup(p);
> - if (nfserr != NFS_OK)
> - return nfs4_stat_to_errno(nfserr);
> - return 0;
> + if (nfserr == NFS_OK)
> + *nfs_retval = 0;
> + else
> + *nfs_retval = nfs4_stat_to_errno(nfserr);
> + return true;
> +out_bad_operation:
> + dprintk("nfs: Server returned operation"
> + " %d but we issued a request for %d\n",
> + opnum, expected);
> + *nfs_retval = -EREMOTEIO;
> + return false;
> out_overflow:
> print_overflow_msg(__func__, xdr);
> - return -EIO;
> + *nfs_retval = -EIO;
> + return false;
> +}
> +
> +static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
> +{
> + int retval;
> +
> + __decode_op_hdr(xdr, expected, &retval);
> + return retval;
> }
>
> /* Dummy routine */
> @@ -5001,11 +5015,12 @@ static int decode_open(struct xdr_stream *xdr, struct nfs_openres *res)
> uint32_t savewords, bmlen, i;
> int status;
>
> - status = decode_op_hdr(xdr, OP_OPEN);
> - if (status != -EIO)
> - nfs_increment_open_seqid(status, res->seqid);
> - if (!status)
> - status = decode_stateid(xdr, &res->stateid);
> + if (!__decode_op_hdr(xdr, OP_OPEN, &status))
> + return status;
> + nfs_increment_open_seqid(status, res->seqid);
> + if (status)
> + return status;
> + status = decode_stateid(xdr, &res->stateid);
> if (unlikely(status))
> return status;
>
I compiled 3.13-rc1 with that patch included.

Touching the file now returns I/O error just like before but the share
doesn't die anymore so the patch works.

- Antti

2013-12-04 21:08:15

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-04 23:03, Dr Fields James Bruce wrote:
> On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti Tönkyrä wrote:
>> Looks like I can strace it and I did an example strace, if you need
>> something more specific please do tell. Command log has completion
>> timestamps for associating what part of strace happened at what
>> command.
>>
>> http://daedalus.pingtimeout.net/dbg/strace-ioerr.txt
>> http://daedalus.pingtimeout.net/dbg/strace-commandlog.txt
> Eh, not much use without knowing fuse's protocol, but:
>
>> And just in case you missed my other mail, the I/O error doesn't
>> kill the share if using NFSv3 (mount -t nfs -o vers=3,intr,hard
>> ...). The initial I/O error happens but the share doesn't die even
>> when there are file handles open there.
> Oh, I overlooked that. So that merits another look at the network
> trace:
>
>>>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
> And I see something I'd overlooked before: the client is sending the
> later opens with the same open owner and sequence id. But NFS4ERR_IO is
> a seqid-mutating error. So now I think this probably *is* a client
> bug....
>
> What's the client kernel version?
>
> --b.
Tests were mostly conducted using 3.8.0-29 (ubuntu flavour). Original
bug was triggered with 3.8.10.

- Antti

2013-12-02 20:45:08

by Trond Myklebust

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Dec 2, 2013, at 15:21, Antti T?nkyr? <[email protected]> wrote:

> Hello,
>
> NTFS file system and some other filesystems seem to return EILSEQ when user passes badly formatted data. Current NFSv4 implementation does not map EILSEQ into any known NFSv4 error code which causes problems in some use cases. In my case I observed that if filesystem returns EILSEQ, the NFS share will begin I/O erroring until it's able to recover (If there are handles open to the files in the share, I/O errors will continue until they are closed after which the share recovers after small period of time).
>
> Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".

The NFS client will then happily map that straight into EINVAL for you...

Cheers,
Trond

2013-12-04 12:33:39

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-03 23:22, Dr Fields James Bruce wrote:
> On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
>> OK, it makes sense that touching a file with a bad name would get an
>> error, but you're seeing that cause later creates of files on the same
>> filesystem fail. I can't figure out why that would happen.
> ...
>
> So maybe there's some other problem here, but...
>
>>>>> Given that widely used ntfs-3g FUSE module also returns EILSEQ on the same case (I tested this) I would argue that a fix should be done for upstream especially since RFC5661 clearly defines that invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote: "Where the client sends an invalid UTF-8 string, the server should return NFS4ERR_INVAL (see Table 5)".
>>>> The NFS client will then happily map that straight into EINVAL for you...
> This seems like a spec bug?
>
> NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
> wire all the time. But I don't know what other error would work.
>
> I guess a client could map INVAL to EILSEQ on open or lookup (is there
> any other reason a correct client should ever see INVAL on those ops?).
> Or do that only if fs_charset is supported and has
> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.
>
> --b.
Ewh...

I did some further testing and managed to get stale NFS file handles
instead of I/O errors on one run. (Haven't been able to repeat this
behaviour)

After that I did another run (which errored with I/O errors) and found
out that I am able to touch existing files but not create any new files.
Touching an existing file changes the mtime as expected on both the NFS
share and backing FS so something is still being exchanged by the NFS
client and server. I also tested to write into a touchable file but that
returned an I/O error again.

Example below:
# ls -l p
-rwxrwxrwx 1 root root 0 Dec 4 14:26 p
# touch p
# ls -l p
-rwxrwxrwx 1 root root 0 Dec 4 14:31 p
# touch newfile
touch: cannot touch `newfile': Input/output error
# echo test > p
-su: p: Input/output error

- Antti

2013-12-04 12:40:13

by Antti Tönkyrä

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On 2013-12-04 14:33, Antti T?nkyr? wrote:
> On 2013-12-03 23:22, Dr Fields James Bruce wrote:
>> On Tue, Dec 03, 2013 at 03:48:06PM -0500, Dr Fields James Bruce wrote:
>>> OK, it makes sense that touching a file with a bad name would get an
>>> error, but you're seeing that cause later creates of files on the same
>>> filesystem fail. I can't figure out why that would happen.
>> ...
>>
>> So maybe there's some other problem here, but...
>>
>>>>>> Given that widely used ntfs-3g FUSE module also returns EILSEQ on
>>>>>> the same case (I tested this) I would argue that a fix should be
>>>>>> done for upstream especially since RFC5661 clearly defines that
>>>>>> invalid UTF-8 sequence should map into NFSERR_INVAL, exact quote:
>>>>>> "Where the client sends an invalid UTF-8 string, the server
>>>>>> should return NFS4ERR_INVAL (see Table 5)".
>>>>> The NFS client will then happily map that straight into EINVAL for
>>>>> you...
>> This seems like a spec bug?
>>
>> NFS4ERR_INVAL only makes sense if you could really mandate UTF-8 on the
>> wire all the time. But I don't know what other error would work.
>>
>> I guess a client could map INVAL to EILSEQ on open or lookup (is there
>> any other reason a correct client should ever see INVAL on those ops?).
>> Or do that only if fs_charset is supported and has
>> FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 set. Yuch.
>>
>> --b.
> Ewh...
>
> I did some further testing and managed to get stale NFS file handles
> instead of I/O errors on one run. (Haven't been able to repeat this
> behaviour)
>
> After that I did another run (which errored with I/O errors) and found
> out that I am able to touch existing files but not create any new
> files. Touching an existing file changes the mtime as expected on both
> the NFS share and backing FS so something is still being exchanged by
> the NFS client and server. I also tested to write into a touchable
> file but that returned an I/O error again.
>
> Example below:
> # ls -l p
> -rwxrwxrwx 1 root root 0 Dec 4 14:26 p
> # touch p
> # ls -l p
> -rwxrwxrwx 1 root root 0 Dec 4 14:31 p
> # touch newfile
> touch: cannot touch `newfile': Input/output error
> # echo test > p
> -su: p: Input/output error
>
> - Antti
>
Also might be worth noting that NFSv3 also emits the "nfsd: non-standard
errno: -84" and touch command on client says "I/O error" but in NFSv3
case the client doesn't start to I/O erroring on everything after that.

- Antti

2013-12-04 21:38:06

by J. Bruce Fields

[permalink] [raw]

Subject: Re: Patch for mapping EILSEQ into NFSERR_INVAL

On Wed, Dec 04, 2013 at 04:22:48PM -0500, Trond Myklebust wrote:
>
> On Dec 4, 2013, at 16:03, Dr Fields James Bruce <[email protected]> wrote:
>
> > On Wed, Dec 04, 2013 at 10:44:45PM +0200, Antti Tönkyrä wrote:
> >
> >>>> http://daedalus.pingtimeout.net/dbg/eilseq_ioerr.pcap
> >
> > And I see something I'd overlooked before: the client is sending the
> > later opens with the same open owner and sequence id. But NFS4ERR_IO is
> > a seqid-mutating error. So now I think this probably *is* a client
> > bug....
>
> Umm… Yes and no. The client should be able to recover when it discovers that the seqid is out of sync.
>
> That said, I see that we do
>
> status = decode_op_hdr(xdr, OP_OPEN);
> if (status != -EIO)
> nfs_increment_open_seqid(status, res->seqid);
>
> and since NFS4ERR_IO == EIO, that means we skip the seqid update when you send us NFS4ERR_IO.

Oh, OK. Maybe decode_op_hdr could use -NFS4ERR_BADXDR for the two
decoding errors it catches and eliminate the need for this special -EIO
case?

I think NFS4ERR_IO is a legal error for these operations. (Even if the
server should have returned something else in this case.)

--b.