2014-12-16 22:35:05

by Holger Hoffstätte

[permalink] [raw]
Subject: 3.18.1: broken directory with one file too many

g
(please CC: for followups)

I just spent two hours trying to untangle a *weird* bug that I have not
seen before. It might be new to 3.18.x but I don't know for sure.
Apologies in advance for the long prelude but I figured I need to
describe the problem scenario as precisely as possible.

All this is on freshly baked 3.18.1 with Gentoo userland; the exported
filesystem is ext4.

On my NFS server I work with a git repo:

holger>git clone ../work/kernel-patches.git
Cloning into 'kernel-patches'...
done.
holger>cd kernel-patches
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
holger>ll
total 92K
drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>

Looking fine!

On my NFS client this directory is automounted via NFS:

holger>mount | grep home
tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)

This has worked for ages and never caused any problems.

Let's see how my git repo is doing:

holger>cd /mnt/tux/holger/Projects/kernel-patches
holger>ll
total 92K
drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Untracked files:
(use "git add <file>..." to include in what will be committed)

3.14/btrfs-20

nothing added to commit but untracked files present (use "git add" to track)

..wait, what? There is no such file "btrfs-20" !

holger>ll 3.14 | head
ls: cannot access 3.14/btrfs-20: No such file or directory
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-????????? ? ? ? ? ? btrfs-20
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
holger>

There is a "rogue" file messing up the directory?!

This used to work until I added a specific file, so..

holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
Size: 2306 Blocks: 8 IO Block: 1048576 regular file
Device: 18h/24d Inode: 22544856 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ holger) Gid: ( 100/ users)
Access: 2014-12-16 22:41:36.515665610 +0100
Modify: 2014-12-16 22:41:36.515665610 +0100
Change: 2014-12-16 22:41:36.515665610 +0100
Birth: -

Looks fine..maybe try moving it to the parent?

holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
holger>ll
total 96K
drwxr-xr-x 2 holger users 72K Dec 16 22:44 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)

deleted: 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

Untracked files:
(use "git add <file>..." to include in what will be committed)

btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

no changes added to commit (use "git add" and/or "git commit -a")

holger>ll 3.14 | head
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch

I can move it back into 3.14/ and the directory is messed up again.
All this is reproducible, in different export directories.

Any ideas what this might be? A direntry hash collision maybe?
There is a large number of files starting with btrfs-2014xxyy-.. but with
the typical kernel patch names (some quite long), so that would be pretty
bad. Also everything works locally on ext4 without problems, so I suspect
it's an isolated NFS problem.

Things that did not work - same results every time:
- dropping caches on client & server
- restarting NFS on client & server

Any suggestions welcome; I'll gladly test patches.

thanks
Holger



2014-12-17 21:22:00

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> g
> (please CC: for followups)
>
> I just spent two hours trying to untangle a *weird* bug that I have not
> seen before. It might be new to 3.18.x but I don't know for sure.
> Apologies in advance for the long prelude but I figured I need to
> describe the problem scenario as precisely as possible.
>
> All this is on freshly baked 3.18.1 with Gentoo userland; the exported
> filesystem is ext4.
>
> On my NFS server I work with a git repo:
>
> holger>git clone ../work/kernel-patches.git
> Cloning into 'kernel-patches'...
> done.
> holger>cd kernel-patches
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> nothing to commit, working directory clean
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>
>
> Looking fine!
>
> On my NFS client this directory is automounted via NFS:
>
> holger>mount | grep home
> tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)
>
> This has worked for ages and never caused any problems.
>
> Let's see how my git repo is doing:
>
> holger>cd /mnt/tux/holger/Projects/kernel-patches
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
>
> 3.14/btrfs-20
>
> nothing added to commit but untracked files present (use "git add" to track)
>
> ..wait, what? There is no such file "btrfs-20" !
>
> holger>ll 3.14 | head
> ls: cannot access 3.14/btrfs-20: No such file or directory
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -????????? ? ? ? ? ? btrfs-20
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> holger>
>
> There is a "rogue" file messing up the directory?!
>
> This used to work until I added a specific file, so..
>
> holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
> Size: 2306 Blocks: 8 IO Block: 1048576 regular file
> Device: 18h/24d Inode: 22544856 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 1000/ holger) Gid: ( 100/ users)
> Access: 2014-12-16 22:41:36.515665610 +0100
> Modify: 2014-12-16 22:41:36.515665610 +0100
> Change: 2014-12-16 22:41:36.515665610 +0100
> Birth: -
>
> Looks fine..maybe try moving it to the parent?
>
> holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
> holger>ll
> total 96K
> drwxr-xr-x 2 holger users 72K Dec 16 22:44 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Changes not staged for commit:
> (use "git add/rm <file>..." to update what will be committed)
> (use "git checkout -- <file>..." to discard changes in working directory)
>
> deleted: 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
>
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> no changes added to commit (use "git add" and/or "git commit -a")
>
> holger>ll 3.14 | head
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch
>
> I can move it back into 3.14/ and the directory is messed up again.
> All this is reproducible, in different export directories.
>
> Any ideas what this might be? A direntry hash collision maybe?
> There is a large number of files starting with btrfs-2014xxyy-.. but with
> the typical kernel patch names (some quite long), so that would be pretty
> bad. Also everything works locally on ext4 without problems, so I suspect
> it's an isolated NFS problem.

That doesn't sound familiar. A network trace showing the READDIR would
be really useful. Since this is so reproducible, I think that should be
possible. So do something like:

move the problem file into 3.14/
tcpdump -s0 -wtmp.pcap -i<relevant interface>
ls the directory on the client.
kill tcpdump
send us tmp.pcap and/or take a look at it with wireshark and see
what the READDIR response looks like.

--b.

2014-12-18 12:31:18

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/17/14 22:22, J. Bruce Fields wrote:
> On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
>> (..oddly broken directory over NFS..)
> That doesn't sound familiar. A network trace showing the READDIR would
> be really useful. Since this is so reproducible, I think that should be
> possible. So do something like:
>
> move the problem file into 3.14/
> tcpdump -s0 -wtmp.pcap -i<relevant interface>
> ls the directory on the client.
> kill tcpdump
> send us tmp.pcap and/or take a look at it with wireshark and see
> what the READDIR response looks like.

Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.

I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.

Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..

Thanks,
Holger


2014-12-18 12:52:11

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, 18 Dec 2014 13:22:40 +0100, Holger Hoffstätte wrote:

> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
> the client to vanilla 3.18.1 or 3.14.27 has not helped..

Rebooted the server into 3.14.27 (latest longterm stable), dropped caches
on the client and the problem is gone: listing the directory works fine.

-h


2014-12-18 12:59:54

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, 18 Dec 2014 12:51:56 +0000, Holger Hoffstätte wrote:

> On Thu, 18 Dec 2014 13:22:40 +0100, Holger Hoffstätte wrote:
>
>> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
>> the client to vanilla 3.18.1 or 3.14.27 has not helped..
>
> Rebooted the server into 3.14.27 (latest longterm stable), dropped
> caches on the client and the problem is gone: listing the directory
> works fine.

And just for good measure vanilla 3.18.0 also shows the problem, so it's
not a .0/.1 regression.

I'm afraid that's all I can do for now. :/

thanks
Holger


2014-12-18 14:48:58

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> On 12/17/14 22:22, J. Bruce Fields wrote:
> > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> >> (..oddly broken directory over NFS..)
> > That doesn't sound familiar. A network trace showing the READDIR would
> > be really useful. Since this is so reproducible, I think that should be
> > possible. So do something like:
> >
> > move the problem file into 3.14/
> > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > ls the directory on the client.
> > kill tcpdump
> > send us tmp.pcap and/or take a look at it with wireshark and see
> > what the READDIR response looks like.
>
> Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
>
> I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.

On a quick skim, the server's READDIR responses look correct. The entry
btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
is returned in frame 53 (with complete reassembled reply displayed by
wireshark in frame 63).

You could double-check for me--just run "wireshark nfs-server.pcap",
look for packets labeled "Reply ... READDIR", and expand out the READDIR
op and directory listing. I don't see anything obviously wrong.

It's interesting that there's only one LOOKUP in the trace, for btrfs-20
(returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
parse that entry for some reason, then maybe in addition to getting the
filename wrong it also failed to get the attributes, triggering the
extra lookup/getattr.

> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..

I'm a little unclear: when you said "All this is on freshly baked
3.18.1", are you describing the client, or the server, or both?

--b.

2014-12-18 14:58:11

by Benjamin Coddington

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

Frame 36 of nfs-client.pcap has this interesting string:

0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.

...

Ben

On Thu, 18 Dec 2014, J. Bruce Fields wrote:

> On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> > On 12/17/14 22:22, J. Bruce Fields wrote:
> > > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> > >> (..oddly broken directory over NFS..)
> > > That doesn't sound familiar. A network trace showing the READDIR would
> > > be really useful. Since this is so reproducible, I think that should be
> > > possible. So do something like:
> > >
> > > move the problem file into 3.14/
> > > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > > ls the directory on the client.
> > > kill tcpdump
> > > send us tmp.pcap and/or take a look at it with wireshark and see
> > > what the READDIR response looks like.
> >
> > Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
> >
> > I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.
>
> On a quick skim, the server's READDIR responses look correct. The entry
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> is returned in frame 53 (with complete reassembled reply displayed by
> wireshark in frame 63).
>
> You could double-check for me--just run "wireshark nfs-server.pcap",
> look for packets labeled "Reply ... READDIR", and expand out the READDIR
> op and directory listing. I don't see anything obviously wrong.
>
> It's interesting that there's only one LOOKUP in the trace, for btrfs-20
> (returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
> parse that entry for some reason, then maybe in addition to getting the
> filename wrong it also failed to get the attributes, triggering the
> extra lookup/getattr.
>
> > Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..
>
> I'm a little unclear: when you said "All this is on freshly baked
> 3.18.1", are you describing the client, or the server, or both?
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2014-12-18 15:19:16

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> Frame 36 of nfs-client.pcap has this interesting string:
>
> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
> 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
> 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
> 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
> 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.

Yes, that looks like the server messing up the encoding of the reply.

Holger, what's the difference between nfs-client.pcap and
nfs-server.pcap?

--b.

>
> ...
>
> Ben
>
> On Thu, 18 Dec 2014, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> > > On 12/17/14 22:22, J. Bruce Fields wrote:
> > > > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> > > >> (..oddly broken directory over NFS..)
> > > > That doesn't sound familiar. A network trace showing the READDIR would
> > > > be really useful. Since this is so reproducible, I think that should be
> > > > possible. So do something like:
> > > >
> > > > move the problem file into 3.14/
> > > > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > > > ls the directory on the client.
> > > > kill tcpdump
> > > > send us tmp.pcap and/or take a look at it with wireshark and see
> > > > what the READDIR response looks like.
> > >
> > > Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
> > >
> > > I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.
> >
> > On a quick skim, the server's READDIR responses look correct. The entry
> > btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> > is returned in frame 53 (with complete reassembled reply displayed by
> > wireshark in frame 63).
> >
> > You could double-check for me--just run "wireshark nfs-server.pcap",
> > look for packets labeled "Reply ... READDIR", and expand out the READDIR
> > op and directory listing. I don't see anything obviously wrong.
> >
> > It's interesting that there's only one LOOKUP in the trace, for btrfs-20
> > (returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
> > parse that entry for some reason, then maybe in addition to getting the
> > filename wrong it also failed to get the attributes, triggering the
> > extra lookup/getattr.
> >
> > > Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..
> >
> > I'm a little unclear: when you said "All this is on freshly baked
> > 3.18.1", are you describing the client, or the server, or both?
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >


2014-12-18 15:43:15

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, 18 Dec 2014 10:19:14 -0500, J. Bruce Fields wrote:

> On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
>> Frame 36 of nfs-client.pcap has this interesting string:
>>
>> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74
>> ..;...&..|...Abt 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36
>> rfs-20........06 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68
>> -fix-deadlock-wh 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64
>> 65 en-mounting-a-de 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63
>> 68 00 graded-fs.patch.
>
> Yes, that looks like the server messing up the encoding of the reply.
>
> Holger, what's the difference between nfs-client.pcap and
> nfs-server.pcap?

One is the client, one is the server :-)

No, really. Both 3.18.1, 64it, same userland/compiler/etc. and even same
CFLAGS for both kernel and userland. I built tcpdump on both (from
packages) and ran them back to back literally 10 seconds apart.

What "encoding" are we talking about here? The NFS RPC encoding?
(everything I know & remember about NFS is 15+ years old..)

thanks,
Holger


2014-12-18 16:32:55

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 03:42:53PM +0000, Holger Hoffstätte wrote:
> On Thu, 18 Dec 2014 10:19:14 -0500, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> >> Frame 36 of nfs-client.pcap has this interesting string:
> >>
> >> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74
> >> ..;...&..|...Abt 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36
> >> rfs-20........06 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68
> >> -fix-deadlock-wh 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64
> >> 65 en-mounting-a-de 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63
> >> 68 00 graded-fs.patch.
> >
> > Yes, that looks like the server messing up the encoding of the reply.
> >
> > Holger, what's the difference between nfs-client.pcap and
> > nfs-server.pcap?
>
> One is the client, one is the server :-)
>
> No, really. Both 3.18.1, 64it, same userland/compiler/etc. and even same
> CFLAGS for both kernel and userland. I built tcpdump on both (from
> packages) and ran them back to back literally 10 seconds apart.

So this is two separate tests, both failed, the only difference is that
the tcpdump was run on the client in one case and the server in the
other?

> What "encoding" are we talking about here? The NFS RPC encoding?
> (everything I know & remember about NFS is 15+ years old..)

Right, I suspect there's a bug in the server's encoding of the reply to
the READDIR rpc. Odd that the encoding would look correct from the
point of view of tcpdump on the server but not from the point of view of
tcpdump on the client.

--b.

2014-12-18 16:42:41

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/18/14 17:32, J. Bruce Fields wrote:
> So this is two separate tests, both failed, the only difference is that
> the tcpdump was run on the client in one case and the server in the
> other?

Exactly: client first, then I realized you'd want to look at the server-side too.
That being said..

> Right, I suspect there's a bug in the server's encoding of the reply to
> the READDIR rpc. Odd that the encoding would look correct from the
> point of view of tcpdump on the server but not from the point of view of
> tcpdump on the client.

..in order to rule out a mistake on my part with the two separate runs (which prevents correlated analysis) I was just about to boot the server back into 3.18.1 and re-run both client- and server-side traces simultaneously. However I have to head out for a bit first; will post that later today.

Thanks so far!

Holger

PS: sorry for the mangled replies, that seems to be Gmane. Using email only now.


2014-12-18 17:06:55

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 05:42:36PM +0100, Holger Hoffstätte wrote:
> On 12/18/14 17:32, J. Bruce Fields wrote:
> > So this is two separate tests, both failed, the only difference is that
> > the tcpdump was run on the client in one case and the server in the
> > other?
>
> Exactly: client first, then I realized you'd want to look at the server-side too.
> That being said..

Whoops, now I see, the server-side trace has the same problem, I just
overlooked it the first time.

> > Right, I suspect there's a bug in the server's encoding of the reply to
> > the READDIR rpc. Odd that the encoding would look correct from the
> > point of view of tcpdump on the server but not from the point of view of
> > tcpdump on the client.
>
> ..in order to rule out a mistake on my part with the two separate runs (which prevents correlated analysis) I was just about to boot the server back into 3.18.1 and re-run both client- and server-side traces simultaneously. However I have to head out for a bit first; will post that later today.

So this might still be interesting, but it's not a high priority.

--b.

2014-12-18 19:44:41

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/18/14 18:06, J. Bruce Fields wrote:
> Whoops, now I see, the server-side trace has the same problem, I
> just overlooked it the first time.

Excellent, so we know it's the server's fault. Really would have been odd to not have it in the server trace.

>> ..in order to rule out a mistake on my part with the two separate
>> runs (which prevents correlated analysis) I was just about to boot
>> the server back into 3.18.1 and re-run both client- and server-side
>> traces simultaneously. However I have to head out for a bit first;
>> will post that later today.
>
> So this might still be interesting, but it's not a high priority.

Then I guess I'll better keep my feet still and don't muddle the waters further, looks like you found what you need. If you still need it just holler.

Let me know if there's anything I can do to help/patch/test!

thanks
Holger

2014-12-20 18:02:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 08:44:37PM +0100, Holger Hoffstätte wrote:
> On 12/18/14 18:06, J. Bruce Fields wrote:
> > Whoops, now I see, the server-side trace has the same problem, I
> > just overlooked it the first time.
>
> Excellent, so we know it's the server's fault. Really would have been odd to not have it in the server trace.
>
> >> ..in order to rule out a mistake on my part with the two separate
> >> runs (which prevents correlated analysis) I was just about to boot
> >> the server back into 3.18.1 and re-run both client- and server-side
> >> traces simultaneously. However I have to head out for a bit first;
> >> will post that later today.
> >
> > So this might still be interesting, but it's not a high priority.
>
> Then I guess I'll better keep my feet still and don't muddle the waters further, looks like you found what you need. If you still need it just holler.
>
> Let me know if there's anything I can do to help/patch/test!

Gah. Does this fix it?

A struct xdr_stream at a page boundary might point to the end of one
page or the beginning of the next, and I'm guessing xdr_truncate_encode
wasn't prepared to handle the former.

This happens if the readdir entry that would have exceeded the client's
dircount/maxcount limit would have ended exactly on a 4k page boundary,
and inspection of the trace shows you're hitting exactly that case.

If this does the job then I'll go figure out how to make this logic less
ugly....

--b.

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 1cb61242e55e..32910b91d17c 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -630,6 +630,9 @@ void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)

new = buf->page_base + buf->page_len;
old = new + fraglen;
+ /* XXX: HACK: */
+ if (xdr->p == page_address(*xdr->page_ptr) + PAGE_SIZE)
+ xdr->page_ptr++;
xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);

if (buf->page_len) {

2014-12-20 18:50:32

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/20/14 19:02, J. Bruce Fields wrote:
> Gah. Does this fix it?

It does! Well done. :)

Reported-by: Holger Hoffstätte <[email protected]>
Tested-by: Holger Hoffstätte <[email protected]>

Thanks!
Holger

2014-12-18 17:18:52

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 10:19:14AM -0500, J. Bruce Fields wrote:
> On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> > Frame 36 of nfs-client.pcap has this interesting string:
> >
> > 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
> > 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
> > 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
> > 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
> > 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.
>
> Yes, that looks like the server messing up the encoding of the reply.

And indeed looking at the git repo that matches up with the entry:

btrfs-20........06-fix-deadlock-when-mounting-a-degraded-fs.patch.
btrfs-20140619-006-fix-deadlock-when-mounting-a-degraded-fs.patch

So a stray 8 bytes of zeroes got written there.

And the reply also ends with 000000017fffffff when I believe it should
end with two 4-byte 0's (no value follows, not EOF).

So I think those 0's got written to the wrong offset. That should be
enough information to find the bug....

--b.

2014-12-18 15:36:00

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, 18 Dec 2014 09:48:56 -0500, J. Bruce Fields wrote:

> On a quick skim, the server's READDIR responses look correct. The entry
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> is returned in frame 53 (with complete reassembled reply displayed by
> wireshark in frame 63).
>
> You could double-check for me--just run "wireshark nfs-server.pcap",
> look for packets labeled "Reply ... READDIR", and expand out the READDIR
> op and directory listing. I don't see anything obviously wrong.

That's what I can see in Wireshark as well (#53 as part of the "20
reassembled segments"). As I said in my followup I don't think there is
anything wrong with that particular file since removing others "fixed"
the problem. That's why I suspected NIC/TCP buggery, and since my kernels
usually have a bunch of patches (the ones in that repo) I wanted to try
vanilla 3.18.0/1 as well as -3.14.27 first.

>> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
>> the client to vanilla 3.18.1 or 3.14.27 has not helped..
>
> I'm a little unclear: when you said "All this is on freshly baked
> 3.18.1", are you describing the client, or the server, or both?

That was on both. As I wrote in the followups I've now also tried to
first downgrade the clients (didn't help) and then finally found that
3.14.27 (both with and without my patches) on the server repeatably
works, regardless of client. Right now I have 3.18.1 as clients and
3.14.27 on the server, and that works fine.

I never noticed any other problems when first testing 3.18, which is why
switched over all machines; it has been working really well so far.
No other networking problems, and I use NFS all day long. If there really
was NIC packet corruption, NFS dropped requests or general page cache
borkage then I think I would have noticed something much earlier.

Maybe you can try to reproduce? Try git clone https://github.com/
hhoffstaette/kernel-patches and rewind to rev e7b720ef after which I
first noticed the problem. Then look at the 3.14 directory over NFS.

Let me know if there is anything else I can try!

regards,
Holger


2014-12-18 16:30:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 03:35:45PM +0000, Holger Hoffstätte wrote:
> On Thu, 18 Dec 2014 09:48:56 -0500, J. Bruce Fields wrote:
>
> > On a quick skim, the server's READDIR responses look correct. The entry
> > btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> > is returned in frame 53 (with complete reassembled reply displayed by
> > wireshark in frame 63).
> >
> > You could double-check for me--just run "wireshark nfs-server.pcap",
> > look for packets labeled "Reply ... READDIR", and expand out the READDIR
> > op and directory listing. I don't see anything obviously wrong.
>
> That's what I can see in Wireshark as well (#53 as part of the "20
> reassembled segments"). As I said in my followup I don't think there is
> anything wrong with that particular file since removing others "fixed"
> the problem. That's why I suspected NIC/TCP buggery, and since my kernels
> usually have a bunch of patches (the ones in that repo) I wanted to try
> vanilla 3.18.0/1 as well as -3.14.27 first.
>
> >> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
> >> the client to vanilla 3.18.1 or 3.14.27 has not helped..
> >
> > I'm a little unclear: when you said "All this is on freshly baked
> > 3.18.1", are you describing the client, or the server, or both?
>
> That was on both. As I wrote in the followups I've now also tried to
> first downgrade the clients (didn't help) and then finally found that
> 3.14.27 (both with and without my patches) on the server repeatably
> works, regardless of client. Right now I have 3.18.1 as clients and
> 3.14.27 on the server, and that works fine.

The way that the server encodes readdir replies changed in 3.16, so I'd
guess 3.15 will also work for you and 3.16 won't.

--b.

(PS You or your mail client seem to be trimming my address from to: and
cc: lines, I'd rather you didn't.)

2015-01-07 00:26:42

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Sat, 20 Dec 2014 13:02:43 -0500, J. Bruce Fields wrote:

> On Thu, Dec 18, 2014 at 08:44:37PM +0100, Holger Hoffstätte wrote:
>> On 12/18/14 18:06, J. Bruce Fields wrote:
>> > Whoops, now I see, the server-side trace has the same problem, I just
>> > overlooked it the first time.
>>
>> Excellent, so we know it's the server's fault. Really would have been
>> odd to not have it in the server trace.
>>
>> >> ..in order to rule out a mistake on my part with the two separate
>> >> runs (which prevents correlated analysis) I was just about to boot
>> >> the server back into 3.18.1 and re-run both client- and server-side
>> >> traces simultaneously. However I have to head out for a bit first;
>> >> will post that later today.
>> >
>> > So this might still be interesting, but it's not a high priority.
>>
>> Then I guess I'll better keep my feet still and don't muddle the waters
>> further, looks like you found what you need. If you still need it just
>> holler.
>>
>> Let me know if there's anything I can do to help/patch/test!
>
> Gah. Does this fix it?
>
> A struct xdr_stream at a page boundary might point to the end of one
> page or the beginning of the next, and I'm guessing xdr_truncate_encode
> wasn't prepared to handle the former.
>
> This happens if the readdir entry that would have exceeded the client's
> dircount/maxcount limit would have ended exactly on a 4k page boundary,
> and inspection of the trace shows you're hitting exactly that case.
>
> If this does the job then I'll go figure out how to make this logic less
> ugly....
>
> --b.
>
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index
> 1cb61242e55e..32910b91d17c 100644 --- a/net/sunrpc/xdr.c +++
> b/net/sunrpc/xdr.c @@ -630,6 +630,9 @@ void xdr_truncate_encode(struct
> xdr_stream *xdr, size_t len)
>
> new = buf->page_base + buf->page_len;
> old = new + fraglen;
> + /* XXX: HACK: */
> + if (xdr->p == page_address(*xdr->page_ptr) + PAGE_SIZE)
> + xdr->page_ptr++;
> xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);
>
> if (buf->page_len) {

Any news on getting this upstream and into -stables? I ack'ed it on Dec.
20 and have been running it 24/7 since then with no problems.

Just making sure it doesn't disappear into the post-holiday couch
crack.. :)

thanks,
Holger


2015-01-07 18:21:42

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Wed, Jan 07, 2015 at 12:25:06AM +0000, Holger Hoffstätte wrote:
> On Sat, 20 Dec 2014 13:02:43 -0500, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 08:44:37PM +0100, Holger Hoffstätte wrote:
> >> On 12/18/14 18:06, J. Bruce Fields wrote:
> >> > Whoops, now I see, the server-side trace has the same problem, I just
> >> > overlooked it the first time.
> >>
> >> Excellent, so we know it's the server's fault. Really would have been
> >> odd to not have it in the server trace.
> >>
> >> >> ..in order to rule out a mistake on my part with the two separate
> >> >> runs (which prevents correlated analysis) I was just about to boot
> >> >> the server back into 3.18.1 and re-run both client- and server-side
> >> >> traces simultaneously. However I have to head out for a bit first;
> >> >> will post that later today.
> >> >
> >> > So this might still be interesting, but it's not a high priority.
> >>
> >> Then I guess I'll better keep my feet still and don't muddle the waters
> >> further, looks like you found what you need. If you still need it just
> >> holler.
> >>
> >> Let me know if there's anything I can do to help/patch/test!
> >
> > Gah. Does this fix it?
> >
> > A struct xdr_stream at a page boundary might point to the end of one
> > page or the beginning of the next, and I'm guessing xdr_truncate_encode
> > wasn't prepared to handle the former.
> >
> > This happens if the readdir entry that would have exceeded the client's
> > dircount/maxcount limit would have ended exactly on a 4k page boundary,
> > and inspection of the trace shows you're hitting exactly that case.
> >
> > If this does the job then I'll go figure out how to make this logic less
> > ugly....
> >
> > --b.
> >
> > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index
> > 1cb61242e55e..32910b91d17c 100644 --- a/net/sunrpc/xdr.c +++
> > b/net/sunrpc/xdr.c @@ -630,6 +630,9 @@ void xdr_truncate_encode(struct
> > xdr_stream *xdr, size_t len)
> >
> > new = buf->page_base + buf->page_len;
> > old = new + fraglen;
> > + /* XXX: HACK: */
> > + if (xdr->p == page_address(*xdr->page_ptr) + PAGE_SIZE)
> > + xdr->page_ptr++;
> > xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);
> >
> > if (buf->page_len) {
>
> Any news on getting this upstream and into -stables? I ack'ed it on Dec.
> 20 and have been running it 24/7 since then with no problems.
>
> Just making sure it doesn't disappear into the post-holiday couch
> crack.. :)

Yeah, understood, thanks for the reminder. I took another look and
found a simpler fix and wanted to reproduce. Unfortunately the
reproducer depends on a lot of details (your git directory won't
necessarily work on somebody else's setup) so all I found was a brute
force attempt to create directories which lots of various-sized entries
(excuse the sloppy shell):

j=74
k=100
while true; do
ssh root@f20-3 '
cd /exports/xfs/DIR
rm -f 00*
for ((i=0; i<'$k'; i++)); do
touch $(printf "%0'$j'.0f" $i); done
for ((i='$k'; i<100; i++));
do touch $(printf "%0'$((j+1))'.0f" $i); done
'
if ! diff <(ssh root@f20-3 ls /exports/xfs/DIR/) <(ls DIR); then
echo "diff found at $j, $k"
break
fi
echo $j, $((k--))
if [ $k -eq 0 ]; then
echo ---
echo $((j+=4)), $((k=100))
fi
done

Now that I think of it there might be other operations (read, readlink)
that could trigger the same problem more easily.

But anyway that reproducer worked and confirmed the following which I'll
probably pass along upstream today or tomorrow....

--b.

commit 9f5b5c33b2ee
Author: J. Bruce Fields <[email protected]>
Date: Mon Dec 22 16:14:51 2014 -0500

rpc: fix xdr_truncate_encode to handle buffer ending on page boundary

A struct xdr_stream at a page boundary might point to the end of one
page or the beginning of the next, but xdr_truncate_encode isn't
prepared to handle the former.

This can cause corruption of NFSv4 READDIR replies in the case that a
readdir entry that would have exceeded the client's dircount/maxcount
limit would have ended exactly on a 4k page boundary. You're more
likely to hit this case on large directories.

Other xdr_truncate_encode callers are probably also affected.

Reported-by: Holger Hoffstätte <[email protected]>
Tested-by: Holger Hoffstätte <[email protected]>
Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode"
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 1cb61242e55e..862307ac341f 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -629,8 +629,8 @@ void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)
buf->len -= fraglen;

new = buf->page_base + buf->page_len;
- old = new + fraglen;
- xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);
+
+ xdr->page_ptr = buf->pages + (new >> PAGE_SHIFT);

if (buf->page_len) {
xdr->p = page_address(*xdr->page_ptr);

2015-01-07 20:06:24

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH] nfsd4: tweak rd_dircount accounting

From: "J. Bruce Fields" <[email protected]>

RFC 3530 14.2.24 says

This value represents the length of the names of the directory
entries and the cookie value for these entries. This length
represents the XDR encoding of the data (names and cookies)...

The "xdr encoding" of the name should probably include the 4 bytes for
the length.

But this is all just a hint so not worth e.g. backporting to stable.

Also reshuffle some lines to more clearly group together the
dircount-related code.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4xdr.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

On Wed, Jan 07, 2015 at 01:21:41PM -0500, bfields wrote:
> But anyway that reproducer worked and confirmed the following which I'll
> probably pass along upstream today or tomorrow....

I also noticed this nit while I was there. I'm planning to queue this
up for 3.20.--b.

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 15f7b73e0c0f..91f7a3644ffb 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2768,16 +2768,17 @@ nfsd4_encode_dirent(void *ccdv, const char *name, int namlen,
if (entry_bytes > cd->rd_maxcount)
goto fail;
cd->rd_maxcount -= entry_bytes;
- if (!cd->rd_dircount)
- goto fail;
/*
* RFC 3530 14.2.24 describes rd_dircount as only a "hint", so
* let's always let through the first entry, at least:
*/
- name_and_cookie = 4 * XDR_QUADLEN(namlen) + 8;
+ if (!cd->rd_dircount)
+ goto fail;
+ name_and_cookie = 4 + 4 * XDR_QUADLEN(namlen) + 8;
if (name_and_cookie > cd->rd_dircount && cd->cookie_offset)
goto fail;
cd->rd_dircount -= min(cd->rd_dircount, name_and_cookie);
+
cd->cookie_offset = cookie_offset;
skip_entry:
cd->common.err = nfs_ok;
--
1.9.3