2014-12-16 22:19:38

by Holger Hoffstätte

[permalink] [raw]
Subject: 3.18.1: broken directory with one file too many

g
(please CC: for followups)

I just spent two hours trying to untangle a *weird* bug that I have not
seen before. It might be new to 3.18.x but I don't know for sure.
Apologies in advance for the long prelude but I figured I need to
describe the problem scenario as precisely as possible.

All this is on freshly baked 3.18.1 with Gentoo userland; the exported
filesystem is ext4.

On my NFS server I work with a git repo:

holger>git clone ../work/kernel-patches.git
Cloning into 'kernel-patches'...
done.
holger>cd kernel-patches
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
holger>ll
total 92K
drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>

Looking fine!

On my NFS client this directory is automounted via NFS:

holger>mount | grep home
tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)

This has worked for ages and never caused any problems.

Let's see how my git repo is doing:

holger>cd /mnt/tux/holger/Projects/kernel-patches
holger>ll
total 92K
drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Untracked files:
(use "git add <file>..." to include in what will be committed)

3.14/btrfs-20

nothing added to commit but untracked files present (use "git add" to track)

..wait, what? There is no such file "btrfs-20" !

holger>ll 3.14 | head
ls: cannot access 3.14/btrfs-20: No such file or directory
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-????????? ? ? ? ? ? btrfs-20
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
holger>

There is a "rogue" file messing up the directory?!

This used to work until I added a specific file, so..

holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
Size: 2306 Blocks: 8 IO Block: 1048576 regular file
Device: 18h/24d Inode: 22544856 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ holger) Gid: ( 100/ users)
Access: 2014-12-16 22:41:36.515665610 +0100
Modify: 2014-12-16 22:41:36.515665610 +0100
Change: 2014-12-16 22:41:36.515665610 +0100
Birth: -

Looks fine..maybe try moving it to the parent?

holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
holger>ll
total 96K
drwxr-xr-x 2 holger users 72K Dec 16 22:44 3.14/
drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)

deleted: 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

Untracked files:
(use "git add <file>..." to include in what will be committed)

btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

no changes added to commit (use "git add" and/or "git commit -a")

holger>ll 3.14 | head
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch

I can move it back into 3.14/ and the directory is messed up again.
All this is reproducible, in different export directories.

Any ideas what this might be? A direntry hash collision maybe?
There is a large number of files starting with btrfs-2014xxyy-.. but with
the typical kernel patch names (some quite long), so that would be pretty
bad. Also everything works locally on ext4 without problems, so I suspect
it's an isolated NFS problem.

Things that did not work - same results every time:
- dropping caches on client & server
- restarting NFS on client & server

Any suggestions welcome; I'll gladly test patches.

thanks
Holger


2014-12-17 21:22:03

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> g
> (please CC: for followups)
>
> I just spent two hours trying to untangle a *weird* bug that I have not
> seen before. It might be new to 3.18.x but I don't know for sure.
> Apologies in advance for the long prelude but I figured I need to
> describe the problem scenario as precisely as possible.
>
> All this is on freshly baked 3.18.1 with Gentoo userland; the exported
> filesystem is ext4.
>
> On my NFS server I work with a git repo:
>
> holger>git clone ../work/kernel-patches.git
> Cloning into 'kernel-patches'...
> done.
> holger>cd kernel-patches
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> nothing to commit, working directory clean
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>
>
> Looking fine!
>
> On my NFS client this directory is automounted via NFS:
>
> holger>mount | grep home
> tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)
>
> This has worked for ages and never caused any problems.
>
> Let's see how my git repo is doing:
>
> holger>cd /mnt/tux/holger/Projects/kernel-patches
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users 72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
>
> 3.14/btrfs-20
>
> nothing added to commit but untracked files present (use "git add" to track)
>
> ..wait, what? There is no such file "btrfs-20" !
>
> holger>ll 3.14 | head
> ls: cannot access 3.14/btrfs-20: No such file or directory
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -????????? ? ? ? ? ? btrfs-20
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> holger>
>
> There is a "rogue" file messing up the directory?!
>
> This used to work until I added a specific file, so..
>
> holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
> Size: 2306 Blocks: 8 IO Block: 1048576 regular file
> Device: 18h/24d Inode: 22544856 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 1000/ holger) Gid: ( 100/ users)
> Access: 2014-12-16 22:41:36.515665610 +0100
> Modify: 2014-12-16 22:41:36.515665610 +0100
> Change: 2014-12-16 22:41:36.515665610 +0100
> Birth: -
>
> Looks fine..maybe try moving it to the parent?
>
> holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
> holger>ll
> total 96K
> drwxr-xr-x 2 holger users 72K Dec 16 22:44 3.14/
> drwxr-xr-x 2 holger users 16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Changes not staged for commit:
> (use "git add/rm <file>..." to update what will be committed)
> (use "git checkout -- <file>..." to discard changes in working directory)
>
> deleted: 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
>
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>
> no changes added to commit (use "git add" and/or "git commit -a")
>
> holger>ll 3.14 | head
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users 41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users 13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch
>
> I can move it back into 3.14/ and the directory is messed up again.
> All this is reproducible, in different export directories.
>
> Any ideas what this might be? A direntry hash collision maybe?
> There is a large number of files starting with btrfs-2014xxyy-.. but with
> the typical kernel patch names (some quite long), so that would be pretty
> bad. Also everything works locally on ext4 without problems, so I suspect
> it's an isolated NFS problem.

That doesn't sound familiar. A network trace showing the READDIR would
be really useful. Since this is so reproducible, I think that should be
possible. So do something like:

move the problem file into 3.14/
tcpdump -s0 -wtmp.pcap -i<relevant interface>
ls the directory on the client.
kill tcpdump
send us tmp.pcap and/or take a look at it with wireshark and see
what the READDIR response looks like.

--b.

2014-12-18 12:31:19

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/17/14 22:22, J. Bruce Fields wrote:
> On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
>> (..oddly broken directory over NFS..)
> That doesn't sound familiar. A network trace showing the READDIR would
> be really useful. Since this is so reproducible, I think that should be
> possible. So do something like:
>
> move the problem file into 3.14/
> tcpdump -s0 -wtmp.pcap -i<relevant interface>
> ls the directory on the client.
> kill tcpdump
> send us tmp.pcap and/or take a look at it with wireshark and see
> what the READDIR response looks like.

Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.

I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.

Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..

Thanks,
Holger

2014-12-18 14:48:59

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> On 12/17/14 22:22, J. Bruce Fields wrote:
> > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> >> (..oddly broken directory over NFS..)
> > That doesn't sound familiar. A network trace showing the READDIR would
> > be really useful. Since this is so reproducible, I think that should be
> > possible. So do something like:
> >
> > move the problem file into 3.14/
> > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > ls the directory on the client.
> > kill tcpdump
> > send us tmp.pcap and/or take a look at it with wireshark and see
> > what the READDIR response looks like.
>
> Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
>
> I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.

On a quick skim, the server's READDIR responses look correct. The entry
btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
is returned in frame 53 (with complete reassembled reply displayed by
wireshark in frame 63).

You could double-check for me--just run "wireshark nfs-server.pcap",
look for packets labeled "Reply ... READDIR", and expand out the READDIR
op and directory listing. I don't see anything obviously wrong.

It's interesting that there's only one LOOKUP in the trace, for btrfs-20
(returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
parse that entry for some reason, then maybe in addition to getting the
filename wrong it also failed to get the attributes, triggering the
extra lookup/getattr.

> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..

I'm a little unclear: when you said "All this is on freshly baked
3.18.1", are you describing the client, or the server, or both?

--b.

2014-12-18 14:58:13

by Benjamin Coddington

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

Frame 36 of nfs-client.pcap has this interesting string:

0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.

...

Ben

On Thu, 18 Dec 2014, J. Bruce Fields wrote:

> On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> > On 12/17/14 22:22, J. Bruce Fields wrote:
> > > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> > >> (..oddly broken directory over NFS..)
> > > That doesn't sound familiar. A network trace showing the READDIR would
> > > be really useful. Since this is so reproducible, I think that should be
> > > possible. So do something like:
> > >
> > > move the problem file into 3.14/
> > > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > > ls the directory on the client.
> > > kill tcpdump
> > > send us tmp.pcap and/or take a look at it with wireshark and see
> > > what the READDIR response looks like.
> >
> > Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
> >
> > I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.
>
> On a quick skim, the server's READDIR responses look correct. The entry
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> is returned in frame 53 (with complete reassembled reply displayed by
> wireshark in frame 63).
>
> You could double-check for me--just run "wireshark nfs-server.pcap",
> look for packets labeled "Reply ... READDIR", and expand out the READDIR
> op and directory listing. I don't see anything obviously wrong.
>
> It's interesting that there's only one LOOKUP in the trace, for btrfs-20
> (returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
> parse that entry for some reason, then maybe in addition to getting the
> filename wrong it also failed to get the attributes, triggering the
> extra lookup/getattr.
>
> > Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..
>
> I'm a little unclear: when you said "All this is on freshly baked
> 3.18.1", are you describing the client, or the server, or both?
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2014-12-18 15:19:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> Frame 36 of nfs-client.pcap has this interesting string:
>
> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
> 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
> 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
> 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
> 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.

Yes, that looks like the server messing up the encoding of the reply.

Holger, what's the difference between nfs-client.pcap and
nfs-server.pcap?

--b.

>
> ...
>
> Ben
>
> On Thu, 18 Dec 2014, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 01:22:40PM +0100, Holger Hoffstätte wrote:
> > > On 12/17/14 22:22, J. Bruce Fields wrote:
> > > > On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> > > >> (..oddly broken directory over NFS..)
> > > > That doesn't sound familiar. A network trace showing the READDIR would
> > > > be really useful. Since this is so reproducible, I think that should be
> > > > possible. So do something like:
> > > >
> > > > move the problem file into 3.14/
> > > > tcpdump -s0 -wtmp.pcap -i<relevant interface>
> > > > ls the directory on the client.
> > > > kill tcpdump
> > > > send us tmp.pcap and/or take a look at it with wireshark and see
> > > > what the READDIR response looks like.
> > >
> > > Thanks for your reply. I forgot to mention that removing other files seems to "fix" the problem, so it does not seem to be spefically the new file itself that is the cause.
> > >
> > > I captured the "ls 3.14 | head" sequence on both the client and the server, and put the tcpudmp files here: http://hoho.duckdns.org/linux/ - let me know if that helped.
> >
> > On a quick skim, the server's READDIR responses look correct. The entry
> > btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> > is returned in frame 53 (with complete reassembled reply displayed by
> > wireshark in frame 63).
> >
> > You could double-check for me--just run "wireshark nfs-server.pcap",
> > look for packets labeled "Reply ... READDIR", and expand out the READDIR
> > op and directory listing. I don't see anything obviously wrong.
> >
> > It's interesting that there's only one LOOKUP in the trace, for btrfs-20
> > (returning, not suprisingly, NFS4ERR_NOENT). If the client failed to
> > parse that entry for some reason, then maybe in addition to getting the
> > filename wrong it also failed to get the attributes, triggering the
> > extra lookup/getattr.
> >
> > > Meanwhile I'll try older/plain (unpatched) kernels. So far reverting the client to vanilla 3.18.1 or 3.14.27 has not helped..
> >
> > I'm a little unclear: when you said "All this is on freshly baked
> > 3.18.1", are you describing the client, or the server, or both?
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >

2014-12-18 16:30:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 03:35:45PM +0000, Holger Hoffstätte wrote:
> On Thu, 18 Dec 2014 09:48:56 -0500, J. Bruce Fields wrote:
>
> > On a quick skim, the server's READDIR responses look correct. The entry
> > btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> > is returned in frame 53 (with complete reassembled reply displayed by
> > wireshark in frame 63).
> >
> > You could double-check for me--just run "wireshark nfs-server.pcap",
> > look for packets labeled "Reply ... READDIR", and expand out the READDIR
> > op and directory listing. I don't see anything obviously wrong.
>
> That's what I can see in Wireshark as well (#53 as part of the "20
> reassembled segments"). As I said in my followup I don't think there is
> anything wrong with that particular file since removing others "fixed"
> the problem. That's why I suspected NIC/TCP buggery, and since my kernels
> usually have a bunch of patches (the ones in that repo) I wanted to try
> vanilla 3.18.0/1 as well as -3.14.27 first.
>
> >> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
> >> the client to vanilla 3.18.1 or 3.14.27 has not helped..
> >
> > I'm a little unclear: when you said "All this is on freshly baked
> > 3.18.1", are you describing the client, or the server, or both?
>
> That was on both. As I wrote in the followups I've now also tried to
> first downgrade the clients (didn't help) and then finally found that
> 3.14.27 (both with and without my patches) on the server repeatably
> works, regardless of client. Right now I have 3.18.1 as clients and
> 3.14.27 on the server, and that works fine.

The way that the server encodes readdir replies changed in 3.16, so I'd
guess 3.15 will also work for you and 3.16 won't.

--b.

(PS You or your mail client seem to be trimming my address from to: and
cc: lines, I'd rather you didn't.)

2014-12-18 16:32:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 03:42:53PM +0000, Holger Hoffstätte wrote:
> On Thu, 18 Dec 2014 10:19:14 -0500, J. Bruce Fields wrote:
>
> > On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> >> Frame 36 of nfs-client.pcap has this interesting string:
> >>
> >> 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74
> >> ..;...&..|...Abt 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36
> >> rfs-20........06 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68
> >> -fix-deadlock-wh 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64
> >> 65 en-mounting-a-de 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63
> >> 68 00 graded-fs.patch.
> >
> > Yes, that looks like the server messing up the encoding of the reply.
> >
> > Holger, what's the difference between nfs-client.pcap and
> > nfs-server.pcap?
>
> One is the client, one is the server :-)
>
> No, really. Both 3.18.1, 64it, same userland/compiler/etc. and even same
> CFLAGS for both kernel and userland. I built tcpdump on both (from
> packages) and ran them back to back literally 10 seconds apart.

So this is two separate tests, both failed, the only difference is that
the tcpdump was run on the client in one case and the server in the
other?

> What "encoding" are we talking about here? The NFS RPC encoding?
> (everything I know & remember about NFS is 15+ years old..)

Right, I suspect there's a bug in the server's encoding of the reply to
the READDIR rpc. Odd that the encoding would look correct from the
point of view of tcpdump on the server but not from the point of view of
tcpdump on the client.

--b.

2014-12-18 16:42:42

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/18/14 17:32, J. Bruce Fields wrote:
> So this is two separate tests, both failed, the only difference is that
> the tcpdump was run on the client in one case and the server in the
> other?

Exactly: client first, then I realized you'd want to look at the server-side too.
That being said..

> Right, I suspect there's a bug in the server's encoding of the reply to
> the READDIR rpc. Odd that the encoding would look correct from the
> point of view of tcpdump on the server but not from the point of view of
> tcpdump on the client.

..in order to rule out a mistake on my part with the two separate runs (which prevents correlated analysis) I was just about to boot the server back into 3.18.1 and re-run both client- and server-side traces simultaneously. However I have to head out for a bit first; will post that later today.

Thanks so far!

Holger

PS: sorry for the mangled replies, that seems to be Gmane. Using email only now.

2014-12-18 17:06:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 05:42:36PM +0100, Holger Hoffstätte wrote:
> On 12/18/14 17:32, J. Bruce Fields wrote:
> > So this is two separate tests, both failed, the only difference is that
> > the tcpdump was run on the client in one case and the server in the
> > other?
>
> Exactly: client first, then I realized you'd want to look at the server-side too.
> That being said..

Whoops, now I see, the server-side trace has the same problem, I just
overlooked it the first time.

> > Right, I suspect there's a bug in the server's encoding of the reply to
> > the READDIR rpc. Odd that the encoding would look correct from the
> > point of view of tcpdump on the server but not from the point of view of
> > tcpdump on the client.
>
> ..in order to rule out a mistake on my part with the two separate runs (which prevents correlated analysis) I was just about to boot the server back into 3.18.1 and re-run both client- and server-side traces simultaneously. However I have to head out for a bit first; will post that later today.

So this might still be interesting, but it's not a high priority.

--b.

2014-12-18 17:18:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 10:19:14AM -0500, J. Bruce Fields wrote:
> On Thu, Dec 18, 2014 at 09:58:06AM -0500, Benjamin Coddington wrote:
> > Frame 36 of nfs-client.pcap has this interesting string:
> >
> > 0ff0 00 01 3b f6 fb b6 26 16 8f 7c 00 00 00 41 62 74 ..;...&..|...Abt
> > 1000 72 66 73 2d 32 30 00 00 00 00 00 00 00 00 30 36 rfs-20........06
> > 1010 2d 66 69 78 2d 64 65 61 64 6c 6f 63 6b 2d 77 68 -fix-deadlock-wh
> > 1020 65 6e 2d 6d 6f 75 6e 74 69 6e 67 2d 61 2d 64 65 en-mounting-a-de
> > 1030 67 72 61 64 65 64 2d 66 73 2e 70 61 74 63 68 00 graded-fs.patch.
>
> Yes, that looks like the server messing up the encoding of the reply.

And indeed looking at the git repo that matches up with the entry:

btrfs-20........06-fix-deadlock-when-mounting-a-degraded-fs.patch.
btrfs-20140619-006-fix-deadlock-when-mounting-a-degraded-fs.patch

So a stray 8 bytes of zeroes got written there.

And the reply also ends with 000000017fffffff when I believe it should
end with two 4-byte 0's (no value follows, not EOF).

So I think those 0's got written to the wrong offset. That should be
enough information to find the bug....

--b.

2014-12-18 19:44:43

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/18/14 18:06, J. Bruce Fields wrote:
> Whoops, now I see, the server-side trace has the same problem, I
> just overlooked it the first time.

Excellent, so we know it's the server's fault. Really would have been odd to not have it in the server trace.

>> ..in order to rule out a mistake on my part with the two separate
>> runs (which prevents correlated analysis) I was just about to boot
>> the server back into 3.18.1 and re-run both client- and server-side
>> traces simultaneously. However I have to head out for a bit first;
>> will post that later today.
>
> So this might still be interesting, but it's not a high priority.

Then I guess I'll better keep my feet still and don't muddle the waters further, looks like you found what you need. If you still need it just holler.

Let me know if there's anything I can do to help/patch/test!

thanks
Holger

2014-12-20 18:02:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On Thu, Dec 18, 2014 at 08:44:37PM +0100, Holger Hoffstätte wrote:
> On 12/18/14 18:06, J. Bruce Fields wrote:
> > Whoops, now I see, the server-side trace has the same problem, I
> > just overlooked it the first time.
>
> Excellent, so we know it's the server's fault. Really would have been odd to not have it in the server trace.
>
> >> ..in order to rule out a mistake on my part with the two separate
> >> runs (which prevents correlated analysis) I was just about to boot
> >> the server back into 3.18.1 and re-run both client- and server-side
> >> traces simultaneously. However I have to head out for a bit first;
> >> will post that later today.
> >
> > So this might still be interesting, but it's not a high priority.
>
> Then I guess I'll better keep my feet still and don't muddle the waters further, looks like you found what you need. If you still need it just holler.
>
> Let me know if there's anything I can do to help/patch/test!

Gah. Does this fix it?

A struct xdr_stream at a page boundary might point to the end of one
page or the beginning of the next, and I'm guessing xdr_truncate_encode
wasn't prepared to handle the former.

This happens if the readdir entry that would have exceeded the client's
dircount/maxcount limit would have ended exactly on a 4k page boundary,
and inspection of the trace shows you're hitting exactly that case.

If this does the job then I'll go figure out how to make this logic less
ugly....

--b.

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 1cb61242e55e..32910b91d17c 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -630,6 +630,9 @@ void xdr_truncate_encode(struct xdr_stream *xdr, size_t len)

new = buf->page_base + buf->page_len;
old = new + fraglen;
+ /* XXX: HACK: */
+ if (xdr->p == page_address(*xdr->page_ptr) + PAGE_SIZE)
+ xdr->page_ptr++;
xdr->page_ptr -= (old >> PAGE_SHIFT) - (new >> PAGE_SHIFT);

if (buf->page_len) {

2014-12-20 18:50:33

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 3.18.1: broken directory with one file too many

On 12/20/14 19:02, J. Bruce Fields wrote:
> Gah. Does this fix it?

It does! Well done. :)

Reported-by: Holger Hoffstätte <[email protected]>
Tested-by: Holger Hoffstätte <[email protected]>

Thanks!
Holger