2019-07-10 10:15:20

by John Dorminy

[permalink] [raw]
Subject: Request for help debugging readdirplus malfunction on NFS v3

Greetings;

In the lab for the group I'm in, we have three NFS servers each
serving different parts of our shared filesystem. However, as of
kernel 5.1 or so on the clients, the clients have ceased working: a
'ls' on any directory within one mountpoint (the only one hosted on
one server) fails to show any files.

The mount is:
nfs-02:/nbu1 on /p/not-backed-up type nfs
(rw,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.19.119.4,mountvers=3,mountport=4048,mountproto=udp,local_lock=none,addr=10.19.119.4)

Bisection on the client side indicates
be4c2d4723a4a637f0d1b4f7c66447141a4b3564 is the commit at which this
mountpoint ceases to work. `rpcdebug -m nfs -s all` results in the
following going to dmesg:
[10146.723030] NFS call readdirplus 162
[10146.724908] NFS reply readdirplus: -2
[10146.725429] NFS: readdir(/) returns -2

I'm somewhat out of ideas; are there other tools I should be using to
hunt this down, short of adding print statements? and is this a known
bug already?

Thanks in advance!

John Dorminy


2019-07-10 11:21:39

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: Request for help debugging readdirplus malfunction on NFS v3

Hi Dorminy,

there are fixes form Trond that possibly address your issues. Did you
have tried them?

https://www.spinics.net/lists/linux-nfs/msg73754.html

Regards,
Tigran.

----- Original Message -----
> From: "John Dorminy" <[email protected]>
> To: "linux-nfs" <[email protected]>
> Sent: Wednesday, July 10, 2019 12:11:46 PM
> Subject: Request for help debugging readdirplus malfunction on NFS v3

> Greetings;
>
> In the lab for the group I'm in, we have three NFS servers each
> serving different parts of our shared filesystem. However, as of
> kernel 5.1 or so on the clients, the clients have ceased working: a
> 'ls' on any directory within one mountpoint (the only one hosted on
> one server) fails to show any files.
>
> The mount is:
> nfs-02:/nbu1 on /p/not-backed-up type nfs
> (rw,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.19.119.4,mountvers=3,mountport=4048,mountproto=udp,local_lock=none,addr=10.19.119.4)
>
> Bisection on the client side indicates
> be4c2d4723a4a637f0d1b4f7c66447141a4b3564 is the commit at which this
> mountpoint ceases to work. `rpcdebug -m nfs -s all` results in the
> following going to dmesg:
> [10146.723030] NFS call readdirplus 162
> [10146.724908] NFS reply readdirplus: -2
> [10146.725429] NFS: readdir(/) returns -2
>
> I'm somewhat out of ideas; are there other tools I should be using to
> hunt this down, short of adding print statements? and is this a known
> bug already?
>
> Thanks in advance!
>
> John Dorminy

2019-07-10 12:06:02

by John Dorminy

[permalink] [raw]
Subject: Re: Request for help debugging readdirplus malfunction on NFS v3

Ah! I apologize, I've been working off of gregkh/staging.git, I
haven't tried those very promising looking patches. I'll give em a
shot and report back. Thanks!

On Wed, Jul 10, 2019 at 7:20 AM Mkrtchyan, Tigran
<[email protected]> wrote:
>
> Hi Dorminy,
>
> there are fixes form Trond that possibly address your issues. Did you
> have tried them?
>
> https://www.spinics.net/lists/linux-nfs/msg73754.html
>
> Regards,
> Tigran.
>
> ----- Original Message -----
> > From: "John Dorminy" <[email protected]>
> > To: "linux-nfs" <[email protected]>
> > Sent: Wednesday, July 10, 2019 12:11:46 PM
> > Subject: Request for help debugging readdirplus malfunction on NFS v3
>
> > Greetings;
> >
> > In the lab for the group I'm in, we have three NFS servers each
> > serving different parts of our shared filesystem. However, as of
> > kernel 5.1 or so on the clients, the clients have ceased working: a
> > 'ls' on any directory within one mountpoint (the only one hosted on
> > one server) fails to show any files.
> >
> > The mount is:
> > nfs-02:/nbu1 on /p/not-backed-up type nfs
> > (rw,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.19.119.4,mountvers=3,mountport=4048,mountproto=udp,local_lock=none,addr=10.19.119.4)
> >
> > Bisection on the client side indicates
> > be4c2d4723a4a637f0d1b4f7c66447141a4b3564 is the commit at which this
> > mountpoint ceases to work. `rpcdebug -m nfs -s all` results in the
> > following going to dmesg:
> > [10146.723030] NFS call readdirplus 162
> > [10146.724908] NFS reply readdirplus: -2
> > [10146.725429] NFS: readdir(/) returns -2
> >
> > I'm somewhat out of ideas; are there other tools I should be using to
> > hunt this down, short of adding print statements? and is this a known
> > bug already?
> >
> > Thanks in advance!
> >
> > John Dorminy