2021-04-13 19:04:17

by Charles Hedrick

[permalink] [raw]
Subject: safe versions of NFS

I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.

We use an Ubuntu 20 server, with NVMe storage.

I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.

Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.



2021-04-13 19:10:46

by Patrick Goetz

[permalink] [raw]
Subject: Re: safe versions of NFS

I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers
and haven't had any problems.

Check your configuration files; the last time I experienced something
like this it's because I inadvertently used the same fsid on two
different exports. Also recommend exporting top level directories only.
Bind mount everything you want to export into /srv/nfs and only export
those directories. According to Bruce F. this doesn't buy you any
security (I still don't understand why), but it makes for a cleaner
system configuration.

On 4/13/21 9:33 AM, [email protected] wrote:
> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>
> We use an Ubuntu 20 server, with NVMe storage.
>
> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>
> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>
>

2021-04-13 20:59:46

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

many, though not all, of the problems are “lock reclaim failed”.

> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>
> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>
> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>
> On 4/13/21 9:33 AM, [email protected] wrote:
>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>> We use an Ubuntu 20 server, with NVMe storage.
>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.

2021-04-13 21:14:31

by Benjamin Coddington

[permalink] [raw]
Subject: Re: safe versions of NFS

It would be interesting to know why your clients are failing to reclaim
their locks. Something is misconfigured. What server are you using,
and is there anything fancy on the server-side (like HA)? Is it
possible that you have clients with the same nfs4_unique_id?

Ben

On 13 Apr 2021, at 11:17, [email protected] wrote:

> many, though not all, of the problems are “lock reclaim failed”.
>
>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]>
>> wrote:
>>
>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers
>> and haven't had any problems.
>>
>> Check your configuration files; the last time I experienced something
>> like this it's because I inadvertently used the same fsid on two
>> different exports. Also recommend exporting top level directories
>> only. Bind mount everything you want to export into /srv/nfs and
>> only export those directories. According to Bruce F. this doesn't buy
>> you any security (I still don't understand why), but it makes for a
>> cleaner system configuration.
>>
>> On 4/13/21 9:33 AM, [email protected] wrote:
>>> I am in charge of a large computer science dept computing
>>> infrastructure. We have a variety of student and develo9pment users.
>>> If there are problems we’ll see them.
>>> We use an Ubuntu 20 server, with NVMe storage.
>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We
>>> had hangs with NFS 4.1 and 4.2. Files would appear to be locked,
>>> although eventually the lock would time out. It’s too soon to be
>>> sure that moving back to NFS 4.0 will fix it. Next is either NFS 3
>>> or disabling delegations on the server.
>>> Are there known versions of NFS that are safe to use in production
>>> for various kernel versions? The one we’re most interested in is
>>> Ubuntu 20, which can be anything from 5.4 to 5.8.

2021-04-13 21:31:47

by Benjamin Coddington

[permalink] [raw]
Subject: Re: safe versions of NFS

(resending this as it bounced off the list - I accidentally embedded
HTML)

Yes, if you're pretty sure your hostnames are all different, the
client_ids
should be different. For v4.0 you can turn on debugging (rpcdebug -m
nfs -s
proc) and see the client_id in the kernel log in lines that look like:
"NFS
call setclientid auth=%s, '%s'\n", which will happen at mount time, but
it
doesn't look like we have any debugging for v4.1 and v4.2 for
EXCHANGE_ID.

You can extract it via the crash utility, or via systemtap, or by doing
a
wire capture, but nothing that's easily translated to running across a
large
number of machines. There's probably other ways, perhaps we should tack
that string into the tracepoints for exchange_id and setclientid.

If you're interested in troubleshooting, wire capture's usually the most
informative. If the lockup events all happen at the same time, there
might be some network event that is triggering the issue.

You should expect NFSv4.1 to be rock-solid. Its rare we have reports
that it isn't, and I'd love to know why you're having these problems.

Ben

On 13 Apr 2021, at 11:38, [email protected] wrote:

> The server is ubuntu 20, with a ZFS file system.
>
> I don’t set the unique ID. Documentation claims that it is set from
> the hostname. They will surely be unique, or the whole world would
> blow up. How can I check the actual unique ID being used? The kernel
> reports a blank one, but I think that just means to use the hostname.
> We could obviously set a unique one if that would be useful.
>
>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington
>> <[email protected]> wrote:
>>
>> It would be interesting to know why your clients are failing to
>> reclaim their locks. Something is misconfigured. What server are
>> you using, and is there anything fancy on the server-side (like HA)?
>> Is it possible that you have clients with the same nfs4_unique_id?
>>
>> Ben
>>
>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>
>>> many, though not all, of the problems are “lock reclaim failed”.
>>>
>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz
>>>> <[email protected]> wrote:
>>>>
>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20
>>>> servers and haven't had any problems.
>>>>
>>>> Check your configuration files; the last time I experienced
>>>> something like this it's because I inadvertently used the same fsid
>>>> on two different exports. Also recommend exporting top level
>>>> directories only. Bind mount everything you want to export into
>>>> /srv/nfs and only export those directories. According to Bruce F.
>>>> this doesn't buy you any security (I still don't understand why),
>>>> but it makes for a cleaner system configuration.
>>>>
>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>> I am in charge of a large computer science dept computing
>>>>> infrastructure. We have a variety of student and develo9pment
>>>>> users. If there are problems we’ll see them.
>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We
>>>>> had hangs with NFS 4.1 and 4.2. Files would appear to be locked,
>>>>> although eventually the lock would time out. It’s too soon to be
>>>>> sure that moving back to NFS 4.0 will fix it. Next is either NFS 3
>>>>> or disabling delegations on the server.
>>>>> Are there known versions of NFS that are safe to use in production
>>>>> for various kernel versions? The one we’re most interested in is
>>>>> Ubuntu 20, which can be anything from 5.4 to 5.8.
>>



2021-04-13 22:07:45

by Chuck Lever

[permalink] [raw]
Subject: Re: safe versions of NFS



> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>
> (resending this as it bounced off the list - I accidentally embedded HTML)
>
> Yes, if you're pretty sure your hostnames are all different, the client_ids
> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
> proc) and see the client_id in the kernel log in lines that look like: "NFS
> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>
> You can extract it via the crash utility, or via systemtap, or by doing a
> wire capture, but nothing that's easily translated to running across a large
> number of machines. There's probably other ways, perhaps we should tack
> that string into the tracepoints for exchange_id and setclientid.
>
> If you're interested in troubleshooting, wire capture's usually the most
> informative. If the lockup events all happen at the same time, there
> might be some network event that is triggering the issue.
>
> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
> that it isn't, and I'd love to know why you're having these problems.

I echo that: NFSv4.1 protocol and implementation are mature, so if
there are operational problems, it should be root-caused.

NFSv4.1 uses a uniform client ID. That should be the "good" one,
not the NFSv4.0 one that has a non-zero probability of collision.

Charles, please let us know if there are particular workloads that
trigger the lock reclaim failure. A narrow reproducer would help
get to the root issue quickly.


> Ben
>
> On 13 Apr 2021, at 11:38, [email protected] wrote:
>
>> The server is ubuntu 20, with a ZFS file system.
>>
>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>
>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>
>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>
>>> Ben
>>>
>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>
>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>
>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>
>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>
>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>
>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>
>
>
>

--
Chuck Lever



2021-04-13 22:13:50

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

The two oddities I’ve seen are
* the fairly common failure of mounts with “not exported” because of sssd problems
* a major failure when I inadvertently reinstalled sssd on the server. That caused lots of mounts and authentication to fail. That was on Apr 2, though, and most problems have been in the last week

We’ve been starting to move file systems from our netapp to the Linux-based server.I note that Netapp defaults to delegations off with NFS 4.1. They almost certainly wouldn’t see these problems. It’s also interesting to see that there’s been enough history of problems that gitlab recommends turning delegations off on Linux NFS servers, or using 4.0. I’ve seen another big package that makes a similar recommendation.

As soon as we can verify that our applications work, we’re going to upgrade the server that has shown the most problems with Linux 5.4, to see if that helps. So far our Ubuntu 20 systems (with 5.8) have been OK, though they get fewer users. We’ll be moving everything to 20 this summer. While Ubuntu 20 server uses 5.4, I’m inclined to install it with 5.8, since that’s the combination we’ve tested most.

> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>
>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>
>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>
>> You can extract it via the crash utility, or via systemtap, or by doing a
>> wire capture, but nothing that's easily translated to running across a large
>> number of machines. There's probably other ways, perhaps we should tack
>> that string into the tracepoints for exchange_id and setclientid.
>>
>> If you're interested in troubleshooting, wire capture's usually the most
>> informative. If the lockup events all happen at the same time, there
>> might be some network event that is triggering the issue.
>>
>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>> that it isn't, and I'd love to know why you're having these problems.
>
> I echo that: NFSv4.1 protocol and implementation are mature, so if
> there are operational problems, it should be root-caused.
>
> NFSv4.1 uses a uniform client ID. That should be the "good" one,
> not the NFSv4.0 one that has a non-zero probability of collision.
>
> Charles, please let us know if there are particular workloads that
> trigger the lock reclaim failure. A narrow reproducer would help
> get to the root issue quickly.
>
>
>> Ben
>>
>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>
>>> The server is ubuntu 20, with a ZFS file system.
>>>
>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>
>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>
>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>
>>>> Ben
>>>>
>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>
>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>
>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>
>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>
>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>
>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>
>>
>>
>>
>
> --
> Chuck Lever
>
>
>

2021-04-13 22:46:14

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

One other specific:

I’ve been talking about ubuntu 18 and 20. Our first issue was with Centos 7. We have a reproducible problem with Thunderbird. It simply won’t run on 4.1 in Centos 7. The problem is with sqlite. It can be fixed by setting a special parameter in Thunderbird and also Firefox. But we changed the mount to 4.0.

> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>
>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>
>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>
>> You can extract it via the crash utility, or via systemtap, or by doing a
>> wire capture, but nothing that's easily translated to running across a large
>> number of machines. There's probably other ways, perhaps we should tack
>> that string into the tracepoints for exchange_id and setclientid.
>>
>> If you're interested in troubleshooting, wire capture's usually the most
>> informative. If the lockup events all happen at the same time, there
>> might be some network event that is triggering the issue.
>>
>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>> that it isn't, and I'd love to know why you're having these problems.
>
> I echo that: NFSv4.1 protocol and implementation are mature, so if
> there are operational problems, it should be root-caused.
>
> NFSv4.1 uses a uniform client ID. That should be the "good" one,
> not the NFSv4.0 one that has a non-zero probability of collision.
>
> Charles, please let us know if there are particular workloads that
> trigger the lock reclaim failure. A narrow reproducer would help
> get to the root issue quickly.
>
>
>> Ben
>>
>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>
>>> The server is ubuntu 20, with a ZFS file system.
>>>
>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>
>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>
>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>
>>>> Ben
>>>>
>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>
>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>
>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>
>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>
>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>
>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>
>>
>>
>>
>
> --
> Chuck Lever
>
>
>

2021-04-13 23:38:58

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

This is from Centos 7.9, with all file systems mounted via NFS 4.0 (in theory — there may be 4.2 that were unmounted —lazy):

463 106.786384244 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39641 Ack=46777 Win=24576 Len=0 TSval=520277320 TSecr=1478982393
464 108.000270192 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
465 108.000361904 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
466 108.000476711 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 464) ACCESS, [Allowed: RD LU MD XT DL]
467 108.000495290 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8761 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
468 108.000591598 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 465) ACCESS, [Allowed: RD LU MD XT DL]
469 108.000608160 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8917 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
470 118.952127064 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
471 118.952356881 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 470) ACCESS, [Allowed: RD LU MD XT DL]
472 118.952372768 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9361 Ack=9073 Win=16605 Len=0 TSval=520289486 TSecr=1478994599
473 119.999835420 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0x94a968d5, [Check: RD LU MD XT DL]
474 120.000067817 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 473) ACCESS, [Allowed: RD LU MD XT DL]
475 120.000082882 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9533 Ack=9229 Win=16605 Len=0 TSval=520290533 TSecr=1478995646
476 140.000587688 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
477 140.000688677 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
478 140.000746915 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 476) ACCESS, [Allowed: RD LU MD XT DL]
479 140.000759241 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9385 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
480 140.000830146 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 477) ACCESS, [Allowed: RD LU MD XT DL]
481 140.000836443 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9541 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
482 148.442466129 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call RENEW CID: 0x04da
483 148.442650203 172.17.11.218 -> 172.17.141.150 NFS 182 V4 Reply (Call In 482) RENEW
484 148.442664846 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10037 Ack=9657 Win=16605 Len=0 TSval=520318976 TSecr=1479024089
485 149.953317362 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
486 149.953550872 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 485) ACCESS, [Allowed: RD LU MD XT DL]
487 149.953565993 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10209 Ack=9813 Win=16605 Len=0 TSval=520320487 TSecr=1479025600
488 162.000571296 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call ACCESS FH: 0xcd1903be, [Check: RD LU MD XT DL]
489 162.000794395 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 488) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
490 162.000825598 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10369 Ack=9969 Win=16605 Len=0 TSval=520332534 TSecr=1479037647
491 162.000998283 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xeabd4697, [Check: RD LU MD XT DL]
492 162.001218772 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 491) ACCESS, [Allowed: RD LU MD XT DL]
493 162.040415201 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10541 Ack=10125 Win=16605 Len=0 TSval=520332574 TSecr=1479037647
494 166.874398617 172.17.141.150 -> 172.17.11.218 TCP 66 [TCP Keep-Alive] nlogin > nfs [ACK] Seq=39640 Ack=46777 Win=24576 Len=0 TSval=520337408 TSecr=1478982393
495 166.874438892 172.17.141.150 -> 172.17.11.218 NFS 250 V4 Call SEQUENCE
496 166.874506845 172.17.11.218 -> 172.17.141.150 TCP 66 [TCP Dup ACK 462#1] nfs > nlogin [ACK] Seq=46777 Ack=39641 Win=24559 Len=0 TSval=1479042521 TSecr=520277320
497 166.874720218 172.17.11.218 -> 172.17.141.150 NFS 218 V4 Reply (Call In 495) SEQUENCE
498 166.874730215 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39825 Ack=46929 Win=24575 Len=0 TSval=520337408 TSecr=1479042521
499 166.874987010 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
500 166.875172744 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 499) TEST_STATEID
501 166.875309487 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x534735df/
502 166.875655661 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 501) OPEN StateID: 0xd7ae
503 166.875801366 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
504 166.876042044 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 503) TEST_STATEID
505 166.876210946 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
506 166.876485761 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 505) OPEN StateID: 0x9578
507 166.876607463 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
508 166.876820365 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 507) OPEN StateID: 0xfa83
509 166.876941430 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
510 166.877123487 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 509) TEST_STATEID
511 166.877205876 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x968ca393/
512 166.877464268 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 511) OPEN StateID: 0x25d5
513 166.877600104 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
514 166.877841822 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 513) TEST_STATEID
515 166.877997847 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde4187cf/
516 166.878265626 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 515) OPEN StateID: 0xd5ce
517 166.878393548 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
518 166.878603997 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 517) TEST_STATEID
519 166.878692334 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x9f6703e9/
520 166.878920958 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 519) OPEN StateID: 0x69a1
521 166.878964818 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
522 166.879156141 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 521) TEST_STATEID
523 166.879195140 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xe5c84183/
524 166.879435831 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 523) OPEN StateID: 0x6069
525 166.879518910 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
526 166.879709592 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 525) TEST_STATEID
527 166.879796731 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
528 166.880024682 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 527) OPEN StateID: 0xf420
529 166.880070944 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
530 166.880265884 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 529) OPEN StateID: 0xec63
531 166.880301034 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
532 166.880511051 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 531) TEST_STATEID
533 166.880575938 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x05761d23/
534 166.880798417 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 533) OPEN StateID: 0xb199
535 166.880840801 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
536 166.881008021 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 535) TEST_STATEID
537 166.881043797 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
538 166.881270127 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 537) OPEN StateID: 0x49df
539 166.881304710 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
540 166.881498628 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 539) OPEN StateID: 0xf0d5
541 166.881545126 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
542 166.881732646 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 541) TEST_STATEID
543 166.881775578 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0183cd1e/
544 166.881978864 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 543) OPEN StateID: 0x546d
545 166.882021595 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
546 166.882209030 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 545) TEST_STATEID
547 166.882252306 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
548 166.882484514 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 547) OPEN StateID: 0xa46d
549 166.882523043 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
550 166.882710061 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 549) OPEN StateID: 0xaa9a
551 166.882750420 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
552 166.882933338 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 551) TEST_STATEID
553 166.882961488 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3699764a/
554 166.883192776 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 553) OPEN StateID: 0x3c37
555 166.883223581 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
556 166.883407176 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 555) TEST_STATEID
557 166.883468198 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x94fd1187/
558 166.883679012 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 557) OPEN StateID: 0xbedf
559 166.883719911 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
560 166.883910224 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 559) TEST_STATEID
561 166.883937791 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xcd96c73b/
562 166.884165115 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 561) OPEN StateID: 0xbf6c
563 166.884194351 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
564 166.884378426 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 563) TEST_STATEID
565 166.884433848 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd24bb22d/
566 166.884661584 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 565) OPEN StateID: 0xaaf5
567 166.884719445 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
568 166.884904098 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 567) TEST_STATEID
569 166.884952255 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x5e594598/
570 166.885154240 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 569) OPEN StateID: 0x11cb
571 166.885206342 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
572 166.885389478 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 571) TEST_STATEID
573 166.885454506 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa4fd80c1/
574 166.885686638 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 573) OPEN StateID: 0x9363
575 166.885745762 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
576 166.885933232 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 575) TEST_STATEID
577 166.885980692 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x80e4743a/
578 166.886212637 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 577) OPEN StateID: 0x63af
579 166.886272585 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
580 166.886457823 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 579) TEST_STATEID
581 166.886514332 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb3e2d284/
582 166.886742488 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 581) OPEN StateID: 0x52d9
583 166.886803826 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
584 166.886989516 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 583) TEST_STATEID
585 166.887100351 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x95137efa/
586 166.887304377 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 585) OPEN StateID: 0xb26e
587 166.887395696 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
588 166.887587544 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 587) TEST_STATEID
589 166.887703373 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x1d70394e/
590 166.887919160 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 589) OPEN StateID: 0x753c
591 166.887999278 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
592 166.888190709 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 591) TEST_STATEID
593 166.888298867 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x680da7ff/
594 166.888530666 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 593) OPEN StateID: 0x097c
595 166.888605330 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
596 166.888795331 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 595) TEST_STATEID
597 166.888902566 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0a52a987/
598 166.889113308 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 597) OPEN StateID: 0x7781
599 166.889162100 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
600 166.889353078 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 599) TEST_STATEID
601 166.889461157 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x56462642/
602 166.889699242 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 601) OPEN StateID: 0xb209
603 166.889772552 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
604 166.889963172 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 603) TEST_STATEID
605 166.890070241 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x850c0567/
606 166.890272350 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 605) OPEN StateID: 0x8134
607 166.890335412 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
608 166.890542793 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 607) TEST_STATEID
609 166.890650208 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x31aa2390/
610 166.890886053 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 609) OPEN StateID: 0x1f30
611 166.890959992 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
612 166.891158969 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 611) TEST_STATEID
613 166.891264349 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2ae7f451/
614 166.891516937 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 613) OPEN StateID: 0x3fce
615 166.891591778 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
616 166.891788024 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 615) TEST_STATEID
617 166.891902329 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa5cb13d3/
618 166.892117369 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 617) OPEN StateID: 0x03d6
619 166.892175933 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
620 166.892398086 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 619) TEST_STATEID
621 166.892447518 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3f5cfcdb/
622 166.892671544 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 621) OPEN StateID: 0x8bff
623 166.892716533 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
624 166.892901971 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 623) TEST_STATEID
625 166.892940753 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x40b4d194/
626 166.893167930 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 625) OPEN StateID: 0xe3e8
627 166.893215973 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
628 166.893398652 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 627) TEST_STATEID
629 166.893445197 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x4643d6fc/
630 166.893682547 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 629) OPEN StateID: 0xe194
631 166.893731829 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
632 166.893915920 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 631) TEST_STATEID
633 166.893960986 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd246cbd3/
634 166.894191435 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 633) OPEN StateID: 0x22f5
635 166.894240889 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
636 166.894425035 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 635) TEST_STATEID
637 166.894470784 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde40a6e1/
638 166.894695618 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 637) OPEN StateID: 0x9ce2
639 166.894744788 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
640 166.894929372 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 639) TEST_STATEID
641 166.894967975 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0f26a4dc/
642 166.895197152 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 641) OPEN StateID: 0x879c
643 166.895245038 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
644 166.895429467 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 643) TEST_STATEID
645 166.895471234 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x82162062/
646 166.895694775 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 645) OPEN StateID: 0xab68
647 166.895744208 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
648 166.895929221 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 647) TEST_STATEID
649 166.895973162 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb8b3b57f/
650 166.896197535 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 649) OPEN StateID: 0xfb0b
651 166.896245960 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
652 166.896430271 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 651) TEST_STATEID
653 166.896475752 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2e6c2b31/
654 166.896705501 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 653) OPEN StateID: 0x8e00
655 166.896754419 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
656 166.896939911 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 655) TEST_STATEID
657 166.896983440 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x547cf5ea/
658 166.897209526 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 657) OPEN StateID: 0x0532
659 166.897258079 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
660 166.897443527 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 659) TEST_STATEID
661 166.897484081 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x33c176ce/
662 166.897693578 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 661) OPEN StateID: 0x3648
663 166.937386876 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=59461 Ack=70165 Win=24576 Len=0 TSval=520337471 TSecr=1479042544
^C663 packets captured
[hedrick@camaro ~]$

Here’s the corresponding section of /var/log/messages

Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: nfs4_reclaim_open_state: 1 callbacks suppressed
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!




> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>
>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>
>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>
>> You can extract it via the crash utility, or via systemtap, or by doing a
>> wire capture, but nothing that's easily translated to running across a large
>> number of machines. There's probably other ways, perhaps we should tack
>> that string into the tracepoints for exchange_id and setclientid.
>>
>> If you're interested in troubleshooting, wire capture's usually the most
>> informative. If the lockup events all happen at the same time, there
>> might be some network event that is triggering the issue.
>>
>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>> that it isn't, and I'd love to know why you're having these problems.
>
> I echo that: NFSv4.1 protocol and implementation are mature, so if
> there are operational problems, it should be root-caused.
>
> NFSv4.1 uses a uniform client ID. That should be the "good" one,
> not the NFSv4.0 one that has a non-zero probability of collision.
>
> Charles, please let us know if there are particular workloads that
> trigger the lock reclaim failure. A narrow reproducer would help
> get to the root issue quickly.
>
>
>> Ben
>>
>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>
>>> The server is ubuntu 20, with a ZFS file system.
>>>
>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>
>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>
>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>
>>>> Ben
>>>>
>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>
>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>
>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>
>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>
>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>
>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>
>>
>>
>>
>
> --
> Chuck Lever
>
>
>

2021-04-13 23:42:11

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

I have log entries for two Ubuntu 18 systems, but there’s no way to get you a network trace for them without using a time machine.

> On Apr 13, 2021, at 3:40 PM, [email protected] wrote:
>
> This is from Centos 7.9, with all file systems mounted via NFS 4.0 (in theory — there may be 4.2 that were unmounted —lazy):
>
> 463 106.786384244 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39641 Ack=46777 Win=24576 Len=0 TSval=520277320 TSecr=1478982393
> 464 108.000270192 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
> 465 108.000361904 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
> 466 108.000476711 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 464) ACCESS, [Allowed: RD LU MD XT DL]
> 467 108.000495290 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8761 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
> 468 108.000591598 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 465) ACCESS, [Allowed: RD LU MD XT DL]
> 469 108.000608160 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8917 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
> 470 118.952127064 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
> 471 118.952356881 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 470) ACCESS, [Allowed: RD LU MD XT DL]
> 472 118.952372768 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9361 Ack=9073 Win=16605 Len=0 TSval=520289486 TSecr=1478994599
> 473 119.999835420 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0x94a968d5, [Check: RD LU MD XT DL]
> 474 120.000067817 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 473) ACCESS, [Allowed: RD LU MD XT DL]
> 475 120.000082882 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9533 Ack=9229 Win=16605 Len=0 TSval=520290533 TSecr=1478995646
> 476 140.000587688 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
> 477 140.000688677 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
> 478 140.000746915 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 476) ACCESS, [Allowed: RD LU MD XT DL]
> 479 140.000759241 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9385 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
> 480 140.000830146 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 477) ACCESS, [Allowed: RD LU MD XT DL]
> 481 140.000836443 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9541 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
> 482 148.442466129 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call RENEW CID: 0x04da
> 483 148.442650203 172.17.11.218 -> 172.17.141.150 NFS 182 V4 Reply (Call In 482) RENEW
> 484 148.442664846 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10037 Ack=9657 Win=16605 Len=0 TSval=520318976 TSecr=1479024089
> 485 149.953317362 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
> 486 149.953550872 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 485) ACCESS, [Allowed: RD LU MD XT DL]
> 487 149.953565993 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10209 Ack=9813 Win=16605 Len=0 TSval=520320487 TSecr=1479025600
> 488 162.000571296 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call ACCESS FH: 0xcd1903be, [Check: RD LU MD XT DL]
> 489 162.000794395 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 488) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
> 490 162.000825598 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10369 Ack=9969 Win=16605 Len=0 TSval=520332534 TSecr=1479037647
> 491 162.000998283 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xeabd4697, [Check: RD LU MD XT DL]
> 492 162.001218772 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 491) ACCESS, [Allowed: RD LU MD XT DL]
> 493 162.040415201 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10541 Ack=10125 Win=16605 Len=0 TSval=520332574 TSecr=1479037647
> 494 166.874398617 172.17.141.150 -> 172.17.11.218 TCP 66 [TCP Keep-Alive] nlogin > nfs [ACK] Seq=39640 Ack=46777 Win=24576 Len=0 TSval=520337408 TSecr=1478982393
> 495 166.874438892 172.17.141.150 -> 172.17.11.218 NFS 250 V4 Call SEQUENCE
> 496 166.874506845 172.17.11.218 -> 172.17.141.150 TCP 66 [TCP Dup ACK 462#1] nfs > nlogin [ACK] Seq=46777 Ack=39641 Win=24559 Len=0 TSval=1479042521 TSecr=520277320
> 497 166.874720218 172.17.11.218 -> 172.17.141.150 NFS 218 V4 Reply (Call In 495) SEQUENCE
> 498 166.874730215 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39825 Ack=46929 Win=24575 Len=0 TSval=520337408 TSecr=1479042521
> 499 166.874987010 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 500 166.875172744 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 499) TEST_STATEID
> 501 166.875309487 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x534735df/
> 502 166.875655661 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 501) OPEN StateID: 0xd7ae
> 503 166.875801366 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 504 166.876042044 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 503) TEST_STATEID
> 505 166.876210946 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
> 506 166.876485761 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 505) OPEN StateID: 0x9578
> 507 166.876607463 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
> 508 166.876820365 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 507) OPEN StateID: 0xfa83
> 509 166.876941430 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 510 166.877123487 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 509) TEST_STATEID
> 511 166.877205876 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x968ca393/
> 512 166.877464268 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 511) OPEN StateID: 0x25d5
> 513 166.877600104 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 514 166.877841822 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 513) TEST_STATEID
> 515 166.877997847 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde4187cf/
> 516 166.878265626 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 515) OPEN StateID: 0xd5ce
> 517 166.878393548 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 518 166.878603997 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 517) TEST_STATEID
> 519 166.878692334 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x9f6703e9/
> 520 166.878920958 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 519) OPEN StateID: 0x69a1
> 521 166.878964818 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 522 166.879156141 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 521) TEST_STATEID
> 523 166.879195140 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xe5c84183/
> 524 166.879435831 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 523) OPEN StateID: 0x6069
> 525 166.879518910 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 526 166.879709592 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 525) TEST_STATEID
> 527 166.879796731 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
> 528 166.880024682 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 527) OPEN StateID: 0xf420
> 529 166.880070944 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
> 530 166.880265884 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 529) OPEN StateID: 0xec63
> 531 166.880301034 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 532 166.880511051 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 531) TEST_STATEID
> 533 166.880575938 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x05761d23/
> 534 166.880798417 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 533) OPEN StateID: 0xb199
> 535 166.880840801 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 536 166.881008021 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 535) TEST_STATEID
> 537 166.881043797 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
> 538 166.881270127 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 537) OPEN StateID: 0x49df
> 539 166.881304710 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
> 540 166.881498628 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 539) OPEN StateID: 0xf0d5
> 541 166.881545126 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 542 166.881732646 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 541) TEST_STATEID
> 543 166.881775578 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0183cd1e/
> 544 166.881978864 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 543) OPEN StateID: 0x546d
> 545 166.882021595 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 546 166.882209030 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 545) TEST_STATEID
> 547 166.882252306 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
> 548 166.882484514 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 547) OPEN StateID: 0xa46d
> 549 166.882523043 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
> 550 166.882710061 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 549) OPEN StateID: 0xaa9a
> 551 166.882750420 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 552 166.882933338 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 551) TEST_STATEID
> 553 166.882961488 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3699764a/
> 554 166.883192776 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 553) OPEN StateID: 0x3c37
> 555 166.883223581 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 556 166.883407176 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 555) TEST_STATEID
> 557 166.883468198 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x94fd1187/
> 558 166.883679012 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 557) OPEN StateID: 0xbedf
> 559 166.883719911 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 560 166.883910224 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 559) TEST_STATEID
> 561 166.883937791 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xcd96c73b/
> 562 166.884165115 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 561) OPEN StateID: 0xbf6c
> 563 166.884194351 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 564 166.884378426 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 563) TEST_STATEID
> 565 166.884433848 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd24bb22d/
> 566 166.884661584 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 565) OPEN StateID: 0xaaf5
> 567 166.884719445 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 568 166.884904098 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 567) TEST_STATEID
> 569 166.884952255 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x5e594598/
> 570 166.885154240 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 569) OPEN StateID: 0x11cb
> 571 166.885206342 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 572 166.885389478 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 571) TEST_STATEID
> 573 166.885454506 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa4fd80c1/
> 574 166.885686638 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 573) OPEN StateID: 0x9363
> 575 166.885745762 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 576 166.885933232 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 575) TEST_STATEID
> 577 166.885980692 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x80e4743a/
> 578 166.886212637 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 577) OPEN StateID: 0x63af
> 579 166.886272585 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 580 166.886457823 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 579) TEST_STATEID
> 581 166.886514332 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb3e2d284/
> 582 166.886742488 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 581) OPEN StateID: 0x52d9
> 583 166.886803826 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 584 166.886989516 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 583) TEST_STATEID
> 585 166.887100351 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x95137efa/
> 586 166.887304377 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 585) OPEN StateID: 0xb26e
> 587 166.887395696 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 588 166.887587544 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 587) TEST_STATEID
> 589 166.887703373 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x1d70394e/
> 590 166.887919160 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 589) OPEN StateID: 0x753c
> 591 166.887999278 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 592 166.888190709 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 591) TEST_STATEID
> 593 166.888298867 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x680da7ff/
> 594 166.888530666 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 593) OPEN StateID: 0x097c
> 595 166.888605330 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 596 166.888795331 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 595) TEST_STATEID
> 597 166.888902566 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0a52a987/
> 598 166.889113308 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 597) OPEN StateID: 0x7781
> 599 166.889162100 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 600 166.889353078 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 599) TEST_STATEID
> 601 166.889461157 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x56462642/
> 602 166.889699242 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 601) OPEN StateID: 0xb209
> 603 166.889772552 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 604 166.889963172 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 603) TEST_STATEID
> 605 166.890070241 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x850c0567/
> 606 166.890272350 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 605) OPEN StateID: 0x8134
> 607 166.890335412 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 608 166.890542793 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 607) TEST_STATEID
> 609 166.890650208 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x31aa2390/
> 610 166.890886053 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 609) OPEN StateID: 0x1f30
> 611 166.890959992 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 612 166.891158969 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 611) TEST_STATEID
> 613 166.891264349 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2ae7f451/
> 614 166.891516937 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 613) OPEN StateID: 0x3fce
> 615 166.891591778 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 616 166.891788024 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 615) TEST_STATEID
> 617 166.891902329 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa5cb13d3/
> 618 166.892117369 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 617) OPEN StateID: 0x03d6
> 619 166.892175933 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 620 166.892398086 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 619) TEST_STATEID
> 621 166.892447518 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3f5cfcdb/
> 622 166.892671544 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 621) OPEN StateID: 0x8bff
> 623 166.892716533 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 624 166.892901971 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 623) TEST_STATEID
> 625 166.892940753 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x40b4d194/
> 626 166.893167930 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 625) OPEN StateID: 0xe3e8
> 627 166.893215973 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 628 166.893398652 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 627) TEST_STATEID
> 629 166.893445197 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x4643d6fc/
> 630 166.893682547 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 629) OPEN StateID: 0xe194
> 631 166.893731829 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 632 166.893915920 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 631) TEST_STATEID
> 633 166.893960986 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd246cbd3/
> 634 166.894191435 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 633) OPEN StateID: 0x22f5
> 635 166.894240889 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 636 166.894425035 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 635) TEST_STATEID
> 637 166.894470784 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde40a6e1/
> 638 166.894695618 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 637) OPEN StateID: 0x9ce2
> 639 166.894744788 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 640 166.894929372 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 639) TEST_STATEID
> 641 166.894967975 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0f26a4dc/
> 642 166.895197152 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 641) OPEN StateID: 0x879c
> 643 166.895245038 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 644 166.895429467 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 643) TEST_STATEID
> 645 166.895471234 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x82162062/
> 646 166.895694775 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 645) OPEN StateID: 0xab68
> 647 166.895744208 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 648 166.895929221 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 647) TEST_STATEID
> 649 166.895973162 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb8b3b57f/
> 650 166.896197535 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 649) OPEN StateID: 0xfb0b
> 651 166.896245960 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 652 166.896430271 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 651) TEST_STATEID
> 653 166.896475752 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2e6c2b31/
> 654 166.896705501 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 653) OPEN StateID: 0x8e00
> 655 166.896754419 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 656 166.896939911 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 655) TEST_STATEID
> 657 166.896983440 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x547cf5ea/
> 658 166.897209526 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 657) OPEN StateID: 0x0532
> 659 166.897258079 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 660 166.897443527 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 659) TEST_STATEID
> 661 166.897484081 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x33c176ce/
> 662 166.897693578 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 661) OPEN StateID: 0x3648
> 663 166.937386876 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=59461 Ack=70165 Win=24576 Len=0 TSval=520337471 TSecr=1479042544
> ^C663 packets captured
> [hedrick@camaro ~]$
>
> Here’s the corresponding section of /var/log/messages
>
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: nfs4_reclaim_open_state: 1 callbacks suppressed
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>
>
>
>
>> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>>
>>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>>
>>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>>
>>> You can extract it via the crash utility, or via systemtap, or by doing a
>>> wire capture, but nothing that's easily translated to running across a large
>>> number of machines. There's probably other ways, perhaps we should tack
>>> that string into the tracepoints for exchange_id and setclientid.
>>>
>>> If you're interested in troubleshooting, wire capture's usually the most
>>> informative. If the lockup events all happen at the same time, there
>>> might be some network event that is triggering the issue.
>>>
>>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>>> that it isn't, and I'd love to know why you're having these problems.
>>
>> I echo that: NFSv4.1 protocol and implementation are mature, so if
>> there are operational problems, it should be root-caused.
>>
>> NFSv4.1 uses a uniform client ID. That should be the "good" one,
>> not the NFSv4.0 one that has a non-zero probability of collision.
>>
>> Charles, please let us know if there are particular workloads that
>> trigger the lock reclaim failure. A narrow reproducer would help
>> get to the root issue quickly.
>>
>>
>>> Ben
>>>
>>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>>
>>>> The server is ubuntu 20, with a ZFS file system.
>>>>
>>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>>
>>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>>
>>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>>
>>>>> Ben
>>>>>
>>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>>
>>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>>
>>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>>
>>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>>
>>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>>
>>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>>
>>>
>>>
>>>
>>
>> --
>> Chuck Lever
>>
>>
>>
>

2021-04-14 16:32:41

by Chuck Lever

[permalink] [raw]
Subject: Re: safe versions of NFS

Good morning Charles,

> On Apr 13, 2021, at 3:40 PM, [email protected] wrote:
>
> This is from Centos 7.9, with all file systems mounted via NFS 4.0 (in theory — there may be 4.2 that were unmounted —lazy):

There isn't much linux-nfs@ can do about problems in distributor
kernels, and it's very possible the issue is already fixed in a
more recent kernel. You did suggest that v5.8 seemed more solid,
for instance, and my 30-second Googling suggests that Ubuntu 18
is based on 4.15, which is ancient history for us upstream code
monkeys.

I recommend working with your Linux distributor to start root
cause analysis and to let them document the issue properly. If
they find that the problem is not already addressed in a newer
kernel, then bring it back here to linux-nfs@.


Sidebar: I find it interesting that the "just disable delegation"
advice is still floating around the interwebs. And unfortunately
it is probably still effective in some cases. The community
should make an effort to nail those down and squash them, IMO.


> 463 106.786384244 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39641 Ack=46777 Win=24576 Len=0 TSval=520277320 TSecr=1478982393
> 464 108.000270192 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
> 465 108.000361904 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
> 466 108.000476711 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 464) ACCESS, [Allowed: RD LU MD XT DL]
> 467 108.000495290 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8761 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
> 468 108.000591598 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 465) ACCESS, [Allowed: RD LU MD XT DL]
> 469 108.000608160 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8917 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
> 470 118.952127064 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
> 471 118.952356881 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 470) ACCESS, [Allowed: RD LU MD XT DL]
> 472 118.952372768 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9361 Ack=9073 Win=16605 Len=0 TSval=520289486 TSecr=1478994599
> 473 119.999835420 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0x94a968d5, [Check: RD LU MD XT DL]
> 474 120.000067817 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 473) ACCESS, [Allowed: RD LU MD XT DL]
> 475 120.000082882 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9533 Ack=9229 Win=16605 Len=0 TSval=520290533 TSecr=1478995646
> 476 140.000587688 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
> 477 140.000688677 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
> 478 140.000746915 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 476) ACCESS, [Allowed: RD LU MD XT DL]
> 479 140.000759241 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9385 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
> 480 140.000830146 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 477) ACCESS, [Allowed: RD LU MD XT DL]
> 481 140.000836443 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9541 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
> 482 148.442466129 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call RENEW CID: 0x04da
> 483 148.442650203 172.17.11.218 -> 172.17.141.150 NFS 182 V4 Reply (Call In 482) RENEW
> 484 148.442664846 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10037 Ack=9657 Win=16605 Len=0 TSval=520318976 TSecr=1479024089
> 485 149.953317362 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
> 486 149.953550872 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 485) ACCESS, [Allowed: RD LU MD XT DL]
> 487 149.953565993 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10209 Ack=9813 Win=16605 Len=0 TSval=520320487 TSecr=1479025600
> 488 162.000571296 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call ACCESS FH: 0xcd1903be, [Check: RD LU MD XT DL]
> 489 162.000794395 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 488) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
> 490 162.000825598 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10369 Ack=9969 Win=16605 Len=0 TSval=520332534 TSecr=1479037647
> 491 162.000998283 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xeabd4697, [Check: RD LU MD XT DL]
> 492 162.001218772 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 491) ACCESS, [Allowed: RD LU MD XT DL]
> 493 162.040415201 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10541 Ack=10125 Win=16605 Len=0 TSval=520332574 TSecr=1479037647
> 494 166.874398617 172.17.141.150 -> 172.17.11.218 TCP 66 [TCP Keep-Alive] nlogin > nfs [ACK] Seq=39640 Ack=46777 Win=24576 Len=0 TSval=520337408 TSecr=1478982393
> 495 166.874438892 172.17.141.150 -> 172.17.11.218 NFS 250 V4 Call SEQUENCE
> 496 166.874506845 172.17.11.218 -> 172.17.141.150 TCP 66 [TCP Dup ACK 462#1] nfs > nlogin [ACK] Seq=46777 Ack=39641 Win=24559 Len=0 TSval=1479042521 TSecr=520277320
> 497 166.874720218 172.17.11.218 -> 172.17.141.150 NFS 218 V4 Reply (Call In 495) SEQUENCE
> 498 166.874730215 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39825 Ack=46929 Win=24575 Len=0 TSval=520337408 TSecr=1479042521
> 499 166.874987010 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 500 166.875172744 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 499) TEST_STATEID
> 501 166.875309487 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x534735df/
> 502 166.875655661 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 501) OPEN StateID: 0xd7ae
> 503 166.875801366 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 504 166.876042044 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 503) TEST_STATEID
> 505 166.876210946 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
> 506 166.876485761 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 505) OPEN StateID: 0x9578
> 507 166.876607463 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
> 508 166.876820365 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 507) OPEN StateID: 0xfa83
> 509 166.876941430 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 510 166.877123487 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 509) TEST_STATEID
> 511 166.877205876 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x968ca393/
> 512 166.877464268 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 511) OPEN StateID: 0x25d5
> 513 166.877600104 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 514 166.877841822 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 513) TEST_STATEID
> 515 166.877997847 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde4187cf/
> 516 166.878265626 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 515) OPEN StateID: 0xd5ce
> 517 166.878393548 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 518 166.878603997 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 517) TEST_STATEID
> 519 166.878692334 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x9f6703e9/
> 520 166.878920958 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 519) OPEN StateID: 0x69a1
> 521 166.878964818 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 522 166.879156141 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 521) TEST_STATEID
> 523 166.879195140 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xe5c84183/
> 524 166.879435831 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 523) OPEN StateID: 0x6069
> 525 166.879518910 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 526 166.879709592 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 525) TEST_STATEID
> 527 166.879796731 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
> 528 166.880024682 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 527) OPEN StateID: 0xf420
> 529 166.880070944 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
> 530 166.880265884 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 529) OPEN StateID: 0xec63
> 531 166.880301034 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 532 166.880511051 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 531) TEST_STATEID
> 533 166.880575938 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x05761d23/
> 534 166.880798417 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 533) OPEN StateID: 0xb199
> 535 166.880840801 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 536 166.881008021 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 535) TEST_STATEID
> 537 166.881043797 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
> 538 166.881270127 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 537) OPEN StateID: 0x49df
> 539 166.881304710 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
> 540 166.881498628 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 539) OPEN StateID: 0xf0d5
> 541 166.881545126 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 542 166.881732646 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 541) TEST_STATEID
> 543 166.881775578 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0183cd1e/
> 544 166.881978864 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 543) OPEN StateID: 0x546d
> 545 166.882021595 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 546 166.882209030 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 545) TEST_STATEID
> 547 166.882252306 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
> 548 166.882484514 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 547) OPEN StateID: 0xa46d
> 549 166.882523043 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
> 550 166.882710061 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 549) OPEN StateID: 0xaa9a
> 551 166.882750420 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 552 166.882933338 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 551) TEST_STATEID
> 553 166.882961488 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3699764a/
> 554 166.883192776 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 553) OPEN StateID: 0x3c37
> 555 166.883223581 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 556 166.883407176 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 555) TEST_STATEID
> 557 166.883468198 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x94fd1187/
> 558 166.883679012 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 557) OPEN StateID: 0xbedf
> 559 166.883719911 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 560 166.883910224 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 559) TEST_STATEID
> 561 166.883937791 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xcd96c73b/
> 562 166.884165115 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 561) OPEN StateID: 0xbf6c
> 563 166.884194351 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 564 166.884378426 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 563) TEST_STATEID
> 565 166.884433848 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd24bb22d/
> 566 166.884661584 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 565) OPEN StateID: 0xaaf5
> 567 166.884719445 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 568 166.884904098 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 567) TEST_STATEID
> 569 166.884952255 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x5e594598/
> 570 166.885154240 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 569) OPEN StateID: 0x11cb
> 571 166.885206342 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 572 166.885389478 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 571) TEST_STATEID
> 573 166.885454506 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa4fd80c1/
> 574 166.885686638 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 573) OPEN StateID: 0x9363
> 575 166.885745762 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 576 166.885933232 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 575) TEST_STATEID
> 577 166.885980692 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x80e4743a/
> 578 166.886212637 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 577) OPEN StateID: 0x63af
> 579 166.886272585 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 580 166.886457823 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 579) TEST_STATEID
> 581 166.886514332 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb3e2d284/
> 582 166.886742488 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 581) OPEN StateID: 0x52d9
> 583 166.886803826 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 584 166.886989516 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 583) TEST_STATEID
> 585 166.887100351 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x95137efa/
> 586 166.887304377 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 585) OPEN StateID: 0xb26e
> 587 166.887395696 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 588 166.887587544 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 587) TEST_STATEID
> 589 166.887703373 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x1d70394e/
> 590 166.887919160 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 589) OPEN StateID: 0x753c
> 591 166.887999278 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 592 166.888190709 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 591) TEST_STATEID
> 593 166.888298867 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x680da7ff/
> 594 166.888530666 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 593) OPEN StateID: 0x097c
> 595 166.888605330 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 596 166.888795331 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 595) TEST_STATEID
> 597 166.888902566 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0a52a987/
> 598 166.889113308 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 597) OPEN StateID: 0x7781
> 599 166.889162100 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 600 166.889353078 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 599) TEST_STATEID
> 601 166.889461157 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x56462642/
> 602 166.889699242 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 601) OPEN StateID: 0xb209
> 603 166.889772552 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 604 166.889963172 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 603) TEST_STATEID
> 605 166.890070241 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x850c0567/
> 606 166.890272350 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 605) OPEN StateID: 0x8134
> 607 166.890335412 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 608 166.890542793 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 607) TEST_STATEID
> 609 166.890650208 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x31aa2390/
> 610 166.890886053 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 609) OPEN StateID: 0x1f30
> 611 166.890959992 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 612 166.891158969 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 611) TEST_STATEID
> 613 166.891264349 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2ae7f451/
> 614 166.891516937 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 613) OPEN StateID: 0x3fce
> 615 166.891591778 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 616 166.891788024 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 615) TEST_STATEID
> 617 166.891902329 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa5cb13d3/
> 618 166.892117369 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 617) OPEN StateID: 0x03d6
> 619 166.892175933 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 620 166.892398086 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 619) TEST_STATEID
> 621 166.892447518 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3f5cfcdb/
> 622 166.892671544 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 621) OPEN StateID: 0x8bff
> 623 166.892716533 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 624 166.892901971 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 623) TEST_STATEID
> 625 166.892940753 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x40b4d194/
> 626 166.893167930 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 625) OPEN StateID: 0xe3e8
> 627 166.893215973 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 628 166.893398652 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 627) TEST_STATEID
> 629 166.893445197 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x4643d6fc/
> 630 166.893682547 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 629) OPEN StateID: 0xe194
> 631 166.893731829 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 632 166.893915920 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 631) TEST_STATEID
> 633 166.893960986 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd246cbd3/
> 634 166.894191435 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 633) OPEN StateID: 0x22f5
> 635 166.894240889 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 636 166.894425035 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 635) TEST_STATEID
> 637 166.894470784 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde40a6e1/
> 638 166.894695618 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 637) OPEN StateID: 0x9ce2
> 639 166.894744788 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 640 166.894929372 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 639) TEST_STATEID
> 641 166.894967975 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0f26a4dc/
> 642 166.895197152 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 641) OPEN StateID: 0x879c
> 643 166.895245038 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 644 166.895429467 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 643) TEST_STATEID
> 645 166.895471234 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x82162062/
> 646 166.895694775 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 645) OPEN StateID: 0xab68
> 647 166.895744208 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 648 166.895929221 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 647) TEST_STATEID
> 649 166.895973162 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb8b3b57f/
> 650 166.896197535 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 649) OPEN StateID: 0xfb0b
> 651 166.896245960 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 652 166.896430271 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 651) TEST_STATEID
> 653 166.896475752 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2e6c2b31/
> 654 166.896705501 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 653) OPEN StateID: 0x8e00
> 655 166.896754419 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 656 166.896939911 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 655) TEST_STATEID
> 657 166.896983440 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x547cf5ea/
> 658 166.897209526 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 657) OPEN StateID: 0x0532
> 659 166.897258079 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
> 660 166.897443527 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 659) TEST_STATEID
> 661 166.897484081 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x33c176ce/
> 662 166.897693578 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 661) OPEN StateID: 0x3648
> 663 166.937386876 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=59461 Ack=70165 Win=24576 Len=0 TSval=520337471 TSecr=1479042544
> ^C663 packets captured
> [hedrick@camaro ~]$
>
> Here’s the corresponding section of /var/log/messages
>
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: nfs4_reclaim_open_state: 1 callbacks suppressed
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>
>
>
>
>> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>>
>>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>>
>>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>>
>>> You can extract it via the crash utility, or via systemtap, or by doing a
>>> wire capture, but nothing that's easily translated to running across a large
>>> number of machines. There's probably other ways, perhaps we should tack
>>> that string into the tracepoints for exchange_id and setclientid.
>>>
>>> If you're interested in troubleshooting, wire capture's usually the most
>>> informative. If the lockup events all happen at the same time, there
>>> might be some network event that is triggering the issue.
>>>
>>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>>> that it isn't, and I'd love to know why you're having these problems.
>>
>> I echo that: NFSv4.1 protocol and implementation are mature, so if
>> there are operational problems, it should be root-caused.
>>
>> NFSv4.1 uses a uniform client ID. That should be the "good" one,
>> not the NFSv4.0 one that has a non-zero probability of collision.
>>
>> Charles, please let us know if there are particular workloads that
>> trigger the lock reclaim failure. A narrow reproducer would help
>> get to the root issue quickly.
>>
>>
>>> Ben
>>>
>>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>>
>>>> The server is ubuntu 20, with a ZFS file system.
>>>>
>>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>>
>>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>>
>>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>>
>>>>> Ben
>>>>>
>>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>>
>>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>>
>>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>>
>>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>>
>>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>>
>>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>>
>>>
>>>
>>>
>>
>> --
>> Chuck Lever
>>
>>
>>
>

--
Chuck Lever



2021-04-14 16:35:33

by Charles Hedrick

[permalink] [raw]
Subject: Re: safe versions of NFS

sure. We’re hoping to move to a new enough kernel that it’s OK. I wasn’t expecting this list to fix problems, but I thought there might be information in the community about the status in various kernel releases, to guide us in what we need to do. If not, we’ll do our own testing.

After looking at the features and our usage pattern, I’m not interested in turning off delegation. If 4.0 doesn’t have problems (and so far it seems that it doesn’t), I think 4.0 with delegations is going to work better for us than 4.1 without delegations.

From a practical point of view getting a problem fixed is really difficult with intermittends, and the lead times for fixes to show up is in years. I think you’ll find that most sysadmins would rather find a workaround than fix the problem. I often don’t have vendor support, in part because when I do, difficult problems turn into an infinite discussion that almost never terminates with a fix.

> On Apr 14, 2021, at 10:15 AM, Chuck Lever III <[email protected]> wrote:
>
> Good morning Charles,
>
>> On Apr 13, 2021, at 3:40 PM, [email protected] wrote:
>>
>> This is from Centos 7.9, with all file systems mounted via NFS 4.0 (in theory — there may be 4.2 that were unmounted —lazy):
>
> There isn't much linux-nfs@ can do about problems in distributor
> kernels, and it's very possible the issue is already fixed in a
> more recent kernel. You did suggest that v5.8 seemed more solid,
> for instance, and my 30-second Googling suggests that Ubuntu 18
> is based on 4.15, which is ancient history for us upstream code
> monkeys.
>
> I recommend working with your Linux distributor to start root
> cause analysis and to let them document the issue properly. If
> they find that the problem is not already addressed in a newer
> kernel, then bring it back here to linux-nfs@.
>
>
> Sidebar: I find it interesting that the "just disable delegation"
> advice is still floating around the interwebs. And unfortunately
> it is probably still effective in some cases. The community
> should make an effort to nail those down and squash them, IMO.
>
>
>> 463 106.786384244 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39641 Ack=46777 Win=24576 Len=0 TSval=520277320 TSecr=1478982393
>> 464 108.000270192 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
>> 465 108.000361904 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
>> 466 108.000476711 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 464) ACCESS, [Allowed: RD LU MD XT DL]
>> 467 108.000495290 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8761 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
>> 468 108.000591598 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 465) ACCESS, [Allowed: RD LU MD XT DL]
>> 469 108.000608160 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9189 Ack=8917 Win=16605 Len=0 TSval=520278534 TSecr=1478983647
>> 470 118.952127064 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
>> 471 118.952356881 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 470) ACCESS, [Allowed: RD LU MD XT DL]
>> 472 118.952372768 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9361 Ack=9073 Win=16605 Len=0 TSval=520289486 TSecr=1478994599
>> 473 119.999835420 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0x94a968d5, [Check: RD LU MD XT DL]
>> 474 120.000067817 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 473) ACCESS, [Allowed: RD LU MD XT DL]
>> 475 120.000082882 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9533 Ack=9229 Win=16605 Len=0 TSval=520290533 TSecr=1478995646
>> 476 140.000587688 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xe696b554, [Check: RD LU MD XT DL]
>> 477 140.000688677 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xd61aa475, [Check: RD LU MD XT DL]
>> 478 140.000746915 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 476) ACCESS, [Allowed: RD LU MD XT DL]
>> 479 140.000759241 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9385 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
>> 480 140.000830146 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 477) ACCESS, [Allowed: RD LU MD XT DL]
>> 481 140.000836443 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=9877 Ack=9541 Win=16605 Len=0 TSval=520310534 TSecr=1479015647
>> 482 148.442466129 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call RENEW CID: 0x04da
>> 483 148.442650203 172.17.11.218 -> 172.17.141.150 NFS 182 V4 Reply (Call In 482) RENEW
>> 484 148.442664846 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10037 Ack=9657 Win=16605 Len=0 TSval=520318976 TSecr=1479024089
>> 485 149.953317362 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xef7d152e, [Check: RD LU MD XT DL]
>> 486 149.953550872 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 485) ACCESS, [Allowed: RD LU MD XT DL]
>> 487 149.953565993 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10209 Ack=9813 Win=16605 Len=0 TSval=520320487 TSecr=1479025600
>> 488 162.000571296 172.17.141.150 -> 172.17.11.218 NFS 226 V4 Call ACCESS FH: 0xcd1903be, [Check: RD LU MD XT DL]
>> 489 162.000794395 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 488) ACCESS, [Access Denied: MD XT DL], [Allowed: RD LU]
>> 490 162.000825598 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10369 Ack=9969 Win=16605 Len=0 TSval=520332534 TSecr=1479037647
>> 491 162.000998283 172.17.141.150 -> 172.17.11.218 NFS 238 V4 Call ACCESS FH: 0xeabd4697, [Check: RD LU MD XT DL]
>> 492 162.001218772 172.17.11.218 -> 172.17.141.150 NFS 222 V4 Reply (Call In 491) ACCESS, [Allowed: RD LU MD XT DL]
>> 493 162.040415201 172.17.141.150 -> 172.17.11.218 TCP 66 rndc > nfs [ACK] Seq=10541 Ack=10125 Win=16605 Len=0 TSval=520332574 TSecr=1479037647
>> 494 166.874398617 172.17.141.150 -> 172.17.11.218 TCP 66 [TCP Keep-Alive] nlogin > nfs [ACK] Seq=39640 Ack=46777 Win=24576 Len=0 TSval=520337408 TSecr=1478982393
>> 495 166.874438892 172.17.141.150 -> 172.17.11.218 NFS 250 V4 Call SEQUENCE
>> 496 166.874506845 172.17.11.218 -> 172.17.141.150 TCP 66 [TCP Dup ACK 462#1] nfs > nlogin [ACK] Seq=46777 Ack=39641 Win=24559 Len=0 TSval=1479042521 TSecr=520277320
>> 497 166.874720218 172.17.11.218 -> 172.17.141.150 NFS 218 V4 Reply (Call In 495) SEQUENCE
>> 498 166.874730215 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=39825 Ack=46929 Win=24575 Len=0 TSval=520337408 TSecr=1479042521
>> 499 166.874987010 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 500 166.875172744 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 499) TEST_STATEID
>> 501 166.875309487 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x534735df/
>> 502 166.875655661 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 501) OPEN StateID: 0xd7ae
>> 503 166.875801366 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 504 166.876042044 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 503) TEST_STATEID
>> 505 166.876210946 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
>> 506 166.876485761 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 505) OPEN StateID: 0x9578
>> 507 166.876607463 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xdabfc399/
>> 508 166.876820365 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 507) OPEN StateID: 0xfa83
>> 509 166.876941430 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 510 166.877123487 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 509) TEST_STATEID
>> 511 166.877205876 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x968ca393/
>> 512 166.877464268 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 511) OPEN StateID: 0x25d5
>> 513 166.877600104 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 514 166.877841822 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 513) TEST_STATEID
>> 515 166.877997847 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde4187cf/
>> 516 166.878265626 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 515) OPEN StateID: 0xd5ce
>> 517 166.878393548 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 518 166.878603997 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 517) TEST_STATEID
>> 519 166.878692334 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x9f6703e9/
>> 520 166.878920958 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 519) OPEN StateID: 0x69a1
>> 521 166.878964818 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 522 166.879156141 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 521) TEST_STATEID
>> 523 166.879195140 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xe5c84183/
>> 524 166.879435831 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 523) OPEN StateID: 0x6069
>> 525 166.879518910 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 526 166.879709592 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 525) TEST_STATEID
>> 527 166.879796731 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
>> 528 166.880024682 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 527) OPEN StateID: 0xf420
>> 529 166.880070944 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x927973ae/
>> 530 166.880265884 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 529) OPEN StateID: 0xec63
>> 531 166.880301034 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 532 166.880511051 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 531) TEST_STATEID
>> 533 166.880575938 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x05761d23/
>> 534 166.880798417 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 533) OPEN StateID: 0xb199
>> 535 166.880840801 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 536 166.881008021 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 535) TEST_STATEID
>> 537 166.881043797 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
>> 538 166.881270127 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 537) OPEN StateID: 0x49df
>> 539 166.881304710 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb6b205f7/
>> 540 166.881498628 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 539) OPEN StateID: 0xf0d5
>> 541 166.881545126 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 542 166.881732646 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 541) TEST_STATEID
>> 543 166.881775578 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0183cd1e/
>> 544 166.881978864 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 543) OPEN StateID: 0x546d
>> 545 166.882021595 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 546 166.882209030 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 545) TEST_STATEID
>> 547 166.882252306 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
>> 548 166.882484514 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 547) OPEN StateID: 0xa46d
>> 549 166.882523043 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x673ef4c3/
>> 550 166.882710061 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 549) OPEN StateID: 0xaa9a
>> 551 166.882750420 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 552 166.882933338 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 551) TEST_STATEID
>> 553 166.882961488 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3699764a/
>> 554 166.883192776 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 553) OPEN StateID: 0x3c37
>> 555 166.883223581 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 556 166.883407176 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 555) TEST_STATEID
>> 557 166.883468198 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x94fd1187/
>> 558 166.883679012 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 557) OPEN StateID: 0xbedf
>> 559 166.883719911 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 560 166.883910224 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 559) TEST_STATEID
>> 561 166.883937791 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xcd96c73b/
>> 562 166.884165115 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 561) OPEN StateID: 0xbf6c
>> 563 166.884194351 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 564 166.884378426 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 563) TEST_STATEID
>> 565 166.884433848 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd24bb22d/
>> 566 166.884661584 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 565) OPEN StateID: 0xaaf5
>> 567 166.884719445 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 568 166.884904098 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 567) TEST_STATEID
>> 569 166.884952255 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x5e594598/
>> 570 166.885154240 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 569) OPEN StateID: 0x11cb
>> 571 166.885206342 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 572 166.885389478 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 571) TEST_STATEID
>> 573 166.885454506 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa4fd80c1/
>> 574 166.885686638 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 573) OPEN StateID: 0x9363
>> 575 166.885745762 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 576 166.885933232 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 575) TEST_STATEID
>> 577 166.885980692 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x80e4743a/
>> 578 166.886212637 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 577) OPEN StateID: 0x63af
>> 579 166.886272585 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 580 166.886457823 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 579) TEST_STATEID
>> 581 166.886514332 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb3e2d284/
>> 582 166.886742488 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 581) OPEN StateID: 0x52d9
>> 583 166.886803826 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 584 166.886989516 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 583) TEST_STATEID
>> 585 166.887100351 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x95137efa/
>> 586 166.887304377 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 585) OPEN StateID: 0xb26e
>> 587 166.887395696 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 588 166.887587544 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 587) TEST_STATEID
>> 589 166.887703373 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x1d70394e/
>> 590 166.887919160 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 589) OPEN StateID: 0x753c
>> 591 166.887999278 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 592 166.888190709 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 591) TEST_STATEID
>> 593 166.888298867 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x680da7ff/
>> 594 166.888530666 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 593) OPEN StateID: 0x097c
>> 595 166.888605330 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 596 166.888795331 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 595) TEST_STATEID
>> 597 166.888902566 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0a52a987/
>> 598 166.889113308 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 597) OPEN StateID: 0x7781
>> 599 166.889162100 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 600 166.889353078 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 599) TEST_STATEID
>> 601 166.889461157 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x56462642/
>> 602 166.889699242 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 601) OPEN StateID: 0xb209
>> 603 166.889772552 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 604 166.889963172 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 603) TEST_STATEID
>> 605 166.890070241 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x850c0567/
>> 606 166.890272350 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 605) OPEN StateID: 0x8134
>> 607 166.890335412 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 608 166.890542793 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 607) TEST_STATEID
>> 609 166.890650208 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x31aa2390/
>> 610 166.890886053 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 609) OPEN StateID: 0x1f30
>> 611 166.890959992 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 612 166.891158969 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 611) TEST_STATEID
>> 613 166.891264349 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2ae7f451/
>> 614 166.891516937 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 613) OPEN StateID: 0x3fce
>> 615 166.891591778 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 616 166.891788024 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 615) TEST_STATEID
>> 617 166.891902329 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xa5cb13d3/
>> 618 166.892117369 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 617) OPEN StateID: 0x03d6
>> 619 166.892175933 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 620 166.892398086 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 619) TEST_STATEID
>> 621 166.892447518 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x3f5cfcdb/
>> 622 166.892671544 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 621) OPEN StateID: 0x8bff
>> 623 166.892716533 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 624 166.892901971 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 623) TEST_STATEID
>> 625 166.892940753 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x40b4d194/
>> 626 166.893167930 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 625) OPEN StateID: 0xe3e8
>> 627 166.893215973 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 628 166.893398652 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 627) TEST_STATEID
>> 629 166.893445197 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x4643d6fc/
>> 630 166.893682547 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 629) OPEN StateID: 0xe194
>> 631 166.893731829 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 632 166.893915920 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 631) TEST_STATEID
>> 633 166.893960986 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xd246cbd3/
>> 634 166.894191435 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 633) OPEN StateID: 0x22f5
>> 635 166.894240889 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 636 166.894425035 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 635) TEST_STATEID
>> 637 166.894470784 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xde40a6e1/
>> 638 166.894695618 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 637) OPEN StateID: 0x9ce2
>> 639 166.894744788 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 640 166.894929372 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 639) TEST_STATEID
>> 641 166.894967975 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x0f26a4dc/
>> 642 166.895197152 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 641) OPEN StateID: 0x879c
>> 643 166.895245038 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 644 166.895429467 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 643) TEST_STATEID
>> 645 166.895471234 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x82162062/
>> 646 166.895694775 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 645) OPEN StateID: 0xab68
>> 647 166.895744208 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 648 166.895929221 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 647) TEST_STATEID
>> 649 166.895973162 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0xb8b3b57f/
>> 650 166.896197535 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 649) OPEN StateID: 0xfb0b
>> 651 166.896245960 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 652 166.896430271 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 651) TEST_STATEID
>> 653 166.896475752 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x2e6c2b31/
>> 654 166.896705501 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 653) OPEN StateID: 0x8e00
>> 655 166.896754419 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 656 166.896939911 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 655) TEST_STATEID
>> 657 166.896983440 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x547cf5ea/
>> 658 166.897209526 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 657) OPEN StateID: 0x0532
>> 659 166.897258079 172.17.141.150 -> 172.17.11.218 NFS 274 V4 Call TEST_STATEID
>> 660 166.897443527 172.17.11.218 -> 172.17.141.150 NFS 234 V4 Reply (Call In 659) TEST_STATEID
>> 661 166.897484081 172.17.141.150 -> 172.17.11.218 NFS 334 V4 Call OPEN DH: 0x33c176ce/
>> 662 166.897693578 172.17.11.218 -> 172.17.141.150 NFS 454 V4 Reply (Call In 661) OPEN StateID: 0x3648
>> 663 166.937386876 172.17.141.150 -> 172.17.11.218 TCP 66 nlogin > nfs [ACK] Seq=59461 Ack=70165 Win=24576 Len=0 TSval=520337471 TSecr=1479042544
>> ^C663 packets captured
>> [hedrick@camaro ~]$
>>
>> Here’s the corresponding section of /var/log/messages
>>
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: nfs4_reclaim_open_state: 1 callbacks suppressed
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>> Apr 13 15:36:05 camaro.lcsr.rutgers.edu kernel: NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>>
>>
>>
>>
>>> On Apr 13, 2021, at 1:24 PM, Chuck Lever III <[email protected]> wrote:
>>>
>>>
>>>
>>>> On Apr 13, 2021, at 12:23 PM, Benjamin Coddington <[email protected]> wrote:
>>>>
>>>> (resending this as it bounced off the list - I accidentally embedded HTML)
>>>>
>>>> Yes, if you're pretty sure your hostnames are all different, the client_ids
>>>> should be different. For v4.0 you can turn on debugging (rpcdebug -m nfs -s
>>>> proc) and see the client_id in the kernel log in lines that look like: "NFS
>>>> call setclientid auth=%s, '%s'\n", which will happen at mount time, but it
>>>> doesn't look like we have any debugging for v4.1 and v4.2 for EXCHANGE_ID.
>>>>
>>>> You can extract it via the crash utility, or via systemtap, or by doing a
>>>> wire capture, but nothing that's easily translated to running across a large
>>>> number of machines. There's probably other ways, perhaps we should tack
>>>> that string into the tracepoints for exchange_id and setclientid.
>>>>
>>>> If you're interested in troubleshooting, wire capture's usually the most
>>>> informative. If the lockup events all happen at the same time, there
>>>> might be some network event that is triggering the issue.
>>>>
>>>> You should expect NFSv4.1 to be rock-solid. Its rare we have reports
>>>> that it isn't, and I'd love to know why you're having these problems.
>>>
>>> I echo that: NFSv4.1 protocol and implementation are mature, so if
>>> there are operational problems, it should be root-caused.
>>>
>>> NFSv4.1 uses a uniform client ID. That should be the "good" one,
>>> not the NFSv4.0 one that has a non-zero probability of collision.
>>>
>>> Charles, please let us know if there are particular workloads that
>>> trigger the lock reclaim failure. A narrow reproducer would help
>>> get to the root issue quickly.
>>>
>>>
>>>> Ben
>>>>
>>>> On 13 Apr 2021, at 11:38, [email protected] wrote:
>>>>
>>>>> The server is ubuntu 20, with a ZFS file system.
>>>>>
>>>>> I don’t set the unique ID. Documentation claims that it is set from the hostname. They will surely be unique, or the whole world would blow up. How can I check the actual unique ID being used? The kernel reports a blank one, but I think that just means to use the hostname. We could obviously set a unique one if that would be useful.
>>>>>
>>>>>> On Apr 13, 2021, at 11:35 AM, Benjamin Coddington <[email protected]> wrote:
>>>>>>
>>>>>> It would be interesting to know why your clients are failing to reclaim their locks. Something is misconfigured. What server are you using, and is there anything fancy on the server-side (like HA)? Is it possible that you have clients with the same nfs4_unique_id?
>>>>>>
>>>>>> Ben
>>>>>>
>>>>>> On 13 Apr 2021, at 11:17, [email protected] wrote:
>>>>>>
>>>>>>> many, though not all, of the problems are “lock reclaim failed”.
>>>>>>>
>>>>>>>> On Apr 13, 2021, at 10:52 AM, Patrick Goetz <[email protected]> wrote:
>>>>>>>>
>>>>>>>> I use NFS 4.2 with Ubuntu 18/20 workstations and Ubuntu 18/20 servers and haven't had any problems.
>>>>>>>>
>>>>>>>> Check your configuration files; the last time I experienced something like this it's because I inadvertently used the same fsid on two different exports. Also recommend exporting top level directories only. Bind mount everything you want to export into /srv/nfs and only export those directories. According to Bruce F. this doesn't buy you any security (I still don't understand why), but it makes for a cleaner system configuration.
>>>>>>>>
>>>>>>>> On 4/13/21 9:33 AM, [email protected] wrote:
>>>>>>>>> I am in charge of a large computer science dept computing infrastructure. We have a variety of student and develo9pment users. If there are problems we’ll see them.
>>>>>>>>> We use an Ubuntu 20 server, with NVMe storage.
>>>>>>>>> I’ve just had to move Centos 7 and Ubuntu 18 to use NFS 4.0. We had hangs with NFS 4.1 and 4.2. Files would appear to be locked, although eventually the lock would time out. It’s too soon to be sure that moving back to NFS 4.0 will fix it. Next is either NFS 3 or disabling delegations on the server.
>>>>>>>>> Are there known versions of NFS that are safe to use in production for various kernel versions? The one we’re most interested in is Ubuntu 20, which can be anything from 5.4 to 5.8.
>>>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Chuck Lever
>>>
>>>
>>>
>>
>
> --
> Chuck Lever
>
>
>

2021-10-05 19:46:52

by Charles Hedrick

[permalink] [raw]
Subject: more problems with NFS. sort of repeatable problem with vmplayer

We just found a nearly repeatable problem. If you run vmplayer (a desktop VM system from VMware). with its vm storage on NFS, the system eventually locks up. Some of the time. It happens consistently for one user, and I just saw it.

When we told the user for which it is consistent to move his vm’s to local storage, the problem went away.

It tried running vmplayer. Shortly after starting to create a new VM, vmplayer hung. I had another window with a shell. I went into the directory with the vm files and did “ls -ltrc”. It didn’t quite hang, but look about a minute to finish I also saw log entries from VMware complaining that disk operations took several seconds.

We saw this problem last semester consistentl, though I didn’t realize a connection with vmplayer (if it existed). We fixed it by forcing mounts to use NFS 4.0. Since delegations are now disabled on our server, I’m assuming that the problem is locking. We don’t normally use locking a lot, but I believe that VMware uses it extensively.

The problem occurs on Ubuntu 20.04 with both the normal (5.4) and HWE (5.11) kernels.

Any thoughts? At the moment I’m tempted to force 4.0, but I’d like to be able to use 4.2 at some point. Since it still happens with 5.11 it doesn’t look good. I’m willing to try a more recent kernel if it’s likely to help.

We’re probably an unusual installation. We’re a CS department, with researchers and also a large time-sharing environment for students (spread across many machines, with a graphical interface using Xrdb, etc). Our people use every piece of software under the sun.

Client and server are both Ubuntu 20.04. Server is on ZFS with NVMe storage.

2021-10-11 14:47:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: more problems with NFS. sort of repeatable problem with vmplayer

On Tue, Oct 05, 2021 at 03:46:21PM -0400, Charles Hedrick wrote:
> We just found a nearly repeatable problem. If you run vmplayer (a desktop VM system from VMware). with its vm storage on NFS, the system eventually locks up. Some of the time. It happens consistently for one user, and I just saw it.

Could you explain what you mean by "locks up"? Is one application
hanging, or is the whole machine unresponsive until rebooted?

> When we told the user for which it is consistent to move his vm’s to local storage, the problem went away.
>
> It tried running vmplayer. Shortly after starting to create a new VM, vmplayer hung. I had another window with a shell. I went into the directory with the vm files and did “ls -ltrc”. It didn’t quite hang, but look about a minute to finish I also saw log entries from VMware complaining that disk operations took several seconds.

Any kernel messages in the log around that time?

> We saw this problem last semester consistentl, though I didn’t realize
> a connection with vmplayer (if it existed). We fixed it by forcing
> mounts to use NFS 4.0. Since delegations are now disabled on our
> server, I’m assuming that the problem is locking. We don’t normally
> use locking a lot, but I believe that VMware uses it extensively.

So you're normally using NFSv4.2?

--b.

>
> The problem occurs on Ubuntu 20.04 with both the normal (5.4) and HWE
> (5.11) kernels.
>
> Any thoughts? At the moment I’m tempted to force 4.0, but I’d like to
> be able to use 4.2 at some point. Since it still happens with 5.11 it
> doesn’t look good. I’m willing to try a more recent kernel if it’s
> likely to help.
>
> We’re probably an unusual installation. We’re a CS department, with
> researchers and also a large time-sharing environment for students
> (spread across many machines, with a graphical interface using Xrdb,
> etc). Our people use every piece of software under the sun.
>
> Client and server are both Ubuntu 20.04. Server is on ZFS with NVMe
> storage.

2021-10-15 03:01:44

by NeilBrown

[permalink] [raw]
Subject: Re: more problems with NFS. sort of repeatable problem with vmplayer

On Wed, 06 Oct 2021, Charles Hedrick wrote:

>
> It tried running vmplayer. Shortly after starting to create a new VM,
> vmplayer hung. I had another window with a shell. I went into the
> directory with the vm files and did “ls -ltrc”. It didn’t quite hang,
> but look about a minute to finish I also saw log entries from VMware
> complaining that disk operations took several seconds.

Useful information to provide when a process appears to hang on NFS
include:

- cat /proc/$PID/stack

- rpcdebug -m nfs -s all; rpcdebug -m rpc -s all ; sleep 2 ;
rpcdebug -m rpc -c all; rpcdebug -m nfs -c all
then collect kernel logs

- tcpdump -w filename.pcap -s 0 -c 1000 port 2049
and compress filename.pcap and put it somewhere we can find it.

- trace-cmd record -e 'nfs:*' sleep 2
trace-cmd report > filename

>
> We’re probably an unusual installation. We’re a CS department, with
> researchers and also a large time-sharing environment for students
> (spread across many machines, with a graphical interface using Xrdb,
> etc). Our people use every piece of software under the sun.

Probably not all that unusual. There certainly are lots of large and
varied NFS sites out there.

>
> Client and server are both Ubuntu 20.04. Server is on ZFS with NVMe storage.

If it is possible to reproduce without ZFS, that would provide useful
information.
I don't think it is *likely* that ZFS causes the problem, but neither
would I be surprised if it did.

NeilBrown

2021-10-15 06:02:01

by Charles Hedrick

[permalink] [raw]
Subject: Re: more problems with NFS. sort of repeatable problem with vmplayer

Thanks. We’re now into classes, so our priorities are supporting students. Unfortunately this isn’t the only issue we have to resolve.

I’ll try to get this info, just not today. We asked the course using vmplayer to put their VM in local storage, or to use kvm. So far no problems. The default kvm setup doesn’t use direct.

Does direct mode get turned into direct mode on the server? If so I should look into the state of direct support in the version of ZFS we have.

> On Oct 14, 2021, at 6:28 PM, NeilBrown <[email protected]> wrote:
>
> On Wed, 06 Oct 2021, Charles Hedrick wrote:
>
>>
>> It tried running vmplayer. Shortly after starting to create a new VM,
>> vmplayer hung. I had another window with a shell. I went into the
>> directory with the vm files and did “ls -ltrc”. It didn’t quite hang,
>> but look about a minute to finish I also saw log entries from VMware
>> complaining that disk operations took several seconds.
>
> Useful information to provide when a process appears to hang on NFS
> include:
>
> - cat /proc/$PID/stack
>
> - rpcdebug -m nfs -s all; rpcdebug -m rpc -s all ; sleep 2 ;
> rpcdebug -m rpc -c all; rpcdebug -m nfs -c all
> then collect kernel logs
>
> - tcpdump -w filename.pcap -s 0 -c 1000 port 2049
> and compress filename.pcap and put it somewhere we can find it.
>
> - trace-cmd record -e 'nfs:*' sleep 2
> trace-cmd report > filename
>
>>
>> We’re probably an unusual installation. We’re a CS department, with
>> researchers and also a large time-sharing environment for students
>> (spread across many machines, with a graphical interface using Xrdb,
>> etc). Our people use every piece of software under the sun.
>
> Probably not all that unusual. There certainly are lots of large and
> varied NFS sites out there.
>
>>
>> Client and server are both Ubuntu 20.04. Server is on ZFS with NVMe storage.
>
> If it is possible to reproduce without ZFS, that would provide useful
> information.
> I don't think it is *likely* that ZFS causes the problem, but neither
> would I be surprised if it did.
>
> NeilBrown