2019-03-20 12:35:08

by James Pearson

[permalink] [raw]
Subject: NFSv3 mount hangs on CentOS 7.4

We have a number of specialized workstations running CentOS 7.4
(kernel 3.10.0-693.11.1.el7.x86_64) that are both NFSv3 clients and
servers- i.e. they can mount each other's exports via autofs -
however, occasionally the mount process hangs

Each workstation has 4 exports, 2 of the exports are subdirectories of
the root disk - these are exported with the 'sync' option , the other
2 exports are the mount points of separate file systems - exported
with the 'async' option. All are XFS file systems

The mount process hang only happens with the exports from the root
disk - and once a mount hangs on one of these exports, a mount of the
other export from the root disk also hangs - but mounting the other 2
exports are fine

The problem can occur on any of the workstations mounting any of the
other workstations exports

We can temporary fix the problem by running 'exportfs -f' on the server

Running tcpdump on the client (or server) when attempting a mount when
in this state shows that the client sends an FSINFO call, but doesn't
get a reply - it then retransmits the FSINFO call and sends another
FSINFO call about 18 seconds later - but no replies - which I'm
guessing is significant?

Unfortunately, it is not straight forward to upgrade the kernel on
these workstations - so difficult to test if a newer kernel would fix
the issue ...

Is anyone able to suggest any other debugging steps we can take to
find out what the issue might be?

Thanks

James Pearson