Hello NFS developers,
I've written to this list before [1],[2] concerning uninterruptible hung
tasks in clients using NFSv4.0 with Kerberos. I have also written
scripts (which can be cloned from [3]) which help to reproduce the hangs
by configuring two virtual machines with the required setup and a test
program which triggers the hangs rather quickly (see [2] for details).
Meanwhile, I have been able to do some bisecting of kernel sources to
find a commit which exposes the hangs. It seems that since commit
2aca5b869ace67a63aab895659e5dc14c33a4d6e
SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT
(introduced with v3.18-rc1) the uninterruptible hangs occur. When I
revert this commit, then I do not observe the uninterruptible hangs.
I've tested this on Ubuntu 16.04's 4.4 kernel and Debian 9's 4.9 kernel
and several stock kernels.
In our group at the university, we have about 15 desktop machines and 70
nodes in a SLURM cluster. Without reverting the commit, we've had on
average one machine per day locking up with uninterruptible hanging
tasks reported by the kernel; for about 6 weeks, we now run only kernels
with the commit reverted (i.e., Debian/Ubuntu's kernel recompiled after
reverting the patch) and we have not had any NFS-related machine lockups
so far.
I'm not claiming that the mentioned commit is the cause of the problem;
I think it exposes the problem. The problem is also present in current
kernels. Unfortunately, there seems to be another problem which can be
triggered by my test program from [3]. Since commit
9b30889c548a4d45bfe6226e58de32504c1d682f
SUNRPC: Ensure we always close the socket after a connection shuts down
(introduced with v4.16-rc1) is is very likely that the system dies due
to an out of memory condition, i.e., at some point the kernel consumes
all the memory and the OOM killer kills all user processes. When this
commit is reverted, I can observe the uninterruptible hung tasks again
(with kernel up to 4.18-rc2).
Since I have no expertise in the NFS client implementation, I'm still
hoping that exports on this list have an idea how to fix the NFS
client's behavior.
Regards,
Armin
[1] https://marc.info/?l=linux-nfs&m=150620442017672
[2] https://marc.info/?l=linux-nfs&m=152396752525579
[3] https://gitlab.infosun.fim.uni-passau.de/groessli/nfs-krb5-vms
T24gU3VuLCAyMDE4LTA2LTI0IGF0IDIyOjMwICswMjAwLCBBcm1pbiBHcsO2w59saW5nZXIgd3Jv
dGU6DQo+IEhlbGxvIE5GUyBkZXZlbG9wZXJzLA0KPiANCj4gSSd2ZSB3cml0dGVuIHRvIHRoaXMg
bGlzdCBiZWZvcmUgWzFdLFsyXSBjb25jZXJuaW5nIHVuaW50ZXJydXB0aWJsZQ0KPiBodW5nDQo+
IHRhc2tzIGluIGNsaWVudHMgdXNpbmcgTkZTdjQuMCB3aXRoIEtlcmJlcm9zLiBJIGhhdmUgYWxz
byB3cml0dGVuDQo+IHNjcmlwdHMgKHdoaWNoIGNhbiBiZSBjbG9uZWQgZnJvbSBbM10pIHdoaWNo
IGhlbHAgdG8gcmVwcm9kdWNlIHRoZQ0KPiBoYW5ncw0KPiBieSBjb25maWd1cmluZyB0d28gdmly
dHVhbCBtYWNoaW5lcyB3aXRoIHRoZSByZXF1aXJlZCBzZXR1cCBhbmQgYQ0KPiB0ZXN0DQo+IHBy
b2dyYW0gd2hpY2ggdHJpZ2dlcnMgdGhlIGhhbmdzIHJhdGhlciBxdWlja2x5IChzZWUgWzJdIGZv
cg0KPiBkZXRhaWxzKS4NCj4gDQo+IE1lYW53aGlsZSwgSSBoYXZlIGJlZW4gYWJsZSB0byBkbyBz
b21lIGJpc2VjdGluZyBvZiBrZXJuZWwgc291cmNlcyB0bw0KPiBmaW5kIGEgY29tbWl0IHdoaWNo
IGV4cG9zZXMgdGhlIGhhbmdzLiBJdCBzZWVtcyB0aGF0IHNpbmNlIGNvbW1pdA0KPiANCj4gMmFj
YTViODY5YWNlNjdhNjNhYWI4OTU2NTllNWRjMTRjMzNhNGQ2ZQ0KPiBTVU5SUEM6IEFkZCBtaXNz
aW5nIHN1cHBvcnQgZm9yIFJQQ19DTE5UX0NSRUFURV9OT19SRVRSQU5TX1RJTUVPVVQNCj4gDQo+
IChpbnRyb2R1Y2VkIHdpdGggdjMuMTgtcmMxKSB0aGUgdW5pbnRlcnJ1cHRpYmxlIGhhbmdzIG9j
Y3VyLiBXaGVuIEkNCj4gcmV2ZXJ0IHRoaXMgY29tbWl0LCB0aGVuIEkgZG8gbm90IG9ic2VydmUg
dGhlIHVuaW50ZXJydXB0aWJsZSBoYW5ncy4NCj4gSSd2ZSB0ZXN0ZWQgdGhpcyBvbiBVYnVudHUg
MTYuMDQncyA0LjQga2VybmVsIGFuZCBEZWJpYW4gOSdzIDQuOQ0KPiBrZXJuZWwNCj4gYW5kIHNl
dmVyYWwgc3RvY2sga2VybmVscy4NCg0KVGhhdCdzIHRoZSBwYXRjaCB0aGF0IGltcGxlbWVudHMg
dGhpcyBwYXJ0IG9mIHRoZSBORlN2NCBzcGVjOg0KIGh0dHBzOi8vdG9vbHMuaWV0Zi5vcmcvaHRt
bC9yZmM3NTMwI3NlY3Rpb24tMy4xLjENCg0KDQpTbyBhcmUgeW91IHNlZWluZyB0aGUgY29ubmVj
dGlvbiBicmVhayB3aGVuIHRoZXNlIGhhbmdzIG9jY3VyPyBJZiB0aGUNCmNvbm5lY3Rpb24gaGFz
bid0IGJyb2tlbiwgdGhlbiB0aGUgcHJvYmxlbSBpcyBtb3JlIGxpa2VseSB0byBiZSB0aGUNCnNl
cnZlciBzaWxlbnRseSBkcm9wcGluZyByZXF1ZXN0cywgYW5kIGhlbmNlIGZhaWxpbmcgdG8gbWVl
dCB0aGUNCm9ibGlnYXRpb24gdG8gcmVwbHkgdG8gdGhlIGNsaWVudCdzIFJQQyBjYWxsIChhcyBz
cGVsbGVkIG91dCBpbiB0aGUNCmFib3ZlIHNlY3Rpb24gb2YgdGhlIHNwZWMpLg0KDQotLSANClRy
b25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBIYW1tZXJzcGFjZQ0K
dHJvbmQubXlrbGVidXN0QGhhbW1lcnNwYWNlLmNvbQ0KDQo=
On Sun, 2018-06-24 at 22:56, Trond Myklebust wrote:
> On Sun, 2018-06-24 at 22:30 +0200, Armin Größlinger wrote:
>> Meanwhile, I have been able to do some bisecting of kernel sources to
>> find a commit which exposes the hangs. It seems that since commit
>>
>> 2aca5b869ace67a63aab895659e5dc14c33a4d6e
>> SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT
>>
>> (introduced with v3.18-rc1) the uninterruptible hangs occur. When I
>> revert this commit, then I do not observe the uninterruptible hangs.
>> I've tested this on Ubuntu 16.04's 4.4 kernel and Debian 9's 4.9
>> kernel
>> and several stock kernels.
>
> That's the patch that implements this part of the NFSv4 spec:
> https://tools.ietf.org/html/rfc7530#section-3.1.1
I don't think the commit I referred to is the problem,
I think it exposes the underlying problem.
> So are you seeing the connection break when these hangs occur?
Sometimes the server (with Debian's 4.9.88 kernel) logs
[ 194.473842] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shuttting down socket
but not always when the hang occurs. Is there another way to check if
the connection is "broken"?
> If the
> connection hasn't broken, then the problem is more likely to be the
> server silently dropping requests, and hence failing to meet the
> obligation to reply to the client's RPC call (as spelled out in the
> above section of the spec).
Initially (in our group at the university) we observed the problem with
a Nexenta NFS server. I could not reproduce the problem with a FreeBSD
server. In addition, the problem seems to be very timing sensitive: it
occurs less when our Nexenta server under a heavier load and I cannot
reproduce it with my test VMs when I disable KVM acceleration (so the
VMs run 2-5 times slower).
I now tried also with Linux 4.18-rc2 as NFS server (instead of 4.9.88
from Debian Stretch) and then I could not observe the hanging tasks on
the client but the test program seems to "pause" for 30-120 seconds
every few iterations (and continues after the pause). After 2.5 hours,
the 2 GB RAM of the client were almost completely consumed by the kernel
(i.e., commands on the shell failed with "cannot fork: Cannot allocate
memory"), so there seems to be a memory leak?
With 4.18-rc2 as NFS client, I still see the OOM killer killing all
processes a few seconds after starting my test program (as mentioned in
my previous email).
With 4.18-rc2 as NFS server, I see many messages like
[ 1098.832570] rpc-srv/tcp: nfsd: got error -104 when sending 232 bytes
- shutting down socket
[ 1137.164829] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1211.284693] rpc-srv/tcp: nfsd: got error -104 when sending 232 bytes
- shutting down socket
[ 1236.512956] rpc-srv/tcp: nfsd: got error -104 when sending 232 bytes
- shutting down socket
[ 1258.140792] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1299.744482] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1372.608731] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1376.272594] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1376.412361] rpc-srv/tcp: nfsd: sent only 204 when sending 232 bytes -
shutting down socket
[ 1386.340604] rpc-srv/tcp: nfsd: got error -104 when sending 232 bytes
- shutting down socket
[ 1406.828262] rpc-srv/tcp: nfsd: got error -32 when sending 232 bytes -
shutting down socket
on the server (but the client keeps running - with 30-120 second pauses
mentioned above) and the port of the NFS connection changes frequently
(every few seconds).
I'm not sure what to try next and whether to blame the server or the
client for the misbehavior.
Regards,
Armin