2014-06-11 16:44:03

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: pNFS client behavior confirmation



Dear pnfs developers,

I would like to confirm client behavior which was hurting us for years.

If for some reason the client is unable to connect to a DS, then this DS got blacklisted
and the only way to whitelist it again was client reboot. With RHEL7 (and I believe with
upstream kernel), after 2 mins client 'forgets' about bad DS and uses it again. This
behavior is tested and confirmed during June bakeathon.

Missing bits:

i) it takes client 6 min to fall back to MDS, e.g time between LAYOUTGET and first WRITE to MDS
ii) there are no log entries why client have decided to use MDS

Thanks,
Tigran.