2012-03-27 15:54:05

by DENIEL Philippe

[permalink] [raw]
Subject: Client says "Stale NFS file handle" but server does not return NFS3ERR_STALE

Hi,

I have the following issue:
Client does a classical "mount -o vers=3,lock server:/path /mnt". The
server is my nfs-ganesha user space server.
Then, a long time running "dd if=/dev/zero of=./foo..." is made inside a
directory in the mount point. No matter what the other parameters of dd
(like bs= or count=) are : I kill the daemon, and restart it a couple of
seconds later. Then I kill the dd (CTRL-C from the console). The dd
command returns an error (which is logical, it' sis IO error or Bad File
Descriptor), but I see something else that is quite strange:
- if I ls from the current directory (where I ran 'dd'), I got the
message "ls: cannot open directory .: Stale NFS file handle"
- In wireshark, I see no NFS3ERR_STALE
The wireshark capture shows that the "server shutdown" was made between
a WRITE reply and the related COMMIT call (I received the COMMIT call as
the server rebooted).
Apparently, the client decided to return "Stale NFS file handle" to the
client, the server returns no error, all replies are NFS3_OK.
What should I be looking for to fix this bug ? (which is probably on my
side)

Regards

Philippe




2012-03-27 16:45:16

by Myklebust, Trond

[permalink] [raw]
Subject: Re: Client says "Stale NFS file handle" but server does not return NFS3ERR_STALE

QXJlIHlvdSBwZXJoYXBzIHJldHVybmluZyBibGF0YW50bHkgd3JvbmcgYXR0cmlidXRlcz8gSSB3
b3VsZCBleHBlY3QNCnRoYXQgcmV0dXJuaW5nIHRoZSB3cm9uZyBmaWxlaWQsIG9yIGFuIGluY29y
cmVjdCBmaWxlIHR5cGUgaW4gZWl0aGVyIGENCkdFVEFUVFIgb3IgYSBSRUFERElSIG1pZ2h0IGNh
dXNlIGEgc2l0dWF0aW9uIHN1Y2ggYXMgd2hhdCB5b3UgZGVzY3JpYmUuDQoNCkNoZWVycw0KICBU
cm9uZA0KDQpPbiBUdWUsIDIwMTItMDMtMjcgYXQgMTg6MjggKzAyMDAsIERFTklFTCBQaGlsaXBw
ZSB3cm90ZToNCj4gTW9yZSBpbmZvcm1hdGlvbjoNCj4gaWYgSSBkbyAiZWNobyAzMjc2NyA+IC9w
cm9jL3N5cy9zdW5ycGMvbmZzX2RlYnVnIiwgSSBjYW4gc2VlIHRoaXMgaW4gc3lzbG9nOg0KPiAg
ICAgTWFyIDI3IDE4OjExOjE1IGF1cnk2MyBrZXJuZWw6IFszMjQzMC4wNjU5MzBdIE5GUzogDQo+
IG5mc19sb29rdXBfcmV2YWxpZGF0ZSgvYSkgaXMgaW52YWxpZA0KPiANCj4gQW55IElkZWEgPw0K
PiANCj4gICAgIFBoaWxpcHBlDQo+IA0KPiBERU5JRUwgUGhpbGlwcGUgYSDDqWNyaXQgOg0KPiA+
IEhpLA0KPiA+DQo+ID4gSSBoYXZlIHRoZSBmb2xsb3dpbmcgaXNzdWU6DQo+ID4gQ2xpZW50IGRv
ZXMgYSBjbGFzc2ljYWwgIm1vdW50IC1vIHZlcnM9Myxsb2NrIHNlcnZlcjovcGF0aCAvbW50Ii4g
VGhlIA0KPiA+IHNlcnZlciBpcyBteSBuZnMtZ2FuZXNoYSB1c2VyIHNwYWNlIHNlcnZlci4NCj4g
PiBUaGVuLCBhIGxvbmcgdGltZSBydW5uaW5nICJkZCBpZj0vZGV2L3plcm8gb2Y9Li9mb28uLi4i
IGlzIG1hZGUgaW5zaWRlIA0KPiA+IGEgZGlyZWN0b3J5IGluIHRoZSBtb3VudCBwb2ludC4gTm8g
bWF0dGVyIHdoYXQgdGhlIG90aGVyIHBhcmFtZXRlcnMgb2YgDQo+ID4gZGQgKGxpa2UgYnM9IG9y
IGNvdW50PSkgYXJlIDogSSBraWxsIHRoZSBkYWVtb24sIGFuZCByZXN0YXJ0IGl0IGEgDQo+ID4g
Y291cGxlIG9mIHNlY29uZHMgbGF0ZXIuIFRoZW4gSSBraWxsIHRoZSBkZCAoQ1RSTC1DIGZyb20g
dGhlIGNvbnNvbGUpLiANCj4gPiBUaGUgZGQgY29tbWFuZCByZXR1cm5zIGFuIGVycm9yICh3aGlj
aCBpcyBsb2dpY2FsLCBpdCcgc2lzIElPIGVycm9yIG9yIA0KPiA+IEJhZCBGaWxlIERlc2NyaXB0
b3IpLCBidXQgSSBzZWUgc29tZXRoaW5nIGVsc2UgdGhhdCBpcyBxdWl0ZSBzdHJhbmdlOg0KPiA+
ICAgIC0gaWYgSSBscyBmcm9tIHRoZSBjdXJyZW50IGRpcmVjdG9yeSAod2hlcmUgSSByYW4gJ2Rk
JyksIEkgZ290IHRoZSANCj4gPiBtZXNzYWdlICJsczogY2Fubm90IG9wZW4gZGlyZWN0b3J5IC46
IFN0YWxlIE5GUyBmaWxlIGhhbmRsZSINCj4gPiAgICAtIEluIHdpcmVzaGFyaywgSSBzZWUgbm8g
TkZTM0VSUl9TVEFMRQ0KPiA+IFRoZSB3aXJlc2hhcmsgY2FwdHVyZSBzaG93cyB0aGF0IHRoZSAi
c2VydmVyIHNodXRkb3duIiB3YXMgbWFkZSANCj4gPiBiZXR3ZWVuIGEgV1JJVEUgcmVwbHkgYW5k
IHRoZSByZWxhdGVkIENPTU1JVCBjYWxsIChJIHJlY2VpdmVkIHRoZSANCj4gPiBDT01NSVQgY2Fs
bCBhcyB0aGUgc2VydmVyIHJlYm9vdGVkKS4NCj4gPiBBcHBhcmVudGx5LCB0aGUgY2xpZW50IGRl
Y2lkZWQgdG8gcmV0dXJuICJTdGFsZSBORlMgIGZpbGUgaGFuZGxlIiB0byANCj4gPiB0aGUgY2xp
ZW50LCB0aGUgc2VydmVyIHJldHVybnMgbm8gZXJyb3IsIGFsbCByZXBsaWVzIGFyZSBORlMzX09L
Lg0KPiA+IFdoYXQgIHNob3VsZCBJIGJlIGxvb2tpbmcgZm9yIHRvIGZpeCB0aGlzIGJ1ZyA/ICh3
aGljaCBpcyBwcm9iYWJseSBvbiANCj4gPiBteSBzaWRlKQ0KPiA+DQo+ID4gICAgUmVnYXJkcw0K
PiA+ICAgICAgICAgUGhpbGlwcGUNCj4gPg0KPiA+DQo+ID4gLS0gDQo+ID4gVG8gdW5zdWJzY3Jp
YmUgZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LW5mcyIg
aW4NCj4gPiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9y
Zw0KPiA+IE1vcmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFq
b3Jkb21vLWluZm8uaHRtbA0KPiANCj4gLS0NCj4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0aGlzIGxp
c3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LW5mcyIgaW4NCj4gdGhlIGJvZHkg
b2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcNCj4gTW9yZSBtYWpvcmRv
bW8gaW5mbyBhdCAgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRvbW8taW5mby5odG1sDQoN
Ci0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXINCg0KTmV0
QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20NCg0K

2012-03-27 16:28:13

by DENIEL Philippe

[permalink] [raw]
Subject: Re: Client says "Stale NFS file handle" but server does not return NFS3ERR_STALE

More information:
if I do "echo 32767 > /proc/sys/sunrpc/nfs_debug", I can see this in syslog:
Mar 27 18:11:15 aury63 kernel: [32430.065930] NFS:
nfs_lookup_revalidate(/a) is invalid

Any Idea ?

Philippe

DENIEL Philippe a ?crit :
> Hi,
>
> I have the following issue:
> Client does a classical "mount -o vers=3,lock server:/path /mnt". The
> server is my nfs-ganesha user space server.
> Then, a long time running "dd if=/dev/zero of=./foo..." is made inside
> a directory in the mount point. No matter what the other parameters of
> dd (like bs= or count=) are : I kill the daemon, and restart it a
> couple of seconds later. Then I kill the dd (CTRL-C from the console).
> The dd command returns an error (which is logical, it' sis IO error or
> Bad File Descriptor), but I see something else that is quite strange:
> - if I ls from the current directory (where I ran 'dd'), I got the
> message "ls: cannot open directory .: Stale NFS file handle"
> - In wireshark, I see no NFS3ERR_STALE
> The wireshark capture shows that the "server shutdown" was made
> between a WRITE reply and the related COMMIT call (I received the
> COMMIT call as the server rebooted).
> Apparently, the client decided to return "Stale NFS file handle" to
> the client, the server returns no error, all replies are NFS3_OK.
> What should I be looking for to fix this bug ? (which is probably on
> my side)
>
> Regards
> Philippe
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html