2008-03-31 22:40:21

by Anirban Sinha

[permalink] [raw]
Subject: nfsd restart failures without /proc/fs/nfsd filesystem mounted

Hi:

I am using a system where we do not use the /proc/nfs/nfsd filesystem (=
due to several reasons). I understand that without this filesystem, nfs=
utils does not use the "new cache" mechanism. However, a nfsd restart o=
peration should still be functional. However, when I try doing this man=
ually, I get the following error:=20

root:my_node:/etc/rc.d/init.d# /sbin/service nfs restart
Shutting down NFS mountd:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [=A0 OK=A0 ]
Shutting down NFS daemon:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [FAILED]
Shutting down NFS services:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [=A0 OK=A0 ]
Starting NFS services:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [=A0 OK=A0 ]
Starting NFS daemon:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0[FAILED=
]
Starting NFS mountd:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [=A0 OK=
=A0 ]

And the kernel log says:
[13:37:37.766844] nfsd: Could not allocate memory read-ahead cache.

This issue is happening on an intel platform. However, this same operat=
ion succeeds on our mips platform (without nfsd filesystem). What is al=
so interesting is that once I turn on /proc/fs/nfsd filesystem on Intel=
, the issue seems to go away.

Several weeks back, I reported an issue regarding exportfs -a failure o=
n an already exported filesystem when nfsd FS was not used. It turned o=
ut to be a bug in the nfsutils. Did we bump into another bug here? Is t=
here no other option but to turn on nfsd filesystem?=20

Thanks for whoever responds to this ...

Cheers,

Ani





2008-04-01 21:06:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd restart failures without /proc/fs/nfsd filesystem mounted

On Mon, Mar 31, 2008 at 03:39:04PM -0700, Anirban Sinha wrote:
> Hi:
>=20
> I am using a system where we do not use the /proc/nfs/nfsd filesystem=
(due to several reasons). I understand that without this filesystem, n=
fsutils does not use the "new cache" mechanism. However, a nfsd restart=
operation should still be functional. However, when I try doing this m=
anually, I get the following error:=20
>=20
> root:my_node:/etc/rc.d/init.d# /sbin/service nfs restart
> Shutting down NFS mountd:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 [=C2=A0 OK=C2=A0 ]
> Shutting down NFS daemon:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 [FAILED]
> Shutting down NFS services:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
[=C2=A0 OK=C2=A0 ]
> Starting NFS services:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 [=C2=A0 OK=C2=A0 ]
> Starting NFS daemon:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0[FAILED]
> Starting NFS mountd:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 [=C2=A0 OK=C2=A0 ]
>=20
> And the kernel log says:
> [13:37:37.766844] nfsd: Could not allocate memory read-ahead cache.

What kernel version is this? In the latest
fs/nfsd/vfs.c:nfsd_racache_init(int cache_size):

raparml =3D kcalloc(cache_size, sizeof(struct raparms), GFP_KERNEL);

if (!raparml) {
printk(KERN_WARNING
"nfsd: Could not allocate memory read-ahead cache.\n");
return -ENOMEM;
}

which is called from fs/nfsd/nfssvc.c:nfsd_svc() as:

error =3D nfsd_racache_init(2*nrservs);

where nrservs is the number of server threads. How many server threads=
are you
trying to start, and how much memory do you have?

--b.

>=20
> This issue is happening on an intel platform. However, this same oper=
ation succeeds on our mips platform (without nfsd filesystem). What is =
also interesting is that once I turn on /proc/fs/nfsd filesystem on Int=
el, the issue seems to go away.
>=20
> Several weeks back, I reported an issue regarding exportfs -a failure=
on an already exported filesystem when nfsd FS was not used. It turned=
out to be a bug in the nfsutils. Did we bump into another bug here? Is=
there no other option but to turn on nfsd filesystem?=20
>=20
> Thanks for whoever responds to this ...
>=20
> Cheers,
>=20
> Ani
>=20
>=20
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-04-01 21:19:36

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted

SGk6DQoNClRoYW5rcyBmb3IgcmVzcG9uZGluZy4gDQoNCg0KPj4gQW5kIHRoZSBrZXJuZWwgbG9n
IHNheXM6DQo+PiBbMTM6Mzc6MzcuNzY2ODQ0XSBuZnNkOiBDb3VsZCBub3QgYWxsb2NhdGUgbWVt
b3J5IHJlYWQtYWhlYWQgY2FjaGUuDQo+DQo+V2hhdCBrZXJuZWwgdmVyc2lvbiBpcyB0aGlzPyAg
DQoNCkkgYW0gdXNpbmcga2VybmVsIHZlcnNpb24gMi42LjE3LjcuIA0KDQoNCj4gSW4gdGhlIGxh
dGVzdA0KPmZzL25mc2QvdmZzLmM6bmZzZF9yYWNhY2hlX2luaXQoaW50IGNhY2hlX3NpemUpOg0K
Pg0KPglyYXBhcm1sID0ga2NhbGxvYyhjYWNoZV9zaXplLCBzaXplb2Yoc3RydWN0IHJhcGFybXMp
LCBHRlBfS0VSTkVMKTsNCj4NCj4JaWYgKCFyYXBhcm1sKSB7DQo+CQlwcmludGsoS0VSTl9XQVJO
SU5HDQo+CQkJIm5mc2Q6IENvdWxkIG5vdCBhbGxvY2F0ZSBtZW1vcnkgcmVhZC1haGVhZA0KPmNh
Y2hlLlxuIik7DQo+CQlyZXR1cm4gLUVOT01FTTsNCj4JfQ0KPg0KPndoaWNoIGlzIGNhbGxlZCBm
cm9tIGZzL25mc2QvbmZzc3ZjLmM6bmZzZF9zdmMoKSBhczoNCj4NCj4JZXJyb3IgPSBuZnNkX3Jh
Y2FjaGVfaW5pdCgyKm5yc2VydnMpOw0KPg0KPndoZXJlIG5yc2VydnMgaXMgdGhlIG51bWJlciBv
ZiBzZXJ2ZXIgdGhyZWFkcy4gIEhvdyBtYW55IHNlcnZlciB0aHJlYWRzDQo+YXJlIHlvdQ0KPnRy
eWluZyB0byBzdGFydCwgYW5kIGhvdyBtdWNoIG1lbW9yeSBkbyB5b3UgaGF2ZT8NCg0KDQpZZWEs
IEkgaGF2ZSBzZWVuIHRoYXQgY29kZWJhc2UuIFRoZSBjb25maWd1cmF0aW9uIGZpbGUgL2V0Yy9p
bml0LmQvbmZzIGNyZWF0ZXMgOCBuZnMgdGhyZWFkczoNCg0Kcm9vdDpaZXVnbWE6L2V0Yy9pbml0
LmQjIHBzIC1BIHxncmVwIG5mcw0KIDIyMDIgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0KIDIyMDMg
PyAgICAgICAgMDA6MDA6MDAgbmZzZA0KIDIyMDQgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0KIDIy
MDUgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0KIDIyMDYgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0K
IDIyMDcgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0KIDIyMDggPyAgICAgICAgMDA6MDA6MDAgbmZz
ZA0KIDIyMDkgPyAgICAgICAgMDA6MDA6MDAgbmZzZA0Kcm9vdDpaZXVnbWE6L2V0Yy9pbml0LmQj
IGZyZWUNCiAgICAgICAgICAgICB0b3RhbCAgICAgICB1c2VkICAgICAgIGZyZWUgICAgIHNoYXJl
ZCAgICBidWZmZXJzICAgICBjYWNoZWQNCk1lbTogICAgICAgIDI1NTM3MiAgICAgIDM0MzUyICAg
ICAyMjEwMjAgICAgICAgICAgMCAgICAgICAyNTY0ICAgICAgMTg3MjANCi0vKyBidWZmZXJzL2Nh
Y2hlOiAgICAgIDEzMDY4ICAgICAyNDIzMDQNClN3YXA6ICAgICAgICAgICAgMCAgICAgICAgICAw
ICAgICAgICAgIDANCg0KDQpUaGUgZnVubnkgdGhpbmcgaXMgdGhhdCB0aGUgbW9tZW50IEkgZW5h
YmxlIG5mc2QgZmlsZXN5c3RlbSwgdGhlIHByb2JsZW0gc2VlbXMgdG8gZ28gYXdheS4gQ2FuIHlv
dSB0cnkgYW5kIHJlcHJvZHVjZSB0aGUgcHJvYmxlbSBieSBkaXNhYmxpbmcgbmZzZCBmaWxlc3lz
dGVtIGluIHlvdXIgc3lzdGVtPw0KDQojPiB1bW91bnQgL3Byb2MvZnMvbmZzZA0KDQpBbmkNCg0K
DQoNCg0KPg0KPi0tYi4NCj4NCj4+DQo+PiBUaGlzIGlzc3VlIGlzIGhhcHBlbmluZyBvbiBhbiBp
bnRlbCBwbGF0Zm9ybS4gSG93ZXZlciwgdGhpcyBzYW1lDQo+b3BlcmF0aW9uIHN1Y2NlZWRzIG9u
IG91ciBtaXBzIHBsYXRmb3JtICh3aXRob3V0IG5mc2QgZmlsZXN5c3RlbSkuIFdoYXQNCj5pcyBh
bHNvIGludGVyZXN0aW5nIGlzIHRoYXQgb25jZSBJIHR1cm4gb24gL3Byb2MvZnMvbmZzZCBmaWxl
c3lzdGVtIG9uDQo+SW50ZWwsIHRoZSBpc3N1ZSBzZWVtcyB0byBnbyBhd2F5Lg0KPj4NCj4+IFNl
dmVyYWwgd2Vla3MgYmFjaywgSSByZXBvcnRlZCBhbiBpc3N1ZSByZWdhcmRpbmcgZXhwb3J0ZnMg
LWEgZmFpbHVyZQ0KPm9uIGFuIGFscmVhZHkgZXhwb3J0ZWQgZmlsZXN5c3RlbSB3aGVuIG5mc2Qg
RlMgd2FzIG5vdCB1c2VkLiBJdCB0dXJuZWQNCj5vdXQgdG8gYmUgYSBidWcgaW4gdGhlIG5mc3V0
aWxzLiBEaWQgd2UgYnVtcCBpbnRvIGFub3RoZXIgYnVnIGhlcmU/IElzDQo+dGhlcmUgbm8gb3Ro
ZXIgb3B0aW9uIGJ1dCB0byB0dXJuIG9uIG5mc2QgZmlsZXN5c3RlbT8NCj4+DQo+PiBUaGFua3Mg
Zm9yIHdob2V2ZXIgcmVzcG9uZHMgdG8gdGhpcyAuLi4NCj4+DQo+PiBDaGVlcnMsDQo+Pg0KPj4g
QW5pDQo+Pg0KPj4NCj4+DQo+PiAtLQ0KPj4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0aGlzIGxpc3Q6
IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LW5mcyINCj5pbg0KPj4gdGhlIGJvZHkg
b2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcNCj4+IE1vcmUgbWFqb3Jk
b21vIGluZm8gYXQgIGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbA0K

2008-04-01 21:24:45

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted

SSB0cmllZCBjcmVhdGluZyBhIHNpbmdsZSBuZnNkIHRocmVhZCBidXQgSSBnZXQgdGhlIHNhbWUg
ZXJyb3IgZnJvbSB0aGUga2VybmVsLg0KDQpBbmkNCg0KDQo+LS0tLS1PcmlnaW5hbCBNZXNzYWdl
LS0tLS0NCj5Gcm9tOiBKLiBCcnVjZSBGaWVsZHMgW21haWx0bzpiZmllbGRzQGZpZWxkc2VzLm9y
Z10NCj5TZW50OiBUdWVzZGF5LCBBcHJpbCAwMSwgMjAwOCAyOjA2IFBNDQo+VG86IEFuaXJiYW4g
U2luaGENCj5DYzogbGludXgtbmZzQHZnZXIua2VybmVsLm9yZw0KPlN1YmplY3Q6IFJlOiBuZnNk
IHJlc3RhcnQgZmFpbHVyZXMgd2l0aG91dCAvcHJvYy9mcy9uZnNkIGZpbGVzeXN0ZW0NCj5tb3Vu
dGVkDQo+DQo+T24gTW9uLCBNYXIgMzEsIDIwMDggYXQgMDM6Mzk6MDRQTSAtMDcwMCwgQW5pcmJh
biBTaW5oYSB3cm90ZToNCj4+IEhpOg0KPj4NCj4+IEkgYW0gdXNpbmcgYSBzeXN0ZW0gd2hlcmUg
d2UgZG8gbm90IHVzZSB0aGUgL3Byb2MvbmZzL25mc2QgZmlsZXN5c3RlbQ0KPihkdWUgdG8gc2V2
ZXJhbCByZWFzb25zKS4gSSB1bmRlcnN0YW5kIHRoYXQgd2l0aG91dCB0aGlzIGZpbGVzeXN0ZW0s
DQo+bmZzdXRpbHMgZG9lcyBub3QgdXNlIHRoZSAibmV3IGNhY2hlIiBtZWNoYW5pc20uIEhvd2V2
ZXIsIGEgbmZzZCByZXN0YXJ0DQo+b3BlcmF0aW9uIHNob3VsZCBzdGlsbCBiZSBmdW5jdGlvbmFs
LiBIb3dldmVyLCB3aGVuIEkgdHJ5IGRvaW5nIHRoaXMNCj5tYW51YWxseSwgSSBnZXQgdGhlIGZv
bGxvd2luZyBlcnJvcjoNCj4+DQo+PiByb290Om15X25vZGU6L2V0Yy9yYy5kL2luaXQuZCMgL3Ni
aW4vc2VydmljZSBuZnMgcmVzdGFydA0KPj4gU2h1dHRpbmcgZG93biBORlMgbW91bnRkOsKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
wqDCoCBbwqAgT0vCoCBdDQo+PiBTaHV0dGluZyBkb3duIE5GUyBkYWVtb246wqDCoMKgwqDCoMKg
wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgIFtG
QUlMRURdDQo+PiBTaHV0dGluZyBkb3duIE5GUyBzZXJ2aWNlczrCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoCBbwqAgT0vCoCBdDQo+
PiBTdGFydGluZyBORlMgc2VydmljZXM6wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgIFvCoCBPS8KgIF0NCj4+IFN0
YXJ0aW5nIE5GUyBkYWVtb246wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqAgwqBbRkFJTEVEXQ0KPj4gU3RhcnRp
bmcgTkZTIG1vdW50ZDrCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgIFvCoCBPS8KgIF0NCj4+DQo+PiBBbmQg
dGhlIGtlcm5lbCBsb2cgc2F5czoNCj4+IFsxMzozNzozNy43NjY4NDRdIG5mc2Q6IENvdWxkIG5v
dCBhbGxvY2F0ZSBtZW1vcnkgcmVhZC1haGVhZCBjYWNoZS4NCj4NCj5XaGF0IGtlcm5lbCB2ZXJz
aW9uIGlzIHRoaXM/ICBJbiB0aGUgbGF0ZXN0DQo+ZnMvbmZzZC92ZnMuYzpuZnNkX3JhY2FjaGVf
aW5pdChpbnQgY2FjaGVfc2l6ZSk6DQo+DQo+CXJhcGFybWwgPSBrY2FsbG9jKGNhY2hlX3NpemUs
IHNpemVvZihzdHJ1Y3QgcmFwYXJtcyksIEdGUF9LRVJORUwpOw0KPg0KPglpZiAoIXJhcGFybWwp
IHsNCj4JCXByaW50ayhLRVJOX1dBUk5JTkcNCj4JCQkibmZzZDogQ291bGQgbm90IGFsbG9jYXRl
IG1lbW9yeSByZWFkLWFoZWFkDQo+Y2FjaGUuXG4iKTsNCj4JCXJldHVybiAtRU5PTUVNOw0KPgl9
DQo+DQo+d2hpY2ggaXMgY2FsbGVkIGZyb20gZnMvbmZzZC9uZnNzdmMuYzpuZnNkX3N2YygpIGFz
Og0KPg0KPgllcnJvciA9IG5mc2RfcmFjYWNoZV9pbml0KDIqbnJzZXJ2cyk7DQo+DQo+d2hlcmUg
bnJzZXJ2cyBpcyB0aGUgbnVtYmVyIG9mIHNlcnZlciB0aHJlYWRzLiAgSG93IG1hbnkgc2VydmVy
IHRocmVhZHMNCj5hcmUgeW91DQo+dHJ5aW5nIHRvIHN0YXJ0LCBhbmQgaG93IG11Y2ggbWVtb3J5
IGRvIHlvdSBoYXZlPw0KPg0KPi0tYi4NCj4NCj4+DQo+PiBUaGlzIGlzc3VlIGlzIGhhcHBlbmlu
ZyBvbiBhbiBpbnRlbCBwbGF0Zm9ybS4gSG93ZXZlciwgdGhpcyBzYW1lDQo+b3BlcmF0aW9uIHN1
Y2NlZWRzIG9uIG91ciBtaXBzIHBsYXRmb3JtICh3aXRob3V0IG5mc2QgZmlsZXN5c3RlbSkuIFdo
YXQNCj5pcyBhbHNvIGludGVyZXN0aW5nIGlzIHRoYXQgb25jZSBJIHR1cm4gb24gL3Byb2MvZnMv
bmZzZCBmaWxlc3lzdGVtIG9uDQo+SW50ZWwsIHRoZSBpc3N1ZSBzZWVtcyB0byBnbyBhd2F5Lg0K
Pj4NCj4+IFNldmVyYWwgd2Vla3MgYmFjaywgSSByZXBvcnRlZCBhbiBpc3N1ZSByZWdhcmRpbmcg
ZXhwb3J0ZnMgLWEgZmFpbHVyZQ0KPm9uIGFuIGFscmVhZHkgZXhwb3J0ZWQgZmlsZXN5c3RlbSB3
aGVuIG5mc2QgRlMgd2FzIG5vdCB1c2VkLiBJdCB0dXJuZWQNCj5vdXQgdG8gYmUgYSBidWcgaW4g
dGhlIG5mc3V0aWxzLiBEaWQgd2UgYnVtcCBpbnRvIGFub3RoZXIgYnVnIGhlcmU/IElzDQo+dGhl
cmUgbm8gb3RoZXIgb3B0aW9uIGJ1dCB0byB0dXJuIG9uIG5mc2QgZmlsZXN5c3RlbT8NCj4+DQo+
PiBUaGFua3MgZm9yIHdob2V2ZXIgcmVzcG9uZHMgdG8gdGhpcyAuLi4NCj4+DQo+PiBDaGVlcnMs
DQo+Pg0KPj4gQW5pDQo+Pg0KPj4NCj4+DQo+PiAtLQ0KPj4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0
aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LW5mcyINCj5pbg0KPj4g
dGhlIGJvZHkgb2YgYSBtZXNzYWdlIHRvIG1ham9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcNCj4+IE1v
cmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWlu
Zm8uaHRtbA0K

2008-04-01 22:13:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd restart failures without /proc/fs/nfsd filesystem mounted

On Tue, Apr 01, 2008 at 02:19:04PM -0700, Anirban Sinha wrote:
> Hi:
>
> Thanks for responding.
>
>
> >> And the kernel log says:
> >> [13:37:37.766844] nfsd: Could not allocate memory read-ahead cache.
> >
> >What kernel version is this?
>
> I am using kernel version 2.6.17.7.
>
>
> > In the latest
> >fs/nfsd/vfs.c:nfsd_racache_init(int cache_size):
> >
> > raparml = kcalloc(cache_size, sizeof(struct raparms), GFP_KERNEL);
> >
> > if (!raparml) {
> > printk(KERN_WARNING
> > "nfsd: Could not allocate memory read-ahead
> >cache.\n");
> > return -ENOMEM;
> > }
> >
> >which is called from fs/nfsd/nfssvc.c:nfsd_svc() as:
> >
> > error = nfsd_racache_init(2*nrservs);
> >
> >where nrservs is the number of server threads. How many server threads
> >are you
> >trying to start, and how much memory do you have?
>
>
> Yea, I have seen that codebase. The configuration file /etc/init.d/nfs creates 8 nfs threads:
>
> root:Zeugma:/etc/init.d# ps -A |grep nfs
> 2202 ? 00:00:00 nfsd
> 2203 ? 00:00:00 nfsd
> 2204 ? 00:00:00 nfsd
> 2205 ? 00:00:00 nfsd
> 2206 ? 00:00:00 nfsd
> 2207 ? 00:00:00 nfsd
> 2208 ? 00:00:00 nfsd
> 2209 ? 00:00:00 nfsd
> root:Zeugma:/etc/init.d# free
> total used free shared buffers cached
> Mem: 255372 34352 221020 0 2564 18720
> -/+ buffers/cache: 13068 242304
> Swap: 0 0 0
>
>
> The funny thing is that the moment I enable nfsd filesystem, the problem
> seems to go away.

OK, so write_svc() (hence sys_nfsservctl()) is getting garbage. Hm.
The structure that's passed in to the kernel is:

struct nfsctl_svc {
unsigned short svc_port;
int svc_nthreads;
};

Is it at all possible that userspace and the kernel could disagree about
the layout of that structure?

Can you play with strace or insert some printk's to figure out what's
going on?

> Can you try and reproduce the problem by disabling nfsd filesystem in your
> system?

I haven't tried yet.

--b.

>
> #> umount /proc/fs/nfsd
>
> Ani
>
>
>
>
> >
> >--b.
> >
> >>
> >> This issue is happening on an intel platform. However, this same
> >operation succeeds on our mips platform (without nfsd filesystem). What
> >is also interesting is that once I turn on /proc/fs/nfsd filesystem on
> >Intel, the issue seems to go away.
> >>
> >> Several weeks back, I reported an issue regarding exportfs -a failure
> >on an already exported filesystem when nfsd FS was not used. It turned
> >out to be a bug in the nfsutils. Did we bump into another bug here? Is
> >there no other option but to turn on nfsd filesystem?
> >>
> >> Thanks for whoever responds to this ...
> >>
> >> Cheers,
> >>
> >> Ani
> >>
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> >in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-04-01 22:54:47

by Steve Dickson

[permalink] [raw]
Subject: Re: nfsd restart failures without /proc/fs/nfsd filesystem mounted



Anirban Sinha wrote:
> Hi:
>
> I am using a system where we do not use the /proc/nfs/nfsd filesystem (due to several reasons). I understand that without this filesystem, nfsutils does not use the "new cache" mechanism. However, a nfsd restart operation should still be functional. However, when I try doing this manually, I get the following error:
>
> root:my_node:/etc/rc.d/init.d# /sbin/service nfs restart
> Shutting down NFS mountd: [ OK ]
> Shutting down NFS daemon: [FAILED]
> Shutting down NFS services: [ OK ]
> Starting NFS services: [ OK ]
> Starting NFS daemon: [FAILED]
> Starting NFS mountd: [ OK ]
This seems to work with both a 2.6.18 kernel (using nfs-utils-1.0.9) and
a 2.6.25 kernel (using nfs-utils-1.1.2-1).
What version of nfs-utils are you using?

> I am using kernel version 2.6.17.7.
This is a pretty old kernel... any chance of upgrading it?

steved.

2008-04-01 22:57:25

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted


>This seems to work with both a 2.6.18 kernel (using nfs-utils-1.0.9)
and
>a 2.6.25 kernel (using nfs-utils-1.1.2-1).
>What version of nfs-utils are you using?
>
>> I am using kernel version 2.6.17.7.
>This is a pretty old kernel... any chance of upgrading it?

No. These days a new kernel comes up in every 2-3 months. It's pretty
difficult for us keep up with the every latest and greatest releases.

Ani



>
>steved.

2008-04-02 19:29:11

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted



>-----Original Message-----
>From: J. Bruce Fields [mailto:[email protected]]
>Sent: Tuesday, April 01, 2008 3:13 PM
>
>> >which is called from fs/nfsd/nfssvc.c:nfsd_svc() as:
>> >
>> > error = nfsd_racache_init(2*nrservs);
>> >
>> >where nrservs is the number of server threads. How many server
>threads
>> >are you
>> >trying to start, and how much memory do you have?
>>
>>
>> Yea, I have seen that codebase. The configuration file
/etc/init.d/nfs
>creates 8 nfs threads:
>>
>> root:Zeugma:/etc/init.d# ps -A |grep nfs
>> 2202 ? 00:00:00 nfsd
>> 2203 ? 00:00:00 nfsd
>> 2204 ? 00:00:00 nfsd
>> 2205 ? 00:00:00 nfsd
>> 2206 ? 00:00:00 nfsd
>> 2207 ? 00:00:00 nfsd
>> 2208 ? 00:00:00 nfsd
>> 2209 ? 00:00:00 nfsd
>> root:Zeugma:/etc/init.d# free
>> total used free shared buffers
>cached
>> Mem: 255372 34352 221020 0 2564
>18720
>> -/+ buffers/cache: 13068 242304
>> Swap: 0 0 0
>>
>>
>> The funny thing is that the moment I enable nfsd filesystem, the
>problem
>> seems to go away.
>
>OK, so write_svc() (hence sys_nfsservctl()) is getting garbage. Hm.


Yea, indeed. So I did some digging with printks and found that the
#server threads are indeed not getting correctly passed on to the
kernel. This I attributed, as you have suggested, to some memory
corruption.



>The structure that's passed in to the kernel is:
>
> struct nfsctl_svc {
> unsigned short svc_port;
> int svc_nthreads;
> };
>
>Is it at all possible that userspace and the kernel could disagree
about
>the layout of that structure?


I looked at the userspace definition of nfsctl_arg in
support/include/nfs/nfs.h:

struct nfsctl_arg {
int ca_version; /* safeguard */
union {
struct nfsctl_svc u_svc;
struct nfsctl_client u_client;
struct nfsctl_export u_export;
struct nfsctl_uidmap u_umap;
struct nfsctl_fhparm u_getfh;
struct nfsctl_fdparm u_getfd;
struct nfsctl_fsparm u_getfs;
void *u_ptr;
} u;
#define ca_svc u.u_svc
#define ca_client u.u_client
#define ca_export u.u_export
#define ca_umap u.u_umap
#define ca_getfh u.u_getfh
#define ca_getfd u.u_getfd
#define ca_getfs u.u_getfs
#define ca_authd u.u_authd
};

As you can see, we have an extra u_ptr member in the union (which is the
same as in the kernel: include/linux/nfsd/syscall.h).

For experiment, I removed this member, recompiled nfs-utils and wala!
The kernel now gets the correct value of the server thread #.

I am a bit puzzled by this since u_umap member already has a char* and I
think adding a void* does not change the alignment of the union. In the
kernel, its presence is important since it does not have a u_umap member
in its nfsctl_arg declaration:

struct nfsctl_arg {
int ca_version; /* safeguard */
union {
struct nfsctl_svc u_svc;
struct nfsctl_client u_client;
struct nfsctl_export u_export;
struct nfsctl_fdparm u_getfd;
struct nfsctl_fsparm u_getfs;
/*
* The following dummy member is needed to preserve
binary compatibility
* on platforms where alignof(void*)>alignof(int). It's
needed because
* this union used to contain a member (u_umap) which
contained a
* pointer.
*/
void *u_ptr;
} u;
#define ca_svc u.u_svc
#define ca_client u.u_client
#define ca_export u.u_export
#define ca_getfd u.u_getfd
#define ca_getfs u.u_getfs
};

Ani



2008-04-02 21:25:44

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted

>As you can see, we have an extra u_ptr member in the union (which is
the
>same as in the kernel: include/linux/nfsd/syscall.h).
>
>For experiment, I removed this member, recompiled nfs-utils and wala!
>The kernel now gets the correct value of the server thread #.
>
>I am a bit puzzled by this since u_umap member already has a char* and
I
>think adding a void* does not change the alignment of the union.


Indeed. Actually, by mistake, I was using the new 1.1.2 nfs-utils
binaries and making changes in the old 1.1.0 nfs-utils source. The
removal of the member does not change anything. What is happening is
that an upgrade to the new 1.1.2 nfs-utils binaries solves the problem.
I think you can reproduce the problem at your end simply by unmounting
nfsd filesystem and using 1.1.0 nfs-utils binary.

Cheers,

Ani





2008-04-02 22:26:15

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd restart failures without /proc/fs/nfsd filesystem mounted

On Wed, Apr 02, 2008 at 02:25:10PM -0700, Anirban Sinha wrote:
> >As you can see, we have an extra u_ptr member in the union (which is
> the
> >same as in the kernel: include/linux/nfsd/syscall.h).
> >
> >For experiment, I removed this member, recompiled nfs-utils and wala!
> >The kernel now gets the correct value of the server thread #.
> >
> >I am a bit puzzled by this since u_umap member already has a char* and
> I
> >think adding a void* does not change the alignment of the union.
>
>
> Indeed. Actually, by mistake, I was using the new 1.1.2 nfs-utils
> binaries and making changes in the old 1.1.0 nfs-utils source. The
> removal of the member does not change anything. What is happening is
> that an upgrade to the new 1.1.2 nfs-utils binaries solves the problem.
> I think you can reproduce the problem at your end simply by unmounting
> nfsd filesystem and using 1.1.0 nfs-utils binary.

OK! Hm, do you know which patch actually fixed this? (Is it actually
just the same problem you reported before?) On a quick skim through the
git history I'm not seeing it.

--b.

2008-04-02 23:06:54

by Anirban Sinha

[permalink] [raw]
Subject: RE: nfsd restart failures without /proc/fs/nfsd filesystem mounted

>>
>>
>> Indeed. Actually, by mistake, I was using the new 1.1.2 nfs-utils
>> binaries and making changes in the old 1.1.0 nfs-utils source. The
>> removal of the member does not change anything. What is happening is
>> that an upgrade to the new 1.1.2 nfs-utils binaries solves the
>problem.
>> I think you can reproduce the problem at your end simply by
unmounting
>> nfsd filesystem and using 1.1.0 nfs-utils binary.
>
>OK! Hm, do you know which patch actually fixed this? (Is it actually
>just the same problem you reported before?)

No, it can't be the same patch. After I reported the issue and Neil gave
me the patch, I applied the patch to our own nfs-utils source and
thereafter, our build process has been using the patched nfs-utils
binaries ever since. However, interestingly, the patch that was
committed to the git repo is handles the issue slightly differently
(though I think it should not create any difference).

The patch Neil gave me is as follows:

diff --git a/support/export/client.c b/support/export/client.c
index 1cb242f..e96f5e0 100644
--- a/support/export/client.c
+++ b/support/export/client.c
@@ -462,5 +462,5 @@ client_gettype(char *ident)
sp++; if(!isdigit(*sp) || strtoul(sp, &sp, 10) > 255 || *sp !=
'.') return MCL_FQDN;
sp++; if(!isdigit(*sp) || strtoul(sp, &sp, 10) > 255 || *sp !=
'\0') return MCL_FQDN;
/* we lie here a bit. but technically N.N.N.N == N.N.N.N/32 :)
*/
- return MCL_SUBNETWORK;
+ return MCL_FQDN;
}


On a quick skim through the
>git history I'm not seeing it.

Hmm, I also did a quick skip through the commit logs. Didn't quite see
anything that might be relevant. However, I retested with 1.1.2 and
again it *did* solve the issue. Interesting!


A


>
>--b.