2003-08-06 15:08:27

by Marc Schmitt

[permalink] [raw]
Subject: rpc.mountd: getfh failed: Operation not permitted

Hi (again)

I don't get any rest with my Linux NFS server... after you helped me
with the mountd RPC issues, today another problem occured, it was the
third time this happened.

All of a sudden, the server comes into a state where it will fill the
logs with the error message in the subject. Clients with active mounts
will continue working fine, but some mount requests (only some!) will be
denied with that error. Restarting the nfs daemon helps, after that some
users reported that their homes were hanging with a stale handle, though.

After googling around, I found various suggestions:

- upgrade to kernel >= 2.4.20 (which is something I have to do tonight
anyway, the server currently suffers from the quota bug in 2.4.18-27
that was solved in 2.4.20-18, too: "kernel: VFS: find_free_dqentry():
Data block full but it shouldn't. kernel: VFS: Error -5 occured while
creating quota.")

- increase the number of open file handles
Is this really related? file-max is currently set to 399763, which
should be enough...

- locking issues
Samba is running on the server, too. Could this potentially lead to this
error? Or serveral clients accessing the same file over NFS?

Hopefully, the problem will go away with the kernel upgrade.

The underlying filesystem is ext3, mounted with usrquota and
data=journal, btw.

Please let me know if you see other potential reasons for this problem,
thanks.

Greetings
Marc





-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-08-19 14:51:01

by Marc Schmitt

[permalink] [raw]
Subject: Re: rpc.mountd: getfh failed: Operation not permitted

Marc Schmitt wrote:

> - upgrade to kernel >= 2.4.20 (which is something I have to do tonight
> anyway, the server currently suffers from the quota bug in 2.4.18-27
> that was solved in 2.4.20-18, too: "kernel: VFS: find_free_dqentry():
> Data block full but it shouldn't. kernel: VFS: Error -5 occured while
> creating quota.")

:(
The upgrade to kernel 2.4.20-19 did not help, today again, I had to
restart the NFS service due to the "getfh failed: Operation not
permitted." error.
I've compared the logs (Aug 4th and today). What is interesting is that
the VFS quota error appeared again today, too. The quota bug occured
first at 6am and continued happening about once an hour till 11:45am.
45' later, the first getfh error appears in the logs. Short after that,
I've restarted the NFS service.

I'd love to help debugging this problem. I'd appreciate some pointers
where to start, though. How can I provide debugging infos while still
keeping the machine in production?

What I may try is running vanilla kernel 2.4.22 (once it's out), too, I
saw that there are improvements for machines with heavy I/O load.

TIA

Regards,

Marc



-------------------------------------------------------
This SF.net email is sponsored by Dice.com.
Did you know that Dice has over 25,000 tech jobs available today? From
careers in IT to Engineering to Tech Sales, Dice has tech jobs from the
best hiring companies. http://www.dice.com/index.epl?rel_code=104
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs