From: Jan Bruvoll <jan@bruvoll.com>
Subject: Performance problems with high number of clients
Date: Thu, 22 Jul 2004 21:53:42 +0100
Sender: nfs-admin@lists.sourceforge.net
Message-ID: <41002956.2080103@bruvoll.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
To: NFS mailling list <nfs@lists.sourceforge.net>
Errors-To: nfs-admin@lists.sourceforge.net

Dear List,

I am having problems with a site served by a NFS server cluster/pair, 
and I hope you could please have a quick look at this and give me a few 
pointers on what to look for. I'm not extremely experienced when it 
comes to NFS tuning, and frankly I don't quite know where to begin now.

Specs:
2 servers (Dual Xeon 3Ghz 2Gb ram, 3Ware RAID5, Gentoo kernel 2.4.26 r6) 
connected with DRBD and heartbeat to do auto fail-over.
Clients: 10 relatively heavily laden web servers (1 mill. PVs total per 
day), using a simple file based locking mechanism for newly generated 
content (which leads to a steady flow of file creations and deletions)

The problem only occurs when the most active NFS mount is added (where 
the locking mechanism is used), where the server load increases to about 
~7, response goes down the drain, however the CPUs are apparently 
98-100% idle. In this situation, I'm not able to mount/unmount anything 
-> RPC timeouts.

If only the other mounts are set up, ie ~60 relatively inactive mounts, 
there are no problems.

I've done some crude transfer speed testing: I've created a 500Mb file 
that I for load testing make a copy of using rsync. Ie. all numbers 
below are tested using simultaneous reads and writes.

hdparm -Tt /dev/sda: 39Mb/s
Direct to file system (read + write): 12 Mb/s sustained
Onto DRBD file system (read + write): 10 Mb/s sustained
Over NFS (read + write): 3.6 Mb/s

The network between test client and server is 100Mb/s. I guess the raw 
throughput is ok, however I can't quite understand why everything hangs 
up when I add the final mount. Any help would be greatly appreciated.

PS. I should of course also add that the current, main NFS server, an 
old single PIII/650 with kernel 2.4.17, has no problems -at all- with 
the same load.

Best regards
Jan


-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs