From: "J. Bruce Fields" Subject: Re: [NFS] problems with lockd in 2.6.22.6 Date: Fri, 7 Sep 2007 12:19:45 -0400 Message-ID: <20070907161945.GI24638@fieldses.org> References: <200709071749.55760.wolfgang.walter@studentenwerk.mhn.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: neilb@suse.de, netdev@vger.kernel.org, nfs@lists.sourceforge.net To: Wolfgang Walter Return-path: In-Reply-To: <200709071749.55760.wolfgang.walter@studentenwerk.mhn.de> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, Sep 07, 2007 at 05:49:55PM +0200, Wolfgang Walter wrote: > Hello, >=20 > we upgraded the kernel of a nfs-server from 2.6.17.11 to 2.6.22.6. Si= nce then we get the message >=20 > lockd: too many open TCP sockets, consider increasing the number of n= fsd threads > lockd: last TCP connect from ^\\236^\=C3=89^D >=20 > 1) These random characters in the second line are caused by a bug in = svc_tcp_accept. > I already posted this patch on netdev@vger.kernel.org: Thanks, I've applied that. (The bug is a little subtle: there's actually two previous __svc_print_addr() calls which might have initialized "buf" correctly, and it's not obvious that the second isn't always called (since it's in a dprintk, which is a macro that expands into a printk inside a conditional)). > with this patch applied one gets something like >=20 > lockd: too many open TCP sockets, consider increasing the number of > nfsd threads lockd: last TCP connect from 10.11.0.12, port=3D784 >=20 >=20 > 2) The number of nfsd threads we are running on the machine is 1024. > So this is not the problem. It seems, though, that in the case of > lockd svc_tcp_accept does not check the number of nfsd threads but th= e > number of lockd threads which is one. As soon as the number of open > lockd sockets surpasses 80 this message gets logged. This usually > happens every evening when a lot of people shutdown their workstation= =2E So to be clear: there's not an actual problem here other than that the logs are getting spammed? (Not that that isn't a problem in itself.) > 3) For unknown reason these sockets then remain open. In the morning > when people start their workstation again we therefor not only get a > lot of these messages again but often the nfs-server does not proberl= y > work any more. Restarting the nfs-daemon is a workaround. Hm, thanks. --b.