Return-Path: linux-nfs-owner@vger.kernel.org Received: from xes-mad.com ([216.165.139.218]:51573 "EHLO xes-mad.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754780AbaCRVuo convert rfc822-to-8bit (ORCPT ); Tue, 18 Mar 2014 17:50:44 -0400 Date: Tue, 18 Mar 2014 16:50:29 -0500 (CDT) From: Andrew Martin To: Trond Myklebust Cc: Jim Rees , bhawley@luminex.com, Brown Neil , linux-nfs-owner@vger.kernel.org, linux-nfs@vger.kernel.org Message-ID: <693414378.60415.1395179429651.JavaMail.zimbra@xes-inc.com> In-Reply-To: <40C20DD8-9E8D-4625-B98C-A1E61D00AC17@primarydata.com> References: <1696396609.119284.1394040541217.JavaMail.zimbra@xes-inc.com> <20140306173632.GA18545@umich.edu> <1397912955.101159.1394130906695.JavaMail.zimbra@xes-inc.com> <2043391310.134091.1394135196565.JavaMail.zimbra@xes-inc.com> <76B038DA-3E86-4C46-BFB6-928BFB8202D8@primarydata.com> <521763040.159828.1394138758307.JavaMail.zimbra@xes-inc.com> <40C20DD8-9E8D-4625-B98C-A1E61D00AC17@primarydata.com> Subject: Re: Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Sender: linux-nfs-owner@vger.kernel.org List-ID: ----- Original Message ----- > From: "Trond Myklebust" > To: "Andrew Martin" > Cc: "Jim Rees" , bhawley@luminex.com, "Brown Neil" , linux-nfs-owner@vger.kernel.org, > linux-nfs@vger.kernel.org > Sent: Thursday, March 6, 2014 3:01:03 PM > Subject: Re: Optimal NFS mount options to safely allow interrupts and timeouts on newer kernels > > > On Mar 6, 2014, at 15:45, Andrew Martin wrote: > > > ----- Original Message ----- > >> From: "Trond Myklebust" > >>> I attempted to get a backtrace from one of the uninterruptable apache > >>> processes: > >>> echo w > /proc/sysrq-trigger > >>> > >>> Here's one example: > >>> [1227348.003904] apache2 D 0000000000000000 0 10175 1773 > >>> 0x00000004 > >>> [1227348.003906] ffff8802813178c8 0000000000000082 0000000000015e00 > >>> 0000000000015e00 > >>> [1227348.003908] ffff8801d88f03d0 ffff880281317fd8 0000000000015e00 > >>> ffff8801d88f0000 > >>> [1227348.003910] 0000000000015e00 ffff880281317fd8 0000000000015e00 > >>> ffff8801d88f03d0 > >>> [1227348.003912] Call Trace: > >>> [1227348.003918] [] ? rpc_wait_bit_killable+0x0/0x40 > >>> [sunrpc] > >>> [1227348.003923] [] rpc_wait_bit_killable+0x24/0x40 > >>> [sunrpc] > >>> [1227348.003925] [] __wait_on_bit+0x5f/0x90 > >>> [1227348.003930] [] ? rpc_wait_bit_killable+0x0/0x40 > >>> [sunrpc] > >>> [1227348.003932] [] out_of_line_wait_on_bit+0x78/0x90 > >>> [1227348.003934] [] ? wake_bit_function+0x0/0x40 > >>> [1227348.003939] [] __rpc_execute+0x191/0x2a0 [sunrpc] > >>> [1227348.003945] [] rpc_execute+0x26/0x30 [sunrpc] > >> > >> That basically means that the process is hanging in the RPC layer, > >> somewhere > >> in the state machine. ‘echo 0 >/proc/sys/sunrpc/rpc_debug’ as the ‘root’ > >> user should give us a dump of which state these RPC calls are in. Can you > >> please try that? > > Yes I will definitely run that the next time it happens, but since it > > occurs > > sporadically (and I have not yet found a way to reproduce it on demand), it > > could be days before it occurs again. I'll also run "netstat -tn" to check > > the > > TCP connections the next time this happens. > > If you are comfortable applying patches and compiling your own kernels, then > you might want to try applying the fix for a certain out-of-socket-buffer > race that Neil reported, and that I suspect you may be hitting. The patch > has been sent to the ‘stable kernel’ series, and so should appear soon in > Debian’s own kernels, but if this is bothering you now, then go for it… > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=06ea0bfe6e6043cb56a78935a19f6f8ebc636226 > Trond, This problem has reoccurred, and I have captured the debug output that you requested: echo 0 >/proc/sys/sunrpc/rpc_debug: http://pastebin.com/9juDs2TW echo w > /proc/sysrq-trigger ; dmesg: http://pastebin.com/1vDx9bNf netstat -tn: http://pastebin.com/mjxqjmuL One suggestion for debug was to attempt to run "umount -f /path/to/mountpoint" repeatedly to attempt to send SIGKILL back up to the application. This always returned "Device or resource busy" and I was unable to unmount the filesystem until I used "mount -l". I was able to kill -9 all but two of the processes that were blocking in uninterruptable sleep. Note that I was able to get lsof output on these processes this time, and they all appeared to be blocking on access to a single file on the nfs share. If I tried to cat said file from this client, my terminal would block: open("/path/to/file", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=42385, ...}) = 0 mmap(NULL, 1056768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb00f0dc000 read(3, However, I could cat the file just fine from another nfs client. Does this additional information shed any light on the source of this problem? Thanks, Andrew