From: Trond Myklebust Subject: Re: NFS silent death Date: Mon, 27 Apr 2009 17:04:47 -0400 Message-ID: <1240866287.8818.89.camel@heimdal.trondhjem.org> References: <143DD9C8-CD7D-4D18-8295-D9ADBD8A0972@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Cc: linux-nfs@vger.kernel.org To: Anton Starikov Return-path: Received: from mail-out1.uio.no ([129.240.10.57]:51278 "EHLO mail-out1.uio.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760891AbZD0VEu (ORCPT ); Mon, 27 Apr 2009 17:04:50 -0400 In-Reply-To: <143DD9C8-CD7D-4D18-8295-D9ADBD8A0972@gmail.com> Sender: linux-nfs-owner@vger.kernel.org List-ID: On Mon, 2009-04-27 at 22:13 +0200, Anton Starikov wrote: > Hello, > > I got stuck around NFS problem. > > There is a server which serve /home via NFSv4 and root via NFSv3. > There are number of diskless clients. > > At some points some of clients hang. > Nothing in the logs and in console most of time. > > I tried 2.6.27-2.6.30 kernels on both sides. It looks like generally > NFSv4 stable enough, if I just mount /home to diskNess host - no hangs > during couple of months. But if I start mix both of them - then > problems come. And, there is a problem to have nfs-root on NFSv4, at > least all my attempts failed. After some time it always endup with > broken id-mapping. > And with 2.6.27-2.6.29 nfs3 used to be absolutely non-usable, if you > need writing. At least all my games with different mount|export > options used end up with the same problem with writing: > echo "test" > test_file > when file doesn't preexist, fail with "Invalid argument" > So, I had to mix them. > > So far I come to 2.6.30 kernel and finally got something in the logs, > but only for one of clients, rest is still die silent. > > > [ 9363.096013] INFO: task rpciod/0:745 blocked for more than 120 > seconds. > [ 9363.102730] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 9363.110905] rpciod/0 D ffffc20000010e00 0 745 2 > [ 9363.117049] ffff8805244a9b10 0000000000000046 ffff8805244a9aa0 > 0000000000010e00 > [ 9363.124876] 0000000000010e00 0000000000010e00 0000000000010e00 > 0000000000010e00 > [ 9363.132689] 0000000000010e00 0000000000010e00 ffff880524a667c0 > ffff880124192840 > [ 9363.140506] Call Trace: > [ 9363.143138] [] schedule+0x1c/0x44 > [ 9363.148303] [] nfs_idmap_id+0x1ed/0x287 [nfs] > [ 9363.154567] [] nfs_map_group_to_gid+0x39/0x4f > [nfs] > [ 9363.161332] [] decode_attr_group+0x110/0x1af [nfs] > [ 9363.168011] [] decode_getfattr+0x45e/0x960 [nfs] > [ 9363.174509] [] nfs4_xdr_dec_open+0xa3/0xef [nfs] > [ 9363.181020] [] rpcauth_unwrap_resp+0x89/0xac > [sunrpc] > [ 9363.187935] [] call_decode+0x14e/0x1d3 [sunrpc] > [ 9363.194326] [] __rpc_execute+0x93/0x278 [sunrpc] > [ 9363.200809] [] rpc_async_schedule+0x23/0x39 > [sunrpc] > [ 9363.207624] [] run_workqueue+0xc9/0x189 > [ 9363.213300] [] worker_thread+0xea/0x10f > [ 9363.218972] [] kthread+0x69/0xac > [ 9363.224022] [] child_rip+0xa/0x20 > [ 9363.229174] INFO: task ntpd:4161 blocked for more than 120 seconds. > [ 9363.235620] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 9363.243787] ntpd D ffffc20000010e00 0 4161 1 > [ 9363.249931] ffff880125831bf8 0000000000000082 ffff880126b49ad0 > 0000000000010e00 > [ 9363.257758] 0000000000010e00 0000000000010e00 0000000000010e00 > 0000000000010e00 > [ 9363.265583] 0000000000010e00 0000000000010e00 ffff880122c380c0 > ffff8801264184c0 > [ 9363.273419] Call Trace: > [ 9363.276035] [] schedule+0x1c/0x44 > [ 9363.281174] [] io_schedule+0x76/0xd0 > [ 9363.286581] [] sync_page+0x54/0x6c > [ 9363.291815] [] __lock_page+0x82/0xb8 > [ 9363.297224] [] find_lock_page+0x48/0x82 > [ 9363.302892] [] filemap_fault+0x183/0x346 > [ 9363.308647] [] __do_fault+0x77/0x449 > [ 9363.314059] [] handle_mm_fault+0x1fe/0x31b > [ 9363.319986] [] do_page_fault+0x273/0x29e > [ 9363.325741] [] page_fault+0x25/0x30 > [ 9363.331068] [<00007fdd26f110e0>] 0x7fdd26f110e0 > [ 9363.335865] INFO: task bash:4261 blocked for more than 120 seconds. > [ 9363.342311] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 9363.350480] bash D ffffc20000810e00 0 4261 4259 > [ 9363.356611] ffff880925175468 0000000000000082 0000000000000088 > 0000000000010e00 > [ 9363.364441] 0000000000010e00 0000000000010e00 0000000000010e00 > 0000000000010e00 > [ 9363.372253] 0000000000010e00 0000000000010e00 ffff88092649e780 > ffff880d26522040 > [ 9363.380078] Call Trace: > [ 9363.382709] [] __mutex_lock_common+0x159/0x1fc > [ 9363.388985] [] __mutex_lock_slowpath+0x27/0x3d > [ 9363.395262] [] mutex_lock+0x25/0x53 > [ 9363.400600] [] nfs_idmap_id+0x96/0x287 [nfs] > [ 9363.406772] [] nfs_map_name_to_uid+0x39/0x4f [nfs] > [ 9363.413462] [] decode_attr_owner+0x110/0x1af [nfs] > [ 9363.420140] [] decode_getfattr+0x8e6/0x960 [nfs] > [ 9363.426642] [] nfs4_xdr_dec_access+0xfd/0x11e > [nfs] > [ 9363.433408] [] rpcauth_unwrap_resp+0x89/0xac > [sunrpc] > [ 9363.440324] [] call_decode+0x14e/0x1d3 [sunrpc] Hmm... My guess is that you are deadlocked because rpciod appears to be hanging on an idmapper upcall due to an NFSv4 open() call. At the same time, since /etc/passwd and /etc/group are on an NFSv3 partition, the idmapper will need to use rpciod in order to read them. We therefore apparently need to move the idmapper upcall out of the rpciod part of the open() call. I'll look into that... Trond