Return-Path: Received: from mail-bw0-f46.google.com ([209.85.214.46]:37876 "EHLO mail-bw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751834Ab0HYGdw (ORCPT ); Wed, 25 Aug 2010 02:33:52 -0400 Received: by bwz11 with SMTP id 11so348158bwz.19 for ; Tue, 24 Aug 2010 23:33:50 -0700 (PDT) Subject: Re: hang in writeback code on nfsv4 mount From: Artem Bityutskiy Reply-To: dedekind1@gmail.com To: "J. Bruce Fields" Cc: "linux-nfs@vger.kernel.org" , Trond Myklebust , Christoph Hellwig , Jens Axboe In-Reply-To: <20100825023425.GA24591@fieldses.org> References: <20100825023425.GA24591@fieldses.org> Content-Type: text/plain; charset="UTF-8" Date: Wed, 25 Aug 2010 09:32:25 +0300 Message-ID: <1282717945.24044.187.camel@localhost> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Wed, 2010-08-25 at 04:34 +0200, ext J. Bruce Fields wrote: > As of 253c34e9b10c30d3064be654b5b78fbc1a8b1896 "writeback: prevent > unnecessary bdi threads wakeups", any nfs mount hangs for me. Is this a > known issue? > > --b. > > INFO: task mount.nfs4:3812 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > mount.nfs4 D 0000000000000000 2880 3812 3811 0x00000000 > ffff88001ed25a28 0000000000000046 ffff88001ed25fd8 ffff88001ed25fd8 > ffff88001ed24000 ffff88001ed24000 ffff88001ed24000 ffff88001f9503a0 > ffff88001ed25fd8 ffff88001f9503a8 ffff88001ed24000 ffff88001ed25fd8 > Call Trace: > [] schedule_timeout+0x1cd/0x2e0 > [] ? mark_held_locks+0x6c/0xa0 > [] ? _raw_spin_unlock_irq+0x30/0x60 > [] ? trace_hardirqs_on_caller+0x14d/0x190 > [] ? sub_preempt_count+0xe/0xd0 > [] wait_for_common+0x120/0x190 > [] ? default_wake_function+0x0/0x20 > [] wait_for_completion+0x1d/0x20 > [] kthread_stop+0x4a/0x150 > [] ? thaw_process+0x70/0x80 > [] bdi_unregister+0x10a/0x1a0 > [] nfs_put_super+0x19/0x20 > [] generic_shutdown_super+0x54/0xe0 > [] kill_anon_super+0x16/0x60 > [] nfs4_kill_super+0x39/0x90 > [] deactivate_locked_super+0x45/0x60 > [] deactivate_super+0x49/0x70 > [] mntput_no_expire+0x84/0xe0 > [] release_mounts+0x9f/0xc0 > [] put_mnt_ns+0x65/0x80 > [] nfs_follow_remote_path+0x1e6/0x420 > [] nfs4_try_mount+0x6f/0xd0 > [] nfs4_get_sb+0xa2/0x360 > [] vfs_kern_mount+0x88/0x1f0 > [] do_kern_mount+0x52/0x130 > [] ? _lock_kernel+0x6a/0x170 > [] do_mount+0x26e/0x7f0 > [] ? copy_mount_options+0xea/0x190 > [] sys_mount+0x98/0xf0 > [] system_call_fastpath+0x16/0x1b > 1 lock held by mount.nfs4/3812: > #0: (&type->s_umount_key#24){+.+...}, at: [] deactivate_super+0x41/0x70 I've tried both v2.6.36-rc2 and commit 253c34e9b10c30d3064be654b5b78fbc1a8b1896 [1] - and I can mount an NFS share with no problems: sudo mount -t nfs4 sauron:/home/dedekind/ /mnt/sauron_home/ works fine. Any hints about how to reproduce this are welcome. I'll try to look at the code and figure out why this could happen. So, does the mount at some point succeed? Or it is blocked forever? And sysrq-t output would be useful to look at as well. Also, it is strange that 'sys_mount()' involves 'nfs4_kill_super()' - is this normal or this is an error path? [1]: the kernel tree does not compile on this commit, and I applied patch on top to solve the compilation issue: 387ac089361fbe5ef287e6950c5c40f6b18e5c55 "block: fix missing export of blk_types.h" -- Best Regards, Artem Bityutskiy (Артём Битюцкий)