Return-Path: linux-nfs-owner@vger.kernel.org Received: from mail-ig0-f181.google.com ([209.85.213.181]:41662 "EHLO mail-ig0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754280AbaHLCwm convert rfc822-to-8bit (ORCPT ); Mon, 11 Aug 2014 22:52:42 -0400 Received: by mail-ig0-f181.google.com with SMTP id h3so5381731igd.8 for ; Mon, 11 Aug 2014 19:52:41 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: [nfs] BUG: sleeping function called from invalid context at include/linux/wait.h:976 From: Weston Andros Adamson In-Reply-To: Date: Mon, 11 Aug 2014 22:52:38 -0400 Cc: Fengguang Wu , Peter Zijlstra , linux-nfs list , Jet Chen , Su Tao , Yuanhan Liu , LKP , "linux-kernel@vger.kernel.org" Message-Id: <7EA40233-6820-4F0E-A2B7-687DC01BDD78@primarydata.com> References: <20140805140307.GB5593@localhost> To: Nick Krause Sender: linux-nfs-owner@vger.kernel.org List-ID: Hi, I posted a 5 patch series to the nfs list last week with the cover letter titled "nfs_page_group_lock cleanup?, but neglected to mail the wider list. The might_sleep check was being hit because nfs_page_group_lock with wait=False called wait_on_bit_lock (which might sleep). I also had to be careful to ignore the nonblock argument in nfs_lock_and_join_requests and always call nonblocking, because the inode lock is held. Blocking is handled by dropping the inode lock, calling wait_on_bit(), then trying again. The cover letter for the patchset is below. Thanks! -dros These patches clean up some issues surrouding nfs_page_group_lock: - normalize wait/nonblock argument - make nonblocking calls really nonblocking - handle errors - ensure that we don't call blocking nfs_page_group_lock when holding the inode spinlock This cleanup was inspired by Fengguang Wu's report that we were sleeping with locks held in nfs_lock_and_join_requests. Weston Andros Adamson (5): nfs: change nfs_page_group_lock argument nfs: fix nonblocking calls to nfs_page_group_lock nfs: use blocking page_group_lock in add_request nfs: fix error handling in lock_and_join_requests nfs: don't sleep with inode lock in lock_and_join_requests fs/nfs/pagelist.c | 59 ++++++++++++++++++++++++++++++------------------ fs/nfs/write.c | 21 +++++++++++++---- include/linux/nfs_page.h | 1 + 3 files changed, 55 insertions(+), 26 deletions(-) On Aug 9, 2014, at 12:11 AM, Nick Krause wrote: > On Tue, Aug 5, 2014 at 11:08 AM, Weston Andros Adamson > wrote: >> Thanks, I?ll investigate. >> >> -dros >> >> >> On Aug 5, 2014, at 10:03 AM, Fengguang Wu wrote: >> >>> Greetings, >>> >>> Here is an NFS error triggered by this debug check. >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/wait >>> commit b87699e5fa31f451987a992b9cbda22d29ebcb46 >>> Author: Peter Zijlstra >>> AuthorDate: Mon Aug 4 11:12:21 2014 +0200 >>> Commit: Peter Zijlstra >>> CommitDate: Mon Aug 4 13:29:57 2014 +0200 >>> >>> wait: Add might_sleep() >>> >>> Add more might_sleep() checks, suppose someone put a wait_event() like >>> thing in a wait loop.. >>> >>> Can't put might_sleep() in ___wait_event() because there's the locked >>> primitives which call ___wait_event() with locks held. >>> >>> Signed-off-by: Peter Zijlstra >>> Link: http://lkml.kernel.org/n/tip-amr894sd1j012khd3fgyh9m8@git.kernel.org >>> >>> >>> [ 13.363454] BUG: sleeping function called from invalid context at include/linux/wait.h:976 >>> [ 13.365679] in_atomic(): 1, irqs_disabled(): 0, pid: 2715, name: dmesg >>> [ 13.367109] CPU: 1 PID: 2715 Comm: dmesg Not tainted 3.16.0-00048-gb87699e #1 >>> [ 13.368385] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >>> [ 13.369544] 0000000000000000 ffff88003e3efad0 ffffffff819c6bad ffff880035e9e480 >>> [ 13.371838] ffff88003e3efae0 ffffffff81106e96 ffff88003e3efb00 ffffffff81306a2e >>> [ 13.373822] ffff880035e9e480 ffff88001b8a1a40 ffff88003e3efb48 ffffffff8130975a >>> [ 13.376165] Call Trace: >>> [ 13.376890] [] dump_stack+0x4d/0x66 >>> [ 13.377903] [] __might_sleep+0x10a/0x10c >>> [ 13.379247] [] nfs_page_group_lock+0x4e/0x7b >>> [ 13.380739] [] nfs_lock_and_join_requests+0x83/0x334 >>> [ 13.381884] [] nfs_do_writepage+0x94/0x191 >>> [ 13.383078] [] nfs_writepages_callback+0x13/0x25 >>> [ 13.384592] [] ? nfs_do_writepage+0x191/0x191 >>> [ 13.385745] [] write_cache_pages+0x281/0x3a9 >>> [ 13.386829] [] ? nfs_do_writepage+0x191/0x191 >>> [ 13.388524] [] nfs_writepages+0xa9/0x10f >>> [ 13.389903] [] ? release_pages+0x1a2/0x20b >>> [ 13.391087] [] ? free_pcppages_bulk+0x298/0x33c >>> [ 13.392198] [] do_writepages+0x1e/0x2c >>> [ 13.393202] [] __filemap_fdatawrite_range+0x55/0x57 >>> [ 13.394313] [] filemap_write_and_wait_range+0x2a/0x58 >>> [ 13.395448] [] nfs_file_fsync+0x4e/0x10c >>> [ 13.396546] [] vfs_fsync_range+0x1b/0x23 >>> [ 13.397567] [] vfs_fsync+0x1c/0x1e >>> [ 13.398517] [] nfs_file_flush+0x6c/0x6f >>> [ 13.399513] [] filp_close+0x3c/0x72 >>> [ 13.400646] [] put_files_struct+0x67/0xb3 >>> [ 13.401759] [] exit_files+0x4a/0x4f >>> [ 13.402756] [] do_exit+0x3c9/0x985 >>> [ 13.403740] [] ? trace_do_page_fault+0x52/0xb7 >>> [ 13.404875] [] do_group_exit+0x44/0xac >>> [ 13.405878] [] SyS_exit_group+0x14/0x14 >>> [ 13.406892] [] system_call_fastpath+0x16/0x1b >>> [ 15.733276] BUG: sleeping function called from invalid context at include/linux/wait.h:976 >>> [ 15.735154] in_atomic(): 1, irqs_disabled(): 0, pid: 2716, name: cat >>> [ 15.736263] CPU: 1 PID: 2716 Comm: cat Not tainted 3.16.0-00048-gb87699e #1 >>> [ 15.737414] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >>> [ 15.738754] 0000000000000000 ffff88000068b910 ffffffff819c6bad ffff880035e9e080 >>> [ 15.746872] ffff88000068b920 ffffffff81106e96 ffff88000068b940 ffffffff81306a2e >>> [ 15.748694] ffff880035e9e080 ffff88001b8a1a40 ffff88000068b988 ffffffff8130975a >>> [ 15.750479] Call Trace: >>> [ 15.751164] [] dump_stack+0x4d/0x66 >>> [ 15.752141] [] __might_sleep+0x10a/0x10c >>> [ 15.753148] [] nfs_page_group_lock+0x4e/0x7b >>> [ 15.754183] [] nfs_lock_and_join_requests+0x83/0x334 >>> [ 15.755294] [] nfs_do_writepage+0x94/0x191 >>> [ 15.756335] [] nfs_writepages_callback+0x13/0x25 >>> [ 15.757405] [] ? nfs_do_writepage+0x191/0x191 >>> [ 15.758452] [] write_cache_pages+0x281/0x3a9 >>> [ 15.759483] [] ? nfs_do_writepage+0x191/0x191 >>> [ 15.760547] [] nfs_writepages+0xa9/0x10f >>> [ 15.761541] [] do_writepages+0x1e/0x2c >>> [ 15.762514] [] __filemap_fdatawrite_range+0x55/0x57 >>> [ 15.763612] [] filemap_write_and_wait_range+0x2a/0x58 >>> [ 15.764746] [] nfs_file_fsync+0x4e/0x10c >>> [ 15.765748] [] vfs_fsync_range+0x1b/0x23 >>> [ 15.766745] [] vfs_fsync+0x1c/0x1e >>> [ 15.767681] [] nfs_file_flush+0x6c/0x6f >>> [ 15.768690] [] filp_close+0x3c/0x72 >>> [ 15.769638] [] put_files_struct+0x67/0xb3 >>> [ 15.770635] [] exit_files+0x4a/0x4f >>> [ 15.771581] [] do_exit+0x3c9/0x985 >>> [ 15.772537] [] ? __schedule+0x4cb/0x734 >>> [ 15.773522] [] do_group_exit+0x44/0xac >>> [ 15.774493] [] get_signal_to_deliver+0x53b/0x5cb >>> [ 15.775575] [] do_signal+0x49/0x511 >>> [ 15.776561] [] ? do_syslog+0x141/0x4a8 >>> [ 15.777564] [] ? kmsg_read+0x2d/0x54 >>> [ 15.778513] [] ? proc_reg_read+0x56/0x69 >>> [ 15.779504] [] do_notify_resume+0x35/0x72 >>> [ 15.780521] [] int_signal+0x12/0x17 >>> >>> This script may reproduce the error. >>> >>> ---------------------------------------------------------------------------- >>> #!/bin/bash >>> >>> kernel=$1 >>> initrd=debian-x86_64.cgz >>> >>> wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd >>> >>> kvm=( >>> qemu-system-x86_64 >>> -enable-kvm >>> -cpu Nehalem >>> -kernel $kernel >>> -initrd $initrd >>> -m 1024 >>> -smp 2 >>> -net nic,vlan=1,model=e1000 >>> -net user,vlan=1 >>> -boot order=nc >>> -no-reboot >>> -watchdog i6300esb >>> -rtc base=localtime >>> -serial stdio >>> -display none >>> -monitor null >>> ) >>> >>> append=( >>> root=/dev/ram0 >>> ip=::::vm-vp-1G-5::dhcp >>> oops=panic >>> earlyprintk=ttyS0,115200 >>> debug >>> apic=debug >>> sysrq_always_enabled >>> rcupdate.rcu_cpu_stall_timeout=100 >>> panic=10 >>> softlockup_panic=1 >>> nmi_watchdog=panic >>> load_ramdisk=2 >>> prompt_ramdisk=0 >>> console=ttyS0,115200 >>> console=tty0 >>> vga=normal >>> ) >>> >>> "${kvm[@]}" --append "${append[*]}" >>> ---------------------------------------------------------------------------- >>> >>> Thanks, >>> Fengguang >>> _______________________________________________ >>> LKP mailing list >>> LKP@linux.intel.com >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > Hey Weston, > After doing the trace for you . seems the issue is in ,nfs_page_group_lock. > As due to the head not being equal to head be are hitiing a warn on. > > 152 nfs_page_group_lock(struct nfs_page *req) > 153 { > 154 struct nfs_page *head = req->wb_head; > 155 > 156 WARN_ON_ONCE(head != head->wb_head); > 157 > 158 wait_on_bit_lock(&head->wb_flags, PG_HEADLOCK, > 159 nfs_wait_bit_uninterruptible, > 160 TASK_UNINTERRUPTIBLE); > 161 } > > I am pasting the function for your convenience. > Please CC me to let me known how this goes. > Cheers Nick