Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754714AbYFVSPB (ORCPT ); Sun, 22 Jun 2008 14:15:01 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753157AbYFVSOx (ORCPT ); Sun, 22 Jun 2008 14:14:53 -0400 Received: from gir.skynet.ie ([193.1.99.77]:60133 "EHLO gir.skynet.ie" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753081AbYFVSOw (ORCPT ); Sun, 22 Jun 2008 14:14:52 -0400 Date: Sun, 22 Jun 2008 19:14:49 +0100 From: Mel Gorman To: Daniel J Blueman Cc: Christoph Lameter , Linus Torvalds , Alexander Beregalov , Linux Kernel , david@fromorbit.com, xfs@oss.sgi.com Subject: Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)... Message-ID: <20080622181449.GD625@csn.ul.ie> References: <6278d2220806220256g674304ectb945c14e7e09fede@mail.gmail.com> <6278d2220806220258p28de00c1x615ad7b2f708e3f8@mail.gmail.com> <20080622181011.GC625@csn.ul.ie> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20080622181011.GC625@csn.ul.ie> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5484 Lines: 130 (Sorry for the resend, the wrong Dave Chinner's email address was used) On (22/06/08 10:58), Daniel J Blueman didst pronounce: > I'm seeing a similar issue [2] to what was recently reported [1] by > Alexander, but with another workload involving XFS and memory > pressure. > Is NFS involved or is this XFS only? It looks like XFS-only but no harm in being sure. I'm beginning to wonder if this is a problem where a lot of dirty inodes are being written back in this path and we stall while that happens. I'm still not getting why we are triggering this now and did not before 2.6.26-rc1 or why it bisects to the zonelist modifications. Diffing the reclaim and allocation paths between 2.6.25 and 2.6.26-rc1 has not yielded any candidates for me yet that would explain this. > SLUB allocator is in use and config is at http://quora.org/config-client-debug . > > Let me know if you'd like more details/vmlinux objdump etc. > > Thanks, > Daniel > > --- [1] > > http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/e673c9173d45a735/db9213ef39e4e11c > > --- [2] > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.26-rc7-210c #2 > ------------------------------------------------------- > AutopanoPro/4470 is trying to acquire lock: > (iprune_mutex){--..}, at: [] shrink_icache_memory+0x7d/0x290 > > but task is already holding lock: > (&mm->mmap_sem){----}, at: [] do_page_fault+0x255/0x890 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #2 (&mm->mmap_sem){----}: > [] __lock_acquire+0xbdd/0x1020 > [] lock_acquire+0x65/0x90 > [] down_read+0x3b/0x70 > [] do_page_fault+0x27c/0x890 > [] error_exit+0x0/0xa9 > [] 0xffffffffffffffff > > -> #1 (&(&ip->i_iolock)->mr_lock){----}: > [] __lock_acquire+0xbdd/0x1020 > [] lock_acquire+0x65/0x90 > [] down_write_nested+0x46/0x80 > [] xfs_ilock+0x99/0xa0 > [] xfs_ireclaim+0x3f/0x90 > [] xfs_finish_reclaim+0x59/0x1a0 > [] xfs_reclaim+0x109/0x110 > [] xfs_fs_clear_inode+0xe1/0x110 > [] clear_inode+0x7d/0x110 > [] dispose_list+0x2a/0x100 > [] shrink_icache_memory+0x22f/0x290 > [] shrink_slab+0x168/0x1d0 > [] kswapd+0x3b6/0x560 > [] kthread+0x4d/0x80 > [] child_rip+0xa/0x12 > [] 0xffffffffffffffff > > -> #0 (iprune_mutex){--..}: > [] __lock_acquire+0xa47/0x1020 > [] lock_acquire+0x65/0x90 > [] mutex_lock_nested+0xb5/0x300 > [] shrink_icache_memory+0x7d/0x290 > [] shrink_slab+0x168/0x1d0 > [] try_to_free_pages+0x268/0x3a0 > [] __alloc_pages_internal+0x206/0x4b0 > [] __alloc_pages_nodemask+0x9/0x10 > [] alloc_page_vma+0x72/0x1b0 > [] handle_mm_fault+0x462/0x7b0 > [] do_page_fault+0x30c/0x890 > [] error_exit+0x0/0xa9 > [] 0xffffffffffffffff > > other info that might help us debug this: > > 2 locks held by AutopanoPro/4470: > #0: (&mm->mmap_sem){----}, at: [] do_page_fault+0x255/0x890 > #1: (shrinker_rwsem){----}, at: [] shrink_slab+0x32/0x1d0 > > stack backtrace: > Pid: 4470, comm: AutopanoPro Not tainted 2.6.26-rc7-210c #2 > > Call Trace: > [] print_circular_bug_tail+0x83/0x90 > [] ? print_circular_bug_entry+0x49/0x60 > [] __lock_acquire+0xa47/0x1020 > [] lock_acquire+0x65/0x90 > [] ? shrink_icache_memory+0x7d/0x290 > [] mutex_lock_nested+0xb5/0x300 > [] ? shrink_icache_memory+0x7d/0x290 > [] shrink_icache_memory+0x7d/0x290 > [] ? shrink_slab+0x32/0x1d0 > [] shrink_slab+0x168/0x1d0 > [] try_to_free_pages+0x268/0x3a0 > [] ? isolate_pages_global+0x0/0x40 > [] __alloc_pages_internal+0x206/0x4b0 > [] __alloc_pages_nodemask+0x9/0x10 > [] alloc_page_vma+0x72/0x1b0 > [] handle_mm_fault+0x462/0x7b0 > [] ? trace_hardirqs_on+0xbf/0x150 > [] ? do_page_fault+0x255/0x890 > [] do_page_fault+0x30c/0x890 > [] error_exit+0x0/0xa9 > -- > Daniel J Blueman > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/