From: Linus Torvalds Subject: Re: 2.6.26-rc: nfsd hangs for a few sec Date: Sat, 21 Jun 2008 11:36:41 -0700 (PDT) Message-ID: References: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: kernel-testers@vger.kernel.org, kernel list , linux-mm@kvack.org, Mel Gorman , Christoph Lameter , Lee Schermerhorn , KAMEZAWA Hiroyuki , Hugh Dickins , Nick Piggin , Andrew Morton , bfields@fieldses.org, neilb@suse.de, linux-nfs@vger.kernel.org To: Alexander Beregalov Return-path: Received: from smtp1.linux-foundation.org ([140.211.169.13]:35949 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751623AbYFUShR (ORCPT ); Sat, 21 Jun 2008 14:37:17 -0400 In-Reply-To: Sender: linux-nfs-owner@vger.kernel.org List-ID: On Sat, 21 Jun 2008, Alexander Beregalov wrote: > > > > -> #1 (&(&ip->i_iolock)->mr_lock){----}: > > [] __lock_acquire+0xa0c/0xbc6 > > [] lock_acquire+0x6a/0x86 > > [] down_write_nested+0x33/0x6a > > [] xfs_ilock+0x7b/0xd6 > > [] xfs_ireclaim+0x1d/0x59 > > [] xfs_finish_reclaim+0x173/0x195 > > [] xfs_reclaim+0xb3/0x138 > > [] xfs_fs_clear_inode+0x55/0x8e > > [] clear_inode+0x83/0xd2 > > [] dispose_list+0x3c/0xc1 > > [] shrink_icache_memory+0x173/0x19b > > [] shrink_slab+0xda/0x153 > > [] try_to_free_pages+0x1e0/0x2a1 > > [] __alloc_pages_internal+0x23f/0x3a7 > > [] __alloc_pages+0xa/0xc > > [] __slab_alloc+0x1c7/0x513 > > [] kmem_cache_alloc+0x45/0xb3 > > [] reiserfs_alloc_inode+0x12/0x23 > > [] alloc_inode+0x14/0x1a9 > > [] iget5_locked+0x47/0x133 Hmm. Both the trace above and the trace below: > > -> #0 (iprune_mutex){--..}: > > [] __lock_acquire+0x929/0xbc6 > > [] lock_acquire+0x6a/0x86 > > [] mutex_lock_nested+0xba/0x232 > > [] shrink_icache_memory+0x38/0x19b > > [] shrink_slab+0xda/0x153 > > [] try_to_free_pages+0x1e0/0x2a1 > > [] __alloc_pages_internal+0x23f/0x3a7 > > [] __alloc_pages+0xa/0xc > > [] __do_page_cache_readahead+0xaa/0x16a > > [] ondemand_readahead+0x119/0x127 > > [] page_cache_async_readahead+0x52/0x5d > > [] generic_file_splice_read+0x290/0x4a8 > > [] xfs_splice_read+0x4b/0x78 are kind of scary, because they are both filesystem memory allocation paths that don't have GFP_NOFS, so they cause a callback back into the filesystem to free things. Which in general isn't necessarily wrong: under inode pressure, it may well make sense to try to shrink the inode caches when allocating a new inode, or things may well blow up out of proportion, but it does make me a big nervous. However, it's not clear why things apparently bisected down to the commit it did (54a6eb5c4765aa573a030ceeba2c14e3d2ea5706: "mm: use two zonelist that are filtered by GFP mask"). That part makes me worry that that commit screwed up the freeing pressure logic. Mel? Linus