Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752738Ab1EAIB4 (ORCPT ); Sun, 1 May 2011 04:01:56 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:6557 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752122Ab1EAIBx (ORCPT ); Sun, 1 May 2011 04:01:53 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhAEAEYQvU15LHHJgWdsb2JhbACmGBUBARYmJcQiDoVyBJ0t Date: Sun, 1 May 2011 18:01:49 +1000 From: Dave Chinner To: Christian Kujau Cc: Markus Trippelsdorf , LKML , xfs@oss.sgi.com, minchan.kim@gmail.com Subject: Re: 2.6.39-rc4+: oom-killer busy killing tasks Message-ID: <20110501080149.GD13542@dastard> References: <20110427022655.GE12436@dastard> <20110427102824.GI12436@dastard> <20110428233751.GR12436@dastard> <20110429201701.GA13166@x4.trippels.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3402 Lines: 78 On Fri, Apr 29, 2011 at 05:17:53PM -0700, Christian Kujau wrote: > On Fri, 29 Apr 2011 at 22:17, Markus Trippelsdorf wrote: > > I could be the hrtimer bug again. Would you try to reproduce the issue > > with this patch applied? > > http://git.us.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commit;h=ce31332d3c77532d6ea97ddcb475a2b02dd358b4 > > With that patch applied, the OOm killer still kicks in, this time the OOM > messages were written to the syslog agian: > > http://nerdbynature.de/bits/2.6.39-rc4/oom/ > (The -9 files are the current ones) > > Also, this time xfs did not show up in the backtrace: > > ssh invoked oom-killer: gfp_mask=0x44d0, order=2, oom_adj=0, oom_score_adj=0 > Call Trace: > [c22bfae0] [c0009d30] show_stack+0x70/0x1bc (unreliable) > [c22bfb20] [c009cd3c] T.545+0x74/0x1d0 > [c22bfb70] [c009cf6c] T.543+0xd4/0x2a0 > [c22bfbb0] [c009d3b4] out_of_memory+0x27c/0x360 > [c22bfc00] [c00a199c] __alloc_pages_nodemask+0x6f8/0x708 > [c22bfca0] [c00a19c8] __get_free_pages+0x1c/0x44 > [c22bfcb0] [c00d283c] __kmalloc_track_caller+0x1c0/0x1dc > [c22bfcd0] [c036ff1c] __alloc_skb+0x74/0x140 > [c22bfd00] [c0369b08] sock_alloc_send_pskb+0x23c/0x37c > [c22bfd70] [c03e8974] unix_stream_sendmsg+0x354/0x478 > [c22bfde0] [c0364118] sock_aio_write+0x170/0x180 > [c22bfe50] [c00d580c] do_sync_write+0xb8/0x144 > [c22bfef0] [c00d68d0] vfs_write+0x1b8/0x1c0 > [c22bff10] [c00d6a10] sys_write+0x58/0xc8 > [c22bff40] [c00127d4] ret_from_syscall+0x0/0x38 > --- Exception: c01 at 0x2044cc14 Doesn't need to have XFS in the stack trace - the inode cache is consuming all of low memory. Indeed, I wonder if that is the problem - this is a highmem configuration where there is 450MB of highmem free, and very little lowmem free which is considered "all unreclaimable". The lowmem zone: Apr 29 15:59:10 alice kernel: [ 3834.754358] DMA free:64704kB min:3532kB low:4412kB high:5296kB active_anon:0kB inactive_anon:0kB active_file:132kB inactive_file:168kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:780288kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:639680kB slab_unreclaimable:41652kB kernel_stack:1128kB pagetables:1788kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:516 all_unreclaimable? yes I really don't know why the xfs inode cache is not being trimmed. I really, really need to know if the XFS inode cache shrinker is getting blocked or not running - do you have those sysrq-w traces when near OOM I asked for a while back? It may be that the zone reclaim is simply fubar because slab cache reclaim is proportional to the number of pages scanned on the LRU. With most of the cached pages in the highmem zone, the lowmem zone scan only scanned 516 pages. I can't see it freeing many inodes (there's >600,000 of them in memory) based on such a low page scan number. Maybe you should tweak /proc/sys/vm/vfs_cache_pressure to make it reclaim vfs structures more rapidly. It might help, but I'm starting to think that this problem is actually a VM zone reclaim balance problem, not an XFS problem as such.... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/