Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Tue, 22 Oct 2002 00:14:34 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Tue, 22 Oct 2002 00:14:30 -0400 Received: from packet.digeo.com ([12.110.80.53]:49061 "EHLO packet.digeo.com") by vger.kernel.org with ESMTP id ; Tue, 22 Oct 2002 00:14:25 -0400 Message-ID: <3DB4D20A.8A579516@digeo.com> Date: Mon, 21 Oct 2002 21:20:26 -0700 From: Andrew Morton X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.5.42 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Martin J. Bligh" CC: Rik van Riel , linux-kernel , linux-mm mailing list Subject: Re: ZONE_NORMAL exhaustion (dcache slab) References: <3DB4C87E.7CF128F3@digeo.com> <2622146086.1035233637@[10.10.2.3]> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 22 Oct 2002 04:20:26.0978 (UTC) FILETIME=[593E9C20:01C27982] Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3581 Lines: 84 "Martin J. Bligh" wrote: > > > I cannot make it happen here, either. 2.5.43-mm2 or current devel > > stuff. Heisenbug; maybe something broke dcache-rcu? Or the math > > overflow (unlikely). > > Dipankar is going to give me some debug code once he's slept for > a while ... that should help see if dcache-rcu went wacko. Well if it doesn't happen again... > >> So it looks as though it's actually ext2_inode cache that's first against the wall. > > > > Well that's to be expected. Each ext2 directory inode has highmem > > pagecache attached to it, which pins the inode. There's no highmem > > eviction pressure so your normal zone gets stuffed full of inodes. > > > > There's a fix for this in Andrea's tree, although that's perhaps a > > bit heavy on inode_lock for 2.5 purposes. It's a matter of running > > invalidate_inode_pages() against the inodes as they come off the > > unused_list. I haven't got around to it yet. > > Thanks; no urgent problem (though we did seem to have a customer hitting > a very similar situation very easily in 2.4 ... we'll see if Andrea's > fixes that, then I'll try to reproduce their problem on current 2.5). Oh it's reproduceable OK. Just run make-teeny-files 7 7 against a few filesystems and watch the fun http://www.zip.com.au/~akpm/linux/patches/stuff/make-teeny-files.c > >> larry:~# egrep '(dentry|inode)' /proc/slabinfo > >> isofs_inode_cache 0 0 320 0 0 1 : 120 60 > >> ext2_inode_cache 667345 809181 416 89909 89909 1 : 120 60 > >> shmem_inode_cache 3 9 416 1 1 1 : 120 60 > >> sock_inode_cache 16 22 352 2 2 1 : 120 60 > >> proc_inode_cache 12 12 320 1 1 1 : 120 60 > >> inode_cache 385 396 320 33 33 1 : 120 60 > >> dentry_cache 1068289 1131096 160 47129 47129 1 : 248 124 > > > > OK, so there's reasonable dentry shrinkage there, and the inodes > > for regular files whch have no attached pagecache were reaped. > > But all the directory inodes are sitting there pinned. > > OK, this all makes a lot of sense ... apart from one thing: > from looking at meminfo: > > HighTotal: 15335424 kB > HighFree: 15066160 kB > > Even if every highmem page is pagecache, that's only 67316 pages by > my reckoning (is pagecache broken out seperately in meminfo? both > Buffers and Cached seem to large). If I only have 67316 page of > pagecache, how can I have 667345 inodes with attatched pagecache pages? > Or am I just missing something obvious and fundamental? Maybe you didn't cat /dev/sda2 for long enough? You should end up with very little dcache and tons of icache. Here's what I get: ext2_inode_cache: 420248KB 420256KB 99.99 buffer_head: 40422KB 41648KB 97.5 dentry_cache: 667KB 10211KB 6.54 biovec-BIO_MAX_PAGES: 768KB 780KB 98.46 Massive internal fragmentation of the dcache there. But it takes a long time. Generally, I feel that the proportional-shrink on slab is applying too much pressure when there's not much slab and too little when there's a lot. If you have 400 megs of inodes I don't really think they are likely to be used again soon. Perhaps we need to multiply the slab cache scanning pressure by the slab occupancy. That's simple to do. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/