Date: Mon, 3 Jan 2005 11:40:41 -0500 (EST)
From: Rik van Riel <riel@redhat.com>
To: Andrea Arcangeli <andrea@suse.de>
cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com>,
       Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org,
       robert_hentosh@dell.com
Subject: Re: [PATCH][2/2] do not OOM kill if we skip writing many pages
In-Reply-To: <20050103162500.GX5164@dualathlon.random>
Message-ID: <Pine.LNX.4.61.0501031130310.25392@chimarrao.boston.redhat.com>
References: <Pine.LNX.4.61.0412201013420.13935@chimarrao.boston.redhat.com>
 <20050102172929.GL5164@dualathlon.random> <Pine.LNX.4.61.0501022319180.10640@chimarrao.boston.redhat.com>
 <20050103122241.GE29158@logos.cnet> <20050103162500.GX5164@dualathlon.random>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2476
Lines: 56

On Mon, 3 Jan 2005, Andrea Arcangeli wrote:
> On Mon, Jan 03, 2005 at 10:22:41AM -0200, Marcelo Tosatti wrote:
>> What are the details of the OOM kills (output, workload, configuration, etc)?

The workload is a simple dd to a block device, on a system
with highmem.  The mapping for the block device can only be
cached in lowmem.

kernel: oom-killer: gfp_mask=0xd0
...
kernel: Free pages:      968016kB (966400kB HighMem)
kernel: Active:31932 inactive:185316 dirty:8 writeback:165518 unstable:0 free:242004 slab:55830 mapped:33266 pagetables:1135
kernel: DMA free:16kB min:16kB low:32kB high:48kB active:0kB inactive:9656kB present:16384kB
kernel: protections[]: 0 0 0
kernel: Normal free:1600kB min:936kB low:1872kB high:2808kB active:208kB inactive:653148kB present:901120kB
kernel: protections[]: 0 0 0
kernel: HighMem free:966400kB min:512kB low:1024kB high:1536kB active:127520kB inactive:78464kB present:1179584kB
kernel: protections[]: 0 0 0
...

If you run on a system with more highmem, you'll simply get
an OOM kill with more free highmem pages.  The only thing
that lives in highmem is the process code, which the VM is
not scanning for obvious reasons.

>> Are these running 2.6.10-mm?

The latest rawhide kernel, with a few VM fixes, including all
the important ones that I could see from -mm.

Reading balance_dirty_pages, I do not understand how we could
end up having so many pages in writeback state, but still
continue writing out more - surely we should have run out of
dirty pages long ago and stalled in blk_congestion_wait()
until lots of IO had finished completing ?

Why can we build up 660MB of pages in the writeback stage,
for a mapping that can only live in the low 900MB of memory?
Yes, it has my patch 1/2 applied (lowering the dirty limit
for lowmem only mappings)...

> And did they apply Con's patch? (i.e. my 3/4 I posted few days ago)

Con's patch is not relevant for this bug, since there are so few
mapped pages (and those almost certainly live in highmem, which
the VM is not scanning).

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/