Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1764984AbXIMScl (ORCPT ); Thu, 13 Sep 2007 14:32:41 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753942AbXIMScd (ORCPT ); Thu, 13 Sep 2007 14:32:33 -0400 Received: from netops-testserver-4-out.sgi.com ([192.48.171.29]:47737 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753481AbXIMScc (ORCPT ); Thu, 13 Sep 2007 14:32:32 -0400 Date: Thu, 13 Sep 2007 11:32:26 -0700 (PDT) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Peter Zijlstra cc: Nick Piggin , Daniel Phillips , linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, dkegel@google.com, David Miller Subject: Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC) In-Reply-To: <1189671552.21778.158.camel@twins> Message-ID: References: <20070814142103.204771292@sgi.com> <200709050220.53801.phillips@phunq.net> <20070905114242.GA19938@wotan.suse.de> <1189594373.21778.114.camel@twins> <1189671552.21778.158.camel@twins> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2692 Lines: 59 On Thu, 13 Sep 2007, Peter Zijlstra wrote: > > > > Every user of memory relies on the VM, and we only get into trouble if > > > the VM in turn relies on one of these users. Traditionally that has only > > > been the block layer, and we special cased that using mempools and > > > PF_MEMALLOC. > > > > > > Why do you object to me doing a similar thing for networking? > > > > I have not seen you using mempools for the networking layer. I would not > > object to such a solution. It already exists for other subsystems. > > Dude, listen, how often do I have to say this: I cannot use mempools for > the network subsystem because its build on kmalloc! What I've done is > build a replacement for mempools - a reserve system - that does work > similar to mempools but also provides the flexibility of kmalloc. > > That is all, no more, no less. Its different since it becomes a privileged player that can suck all the available memory out of the page allocator. > I'm confused by this, I've never claimed part of, or such a thing. All > I'm saying is that because of the circular dependency between the VM and > the IO subsystem used for swap (not file backed paging [*], just swap) > you have to do something special to avoid deadlocks. How are dirty file backed pages different? They may also be written out by the VM during reclaim. > > Replacing the mempools for the block layer sounds pretty good. But how do > > these various subsystems that may live in different portions of the system > > for various devices avoid global serialization and livelock through your > > system? > > The reserves are spread over all kernel mapped zones, the slab allocator > is still per cpu, the page allocator tries to get pages from the nearest > node. But it seems that you have unbounded allocations with PF_MEMALLOC now for the networking case? So networking can exhaust all reserves? > > And how is fairness addresses? I may want to run a fileserver on > > some nodes and a HPC application that relies on a fiberchannel connection > > on other nodes. How do we guarantee that the HPC application is not > > impacted if the network services of the fileserver flood the system with > > messages and exhaust memory? > > The network system reserves A pages, the block layer reserves B pages, > once they start getting pages from the reserves they go bean counting, > once they reach their respective limit they stop. That sounds good. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/