Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754643Ab0HYVf4 (ORCPT ); Wed, 25 Aug 2010 17:35:56 -0400 Received: from casper.infradead.org ([85.118.1.10]:44269 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754604Ab0HYVfw convert rfc822-to-8bit (ORCPT ); Wed, 25 Aug 2010 17:35:52 -0400 Subject: Re: [patch 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc From: Peter Zijlstra To: "Ted Ts'o" Cc: Dave Chinner , David Rientjes , Jens Axboe , Andrew Morton , Neil Brown , Alasdair G Kergon , Chris Mason , Steven Whitehouse , Jan Kara , Frederic Weisbecker , "linux-raid@vger.kernel.org" , "linux-btrfs@vger.kernel.org" , "cluster-devel@redhat.com" , "linux-ext4@vger.kernel.org" , "reiserfs-devel@vger.kernel.org" , "linux-kernel@vger.kernel.org" In-Reply-To: <20100825205342.GG4453@thunk.org> References: <1282656558.2605.2742.camel@laptop> <4C73CA24.3060707@fusionio.com> <20100825112433.GB4453@thunk.org> <1282736132.2605.3563.camel@laptop> <20100825115709.GD4453@thunk.org> <1282740516.2605.3644.camel@laptop> <20100825132417.GQ31488@dastard> <1282743342.2605.3707.camel@laptop> <20100825205342.GG4453@thunk.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Wed, 25 Aug 2010 23:35:25 +0200 Message-ID: <1282772125.1975.153.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2036 Lines: 44 On Wed, 2010-08-25 at 16:53 -0400, Ted Ts'o wrote: > On Wed, Aug 25, 2010 at 03:35:42PM +0200, Peter Zijlstra wrote: > > > > While I appreciate that it might be somewhat (a lot) harder for a > > filesystem to provide that guarantee, I'd be deeply worried about your > > claim that its impossible. > > > > It would render a system without swap very prone to deadlocks. Even with > > the very tight dirty page accounting we currently have you can fill all > > your memory with anonymous pages, at which point there's nothing free > > and you require writeout of dirty pages to succeed. > > For file systems that do delayed allocation, the situation is very > similar to swapping over NFS. Sometimes in order to make some free > memory, you need to spend some free memory... Which means you need to be able to compute a bounded amount of that memory. > which implies that for > these file systems, being more aggressive about triggering writeout, > and being more aggressive about throttling processes which are > creating too many dirty pages, especially dirty delayed allocaiton > pages (regardless of whether this is via write(2) or accessing mmapped > memory), is a really good idea. That seems unrelated, the VM has a strict dirty limit and controls writeback when needed. That part works. > A pool of free pages which is reserved for routines that are doing > page cleaning would probably also be a good idea. Maybe that's just > retrying with GFP_ATOMIC if a normal allocation fails, or maybe we > need our own special pool, or maybe we need to dynamically resize the > GFP_ATOMIC pool based on how many subsystems might need to use it.... We have a smallish reserve, accessible with PF_MEMALLOC, but its use is not regulated nor bounded, it just mostly works good enough. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/