Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755274Ab0DNMPY (ORCPT ); Wed, 14 Apr 2010 08:15:24 -0400 Received: from one.firstfloor.org ([213.235.205.2]:42015 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755206Ab0DNMPX (ORCPT ); Wed, 14 Apr 2010 08:15:23 -0400 To: Chris Mason Cc: Mel Gorman , Dave Chinner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH] mm: disallow direct reclaim page writeback From: Andi Kleen References: <1271117878-19274-1-git-send-email-david@fromorbit.com> <20100413095815.GU25756@csn.ul.ie> <20100413111902.GY2493@dastard> <20100413193428.GI25756@csn.ul.ie> <20100413202021.GZ13327@think> <877hoa9wlv.fsf@basil.nowhere.org> <20100414112015.GO13327@think> Date: Wed, 14 Apr 2010 14:15:16 +0200 In-Reply-To: <20100414112015.GO13327@think> (Chris Mason's message of "Wed, 14 Apr 2010 07:20:15 -0400") Message-ID: <8739yy9qnf.fsf@basil.nowhere.org> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/22.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2193 Lines: 57 Chris Mason writes: >> >> Basically if you cannot tolerate 1K (or more likely more) of stack >> used before your fs is called you're toast in lots of other situations >> anyways. > > Well, on a 4K stack kernel, 832 bytes is a very large percentage for > just one function. To be honest I think 4K stack simply has to go. I tend to call it "russian roulette" mode. It was just a old workaround for a very old buggy VM that couldn't free 8K pages and the VM is a lot better at that now. And the general trend is to more complex code everywhere, so 4K stacks become more and more hazardous. It was a bad idea back then and is still a bad idea, getting worse and worse with each MLOC being added to the kernel each year. We don't have any good ways to verify that obscure paths through the more and more subsystems won't exceed it (in fact I'm pretty sure there are plenty of problems in exotic configurations) And even if you can make a specific load work there's basically no safety net. The only part of the 4K stack code that's good is the separate interrupt stack, but that one should be just combined with a sane 8K process stack. But yes on a 4K kernel you probably don't want to do any direct reclaim. Maybe for GFP_NOFS everywhere except user allocations when it's set? Or simply drop it? > But they don't realize their function can dive down into ecryptfs then > the filesystem then maybe loop and then perhaps raid6 on top of a > network block device. Those stackings need to use separate threads anyways. A lot of them do in fact. Block avoided this problem by iterating instead of recursing. Those that still recurse on the same stack simply need to be fixed. > Yeah, but since the call chain does eventually go into the allocator, > this function needs to be more stack friendly. For common fast paths it doesn't go into the allocator. -Andi -- ak@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/