Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761095AbXETFWs (ORCPT ); Sun, 20 May 2007 01:22:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758601AbXETFWk (ORCPT ); Sun, 20 May 2007 01:22:40 -0400 Received: from ns.suse.de ([195.135.220.2]:60187 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758214AbXETFWj (ORCPT ); Sun, 20 May 2007 01:22:39 -0400 Date: Sun, 20 May 2007 07:22:29 +0200 From: Nick Piggin To: William Lee Irwin III Cc: Christoph Lameter , Linux Kernel Mailing List , Linux Memory Management List , linux-arch@vger.kernel.org Subject: Re: [rfc] increase struct page size?! Message-ID: <20070520052229.GA9372@wotan.suse.de> References: <20070518040854.GA15654@wotan.suse.de> <20070519012530.GB15569@wotan.suse.de> <20070519181501.GC19966@holomorphy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070519181501.GC19966@holomorphy.com> User-Agent: Mutt/1.5.9i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3118 Lines: 78 On Sat, May 19, 2007 at 11:15:01AM -0700, William Lee Irwin III wrote: > On Fri, May 18, 2007 at 11:14:26AM -0700, Christoph Lameter wrote: > >> Right. That would simplify the calculations. > > On Sat, May 19, 2007 at 03:25:30AM +0200, Nick Piggin wrote: > > It isn't the calculations I'm worried about, although they'll get simpler > > too. It is the cache cost. > > The cache cost argument is specious. Even misaligned, smaller is > smaller. Of course smaller is smaller ;) Why would that make the cache cost argument specious? > The cache footprint reduction is merely amortized, > probabilistic, etc. I don't really know what you mean by this, or what part of my cache cost argument you disagree with... I think it is that you could construct mem_map access patterns, without specifically looking at alignment, where a 56 byte struct page would suffer about 75% more cache misses than a 64 byte aligned one (and you could also get about 12% fewer cache misses with other access patterns). I also think the kernel's mem_map access patterns would be more on the random side, so overall would result in significantly fewer cache misses with 64 byte aligned pages. Which part do you disagree with? > On Fri, May 18, 2007 at 11:14:26AM -0700, Christoph Lameter wrote: > >> I wonder if there are other uses for the free space? > > On Sat, May 19, 2007 at 03:25:30AM +0200, Nick Piggin wrote: > > Hugh points out that we should make _count and _mapcount atomic_long_t's, > > which would probably be a better use of the space once your vmemmap goes > > in. > > I'm not so sure about that. I doubt we have issues with that. I say The issue is that userspace can DOS or crash the kernel by deliberately overflowing count or mapcount. > if there's to be padding to 64B to use the of the whole additional > space for additional flag bits. I'm sure fs's could make good use of > 64 spare flag bits, or whatever's left over after the VM has its fill. > Perhaps so many spare flag bits could be used in lieu of buffer_heads. Really? 64-bit architectures can already use about maybe 16 or 32 more page flag bits than 32-bit architectures, and I definitely do not want to increase the size of 32-bit struct page, so I think this wouldn't work. > page->virtual is the same old mistake as it was when it was removed. > The virtual mem_map code should be used to resolve the computational Don't get too hung up on the page->virtual thing. I'll send another patch with atomic_t/atomic_long_t conversion. > expense. Much the same holds for the atomic_t's; 32 + PAGE_SHIFT is > 44 bits or more, about as much as is possible, and one reference per > page per page is not even feasible. Full-length atomic_t's are just > not necessary. I don't know what your 32 + PAGE_SHIFT calculation is for, but yes you can wrap these counters from userspace on 64-bit architectures. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/