Date: Tue, 20 Apr 2004 16:49:37 +0200
From: Andrea Arcangeli <andrea@suse.de>
To: Andrew Morton <akpm@osdl.org>
Cc: Manfred Spraul <manfred@colorfullife.com>, agruen@suse.de,
       linux-kernel@vger.kernel.org
Subject: Re: slab-alignment-rework.patch in -mc
Message-ID: <20040420144937.GG29954@dualathlon.random>
References: <1082383751.6746.33.camel@f235.suse.de> <20040419162533.GR29954@dualathlon.random> <4084017C.5080706@colorfullife.com> <20040420002423.469cca01.akpm@osdl.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20040420002423.469cca01.akpm@osdl.org>
User-Agent: Mutt/1.4.1i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1883
Lines: 35

On Tue, Apr 20, 2004 at 12:24:23AM -0700, Andrew Morton wrote:
> So I do think that we should either make "align=0" translate to "pack them
> densely" or do the big sweep across all kmem_cache_create() callsites.

agreed.

> If the latter, while we're there, let's remove SLAB_HWCACHE_ALIGN where it
> isn't obviously appropriate.  I'd imagine that being able to fit more inodes
> into memory is a net win over the occasional sharing effect, for example.

One warning here, false sharing here isn' the only reason for hw
alignment, for structures like inodes or other things often are coded
packing the fields used at the same time together in the same cacheline,
this pratically can reduce the cache utilization to 1 cacheline instead
of 2 cachelines at runtime (even if there's no false sharing at all
because the structure is much bigger than the l1 size anyways).

So the hardware alignment should be removed with care looking the layout
of the structures and evaluating if we're losing cacheline packing. For
example the task_struct definitely must be fully l1 aligned, not because
of false sharing issues that are probably non existent in the task
struct anyways, but because most important fileds in the task struct
are packed to maximize the cache utilization at runtime.

For 12 bytes small things including locks like anon-vma the false
sharing is the biggest issue (but still it doesn't worth to l1 align it
in the anon-vma case), for buffer headers and task_structs the cacheline
packing provided by the l1 alignment of the structure is the primary
reason for wanting an l1 alignment. Each case should be evaluated
separately.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/