Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756189AbYJVTqi (ORCPT ); Wed, 22 Oct 2008 15:46:38 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753193AbYJVTq1 (ORCPT ); Wed, 22 Oct 2008 15:46:27 -0400 Received: from fxip-0047f.externet.hu ([88.209.222.127]:47503 "EHLO pomaz-ex.szeredi.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752703AbYJVTq0 (ORCPT ); Wed, 22 Oct 2008 15:46:26 -0400 To: cl@linux-foundation.org CC: miklos@szeredi.hu, penberg@cs.helsinki.fi, nickpiggin@yahoo.com.au, hugh@veritas.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org In-reply-to: (message from Christoph Lameter on Wed, 22 Oct 2008 08:42:25 -0700 (PDT)) Subject: Re: SLUB defrag pull request? References: <1223883004.31587.15.camel@penberg-laptop> <1223883164.31587.16.camel@penberg-laptop> <200810132354.30789.nickpiggin@yahoo.com.au> <48F378C6.7030206@linux-foundation.org> <48FC9CCC.3040006@linux-foundation.org> <48FCCC72.5020202@linux-foundation.org> <48FCD7CB.4060505@linux-foundation.org> <48FCE1C4.20807@linux-foundation.org> <48FE6306.6020806@linux-foundation.org> Message-Id: From: Miklos Szeredi Date: Wed, 22 Oct 2008 21:46:03 +0200 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3096 Lines: 70 On Wed, 22 Oct 2008, Christoph Lameter wrote: > On Wed, 22 Oct 2008, Miklos Szeredi wrote: > > > On Tue, 21 Oct 2008, Christoph Lameter wrote: > >> The only way that a secure reference can be established is if the > >> slab page is locked. That requires a spinlock. The slab allocator > >> calls the get() functions while the slab lock guarantees object > >> existence. Then locks are dropped and reclaim actions can start with > >> the guarantee that the slab object will not suddenly vanish. > > > > Yes, you've made up your mind, that you want to do it this way. But > > it's the _wrong_ way, this "want to get a secure reference for use > > later" leads to madness when applied to dentries or inodes. Try for a > > minute to think outside this template. > > > > For example dcache_lock will protect against dentries moving to/from > > d_lru. So you can do this: > > > > take dcache_lock > > check if d_lru is non-empty > > The dentry could have been freed even before we take the dcache_lock. We > cannot access d_lru without a stable reference to the dentry. Why? The kmem_cache_free() doesn't touch the contents of the object, does it? > > take sb->s_umount > > free dentry > > release sb->s_umount > > release dcache_lock > > > > Yeah, locking will be more complicated in reality. Still, much less > > complicated than trying to do the same across two separate phases. > > > > Why can't something like that work? > > Because the slab starts out with a series of objects left in a slab. It > needs to do build a list of objects etc in a way that is independent as > possible from the user of the slab page. It does that by locking the slab > page so that free operations stall until the reference has been > established. If it would not be shutting off frees then the objects could > vanish under us. It doesn't matter. All we care about is that the dentry is on the lru: it's cached but unused. Every other state (being created, active, being freed, freed) is uninteresting. > We could also avoid frees by calling some cache specific method that locks > out frees before and after. But then frees would stall everywhere and > every slab cache would have to check a global lock before freeing objects > (there would be numerous complications with RCU free etc etc). > > Slab defrag only stops frees on a particular slab page. > > The slab defrag approach also allows the slab cache (dentry or inodes > here) to do something else than free the object. It would be possible f.e. > to move the object by allocating a new entry and moving the information to > the new dentry. That would actually be better since it would preserve the > objects and just move them into the same slab page. Sure, and all that is possible without doing this messy 2 phase thing. Unless I'm still missing something obvious... Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/