Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422769Ab2KNQ02 (ORCPT ); Wed, 14 Nov 2012 11:26:28 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:30779 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161063Ab2KNQ01 (ORCPT ); Wed, 14 Nov 2012 11:26:27 -0500 Date: Wed, 14 Nov 2012 11:25:40 -0500 From: Konrad Rzeszutek Wilk To: Bob Liu Cc: Seth Jennings , Dan Magenheimer , devel@linuxdriverproject.org, linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, ngupta@vflare.org, minchan@kernel.org, fschmaus@gmail.com, andor.daam@googlemail.com, ilendir@googlemail.com, akpm@linux-foundation.org, mgorman@suse.de Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules Message-ID: <20121114162540.GA28650@localhost.localdomain> References: <1351696074-29362-1-git-send-email-dan.magenheimer@oracle.com> <1351696074-29362-3-git-send-email-dan.magenheimer@oracle.com> <50915A5C.8000303@linux.vnet.ibm.com> <20121102182749.GB30100@konrad-lan.dumpdata.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: acsinet22.oracle.com [141.146.126.238] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12797 Lines: 364 > On Sat, Nov 3, 2012 at 2:27 AM, Konrad Rzeszutek Wilk > wrote: > > On Wed, Oct 31, 2012 at 12:05:32PM -0500, Seth Jennings wrote: > >> On 10/31/2012 10:07 AM, Dan Magenheimer wrote: > >> > With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be > >> > built/loaded as modules rather than built-in and enabled by a boot parameter, > >> > this patch provides "lazy initialization", allowing backends to register to > >> > frontswap even after swapon was run. Before a backend registers all calls > >> > to init are recorded and the creation of tmem_pools delayed until a backend > >> > registers or until a frontswap put is attempted. > >> > > >> > Signed-off-by: Stefan Hengelein > >> > Signed-off-by: Florian Schmaus > >> > Signed-off-by: Andor Daam > >> > Signed-off-by: Dan Magenheimer > >> > --- > >> > include/linux/frontswap.h | 1 + > >> > mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++----- > >> > 2 files changed, 63 insertions(+), 8 deletions(-) > >> > > >> > diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h > >> > index 3044254..ef6ada6 100644 > >> > --- a/include/linux/frontswap.h > >> > +++ b/include/linux/frontswap.h > >> > @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool); > >> > extern void frontswap_tmem_exclusive_gets(bool); > >> > > >> > extern void __frontswap_init(unsigned type); > >> > +#define FRONTSWAP_HAS_LAZY_INIT > >> > extern int __frontswap_store(struct page *page); > >> > extern int __frontswap_load(struct page *page); > >> > extern void __frontswap_invalidate_page(unsigned, pgoff_t); > >> > diff --git a/mm/frontswap.c b/mm/frontswap.c > >> > index 2890e67..523a19b 100644 > >> > --- a/mm/frontswap.c > >> > +++ b/mm/frontswap.c > >> > @@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { } > >> > static inline void inc_frontswap_failed_stores(void) { } > >> > static inline void inc_frontswap_invalidates(void) { } > >> > #endif > >> > + > >> > +/* > >> > + * When no backend is registered all calls to init are registered and > >> > + * remembered but fail to create tmem_pools. When a backend registers with > >> > + * frontswap the previous calls to init are executed to create tmem_pools > >> > + * and set the respective poolids. > >> > + * While no backend is registered all "puts", "gets" and "flushes" are > >> > + * ignored or fail. > >> > + */ > >> > +#define MAX_INITIALIZABLE_SD 32 > >> > >> MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES > >> > >> > +static int sds[MAX_INITIALIZABLE_SD]; > >> > >> Rather than store and array of enabled types indexed by type, why not > >> an array of booleans indexed by type. Or a bitfield if you really > >> want to save space. > >> > >> > +static int backend_registered; > >> > >> (backend_registered) is equivalent to checking (frontswap_ops != NULL) > >> right? Kind of. frontswap_ops is not a pointer though so it would be more of a frontswap_ops != dummy. Lets make another patch that makes this a pointer and then rip out the backend_registered. .. snip.. > > if (sis->frontswap_map == NULL) > > return; > > - frontswap_ops.init(type); > > + if (backend_registered) { > > + (*frontswap_ops.init)(type); > > + set_bit(type, sds); > > + } > > } > > What about set bit if backend not registered and clear bit when invalidate. > I think that looks more directly. > Like: > + if (backend_registered) { > + BUG_ON(sis == NULL); > + if (sis->frontswap_map == NULL) > + return; > + frontswap_ops.init(type); > + } > + else { > + BUG_ON(type > MAX_SWAPFILES); > + set_bit(type, sds); > + } Good idea. > > > > EXPORT_SYMBOL(__frontswap_init); > > > > @@ -147,10 +169,20 @@ int __frontswap_store(struct page *page) > > struct swap_info_struct *sis = swap_info[type]; > > pgoff_t offset = swp_offset(entry); > > > > + if (!backend_registered) { > > + inc_frontswap_failed_stores(); > > + return ret; > > + } > > + > > BUG_ON(!PageLocked(page)); > > BUG_ON(sis == NULL); > > if (frontswap_test(sis, offset)) > > dup = 1; > > + if (type < MAX_SWAPFILES && !test_bit(type, sds)) { > > + /* lazy init call to handle post-boot insmod backends*/ > > + (*frontswap_ops.init)(type); > > + set_bit(type, sds); > > + } > > Then rm this. Right, b/c the frontswap_init takes care of initializing the backend. And this does not get called _until_ backend_registered is set. So we have to be extra careful to set backend_registered _after_ all the frontswap.init have been called. > > > ret = frontswap_ops.store(type, offset, page); > > if (ret == 0) { > > frontswap_set(sis, offset); > > @@ -186,6 +218,9 @@ int __frontswap_load(struct page *page) > > struct swap_info_struct *sis = swap_info[type]; > > pgoff_t offset = swp_offset(entry); > > > > + if (!backend_registered) > > + return ret; > > + > > BUG_ON(!PageLocked(page)); > > BUG_ON(sis == NULL); > > if (frontswap_test(sis, offset)) > > @@ -209,6 +244,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset) > > { > > struct swap_info_struct *sis = swap_info[type]; > > > > + if (!backend_registered) > > + return; > > + > > I'm not sure whether __frontswap_invalidate_page() will be called if > backend not registered. Yes. User could do: swapon /dev/sda3 swapoff /dev/sda3 modprobe zcache > > > BUG_ON(sis == NULL); > > if (frontswap_test(sis, offset)) { > > frontswap_ops.invalidate_page(type, offset); > > @@ -226,12 +264,16 @@ void __frontswap_invalidate_area(unsigned type) > > { > > struct swap_info_struct *sis = swap_info[type]; > > > > - BUG_ON(sis == NULL); > > - if (sis->frontswap_map == NULL) > > - return; > > - frontswap_ops.invalidate_area(type); > > - atomic_set(&sis->frontswap_pages, 0); > > - memset(sis->frontswap_map, 0, sis->max / sizeof(long)); > > + if (backend_registered) { > > + BUG_ON(sis == NULL); > > + if (sis->frontswap_map == NULL) > > + return; > > + (*frontswap_ops.invalidate_area)(type); > > + atomic_set(&sis->frontswap_pages, 0); > > + memset(sis->frontswap_map, 0, sis->max / sizeof(long)); > > + } else { > > + bitmap_zero(sds, MAX_SWAPFILES); > > Use clear_bit(type, sds) here; Yikes. Yes. It actually could be unconditional too > > > + } > > } > > EXPORT_SYMBOL(__frontswap_invalidate_area); > > > > @@ -364,6 +406,9 @@ static int __init init_frontswap(void) > > debugfs_create_u64("invalidates", S_IRUGO, > > root, &frontswap_invalidates); > > #endif > > + bitmap_zero(sds, MAX_SWAPFILES); > > + > > + frontswap_enabled = 1; > > We'd better init backend_registered = false also. I think we are OK. The .bss is set to zero so that means backend_registered is by default zero. The end result would look like this (I had not compiled tested it yet): >From a13ed2c85b220c62035ab7ac79ad8a62f9f29c13 Mon Sep 17 00:00:00 2001 From: Dan Magenheimer Date: Wed, 31 Oct 2012 08:07:51 -0700 Subject: [PATCH] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be built/loaded as modules rather than built-in and enabled by a boot parameter, this patch provides "lazy initialization", allowing backends to register to frontswap even after swapon was run. Before a backend registers all calls to init are recorded and the creation of tmem_pools delayed until a backend registers or until a frontswap put is attempted. Signed-off-by: Stefan Hengelein Signed-off-by: Florian Schmaus Signed-off-by: Andor Daam Signed-off-by: Dan Magenheimer [v1: Fixes per Seth Jennings suggestions] [v2: Removed FRONTSWAP_HAS_.. ] [v3: Fix up per Bob Liu recommendations] Signed-off-by: Konrad Rzeszutek Wilk --- mm/frontswap.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++-------- 1 files changed, 56 insertions(+), 10 deletions(-) diff --git a/mm/frontswap.c b/mm/frontswap.c index 2890e67..db90736 100644 --- a/mm/frontswap.c +++ b/mm/frontswap.c @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { } static inline void inc_frontswap_failed_stores(void) { } static inline void inc_frontswap_invalidates(void) { } #endif + +/* + * When no backend is registered all calls to init are registered and + * remembered but fail to create tmem_pools. When a backend registers with + * frontswap the previous calls to init are executed to create tmem_pools + * and set the respective poolids. + * While no backend is registered all "puts", "gets" and "flushes" are + * ignored or fail. + */ +static DECLARE_BITMAP(need_init, MAX_SWAPFILES); +static bool backend_registered __read_mostly; + /* * Register operations for frontswap, returning previous thus allowing * detection of multiple backends and possible nesting. @@ -87,9 +99,19 @@ static inline void inc_frontswap_invalidates(void) { } struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops) { struct frontswap_ops old = frontswap_ops; + int i; frontswap_ops = *ops; frontswap_enabled = true; + + for (i = 0; i < MAX_SWAPFILES; i++) { + if (test_and_clear_bit(i, need_init)) + (*frontswap_ops.init)(i); + } + /* We MUST have backend_registered called _after_ the frontswap_init's + * have been called. Otherwise __frontswap_store might fail. */ + barrier(); + backend_registered = true; return old; } EXPORT_SYMBOL(frontswap_register_ops); @@ -119,10 +141,17 @@ void __frontswap_init(unsigned type) { struct swap_info_struct *sis = swap_info[type]; - BUG_ON(sis == NULL); - if (sis->frontswap_map == NULL) - return; - frontswap_ops.init(type); + if (backend_registered) { + BUG_ON(sis == NULL); + if (sis->frontswap_map == NULL) + return; + (*frontswap_ops.init)(type); + } + else { + BUG_ON(type > MAX_SWAPFILES); + set_bit(type, need_init); + } + } EXPORT_SYMBOL(__frontswap_init); @@ -147,6 +176,11 @@ int __frontswap_store(struct page *page) struct swap_info_struct *sis = swap_info[type]; pgoff_t offset = swp_offset(entry); + if (!backend_registered) { + inc_frontswap_failed_stores(); + return ret; + } + BUG_ON(!PageLocked(page)); BUG_ON(sis == NULL); if (frontswap_test(sis, offset)) @@ -186,6 +220,9 @@ int __frontswap_load(struct page *page) struct swap_info_struct *sis = swap_info[type]; pgoff_t offset = swp_offset(entry); + if (!backend_registered) + return ret; + BUG_ON(!PageLocked(page)); BUG_ON(sis == NULL); if (frontswap_test(sis, offset)) @@ -209,6 +246,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset) { struct swap_info_struct *sis = swap_info[type]; + if (!backend_registered) + return; + BUG_ON(sis == NULL); if (frontswap_test(sis, offset)) { frontswap_ops.invalidate_page(type, offset); @@ -226,12 +266,15 @@ void __frontswap_invalidate_area(unsigned type) { struct swap_info_struct *sis = swap_info[type]; - BUG_ON(sis == NULL); - if (sis->frontswap_map == NULL) - return; - frontswap_ops.invalidate_area(type); - atomic_set(&sis->frontswap_pages, 0); - memset(sis->frontswap_map, 0, sis->max / sizeof(long)); + if (backend_registered) { + BUG_ON(sis == NULL); + if (sis->frontswap_map == NULL) + return; + (*frontswap_ops.invalidate_area)(type); + atomic_set(&sis->frontswap_pages, 0); + memset(sis->frontswap_map, 0, sis->max / sizeof(long)); + } + clear_bit(need_init, MAX_SWAPFILES); } EXPORT_SYMBOL(__frontswap_invalidate_area); @@ -364,6 +407,9 @@ static int __init init_frontswap(void) debugfs_create_u64("invalidates", S_IRUGO, root, &frontswap_invalidates); #endif + bitmap_zero(need_init, MAX_SWAPFILES); + + frontswap_enabled = 1; return 0; } -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/