2012-10-31 15:08:27

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 0/5] enable all tmem backends to be built and loaded as modules

Since various parts of transcendent memory ("tmem") [1] were first posted in
2009, reviewers have suggested that various tmem features should be built
as a module and enabled by loading the module, rather than the current clunky
method of compiling as a built-in and enabling via boot parameter. Due
to certain tmem initialization steps, that was not feasible at the time.

[1] http://lwn.net/Articles/454795/

This patchset allows each of the three merged transcendent memory
backends (zcache, ramster, Xen tmem) to be used as modules by first
enabling transcendent memory frontends (cleancache, frontswap) to deal
with "lazy initialization" and, second, by adding the necessary code for
the backends to be built and loaded as modules.

The original mechanism to enable tmem backends -- namely to hardwire
them into the kernel and select/enable one with a kernel boot
parameter -- is retained but should be considered deprecated. When
backends are loaded as modules, certain knobs will now be
properly selected via module_params rather than via undocumented
kernel boot parameters. Note that module UNloading is not yet
supported as it is lower priority and will require significant
additional work.

The lazy initialization support is necessary because filesystems
and swap devices are normally mounted early in boot and these
activites normally trigger tmem calls to setup certain data structures;
if the respective cleancache/frontswap ops are not yet registered
by a back end, the tmem setup would fail for these devices and
cleancache/frontswap would never be enabled for them which limits
much of the value of tmem in many system configurations. Lazy
initialization records the necessary information in cleancache/frontswap
data structures and "replays" it after the ops are registered
to ensure that all filesystems and swap devices can benefit from
the loaded tmem backend.

Patches 1 and 2 are the original [2] patches to cleancache and frontswap
proposed by Erlangen University, but rebased to 3.7-rcN plus a couple
of bug fixes I found necessary to run properly. I have not attempted
any code cleanup. I have also added defines to ensure at runtime
that backends are not loaded as modules if the frontend patches are not
yet merged; this is useful to avoid any build dependency (since the
frontends may be merged into linux-next through different trees and
at different times than some backends) and once the entire patchset
is safely merged, these defines/ifdefs can be removed.

[2] http://www.spinics.net/lists/linux-mm/msg31490.html

Patch 3 enables module support for zcache2. Zsmalloc support
has not yet been merged into zcache2 but, once merged, could now
easily be selected via a module_param.

Patch 4 enables module support for ramster. Ramster will now be
enabled with a module_param to zcache2.

Patch 5 enables module support for the Xen tmem shim. Xen
self-ballooning and frontswap-selfshrinking are also "lazily"
initialized when the Xen tmem shim is loaded as a module, unless
explicitly disabled by module_params.

Signed-off-by: Dan Magenheimer <[email protected]>

---
Diffstat:

drivers/staging/ramster/Kconfig | 6 +-
drivers/staging/ramster/Makefile | 11 +-
drivers/staging/ramster/ramster.h | 6 +-
drivers/staging/ramster/ramster/nodemanager.c | 9 +-
drivers/staging/ramster/ramster/ramster.c | 29 +++-
drivers/staging/ramster/ramster/ramster.h | 2 +-
.../staging/ramster/ramster/ramster_nodemanager.h | 2 +
drivers/staging/ramster/tmem.c | 6 +-
drivers/staging/ramster/tmem.h | 8 +-
drivers/staging/ramster/zcache-main.c | 61 +++++++-
drivers/staging/ramster/zcache.h | 2 +-
drivers/xen/Kconfig | 4 +-
drivers/xen/tmem.c | 56 ++++++--
drivers/xen/xen-selfballoon.c | 13 +-
include/linux/cleancache.h | 1 +
include/linux/frontswap.h | 1 +
include/xen/tmem.h | 8 +
mm/cleancache.c | 157 +++++++++++++++++--
mm/frontswap.c | 70 ++++++++-
19 files changed, 379 insertions(+), 73 deletions(-)


2012-10-31 15:08:29

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 1/5] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
cleancache even after filesystems were mounted. Calls to init_fs and
init_shared_fs are remembered as fake poolids but no real tmem_pools created.
On backend registration the fake poolids are mapped to real poolids and
respective tmem_pools.

Signed-off-by: Stefan Hengelein <[email protected]>
Signed-off-by: Florian Schmaus <[email protected]>
Signed-off-by: Andor Daam <[email protected]>
Signed-off-by: Dan Magenheimer <[email protected]>
---
include/linux/cleancache.h | 1 +
mm/cleancache.c | 157 +++++++++++++++++++++++++++++++++++++++-----
2 files changed, 141 insertions(+), 17 deletions(-)

diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 42e55de..f7e32f0 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -37,6 +37,7 @@ extern struct cleancache_ops
cleancache_register_ops(struct cleancache_ops *ops);
extern void __cleancache_init_fs(struct super_block *);
extern void __cleancache_init_shared_fs(char *, struct super_block *);
+#define CLEANCACHE_HAS_LAZY_INIT
extern int __cleancache_get_page(struct page *);
extern void __cleancache_put_page(struct page *);
extern void __cleancache_invalidate_page(struct address_space *, struct page *);
diff --git a/mm/cleancache.c b/mm/cleancache.c
index 32e6f41..29430b7 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -45,15 +45,42 @@ static u64 cleancache_puts;
static u64 cleancache_invalidates;

/*
+ * When no backend is registered all calls to init_fs and init_shard_fs
+ * are registered and fake poolids are given to the respective
+ * super block but no tmem_pools are created. When a backend
+ * registers with cleancache the previous calls to init_fs and
+ * init_shared_fs are executed to create tmem_pools and set the
+ * respective poolids. While no backend is registered all "puts",
+ * "gets" and "flushes" are ignored or fail.
+ */
+#define MAX_INITIALIZABLE_FS 32
+#define FAKE_FS_POOLID_OFFSET 1000
+#define FAKE_SHARED_FS_POOLID_OFFSET 2000
+static int fs_poolid_map[MAX_INITIALIZABLE_FS];
+static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
+static char *uuids[MAX_INITIALIZABLE_FS];
+static int backend_registered;
+
+/*
* register operations for cleancache, returning previous thus allowing
* detection of multiple backends and possible nesting
*/
struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
{
struct cleancache_ops old = cleancache_ops;
+ int i;

cleancache_ops = *ops;
- cleancache_enabled = 1;
+
+ backend_registered = 1;
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (fs_poolid_map[i] == -1)
+ fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+ if (shared_fs_poolid_map[i] == -1)
+ shared_fs_poolid_map[i] =
+ (*cleancache_ops.init_shared_fs)
+ (uuids[i], PAGE_SIZE);
+ }
return old;
}
EXPORT_SYMBOL(cleancache_register_ops);
@@ -61,15 +88,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
/* Called by a cleancache-enabled filesystem at time of mount */
void __cleancache_init_fs(struct super_block *sb)
{
- sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
+ int i;
+
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (fs_poolid_map[i] == -2) {
+ sb->cleancache_poolid =
+ i + FAKE_FS_POOLID_OFFSET;
+ if (backend_registered)
+ fs_poolid_map[i] =
+ (*cleancache_ops.init_fs)(PAGE_SIZE);
+ else
+ fs_poolid_map[i] = -1;
+ break;
+ }
+ }
}
EXPORT_SYMBOL(__cleancache_init_fs);

/* Called by a cleancache-enabled clustered filesystem at time of mount */
void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
{
- sb->cleancache_poolid =
- (*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
+ int i;
+
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (shared_fs_poolid_map[i] == -2) {
+ sb->cleancache_poolid =
+ i + FAKE_SHARED_FS_POOLID_OFFSET;
+ uuids[i] = uuid;
+ if (backend_registered)
+ shared_fs_poolid_map[i] =
+ (*cleancache_ops.init_shared_fs)
+ (uuid, PAGE_SIZE);
+ else
+ shared_fs_poolid_map[i] = -1;
+ break;
+ }
+ }
}
EXPORT_SYMBOL(__cleancache_init_shared_fs);

@@ -99,6 +153,19 @@ static int cleancache_get_key(struct inode *inode,
}

/*
+ * Returns a pool_id that is associated with a given fake poolid.
+ */
+static int get_poolid_from_fake(int fake_pool_id)
+{
+ if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
+ return shared_fs_poolid_map[fake_pool_id -
+ FAKE_SHARED_FS_POOLID_OFFSET];
+ else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
+ return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
+ return -1;
+}
+
+/*
* "Get" data from cleancache associated with the poolid/inode/index
* that were specified when the data was put to cleanache and, if
* successful, use it to fill the specified page with data and return 0.
@@ -109,17 +176,26 @@ int __cleancache_get_page(struct page *page)
{
int ret = -1;
int pool_id;
+ int fake_pool_id;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered) {
+ cleancache_failed_gets++;
+ goto out;
+ }
+
VM_BUG_ON(!PageLocked(page));
- pool_id = page->mapping->host->i_sb->cleancache_poolid;
- if (pool_id < 0)
+ fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ if (fake_pool_id < 0)
goto out;
+ pool_id = get_poolid_from_fake(fake_pool_id);

if (cleancache_get_key(page->mapping->host, &key) < 0)
goto out;

- ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
+ if (pool_id >= 0)
+ ret = (*cleancache_ops.get_page)(pool_id,
+ key, page->index, page);
if (ret == 0)
cleancache_succ_gets++;
else
@@ -138,12 +214,23 @@ EXPORT_SYMBOL(__cleancache_get_page);
void __cleancache_put_page(struct page *page)
{
int pool_id;
+ int fake_pool_id;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered) {
+ cleancache_puts++;
+ return;
+ }
+
VM_BUG_ON(!PageLocked(page));
- pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ if (fake_pool_id < 0)
+ return;
+
+ pool_id = get_poolid_from_fake(fake_pool_id);
+
if (pool_id >= 0 &&
- cleancache_get_key(page->mapping->host, &key) >= 0) {
+ cleancache_get_key(page->mapping->host, &key) >= 0) {
(*cleancache_ops.put_page)(pool_id, key, page->index, page);
cleancache_puts++;
}
@@ -158,14 +245,22 @@ void __cleancache_invalidate_page(struct address_space *mapping,
struct page *page)
{
/* careful... page->mapping is NULL sometimes when this is called */
- int pool_id = mapping->host->i_sb->cleancache_poolid;
+ int pool_id;
+ int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
struct cleancache_filekey key = { .u.key = { 0 } };

- if (pool_id >= 0) {
+ if (!backend_registered)
+ return;
+
+ if (fake_pool_id >= 0) {
+ pool_id = get_poolid_from_fake(fake_pool_id);
+ if (pool_id < 0)
+ return;
+
VM_BUG_ON(!PageLocked(page));
if (cleancache_get_key(mapping->host, &key) >= 0) {
(*cleancache_ops.invalidate_page)(pool_id,
- key, page->index);
+ key, page->index);
cleancache_invalidates++;
}
}
@@ -179,9 +274,18 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
*/
void __cleancache_invalidate_inode(struct address_space *mapping)
{
- int pool_id = mapping->host->i_sb->cleancache_poolid;
+ int pool_id;
+ int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered)
+ return;
+
+ if (fake_pool_id < 0)
+ return;
+
+ pool_id = get_poolid_from_fake(fake_pool_id);
+
if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
(*cleancache_ops.invalidate_inode)(pool_id, key);
}
@@ -194,16 +298,30 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
*/
void __cleancache_invalidate_fs(struct super_block *sb)
{
- if (sb->cleancache_poolid >= 0) {
- int old_poolid = sb->cleancache_poolid;
- sb->cleancache_poolid = -1;
- (*cleancache_ops.invalidate_fs)(old_poolid);
+ int index;
+ int fake_pool_id = sb->cleancache_poolid;
+ int old_poolid = fake_pool_id;
+
+ if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
+ index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
+ old_poolid = shared_fs_poolid_map[index];
+ shared_fs_poolid_map[index] = -2;
+ uuids[index] = NULL;
+ } else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
+ index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
+ old_poolid = fs_poolid_map[index];
+ fs_poolid_map[index] = -2;
}
+ sb->cleancache_poolid = -1;
+ if (backend_registered)
+ (*cleancache_ops.invalidate_fs)(old_poolid);
}
EXPORT_SYMBOL(__cleancache_invalidate_fs);

static int __init init_cleancache(void)
{
+ int i;
+
#ifdef CONFIG_DEBUG_FS
struct dentry *root = debugfs_create_dir("cleancache", NULL);
if (root == NULL)
@@ -215,6 +333,11 @@ static int __init init_cleancache(void)
debugfs_create_u64("invalidates", S_IRUGO,
root, &cleancache_invalidates);
#endif
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ fs_poolid_map[i] = -2;
+ shared_fs_poolid_map[i] = -2;
+ }
+ cleancache_enabled = 1;
return 0;
}
module_init(init_cleancache)
--
1.7.1

2012-10-31 15:08:34

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 3/5] staging: zcache2+ramster: enable zcache2 to be built/loaded as a module

Allow zcache2 to be built/loaded as a module. Note runtime dependency
disallows loading if cleancache/frontswap lazy initialization patches
are not present. Zsmalloc support has not yet been merged into zcache2
but, once merged, could now easily be selected via a module_param.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <[email protected]>
---
drivers/staging/ramster/Kconfig | 6 ++--
drivers/staging/ramster/Makefile | 11 +++---
drivers/staging/ramster/tmem.c | 6 +++-
drivers/staging/ramster/tmem.h | 8 ++--
drivers/staging/ramster/zcache-main.c | 61 ++++++++++++++++++++++++++++++--
drivers/staging/ramster/zcache.h | 2 +-
6 files changed, 76 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/ramster/Kconfig b/drivers/staging/ramster/Kconfig
index 843c541..fd3e767 100644
--- a/drivers/staging/ramster/Kconfig
+++ b/drivers/staging/ramster/Kconfig
@@ -1,5 +1,5 @@
config ZCACHE2
- bool "Dynamic compression of swap pages and clean pagecache pages"
+ tristate "Dynamic compression of swap pages and clean pagecache pages"
depends on CRYPTO=y && SWAP=y && CLEANCACHE && FRONTSWAP && !ZCACHE
select CRYPTO_LZO
default n
@@ -16,8 +16,8 @@ config ZCACHE2
version of ramster.

config RAMSTER
- bool "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
- depends on CONFIGFS_FS=y && SYSFS=y && !HIGHMEM && ZCACHE2=y
+ tristate "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
+ depends on CONFIGFS_FS && SYSFS && !HIGHMEM && ZCACHE2
# must ensure struct page is 8-byte aligned
select HAVE_ALIGNED_STRUCT_PAGE if !64_BIT
default n
diff --git a/drivers/staging/ramster/Makefile b/drivers/staging/ramster/Makefile
index 2d8b9d0..fcb25cb 100644
--- a/drivers/staging/ramster/Makefile
+++ b/drivers/staging/ramster/Makefile
@@ -1,6 +1,7 @@
zcache-y := zcache-main.o tmem.o zbud.o
-zcache-$(CONFIG_RAMSTER) += ramster/ramster.o ramster/r2net.o
-zcache-$(CONFIG_RAMSTER) += ramster/nodemanager.o ramster/tcp.o
-zcache-$(CONFIG_RAMSTER) += ramster/heartbeat.o ramster/masklog.o
-
-obj-$(CONFIG_ZCACHE2) += zcache.o
+ifneq (,$(filter $(CONFIG_RAMSTER),y m))
+zcache-y += ramster/ramster.o ramster/r2net.o
+zcache-y += ramster/nodemanager.o ramster/tcp.o
+zcache-y += ramster/heartbeat.o ramster/masklog.o
+endif
+obj-$(CONFIG_MODULES) += zcache.o
diff --git a/drivers/staging/ramster/tmem.c b/drivers/staging/ramster/tmem.c
index a2b7e03..d7e51e4 100644
--- a/drivers/staging/ramster/tmem.c
+++ b/drivers/staging/ramster/tmem.c
@@ -35,7 +35,8 @@
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/atomic.h>
-#ifdef CONFIG_RAMSTER
+#include <linux/export.h>
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
#include <linux/delay.h>
#endif

@@ -641,6 +642,7 @@ void *tmem_localify_get_pampd(struct tmem_pool *pool, struct tmem_oid *oidp,
/* note, hashbucket remains locked */
return pampd;
}
+EXPORT_SYMBOL_GPL(tmem_localify_get_pampd);

void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
void *pampd, void *saved_hb, bool delete)
@@ -658,6 +660,7 @@ void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
}
spin_unlock(&hb->lock);
}
+EXPORT_SYMBOL_GPL(tmem_localify_finish);

/*
* For ramster only. Helper function to support asynchronous tmem_get.
@@ -719,6 +722,7 @@ out:
spin_unlock(&hb->lock);
return ret;
}
+EXPORT_SYMBOL_GPL(tmem_replace);
#endif

/*
diff --git a/drivers/staging/ramster/tmem.h b/drivers/staging/ramster/tmem.h
index adbe5a8..d128ce2 100644
--- a/drivers/staging/ramster/tmem.h
+++ b/drivers/staging/ramster/tmem.h
@@ -126,7 +126,7 @@ static inline unsigned tmem_oid_hash(struct tmem_oid *oidp)
TMEM_HASH_BUCKET_BITS);
}

-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
struct tmem_xhandle {
uint8_t client_id;
uint8_t xh_data_cksum;
@@ -171,7 +171,7 @@ struct tmem_obj {
unsigned int objnode_tree_height;
unsigned long objnode_count;
long pampd_count;
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
/*
* for current design of ramster, all pages belonging to
* an object reside on the same remotenode and extra is
@@ -215,7 +215,7 @@ struct tmem_pamops {
uint32_t);
void (*free)(void *, struct tmem_pool *,
struct tmem_oid *, uint32_t, bool);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
void (*new_obj)(struct tmem_obj *);
void (*free_obj)(struct tmem_pool *, struct tmem_obj *, bool);
void *(*repatriate_preload)(void *, struct tmem_pool *,
@@ -247,7 +247,7 @@ extern int tmem_flush_page(struct tmem_pool *, struct tmem_oid *,
extern int tmem_flush_object(struct tmem_pool *, struct tmem_oid *);
extern int tmem_destroy_pool(struct tmem_pool *);
extern void tmem_new_pool(struct tmem_pool *, uint32_t);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
extern int tmem_replace(struct tmem_pool *, struct tmem_oid *, uint32_t index,
void *);
extern void *tmem_localify_get_pampd(struct tmem_pool *, struct tmem_oid *,
diff --git a/drivers/staging/ramster/zcache-main.c b/drivers/staging/ramster/zcache-main.c
index a09dd5c..956d16d 100644
--- a/drivers/staging/ramster/zcache-main.c
+++ b/drivers/staging/ramster/zcache-main.c
@@ -31,8 +31,10 @@
#include "ramster.h"
#ifdef CONFIG_RAMSTER
static int ramster_enabled;
+static int disable_frontswap_selfshrink;
#else
#define ramster_enabled 0
+#define disable_frontswap_selfshrink 0
#endif

#ifndef __PG_WAS_ACTIVE
@@ -68,8 +70,12 @@ static char *namestr __read_mostly = "zcache";
MODULE_LICENSE("GPL");

/* crypto API for zcache */
+#ifdef CONFIG_ZCACHE2_MODULE
+static char *zcache_comp_name = "lzo";
+#else
#define ZCACHE_COMP_NAME_SZ CRYPTO_MAX_ALG_NAME
static char zcache_comp_name[ZCACHE_COMP_NAME_SZ] __read_mostly;
+#endif
static struct crypto_comp * __percpu *zcache_comp_pcpu_tfms __read_mostly;

enum comp_op {
@@ -1346,7 +1352,7 @@ static int zcache_local_new_pool(uint32_t flags)
int zcache_autocreate_pool(unsigned int cli_id, unsigned int pool_id, bool eph)
{
struct tmem_pool *pool;
- struct zcache_client *cli;
+ struct zcache_client *cli = NULL;
uint32_t flags = eph ? 0 : TMEM_POOL_PERSIST;
int ret = -1;

@@ -1640,6 +1646,7 @@ struct frontswap_ops zcache_frontswap_register_ops(void)
* OR NOTHING HAPPENS!
*/

+#ifndef CONFIG_ZCACHE2_MODULE
static int __init enable_zcache(char *s)
{
zcache_enabled = 1;
@@ -1706,18 +1713,27 @@ static int __init enable_zcache_compressor(char *s)
return 1;
}
__setup("zcache=", enable_zcache_compressor);
+#endif


-static int __init zcache_comp_init(void)
+static int zcache_comp_init(void)
{
int ret = 0;

/* check crypto algorithm */
+#ifdef CONFIG_ZCACHE2_MODULE
+ ret = crypto_has_comp(zcache_comp_name, 0, 0);
+ if (!ret) {
+ ret = -1;
+ goto out;
+ }
+#else
if (*zcache_comp_name != '\0') {
ret = crypto_has_comp(zcache_comp_name, 0, 0);
if (!ret)
pr_info("zcache: %s not supported\n",
zcache_comp_name);
+ goto out;
}
if (!ret)
strcpy(zcache_comp_name, "lzo");
@@ -1726,6 +1742,7 @@ static int __init zcache_comp_init(void)
ret = 1;
goto out;
}
+#endif
pr_info("zcache: using %s compressor\n", zcache_comp_name);

/* alloc percpu transforms */
@@ -1737,10 +1754,27 @@ out:
return ret;
}

-static int __init zcache_init(void)
+static int zcache_init(void)
{
int ret = 0;

+#ifdef CONFIG_ZCACHE2_MODULE
+#if defined(CONFIG_FRONTSWAP) && !defined(FRONTSWAP_HAS_LAZY_INIT)
+ /* This ifdef can be removed when frontswap lazy init is merged */
+ if (!disable_frontswap) {
+ pr_err("zcache module requires frontswap lazy init\n");
+ return -EINVAL;
+ }
+#endif
+#if defined(CONFIG_CLEANCACHE) && !defined(CLEANCACHE_HAS_LAZY_INIT)
+ /* This ifdef can be removed when cleancache lazy init is merged */
+ if (!disable_cleancache) {
+ pr_err("zcache module requires cleancache lazy init\n");
+ return -EINVAL;
+ }
+#endif
+ zcache_enabled = 1;
+#endif
if (ramster_enabled) {
namestr = "ramster";
ramster_register_pamops(&zcache_pamops);
@@ -1812,9 +1846,28 @@ static int __init zcache_init(void)
}
if (ramster_enabled)
ramster_init(!disable_cleancache, !disable_frontswap,
- frontswap_has_exclusive_gets);
+ frontswap_has_exclusive_gets,
+ !disable_frontswap_selfshrink);
out:
return ret;
}

+#ifdef CONFIG_ZCACHE2_MODULE
+#ifdef CONFIG_RAMSTER
+module_param(ramster_enabled, int, S_IRUGO);
+module_param(disable_frontswap_selfshrink, int, S_IRUGO);
+#endif
+module_param(disable_cleancache, int, S_IRUGO);
+module_param(disable_frontswap, int, S_IRUGO);
+#ifdef FRONTSWAP_HAS_EXCLUSIVE_GETS
+module_param(frontswap_has_exclusive_gets, bool, S_IRUGO);
+#endif
+module_param(disable_frontswap_ignore_nonactive, int, S_IRUGO);
+module_param(zcache_comp_name, charp, S_IRUGO);
+module_init(zcache_init);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <[email protected]>");
+MODULE_DESCRIPTION("In-kernel compression of cleancache/frontswap pages");
+#else
late_initcall(zcache_init);
+#endif
diff --git a/drivers/staging/ramster/zcache.h b/drivers/staging/ramster/zcache.h
index 81722b3..8491200 100644
--- a/drivers/staging/ramster/zcache.h
+++ b/drivers/staging/ramster/zcache.h
@@ -39,7 +39,7 @@ extern int zcache_flush_page(int, int, struct tmem_oid *, uint32_t);
extern int zcache_flush_object(int, int, struct tmem_oid *);
extern void zcache_decompress_to_page(char *, unsigned int, struct page *);

-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
extern void *zcache_pampd_create(char *, unsigned int, bool, int,
struct tmem_handle *);
int zcache_autocreate_pool(unsigned int cli_id, unsigned int pool_id, bool eph);
--
1.7.1

2012-10-31 15:09:47

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <[email protected]>
Signed-off-by: Florian Schmaus <[email protected]>
Signed-off-by: Andor Daam <[email protected]>
Signed-off-by: Dan Magenheimer <[email protected]>
---
include/linux/frontswap.h | 1 +
mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++-----
2 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 3044254..ef6ada6 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
extern void frontswap_tmem_exclusive_gets(bool);

extern void __frontswap_init(unsigned type);
+#define FRONTSWAP_HAS_LAZY_INIT
extern int __frontswap_store(struct page *page);
extern int __frontswap_load(struct page *page);
extern void __frontswap_invalidate_page(unsigned, pgoff_t);
diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..523a19b 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { }
static inline void inc_frontswap_failed_stores(void) { }
static inline void inc_frontswap_invalidates(void) { }
#endif
+
+/*
+ * When no backend is registered all calls to init are registered and
+ * remembered but fail to create tmem_pools. When a backend registers with
+ * frontswap the previous calls to init are executed to create tmem_pools
+ * and set the respective poolids.
+ * While no backend is registered all "puts", "gets" and "flushes" are
+ * ignored or fail.
+ */
+#define MAX_INITIALIZABLE_SD 32
+static int sds[MAX_INITIALIZABLE_SD];
+static int backend_registered;
+
/*
* Register operations for frontswap, returning previous thus allowing
* detection of multiple backends and possible nesting.
@@ -87,9 +100,16 @@ static inline void inc_frontswap_invalidates(void) { }
struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
{
struct frontswap_ops old = frontswap_ops;
+ int i;

frontswap_ops = *ops;
frontswap_enabled = true;
+
+ backend_registered = 1;
+ for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
+ if (sds[i] != -1)
+ (*frontswap_ops.init)(sds[i]);
+ }
return old;
}
EXPORT_SYMBOL(frontswap_register_ops);
@@ -122,7 +142,10 @@ void __frontswap_init(unsigned type)
BUG_ON(sis == NULL);
if (sis->frontswap_map == NULL)
return;
- frontswap_ops.init(type);
+ if (backend_registered) {
+ (*frontswap_ops.init)(type);
+ sds[type] = type;
+ }
}
EXPORT_SYMBOL(__frontswap_init);

@@ -147,10 +170,20 @@ int __frontswap_store(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered) {
+ inc_frontswap_failed_stores();
+ return ret;
+ }
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
dup = 1;
+ if (type < MAX_INITIALIZABLE_SD && sds[type] == -1) {
+ /* lazy init call to handle post-boot insmod backends*/
+ (*frontswap_ops.init)(type);
+ sds[type] = type;
+ }
ret = frontswap_ops.store(type, offset, page);
if (ret == 0) {
frontswap_set(sis, offset);
@@ -186,6 +219,9 @@ int __frontswap_load(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered)
+ return ret;
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
@@ -209,6 +245,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
{
struct swap_info_struct *sis = swap_info[type];

+ if (!backend_registered)
+ return;
+
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset)) {
frontswap_ops.invalidate_page(type, offset);
@@ -225,13 +264,23 @@ EXPORT_SYMBOL(__frontswap_invalidate_page);
void __frontswap_invalidate_area(unsigned type)
{
struct swap_info_struct *sis = swap_info[type];
-
- BUG_ON(sis == NULL);
- if (sis->frontswap_map == NULL)
- return;
- frontswap_ops.invalidate_area(type);
- atomic_set(&sis->frontswap_pages, 0);
- memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ int i;
+
+ if (backend_registered) {
+ BUG_ON(sis == NULL);
+ if (sis->frontswap_map == NULL)
+ return;
+ (*frontswap_ops.invalidate_area)(type);
+ atomic_set(&sis->frontswap_pages, 0);
+ memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ } else {
+ for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
+ if (sds[i] == type) {
+ sds[i] = -1;
+ break;
+ }
+ }
+ }
}
EXPORT_SYMBOL(__frontswap_invalidate_area);

@@ -353,6 +402,7 @@ EXPORT_SYMBOL(frontswap_curr_pages);

static int __init init_frontswap(void)
{
+ int i;
#ifdef CONFIG_DEBUG_FS
struct dentry *root = debugfs_create_dir("frontswap", NULL);
if (root == NULL)
@@ -364,6 +414,10 @@ static int __init init_frontswap(void)
debugfs_create_u64("invalidates", S_IRUGO,
root, &frontswap_invalidates);
#endif
+ for (i = 0; i < MAX_INITIALIZABLE_SD; i++)
+ sds[i] = -1;
+
+ frontswap_enabled = 1;
return 0;
}

--
1.7.1

2012-10-31 15:09:58

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 4/5] staging: zcache2+ramster: enable ramster to be built/loaded as a module

Enable module support for ramster. Note runtime dependency disallows
loading if cleancache/frontswap lazy initialization patches are not
present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <[email protected]>
---
drivers/staging/ramster/ramster.h | 6 +++-
drivers/staging/ramster/ramster/nodemanager.c | 9 +++---
drivers/staging/ramster/ramster/ramster.c | 29 ++++++++++++++++---
drivers/staging/ramster/ramster/ramster.h | 2 +-
.../staging/ramster/ramster/ramster_nodemanager.h | 2 +
5 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/ramster/ramster.h b/drivers/staging/ramster/ramster.h
index 1b71aea..e1f91d5 100644
--- a/drivers/staging/ramster/ramster.h
+++ b/drivers/staging/ramster/ramster.h
@@ -11,10 +11,14 @@
#ifndef _ZCACHE_RAMSTER_H_
#define _ZCACHE_RAMSTER_H_

+#ifdef CONFIG_RAMSTER_MODULE
+#define CONFIG_RAMSTER
+#endif
+
#ifdef CONFIG_RAMSTER
#include "ramster/ramster.h"
#else
-static inline void ramster_init(bool x, bool y, bool z)
+static inline void ramster_init(bool x, bool y, bool z, bool w)
{
}

diff --git a/drivers/staging/ramster/ramster/nodemanager.c b/drivers/staging/ramster/ramster/nodemanager.c
index c0f4815..2cfe933 100644
--- a/drivers/staging/ramster/ramster/nodemanager.c
+++ b/drivers/staging/ramster/ramster/nodemanager.c
@@ -949,7 +949,7 @@ static void __exit exit_r2nm(void)
r2hb_exit();
}

-static int __init init_r2nm(void)
+int r2nm_init(void)
{
int ret = -1;

@@ -986,10 +986,11 @@ out_r2hb:
out:
return ret;
}
+EXPORT_SYMBOL_GPL(r2nm_init);

MODULE_AUTHOR("Oracle");
MODULE_LICENSE("GPL");

-/* module_init(init_r2nm) */
-late_initcall(init_r2nm);
-/* module_exit(exit_r2nm) */
+#ifndef CONFIG_RAMSTER_MODULE
+late_initcall(r2nm_init);
+#endif
diff --git a/drivers/staging/ramster/ramster/ramster.c b/drivers/staging/ramster/ramster/ramster.c
index c06709f..491ec70 100644
--- a/drivers/staging/ramster/ramster/ramster.c
+++ b/drivers/staging/ramster/ramster/ramster.c
@@ -92,7 +92,7 @@ static unsigned long ramster_remote_page_flushes_failed;
#include <linux/debugfs.h>
#define zdfs debugfs_create_size_t
#define zdfs64 debugfs_create_u64
-static int __init ramster_debugfs_init(void)
+static int ramster_debugfs_init(void)
{
struct dentry *root = debugfs_create_dir("ramster", NULL);
if (root == NULL)
@@ -191,6 +191,7 @@ int ramster_do_preload_flnode(struct tmem_pool *pool)
kmem_cache_free(ramster_flnode_cache, flnode);
return ret;
}
+EXPORT_SYMBOL_GPL(ramster_do_preload_flnode);

/*
* Called by the message handler after a (still compressed) page has been
@@ -458,6 +459,7 @@ void *ramster_pampd_free(void *pampd, struct tmem_pool *pool,
}
return local_pampd;
}
+EXPORT_SYMBOL_GPL(ramster_pampd_free);

void ramster_count_foreign_pages(bool eph, int count)
{
@@ -489,6 +491,7 @@ void ramster_count_foreign_pages(bool eph, int count)
ramster_foreign_pers_pages = c;
}
}
+EXPORT_SYMBOL_GPL(ramster_count_foreign_pages);

/*
* For now, just push over a few pages every few seconds to
@@ -674,7 +677,7 @@ requeue:
ramster_remotify_queue_delayed_work(HZ);
}

-void __init ramster_remotify_init(void)
+void ramster_remotify_init(void)
{
unsigned long n = 60UL;
ramster_remotify_workqueue =
@@ -849,8 +852,10 @@ static bool frontswap_selfshrinking __read_mostly;
static void selfshrink_process(struct work_struct *work);
static DECLARE_DELAYED_WORK(selfshrink_worker, selfshrink_process);

+#ifndef CONFIG_RAMSTER_MODULE
/* Enable/disable with kernel boot option. */
static bool use_frontswap_selfshrink __initdata = true;
+#endif

/*
* The default values for the following parameters were deemed reasonable
@@ -905,6 +910,7 @@ static void frontswap_selfshrink(void)
frontswap_shrink(tgt_frontswap_pages);
}

+#ifndef CONFIG_RAMSTER_MODULE
static int __init ramster_nofrontswap_selfshrink_setup(char *s)
{
use_frontswap_selfshrink = false;
@@ -912,6 +918,7 @@ static int __init ramster_nofrontswap_selfshrink_setup(char *s)
}

__setup("noselfshrink", ramster_nofrontswap_selfshrink_setup);
+#endif

static void selfshrink_process(struct work_struct *work)
{
@@ -930,6 +937,7 @@ void ramster_cpu_up(int cpu)
per_cpu(ramster_remoteputmem1, cpu) = p1;
per_cpu(ramster_remoteputmem2, cpu) = p2;
}
+EXPORT_SYMBOL_GPL(ramster_cpu_up);

void ramster_cpu_down(int cpu)
{
@@ -945,6 +953,7 @@ void ramster_cpu_down(int cpu)
kp->flnode = NULL;
}
}
+EXPORT_SYMBOL_GPL(ramster_cpu_down);

void ramster_register_pamops(struct tmem_pamops *pamops)
{
@@ -955,9 +964,11 @@ void ramster_register_pamops(struct tmem_pamops *pamops)
pamops->repatriate = ramster_pampd_repatriate;
pamops->repatriate_preload = ramster_pampd_repatriate_preload;
}
+EXPORT_SYMBOL_GPL(ramster_register_pamops);

-void __init ramster_init(bool cleancache, bool frontswap,
- bool frontswap_exclusive_gets)
+void ramster_init(bool cleancache, bool frontswap,
+ bool frontswap_exclusive_gets,
+ bool frontswap_selfshrink)
{
int ret = 0;

@@ -972,10 +983,17 @@ void __init ramster_init(bool cleancache, bool frontswap,
if (ret)
pr_err("ramster: can't create sysfs for ramster\n");
(void)r2net_register_handlers();
+#ifdef CONFIG_RAMSTER_MODULE
+ ret = r2nm_init();
+ if (ret)
+ pr_err("ramster: can't init r2net\n");
+ frontswap_selfshrinking = frontswap_selfshrink;
+#else
+ frontswap_selfshrinking = use_frontswap_selfshrink;
+#endif
INIT_LIST_HEAD(&ramster_rem_op_list);
ramster_flnode_cache = kmem_cache_create("ramster_flnode",
sizeof(struct flushlist_node), 0, 0, NULL);
- frontswap_selfshrinking = use_frontswap_selfshrink;
if (frontswap_selfshrinking) {
pr_info("ramster: Initializing frontswap selfshrink driver.\n");
schedule_delayed_work(&selfshrink_worker,
@@ -983,3 +1001,4 @@ void __init ramster_init(bool cleancache, bool frontswap,
}
ramster_remotify_init();
}
+EXPORT_SYMBOL_GPL(ramster_init);
diff --git a/drivers/staging/ramster/ramster/ramster.h b/drivers/staging/ramster/ramster/ramster.h
index 12ae56f..6d41a7a 100644
--- a/drivers/staging/ramster/ramster/ramster.h
+++ b/drivers/staging/ramster/ramster/ramster.h
@@ -147,7 +147,7 @@ extern int r2net_register_handlers(void);
extern int r2net_remote_target_node_set(int);

extern int ramster_remotify_pageframe(bool);
-extern void ramster_init(bool, bool, bool);
+extern void ramster_init(bool, bool, bool, bool);
extern void ramster_register_pamops(struct tmem_pamops *);
extern int ramster_localify(int, struct tmem_oid *oidp, uint32_t, char *,
unsigned int, void *);
diff --git a/drivers/staging/ramster/ramster/ramster_nodemanager.h b/drivers/staging/ramster/ramster/ramster_nodemanager.h
index 49f879d..dbaae34 100644
--- a/drivers/staging/ramster/ramster/ramster_nodemanager.h
+++ b/drivers/staging/ramster/ramster/ramster_nodemanager.h
@@ -36,4 +36,6 @@
/* host name, group name, cluster name all 64 bytes */
#define R2NM_MAX_NAME_LEN 64 /* __NEW_UTS_LEN */

+extern int r2nm_init(void);
+
#endif /* _RAMSTER_NODEMANAGER_H */
--
1.7.1

2012-10-31 15:09:55

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH 5/5] xen: tmem: enable Xen tmem shim to be built/loaded as a module

Allow Xen tmem shim to be built/loaded as a module. Xen self-ballooning
and frontswap-selfshrinking are now also "lazily" initialized when the
Xen tmem shim is loaded as a module, unless explicitly disabled
by module parameters.

Note runtime dependency disallows loading if cleancache/frontswap lazy
initialization patches are not present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <[email protected]>
---
drivers/xen/Kconfig | 4 +-
drivers/xen/tmem.c | 56 +++++++++++++++++++++++++++++++++--------
drivers/xen/xen-selfballoon.c | 13 +++++----
include/xen/tmem.h | 8 ++++++
4 files changed, 62 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index d4dffcd..68dd78f 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -144,8 +144,8 @@ config SWIOTLB_XEN
select SWIOTLB

config XEN_TMEM
- bool
- default y if (CLEANCACHE || FRONTSWAP)
+ tristate
+ default m if (CLEANCACHE || FRONTSWAP)
help
Shim to interface in-kernel Transcendent Memory hooks
(e.g. cleancache and frontswap) to Xen tmem hypercalls.
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 144564e..8ba0a3f 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -5,6 +5,7 @@
* Author: Dan Magenheimer
*/

+#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/init.h>
@@ -128,6 +129,7 @@ static int xen_tmem_flush_object(u32 pool_id, struct tmem_oid oid)
return xen_tmem_op(TMEM_FLUSH_OBJECT, pool_id, oid, 0, 0, 0, 0, 0);
}

+#ifndef CONFIG_XEN_TMEM_MODULE
bool __read_mostly tmem_enabled = false;

static int __init enable_tmem(char *s)
@@ -136,6 +138,7 @@ static int __init enable_tmem(char *s)
return 1;
}
__setup("tmem", enable_tmem);
+#endif

#ifdef CONFIG_CLEANCACHE
static int xen_tmem_destroy_pool(u32 pool_id)
@@ -227,16 +230,21 @@ static int tmem_cleancache_init_shared_fs(char *uuid, size_t pagesize)
return xen_tmem_new_pool(shared_uuid, TMEM_POOL_SHARED, pagesize);
}

-static bool __initdata use_cleancache = true;
-
+static bool disable_cleancache __read_mostly;
+static bool disable_selfballooning __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_cleancache, bool, S_IRUGO);
+module_param(disable_selfballooning, bool, S_IRUGO);
+#else
static int __init no_cleancache(char *s)
{
- use_cleancache = false;
+ disable_cleancache = true;
return 1;
}
__setup("nocleancache", no_cleancache);
+#endif

-static struct cleancache_ops __initdata tmem_cleancache_ops = {
+static struct cleancache_ops tmem_cleancache_ops = {
.put_page = tmem_cleancache_put_page,
.get_page = tmem_cleancache_get_page,
.invalidate_page = tmem_cleancache_flush_page,
@@ -353,16 +361,21 @@ static void tmem_frontswap_init(unsigned ignored)
xen_tmem_new_pool(private, TMEM_POOL_PERSIST, PAGE_SIZE);
}

-static bool __initdata use_frontswap = true;
-
+static bool disable_frontswap __read_mostly;
+static bool disable_frontswap_selfshrinking __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_frontswap, bool, S_IRUGO);
+module_param(disable_frontswap_selfshrinking, bool, S_IRUGO);
+#else
static int __init no_frontswap(char *s)
{
- use_frontswap = false;
+ disable_frontswap = true;
return 1;
}
__setup("nofrontswap", no_frontswap);
+#endif

-static struct frontswap_ops __initdata tmem_frontswap_ops = {
+static struct frontswap_ops tmem_frontswap_ops = {
.store = tmem_frontswap_store,
.load = tmem_frontswap_load,
.invalidate_page = tmem_frontswap_flush_page,
@@ -371,12 +384,19 @@ static struct frontswap_ops __initdata tmem_frontswap_ops = {
};
#endif

-static int __init xen_tmem_init(void)
+static int xen_tmem_init(void)
{
if (!xen_domain())
return 0;
#ifdef CONFIG_FRONTSWAP
- if (tmem_enabled && use_frontswap) {
+#if defined(CONFIG_XEN_TMEM_MODULE) && !defined(FRONTSWAP_HAS_LAZY_INIT)
+ /* This ifdef can be removed when frontswap lazy init is merged */
+ if (!disable_frontswap) {
+ pr_err("Xen tmem module requires frontswap lazy init\n");
+ return -EINVAL;
+ }
+#endif
+ if (tmem_enabled && !disable_frontswap) {
char *s = "";
struct frontswap_ops old_ops =
frontswap_register_ops(&tmem_frontswap_ops);
@@ -389,8 +409,15 @@ static int __init xen_tmem_init(void)
}
#endif
#ifdef CONFIG_CLEANCACHE
+#if defined(CONFIG_XEN_TMEM_MODULE) && !defined(CLEANCACHE_HAS_LAZY_INIT)
+ /* This ifdef can be removed when cleancache lazy init is merged */
+ if (!disable_cleancache) {
+ pr_err("Xen tmem module requires cleancache lazy init\n");
+ return -EINVAL;
+ }
+#endif
BUG_ON(sizeof(struct cleancache_filekey) != sizeof(struct tmem_oid));
- if (tmem_enabled && use_cleancache) {
+ if (tmem_enabled && !disable_cleancache) {
char *s = "";
struct cleancache_ops old_ops =
cleancache_register_ops(&tmem_cleancache_ops);
@@ -400,7 +427,14 @@ static int __init xen_tmem_init(void)
"Xen Transcendent Memory%s\n", s);
}
#endif
+#ifdef CONFIG_XEN_SELFBALLOONING
+ xen_selfballoon_init(!disable_selfballooning,
+ !disable_frontswap_selfshrinking);
+#endif
return 0;
}

module_init(xen_tmem_init)
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <[email protected]>");
+MODULE_DESCRIPTION("Shim to Xen transcendent memory");
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 7d041cb..f4808aa 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -121,7 +121,7 @@ static DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
static bool frontswap_selfshrinking __read_mostly;

/* Enable/disable with kernel boot option. */
-static bool use_frontswap_selfshrink __initdata = true;
+static bool use_frontswap_selfshrink = true;

/*
* The default values for the following parameters were deemed reasonable
@@ -185,7 +185,7 @@ static int __init xen_nofrontswap_selfshrink_setup(char *s)
__setup("noselfshrink", xen_nofrontswap_selfshrink_setup);

/* Disable with kernel boot option. */
-static bool use_selfballooning __initdata = true;
+static bool use_selfballooning = true;

static int __init xen_noselfballooning_setup(char *s)
{
@@ -196,7 +196,7 @@ static int __init xen_noselfballooning_setup(char *s)
__setup("noselfballooning", xen_noselfballooning_setup);
#else /* !CONFIG_FRONTSWAP */
/* Enable with kernel boot option. */
-static bool use_selfballooning __initdata = false;
+static bool use_selfballooning;

static int __init xen_selfballooning_setup(char *s)
{
@@ -537,7 +537,7 @@ int register_xen_selfballooning(struct device *dev)
}
EXPORT_SYMBOL(register_xen_selfballooning);

-static int __init xen_selfballoon_init(void)
+int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
{
bool enable = false;

@@ -571,7 +571,8 @@ static int __init xen_selfballoon_init(void)

return 0;
}
+EXPORT_SYMBOL(xen_selfballoon_init);

+#ifndef CONFIG_XEN_TMEM_MODULE
subsys_initcall(xen_selfballoon_init);
-
-MODULE_LICENSE("GPL");
+#endif
diff --git a/include/xen/tmem.h b/include/xen/tmem.h
index 591550a..f94bdd1 100644
--- a/include/xen/tmem.h
+++ b/include/xen/tmem.h
@@ -3,7 +3,15 @@

#include <linux/types.h>

+#ifdef CONFIG_XEN_TMEM_MODULE
+#define tmem_enabled 1
+#else
/* defined in drivers/xen/tmem.c */
extern bool tmem_enabled;
+#endif
+
+#ifdef CONFIG_XEN_SELFBALLOONING
+extern int xen_selfballoon_init(bool, bool);
+#endif

#endif /* _XEN_TMEM_H */
--
1.7.1

2012-10-31 15:16:11

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH 0/5] enable all tmem backends to be built and loaded as modules

Apologies... I misspelled the family name of one of the
Erlangen University authors of the first two patches
in this patchset, so any reply-alls to any of the
patch posts will see a bounce. If you reply-all to any
of these patches, kindly change one of the recipients
to:

[email protected] (was misspelled andor.damm)

I regret the inconvenience... :-(

> -----Original Message-----
> From: Dan Magenheimer [mailto:[email protected]]
> Sent: Wednesday, October 31, 2012 9:08 AM
> To: [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Subject: [PATCH 0/5] enable all tmem backends to be built and loaded as modules
>
> Since various parts of transcendent memory ("tmem") [1] were first posted in
> 2009, reviewers have suggested that various tmem features should be built
> as a module and enabled by loading the module, rather than the current clunky
> method of compiling as a built-in and enabling via boot parameter. Due
> to certain tmem initialization steps, that was not feasible at the time.
>
> [1] http://lwn.net/Articles/454795/
>
> This patchset allows each of the three merged transcendent memory
> backends (zcache, ramster, Xen tmem) to be used as modules by first
> enabling transcendent memory frontends (cleancache, frontswap) to deal
> with "lazy initialization" and, second, by adding the necessary code for
> the backends to be built and loaded as modules.
>
> The original mechanism to enable tmem backends -- namely to hardwire
> them into the kernel and select/enable one with a kernel boot
> parameter -- is retained but should be considered deprecated. When
> backends are loaded as modules, certain knobs will now be
> properly selected via module_params rather than via undocumented
> kernel boot parameters. Note that module UNloading is not yet
> supported as it is lower priority and will require significant
> additional work.
>
> The lazy initialization support is necessary because filesystems
> and swap devices are normally mounted early in boot and these
> activites normally trigger tmem calls to setup certain data structures;
> if the respective cleancache/frontswap ops are not yet registered
> by a back end, the tmem setup would fail for these devices and
> cleancache/frontswap would never be enabled for them which limits
> much of the value of tmem in many system configurations. Lazy
> initialization records the necessary information in cleancache/frontswap
> data structures and "replays" it after the ops are registered
> to ensure that all filesystems and swap devices can benefit from
> the loaded tmem backend.
>
> Patches 1 and 2 are the original [2] patches to cleancache and frontswap
> proposed by Erlangen University, but rebased to 3.7-rcN plus a couple
> of bug fixes I found necessary to run properly. I have not attempted
> any code cleanup. I have also added defines to ensure at runtime
> that backends are not loaded as modules if the frontend patches are not
> yet merged; this is useful to avoid any build dependency (since the
> frontends may be merged into linux-next through different trees and
> at different times than some backends) and once the entire patchset
> is safely merged, these defines/ifdefs can be removed.
>
> [2] http://www.spinics.net/lists/linux-mm/msg31490.html
>
> Patch 3 enables module support for zcache2. Zsmalloc support
> has not yet been merged into zcache2 but, once merged, could now
> easily be selected via a module_param.
>
> Patch 4 enables module support for ramster. Ramster will now be
> enabled with a module_param to zcache2.
>
> Patch 5 enables module support for the Xen tmem shim. Xen
> self-ballooning and frontswap-selfshrinking are also "lazily"
> initialized when the Xen tmem shim is loaded as a module, unless
> explicitly disabled by module_params.
>
> Signed-off-by: Dan Magenheimer <[email protected]>
>
> ---
> Diffstat:
>
> drivers/staging/ramster/Kconfig | 6 +-
> drivers/staging/ramster/Makefile | 11 +-
> drivers/staging/ramster/ramster.h | 6 +-
> drivers/staging/ramster/ramster/nodemanager.c | 9 +-
> drivers/staging/ramster/ramster/ramster.c | 29 +++-
> drivers/staging/ramster/ramster/ramster.h | 2 +-
> .../staging/ramster/ramster/ramster_nodemanager.h | 2 +
> drivers/staging/ramster/tmem.c | 6 +-
> drivers/staging/ramster/tmem.h | 8 +-
> drivers/staging/ramster/zcache-main.c | 61 +++++++-
> drivers/staging/ramster/zcache.h | 2 +-
> drivers/xen/Kconfig | 4 +-
> drivers/xen/tmem.c | 56 ++++++--
> drivers/xen/xen-selfballoon.c | 13 +-
> include/linux/cleancache.h | 1 +
> include/linux/frontswap.h | 1 +
> include/xen/tmem.h | 8 +
> mm/cleancache.c | 157 +++++++++++++++++--
> mm/frontswap.c | 70 ++++++++-
> 19 files changed, 379 insertions(+), 73 deletions(-)

2012-10-31 17:07:36

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

On 10/31/2012 10:07 AM, Dan Magenheimer wrote:
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> frontswap even after swapon was run. Before a backend registers all calls
> to init are recorded and the creation of tmem_pools delayed until a backend
> registers or until a frontswap put is attempted.
>
> Signed-off-by: Stefan Hengelein <[email protected]>
> Signed-off-by: Florian Schmaus <[email protected]>
> Signed-off-by: Andor Daam <[email protected]>
> Signed-off-by: Dan Magenheimer <[email protected]>
> ---
> include/linux/frontswap.h | 1 +
> mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++-----
> 2 files changed, 63 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
> index 3044254..ef6ada6 100644
> --- a/include/linux/frontswap.h
> +++ b/include/linux/frontswap.h
> @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
> extern void frontswap_tmem_exclusive_gets(bool);
>
> extern void __frontswap_init(unsigned type);
> +#define FRONTSWAP_HAS_LAZY_INIT
> extern int __frontswap_store(struct page *page);
> extern int __frontswap_load(struct page *page);
> extern void __frontswap_invalidate_page(unsigned, pgoff_t);
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 2890e67..523a19b 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { }
> static inline void inc_frontswap_failed_stores(void) { }
> static inline void inc_frontswap_invalidates(void) { }
> #endif
> +
> +/*
> + * When no backend is registered all calls to init are registered and
> + * remembered but fail to create tmem_pools. When a backend registers with
> + * frontswap the previous calls to init are executed to create tmem_pools
> + * and set the respective poolids.
> + * While no backend is registered all "puts", "gets" and "flushes" are
> + * ignored or fail.
> + */
> +#define MAX_INITIALIZABLE_SD 32

MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES

> +static int sds[MAX_INITIALIZABLE_SD];

Rather than store and array of enabled types indexed by type, why not
an array of booleans indexed by type. Or a bitfield if you really
want to save space.

> +static int backend_registered;

(backend_registered) is equivalent to checking (frontswap_ops != NULL)
right?

> +
> /*
> * Register operations for frontswap, returning previous thus allowing
> * detection of multiple backends and possible nesting.
> @@ -87,9 +100,16 @@ static inline void inc_frontswap_invalidates(void) { }
> struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
> {
> struct frontswap_ops old = frontswap_ops;
> + int i;
>
> frontswap_ops = *ops;
> frontswap_enabled = true;
> +
> + backend_registered = 1;
> + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
> + if (sds[i] != -1)
> + (*frontswap_ops.init)(sds[i]);
> + }
> return old;
> }
> EXPORT_SYMBOL(frontswap_register_ops);
> @@ -122,7 +142,10 @@ void __frontswap_init(unsigned type)
> BUG_ON(sis == NULL);
> if (sis->frontswap_map == NULL)
> return;
> - frontswap_ops.init(type);
> + if (backend_registered) {
> + (*frontswap_ops.init)(type);
> + sds[type] = type;

This is weird, storing the type in an array indexed by type. Hence my
suggestion above about an array of booleans or a bitfield.

> + }
> }
> EXPORT_SYMBOL(__frontswap_init);
>
> @@ -147,10 +170,20 @@ int __frontswap_store(struct page *page)
> struct swap_info_struct *sis = swap_info[type];
> pgoff_t offset = swp_offset(entry);
>
> + if (!backend_registered) {
> + inc_frontswap_failed_stores();
> + return ret;
> + }
> +
> BUG_ON(!PageLocked(page));
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset))
> dup = 1;
> + if (type < MAX_INITIALIZABLE_SD && sds[type] == -1) {
> + /* lazy init call to handle post-boot insmod backends*/
> + (*frontswap_ops.init)(type);
> + sds[type] = type;
> + }
> ret = frontswap_ops.store(type, offset, page);
> if (ret == 0) {
> frontswap_set(sis, offset);
> @@ -186,6 +219,9 @@ int __frontswap_load(struct page *page)
> struct swap_info_struct *sis = swap_info[type];
> pgoff_t offset = swp_offset(entry);
>
> + if (!backend_registered)
> + return ret;
> +
> BUG_ON(!PageLocked(page));
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset))
> @@ -209,6 +245,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> {
> struct swap_info_struct *sis = swap_info[type];
>
> + if (!backend_registered)
> + return;
> +
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset)) {
> frontswap_ops.invalidate_page(type, offset);
> @@ -225,13 +264,23 @@ EXPORT_SYMBOL(__frontswap_invalidate_page);
> void __frontswap_invalidate_area(unsigned type)
> {
> struct swap_info_struct *sis = swap_info[type];
> -
> - BUG_ON(sis == NULL);
> - if (sis->frontswap_map == NULL)
> - return;
> - frontswap_ops.invalidate_area(type);
> - atomic_set(&sis->frontswap_pages, 0);
> - memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> + int i;
> +
> + if (backend_registered) {
> + BUG_ON(sis == NULL);
> + if (sis->frontswap_map == NULL)
> + return;
> + (*frontswap_ops.invalidate_area)(type);
> + atomic_set(&sis->frontswap_pages, 0);
> + memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> + } else {
> + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
> + if (sds[i] == type) {

Additional weirdness with sds. It seems this whole for loop could
just be reduced to:

sds[type] = -1;

> + sds[i] = -1;
> + break;
> + }
> + }
> + }
> }
> EXPORT_SYMBOL(__frontswap_invalidate_area);
>
> @@ -353,6 +402,7 @@ EXPORT_SYMBOL(frontswap_curr_pages);
>
> static int __init init_frontswap(void)
> {
> + int i;
> #ifdef CONFIG_DEBUG_FS
> struct dentry *root = debugfs_create_dir("frontswap", NULL);
> if (root == NULL)
> @@ -364,6 +414,10 @@ static int __init init_frontswap(void)
> debugfs_create_u64("invalidates", S_IRUGO,
> root, &frontswap_invalidates);
> #endif
> + for (i = 0; i < MAX_INITIALIZABLE_SD; i++)
> + sds[i] = -1;
> +
> + frontswap_enabled = 1;

If frontswap_enabled is going to be on all the time, then what point
does it serve? By extension, can all of the static inline wrappers in
frontswap.h be done away with?

--
Seth

2012-10-31 21:43:14

by Cesar Eduardo Barros

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

Em 31-10-2012 15:05, Seth Jennings escreveu:
> On 10/31/2012 10:07 AM, Dan Magenheimer wrote:
>> +#define MAX_INITIALIZABLE_SD 32
>
> MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES
>
>> +static int sds[MAX_INITIALIZABLE_SD];
>
> Rather than store and array of enabled types indexed by type, why not
> an array of booleans indexed by type. Or a bitfield if you really
> want to save space.

Since it is indexed by swap_info_struct's type, and frontswap already
pokes directly inside the swap_info_structs, it would be even cleaner to
use a boolean field within the swap_info_struct.

And if you are using a field within the swap_info_struct, you could
overload the already existing frontswap_map field, which should only
have any use if you have a frontswap module already loaded. That is,
move the vzalloc of the frontswap_map to within frontswap's init
function, and call it outside the swapfile_lock/swapon_mutex. This also
has the advantage of not allocating the frontswap_map when it is not
going to be used.

--
Cesar Eduardo Barros
[email protected]
[email protected]

2012-11-01 15:33:34

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

> From: Seth Jennings [mailto:[email protected]]
> Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as
> modules
>
> > static int __init init_frontswap(void)
> > {
> > + int i;
> > #ifdef CONFIG_DEBUG_FS
> > struct dentry *root = debugfs_create_dir("frontswap", NULL);
> > if (root == NULL)
> > @@ -364,6 +414,10 @@ static int __init init_frontswap(void)
> > debugfs_create_u64("invalidates", S_IRUGO,
> > root, &frontswap_invalidates);
> > #endif
> > + for (i = 0; i < MAX_INITIALIZABLE_SD; i++)
> > + sds[i] = -1;
> > +
> > + frontswap_enabled = 1;
>
> If frontswap_enabled is going to be on all the time, then what point
> does it serve? By extension, can all of the static inline wrappers in
> frontswap.h be done away with?

The intent of frontswap_enabled and cleancache_enabled was
to avoid the overhead of a function call at the point where
each frontswap/cleancache "hooks" is placed, using a global
variable check instead. I'm not sure if this minor
performance tuning effort is worth preserving: If not,
I agree frontswap_enabled and the static inline wrappers (as
well as their cleancache brethren) could be done away with **;
if worth preserving, then I think frontswap_enabled could
be set in the init method instead but the check for enabled
in the frontswap init method and the cleancache init_fs
method would need to be removed else lazy initialization
wouldn't work.

Dan

** Note to anyone that tries this: There is a subtle but
clever hack in the wrappers suggested by Jeremy Fitzhardinge
that disables the wrappers at compile-time as well as
runtime. IOW, make sure you test-compile both with
CONFIG_{CLEANCACHE|FRONTSWAP} _and_ with them unconfig'd.

2012-11-02 18:26:23

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

> > > + frontswap_enabled = 1;
> >
> > If frontswap_enabled is going to be on all the time, then what point
> > does it serve? By extension, can all of the static inline wrappers in
> > frontswap.h be done away with?

Hm, or the frontswap_enabled can be converted to a "frontswap_flag"
which has:

#define FRONTSWAP_ON (1<<1)
#define FRONTSWAP_BACKEND_ON (1<<2)

or so? And then we can see if we can squash the 'backend_registerd'
and 'frontswap_enabled' together.
>
> The intent of frontswap_enabled and cleancache_enabled was
> to avoid the overhead of a function call at the point where
> each frontswap/cleancache "hooks" is placed, using a global
> variable check instead. I'm not sure if this minor
> performance tuning effort is worth preserving: If not,
> I agree frontswap_enabled and the static inline wrappers (as
> well as their cleancache brethren) could be done away with **;
> if worth preserving, then I think frontswap_enabled could
> be set in the init method instead but the check for enabled
> in the frontswap init method and the cleancache init_fs
> method would need to be removed else lazy initialization
> wouldn't work.

Either way, that should be a seperate patch.
>
> Dan
>
> ** Note to anyone that tries this: There is a subtle but
> clever hack in the wrappers suggested by Jeremy Fitzhardinge
> that disables the wrappers at compile-time as well as
> runtime. IOW, make sure you test-compile both with
> CONFIG_{CLEANCACHE|FRONTSWAP} _and_ with them unconfig'd.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-11-02 18:28:12

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

On Wed, Oct 31, 2012 at 12:05:32PM -0500, Seth Jennings wrote:
> On 10/31/2012 10:07 AM, Dan Magenheimer wrote:
> > With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> > built/loaded as modules rather than built-in and enabled by a boot parameter,
> > this patch provides "lazy initialization", allowing backends to register to
> > frontswap even after swapon was run. Before a backend registers all calls
> > to init are recorded and the creation of tmem_pools delayed until a backend
> > registers or until a frontswap put is attempted.
> >
> > Signed-off-by: Stefan Hengelein <[email protected]>
> > Signed-off-by: Florian Schmaus <[email protected]>
> > Signed-off-by: Andor Daam <[email protected]>
> > Signed-off-by: Dan Magenheimer <[email protected]>
> > ---
> > include/linux/frontswap.h | 1 +
> > mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++-----
> > 2 files changed, 63 insertions(+), 8 deletions(-)
> >
> > diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
> > index 3044254..ef6ada6 100644
> > --- a/include/linux/frontswap.h
> > +++ b/include/linux/frontswap.h
> > @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
> > extern void frontswap_tmem_exclusive_gets(bool);
> >
> > extern void __frontswap_init(unsigned type);
> > +#define FRONTSWAP_HAS_LAZY_INIT
> > extern int __frontswap_store(struct page *page);
> > extern int __frontswap_load(struct page *page);
> > extern void __frontswap_invalidate_page(unsigned, pgoff_t);
> > diff --git a/mm/frontswap.c b/mm/frontswap.c
> > index 2890e67..523a19b 100644
> > --- a/mm/frontswap.c
> > +++ b/mm/frontswap.c
> > @@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { }
> > static inline void inc_frontswap_failed_stores(void) { }
> > static inline void inc_frontswap_invalidates(void) { }
> > #endif
> > +
> > +/*
> > + * When no backend is registered all calls to init are registered and
> > + * remembered but fail to create tmem_pools. When a backend registers with
> > + * frontswap the previous calls to init are executed to create tmem_pools
> > + * and set the respective poolids.
> > + * While no backend is registered all "puts", "gets" and "flushes" are
> > + * ignored or fail.
> > + */
> > +#define MAX_INITIALIZABLE_SD 32
>
> MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES
>
> > +static int sds[MAX_INITIALIZABLE_SD];
>
> Rather than store and array of enabled types indexed by type, why not
> an array of booleans indexed by type. Or a bitfield if you really
> want to save space.
>
> > +static int backend_registered;
>
> (backend_registered) is equivalent to checking (frontswap_ops != NULL)
> right?
>
> > +
> > /*
> > * Register operations for frontswap, returning previous thus allowing
> > * detection of multiple backends and possible nesting.
> > @@ -87,9 +100,16 @@ static inline void inc_frontswap_invalidates(void) { }
> > struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
> > {
> > struct frontswap_ops old = frontswap_ops;
> > + int i;
> >
> > frontswap_ops = *ops;
> > frontswap_enabled = true;
> > +
> > + backend_registered = 1;
> > + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
> > + if (sds[i] != -1)
> > + (*frontswap_ops.init)(sds[i]);
> > + }
> > return old;
> > }
> > EXPORT_SYMBOL(frontswap_register_ops);
> > @@ -122,7 +142,10 @@ void __frontswap_init(unsigned type)
> > BUG_ON(sis == NULL);
> > if (sis->frontswap_map == NULL)
> > return;
> > - frontswap_ops.init(type);
> > + if (backend_registered) {
> > + (*frontswap_ops.init)(type);
> > + sds[type] = type;
>
> This is weird, storing the type in an array indexed by type. Hence my
> suggestion above about an array of booleans or a bitfield.
>
> > + }
> > }
> > EXPORT_SYMBOL(__frontswap_init);
> >
> > @@ -147,10 +170,20 @@ int __frontswap_store(struct page *page)
> > struct swap_info_struct *sis = swap_info[type];
> > pgoff_t offset = swp_offset(entry);
> >
> > + if (!backend_registered) {
> > + inc_frontswap_failed_stores();
> > + return ret;
> > + }
> > +
> > BUG_ON(!PageLocked(page));
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset))
> > dup = 1;
> > + if (type < MAX_INITIALIZABLE_SD && sds[type] == -1) {
> > + /* lazy init call to handle post-boot insmod backends*/
> > + (*frontswap_ops.init)(type);
> > + sds[type] = type;
> > + }
> > ret = frontswap_ops.store(type, offset, page);
> > if (ret == 0) {
> > frontswap_set(sis, offset);
> > @@ -186,6 +219,9 @@ int __frontswap_load(struct page *page)
> > struct swap_info_struct *sis = swap_info[type];
> > pgoff_t offset = swp_offset(entry);
> >
> > + if (!backend_registered)
> > + return ret;
> > +
> > BUG_ON(!PageLocked(page));
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset))
> > @@ -209,6 +245,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> > {
> > struct swap_info_struct *sis = swap_info[type];
> >
> > + if (!backend_registered)
> > + return;
> > +
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset)) {
> > frontswap_ops.invalidate_page(type, offset);
> > @@ -225,13 +264,23 @@ EXPORT_SYMBOL(__frontswap_invalidate_page);
> > void __frontswap_invalidate_area(unsigned type)
> > {
> > struct swap_info_struct *sis = swap_info[type];
> > -
> > - BUG_ON(sis == NULL);
> > - if (sis->frontswap_map == NULL)
> > - return;
> > - frontswap_ops.invalidate_area(type);
> > - atomic_set(&sis->frontswap_pages, 0);
> > - memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> > + int i;
> > +
> > + if (backend_registered) {
> > + BUG_ON(sis == NULL);
> > + if (sis->frontswap_map == NULL)
> > + return;
> > + (*frontswap_ops.invalidate_area)(type);
> > + atomic_set(&sis->frontswap_pages, 0);
> > + memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> > + } else {
> > + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
> > + if (sds[i] == type) {
>
> Additional weirdness with sds. It seems this whole for loop could
> just be reduced to:
>
> sds[type] = -1;


How does this look? (I hadn't actually tested it, but did compile test
it)

>From f545530e9ef2b0623ab9e78d490595e3b7eaa3fa Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <[email protected]>
Date: Wed, 31 Oct 2012 08:07:51 -0700
Subject: [PATCH 2/2] mm: frontswap: lazy initialization to allow tmem
backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <[email protected]>
Signed-off-by: Florian Schmaus <[email protected]>
Signed-off-by: Andor Daam <[email protected]>
Signed-off-by: Dan Magenheimer <[email protected]>
[v1: Fixes per Seth Jennings suggestions]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
include/linux/frontswap.h | 1 +
mm/frontswap.c | 59 +++++++++++++++++++++++++++++++++++++++++------
2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 3044254..ef6ada6 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
extern void frontswap_tmem_exclusive_gets(bool);

extern void __frontswap_init(unsigned type);
+#define FRONTSWAP_HAS_LAZY_INIT
extern int __frontswap_store(struct page *page);
extern int __frontswap_load(struct page *page);
extern void __frontswap_invalidate_page(unsigned, pgoff_t);
diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..4e04549 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
static inline void inc_frontswap_failed_stores(void) { }
static inline void inc_frontswap_invalidates(void) { }
#endif
+
+/*
+ * When no backend is registered all calls to init are registered and
+ * remembered but fail to create tmem_pools. When a backend registers with
+ * frontswap the previous calls to init are executed to create tmem_pools
+ * and set the respective poolids.
+ * While no backend is registered all "puts", "gets" and "flushes" are
+ * ignored or fail.
+ */
+static DECLARE_BITMAP(sds, MAX_SWAPFILES);
+static bool backend_registered __read_mostly;
+
/*
* Register operations for frontswap, returning previous thus allowing
* detection of multiple backends and possible nesting.
@@ -87,9 +99,16 @@ static inline void inc_frontswap_invalidates(void) { }
struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
{
struct frontswap_ops old = frontswap_ops;
+ int i;

frontswap_ops = *ops;
frontswap_enabled = true;
+
+ backend_registered = true;
+ for (i = 0; i < MAX_SWAPFILES; i++) {
+ if (test_bit(i, sds))
+ (*frontswap_ops.init)(sds[i]);
+ }
return old;
}
EXPORT_SYMBOL(frontswap_register_ops);
@@ -122,7 +141,10 @@ void __frontswap_init(unsigned type)
BUG_ON(sis == NULL);
if (sis->frontswap_map == NULL)
return;
- frontswap_ops.init(type);
+ if (backend_registered) {
+ (*frontswap_ops.init)(type);
+ set_bit(type, sds);
+ }
}
EXPORT_SYMBOL(__frontswap_init);

@@ -147,10 +169,20 @@ int __frontswap_store(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered) {
+ inc_frontswap_failed_stores();
+ return ret;
+ }
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
dup = 1;
+ if (type < MAX_SWAPFILES && !test_bit(type, sds)) {
+ /* lazy init call to handle post-boot insmod backends*/
+ (*frontswap_ops.init)(type);
+ set_bit(type, sds);
+ }
ret = frontswap_ops.store(type, offset, page);
if (ret == 0) {
frontswap_set(sis, offset);
@@ -186,6 +218,9 @@ int __frontswap_load(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered)
+ return ret;
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
@@ -209,6 +244,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
{
struct swap_info_struct *sis = swap_info[type];

+ if (!backend_registered)
+ return;
+
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset)) {
frontswap_ops.invalidate_page(type, offset);
@@ -226,12 +264,16 @@ void __frontswap_invalidate_area(unsigned type)
{
struct swap_info_struct *sis = swap_info[type];

- BUG_ON(sis == NULL);
- if (sis->frontswap_map == NULL)
- return;
- frontswap_ops.invalidate_area(type);
- atomic_set(&sis->frontswap_pages, 0);
- memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ if (backend_registered) {
+ BUG_ON(sis == NULL);
+ if (sis->frontswap_map == NULL)
+ return;
+ (*frontswap_ops.invalidate_area)(type);
+ atomic_set(&sis->frontswap_pages, 0);
+ memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ } else {
+ bitmap_zero(sds, MAX_SWAPFILES);
+ }
}
EXPORT_SYMBOL(__frontswap_invalidate_area);

@@ -364,6 +406,9 @@ static int __init init_frontswap(void)
debugfs_create_u64("invalidates", S_IRUGO,
root, &frontswap_invalidates);
#endif
+ bitmap_zero(sds, MAX_SWAPFILES);
+
+ frontswap_enabled = 1;
return 0;
}

--
1.7.11.7

2012-11-02 18:32:38

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 1/5] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules

On Wed, Oct 31, 2012 at 08:07:50AM -0700, Dan Magenheimer wrote:
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> cleancache even after filesystems were mounted. Calls to init_fs and
> init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> On backend registration the fake poolids are mapped to real poolids and
> respective tmem_pools.
>
> Signed-off-by: Stefan Hengelein <[email protected]>
> Signed-off-by: Florian Schmaus <[email protected]>
> Signed-off-by: Andor Daam <[email protected]>
> Signed-off-by: Dan Magenheimer <[email protected]>
> ---
> include/linux/cleancache.h | 1 +
> mm/cleancache.c | 157 +++++++++++++++++++++++++++++++++++++++-----
> 2 files changed, 141 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
> index 42e55de..f7e32f0 100644
> --- a/include/linux/cleancache.h
> +++ b/include/linux/cleancache.h
> @@ -37,6 +37,7 @@ extern struct cleancache_ops
> cleancache_register_ops(struct cleancache_ops *ops);
> extern void __cleancache_init_fs(struct super_block *);
> extern void __cleancache_init_shared_fs(char *, struct super_block *);
> +#define CLEANCACHE_HAS_LAZY_INIT
> extern int __cleancache_get_page(struct page *);
> extern void __cleancache_put_page(struct page *);
> extern void __cleancache_invalidate_page(struct address_space *, struct page *);
> diff --git a/mm/cleancache.c b/mm/cleancache.c
> index 32e6f41..29430b7 100644
> --- a/mm/cleancache.c
> +++ b/mm/cleancache.c
> @@ -45,15 +45,42 @@ static u64 cleancache_puts;
> static u64 cleancache_invalidates;
>
> /*
> + * When no backend is registered all calls to init_fs and init_shard_fs
> + * are registered and fake poolids are given to the respective
> + * super block but no tmem_pools are created. When a backend
> + * registers with cleancache the previous calls to init_fs and
> + * init_shared_fs are executed to create tmem_pools and set the
> + * respective poolids. While no backend is registered all "puts",
> + * "gets" and "flushes" are ignored or fail.
> + */
> +#define MAX_INITIALIZABLE_FS 32
> +#define FAKE_FS_POOLID_OFFSET 1000
> +#define FAKE_SHARED_FS_POOLID_OFFSET 2000
> +static int fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static char *uuids[MAX_INITIALIZABLE_FS];
> +static int backend_registered;

Those could use some #define's and bool, so please see attached
patch which does this.

>From a89c1224ec1957f1afaf4fbc1de349124bed6c67 Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <[email protected]>
Date: Wed, 31 Oct 2012 08:07:50 -0700
Subject: [PATCH 1/2] mm: cleancache: lazy initialization to allow tmem
backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
cleancache even after filesystems were mounted. Calls to init_fs and
init_shared_fs are remembered as fake poolids but no real tmem_pools created.
On backend registration the fake poolids are mapped to real poolids and
respective tmem_pools.

Signed-off-by: Stefan Hengelein <[email protected]>
Signed-off-by: Florian Schmaus <[email protected]>
Signed-off-by: Andor Daam <[email protected]>
Signed-off-by: Dan Magenheimer <[email protected]>
[v1: Minor fixes: used #define for some values and bools]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
include/linux/cleancache.h | 1 +
mm/cleancache.c | 156 ++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 42e55de..f7e32f0 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -37,6 +37,7 @@ extern struct cleancache_ops
cleancache_register_ops(struct cleancache_ops *ops);
extern void __cleancache_init_fs(struct super_block *);
extern void __cleancache_init_shared_fs(char *, struct super_block *);
+#define CLEANCACHE_HAS_LAZY_INIT
extern int __cleancache_get_page(struct page *);
extern void __cleancache_put_page(struct page *);
extern void __cleancache_invalidate_page(struct address_space *, struct page *);
diff --git a/mm/cleancache.c b/mm/cleancache.c
index 32e6f41..318a0ad 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -45,15 +45,45 @@ static u64 cleancache_puts;
static u64 cleancache_invalidates;

/*
+ * When no backend is registered all calls to init_fs and init_shard_fs
+ * are registered and fake poolids are given to the respective
+ * super block but no tmem_pools are created. When a backend
+ * registers with cleancache the previous calls to init_fs and
+ * init_shared_fs are executed to create tmem_pools and set the
+ * respective poolids. While no backend is registered all "puts",
+ * "gets" and "flushes" are ignored or fail.
+ */
+#define MAX_INITIALIZABLE_FS 32
+#define FAKE_FS_POOLID_OFFSET 1000
+#define FAKE_SHARED_FS_POOLID_OFFSET 2000
+
+#define FS_NO_BACKEND (-1)
+#define FS_UNKNOWN (-2)
+static int fs_poolid_map[MAX_INITIALIZABLE_FS];
+static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
+
+static char *uuids[MAX_INITIALIZABLE_FS];
+static bool __read_mostly backend_registered;
+
+/*
* register operations for cleancache, returning previous thus allowing
* detection of multiple backends and possible nesting
*/
struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
{
struct cleancache_ops old = cleancache_ops;
+ int i;

cleancache_ops = *ops;
- cleancache_enabled = 1;
+
+ backend_registered = true;
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (fs_poolid_map[i] == FS_NO_BACKEND)
+ fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+ if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
+ shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+ (uuids[i], PAGE_SIZE);
+ }
return old;
}
EXPORT_SYMBOL(cleancache_register_ops);
@@ -61,15 +91,38 @@ EXPORT_SYMBOL(cleancache_register_ops);
/* Called by a cleancache-enabled filesystem at time of mount */
void __cleancache_init_fs(struct super_block *sb)
{
- sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
+ int i;
+
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (fs_poolid_map[i] == FS_UNKNOWN) {
+ sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
+ if (backend_registered)
+ fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+ else
+ fs_poolid_map[i] = FS_NO_BACKEND;
+ break;
+ }
+ }
}
EXPORT_SYMBOL(__cleancache_init_fs);

/* Called by a cleancache-enabled clustered filesystem at time of mount */
void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
{
- sb->cleancache_poolid =
- (*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
+ int i;
+
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
+ sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
+ uuids[i] = uuid;
+ if (backend_registered)
+ shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+ (uuid, PAGE_SIZE);
+ else
+ shared_fs_poolid_map[i] = FS_NO_BACKEND;
+ break;
+ }
+ }
}
EXPORT_SYMBOL(__cleancache_init_shared_fs);

@@ -99,6 +152,19 @@ static int cleancache_get_key(struct inode *inode,
}

/*
+ * Returns a pool_id that is associated with a given fake poolid.
+ */
+static int get_poolid_from_fake(int fake_pool_id)
+{
+ if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
+ return shared_fs_poolid_map[fake_pool_id -
+ FAKE_SHARED_FS_POOLID_OFFSET];
+ else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
+ return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
+ return FS_NO_BACKEND;
+}
+
+/*
* "Get" data from cleancache associated with the poolid/inode/index
* that were specified when the data was put to cleanache and, if
* successful, use it to fill the specified page with data and return 0.
@@ -109,17 +175,26 @@ int __cleancache_get_page(struct page *page)
{
int ret = -1;
int pool_id;
+ int fake_pool_id;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered) {
+ cleancache_failed_gets++;
+ goto out;
+ }
+
VM_BUG_ON(!PageLocked(page));
- pool_id = page->mapping->host->i_sb->cleancache_poolid;
- if (pool_id < 0)
+ fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ if (fake_pool_id < 0)
goto out;
+ pool_id = get_poolid_from_fake(fake_pool_id);

if (cleancache_get_key(page->mapping->host, &key) < 0)
goto out;

- ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
+ if (pool_id >= 0)
+ ret = (*cleancache_ops.get_page)(pool_id,
+ key, page->index, page);
if (ret == 0)
cleancache_succ_gets++;
else
@@ -138,12 +213,23 @@ EXPORT_SYMBOL(__cleancache_get_page);
void __cleancache_put_page(struct page *page)
{
int pool_id;
+ int fake_pool_id;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered) {
+ cleancache_puts++;
+ return;
+ }
+
VM_BUG_ON(!PageLocked(page));
- pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+ if (fake_pool_id < 0)
+ return;
+
+ pool_id = get_poolid_from_fake(fake_pool_id);
+
if (pool_id >= 0 &&
- cleancache_get_key(page->mapping->host, &key) >= 0) {
+ cleancache_get_key(page->mapping->host, &key) >= 0) {
(*cleancache_ops.put_page)(pool_id, key, page->index, page);
cleancache_puts++;
}
@@ -158,14 +244,22 @@ void __cleancache_invalidate_page(struct address_space *mapping,
struct page *page)
{
/* careful... page->mapping is NULL sometimes when this is called */
- int pool_id = mapping->host->i_sb->cleancache_poolid;
+ int pool_id;
+ int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
struct cleancache_filekey key = { .u.key = { 0 } };

- if (pool_id >= 0) {
+ if (!backend_registered)
+ return;
+
+ if (fake_pool_id >= 0) {
+ pool_id = get_poolid_from_fake(fake_pool_id);
+ if (pool_id < 0)
+ return;
+
VM_BUG_ON(!PageLocked(page));
if (cleancache_get_key(mapping->host, &key) >= 0) {
(*cleancache_ops.invalidate_page)(pool_id,
- key, page->index);
+ key, page->index);
cleancache_invalidates++;
}
}
@@ -179,9 +273,18 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
*/
void __cleancache_invalidate_inode(struct address_space *mapping)
{
- int pool_id = mapping->host->i_sb->cleancache_poolid;
+ int pool_id;
+ int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
struct cleancache_filekey key = { .u.key = { 0 } };

+ if (!backend_registered)
+ return;
+
+ if (fake_pool_id < 0)
+ return;
+
+ pool_id = get_poolid_from_fake(fake_pool_id);
+
if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
(*cleancache_ops.invalidate_inode)(pool_id, key);
}
@@ -194,16 +297,30 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
*/
void __cleancache_invalidate_fs(struct super_block *sb)
{
- if (sb->cleancache_poolid >= 0) {
- int old_poolid = sb->cleancache_poolid;
- sb->cleancache_poolid = -1;
- (*cleancache_ops.invalidate_fs)(old_poolid);
+ int index;
+ int fake_pool_id = sb->cleancache_poolid;
+ int old_poolid = fake_pool_id;
+
+ if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
+ index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
+ old_poolid = shared_fs_poolid_map[index];
+ shared_fs_poolid_map[index] = FS_UNKNOWN;
+ uuids[index] = NULL;
+ } else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
+ index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
+ old_poolid = fs_poolid_map[index];
+ fs_poolid_map[index] = FS_UNKNOWN;
}
+ sb->cleancache_poolid = -1;
+ if (backend_registered)
+ (*cleancache_ops.invalidate_fs)(old_poolid);
}
EXPORT_SYMBOL(__cleancache_invalidate_fs);

static int __init init_cleancache(void)
{
+ int i;
+
#ifdef CONFIG_DEBUG_FS
struct dentry *root = debugfs_create_dir("cleancache", NULL);
if (root == NULL)
@@ -215,6 +332,11 @@ static int __init init_cleancache(void)
debugfs_create_u64("invalidates", S_IRUGO,
root, &cleancache_invalidates);
#endif
+ for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+ fs_poolid_map[i] = FS_UNKNOWN;
+ shared_fs_poolid_map[i] = FS_UNKNOWN;
+ }
+ cleancache_enabled = 1;
return 0;
}
module_init(init_cleancache)
--
1.7.11.7

2012-11-02 18:40:16

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 3/5] staging: zcache2+ramster: enable zcache2 to be built/loaded as a module

On Wed, Oct 31, 2012 at 08:07:52AM -0700, Dan Magenheimer wrote:
> Allow zcache2 to be built/loaded as a module. Note runtime dependency
> disallows loading if cleancache/frontswap lazy initialization patches
> are not present. Zsmalloc support has not yet been merged into zcache2
> but, once merged, could now easily be selected via a module_param.
>
> If built-in (not built as a module), the original mechanism of enabling via
> a kernel boot parameter is retained, but this should be considered deprecated.
>
> Note that module unload is explicitly not yet supported.

I had an issue putting it on v3.7-rc3 with the Kconfig. Not sure why
as it looks exactly the same.

The patch looks good, however..

> @@ -1812,9 +1846,28 @@ static int __init zcache_init(void)
> }
> if (ramster_enabled)
> ramster_init(!disable_cleancache, !disable_frontswap,
> - frontswap_has_exclusive_gets);
> + frontswap_has_exclusive_gets,
> + !disable_frontswap_selfshrink);
> out:
> return ret;
> }

.. ramster_init change is in the next patch. So it looks like the
patch order is a bit mismatched.

2012-11-03 01:21:31

by Bob Liu

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

On Sat, Nov 3, 2012 at 2:27 AM, Konrad Rzeszutek Wilk
<[email protected]> wrote:
> On Wed, Oct 31, 2012 at 12:05:32PM -0500, Seth Jennings wrote:
>> On 10/31/2012 10:07 AM, Dan Magenheimer wrote:
>> > With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
>> > built/loaded as modules rather than built-in and enabled by a boot parameter,
>> > this patch provides "lazy initialization", allowing backends to register to
>> > frontswap even after swapon was run. Before a backend registers all calls
>> > to init are recorded and the creation of tmem_pools delayed until a backend
>> > registers or until a frontswap put is attempted.
>> >
>> > Signed-off-by: Stefan Hengelein <[email protected]>
>> > Signed-off-by: Florian Schmaus <[email protected]>
>> > Signed-off-by: Andor Daam <[email protected]>
>> > Signed-off-by: Dan Magenheimer <[email protected]>
>> > ---
>> > include/linux/frontswap.h | 1 +
>> > mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++-----
>> > 2 files changed, 63 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
>> > index 3044254..ef6ada6 100644
>> > --- a/include/linux/frontswap.h
>> > +++ b/include/linux/frontswap.h
>> > @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
>> > extern void frontswap_tmem_exclusive_gets(bool);
>> >
>> > extern void __frontswap_init(unsigned type);
>> > +#define FRONTSWAP_HAS_LAZY_INIT
>> > extern int __frontswap_store(struct page *page);
>> > extern int __frontswap_load(struct page *page);
>> > extern void __frontswap_invalidate_page(unsigned, pgoff_t);
>> > diff --git a/mm/frontswap.c b/mm/frontswap.c
>> > index 2890e67..523a19b 100644
>> > --- a/mm/frontswap.c
>> > +++ b/mm/frontswap.c
>> > @@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { }
>> > static inline void inc_frontswap_failed_stores(void) { }
>> > static inline void inc_frontswap_invalidates(void) { }
>> > #endif
>> > +
>> > +/*
>> > + * When no backend is registered all calls to init are registered and
>> > + * remembered but fail to create tmem_pools. When a backend registers with
>> > + * frontswap the previous calls to init are executed to create tmem_pools
>> > + * and set the respective poolids.
>> > + * While no backend is registered all "puts", "gets" and "flushes" are
>> > + * ignored or fail.
>> > + */
>> > +#define MAX_INITIALIZABLE_SD 32
>>
>> MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES
>>
>> > +static int sds[MAX_INITIALIZABLE_SD];
>>
>> Rather than store and array of enabled types indexed by type, why not
>> an array of booleans indexed by type. Or a bitfield if you really
>> want to save space.
>>
>> > +static int backend_registered;
>>
>> (backend_registered) is equivalent to checking (frontswap_ops != NULL)
>> right?
>>
>> > +
>> > /*
>> > * Register operations for frontswap, returning previous thus allowing
>> > * detection of multiple backends and possible nesting.
>> > @@ -87,9 +100,16 @@ static inline void inc_frontswap_invalidates(void) { }
>> > struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
>> > {
>> > struct frontswap_ops old = frontswap_ops;
>> > + int i;
>> >
>> > frontswap_ops = *ops;
>> > frontswap_enabled = true;
>> > +
>> > + backend_registered = 1;
>> > + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
>> > + if (sds[i] != -1)
>> > + (*frontswap_ops.init)(sds[i]);
>> > + }
>> > return old;
>> > }
>> > EXPORT_SYMBOL(frontswap_register_ops);
>> > @@ -122,7 +142,10 @@ void __frontswap_init(unsigned type)
>> > BUG_ON(sis == NULL);
>> > if (sis->frontswap_map == NULL)
>> > return;
>> > - frontswap_ops.init(type);
>> > + if (backend_registered) {
>> > + (*frontswap_ops.init)(type);
>> > + sds[type] = type;
>>
>> This is weird, storing the type in an array indexed by type. Hence my
>> suggestion above about an array of booleans or a bitfield.
>>
>> > + }
>> > }
>> > EXPORT_SYMBOL(__frontswap_init);
>> >
>> > @@ -147,10 +170,20 @@ int __frontswap_store(struct page *page)
>> > struct swap_info_struct *sis = swap_info[type];
>> > pgoff_t offset = swp_offset(entry);
>> >
>> > + if (!backend_registered) {
>> > + inc_frontswap_failed_stores();
>> > + return ret;
>> > + }
>> > +
>> > BUG_ON(!PageLocked(page));
>> > BUG_ON(sis == NULL);
>> > if (frontswap_test(sis, offset))
>> > dup = 1;
>> > + if (type < MAX_INITIALIZABLE_SD && sds[type] == -1) {
>> > + /* lazy init call to handle post-boot insmod backends*/
>> > + (*frontswap_ops.init)(type);
>> > + sds[type] = type;
>> > + }
>> > ret = frontswap_ops.store(type, offset, page);
>> > if (ret == 0) {
>> > frontswap_set(sis, offset);
>> > @@ -186,6 +219,9 @@ int __frontswap_load(struct page *page)
>> > struct swap_info_struct *sis = swap_info[type];
>> > pgoff_t offset = swp_offset(entry);
>> >
>> > + if (!backend_registered)
>> > + return ret;
>> > +
>> > BUG_ON(!PageLocked(page));
>> > BUG_ON(sis == NULL);
>> > if (frontswap_test(sis, offset))
>> > @@ -209,6 +245,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
>> > {
>> > struct swap_info_struct *sis = swap_info[type];
>> >
>> > + if (!backend_registered)
>> > + return;
>> > +
>> > BUG_ON(sis == NULL);
>> > if (frontswap_test(sis, offset)) {
>> > frontswap_ops.invalidate_page(type, offset);
>> > @@ -225,13 +264,23 @@ EXPORT_SYMBOL(__frontswap_invalidate_page);
>> > void __frontswap_invalidate_area(unsigned type)
>> > {
>> > struct swap_info_struct *sis = swap_info[type];
>> > -
>> > - BUG_ON(sis == NULL);
>> > - if (sis->frontswap_map == NULL)
>> > - return;
>> > - frontswap_ops.invalidate_area(type);
>> > - atomic_set(&sis->frontswap_pages, 0);
>> > - memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> > + int i;
>> > +
>> > + if (backend_registered) {
>> > + BUG_ON(sis == NULL);
>> > + if (sis->frontswap_map == NULL)
>> > + return;
>> > + (*frontswap_ops.invalidate_area)(type);
>> > + atomic_set(&sis->frontswap_pages, 0);
>> > + memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> > + } else {
>> > + for (i = 0; i < MAX_INITIALIZABLE_SD; i++) {
>> > + if (sds[i] == type) {
>>
>> Additional weirdness with sds. It seems this whole for loop could
>> just be reduced to:
>>
>> sds[type] = -1;
>
>
> How does this look? (I hadn't actually tested it, but did compile test
> it)
>
> From f545530e9ef2b0623ab9e78d490595e3b7eaa3fa Mon Sep 17 00:00:00 2001
> From: Dan Magenheimer <[email protected]>
> Date: Wed, 31 Oct 2012 08:07:51 -0700
> Subject: [PATCH 2/2] mm: frontswap: lazy initialization to allow tmem
> backends to build/run as modules
>
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> frontswap even after swapon was run. Before a backend registers all calls
> to init are recorded and the creation of tmem_pools delayed until a backend
> registers or until a frontswap put is attempted.
>
> Signed-off-by: Stefan Hengelein <[email protected]>
> Signed-off-by: Florian Schmaus <[email protected]>
> Signed-off-by: Andor Daam <[email protected]>
> Signed-off-by: Dan Magenheimer <[email protected]>
> [v1: Fixes per Seth Jennings suggestions]
> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> include/linux/frontswap.h | 1 +
> mm/frontswap.c | 59 +++++++++++++++++++++++++++++++++++++++++------
> 2 files changed, 53 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
> index 3044254..ef6ada6 100644
> --- a/include/linux/frontswap.h
> +++ b/include/linux/frontswap.h
> @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
> extern void frontswap_tmem_exclusive_gets(bool);
>
> extern void __frontswap_init(unsigned type);
> +#define FRONTSWAP_HAS_LAZY_INIT
> extern int __frontswap_store(struct page *page);
> extern int __frontswap_load(struct page *page);
> extern void __frontswap_invalidate_page(unsigned, pgoff_t);
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 2890e67..4e04549 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
> static inline void inc_frontswap_failed_stores(void) { }
> static inline void inc_frontswap_invalidates(void) { }
> #endif
> +
> +/*
> + * When no backend is registered all calls to init are registered and
> + * remembered but fail to create tmem_pools. When a backend registers with
> + * frontswap the previous calls to init are executed to create tmem_pools
> + * and set the respective poolids.
> + * While no backend is registered all "puts", "gets" and "flushes" are
> + * ignored or fail.
> + */
> +static DECLARE_BITMAP(sds, MAX_SWAPFILES);
> +static bool backend_registered __read_mostly;
> +

Yes, i also prefer to use bitmap and resue MAX_SWAPFILES.

> /*
> * Register operations for frontswap, returning previous thus allowing
> * detection of multiple backends and possible nesting.
> @@ -87,9 +99,16 @@ static inline void inc_frontswap_invalidates(void) { }
> struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
> {
> struct frontswap_ops old = frontswap_ops;
> + int i;
>
> frontswap_ops = *ops;
> frontswap_enabled = true;
> +
> + backend_registered = true;
> + for (i = 0; i < MAX_SWAPFILES; i++) {
> + if (test_bit(i, sds))
> + (*frontswap_ops.init)(sds[i]);
> + }
> return old;
> }
> EXPORT_SYMBOL(frontswap_register_ops);
> @@ -122,7 +141,10 @@ void __frontswap_init(unsigned type)
> BUG_ON(sis == NULL);
> if (sis->frontswap_map == NULL)
> return;
> - frontswap_ops.init(type);
> + if (backend_registered) {
> + (*frontswap_ops.init)(type);
> + set_bit(type, sds);
> + }
> }

What about set bit if backend not registered and clear bit when invalidate.
I think that looks more directly.
Like:
+ if (backend_registered) {
+ BUG_ON(sis == NULL);
+ if (sis->frontswap_map == NULL)
+ return;
+ frontswap_ops.init(type);
+ }
+ else {
+ BUG_ON(type > MAX_SWAPFILES);
+ set_bit(type, sds);
+ }


> EXPORT_SYMBOL(__frontswap_init);
>
> @@ -147,10 +169,20 @@ int __frontswap_store(struct page *page)
> struct swap_info_struct *sis = swap_info[type];
> pgoff_t offset = swp_offset(entry);
>
> + if (!backend_registered) {
> + inc_frontswap_failed_stores();
> + return ret;
> + }
> +
> BUG_ON(!PageLocked(page));
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset))
> dup = 1;
> + if (type < MAX_SWAPFILES && !test_bit(type, sds)) {
> + /* lazy init call to handle post-boot insmod backends*/
> + (*frontswap_ops.init)(type);
> + set_bit(type, sds);
> + }

Then rm this.

> ret = frontswap_ops.store(type, offset, page);
> if (ret == 0) {
> frontswap_set(sis, offset);
> @@ -186,6 +218,9 @@ int __frontswap_load(struct page *page)
> struct swap_info_struct *sis = swap_info[type];
> pgoff_t offset = swp_offset(entry);
>
> + if (!backend_registered)
> + return ret;
> +
> BUG_ON(!PageLocked(page));
> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset))
> @@ -209,6 +244,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> {
> struct swap_info_struct *sis = swap_info[type];
>
> + if (!backend_registered)
> + return;
> +

I'm not sure whether __frontswap_invalidate_page() will be called if
backend not registered.

> BUG_ON(sis == NULL);
> if (frontswap_test(sis, offset)) {
> frontswap_ops.invalidate_page(type, offset);
> @@ -226,12 +264,16 @@ void __frontswap_invalidate_area(unsigned type)
> {
> struct swap_info_struct *sis = swap_info[type];
>
> - BUG_ON(sis == NULL);
> - if (sis->frontswap_map == NULL)
> - return;
> - frontswap_ops.invalidate_area(type);
> - atomic_set(&sis->frontswap_pages, 0);
> - memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> + if (backend_registered) {
> + BUG_ON(sis == NULL);
> + if (sis->frontswap_map == NULL)
> + return;
> + (*frontswap_ops.invalidate_area)(type);
> + atomic_set(&sis->frontswap_pages, 0);
> + memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> + } else {
> + bitmap_zero(sds, MAX_SWAPFILES);

Use clear_bit(type, sds) here;

> + }
> }
> EXPORT_SYMBOL(__frontswap_invalidate_area);
>
> @@ -364,6 +406,9 @@ static int __init init_frontswap(void)
> debugfs_create_u64("invalidates", S_IRUGO,
> root, &frontswap_invalidates);
> #endif
> + bitmap_zero(sds, MAX_SWAPFILES);
> +
> + frontswap_enabled = 1;

We'd better init backend_registered = false also.

> return 0;
> }
>
> --
> 1.7.11.7
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

--
Thanks,
--Bob

2012-11-14 16:26:28

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 2/5] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules

> On Sat, Nov 3, 2012 at 2:27 AM, Konrad Rzeszutek Wilk
> <[email protected]> wrote:
> > On Wed, Oct 31, 2012 at 12:05:32PM -0500, Seth Jennings wrote:
> >> On 10/31/2012 10:07 AM, Dan Magenheimer wrote:
> >> > With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> >> > built/loaded as modules rather than built-in and enabled by a boot parameter,
> >> > this patch provides "lazy initialization", allowing backends to register to
> >> > frontswap even after swapon was run. Before a backend registers all calls
> >> > to init are recorded and the creation of tmem_pools delayed until a backend
> >> > registers or until a frontswap put is attempted.
> >> >
> >> > Signed-off-by: Stefan Hengelein <[email protected]>
> >> > Signed-off-by: Florian Schmaus <[email protected]>
> >> > Signed-off-by: Andor Daam <[email protected]>
> >> > Signed-off-by: Dan Magenheimer <[email protected]>
> >> > ---
> >> > include/linux/frontswap.h | 1 +
> >> > mm/frontswap.c | 70 +++++++++++++++++++++++++++++++++++++++-----
> >> > 2 files changed, 63 insertions(+), 8 deletions(-)
> >> >
> >> > diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
> >> > index 3044254..ef6ada6 100644
> >> > --- a/include/linux/frontswap.h
> >> > +++ b/include/linux/frontswap.h
> >> > @@ -23,6 +23,7 @@ extern void frontswap_writethrough(bool);
> >> > extern void frontswap_tmem_exclusive_gets(bool);
> >> >
> >> > extern void __frontswap_init(unsigned type);
> >> > +#define FRONTSWAP_HAS_LAZY_INIT
> >> > extern int __frontswap_store(struct page *page);
> >> > extern int __frontswap_load(struct page *page);
> >> > extern void __frontswap_invalidate_page(unsigned, pgoff_t);
> >> > diff --git a/mm/frontswap.c b/mm/frontswap.c
> >> > index 2890e67..523a19b 100644
> >> > --- a/mm/frontswap.c
> >> > +++ b/mm/frontswap.c
> >> > @@ -80,6 +80,19 @@ static inline void inc_frontswap_succ_stores(void) { }
> >> > static inline void inc_frontswap_failed_stores(void) { }
> >> > static inline void inc_frontswap_invalidates(void) { }
> >> > #endif
> >> > +
> >> > +/*
> >> > + * When no backend is registered all calls to init are registered and
> >> > + * remembered but fail to create tmem_pools. When a backend registers with
> >> > + * frontswap the previous calls to init are executed to create tmem_pools
> >> > + * and set the respective poolids.
> >> > + * While no backend is registered all "puts", "gets" and "flushes" are
> >> > + * ignored or fail.
> >> > + */
> >> > +#define MAX_INITIALIZABLE_SD 32
> >>
> >> MAX_INITIALIZABLE_SD should just be MAX_SWAPFILES
> >>
> >> > +static int sds[MAX_INITIALIZABLE_SD];
> >>
> >> Rather than store and array of enabled types indexed by type, why not
> >> an array of booleans indexed by type. Or a bitfield if you really
> >> want to save space.
> >>
> >> > +static int backend_registered;
> >>
> >> (backend_registered) is equivalent to checking (frontswap_ops != NULL)
> >> right?

Kind of. frontswap_ops is not a pointer though so it would be more of
a frontswap_ops != dummy. Lets make another patch that makes this a
pointer and then rip out the backend_registered.
.. snip..
> > if (sis->frontswap_map == NULL)
> > return;
> > - frontswap_ops.init(type);
> > + if (backend_registered) {
> > + (*frontswap_ops.init)(type);
> > + set_bit(type, sds);
> > + }
> > }
>
> What about set bit if backend not registered and clear bit when invalidate.
> I think that looks more directly.
> Like:
> + if (backend_registered) {
> + BUG_ON(sis == NULL);
> + if (sis->frontswap_map == NULL)
> + return;
> + frontswap_ops.init(type);
> + }
> + else {
> + BUG_ON(type > MAX_SWAPFILES);
> + set_bit(type, sds);
> + }

Good idea.
>
>
> > EXPORT_SYMBOL(__frontswap_init);
> >
> > @@ -147,10 +169,20 @@ int __frontswap_store(struct page *page)
> > struct swap_info_struct *sis = swap_info[type];
> > pgoff_t offset = swp_offset(entry);
> >
> > + if (!backend_registered) {
> > + inc_frontswap_failed_stores();
> > + return ret;
> > + }
> > +
> > BUG_ON(!PageLocked(page));
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset))
> > dup = 1;
> > + if (type < MAX_SWAPFILES && !test_bit(type, sds)) {
> > + /* lazy init call to handle post-boot insmod backends*/
> > + (*frontswap_ops.init)(type);
> > + set_bit(type, sds);
> > + }
>
> Then rm this.

Right, b/c the frontswap_init takes care of initializing the backend.
And this does not get called _until_ backend_registered is set.

So we have to be extra careful to set backend_registered _after_
all the frontswap.init have been called.
>
> > ret = frontswap_ops.store(type, offset, page);
> > if (ret == 0) {
> > frontswap_set(sis, offset);
> > @@ -186,6 +218,9 @@ int __frontswap_load(struct page *page)
> > struct swap_info_struct *sis = swap_info[type];
> > pgoff_t offset = swp_offset(entry);
> >
> > + if (!backend_registered)
> > + return ret;
> > +
> > BUG_ON(!PageLocked(page));
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset))
> > @@ -209,6 +244,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> > {
> > struct swap_info_struct *sis = swap_info[type];
> >
> > + if (!backend_registered)
> > + return;
> > +
>
> I'm not sure whether __frontswap_invalidate_page() will be called if
> backend not registered.

Yes.

User could do:

swapon /dev/sda3
swapoff /dev/sda3
modprobe zcache

>
> > BUG_ON(sis == NULL);
> > if (frontswap_test(sis, offset)) {
> > frontswap_ops.invalidate_page(type, offset);
> > @@ -226,12 +264,16 @@ void __frontswap_invalidate_area(unsigned type)
> > {
> > struct swap_info_struct *sis = swap_info[type];
> >
> > - BUG_ON(sis == NULL);
> > - if (sis->frontswap_map == NULL)
> > - return;
> > - frontswap_ops.invalidate_area(type);
> > - atomic_set(&sis->frontswap_pages, 0);
> > - memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> > + if (backend_registered) {
> > + BUG_ON(sis == NULL);
> > + if (sis->frontswap_map == NULL)
> > + return;
> > + (*frontswap_ops.invalidate_area)(type);
> > + atomic_set(&sis->frontswap_pages, 0);
> > + memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> > + } else {
> > + bitmap_zero(sds, MAX_SWAPFILES);
>
> Use clear_bit(type, sds) here;

Yikes. Yes. It actually could be unconditional too
>
> > + }
> > }
> > EXPORT_SYMBOL(__frontswap_invalidate_area);
> >
> > @@ -364,6 +406,9 @@ static int __init init_frontswap(void)
> > debugfs_create_u64("invalidates", S_IRUGO,
> > root, &frontswap_invalidates);
> > #endif
> > + bitmap_zero(sds, MAX_SWAPFILES);
> > +
> > + frontswap_enabled = 1;
>
> We'd better init backend_registered = false also.

I think we are OK. The .bss is set to zero so that means
backend_registered is by default zero.

The end result would look like this (I had not compiled tested it yet):

>From a13ed2c85b220c62035ab7ac79ad8a62f9f29c13 Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <[email protected]>
Date: Wed, 31 Oct 2012 08:07:51 -0700
Subject: [PATCH] mm: frontswap: lazy initialization to allow tmem backends to
build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <[email protected]>
Signed-off-by: Florian Schmaus <[email protected]>
Signed-off-by: Andor Daam <[email protected]>
Signed-off-by: Dan Magenheimer <[email protected]>
[v1: Fixes per Seth Jennings suggestions]
[v2: Removed FRONTSWAP_HAS_.. ]
[v3: Fix up per Bob Liu <[email protected]> recommendations]
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
---
mm/frontswap.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++--------
1 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..db90736 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
static inline void inc_frontswap_failed_stores(void) { }
static inline void inc_frontswap_invalidates(void) { }
#endif
+
+/*
+ * When no backend is registered all calls to init are registered and
+ * remembered but fail to create tmem_pools. When a backend registers with
+ * frontswap the previous calls to init are executed to create tmem_pools
+ * and set the respective poolids.
+ * While no backend is registered all "puts", "gets" and "flushes" are
+ * ignored or fail.
+ */
+static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
+static bool backend_registered __read_mostly;
+
/*
* Register operations for frontswap, returning previous thus allowing
* detection of multiple backends and possible nesting.
@@ -87,9 +99,19 @@ static inline void inc_frontswap_invalidates(void) { }
struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
{
struct frontswap_ops old = frontswap_ops;
+ int i;

frontswap_ops = *ops;
frontswap_enabled = true;
+
+ for (i = 0; i < MAX_SWAPFILES; i++) {
+ if (test_and_clear_bit(i, need_init))
+ (*frontswap_ops.init)(i);
+ }
+ /* We MUST have backend_registered called _after_ the frontswap_init's
+ * have been called. Otherwise __frontswap_store might fail. */
+ barrier();
+ backend_registered = true;
return old;
}
EXPORT_SYMBOL(frontswap_register_ops);
@@ -119,10 +141,17 @@ void __frontswap_init(unsigned type)
{
struct swap_info_struct *sis = swap_info[type];

- BUG_ON(sis == NULL);
- if (sis->frontswap_map == NULL)
- return;
- frontswap_ops.init(type);
+ if (backend_registered) {
+ BUG_ON(sis == NULL);
+ if (sis->frontswap_map == NULL)
+ return;
+ (*frontswap_ops.init)(type);
+ }
+ else {
+ BUG_ON(type > MAX_SWAPFILES);
+ set_bit(type, need_init);
+ }
+
}
EXPORT_SYMBOL(__frontswap_init);

@@ -147,6 +176,11 @@ int __frontswap_store(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered) {
+ inc_frontswap_failed_stores();
+ return ret;
+ }
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
@@ -186,6 +220,9 @@ int __frontswap_load(struct page *page)
struct swap_info_struct *sis = swap_info[type];
pgoff_t offset = swp_offset(entry);

+ if (!backend_registered)
+ return ret;
+
BUG_ON(!PageLocked(page));
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset))
@@ -209,6 +246,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
{
struct swap_info_struct *sis = swap_info[type];

+ if (!backend_registered)
+ return;
+
BUG_ON(sis == NULL);
if (frontswap_test(sis, offset)) {
frontswap_ops.invalidate_page(type, offset);
@@ -226,12 +266,15 @@ void __frontswap_invalidate_area(unsigned type)
{
struct swap_info_struct *sis = swap_info[type];

- BUG_ON(sis == NULL);
- if (sis->frontswap_map == NULL)
- return;
- frontswap_ops.invalidate_area(type);
- atomic_set(&sis->frontswap_pages, 0);
- memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ if (backend_registered) {
+ BUG_ON(sis == NULL);
+ if (sis->frontswap_map == NULL)
+ return;
+ (*frontswap_ops.invalidate_area)(type);
+ atomic_set(&sis->frontswap_pages, 0);
+ memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+ }
+ clear_bit(need_init, MAX_SWAPFILES);
}
EXPORT_SYMBOL(__frontswap_invalidate_area);

@@ -364,6 +407,9 @@ static int __init init_frontswap(void)
debugfs_create_u64("invalidates", S_IRUGO,
root, &frontswap_invalidates);
#endif
+ bitmap_zero(need_init, MAX_SWAPFILES);
+
+ frontswap_enabled = 1;
return 0;
}

--
1.7.7.6