2015-08-05 13:50:50

by Dan Streetman

[permalink] [raw]
Subject: [PATCH 0/3] make zswap params changeable at runtime

This is a resend of the patch series. It makes creation of the zpool and
compressor dynamic, so that they can be changed at runtime. This makes
using/configuring zswap easier, as before this zswap had to be configured
at boot time, using boot params.

This uses a single list to track both the zpool and compressor together,
although Seth had mentioned an alternative which is to track the zpools
and compressors using separate lists. In the most common case, only a
single zpool and single compressor, using one list is slightly simpler
than using two lists, and for the uncommon case of multiple zpools and/or
compressors, using one list is slightly less simple (and uses slightly
more memory, probably) than using two lists.

Dan Streetman (3):
zpool: add zpool_has_pool()
zswap: dynamic pool creation
zswap: change zpool/compressor at runtime

include/linux/zpool.h | 2 +
mm/zpool.c | 25 ++
mm/zswap.c | 683 ++++++++++++++++++++++++++++++++++++++------------
3 files changed, 555 insertions(+), 155 deletions(-)

--
2.1.0


2015-08-05 13:47:04

by Dan Streetman

[permalink] [raw]
Subject: [PATCH 1/3] zpool: add zpool_has_pool()

Add zpool_has_pool() function, indicating if the specified type of zpool
is available (i.e. zsmalloc or zbud). This allows checking if a pool is
available, without actually trying to allocate it, similar to
crypto_has_alg().

This is used by a following patch to zswap that enables the dynamic
runtime creation of zswap zpools.

Signed-off-by: Dan Streetman <[email protected]>
---
include/linux/zpool.h | 2 ++
mm/zpool.c | 25 +++++++++++++++++++++++++
2 files changed, 27 insertions(+)

diff --git a/include/linux/zpool.h b/include/linux/zpool.h
index c924a28..42f8ec9 100644
--- a/include/linux/zpool.h
+++ b/include/linux/zpool.h
@@ -36,6 +36,8 @@ enum zpool_mapmode {
ZPOOL_MM_DEFAULT = ZPOOL_MM_RW
};

+bool zpool_has_pool(char *type);
+
struct zpool *zpool_create_pool(char *type, char *name,
gfp_t gfp, const struct zpool_ops *ops);

diff --git a/mm/zpool.c b/mm/zpool.c
index 951db32..aafcf8f 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -100,6 +100,31 @@ static void zpool_put_driver(struct zpool_driver *driver)
}

/**
+ * zpool_has_pool() - Check if the pool driver is available
+ * @type The type of the zpool to check (e.g. zbud, zsmalloc)
+ *
+ * This checks if the @type pool driver is available.
+ *
+ * Returns: true if @type pool is available, false if not
+ */
+bool zpool_has_pool(char *type)
+{
+ struct zpool_driver *driver = zpool_get_driver(type);
+
+ if (!driver) {
+ request_module("zpool-%s", type);
+ driver = zpool_get_driver(type);
+ }
+
+ if (!driver)
+ return false;
+
+ zpool_put_driver(driver);
+ return true;
+}
+EXPORT_SYMBOL(zpool_has_pool);
+
+/**
* zpool_create_pool() - Create a new zpool
* @type The type of the zpool to create (e.g. zbud, zsmalloc)
* @name The name of the zpool (e.g. zram0, zswap)
--
2.1.0

2015-08-05 13:47:13

by Dan Streetman

[permalink] [raw]
Subject: [PATCH 2/3] zswap: dynamic pool creation

Add dynamic creation of pools. Move the static crypto compression
per-cpu transforms into each pool. Add a pointer to zswap_entry to
the pool it's in.

This is required by the following patch which enables changing the
zswap zpool and compressor params at runtime.

Signed-off-by: Dan Streetman <[email protected]>
---
mm/zswap.c | 550 +++++++++++++++++++++++++++++++++++++++++++++----------------
1 file changed, 407 insertions(+), 143 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 48a1d08..f8fcd7e 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -99,66 +99,19 @@ module_param_named(zpool, zswap_zpool_type, charp, 0444);
static struct zpool *zswap_pool;

/*********************************
-* compression functions
+* data structures
**********************************/
-/* per-cpu compression transforms */
-static struct crypto_comp * __percpu *zswap_comp_pcpu_tfms;

-enum comp_op {
- ZSWAP_COMPOP_COMPRESS,
- ZSWAP_COMPOP_DECOMPRESS
+struct zswap_pool {
+ struct zpool *zpool;
+ struct kref kref;
+ struct list_head list;
+ struct rcu_head rcu_head;
+ struct notifier_block notifier;
+ char tfm_name[CRYPTO_MAX_ALG_NAME];
+ struct crypto_comp * __percpu *tfm;
};

-static int zswap_comp_op(enum comp_op op, const u8 *src, unsigned int slen,
- u8 *dst, unsigned int *dlen)
-{
- struct crypto_comp *tfm;
- int ret;
-
- tfm = *per_cpu_ptr(zswap_comp_pcpu_tfms, get_cpu());
- switch (op) {
- case ZSWAP_COMPOP_COMPRESS:
- ret = crypto_comp_compress(tfm, src, slen, dst, dlen);
- break;
- case ZSWAP_COMPOP_DECOMPRESS:
- ret = crypto_comp_decompress(tfm, src, slen, dst, dlen);
- break;
- default:
- ret = -EINVAL;
- }
-
- put_cpu();
- return ret;
-}
-
-static int __init zswap_comp_init(void)
-{
- if (!crypto_has_comp(zswap_compressor, 0, 0)) {
- pr_info("%s compressor not available\n", zswap_compressor);
- /* fall back to default compressor */
- zswap_compressor = ZSWAP_COMPRESSOR_DEFAULT;
- if (!crypto_has_comp(zswap_compressor, 0, 0))
- /* can't even load the default compressor */
- return -ENODEV;
- }
- pr_info("using %s compressor\n", zswap_compressor);
-
- /* alloc percpu transforms */
- zswap_comp_pcpu_tfms = alloc_percpu(struct crypto_comp *);
- if (!zswap_comp_pcpu_tfms)
- return -ENOMEM;
- return 0;
-}
-
-static void __init zswap_comp_exit(void)
-{
- /* free percpu transforms */
- free_percpu(zswap_comp_pcpu_tfms);
-}
-
-/*********************************
-* data structures
-**********************************/
/*
* struct zswap_entry
*
@@ -166,22 +119,24 @@ static void __init zswap_comp_exit(void)
* page within zswap.
*
* rbnode - links the entry into red-black tree for the appropriate swap type
+ * offset - the swap offset for the entry. Index into the red-black tree.
* refcount - the number of outstanding reference to the entry. This is needed
* to protect against premature freeing of the entry by code
* concurrent calls to load, invalidate, and writeback. The lock
* for the zswap_tree structure that contains the entry must
* be held while changing the refcount. Since the lock must
* be held, there is no reason to also make refcount atomic.
- * offset - the swap offset for the entry. Index into the red-black tree.
- * handle - zpool allocation handle that stores the compressed page data
* length - the length in bytes of the compressed page data. Needed during
* decompression
+ * pool - the zswap_pool the entry's data is in
+ * handle - zpool allocation handle that stores the compressed page data
*/
struct zswap_entry {
struct rb_node rbnode;
pgoff_t offset;
int refcount;
unsigned int length;
+ struct zswap_pool *pool;
unsigned long handle;
};

@@ -201,6 +156,44 @@ struct zswap_tree {

static struct zswap_tree *zswap_trees[MAX_SWAPFILES];

+/* RCU-protected iteration */
+static LIST_HEAD(zswap_pools);
+/* protects zswap_pools list modification */
+static DEFINE_SPINLOCK(zswap_pools_lock);
+
+/*********************************
+* helpers and fwd declarations
+**********************************/
+
+#define zswap_pool_debug(msg, p) \
+ pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \
+ zpool_get_type((p)->zpool))
+
+static int zswap_writeback_entry(struct zpool *pool, unsigned long handle);
+static int zswap_pool_get(struct zswap_pool *pool);
+static void zswap_pool_put(struct zswap_pool *pool);
+
+static bool zswap_is_full(void)
+{
+ return totalram_pages * zswap_max_pool_percent / 100 <
+ DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
+}
+
+static void zswap_update_total_size(void)
+{
+ struct zswap_pool *pool;
+ u64 total = 0;
+
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(pool, &zswap_pools, list)
+ total += zpool_get_total_size(pool->zpool);
+
+ rcu_read_unlock();
+
+ zswap_pool_total_size = total;
+}
+
/*********************************
* zswap entry functions
**********************************/
@@ -294,10 +287,11 @@ static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *entry)
*/
static void zswap_free_entry(struct zswap_entry *entry)
{
- zpool_free(zswap_pool, entry->handle);
+ zpool_free(entry->pool->zpool, entry->handle);
+ zswap_pool_put(entry->pool);
zswap_entry_cache_free(entry);
atomic_dec(&zswap_stored_pages);
- zswap_pool_total_size = zpool_get_total_size(zswap_pool);
+ zswap_update_total_size();
}

/* caller must hold the tree lock */
@@ -339,35 +333,21 @@ static struct zswap_entry *zswap_entry_find_get(struct rb_root *root,
**********************************/
static DEFINE_PER_CPU(u8 *, zswap_dstmem);

-static int __zswap_cpu_notifier(unsigned long action, unsigned long cpu)
+static int __zswap_cpu_dstmem_notifier(unsigned long action, unsigned long cpu)
{
- struct crypto_comp *tfm;
u8 *dst;

switch (action) {
case CPU_UP_PREPARE:
- tfm = crypto_alloc_comp(zswap_compressor, 0, 0);
- if (IS_ERR(tfm)) {
- pr_err("can't allocate compressor transform\n");
- return NOTIFY_BAD;
- }
- *per_cpu_ptr(zswap_comp_pcpu_tfms, cpu) = tfm;
dst = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu));
if (!dst) {
pr_err("can't allocate compressor buffer\n");
- crypto_free_comp(tfm);
- *per_cpu_ptr(zswap_comp_pcpu_tfms, cpu) = NULL;
return NOTIFY_BAD;
}
per_cpu(zswap_dstmem, cpu) = dst;
break;
case CPU_DEAD:
case CPU_UP_CANCELED:
- tfm = *per_cpu_ptr(zswap_comp_pcpu_tfms, cpu);
- if (tfm) {
- crypto_free_comp(tfm);
- *per_cpu_ptr(zswap_comp_pcpu_tfms, cpu) = NULL;
- }
dst = per_cpu(zswap_dstmem, cpu);
kfree(dst);
per_cpu(zswap_dstmem, cpu) = NULL;
@@ -378,43 +358,309 @@ static int __zswap_cpu_notifier(unsigned long action, unsigned long cpu)
return NOTIFY_OK;
}

-static int zswap_cpu_notifier(struct notifier_block *nb,
- unsigned long action, void *pcpu)
+static int zswap_cpu_dstmem_notifier(struct notifier_block *nb,
+ unsigned long action, void *pcpu)
{
- unsigned long cpu = (unsigned long)pcpu;
- return __zswap_cpu_notifier(action, cpu);
+ return __zswap_cpu_dstmem_notifier(action, (unsigned long)pcpu);
}

-static struct notifier_block zswap_cpu_notifier_block = {
- .notifier_call = zswap_cpu_notifier
+static struct notifier_block zswap_dstmem_notifier = {
+ .notifier_call = zswap_cpu_dstmem_notifier,
};

-static int __init zswap_cpu_init(void)
+static int __init zswap_cpu_dstmem_init(void)
{
unsigned long cpu;

cpu_notifier_register_begin();
for_each_online_cpu(cpu)
- if (__zswap_cpu_notifier(CPU_UP_PREPARE, cpu) != NOTIFY_OK)
+ if (__zswap_cpu_dstmem_notifier(CPU_UP_PREPARE, cpu) ==
+ NOTIFY_BAD)
goto cleanup;
- __register_cpu_notifier(&zswap_cpu_notifier_block);
+ __register_cpu_notifier(&zswap_dstmem_notifier);
cpu_notifier_register_done();
return 0;

cleanup:
for_each_online_cpu(cpu)
- __zswap_cpu_notifier(CPU_UP_CANCELED, cpu);
+ __zswap_cpu_dstmem_notifier(CPU_UP_CANCELED, cpu);
cpu_notifier_register_done();
return -ENOMEM;
}

+static void zswap_cpu_dstmem_destroy(void)
+{
+ unsigned long cpu;
+
+ cpu_notifier_register_begin();
+ for_each_online_cpu(cpu)
+ __zswap_cpu_dstmem_notifier(CPU_UP_CANCELED, cpu);
+ __unregister_cpu_notifier(&zswap_dstmem_notifier);
+ cpu_notifier_register_done();
+}
+
+static int __zswap_cpu_comp_notifier(struct zswap_pool *pool,
+ unsigned long action, unsigned long cpu)
+{
+ struct crypto_comp *tfm;
+
+ switch (action) {
+ case CPU_UP_PREPARE:
+ if (WARN_ON(*per_cpu_ptr(pool->tfm, cpu)))
+ break;
+ tfm = crypto_alloc_comp(pool->tfm_name, 0, 0);
+ if (IS_ERR_OR_NULL(tfm)) {
+ pr_err("could not alloc crypto comp %s : %ld\n",
+ pool->tfm_name, PTR_ERR(tfm));
+ return NOTIFY_BAD;
+ }
+ *per_cpu_ptr(pool->tfm, cpu) = tfm;
+ break;
+ case CPU_DEAD:
+ case CPU_UP_CANCELED:
+ tfm = *per_cpu_ptr(pool->tfm, cpu);
+ if (!IS_ERR_OR_NULL(tfm))
+ crypto_free_comp(tfm);
+ *per_cpu_ptr(pool->tfm, cpu) = NULL;
+ break;
+ default:
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+static int zswap_cpu_comp_notifier(struct notifier_block *nb,
+ unsigned long action, void *pcpu)
+{
+ unsigned long cpu = (unsigned long)pcpu;
+ struct zswap_pool *pool = container_of(nb, typeof(*pool), notifier);
+
+ return __zswap_cpu_comp_notifier(pool, action, cpu);
+}
+
+static int zswap_cpu_comp_init(struct zswap_pool *pool)
+{
+ unsigned long cpu;
+
+ memset(&pool->notifier, 0, sizeof(pool->notifier));
+ pool->notifier.notifier_call = zswap_cpu_comp_notifier;
+
+ cpu_notifier_register_begin();
+ for_each_online_cpu(cpu)
+ if (__zswap_cpu_comp_notifier(pool, CPU_UP_PREPARE, cpu) ==
+ NOTIFY_BAD)
+ goto cleanup;
+ __register_cpu_notifier(&pool->notifier);
+ cpu_notifier_register_done();
+ return 0;
+
+cleanup:
+ for_each_online_cpu(cpu)
+ __zswap_cpu_comp_notifier(pool, CPU_UP_CANCELED, cpu);
+ cpu_notifier_register_done();
+ return -ENOMEM;
+}
+
+static void zswap_cpu_comp_destroy(struct zswap_pool *pool)
+{
+ unsigned long cpu;
+
+ cpu_notifier_register_begin();
+ for_each_online_cpu(cpu)
+ __zswap_cpu_comp_notifier(pool, CPU_UP_CANCELED, cpu);
+ __unregister_cpu_notifier(&pool->notifier);
+ cpu_notifier_register_done();
+}
+
/*********************************
-* helpers
+* pool functions
**********************************/
-static bool zswap_is_full(void)
+
+static struct zswap_pool *__zswap_pool_current(void)
{
- return totalram_pages * zswap_max_pool_percent / 100 <
- DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
+ struct zswap_pool *pool;
+
+ pool = list_first_or_null_rcu(&zswap_pools, typeof(*pool), list);
+ WARN_ON(!pool);
+
+ return pool;
+}
+
+static struct zswap_pool *zswap_pool_current(void)
+{
+ assert_spin_locked(&zswap_pools_lock);
+
+ return __zswap_pool_current();
+}
+
+static struct zswap_pool *zswap_pool_current_get(void)
+{
+ struct zswap_pool *pool;
+
+ rcu_read_lock();
+
+ pool = __zswap_pool_current();
+ if (!pool || !zswap_pool_get(pool))
+ pool = NULL;
+
+ rcu_read_unlock();
+
+ return pool;
+}
+
+static struct zswap_pool *zswap_pool_last_get(void)
+{
+ struct zswap_pool *pool, *last = NULL;
+
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(pool, &zswap_pools, list)
+ last = pool;
+ if (!WARN_ON(!last) && !zswap_pool_get(last))
+ last = NULL;
+
+ rcu_read_unlock();
+
+ return last;
+}
+
+static struct zpool_ops zswap_zpool_ops = {
+ .evict = zswap_writeback_entry
+};
+
+static struct zswap_pool *zswap_pool_find_get(char *type, char *compressor)
+{
+ struct zswap_pool *pool;
+
+ assert_spin_locked(&zswap_pools_lock);
+
+ list_for_each_entry_rcu(pool, &zswap_pools, list) {
+ if (strncmp(pool->tfm_name, compressor, sizeof(pool->tfm_name)))
+ continue;
+ if (strncmp(zpool_get_type(pool->zpool), type,
+ sizeof(zswap_zpool_type)))
+ continue;
+ /* if we can't get it, it's about to be destroyed */
+ if (!zswap_pool_get(pool))
+ continue;
+ return pool;
+ }
+
+ return NULL;
+}
+
+static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
+{
+ struct zswap_pool *pool;
+ gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN;
+
+ pool = kzalloc(sizeof(*pool), GFP_KERNEL);
+ if (!pool) {
+ pr_err("pool alloc failed\n");
+ return NULL;
+ }
+
+ pool->zpool = zpool_create_pool(type, "zswap", gfp, &zswap_zpool_ops);
+ if (!pool->zpool) {
+ pr_err("%s zpool not available\n", type);
+ goto error;
+ }
+ pr_debug("using %s zpool\n", zpool_get_type(pool->zpool));
+
+ strlcpy(pool->tfm_name, compressor, sizeof(pool->tfm_name));
+ pool->tfm = alloc_percpu(struct crypto_comp *);
+ if (!pool->tfm) {
+ pr_err("percpu alloc failed\n");
+ goto error;
+ }
+
+ if (zswap_cpu_comp_init(pool))
+ goto error;
+ pr_debug("using %s compressor\n", pool->tfm_name);
+
+ /* being the current pool takes 1 ref; this func expects the
+ * caller to always add the new pool as the current pool
+ */
+ kref_init(&pool->kref);
+ INIT_LIST_HEAD(&pool->list);
+
+ zswap_pool_debug("created", pool);
+
+ return pool;
+
+error:
+ free_percpu(pool->tfm);
+ if (pool->zpool)
+ zpool_destroy_pool(pool->zpool);
+ kfree(pool);
+ return NULL;
+}
+
+static struct zswap_pool *__zswap_pool_create_fallback(void)
+{
+ if (!crypto_has_comp(zswap_compressor, 0, 0)) {
+ pr_err("compressor %s not available, using default %s\n",
+ zswap_compressor, ZSWAP_COMPRESSOR_DEFAULT);
+ strncpy(zswap_compressor, ZSWAP_COMPRESSOR_DEFAULT,
+ sizeof(zswap_compressor));
+ }
+ if (!zpool_has_pool(zswap_zpool_type)) {
+ pr_err("zpool %s not available, using default %s\n",
+ zswap_zpool_type, ZSWAP_ZPOOL_DEFAULT);
+ strncpy(zswap_zpool_type, ZSWAP_ZPOOL_DEFAULT,
+ sizeof(zswap_zpool_type));
+ }
+
+ return zswap_pool_create(zswap_zpool_type, zswap_compressor);
+}
+
+static void zswap_pool_destroy(struct zswap_pool *pool)
+{
+ zswap_pool_debug("destroying", pool);
+
+ zswap_cpu_comp_destroy(pool);
+ free_percpu(pool->tfm);
+ zpool_destroy_pool(pool->zpool);
+ kfree(pool);
+}
+
+static int __must_check zswap_pool_get(struct zswap_pool *pool)
+{
+ return kref_get_unless_zero(&pool->kref);
+}
+
+static void __zswap_pool_release(struct rcu_head *head)
+{
+ struct zswap_pool *pool = container_of(head, typeof(*pool), rcu_head);
+
+ /* nobody should have been able to get a kref... */
+ WARN_ON(kref_get_unless_zero(&pool->kref));
+
+ /* pool is now off zswap_pools list and has no references. */
+ zswap_pool_destroy(pool);
+}
+
+static void __zswap_pool_empty(struct kref *kref)
+{
+ struct zswap_pool *pool;
+
+ pool = container_of(kref, typeof(*pool), kref);
+
+ spin_lock(&zswap_pools_lock);
+
+ WARN_ON(pool == zswap_pool_current());
+
+ list_del_rcu(&pool->list);
+ call_rcu(&pool->rcu_head, __zswap_pool_release);
+
+ spin_unlock(&zswap_pools_lock);
+}
+
+static void zswap_pool_put(struct zswap_pool *pool)
+{
+ kref_put(&pool->kref, __zswap_pool_empty);
+}
+
}

/*********************************
@@ -477,6 +723,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
pgoff_t offset;
struct zswap_entry *entry;
struct page *page;
+ struct crypto_comp *tfm;
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -517,13 +764,15 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
case ZSWAP_SWAPCACHE_NEW: /* page is locked */
/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ src = (u8 *)zpool_map_handle(entry->pool->zpool, entry->handle,
ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
- ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src,
- entry->length, dst, &dlen);
+ tfm = *get_cpu_ptr(entry->pool->tfm);
+ ret = crypto_comp_decompress(tfm, src, entry->length,
+ dst, &dlen);
+ put_cpu_ptr(entry->pool->tfm);
kunmap_atomic(dst);
- zpool_unmap_handle(zswap_pool, entry->handle);
+ zpool_unmap_handle(entry->pool->zpool, entry->handle);
BUG_ON(ret);
BUG_ON(dlen != PAGE_SIZE);

@@ -572,6 +821,22 @@ end:
return ret;
}

+static int zswap_shrink(void)
+{
+ struct zswap_pool *pool;
+ int ret;
+
+ pool = zswap_pool_last_get();
+ if (!pool)
+ return -ENOENT;
+
+ ret = zpool_shrink(pool->zpool, 1, NULL);
+
+ zswap_pool_put(pool);
+
+ return ret;
+}
+
/*********************************
* frontswap hooks
**********************************/
@@ -581,6 +846,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry, *dupentry;
+ struct crypto_comp *tfm;
int ret;
unsigned int dlen = PAGE_SIZE, len;
unsigned long handle;
@@ -596,7 +862,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
/* reclaim space if needed */
if (zswap_is_full()) {
zswap_pool_limit_hit++;
- if (zpool_shrink(zswap_pool, 1, NULL)) {
+ if (zswap_shrink()) {
zswap_reject_reclaim_fail++;
ret = -ENOMEM;
goto reject;
@@ -611,33 +877,42 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
goto reject;
}

+ /* if entry is successfully added, it keeps the reference */
+ entry->pool = zswap_pool_current_get();
+ if (!entry->pool) {
+ ret = -EINVAL;
+ goto freepage;
+ }
+
/* compress */
dst = get_cpu_var(zswap_dstmem);
+ tfm = *get_cpu_ptr(entry->pool->tfm);
src = kmap_atomic(page);
- ret = zswap_comp_op(ZSWAP_COMPOP_COMPRESS, src, PAGE_SIZE, dst, &dlen);
+ ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
kunmap_atomic(src);
+ put_cpu_ptr(entry->pool->tfm);
if (ret) {
ret = -EINVAL;
- goto freepage;
+ goto put_dstmem;
}

/* store */
len = dlen + sizeof(struct zswap_header);
- ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
- &handle);
+ ret = zpool_malloc(entry->pool->zpool, len,
+ __GFP_NORETRY | __GFP_NOWARN, &handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
- goto freepage;
+ goto put_dstmem;
}
if (ret) {
zswap_reject_alloc_fail++;
- goto freepage;
+ goto put_dstmem;
}
- zhdr = zpool_map_handle(zswap_pool, handle, ZPOOL_MM_RW);
+ zhdr = zpool_map_handle(entry->pool->zpool, handle, ZPOOL_MM_RW);
zhdr->swpentry = swp_entry(type, offset);
buf = (u8 *)(zhdr + 1);
memcpy(buf, dst, dlen);
- zpool_unmap_handle(zswap_pool, handle);
+ zpool_unmap_handle(entry->pool->zpool, handle);
put_cpu_var(zswap_dstmem);

/* populate entry */
@@ -660,12 +935,14 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* update stats */
atomic_inc(&zswap_stored_pages);
- zswap_pool_total_size = zpool_get_total_size(zswap_pool);
+ zswap_update_total_size();

return 0;

-freepage:
+put_dstmem:
put_cpu_var(zswap_dstmem);
+ zswap_pool_put(entry->pool);
+freepage:
zswap_entry_cache_free(entry);
reject:
return ret;
@@ -680,6 +957,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
{
struct zswap_tree *tree = zswap_trees[type];
struct zswap_entry *entry;
+ struct crypto_comp *tfm;
u8 *src, *dst;
unsigned int dlen;
int ret;
@@ -696,13 +974,14 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,

/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ src = (u8 *)zpool_map_handle(entry->pool->zpool, entry->handle,
ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
- ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src, entry->length,
- dst, &dlen);
+ tfm = *get_cpu_ptr(entry->pool->tfm);
+ ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
+ put_cpu_ptr(entry->pool->tfm);
kunmap_atomic(dst);
- zpool_unmap_handle(zswap_pool, entry->handle);
+ zpool_unmap_handle(entry->pool->zpool, entry->handle);
BUG_ON(ret);

spin_lock(&tree->lock);
@@ -755,10 +1034,6 @@ static void zswap_frontswap_invalidate_area(unsigned type)
zswap_trees[type] = NULL;
}

-static const struct zpool_ops zswap_zpool_ops = {
- .evict = zswap_writeback_entry
-};
-
static void zswap_frontswap_init(unsigned type)
{
struct zswap_tree *tree;
@@ -839,49 +1114,38 @@ static void __exit zswap_debugfs_exit(void) { }
**********************************/
static int __init init_zswap(void)
{
- gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN;
-
- pr_info("loading zswap\n");
-
- zswap_pool = zpool_create_pool(zswap_zpool_type, "zswap", gfp,
- &zswap_zpool_ops);
- if (!zswap_pool && strcmp(zswap_zpool_type, ZSWAP_ZPOOL_DEFAULT)) {
- pr_info("%s zpool not available\n", zswap_zpool_type);
- zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
- zswap_pool = zpool_create_pool(zswap_zpool_type, "zswap", gfp,
- &zswap_zpool_ops);
- }
- if (!zswap_pool) {
- pr_err("%s zpool not available\n", zswap_zpool_type);
- pr_err("zpool creation failed\n");
- goto error;
- }
- pr_info("using %s pool\n", zswap_zpool_type);
+ struct zswap_pool *pool;

if (zswap_entry_cache_create()) {
pr_err("entry cache creation failed\n");
- goto cachefail;
+ goto cache_fail;
}
- if (zswap_comp_init()) {
- pr_err("compressor initialization failed\n");
- goto compfail;
+
+ if (zswap_cpu_dstmem_init()) {
+ pr_err("dstmem alloc failed\n");
+ goto dstmem_fail;
}
- if (zswap_cpu_init()) {
- pr_err("per-cpu initialization failed\n");
- goto pcpufail;
+
+ pool = __zswap_pool_create_fallback();
+ if (!pool) {
+ pr_err("pool creation failed\n");
+ goto pool_fail;
}
+ pr_info("loaded using pool %s/%s\n", pool->tfm_name,
+ zpool_get_type(pool->zpool));
+
+ list_add(&pool->list, &zswap_pools);

frontswap_register_ops(&zswap_frontswap_ops);
if (zswap_debugfs_init())
pr_warn("debugfs initialization failed\n");
return 0;
-pcpufail:
- zswap_comp_exit();
-compfail:
+
+pool_fail:
+ zswap_cpu_dstmem_destroy();
+dstmem_fail:
zswap_entry_cache_destroy();
-cachefail:
- zpool_destroy_pool(zswap_pool);
-error:
+cache_fail:
return -ENOMEM;
}
/* must be late so crypto has time to come up */
--
2.1.0

2015-08-05 13:49:55

by Dan Streetman

[permalink] [raw]
Subject: [PATCH 3/3] zswap: change zpool/compressor at runtime

Update the zpool and compressor parameters to be changeable at runtime.
When changed, a new pool is created with the requested zpool/compressor,
and added as the current pool at the front of the pool list. Previous
pools remain in the list only to remove existing compressed pages from.
The old pool(s) are removed once they become empty.

Signed-off-by: Dan Streetman <[email protected]>
---
mm/zswap.c | 135 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 122 insertions(+), 13 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index f8fcd7e..3eaff21 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -80,23 +80,39 @@ static u64 zswap_duplicate_entry;
static bool zswap_enabled;
module_param_named(enabled, zswap_enabled, bool, 0644);

-/* Compressor to be used by zswap (fixed at boot for now) */
+/* Crypto compressor to use */
#define ZSWAP_COMPRESSOR_DEFAULT "lzo"
-static char *zswap_compressor = ZSWAP_COMPRESSOR_DEFAULT;
-module_param_named(compressor, zswap_compressor, charp, 0444);
-
-/* The maximum percentage of memory that the compressed pool can occupy */
-static unsigned int zswap_max_pool_percent = 20;
-module_param_named(max_pool_percent,
- zswap_max_pool_percent, uint, 0644);
+static char zswap_compressor[CRYPTO_MAX_ALG_NAME] = ZSWAP_COMPRESSOR_DEFAULT;
+static struct kparam_string zswap_compressor_kparam = {
+ .string = zswap_compressor,
+ .maxlen = sizeof(zswap_compressor),
+};
+static int zswap_compressor_param_set(const char *,
+ const struct kernel_param *);
+static struct kernel_param_ops zswap_compressor_param_ops = {
+ .set = zswap_compressor_param_set,
+ .get = param_get_string,
+};
+module_param_cb(compressor, &zswap_compressor_param_ops,
+ &zswap_compressor_kparam, 0644);

-/* Compressed storage to use */
+/* Compressed storage zpool to use */
#define ZSWAP_ZPOOL_DEFAULT "zbud"
-static char *zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
-module_param_named(zpool, zswap_zpool_type, charp, 0444);
+static char zswap_zpool_type[32 /* arbitrary */] = ZSWAP_ZPOOL_DEFAULT;
+static struct kparam_string zswap_zpool_kparam = {
+ .string = zswap_zpool_type,
+ .maxlen = sizeof(zswap_zpool_type),
+};
+static int zswap_zpool_param_set(const char *, const struct kernel_param *);
+static struct kernel_param_ops zswap_zpool_param_ops = {
+ .set = zswap_zpool_param_set,
+ .get = param_get_string,
+};
+module_param_cb(zpool, &zswap_zpool_param_ops, &zswap_zpool_kparam, 0644);

-/* zpool is shared by all of zswap backend */
-static struct zpool *zswap_pool;
+/* The maximum percentage of memory that the compressed pool can occupy */
+static unsigned int zswap_max_pool_percent = 20;
+module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);

/*********************************
* data structures
@@ -161,6 +177,9 @@ static LIST_HEAD(zswap_pools);
/* protects zswap_pools list modification */
static DEFINE_SPINLOCK(zswap_pools_lock);

+/* used by param callback function */
+static bool zswap_init_started;
+
/*********************************
* helpers and fwd declarations
**********************************/
@@ -661,6 +680,94 @@ static void zswap_pool_put(struct zswap_pool *pool)
kref_put(&pool->kref, __zswap_pool_empty);
}

+/*********************************
+* param callbacks
+**********************************/
+
+static int __zswap_param_set(const char *val, const struct kernel_param *kp,
+ char *type, char *compressor)
+{
+ struct zswap_pool *pool, *put_pool = NULL;
+ char str[kp->str->maxlen], *s;
+ int ret;
+
+ strlcpy(str, val, kp->str->maxlen);
+ s = strim(str);
+
+ /* if this is load-time (pre-init) param setting,
+ * don't create a pool; that's done during init.
+ */
+ if (!zswap_init_started)
+ return param_set_copystring(s, kp);
+
+ /* no change required */
+ if (!strncmp(kp->str->string, s, kp->str->maxlen))
+ return 0;
+
+ if (!type) {
+ type = s;
+ if (!zpool_has_pool(type)) {
+ pr_err("zpool %s not available\n", type);
+ return -ENOENT;
+ }
+ } else if (!compressor) {
+ compressor = s;
+ if (!crypto_has_comp(compressor, 0, 0)) {
+ pr_err("compressor %s not available\n", compressor);
+ return -ENOENT;
+ }
+ }
+
+ spin_lock(&zswap_pools_lock);
+
+ pool = zswap_pool_find_get(type, compressor);
+ if (pool) {
+ zswap_pool_debug("using existing", pool);
+ list_del_rcu(&pool->list);
+ } else {
+ spin_unlock(&zswap_pools_lock);
+ pool = zswap_pool_create(type, compressor);
+ spin_lock(&zswap_pools_lock);
+ }
+
+ if (pool)
+ ret = param_set_copystring(s, kp);
+ else
+ ret = -EINVAL;
+
+ if (!ret) {
+ put_pool = zswap_pool_current();
+ list_add_rcu(&pool->list, &zswap_pools);
+ } else if (pool) {
+ /* add the possibly pre-existing pool to the end of the pools
+ * list; if it's new (and empty) then it'll be removed and
+ * destroyed by the put after we drop the lock
+ */
+ list_add_tail_rcu(&pool->list, &zswap_pools);
+ put_pool = pool;
+ }
+
+ spin_unlock(&zswap_pools_lock);
+
+ /* drop the ref from either the old current pool,
+ * or the new pool we failed to add
+ */
+ if (put_pool)
+ zswap_pool_put(put_pool);
+
+ return ret;
+}
+
+static int zswap_compressor_param_set(const char *val,
+ const struct kernel_param *kp)
+{
+ return __zswap_param_set(val, kp, zswap_zpool_type, NULL);
+}
+
+static int zswap_zpool_param_set(const char *val,
+ const struct kernel_param *kp)
+{
+ return __zswap_param_set(val, kp, NULL, zswap_compressor);
}

/*********************************
@@ -1116,6 +1223,8 @@ static int __init init_zswap(void)
{
struct zswap_pool *pool;

+ zswap_init_started = true;
+
if (zswap_entry_cache_create()) {
pr_err("entry cache creation failed\n");
goto cache_fail;
--
2.1.0

2015-08-05 20:08:39

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

On Wed, 5 Aug 2015 09:46:41 -0400 Dan Streetman <[email protected]> wrote:

> Add zpool_has_pool() function, indicating if the specified type of zpool
> is available (i.e. zsmalloc or zbud). This allows checking if a pool is
> available, without actually trying to allocate it, similar to
> crypto_has_alg().
>
> This is used by a following patch to zswap that enables the dynamic
> runtime creation of zswap zpools.
>
> ...
>
> /**
> + * zpool_has_pool() - Check if the pool driver is available
> + * @type The type of the zpool to check (e.g. zbud, zsmalloc)
> + *
> + * This checks if the @type pool driver is available.
> + *
> + * Returns: true if @type pool is available, false if not
> + */
> +bool zpool_has_pool(char *type)
> +{
> + struct zpool_driver *driver = zpool_get_driver(type);
> +
> + if (!driver) {
> + request_module("zpool-%s", type);
> + driver = zpool_get_driver(type);
> + }
> +
> + if (!driver)
> + return false;
> +
> + zpool_put_driver(driver);
> + return true;
> +}

This looks racy: after that zpool_put_driver() has completed, an rmmod
will invalidate zpool_has_pool()'s return value.

If there's some reason why this can't happen, can we please have a code
comment which reveals that reason?

2015-08-05 20:14:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

On Wed, 5 Aug 2015 09:46:43 -0400 Dan Streetman <[email protected]> wrote:

> Update the zpool and compressor parameters to be changeable at runtime.
> When changed, a new pool is created with the requested zpool/compressor,
> and added as the current pool at the front of the pool list. Previous
> pools remain in the list only to remove existing compressed pages from.
> The old pool(s) are removed once they become empty.
>
> +/*********************************
> +* param callbacks
> +**********************************/
> +
> +static int __zswap_param_set(const char *val, const struct kernel_param *kp,
> + char *type, char *compressor)
> +{
> + struct zswap_pool *pool, *put_pool = NULL;
> + char str[kp->str->maxlen], *s;

What's the upper bound on the size of this variable-sized array?

> + int ret;
> +
> + strlcpy(str, val, kp->str->maxlen);
> + s = strim(str);
> +
> + /* if this is load-time (pre-init) param setting,

2015-08-05 22:00:48

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

On Wed, Aug 5, 2015 at 4:08 PM, Andrew Morton <[email protected]> wrote:
> On Wed, 5 Aug 2015 09:46:41 -0400 Dan Streetman <[email protected]> wrote:
>
>> Add zpool_has_pool() function, indicating if the specified type of zpool
>> is available (i.e. zsmalloc or zbud). This allows checking if a pool is
>> available, without actually trying to allocate it, similar to
>> crypto_has_alg().
>>
>> This is used by a following patch to zswap that enables the dynamic
>> runtime creation of zswap zpools.
>>
>> ...
>>
>> /**
>> + * zpool_has_pool() - Check if the pool driver is available
>> + * @type The type of the zpool to check (e.g. zbud, zsmalloc)
>> + *
>> + * This checks if the @type pool driver is available.
>> + *
>> + * Returns: true if @type pool is available, false if not
>> + */
>> +bool zpool_has_pool(char *type)
>> +{
>> + struct zpool_driver *driver = zpool_get_driver(type);
>> +
>> + if (!driver) {
>> + request_module("zpool-%s", type);
>> + driver = zpool_get_driver(type);
>> + }
>> +
>> + if (!driver)
>> + return false;
>> +
>> + zpool_put_driver(driver);
>> + return true;
>> +}
>
> This looks racy: after that zpool_put_driver() has completed, an rmmod
> will invalidate zpool_has_pool()'s return value.

the true/false return value is only a snapshot of that moment in time;
zswap's use of this is only to validate that the user-provided zpool
name is valid; if this fails, zswap will just return failure to the
user (or if this happens at init-time, falls back to LZO). If this
succeeds, zswap still must use zpool_create_pool() which will fail if
the requested module can't be loaded.

essentially zswap does:

if (!zpool_has_pool(zpool_type) || !crypto_has_comp(compressor_type))
return -EINVAL;

that allows it to check that the requested zpool and compressor types
are valid, before actually creating anything. The creation of the
zpool and compressor do have error handling if either of them fail.

>
> If there's some reason why this can't happen, can we please have a code
> comment which reveals that reason?

zpool_create_pool() should work if this returns true, unless as you
say the module is rmmod'ed *and* removed from the system - since
zpool_create_pool() will call request_module() just as this function
does. I can add a comment explaining that.

2015-08-05 22:07:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

On Wed, 5 Aug 2015 18:00:26 -0400 Dan Streetman <[email protected]> wrote:

> >
> > If there's some reason why this can't happen, can we please have a code
> > comment which reveals that reason?
>
> zpool_create_pool() should work if this returns true, unless as you
> say the module is rmmod'ed *and* removed from the system - since
> zpool_create_pool() will call request_module() just as this function
> does. I can add a comment explaining that.

I like comments ;)

Seth, I'm planning on sitting on these patches until you've had a
chance to review them.

2015-08-06 00:08:11

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

Hi,

On (08/05/15 09:46), Dan Streetman wrote:
> Update the zpool and compressor parameters to be changeable at runtime.
> When changed, a new pool is created with the requested zpool/compressor,
> and added as the current pool at the front of the pool list. Previous
> pools remain in the list only to remove existing compressed pages from.
> The old pool(s) are removed once they become empty.
>

Sorry, just curious, is this functionality/complication really
necessary? How often do you expect people to do that? The way I
see it -- a static configuration works just fine: boot, test,
re-configure, boot test; compare the results and done.

-ss

> Signed-off-by: Dan Streetman <[email protected]>
> ---
> mm/zswap.c | 135 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 122 insertions(+), 13 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index f8fcd7e..3eaff21 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -80,23 +80,39 @@ static u64 zswap_duplicate_entry;
> static bool zswap_enabled;
> module_param_named(enabled, zswap_enabled, bool, 0644);
>
> -/* Compressor to be used by zswap (fixed at boot for now) */
> +/* Crypto compressor to use */
> #define ZSWAP_COMPRESSOR_DEFAULT "lzo"
> -static char *zswap_compressor = ZSWAP_COMPRESSOR_DEFAULT;
> -module_param_named(compressor, zswap_compressor, charp, 0444);
> -
> -/* The maximum percentage of memory that the compressed pool can occupy */
> -static unsigned int zswap_max_pool_percent = 20;
> -module_param_named(max_pool_percent,
> - zswap_max_pool_percent, uint, 0644);
> +static char zswap_compressor[CRYPTO_MAX_ALG_NAME] = ZSWAP_COMPRESSOR_DEFAULT;
> +static struct kparam_string zswap_compressor_kparam = {
> + .string = zswap_compressor,
> + .maxlen = sizeof(zswap_compressor),
> +};
> +static int zswap_compressor_param_set(const char *,
> + const struct kernel_param *);
> +static struct kernel_param_ops zswap_compressor_param_ops = {
> + .set = zswap_compressor_param_set,
> + .get = param_get_string,
> +};
> +module_param_cb(compressor, &zswap_compressor_param_ops,
> + &zswap_compressor_kparam, 0644);
>
> -/* Compressed storage to use */
> +/* Compressed storage zpool to use */
> #define ZSWAP_ZPOOL_DEFAULT "zbud"
> -static char *zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
> -module_param_named(zpool, zswap_zpool_type, charp, 0444);
> +static char zswap_zpool_type[32 /* arbitrary */] = ZSWAP_ZPOOL_DEFAULT;
> +static struct kparam_string zswap_zpool_kparam = {
> + .string = zswap_zpool_type,
> + .maxlen = sizeof(zswap_zpool_type),
> +};
> +static int zswap_zpool_param_set(const char *, const struct kernel_param *);
> +static struct kernel_param_ops zswap_zpool_param_ops = {
> + .set = zswap_zpool_param_set,
> + .get = param_get_string,
> +};
> +module_param_cb(zpool, &zswap_zpool_param_ops, &zswap_zpool_kparam, 0644);
>
> -/* zpool is shared by all of zswap backend */
> -static struct zpool *zswap_pool;
> +/* The maximum percentage of memory that the compressed pool can occupy */
> +static unsigned int zswap_max_pool_percent = 20;
> +module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
>
> /*********************************
> * data structures
> @@ -161,6 +177,9 @@ static LIST_HEAD(zswap_pools);
> /* protects zswap_pools list modification */
> static DEFINE_SPINLOCK(zswap_pools_lock);
>
> +/* used by param callback function */
> +static bool zswap_init_started;
> +
> /*********************************
> * helpers and fwd declarations
> **********************************/
> @@ -661,6 +680,94 @@ static void zswap_pool_put(struct zswap_pool *pool)
> kref_put(&pool->kref, __zswap_pool_empty);
> }
>
> +/*********************************
> +* param callbacks
> +**********************************/
> +
> +static int __zswap_param_set(const char *val, const struct kernel_param *kp,
> + char *type, char *compressor)
> +{
> + struct zswap_pool *pool, *put_pool = NULL;
> + char str[kp->str->maxlen], *s;
> + int ret;
> +
> + strlcpy(str, val, kp->str->maxlen);
> + s = strim(str);
> +
> + /* if this is load-time (pre-init) param setting,
> + * don't create a pool; that's done during init.
> + */
> + if (!zswap_init_started)
> + return param_set_copystring(s, kp);
> +
> + /* no change required */
> + if (!strncmp(kp->str->string, s, kp->str->maxlen))
> + return 0;
> +
> + if (!type) {
> + type = s;
> + if (!zpool_has_pool(type)) {
> + pr_err("zpool %s not available\n", type);
> + return -ENOENT;
> + }
> + } else if (!compressor) {
> + compressor = s;
> + if (!crypto_has_comp(compressor, 0, 0)) {
> + pr_err("compressor %s not available\n", compressor);
> + return -ENOENT;
> + }
> + }
> +
> + spin_lock(&zswap_pools_lock);
> +
> + pool = zswap_pool_find_get(type, compressor);
> + if (pool) {
> + zswap_pool_debug("using existing", pool);
> + list_del_rcu(&pool->list);
> + } else {
> + spin_unlock(&zswap_pools_lock);
> + pool = zswap_pool_create(type, compressor);
> + spin_lock(&zswap_pools_lock);
> + }
> +
> + if (pool)
> + ret = param_set_copystring(s, kp);
> + else
> + ret = -EINVAL;
> +
> + if (!ret) {
> + put_pool = zswap_pool_current();
> + list_add_rcu(&pool->list, &zswap_pools);
> + } else if (pool) {
> + /* add the possibly pre-existing pool to the end of the pools
> + * list; if it's new (and empty) then it'll be removed and
> + * destroyed by the put after we drop the lock
> + */
> + list_add_tail_rcu(&pool->list, &zswap_pools);
> + put_pool = pool;
> + }
> +
> + spin_unlock(&zswap_pools_lock);
> +
> + /* drop the ref from either the old current pool,
> + * or the new pool we failed to add
> + */
> + if (put_pool)
> + zswap_pool_put(put_pool);
> +
> + return ret;
> +}
> +
> +static int zswap_compressor_param_set(const char *val,
> + const struct kernel_param *kp)
> +{
> + return __zswap_param_set(val, kp, zswap_zpool_type, NULL);
> +}
> +
> +static int zswap_zpool_param_set(const char *val,
> + const struct kernel_param *kp)
> +{
> + return __zswap_param_set(val, kp, NULL, zswap_compressor);
> }
>
> /*********************************
> @@ -1116,6 +1223,8 @@ static int __init init_zswap(void)
> {
> struct zswap_pool *pool;
>
> + zswap_init_started = true;
> +
> if (zswap_entry_cache_create()) {
> pr_err("entry cache creation failed\n");
> goto cache_fail;
> --
> 2.1.0
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2015-08-06 10:07:11

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

On Wed, Aug 5, 2015 at 4:14 PM, Andrew Morton <[email protected]> wrote:
> On Wed, 5 Aug 2015 09:46:43 -0400 Dan Streetman <[email protected]> wrote:
>
>> Update the zpool and compressor parameters to be changeable at runtime.
>> When changed, a new pool is created with the requested zpool/compressor,
>> and added as the current pool at the front of the pool list. Previous
>> pools remain in the list only to remove existing compressed pages from.
>> The old pool(s) are removed once they become empty.
>>
>> +/*********************************
>> +* param callbacks
>> +**********************************/
>> +
>> +static int __zswap_param_set(const char *val, const struct kernel_param *kp,
>> + char *type, char *compressor)
>> +{
>> + struct zswap_pool *pool, *put_pool = NULL;
>> + char str[kp->str->maxlen], *s;
>
> What's the upper bound on the size of this variable-sized array?

the kernel_param in this function will always be either
zswap_compressor_kparam or zswap_zpool_kparam, which are defined at
the top, and their maxlen fields are set to sizeof(their string),
which is either CRYPTO_MAX_ALG_NAME (currently 64) or 32 (arbitrary
max for zpool name).

I can also add a comment here to clarify that.

>
>> + int ret;
>> +
>> + strlcpy(str, val, kp->str->maxlen);
>> + s = strim(str);
>> +
>> + /* if this is load-time (pre-init) param setting,
>

2015-08-06 10:21:15

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

On Wed, Aug 5, 2015 at 8:08 PM, Sergey Senozhatsky
<[email protected]> wrote:
> Hi,
>
> On (08/05/15 09:46), Dan Streetman wrote:
>> Update the zpool and compressor parameters to be changeable at runtime.
>> When changed, a new pool is created with the requested zpool/compressor,
>> and added as the current pool at the front of the pool list. Previous
>> pools remain in the list only to remove existing compressed pages from.
>> The old pool(s) are removed once they become empty.
>>
>
> Sorry, just curious, is this functionality/complication really
> necessary?

Well you could ask the same question about many other module params;
can't people just configure everything using boot params? ;-)

> How often do you expect people to do that? The way I
> see it -- a static configuration works just fine: boot, test,
> re-configure, boot test; compare the results and done.

Sure a static configuration will work (it has since Seth wrote zswap),
but that doesn't guarantee everyone will want to do it that way.
Certainly for testing/development/benchmarking avoiding a reboot is
helpful. And for long-running and/or critical systems that need to
change their zpool or compressor, for whatever reason, forcing a
reboot isn't desirable.

Why would someone want to change their compressor or zpool? A simple
exampe comes to mind - maybe they have 1000's of systems and a bug was
found in the current level of compressor or zpool - they would then
have to either reboot all the systems to change to a different
zpool/compressor, or leave it using the known-buggy one.

In addition, a static boot-time configuration requires adding params
to the bootloader configuration, *and* rebuilding the initramfs to
include both the required zpool and compressor. So even for static
configurations, it's simpler to be able to set the zpool and
compressor immediately after boot, instead of at boot time.

>
> -ss
>
>> Signed-off-by: Dan Streetman <[email protected]>
>> ---
>> mm/zswap.c | 135 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------
>> 1 file changed, 122 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index f8fcd7e..3eaff21 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -80,23 +80,39 @@ static u64 zswap_duplicate_entry;
>> static bool zswap_enabled;
>> module_param_named(enabled, zswap_enabled, bool, 0644);
>>
>> -/* Compressor to be used by zswap (fixed at boot for now) */
>> +/* Crypto compressor to use */
>> #define ZSWAP_COMPRESSOR_DEFAULT "lzo"
>> -static char *zswap_compressor = ZSWAP_COMPRESSOR_DEFAULT;
>> -module_param_named(compressor, zswap_compressor, charp, 0444);
>> -
>> -/* The maximum percentage of memory that the compressed pool can occupy */
>> -static unsigned int zswap_max_pool_percent = 20;
>> -module_param_named(max_pool_percent,
>> - zswap_max_pool_percent, uint, 0644);
>> +static char zswap_compressor[CRYPTO_MAX_ALG_NAME] = ZSWAP_COMPRESSOR_DEFAULT;
>> +static struct kparam_string zswap_compressor_kparam = {
>> + .string = zswap_compressor,
>> + .maxlen = sizeof(zswap_compressor),
>> +};
>> +static int zswap_compressor_param_set(const char *,
>> + const struct kernel_param *);
>> +static struct kernel_param_ops zswap_compressor_param_ops = {
>> + .set = zswap_compressor_param_set,
>> + .get = param_get_string,
>> +};
>> +module_param_cb(compressor, &zswap_compressor_param_ops,
>> + &zswap_compressor_kparam, 0644);
>>
>> -/* Compressed storage to use */
>> +/* Compressed storage zpool to use */
>> #define ZSWAP_ZPOOL_DEFAULT "zbud"
>> -static char *zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
>> -module_param_named(zpool, zswap_zpool_type, charp, 0444);
>> +static char zswap_zpool_type[32 /* arbitrary */] = ZSWAP_ZPOOL_DEFAULT;
>> +static struct kparam_string zswap_zpool_kparam = {
>> + .string = zswap_zpool_type,
>> + .maxlen = sizeof(zswap_zpool_type),
>> +};
>> +static int zswap_zpool_param_set(const char *, const struct kernel_param *);
>> +static struct kernel_param_ops zswap_zpool_param_ops = {
>> + .set = zswap_zpool_param_set,
>> + .get = param_get_string,
>> +};
>> +module_param_cb(zpool, &zswap_zpool_param_ops, &zswap_zpool_kparam, 0644);
>>
>> -/* zpool is shared by all of zswap backend */
>> -static struct zpool *zswap_pool;
>> +/* The maximum percentage of memory that the compressed pool can occupy */
>> +static unsigned int zswap_max_pool_percent = 20;
>> +module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
>>
>> /*********************************
>> * data structures
>> @@ -161,6 +177,9 @@ static LIST_HEAD(zswap_pools);
>> /* protects zswap_pools list modification */
>> static DEFINE_SPINLOCK(zswap_pools_lock);
>>
>> +/* used by param callback function */
>> +static bool zswap_init_started;
>> +
>> /*********************************
>> * helpers and fwd declarations
>> **********************************/
>> @@ -661,6 +680,94 @@ static void zswap_pool_put(struct zswap_pool *pool)
>> kref_put(&pool->kref, __zswap_pool_empty);
>> }
>>
>> +/*********************************
>> +* param callbacks
>> +**********************************/
>> +
>> +static int __zswap_param_set(const char *val, const struct kernel_param *kp,
>> + char *type, char *compressor)
>> +{
>> + struct zswap_pool *pool, *put_pool = NULL;
>> + char str[kp->str->maxlen], *s;
>> + int ret;
>> +
>> + strlcpy(str, val, kp->str->maxlen);
>> + s = strim(str);
>> +
>> + /* if this is load-time (pre-init) param setting,
>> + * don't create a pool; that's done during init.
>> + */
>> + if (!zswap_init_started)
>> + return param_set_copystring(s, kp);
>> +
>> + /* no change required */
>> + if (!strncmp(kp->str->string, s, kp->str->maxlen))
>> + return 0;
>> +
>> + if (!type) {
>> + type = s;
>> + if (!zpool_has_pool(type)) {
>> + pr_err("zpool %s not available\n", type);
>> + return -ENOENT;
>> + }
>> + } else if (!compressor) {
>> + compressor = s;
>> + if (!crypto_has_comp(compressor, 0, 0)) {
>> + pr_err("compressor %s not available\n", compressor);
>> + return -ENOENT;
>> + }
>> + }
>> +
>> + spin_lock(&zswap_pools_lock);
>> +
>> + pool = zswap_pool_find_get(type, compressor);
>> + if (pool) {
>> + zswap_pool_debug("using existing", pool);
>> + list_del_rcu(&pool->list);
>> + } else {
>> + spin_unlock(&zswap_pools_lock);
>> + pool = zswap_pool_create(type, compressor);
>> + spin_lock(&zswap_pools_lock);
>> + }
>> +
>> + if (pool)
>> + ret = param_set_copystring(s, kp);
>> + else
>> + ret = -EINVAL;
>> +
>> + if (!ret) {
>> + put_pool = zswap_pool_current();
>> + list_add_rcu(&pool->list, &zswap_pools);
>> + } else if (pool) {
>> + /* add the possibly pre-existing pool to the end of the pools
>> + * list; if it's new (and empty) then it'll be removed and
>> + * destroyed by the put after we drop the lock
>> + */
>> + list_add_tail_rcu(&pool->list, &zswap_pools);
>> + put_pool = pool;
>> + }
>> +
>> + spin_unlock(&zswap_pools_lock);
>> +
>> + /* drop the ref from either the old current pool,
>> + * or the new pool we failed to add
>> + */
>> + if (put_pool)
>> + zswap_pool_put(put_pool);
>> +
>> + return ret;
>> +}
>> +
>> +static int zswap_compressor_param_set(const char *val,
>> + const struct kernel_param *kp)
>> +{
>> + return __zswap_param_set(val, kp, zswap_zpool_type, NULL);
>> +}
>> +
>> +static int zswap_zpool_param_set(const char *val,
>> + const struct kernel_param *kp)
>> +{
>> + return __zswap_param_set(val, kp, NULL, zswap_compressor);
>> }
>>
>> /*********************************
>> @@ -1116,6 +1223,8 @@ static int __init init_zswap(void)
>> {
>> struct zswap_pool *pool;
>>
>> + zswap_init_started = true;
>> +
>> if (zswap_entry_cache_create()) {
>> pr_err("entry cache creation failed\n");
>> goto cache_fail;
>> --
>> 2.1.0
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to [email protected]. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>>

2015-08-06 10:59:12

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

On (08/06/15 06:20), Dan Streetman wrote:
> > On (08/05/15 09:46), Dan Streetman wrote:
> >> Update the zpool and compressor parameters to be changeable at runtime.
> >> When changed, a new pool is created with the requested zpool/compressor,
> >> and added as the current pool at the front of the pool list. Previous
> >> pools remain in the list only to remove existing compressed pages from.
> >> The old pool(s) are removed once they become empty.
> >>
> >
> > Sorry, just curious, is this functionality/complication really
> > necessary?
>
> Well you could ask the same question about many other module params;
> can't people just configure everything using boot params? ;-)
>
> > How often do you expect people to do that? The way I
> > see it -- a static configuration works just fine: boot, test,
> > re-configure, boot test; compare the results and done.
>
> Sure a static configuration will work (it has since Seth wrote zswap),
> but that doesn't guarantee everyone will want to do it that way.
> Certainly for testing/development/benchmarking avoiding a reboot is
> helpful. And for long-running and/or critical systems that need to
> change their zpool or compressor, for whatever reason, forcing a
> reboot isn't desirable.

Sorry, I didn't have time to read the patches carefully/attentively
(will do); so my email may be a complete nonsense.

> Why would someone want to change their compressor or zpool? A simple
> exampe comes to mind - maybe they have 1000's of systems and a bug was
> found in the current level of compressor or zpool - they would then
> have to either reboot all the systems to change to a different
> zpool/compressor, or leave it using the known-buggy one.

Well, if that buggy compressor is being used by other modules
then rebooting is sort of inevitable. But you still preserve pages
compressed with the old compressor and let user access them, right?
Thus read operation possibly will hit the bug regardless of current
'front' pool.

> In addition, a static boot-time configuration requires adding params
> to the bootloader configuration, *and* rebuilding the initramfs to
> include both the required zpool and compressor. So even for static
> configurations, it's simpler to be able to set the zpool and
> compressor immediately after boot, instead of at boot time.

I mean, it just feels that this is a way too big change for no particular
use case (no offense). It doesn't take much time to figure out (a simple
google request does the trick here) which one of the available compressors
gives best ratio in general or which one has better read/write
(compress/decompress) speeds.

A buggy compressor is a good use case, I agree (with the exception that
reboot is still very much possible). But if someone changes compressing
backend because he or she estimates a better compression ratio or
performance then there will be no immediate benefit -- pages compressed
with the old compressor are still there and it will take some unpredictable
amount of time to drain old pools and to remove them.

-ss

2015-08-06 11:07:38

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 3/3] zswap: change zpool/compressor at runtime

On Thu, Aug 6, 2015 at 6:59 AM, Sergey Senozhatsky
<[email protected]> wrote:
> On (08/06/15 06:20), Dan Streetman wrote:
>> > On (08/05/15 09:46), Dan Streetman wrote:
>> >> Update the zpool and compressor parameters to be changeable at runtime.
>> >> When changed, a new pool is created with the requested zpool/compressor,
>> >> and added as the current pool at the front of the pool list. Previous
>> >> pools remain in the list only to remove existing compressed pages from.
>> >> The old pool(s) are removed once they become empty.
>> >>
>> >
>> > Sorry, just curious, is this functionality/complication really
>> > necessary?
>>
>> Well you could ask the same question about many other module params;
>> can't people just configure everything using boot params? ;-)
>>
>> > How often do you expect people to do that? The way I
>> > see it -- a static configuration works just fine: boot, test,
>> > re-configure, boot test; compare the results and done.
>>
>> Sure a static configuration will work (it has since Seth wrote zswap),
>> but that doesn't guarantee everyone will want to do it that way.
>> Certainly for testing/development/benchmarking avoiding a reboot is
>> helpful. And for long-running and/or critical systems that need to
>> change their zpool or compressor, for whatever reason, forcing a
>> reboot isn't desirable.
>
> Sorry, I didn't have time to read the patches carefully/attentively
> (will do); so my email may be a complete nonsense.
>
>> Why would someone want to change their compressor or zpool? A simple
>> exampe comes to mind - maybe they have 1000's of systems and a bug was
>> found in the current level of compressor or zpool - they would then
>> have to either reboot all the systems to change to a different
>> zpool/compressor, or leave it using the known-buggy one.
>
> Well, if that buggy compressor is being used by other modules
> then rebooting is sort of inevitable. But you still preserve pages
> compressed with the old compressor and let user access them, right?
> Thus read operation possibly will hit the bug regardless of current
> 'front' pool.

Yes, currently-compressed pages will be uncompressed using the same
compressor. It's only freed once all the pages using it have been
removed.

>
>> In addition, a static boot-time configuration requires adding params
>> to the bootloader configuration, *and* rebuilding the initramfs to
>> include both the required zpool and compressor. So even for static
>> configurations, it's simpler to be able to set the zpool and
>> compressor immediately after boot, instead of at boot time.
>
> I mean, it just feels that this is a way too big change for no particular
> use case (no offense). It doesn't take much time to figure out (a simple
> google request does the trick here) which one of the available compressors
> gives best ratio in general or which one has better read/write
> (compress/decompress) speeds.

There are hardware compressors now, you know (see PowerPC 842 hw
compressor). While a sw compressor won't fail during use (excepting a
buggy driver), a hw compressor might fail for
who-knows-what-hardware-issue. I suspect there will be more hw
compressors in the future.

>
> A buggy compressor is a good use case, I agree (with the exception that
> reboot is still very much possible). But if someone changes compressing
> backend because he or she estimates a better compression ratio or
> performance then there will be no immediate benefit -- pages compressed
> with the old compressor are still there and it will take some unpredictable
> amount of time to drain old pools and to remove them.

To me, avoiding the need to set boot parameters through the bootloader
AND the need to rebuild the initramfs is use case enough to justify
this. I can't think of any other driver configuration that *requires*
updating the bootloader config and rebuilding the initramfs.

>
> -ss

2015-08-06 17:54:43

by Dan Streetman

[permalink] [raw]
Subject: [PATCH] zpool: clarification comment for zpool_has_pool

Add clarification in the documentation comment for zpool_has_pool() to
explain the caller should assume the requested driver is or is not
available, depending on return value. If true is returned, the caller
should assume zpool_create_pool() will succeed, but still must be
prepared to handle failure.

Signed-off-by: Dan Streetman <[email protected]>
---
mm/zpool.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/zpool.c b/mm/zpool.c
index aafcf8f..d8cf7cd 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -103,7 +103,15 @@ static void zpool_put_driver(struct zpool_driver *driver)
* zpool_has_pool() - Check if the pool driver is available
* @type The type of the zpool to check (e.g. zbud, zsmalloc)
*
- * This checks if the @type pool driver is available.
+ * This checks if the @type pool driver is available. This will try to load
+ * the requested module, if needed, but there is no guarantee the module will
+ * still be loaded and available immediately after calling. If this returns
+ * true, the caller should assume the pool is available, but must be prepared
+ * to handle the @zpool_create_pool() returning failure. However if this
+ * returns false, the caller should assume the requested pool type is not
+ * available; either the requested pool type module does not exist, or could
+ * not be loaded, and calling @zpool_create_pool() with the pool type will
+ * fail.
*
* Returns: true if @type pool is available, false if not
*/
--
2.1.0

2015-08-06 17:55:00

by Dan Streetman

[permalink] [raw]
Subject: [PATCH] zswap: comment clarifying maxlen

Add a comment clarifying the variable-size array created on the stack will
always be either CRYPTO_MAX_ALG_NAME (64) or 32 bytes long.

Signed-off-by: Dan Streetman <[email protected]>
---
mm/zswap.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/mm/zswap.c b/mm/zswap.c
index 7bbecd9..b198081 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -691,6 +691,11 @@ static int __zswap_param_set(const char *val, const struct kernel_param *kp,
char str[kp->str->maxlen], *s;
int ret;

+ /*
+ * kp is either zswap_zpool_kparam or zswap_compressor_kparam, defined
+ * at the top of this file, so maxlen is CRYPTO_MAX_ALG_NAME (64) or
+ * 32 (arbitrary).
+ */
strlcpy(str, val, kp->str->maxlen);
s = strim(str);

--
2.1.0

2015-08-06 21:56:11

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

On Wed, Aug 05, 2015 at 03:06:59PM -0700, Andrew Morton wrote:
> On Wed, 5 Aug 2015 18:00:26 -0400 Dan Streetman <[email protected]> wrote:
>
> > >
> > > If there's some reason why this can't happen, can we please have a code
> > > comment which reveals that reason?
> >
> > zpool_create_pool() should work if this returns true, unless as you
> > say the module is rmmod'ed *and* removed from the system - since
> > zpool_create_pool() will call request_module() just as this function
> > does. I can add a comment explaining that.
>
> I like comments ;)
>
> Seth, I'm planning on sitting on these patches until you've had a
> chance to review them.

Thanks Andrew. I'm reviewing now. Patch 2/3 is pretty huge. I've got
the gist of the changes now. I'm also building and testing for myself
as this creates a lot more surface area for issues, alternating between
compressors and allocating new compression transforms on the fly.

I'm kinda with Sergey on this in that it adds yet another complexity to
an already complex feature. This adds more locking, more RCU, more
refcounting. It's becoming harder to review, test, and verify.

I should have results tomorrow.

Thanks,
Seth

2015-08-07 03:30:48

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

On Thu, Aug 06, 2015 at 04:50:23PM -0500, Seth Jennings wrote:
> On Wed, Aug 05, 2015 at 03:06:59PM -0700, Andrew Morton wrote:
> > On Wed, 5 Aug 2015 18:00:26 -0400 Dan Streetman <[email protected]> wrote:
> >
> > > >
> > > > If there's some reason why this can't happen, can we please have a code
> > > > comment which reveals that reason?
> > >
> > > zpool_create_pool() should work if this returns true, unless as you
> > > say the module is rmmod'ed *and* removed from the system - since
> > > zpool_create_pool() will call request_module() just as this function
> > > does. I can add a comment explaining that.
> >
> > I like comments ;)
> >
> > Seth, I'm planning on sitting on these patches until you've had a
> > chance to review them.
>
> Thanks Andrew. I'm reviewing now. Patch 2/3 is pretty huge. I've got
> the gist of the changes now. I'm also building and testing for myself
> as this creates a lot more surface area for issues, alternating between
> compressors and allocating new compression transforms on the fly.
>
> I'm kinda with Sergey on this in that it adds yet another complexity to
> an already complex feature. This adds more locking, more RCU, more
> refcounting. It's becoming harder to review, test, and verify.
>
> I should have results tomorrow.

So I gave it a test run turning all the knobs (compressor, enabled,
max_pool_percent, and zpool) like a crazy person and it was stable,
and all the adjustments had the expected result.

Dan, you might follow up with an update to Documentation/vm/zswap.txt
noting that these parameters are runtime adjustable now.

The growing complexity is a concern, but it is nice to have the
flexibility. Thanks for the good work!

To patchset:

Acked-by: Seth Jennings <[email protected]>

>
> Thanks,
> Seth

2015-08-07 06:30:28

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH 2/3] zswap: dynamic pool creation

Hello,

On (08/05/15 09:46), Dan Streetman wrote:
[..]
> -enum comp_op {
> - ZSWAP_COMPOP_COMPRESS,
> - ZSWAP_COMPOP_DECOMPRESS
> +struct zswap_pool {
> + struct zpool *zpool;
> + struct kref kref;
> + struct list_head list;
> + struct rcu_head rcu_head;
> + struct notifier_block notifier;
> + char tfm_name[CRYPTO_MAX_ALG_NAME];

do you need to keep a second CRYPTO_MAX_ALG_NAME copy? shouldn't it
be `tfm->__crt_alg->cra_name`, which is what
crypto_tfm_alg_name(struct crypto_tfm *tfm)
does?

> + struct crypto_comp * __percpu *tfm;
> };

->tfm will be access pretty often, right? did you intentionally put it
at the bottom offset of `struct zswap_pool'?

[..]
> +static struct zswap_pool *__zswap_pool_current(void)
> {
> - return totalram_pages * zswap_max_pool_percent / 100 <
> - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
> + struct zswap_pool *pool;
> +
> + pool = list_first_or_null_rcu(&zswap_pools, typeof(*pool), list);
> + WARN_ON(!pool);
> +
> + return pool;
> +}
> +
> +static struct zswap_pool *zswap_pool_current(void)
> +{
> + assert_spin_locked(&zswap_pools_lock);
> +
> + return __zswap_pool_current();
> +}

this one seems to be used only once. do you want to replace
that single usage (well, if it's really needed)

WARN_ON(pool == zswap_pool_current());
with
WARN_ON(pool == __zswap_pool_current);

?

you can then drop zswap_pool_current()... and probably rename
__zswap_pool_current() to zswap_pool_current().

-ss

> +static struct zswap_pool *zswap_pool_current_get(void)
> +{
> + struct zswap_pool *pool;
> +
> + rcu_read_lock();
> +
> + pool = __zswap_pool_current();
> + if (!pool || !zswap_pool_get(pool))
> + pool = NULL;
> +
> + rcu_read_unlock();
> +
> + return pool;
> +}
> +
> +static struct zswap_pool *zswap_pool_last_get(void)
> +{
> + struct zswap_pool *pool, *last = NULL;
> +
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(pool, &zswap_pools, list)
> + last = pool;
> + if (!WARN_ON(!last) && !zswap_pool_get(last))
> + last = NULL;
> +
> + rcu_read_unlock();
> +
> + return last;
> +}

2015-08-07 14:25:07

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 2/3] zswap: dynamic pool creation

On Fri, Aug 7, 2015 at 2:30 AM, Sergey Senozhatsky
<[email protected]> wrote:
> Hello,
>
> On (08/05/15 09:46), Dan Streetman wrote:
> [..]
>> -enum comp_op {
>> - ZSWAP_COMPOP_COMPRESS,
>> - ZSWAP_COMPOP_DECOMPRESS
>> +struct zswap_pool {
>> + struct zpool *zpool;
>> + struct kref kref;
>> + struct list_head list;
>> + struct rcu_head rcu_head;
>> + struct notifier_block notifier;
>> + char tfm_name[CRYPTO_MAX_ALG_NAME];
>
> do you need to keep a second CRYPTO_MAX_ALG_NAME copy? shouldn't it
> be `tfm->__crt_alg->cra_name`, which is what
> crypto_tfm_alg_name(struct crypto_tfm *tfm)
> does?

well, we don't absolutely have to keep a copy of tfm_name. However,
->tfm is a __percpu variable, so each time we want to check the pool's
tfm name, we would need to do:
crypto_comp_name(this_cpu_ptr(pool->tfm))

nothing wrong with that really, just adds a bit more code each time we
want to check the tfm name. I'll send a patch to change it.

>
>> + struct crypto_comp * __percpu *tfm;
>> };
>
> ->tfm will be access pretty often, right? did you intentionally put it
> at the bottom offset of `struct zswap_pool'?

no it wasn't intentional; does moving it up provide a benefit?

>
> [..]
>> +static struct zswap_pool *__zswap_pool_current(void)
>> {
>> - return totalram_pages * zswap_max_pool_percent / 100 <
>> - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
>> + struct zswap_pool *pool;
>> +
>> + pool = list_first_or_null_rcu(&zswap_pools, typeof(*pool), list);
>> + WARN_ON(!pool);
>> +
>> + return pool;
>> +}
>> +
>> +static struct zswap_pool *zswap_pool_current(void)
>> +{
>> + assert_spin_locked(&zswap_pools_lock);
>> +
>> + return __zswap_pool_current();
>> +}
>
> this one seems to be used only once. do you want to replace
> that single usage (well, if it's really needed)

it's actually used twice, in __zswap_pool_empty() and
__zswap_param_set(). The next patch adds __zswap_param_set().

>
> WARN_ON(pool == zswap_pool_current());
> with
> WARN_ON(pool == __zswap_pool_current);
>
> ?
>
> you can then drop zswap_pool_current()... and probably rename
> __zswap_pool_current() to zswap_pool_current().
>
> -ss
>
>> +static struct zswap_pool *zswap_pool_current_get(void)
>> +{
>> + struct zswap_pool *pool;
>> +
>> + rcu_read_lock();
>> +
>> + pool = __zswap_pool_current();
>> + if (!pool || !zswap_pool_get(pool))
>> + pool = NULL;
>> +
>> + rcu_read_unlock();
>> +
>> + return pool;
>> +}
>> +
>> +static struct zswap_pool *zswap_pool_last_get(void)
>> +{
>> + struct zswap_pool *pool, *last = NULL;
>> +
>> + rcu_read_lock();
>> +
>> + list_for_each_entry_rcu(pool, &zswap_pools, list)
>> + last = pool;
>> + if (!WARN_ON(!last) && !zswap_pool_get(last))
>> + last = NULL;
>> +
>> + rcu_read_unlock();
>> +
>> + return last;
>> +}

2015-08-07 18:57:36

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 2/3] zswap: dynamic pool creation

On Fri, Aug 7, 2015 at 10:24 AM, Dan Streetman <[email protected]> wrote:
> On Fri, Aug 7, 2015 at 2:30 AM, Sergey Senozhatsky
> <[email protected]> wrote:
>> Hello,
>>
>> On (08/05/15 09:46), Dan Streetman wrote:
>> [..]
>>> -enum comp_op {
>>> - ZSWAP_COMPOP_COMPRESS,
>>> - ZSWAP_COMPOP_DECOMPRESS
>>> +struct zswap_pool {
>>> + struct zpool *zpool;
>>> + struct kref kref;
>>> + struct list_head list;
>>> + struct rcu_head rcu_head;
>>> + struct notifier_block notifier;
>>> + char tfm_name[CRYPTO_MAX_ALG_NAME];
>>
>> do you need to keep a second CRYPTO_MAX_ALG_NAME copy? shouldn't it
>> be `tfm->__crt_alg->cra_name`, which is what
>> crypto_tfm_alg_name(struct crypto_tfm *tfm)
>> does?
>
> well, we don't absolutely have to keep a copy of tfm_name. However,
> ->tfm is a __percpu variable, so each time we want to check the pool's
> tfm name, we would need to do:
> crypto_comp_name(this_cpu_ptr(pool->tfm))
>
> nothing wrong with that really, just adds a bit more code each time we
> want to check the tfm name. I'll send a patch to change it.

i knew there was a reason i added the tfm_name ;-)

since ->tfm is a percpu, we add a notifier for added/removed cpus.
when a cpu is added, we create a new tfm for it. If we don't have the
tfm_name separate from the percpu ->tfm, we have to check some other
cpu's tfm for its name, and i don't think the complexity of checking
what cpus are present *and* have a ->tfm allocated already just to get
the name is worth it, for only 64 bytes ;-)

>
>>
>>> + struct crypto_comp * __percpu *tfm;
>>> };
>>
>> ->tfm will be access pretty often, right? did you intentionally put it
>> at the bottom offset of `struct zswap_pool'?
>
> no it wasn't intentional; does moving it up provide a benefit?
>
>>
>> [..]
>>> +static struct zswap_pool *__zswap_pool_current(void)
>>> {
>>> - return totalram_pages * zswap_max_pool_percent / 100 <
>>> - DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
>>> + struct zswap_pool *pool;
>>> +
>>> + pool = list_first_or_null_rcu(&zswap_pools, typeof(*pool), list);
>>> + WARN_ON(!pool);
>>> +
>>> + return pool;
>>> +}
>>> +
>>> +static struct zswap_pool *zswap_pool_current(void)
>>> +{
>>> + assert_spin_locked(&zswap_pools_lock);
>>> +
>>> + return __zswap_pool_current();
>>> +}
>>
>> this one seems to be used only once. do you want to replace
>> that single usage (well, if it's really needed)
>
> it's actually used twice, in __zswap_pool_empty() and
> __zswap_param_set(). The next patch adds __zswap_param_set().
>
>>
>> WARN_ON(pool == zswap_pool_current());
>> with
>> WARN_ON(pool == __zswap_pool_current);
>>
>> ?
>>
>> you can then drop zswap_pool_current()... and probably rename
>> __zswap_pool_current() to zswap_pool_current().
>>
>> -ss
>>
>>> +static struct zswap_pool *zswap_pool_current_get(void)
>>> +{
>>> + struct zswap_pool *pool;
>>> +
>>> + rcu_read_lock();
>>> +
>>> + pool = __zswap_pool_current();
>>> + if (!pool || !zswap_pool_get(pool))
>>> + pool = NULL;
>>> +
>>> + rcu_read_unlock();
>>> +
>>> + return pool;
>>> +}
>>> +
>>> +static struct zswap_pool *zswap_pool_last_get(void)
>>> +{
>>> + struct zswap_pool *pool, *last = NULL;
>>> +
>>> + rcu_read_lock();
>>> +
>>> + list_for_each_entry_rcu(pool, &zswap_pools, list)
>>> + last = pool;
>>> + if (!WARN_ON(!last) && !zswap_pool_get(last))
>>> + last = NULL;
>>> +
>>> + rcu_read_unlock();
>>> +
>>> + return last;
>>> +}

2015-08-10 00:48:39

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH 2/3] zswap: dynamic pool creation

Hello,

On (08/07/15 10:24), Dan Streetman wrote:
> > On (08/05/15 09:46), Dan Streetman wrote:
> > [..]
> >> -enum comp_op {
> >> - ZSWAP_COMPOP_COMPRESS,
> >> - ZSWAP_COMPOP_DECOMPRESS
> >> +struct zswap_pool {
> >> + struct zpool *zpool;
> >> + struct kref kref;
> >> + struct list_head list;
> >> + struct rcu_head rcu_head;
> >> + struct notifier_block notifier;
> >> + char tfm_name[CRYPTO_MAX_ALG_NAME];
> >
> > do you need to keep a second CRYPTO_MAX_ALG_NAME copy? shouldn't it
> > be `tfm->__crt_alg->cra_name`, which is what
> > crypto_tfm_alg_name(struct crypto_tfm *tfm)
> > does?
>
> well, we don't absolutely have to keep a copy of tfm_name. However,
> ->tfm is a __percpu variable, so each time we want to check the pool's
> tfm name, we would need to do:
> crypto_comp_name(this_cpu_ptr(pool->tfm))
>
> nothing wrong with that really, just adds a bit more code each time we
> want to check the tfm name. I'll send a patch to change it.
>
> >
> >> + struct crypto_comp * __percpu *tfm;
> >> };
> >
> > ->tfm will be access pretty often, right? did you intentionally put it
> > at the bottom offset of `struct zswap_pool'?
>
> no it wasn't intentional; does moving it up provide a benefit?

well, I just prefer to keep 'read mostly' pointers together. all
those cache lines, etc.

gcc 5.1, x86_64

struct zswap_pool {
struct zpool *zpool;
+ struct crypto_comp * __percpu *tfm;
struct kref kref;
struct list_head list;
struct rcu_head rcu_head;
struct notifier_block notifier;
char tfm_name[CRYPTO_MAX_ALG_NAME];
- struct crypto_comp * __percpu *tfm;
};

../scripts/bloat-o-meter zswap.o.old zswap.o
add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-27 (-27)
function old new delta
zswap_writeback_entry 659 656 -3
zswap_frontswap_store 1445 1442 -3
zswap_frontswap_load 417 414 -3
zswap_pool_create 438 432 -6
__zswap_cpu_comp_notifier.part 152 146 -6
__zswap_cpu_comp_notifier 122 116 -6


you know it better ;-)


[..]
> > this one seems to be used only once. do you want to replace
> > that single usage (well, if it's really needed)
>
> it's actually used twice, in __zswap_pool_empty() and
> __zswap_param_set(). The next patch adds __zswap_param_set().

Aha, sorry, didn't read the next patch in advance.

-ss

2015-08-14 20:01:52

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 1/3] zpool: add zpool_has_pool()

>>On Wed, Aug 5, 2015 at 6:06 PM, Andrew Morton <[email protected]> wrote:
>>> On Wed, 5 Aug 2015 18:00:26 -0400 Dan Streetman <[email protected]> wrote:
>>>
>>>> >
>>>> > If there's some reason why this can't happen, can we please have a code
>>>> > comment which reveals that reason?
>>>>
>>>> zpool_create_pool() should work if this returns true, unless as you
>>>> say the module is rmmod'ed *and* removed from the system - since
>>>> zpool_create_pool() will call request_module() just as this function
>>>> does. I can add a comment explaining that.
>>>
>>> I like comments ;)
>>>
>>> Seth, I'm planning on sitting on these patches until you've had a
>>> chance to review them.
>>>
>>>
>> Thanks Andrew. I'm reviewing now. Patch 2/3 is pretty huge. I've got
>> the gist of the changes now. I'm also building and testing for myself
>> as this creates a lot more surface area for issues, alternating between
>> compressors and allocating new compression transforms on the fly.
>>
>> I'm kinda with Sergey on this in that it adds yet another complexity to
>> an already complex feature. This adds more locking, more RCU, more
>> refcounting. It's becoming harder to review, test, and verify.
>>
>> I should have results tomorrow.
>
>So I gave it a test run turning all the knobs (compressor, enabled,
>max_pool_percent, and zpool) like a crazy person and it was stable,
>and all the adjustments had the expected result.
>
>Dan, you might follow up with an update to Documentation/vm/zswap.txt
>noting that these parameters are runtime adjustable now.
>
>The growing complexity is a concern, but it is nice to have the
>flexibility. Thanks for the good work!
>
>To patchset:
>
>Acked-by: Seth Jennings <[email protected]>
>

Hi Seth!

FYI, for whatever reason I'm still not directly getting your emails :(
I use gmail, if that helps...I don't know if there's a problem on
your end or mine...at least this time I knew to check the list archive
;-)

Thanks for reviewing! I'll send a patch to update zswap.txt also.

Andrew, would you prefer an additional patch to update zswap.txt, or
should I roll up that patch and the other few correction patches and
resend this patch set?

2015-08-14 20:03:11

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCH 2/3] zswap: dynamic pool creation

On Sun, Aug 9, 2015 at 8:49 PM, Sergey Senozhatsky
<[email protected]> wrote:
> Hello,
>
> On (08/07/15 10:24), Dan Streetman wrote:
>> > On (08/05/15 09:46), Dan Streetman wrote:
>> > [..]
>> >> -enum comp_op {
>> >> - ZSWAP_COMPOP_COMPRESS,
>> >> - ZSWAP_COMPOP_DECOMPRESS
>> >> +struct zswap_pool {
>> >> + struct zpool *zpool;
>> >> + struct kref kref;
>> >> + struct list_head list;
>> >> + struct rcu_head rcu_head;
>> >> + struct notifier_block notifier;
>> >> + char tfm_name[CRYPTO_MAX_ALG_NAME];
>> >
>> > do you need to keep a second CRYPTO_MAX_ALG_NAME copy? shouldn't it
>> > be `tfm->__crt_alg->cra_name`, which is what
>> > crypto_tfm_alg_name(struct crypto_tfm *tfm)
>> > does?
>>
>> well, we don't absolutely have to keep a copy of tfm_name. However,
>> ->tfm is a __percpu variable, so each time we want to check the pool's
>> tfm name, we would need to do:
>> crypto_comp_name(this_cpu_ptr(pool->tfm))
>>
>> nothing wrong with that really, just adds a bit more code each time we
>> want to check the tfm name. I'll send a patch to change it.
>>
>> >
>> >> + struct crypto_comp * __percpu *tfm;
>> >> };
>> >
>> > ->tfm will be access pretty often, right? did you intentionally put it
>> > at the bottom offset of `struct zswap_pool'?
>>
>> no it wasn't intentional; does moving it up provide a benefit?
>
> well, I just prefer to keep 'read mostly' pointers together. all
> those cache lines, etc.
>
> gcc 5.1, x86_64
>
> struct zswap_pool {
> struct zpool *zpool;
> + struct crypto_comp * __percpu *tfm;
> struct kref kref;
> struct list_head list;
> struct rcu_head rcu_head;
> struct notifier_block notifier;
> char tfm_name[CRYPTO_MAX_ALG_NAME];
> - struct crypto_comp * __percpu *tfm;
> };
>
> ../scripts/bloat-o-meter zswap.o.old zswap.o
> add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-27 (-27)
> function old new delta
> zswap_writeback_entry 659 656 -3
> zswap_frontswap_store 1445 1442 -3
> zswap_frontswap_load 417 414 -3
> zswap_pool_create 438 432 -6
> __zswap_cpu_comp_notifier.part 152 146 -6
> __zswap_cpu_comp_notifier 122 116 -6
>
>
> you know it better ;-)

Ah, well sure that looks better, I'll send a patch (or roll it into a
patch set resend).

Thanks!

>
>
> [..]
>> > this one seems to be used only once. do you want to replace
>> > that single usage (well, if it's really needed)
>>
>> it's actually used twice, in __zswap_pool_empty() and
>> __zswap_param_set(). The next patch adds __zswap_param_set().
>
> Aha, sorry, didn't read the next patch in advance.
>
> -ss