2014-06-02 22:20:16

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv4 0/6] mm/zpool: add common api for zswap to use zbud/zsmalloc

In order to allow zswap users to choose between zbud and zsmalloc for
the compressed storage pool, this patch set adds a new api "zpool" that
provides an interface to both zbud and zsmalloc. Only minor changes
to zbud's interface were needed. This does not include implementing
shrinking in zsmalloc, which will be sent separately.

I believe Seth originally was using zsmalloc for swap, but there were
concerns about how significant the impact of shrinking zsmalloc would
be when zswap had to start reclaiming pages. That still may be an
issue, but this at least allows users to choose themselves whether
they want a lower-density or higher-density compressed storage medium.
At least for situations where zswap reclaim is never or rarely reached,
it probably makes sense to use the higher density of zsmalloc.

Note this patch set does not change zram to use zpool, although that
change should be possible as well.

---

Changes since v3 : https://lkml.org/lkml/2014/5/24/130
-In zpool_shrink() use # pages instead of # bytes
-Add reclaimed param to zpool_shrink() to indicate to caller
# pages actually reclaimed
-move module usage counting to zpool, from zbud/zsmalloc
-update zbud_zpool_shrink() to call zbud_reclaim_page() in a
loop until requested # pages have been reclaimed (or error)

Changes since v2 : https://lkml.org/lkml/2014/5/7/927
-Change zpool to use driver registration instead of hardcoding
implementations
-Add module use counting in zbud/zsmalloc

Changes since v1 https://lkml.org/lkml/2014/4/19/97
-remove zsmalloc shrinking
-change zbud size param type from unsigned int to size_t
-remove zpool fallback creation
-zswap manually falls back to zbud if specified type fails


Dan Streetman (6):
mm/zbud: zbud_alloc() minor param change
mm/zbud: change zbud_alloc size type to size_t
mm/zpool: implement common zpool api to zbud/zsmalloc
mm/zpool: zbud/zsmalloc implement zpool
mm/zpool: update zswap to use zpool
mm/zpool: prevent zbud/zsmalloc from unloading when used

include/linux/zbud.h | 2 +-
include/linux/zpool.h | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++
mm/Kconfig | 43 ++++++----
mm/Makefile | 1 +
mm/zbud.c | 123 +++++++++++++++++++++++----
mm/zpool.c | 206 ++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 83 +++++++++++++++++++
mm/zswap.c | 76 ++++++++++-------
8 files changed, 694 insertions(+), 64 deletions(-)
create mode 100644 include/linux/zpool.h
create mode 100644 mm/zpool.c

--
1.8.3.1


2014-06-02 22:20:24

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv2 1/6] mm/zbud: zbud_alloc() minor param change

Change zbud to store gfp_t flags passed at pool creation to use for
each alloc; this allows the api to be closer to the existing zsmalloc
interface, and the only current zbud user (zswap) uses the same gfp
flags for all allocs. Update zswap to use changed interface.

Signed-off-by: Dan Streetman <[email protected]>
Acked-by: Seth Jennings <[email protected]>
Cc: Weijie Yang <[email protected]>
---

No change since v2 : https://lkml.org/lkml/2014/5/7/727

Changes since v1 https://lkml.org/lkml/2014/4/19/98
-context changes only; zbud_alloc parameter type changed since
last patch

include/linux/zbud.h | 2 +-
mm/zbud.c | 27 +++++++++++++++------------
mm/zswap.c | 6 +++---
3 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/zbud.h b/include/linux/zbud.h
index 13af0d4..0b2534e 100644
--- a/include/linux/zbud.h
+++ b/include/linux/zbud.h
@@ -11,7 +11,7 @@ struct zbud_ops {

struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops);
void zbud_destroy_pool(struct zbud_pool *pool);
-int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,
+int zbud_alloc(struct zbud_pool *pool, unsigned int size,
unsigned long *handle);
void zbud_free(struct zbud_pool *pool, unsigned long handle);
int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries);
diff --git a/mm/zbud.c b/mm/zbud.c
index 01df13a..847c01c 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -94,6 +94,7 @@ struct zbud_pool {
struct list_head lru;
u64 pages_nr;
struct zbud_ops *ops;
+ gfp_t gfp;
};

/*
@@ -193,9 +194,12 @@ static int num_free_chunks(struct zbud_header *zhdr)
*****************/
/**
* zbud_create_pool() - create a new zbud pool
- * @gfp: gfp flags when allocating the zbud pool structure
+ * @gfp: gfp flags when growing the pool
* @ops: user-defined operations for the zbud pool
*
+ * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used
+ * as zbud pool pages.
+ *
* Return: pointer to the new zbud pool or NULL if the metadata allocation
* failed.
*/
@@ -204,7 +208,9 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops)
struct zbud_pool *pool;
int i;

- pool = kmalloc(sizeof(struct zbud_pool), gfp);
+ if (gfp & __GFP_HIGHMEM)
+ return NULL;
+ pool = kmalloc(sizeof(struct zbud_pool), GFP_KERNEL);
if (!pool)
return NULL;
spin_lock_init(&pool->lock);
@@ -214,6 +220,7 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops)
INIT_LIST_HEAD(&pool->lru);
pool->pages_nr = 0;
pool->ops = ops;
+ pool->gfp = gfp;
return pool;
}

@@ -232,7 +239,6 @@ void zbud_destroy_pool(struct zbud_pool *pool)
* zbud_alloc() - allocates a region of a given size
* @pool: zbud pool from which to allocate
* @size: size in bytes of the desired allocation
- * @gfp: gfp flags used if the pool needs to grow
* @handle: handle of the new allocation
*
* This function will attempt to find a free region in the pool large enough to
@@ -240,14 +246,11 @@ void zbud_destroy_pool(struct zbud_pool *pool)
* performed first. If no suitable free region is found, then a new page is
* allocated and added to the pool to satisfy the request.
*
- * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used
- * as zbud pool pages.
- *
- * Return: 0 if success and handle is set, otherwise -EINVAL if the size or
- * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
- * a new page.
+ * Return: 0 if success and @handle is set, -ENOSPC if the @size is too large,
+ * -EINVAL if the @size is 0, or -ENOMEM if the pool was unable to
+ * allocate a new page.
*/
-int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,
+int zbud_alloc(struct zbud_pool *pool, unsigned int size,
unsigned long *handle)
{
int chunks, i, freechunks;
@@ -255,7 +258,7 @@ int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,
enum buddy bud;
struct page *page;

- if (!size || (gfp & __GFP_HIGHMEM))
+ if (!size)
return -EINVAL;
if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
return -ENOSPC;
@@ -279,7 +282,7 @@ int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,

/* Couldn't find unbuddied zbud page, create new one */
spin_unlock(&pool->lock);
- page = alloc_page(gfp);
+ page = alloc_page(pool->gfp);
if (!page)
return -ENOMEM;
spin_lock(&pool->lock);
diff --git a/mm/zswap.c b/mm/zswap.c
index aeaef0f..1cc6770 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -679,8 +679,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* store */
len = dlen + sizeof(struct zswap_header);
- ret = zbud_alloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
- &handle);
+ ret = zbud_alloc(zswap_pool, len, &handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
goto freepage;
@@ -900,7 +899,8 @@ static int __init init_zswap(void)

pr_info("loading zswap\n");

- zswap_pool = zbud_create_pool(GFP_KERNEL, &zswap_zbud_ops);
+ zswap_pool = zbud_create_pool(__GFP_NORETRY | __GFP_NOWARN,
+ &zswap_zbud_ops);
if (!zswap_pool) {
pr_err("zbud pool creation failed\n");
goto error;
--
1.8.3.1

2014-06-02 22:20:29

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

Add zpool api.

zpool provides an interface for memory storage, typically of compressed
memory. Users can select what backend to use; currently the only
implementations are zbud, a low density implementation with up to
two compressed pages per storage page, and zsmalloc, a higher density
implementation with multiple compressed pages per storage page.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Note this patch set is against the mmotm tree at
git://git.cmpxchg.org/linux-mmotm.git
This patch may need context changes to the -next or other trees.

Changes since v3 : https://lkml.org/lkml/2014/5/24/135
-change zpool_shrink() to use # pages instead of # bytes
-change zpool_shrink() to update *reclaimed param with
number of pages actually reclaimed

Changes since v2 : https://lkml.org/lkml/2014/5/7/733
-Remove hardcoded zbud/zsmalloc implementations
-Add driver (un)register functions

Changes since v1 https://lkml.org/lkml/2014/4/19/101
-add some pr_info() during creation and pr_err() on errors
-remove zpool code to call zs_shrink(), since zsmalloc shrinking
was removed from this patchset
-remove fallback; only specified pool type will be tried
-pr_fmt() is defined in zpool to prefix zpool: in any pr_XXX() calls

include/linux/zpool.h | 219 ++++++++++++++++++++++++++++++++++++++++++++++++++
mm/Kconfig | 41 ++++++----
mm/Makefile | 1 +
mm/zpool.c | 198 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 442 insertions(+), 17 deletions(-)
create mode 100644 include/linux/zpool.h
create mode 100644 mm/zpool.c

diff --git a/include/linux/zpool.h b/include/linux/zpool.h
new file mode 100644
index 0000000..a528f7c
--- /dev/null
+++ b/include/linux/zpool.h
@@ -0,0 +1,219 @@
+/*
+ * zpool memory storage api
+ *
+ * Copyright (C) 2014 Dan Streetman
+ *
+ * This is a common frontend for the zbud and zsmalloc memory
+ * storage pool implementations. Typically, this is used to
+ * store compressed memory.
+ */
+
+#ifndef _ZPOOL_H_
+#define _ZPOOL_H_
+
+struct zpool;
+
+struct zpool_ops {
+ int (*evict)(struct zpool *pool, unsigned long handle);
+};
+
+/*
+ * Control how a handle is mapped. It will be ignored if the
+ * implementation does not support it. Its use is optional.
+ * Note that this does not refer to memory protection, it
+ * refers to how the memory will be copied in/out if copying
+ * is necessary during mapping; read-write is the safest as
+ * it copies the existing memory in on map, and copies the
+ * changed memory back out on unmap. Write-only does not copy
+ * in the memory and should only be used for initialization.
+ * If in doubt, use ZPOOL_MM_DEFAULT which is read-write.
+ */
+enum zpool_mapmode {
+ ZPOOL_MM_RW, /* normal read-write mapping */
+ ZPOOL_MM_RO, /* read-only (no copy-out at unmap time) */
+ ZPOOL_MM_WO, /* write-only (no copy-in at map time) */
+
+ ZPOOL_MM_DEFAULT = ZPOOL_MM_RW
+};
+
+/**
+ * zpool_create_pool() - Create a new zpool
+ * @type The type of the zpool to create (e.g. zbud, zsmalloc)
+ * @flags What GFP flags should be used when the zpool allocates memory.
+ * @ops The optional ops callback.
+ *
+ * This creates a new zpool of the specified type. The zpool will use the
+ * given flags when allocating any memory. If the ops param is NULL, then
+ * the created zpool will not be shrinkable.
+ *
+ * Returns: New zpool on success, NULL on failure.
+ */
+struct zpool *zpool_create_pool(char *type, gfp_t flags,
+ struct zpool_ops *ops);
+
+/**
+ * zpool_get_type() - Get the type of the zpool
+ * @pool The zpool to check
+ *
+ * This returns the type of the pool.
+ *
+ * Returns: The type of zpool.
+ */
+char *zpool_get_type(struct zpool *pool);
+
+/**
+ * zpool_destroy_pool() - Destroy a zpool
+ * @pool The zpool to destroy.
+ *
+ * This destroys an existing zpool. The zpool should not be in use.
+ */
+void zpool_destroy_pool(struct zpool *pool);
+
+/**
+ * zpool_malloc() - Allocate memory
+ * @pool The zpool to allocate from.
+ * @size The amount of memory to allocate.
+ * @handle Pointer to the handle to set
+ *
+ * This allocates the requested amount of memory from the pool.
+ * The provided @handle will be set to the allocated object handle.
+ *
+ * Returns: 0 on success, negative value on error.
+ */
+int zpool_malloc(struct zpool *pool, size_t size, unsigned long *handle);
+
+/**
+ * zpool_free() - Free previously allocated memory
+ * @pool The zpool that allocated the memory.
+ * @handle The handle to the memory to free.
+ *
+ * This frees previously allocated memory. This does not guarantee
+ * that the pool will actually free memory, only that the memory
+ * in the pool will become available for use by the pool.
+ */
+void zpool_free(struct zpool *pool, unsigned long handle);
+
+/**
+ * zpool_shrink() - Shrink the pool size
+ * @pool The zpool to shrink.
+ * @pages The number of pages to shrink the pool.
+ * @reclaimed The number of pages successfully evicted.
+ *
+ * This attempts to shrink the actual memory size of the pool
+ * by evicting currently used handle(s). If the pool was
+ * created with no zpool_ops, or the evict call fails for any
+ * of the handles, this will fail. If non-NULL, the @reclaimed
+ * parameter will be set to the number of pages reclaimed,
+ * which may be more than the number of pages requested.
+ *
+ * Returns: 0 on success, negative value on error/failure.
+ */
+int zpool_shrink(struct zpool *pool, unsigned int pages,
+ unsigned int *reclaimed);
+
+/**
+ * zpool_map_handle() - Map a previously allocated handle into memory
+ * @pool The zpool that the handle was allocated from
+ * @handle The handle to map
+ * @mm How the memory should be mapped
+ *
+ * This maps a previously allocated handle into memory. The @mm
+ * param indicates to the implemenation how the memory will be
+ * used, i.e. read-only, write-only, read-write. If the
+ * implementation does not support it, the memory will be treated
+ * as read-write.
+ *
+ * This may hold locks, disable interrupts, and/or preemption,
+ * and the zpool_unmap_handle() must be called to undo those
+ * actions. The code that uses the mapped handle should complete
+ * its operatons on the mapped handle memory quickly and unmap
+ * as soon as possible. Multiple handles should not be mapped
+ * concurrently on a cpu.
+ *
+ * Returns: A pointer to the handle's mapped memory area.
+ */
+void *zpool_map_handle(struct zpool *pool, unsigned long handle,
+ enum zpool_mapmode mm);
+
+/**
+ * zpool_unmap_handle() - Unmap a previously mapped handle
+ * @pool The zpool that the handle was allocated from
+ * @handle The handle to unmap
+ *
+ * This unmaps a previously mapped handle. Any locks or other
+ * actions that the implemenation took in zpool_map_handle()
+ * will be undone here. The memory area returned from
+ * zpool_map_handle() should no longer be used after this.
+ */
+void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
+
+/**
+ * zpool_get_total_size() - The total size of the pool
+ * @pool The zpool to check
+ *
+ * This returns the total size in bytes of the pool.
+ *
+ * Returns: Total size of the zpool in bytes.
+ */
+u64 zpool_get_total_size(struct zpool *pool);
+
+
+/**
+ * struct zpool_driver - driver implementation for zpool
+ * @type: name of the driver.
+ * @list: entry in the list of zpool drivers.
+ * @create: create a new pool.
+ * @destroy: destroy a pool.
+ * @malloc: allocate mem from a pool.
+ * @free: free mem from a pool.
+ * @shrink: shrink the pool.
+ * @map: map a handle.
+ * @unmap: unmap a handle.
+ * @total_size: get total size of a pool.
+ *
+ * This is created by a zpool implementation and registered
+ * with zpool.
+ */
+struct zpool_driver {
+ char *type;
+ struct list_head list;
+
+ void *(*create)(gfp_t gfp, struct zpool_ops *ops);
+ void (*destroy)(void *pool);
+
+ int (*malloc)(void *pool, size_t size, unsigned long *handle);
+ void (*free)(void *pool, unsigned long handle);
+
+ int (*shrink)(void *pool, unsigned int pages,
+ unsigned int *reclaimed);
+
+ void *(*map)(void *pool, unsigned long handle,
+ enum zpool_mapmode mm);
+ void (*unmap)(void *pool, unsigned long handle);
+
+ u64 (*total_size)(void *pool);
+};
+
+/**
+ * zpool_register_driver() - register a zpool implementation.
+ * @driver: driver to register
+ */
+void zpool_register_driver(struct zpool_driver *driver);
+
+/**
+ * zpool_unregister_driver() - unregister a zpool implementation.
+ * @driver: driver to unregister.
+ */
+void zpool_unregister_driver(struct zpool_driver *driver);
+
+/**
+ * zpool_evict() - evict callback from a zpool implementation.
+ * @pool: pool to evict from.
+ * @handle: handle to evict.
+ *
+ * This can be used by zpool implementations to call the
+ * user's evict zpool_ops struct evict callback.
+ */
+int zpool_evict(void *pool, unsigned long handle);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 7511b4a..00f7720 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -515,15 +515,17 @@ config CMA_DEBUG
processing calls such as dma_alloc_from_contiguous().
This option does not affect warning and error messages.

-config ZBUD
- tristate
- default n
+config MEM_SOFT_DIRTY
+ bool "Track memory changes"
+ depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
+ select PROC_PAGE_MONITOR
help
- A special purpose allocator for storing compressed pages.
- It is designed to store up to two compressed pages per physical
- page. While this design limits storage density, it has simple and
- deterministic reclaim properties that make it preferable to a higher
- density approach when reclaim will be used.
+ This option enables memory changes tracking by introducing a
+ soft-dirty bit on pte-s. This bit it set when someone writes
+ into a page just as regular dirty bit, but unlike the latter
+ it can be cleared by hands.
+
+ See Documentation/vm/soft-dirty.txt for more details.

config ZSWAP
bool "Compressed cache for swap pages (EXPERIMENTAL)"
@@ -545,17 +547,22 @@ config ZSWAP
they have not be fully explored on the large set of potential
configurations and workloads that exist.

-config MEM_SOFT_DIRTY
- bool "Track memory changes"
- depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
- select PROC_PAGE_MONITOR
+config ZPOOL
+ tristate "Common API for compressed memory storage"
+ default n
help
- This option enables memory changes tracking by introducing a
- soft-dirty bit on pte-s. This bit it set when someone writes
- into a page just as regular dirty bit, but unlike the latter
- it can be cleared by hands.
+ Compressed memory storage API. This allows using either zbud or
+ zsmalloc.

- See Documentation/vm/soft-dirty.txt for more details.
+config ZBUD
+ tristate "Low density storage for compressed pages"
+ default n
+ help
+ A special purpose allocator for storing compressed pages.
+ It is designed to store up to two compressed pages per physical
+ page. While this design limits storage density, it has simple and
+ deterministic reclaim properties that make it preferable to a higher
+ density approach when reclaim will be used.

config ZSMALLOC
tristate "Memory allocator for compressed pages"
diff --git a/mm/Makefile b/mm/Makefile
index 2b6fff2..759db04 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
obj-$(CONFIG_PAGE_OWNER) += pageowner.o
+obj-$(CONFIG_ZPOOL) += zpool.o
obj-$(CONFIG_ZBUD) += zbud.o
obj-$(CONFIG_ZSMALLOC) += zsmalloc.o
obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o
diff --git a/mm/zpool.c b/mm/zpool.c
new file mode 100644
index 0000000..578c379
--- /dev/null
+++ b/mm/zpool.c
@@ -0,0 +1,198 @@
+/*
+ * zpool memory storage api
+ *
+ * Copyright (C) 2014 Dan Streetman
+ *
+ * This is a common frontend for memory storage pool implementations.
+ * Typically, this is used to store compressed memory.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/list.h>
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/zpool.h>
+
+struct zpool {
+ char *type;
+
+ struct zpool_driver *driver;
+ void *pool;
+ struct zpool_ops *ops;
+
+ struct list_head list;
+};
+
+static LIST_HEAD(drivers_head);
+static DEFINE_SPINLOCK(drivers_lock);
+
+static LIST_HEAD(pools_head);
+static DEFINE_SPINLOCK(pools_lock);
+
+void zpool_register_driver(struct zpool_driver *driver)
+{
+ spin_lock(&drivers_lock);
+ list_add(&driver->list, &drivers_head);
+ spin_unlock(&drivers_lock);
+}
+EXPORT_SYMBOL(zpool_register_driver);
+
+void zpool_unregister_driver(struct zpool_driver *driver)
+{
+ spin_lock(&drivers_lock);
+ list_del(&driver->list);
+ spin_unlock(&drivers_lock);
+}
+EXPORT_SYMBOL(zpool_unregister_driver);
+
+int zpool_evict(void *pool, unsigned long handle)
+{
+ struct zpool *zpool;
+
+ spin_lock(&pools_lock);
+ list_for_each_entry(zpool, &pools_head, list) {
+ if (zpool->pool == pool) {
+ spin_unlock(&pools_lock);
+ if (!zpool->ops || !zpool->ops->evict)
+ return -EINVAL;
+ return zpool->ops->evict(zpool, handle);
+ }
+ }
+ spin_unlock(&pools_lock);
+
+ return -ENOENT;
+}
+EXPORT_SYMBOL(zpool_evict);
+
+static struct zpool_driver *zpool_get_driver(char *type)
+{
+ struct zpool_driver *driver;
+
+ assert_spin_locked(&drivers_lock);
+ list_for_each_entry(driver, &drivers_head, list) {
+ if (!strcmp(driver->type, type))
+ return driver;
+ }
+
+ return NULL;
+}
+
+struct zpool *zpool_create_pool(char *type, gfp_t flags,
+ struct zpool_ops *ops)
+{
+ struct zpool_driver *driver;
+ struct zpool *zpool;
+
+ pr_info("creating pool type %s\n", type);
+
+ spin_lock(&drivers_lock);
+ driver = zpool_get_driver(type);
+ spin_unlock(&drivers_lock);
+
+ if (!driver) {
+ request_module(type);
+ spin_lock(&drivers_lock);
+ driver = zpool_get_driver(type);
+ spin_unlock(&drivers_lock);
+ }
+
+ if (!driver) {
+ pr_err("no driver for type %s\n", type);
+ return NULL;
+ }
+
+ zpool = kmalloc(sizeof(*zpool), GFP_KERNEL);
+ if (!zpool) {
+ pr_err("couldn't create zpool - out of memory\n");
+ return NULL;
+ }
+
+ zpool->type = driver->type;
+ zpool->driver = driver;
+ zpool->pool = driver->create(flags, ops);
+ zpool->ops = ops;
+
+ if (!zpool->pool) {
+ pr_err("couldn't create %s pool\n", type);
+ kfree(zpool);
+ return NULL;
+ }
+
+ pr_info("created %s pool\n", type);
+
+ spin_lock(&pools_lock);
+ list_add(&zpool->list, &pools_head);
+ spin_unlock(&pools_lock);
+
+ return zpool;
+}
+
+void zpool_destroy_pool(struct zpool *zpool)
+{
+ pr_info("destroying pool type %s\n", zpool->type);
+
+ spin_lock(&pools_lock);
+ list_del(&zpool->list);
+ spin_unlock(&pools_lock);
+ zpool->driver->destroy(zpool->pool);
+ kfree(zpool);
+}
+
+char *zpool_get_type(struct zpool *zpool)
+{
+ return zpool->type;
+}
+
+int zpool_malloc(struct zpool *zpool, size_t size, unsigned long *handle)
+{
+ return zpool->driver->malloc(zpool->pool, size, handle);
+}
+
+void zpool_free(struct zpool *zpool, unsigned long handle)
+{
+ zpool->driver->free(zpool->pool, handle);
+}
+
+int zpool_shrink(struct zpool *zpool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ return zpool->driver->shrink(zpool->pool, pages, reclaimed);
+}
+
+void *zpool_map_handle(struct zpool *zpool, unsigned long handle,
+ enum zpool_mapmode mapmode)
+{
+ return zpool->driver->map(zpool->pool, handle, mapmode);
+}
+
+void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
+{
+ zpool->driver->unmap(zpool->pool, handle);
+}
+
+u64 zpool_get_total_size(struct zpool *zpool)
+{
+ return zpool->driver->total_size(zpool->pool);
+}
+
+static int __init init_zpool(void)
+{
+ pr_info("loaded\n");
+ return 0;
+}
+
+static void __exit exit_zpool(void)
+{
+ pr_info("unloaded\n");
+}
+
+module_init(init_zpool);
+module_exit(exit_zpool);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Streetman <[email protected]>");
+MODULE_DESCRIPTION("Common API for compressed memory storage");
--
1.8.3.1

2014-06-02 22:20:34

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv4 5/6] mm/zpool: update zswap to use zpool

Change zswap to use the zpool api instead of directly using zbud.
Add a boot-time param to allow selecting which zpool implementation
to use, with zbud as the default.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Changes since v3 : https://lkml.org/lkml/2014/5/24/131
-use new parameters in call to zpool_shrink()

Changes since v2 : https://lkml.org/lkml/2014/5/7/894
-change zswap to select ZPOOL instead of ZBUD
-only try zbud default if not already tried

Changes since v1 https://lkml.org/lkml/2014/4/19/102
-since zpool fallback is removed, manually fall back to zbud if
specified type fails

mm/Kconfig | 2 +-
mm/zswap.c | 76 +++++++++++++++++++++++++++++++++++++-------------------------
2 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 00f7720..5ae0016 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -531,7 +531,7 @@ config ZSWAP
bool "Compressed cache for swap pages (EXPERIMENTAL)"
depends on FRONTSWAP && CRYPTO=y
select CRYPTO_LZO
- select ZBUD
+ select ZPOOL
default n
help
A lightweight compressed cache for swap pages. It takes
diff --git a/mm/zswap.c b/mm/zswap.c
index 1cc6770..67cf9d8 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -34,7 +34,7 @@
#include <linux/swap.h>
#include <linux/crypto.h>
#include <linux/mempool.h>
-#include <linux/zbud.h>
+#include <linux/zpool.h>

#include <linux/mm_types.h>
#include <linux/page-flags.h>
@@ -45,8 +45,8 @@
/*********************************
* statistics
**********************************/
-/* Number of memory pages used by the compressed pool */
-static u64 zswap_pool_pages;
+/* Total bytes used by the compressed storage */
+static u64 zswap_pool_total_size;
/* The number of compressed pages currently stored in zswap */
static atomic_t zswap_stored_pages = ATOMIC_INIT(0);

@@ -89,8 +89,13 @@ static unsigned int zswap_max_pool_percent = 20;
module_param_named(max_pool_percent,
zswap_max_pool_percent, uint, 0644);

-/* zbud_pool is shared by all of zswap backend */
-static struct zbud_pool *zswap_pool;
+/* Compressed storage to use */
+#define ZSWAP_ZPOOL_DEFAULT "zbud"
+static char *zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
+module_param_named(zpool, zswap_zpool_type, charp, 0444);
+
+/* zpool is shared by all of zswap backend */
+static struct zpool *zswap_pool;

/*********************************
* compression functions
@@ -168,7 +173,7 @@ static void zswap_comp_exit(void)
* be held while changing the refcount. Since the lock must
* be held, there is no reason to also make refcount atomic.
* offset - the swap offset for the entry. Index into the red-black tree.
- * handle - zbud allocation handle that stores the compressed page data
+ * handle - zpool allocation handle that stores the compressed page data
* length - the length in bytes of the compressed page data. Needed during
* decompression
*/
@@ -284,15 +289,15 @@ static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *entry)
}

/*
- * Carries out the common pattern of freeing and entry's zbud allocation,
+ * Carries out the common pattern of freeing and entry's zpool allocation,
* freeing the entry itself, and decrementing the number of stored pages.
*/
static void zswap_free_entry(struct zswap_entry *entry)
{
- zbud_free(zswap_pool, entry->handle);
+ zpool_free(zswap_pool, entry->handle);
zswap_entry_cache_free(entry);
atomic_dec(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(zswap_pool);
+ zswap_pool_total_size = zpool_get_total_size(zswap_pool);
}

/* caller must hold the tree lock */
@@ -409,7 +414,7 @@ cleanup:
static bool zswap_is_full(void)
{
return totalram_pages * zswap_max_pool_percent / 100 <
- zswap_pool_pages;
+ DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
}

/*********************************
@@ -525,7 +530,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry,
* the swap cache, the compressed version stored by zswap can be
* freed.
*/
-static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
+static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
{
struct zswap_header *zhdr;
swp_entry_t swpentry;
@@ -541,9 +546,9 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
};

/* extract swpentry from data */
- zhdr = zbud_map(pool, handle);
+ zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO);
swpentry = zhdr->swpentry; /* here */
- zbud_unmap(pool, handle);
+ zpool_unmap_handle(pool, handle);
tree = zswap_trees[swp_type(swpentry)];
offset = swp_offset(swpentry);

@@ -573,13 +578,13 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
case ZSWAP_SWAPCACHE_NEW: /* page is locked */
/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zbud_map(zswap_pool, entry->handle) +
- sizeof(struct zswap_header);
+ src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src,
entry->length, dst, &dlen);
kunmap_atomic(dst);
- zbud_unmap(zswap_pool, entry->handle);
+ zpool_unmap_handle(zswap_pool, entry->handle);
BUG_ON(ret);
BUG_ON(dlen != PAGE_SIZE);

@@ -652,7 +657,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
/* reclaim space if needed */
if (zswap_is_full()) {
zswap_pool_limit_hit++;
- if (zbud_reclaim_page(zswap_pool, 8)) {
+ if (zpool_shrink(zswap_pool, 1, NULL)) {
zswap_reject_reclaim_fail++;
ret = -ENOMEM;
goto reject;
@@ -679,7 +684,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* store */
len = dlen + sizeof(struct zswap_header);
- ret = zbud_alloc(zswap_pool, len, &handle);
+ ret = zpool_malloc(zswap_pool, len, &handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
goto freepage;
@@ -688,11 +693,11 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
zswap_reject_alloc_fail++;
goto freepage;
}
- zhdr = zbud_map(zswap_pool, handle);
+ zhdr = zpool_map_handle(zswap_pool, handle, ZPOOL_MM_RW);
zhdr->swpentry = swp_entry(type, offset);
buf = (u8 *)(zhdr + 1);
memcpy(buf, dst, dlen);
- zbud_unmap(zswap_pool, handle);
+ zpool_unmap_handle(zswap_pool, handle);
put_cpu_var(zswap_dstmem);

/* populate entry */
@@ -715,7 +720,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* update stats */
atomic_inc(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(zswap_pool);
+ zswap_pool_total_size = zpool_get_total_size(zswap_pool);

return 0;

@@ -751,13 +756,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,

/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zbud_map(zswap_pool, entry->handle) +
- sizeof(struct zswap_header);
+ src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src, entry->length,
dst, &dlen);
kunmap_atomic(dst);
- zbud_unmap(zswap_pool, entry->handle);
+ zpool_unmap_handle(zswap_pool, entry->handle);
BUG_ON(ret);

spin_lock(&tree->lock);
@@ -810,7 +815,7 @@ static void zswap_frontswap_invalidate_area(unsigned type)
zswap_trees[type] = NULL;
}

-static struct zbud_ops zswap_zbud_ops = {
+static struct zpool_ops zswap_zpool_ops = {
.evict = zswap_writeback_entry
};

@@ -868,8 +873,8 @@ static int __init zswap_debugfs_init(void)
zswap_debugfs_root, &zswap_written_back_pages);
debugfs_create_u64("duplicate_entry", S_IRUGO,
zswap_debugfs_root, &zswap_duplicate_entry);
- debugfs_create_u64("pool_pages", S_IRUGO,
- zswap_debugfs_root, &zswap_pool_pages);
+ debugfs_create_u64("pool_total_size", S_IRUGO,
+ zswap_debugfs_root, &zswap_pool_total_size);
debugfs_create_atomic_t("stored_pages", S_IRUGO,
zswap_debugfs_root, &zswap_stored_pages);

@@ -894,17 +899,26 @@ static void __exit zswap_debugfs_exit(void) { }
**********************************/
static int __init init_zswap(void)
{
+ gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN;
+
if (!zswap_enabled)
return 0;

pr_info("loading zswap\n");

- zswap_pool = zbud_create_pool(__GFP_NORETRY | __GFP_NOWARN,
- &zswap_zbud_ops);
+ zswap_pool = zpool_create_pool(zswap_zpool_type, gfp, &zswap_zpool_ops);
+ if (!zswap_pool && strcmp(zswap_zpool_type, ZSWAP_ZPOOL_DEFAULT)) {
+ pr_info("%s zpool not available\n", zswap_zpool_type);
+ zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
+ zswap_pool = zpool_create_pool(zswap_zpool_type, gfp,
+ &zswap_zpool_ops);
+ }
if (!zswap_pool) {
- pr_err("zbud pool creation failed\n");
+ pr_err("%s zpool not available\n", zswap_zpool_type);
+ pr_err("zpool creation failed\n");
goto error;
}
+ pr_info("using %s pool\n", zswap_zpool_type);

if (zswap_entry_cache_create()) {
pr_err("entry cache creation failed\n");
@@ -928,7 +942,7 @@ pcpufail:
compfail:
zswap_entry_cache_destory();
cachefail:
- zbud_destroy_pool(zswap_pool);
+ zpool_destroy_pool(zswap_pool);
error:
return -ENOMEM;
}
--
1.8.3.1

2014-06-02 22:20:51

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv2 6/6] mm/zpool: prevent zbud/zsmalloc from unloading when used

Add try_module_get() to zpool_create_pool(), and module_put() to
zpool_destroy_pool(). Without module usage counting, the driver module(s)
could be unloaded while their pool(s) were active, resulting in an oops
when zpool tried to access them.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Changes since v1 : https://lkml.org/lkml/2014/5/24/134
-add owner field to struct zpool_driver, pointing to driver module
-move module usage counting from zbud/zsmalloc into zpool

include/linux/zpool.h | 5 +++++
mm/zbud.c | 1 +
mm/zpool.c | 22 +++++++++++++++-------
mm/zsmalloc.c | 1 +
4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/linux/zpool.h b/include/linux/zpool.h
index a528f7c..49bd02b 100644
--- a/include/linux/zpool.h
+++ b/include/linux/zpool.h
@@ -176,6 +176,7 @@ u64 zpool_get_total_size(struct zpool *pool);
*/
struct zpool_driver {
char *type;
+ struct module *owner;
struct list_head list;

void *(*create)(gfp_t gfp, struct zpool_ops *ops);
@@ -203,6 +204,10 @@ void zpool_register_driver(struct zpool_driver *driver);
/**
* zpool_unregister_driver() - unregister a zpool implementation.
* @driver: driver to unregister.
+ *
+ * Module usage counting is used to prevent using a driver
+ * while/after unloading. Please only call unregister from
+ * module exit function.
*/
void zpool_unregister_driver(struct zpool_driver *driver);

diff --git a/mm/zbud.c b/mm/zbud.c
index 645379e..440bab7 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -184,6 +184,7 @@ u64 zbud_zpool_total_size(void *pool)

static struct zpool_driver zbud_zpool_driver = {
.type = "zbud",
+ .owner = THIS_MODULE,
.create = zbud_zpool_create,
.destroy = zbud_zpool_destroy,
.malloc = zbud_zpool_malloc,
diff --git a/mm/zpool.c b/mm/zpool.c
index 578c379..119f340 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -72,15 +72,24 @@ static struct zpool_driver *zpool_get_driver(char *type)
{
struct zpool_driver *driver;

- assert_spin_locked(&drivers_lock);
+ spin_lock(&drivers_lock);
list_for_each_entry(driver, &drivers_head, list) {
- if (!strcmp(driver->type, type))
- return driver;
+ if (!strcmp(driver->type, type)) {
+ bool got = try_module_get(driver->owner);
+ spin_unlock(&drivers_lock);
+ return got ? driver : NULL;
+ }
}

+ spin_unlock(&drivers_lock);
return NULL;
}

+static void zpool_put_driver(struct zpool_driver *driver)
+{
+ module_put(driver->owner);
+}
+
struct zpool *zpool_create_pool(char *type, gfp_t flags,
struct zpool_ops *ops)
{
@@ -89,15 +98,11 @@ struct zpool *zpool_create_pool(char *type, gfp_t flags,

pr_info("creating pool type %s\n", type);

- spin_lock(&drivers_lock);
driver = zpool_get_driver(type);
- spin_unlock(&drivers_lock);

if (!driver) {
request_module(type);
- spin_lock(&drivers_lock);
driver = zpool_get_driver(type);
- spin_unlock(&drivers_lock);
}

if (!driver) {
@@ -108,6 +113,7 @@ struct zpool *zpool_create_pool(char *type, gfp_t flags,
zpool = kmalloc(sizeof(*zpool), GFP_KERNEL);
if (!zpool) {
pr_err("couldn't create zpool - out of memory\n");
+ zpool_put_driver(driver);
return NULL;
}

@@ -118,6 +124,7 @@ struct zpool *zpool_create_pool(char *type, gfp_t flags,

if (!zpool->pool) {
pr_err("couldn't create %s pool\n", type);
+ zpool_put_driver(driver);
kfree(zpool);
return NULL;
}
@@ -139,6 +146,7 @@ void zpool_destroy_pool(struct zpool *zpool)
list_del(&zpool->list);
spin_unlock(&pools_lock);
zpool->driver->destroy(zpool->pool);
+ zpool_put_driver(zpool->driver);
kfree(zpool);
}

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index feba644..ae3a28f 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -303,6 +303,7 @@ u64 zs_zpool_total_size(void *pool)

static struct zpool_driver zs_zpool_driver = {
.type = "zsmalloc",
+ .owner = THIS_MODULE,
.create = zs_zpool_create,
.destroy = zs_zpool_destroy,
.malloc = zs_zpool_malloc,
--
1.8.3.1

2014-06-02 22:20:36

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv2 4/6] mm/zpool: zbud/zsmalloc implement zpool

Update zbud and zsmalloc to implement the zpool api.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Note to Seth: We talked about removing the retries parameter from
zbud_reclaim_page(), but I did not include that in this patch.
I'll send a separate patch for that.

Changes since v1 : https://lkml.org/lkml/2014/5/24/136
-Update zbud_zpool_shrink() to call zbud_reclaim_page()
in a loop until number of pages requested has been
reclaimed, or error
-Update zbud_zpool_shrink() to update passed *reclaimed
param with # pages actually reclaimed
-Update zs_pool_shrink() with new param, although function
is not implemented yet

mm/zbud.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 174 insertions(+)

diff --git a/mm/zbud.c b/mm/zbud.c
index dd13665..645379e 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -51,6 +51,7 @@
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/zbud.h>
+#include <linux/zpool.h>

/*****************
* Structures
@@ -114,6 +115,88 @@ struct zbud_header {
};

/*****************
+ * zpool
+ ****************/
+
+#ifdef CONFIG_ZPOOL
+
+static int zbud_zpool_evict(struct zbud_pool *pool, unsigned long handle)
+{
+ return zpool_evict(pool, handle);
+}
+
+static struct zbud_ops zbud_zpool_ops = {
+ .evict = zbud_zpool_evict
+};
+
+static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
+{
+ return zbud_create_pool(gfp, &zbud_zpool_ops);
+}
+
+void zbud_zpool_destroy(void *pool)
+{
+ zbud_destroy_pool(pool);
+}
+
+int zbud_zpool_malloc(void *pool, size_t size, unsigned long *handle)
+{
+ return zbud_alloc(pool, size, handle);
+}
+void zbud_zpool_free(void *pool, unsigned long handle)
+{
+ zbud_free(pool, handle);
+}
+
+int zbud_zpool_shrink(void *pool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ unsigned int total = 0;
+ int ret = -EINVAL;
+
+ while (total < pages) {
+ ret = zbud_reclaim_page(pool, 8);
+ if (ret < 0)
+ break;
+ total++;
+ }
+
+ if (reclaimed)
+ *reclaimed = total;
+
+ return ret;
+}
+
+void *zbud_zpool_map(void *pool, unsigned long handle,
+ enum zpool_mapmode mm)
+{
+ return zbud_map(pool, handle);
+}
+void zbud_zpool_unmap(void *pool, unsigned long handle)
+{
+ zbud_unmap(pool, handle);
+}
+
+u64 zbud_zpool_total_size(void *pool)
+{
+ return zbud_get_pool_size(pool) * PAGE_SIZE;
+}
+
+static struct zpool_driver zbud_zpool_driver = {
+ .type = "zbud",
+ .create = zbud_zpool_create,
+ .destroy = zbud_zpool_destroy,
+ .malloc = zbud_zpool_malloc,
+ .free = zbud_zpool_free,
+ .shrink = zbud_zpool_shrink,
+ .map = zbud_zpool_map,
+ .unmap = zbud_zpool_unmap,
+ .total_size = zbud_zpool_total_size,
+};
+
+#endif /* CONFIG_ZPOOL */
+
+/*****************
* Helpers
*****************/
/* Just to make the code easier to read */
@@ -513,11 +596,20 @@ static int __init init_zbud(void)
/* Make sure the zbud header will fit in one chunk */
BUILD_BUG_ON(sizeof(struct zbud_header) > ZHDR_SIZE_ALIGNED);
pr_info("loaded\n");
+
+#ifdef CONFIG_ZPOOL
+ zpool_register_driver(&zbud_zpool_driver);
+#endif
+
return 0;
}

static void __exit exit_zbud(void)
{
+#ifdef CONFIG_ZPOOL
+ zpool_unregister_driver(&zbud_zpool_driver);
+#endif
+
pr_info("unloaded\n");
}

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fe78189..feba644 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -92,6 +92,7 @@
#include <linux/spinlock.h>
#include <linux/types.h>
#include <linux/zsmalloc.h>
+#include <linux/zpool.h>

/*
* This must be power of 2 and greater than of equal to sizeof(link_free).
@@ -240,6 +241,79 @@ struct mapping_area {
enum zs_mapmode vm_mm; /* mapping mode */
};

+/* zpool driver */
+
+#ifdef CONFIG_ZPOOL
+
+static void *zs_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
+{
+ return zs_create_pool(gfp);
+}
+
+void zs_zpool_destroy(void *pool)
+{
+ zs_destroy_pool(pool);
+}
+
+int zs_zpool_malloc(void *pool, size_t size, unsigned long *handle)
+{
+ *handle = zs_malloc(pool, size);
+ return *handle ? 0 : -1;
+}
+void zs_zpool_free(void *pool, unsigned long handle)
+{
+ zs_free(pool, handle);
+}
+
+int zs_zpool_shrink(void *pool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ return -EINVAL;
+}
+
+void *zs_zpool_map(void *pool, unsigned long handle,
+ enum zpool_mapmode mm)
+{
+ enum zs_mapmode zs_mm;
+
+ switch (mm) {
+ case ZPOOL_MM_RO:
+ zs_mm = ZS_MM_RO;
+ break;
+ case ZPOOL_MM_WO:
+ zs_mm = ZS_MM_WO;
+ break;
+ case ZPOOL_MM_RW: /* fallthru */
+ default:
+ zs_mm = ZS_MM_RW;
+ break;
+ }
+
+ return zs_map_object(pool, handle, zs_mm);
+}
+void zs_zpool_unmap(void *pool, unsigned long handle)
+{
+ zs_unmap_object(pool, handle);
+}
+
+u64 zs_zpool_total_size(void *pool)
+{
+ return zs_get_total_size_bytes(pool);
+}
+
+static struct zpool_driver zs_zpool_driver = {
+ .type = "zsmalloc",
+ .create = zs_zpool_create,
+ .destroy = zs_zpool_destroy,
+ .malloc = zs_zpool_malloc,
+ .free = zs_zpool_free,
+ .shrink = zs_zpool_shrink,
+ .map = zs_zpool_map,
+ .unmap = zs_zpool_unmap,
+ .total_size = zs_zpool_total_size,
+};
+
+#endif /* CONFIG_ZPOOL */

/* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
static DEFINE_PER_CPU(struct mapping_area, zs_map_area);
@@ -814,6 +888,10 @@ static void zs_exit(void)
{
int cpu;

+#ifdef CONFIG_ZPOOL
+ zpool_unregister_driver(&zs_zpool_driver);
+#endif
+
cpu_notifier_register_begin();

for_each_online_cpu(cpu)
@@ -840,6 +918,10 @@ static int zs_init(void)

cpu_notifier_register_done();

+#ifdef CONFIG_ZPOOL
+ zpool_register_driver(&zs_zpool_driver);
+#endif
+
return 0;
fail:
zs_exit();
--
1.8.3.1

2014-06-02 22:21:34

by Dan Streetman

[permalink] [raw]
Subject: [PATCH 2/6] mm/zbud: change zbud_alloc size type to size_t

Change the type of the zbud_alloc() size param from unsigned int
to size_t.

Technically, this should not make any difference, as the zbud
implementation already restricts the size to well within either
type's limits; but as zsmalloc (and kmalloc) use size_t, and
zpool will use size_t, this brings the size parameter type
in line with zsmalloc/zpool.

Signed-off-by: Dan Streetman <[email protected]>
Acked-by: Seth Jennings <[email protected]>
Cc: Weijie Yang <[email protected]>
---

No change since v1 : https://lkml.org/lkml/2014/5/7/757

include/linux/zbud.h | 2 +-
mm/zbud.c | 5 ++---
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/zbud.h b/include/linux/zbud.h
index 0b2534e..1e9cb57 100644
--- a/include/linux/zbud.h
+++ b/include/linux/zbud.h
@@ -11,7 +11,7 @@ struct zbud_ops {

struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops);
void zbud_destroy_pool(struct zbud_pool *pool);
-int zbud_alloc(struct zbud_pool *pool, unsigned int size,
+int zbud_alloc(struct zbud_pool *pool, size_t size,
unsigned long *handle);
void zbud_free(struct zbud_pool *pool, unsigned long handle);
int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries);
diff --git a/mm/zbud.c b/mm/zbud.c
index 847c01c..dd13665 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -123,7 +123,7 @@ enum buddy {
};

/* Converts an allocation size in bytes to size in zbud chunks */
-static int size_to_chunks(int size)
+static int size_to_chunks(size_t size)
{
return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT;
}
@@ -250,8 +250,7 @@ void zbud_destroy_pool(struct zbud_pool *pool)
* -EINVAL if the @size is 0, or -ENOMEM if the pool was unable to
* allocate a new page.
*/
-int zbud_alloc(struct zbud_pool *pool, unsigned int size,
- unsigned long *handle)
+int zbud_alloc(struct zbud_pool *pool, size_t size, unsigned long *handle)
{
int chunks, i, freechunks;
struct zbud_header *zhdr = NULL;
--
1.8.3.1

2014-06-04 01:38:59

by Bob Liu

[permalink] [raw]
Subject: Re: [PATCHv4 0/6] mm/zpool: add common api for zswap to use zbud/zsmalloc



On 06/03/2014 06:19 AM, Dan Streetman wrote:
> In order to allow zswap users to choose between zbud and zsmalloc for
> the compressed storage pool, this patch set adds a new api "zpool" that
> provides an interface to both zbud and zsmalloc. Only minor changes
> to zbud's interface were needed. This does not include implementing
> shrinking in zsmalloc, which will be sent separately.
>
> I believe Seth originally was using zsmalloc for swap, but there were
> concerns about how significant the impact of shrinking zsmalloc would
> be when zswap had to start reclaiming pages. That still may be an
> issue, but this at least allows users to choose themselves whether
> they want a lower-density or higher-density compressed storage medium.
> At least for situations where zswap reclaim is never or rarely reached,
> it probably makes sense to use the higher density of zsmalloc.
>

Nice job!
I also made a attempt last year, but didn't finish.

> Note this patch set does not change zram to use zpool, although that
> change should be possible as well.
>

This version looks good to me!

Reviewed-by: Bob Liu <[email protected]>

> ---
>
> Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> -In zpool_shrink() use # pages instead of # bytes
> -Add reclaimed param to zpool_shrink() to indicate to caller
> # pages actually reclaimed
> -move module usage counting to zpool, from zbud/zsmalloc
> -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> loop until requested # pages have been reclaimed (or error)
>
> Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> -Change zpool to use driver registration instead of hardcoding
> implementations
> -Add module use counting in zbud/zsmalloc
>
> Changes since v1 https://lkml.org/lkml/2014/4/19/97
> -remove zsmalloc shrinking
> -change zbud size param type from unsigned int to size_t
> -remove zpool fallback creation
> -zswap manually falls back to zbud if specified type fails
>
>
> Dan Streetman (6):
> mm/zbud: zbud_alloc() minor param change
> mm/zbud: change zbud_alloc size type to size_t
> mm/zpool: implement common zpool api to zbud/zsmalloc
> mm/zpool: zbud/zsmalloc implement zpool
> mm/zpool: update zswap to use zpool
> mm/zpool: prevent zbud/zsmalloc from unloading when used
>
> include/linux/zbud.h | 2 +-
> include/linux/zpool.h | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++
> mm/Kconfig | 43 ++++++----
> mm/Makefile | 1 +
> mm/zbud.c | 123 +++++++++++++++++++++++----
> mm/zpool.c | 206 ++++++++++++++++++++++++++++++++++++++++++++++
> mm/zsmalloc.c | 83 +++++++++++++++++++
> mm/zswap.c | 76 ++++++++++-------
> 8 files changed, 694 insertions(+), 64 deletions(-)
> create mode 100644 include/linux/zpool.h
> create mode 100644 mm/zpool.c
>

--
Regards,
-Bob

2014-06-06 21:01:24

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCHv4 0/6] mm/zpool: add common api for zswap to use zbud/zsmalloc

On Mon, Jun 02, 2014 at 06:19:40PM -0400, Dan Streetman wrote:
> In order to allow zswap users to choose between zbud and zsmalloc for
> the compressed storage pool, this patch set adds a new api "zpool" that
> provides an interface to both zbud and zsmalloc. Only minor changes
> to zbud's interface were needed. This does not include implementing
> shrinking in zsmalloc, which will be sent separately.
>
> I believe Seth originally was using zsmalloc for swap, but there were
> concerns about how significant the impact of shrinking zsmalloc would
> be when zswap had to start reclaiming pages. That still may be an
> issue, but this at least allows users to choose themselves whether
> they want a lower-density or higher-density compressed storage medium.
> At least for situations where zswap reclaim is never or rarely reached,
> it probably makes sense to use the higher density of zsmalloc.
>
> Note this patch set does not change zram to use zpool, although that
> change should be possible as well.

This looks good! Much better than when we first started :) Thanks Dan.
I haven't had a chance to test it out yet so I'm going to wait to Ack it
until then, which might be as late as 6/16 due to a vacation and a
conference.

Thanks,
Seth

>
> ---
>
> Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> -In zpool_shrink() use # pages instead of # bytes
> -Add reclaimed param to zpool_shrink() to indicate to caller
> # pages actually reclaimed
> -move module usage counting to zpool, from zbud/zsmalloc
> -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> loop until requested # pages have been reclaimed (or error)
>
> Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> -Change zpool to use driver registration instead of hardcoding
> implementations
> -Add module use counting in zbud/zsmalloc
>
> Changes since v1 https://lkml.org/lkml/2014/4/19/97
> -remove zsmalloc shrinking
> -change zbud size param type from unsigned int to size_t
> -remove zpool fallback creation
> -zswap manually falls back to zbud if specified type fails
>
>
> Dan Streetman (6):
> mm/zbud: zbud_alloc() minor param change
> mm/zbud: change zbud_alloc size type to size_t
> mm/zpool: implement common zpool api to zbud/zsmalloc
> mm/zpool: zbud/zsmalloc implement zpool
> mm/zpool: update zswap to use zpool
> mm/zpool: prevent zbud/zsmalloc from unloading when used
>
> include/linux/zbud.h | 2 +-
> include/linux/zpool.h | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++
> mm/Kconfig | 43 ++++++----
> mm/Makefile | 1 +
> mm/zbud.c | 123 +++++++++++++++++++++++----
> mm/zpool.c | 206 ++++++++++++++++++++++++++++++++++++++++++++++
> mm/zsmalloc.c | 83 +++++++++++++++++++
> mm/zswap.c | 76 ++++++++++-------
> 8 files changed, 694 insertions(+), 64 deletions(-)
> create mode 100644 include/linux/zpool.h
> create mode 100644 mm/zpool.c
>
> --
> 1.8.3.1
>

2014-06-23 21:19:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv2 1/6] mm/zbud: zbud_alloc() minor param change

On Mon, 2 Jun 2014 18:19:41 -0400 Dan Streetman <[email protected]> wrote:

> Change zbud to store gfp_t flags passed at pool creation to use for
> each alloc; this allows the api to be closer to the existing zsmalloc
> interface, and the only current zbud user (zswap) uses the same gfp
> flags for all allocs. Update zswap to use changed interface.

This would appear to be a step backwards. There's nothing wrong with
requiring all callers to pass in a gfp_t and removing this option makes
the API less usable.

IMO the patch needs much better justification, or dropping.

2014-06-23 21:46:16

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

On Mon, 2 Jun 2014 18:19:43 -0400 Dan Streetman <[email protected]> wrote:

> Add zpool api.
>
> zpool provides an interface for memory storage, typically of compressed
> memory. Users can select what backend to use; currently the only
> implementations are zbud, a low density implementation with up to
> two compressed pages per storage page, and zsmalloc, a higher density
> implementation with multiple compressed pages per storage page.
>
> ...
>
> +/**
> + * zpool_create_pool() - Create a new zpool
> + * @type The type of the zpool to create (e.g. zbud, zsmalloc)
> + * @flags What GFP flags should be used when the zpool allocates memory.
> + * @ops The optional ops callback.
> + *
> + * This creates a new zpool of the specified type. The zpool will use the
> + * given flags when allocating any memory. If the ops param is NULL, then
> + * the created zpool will not be shrinkable.
> + *
> + * Returns: New zpool on success, NULL on failure.
> + */
> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
> + struct zpool_ops *ops);

It is unconventional to document the API in the .h file. It's better
to put the documentation where people expect to find it.

It's irritating for me (for example) because this kernel convention has
permitted me to train my tags system to ignore prototypes in headers.
But if I want to find the zpool_create_pool documentation I will need
to jump through hoops.

>
> ...
>
> +int zpool_evict(void *pool, unsigned long handle)
> +{
> + struct zpool *zpool;
> +
> + spin_lock(&pools_lock);
> + list_for_each_entry(zpool, &pools_head, list) {
> + if (zpool->pool == pool) {
> + spin_unlock(&pools_lock);

This is racy against zpool_unregister_driver().

> + if (!zpool->ops || !zpool->ops->evict)
> + return -EINVAL;
> + return zpool->ops->evict(zpool, handle);
> + }
> + }
> + spin_unlock(&pools_lock);
> +
> + return -ENOENT;
> +}
> +EXPORT_SYMBOL(zpool_evict);
> +
> +static struct zpool_driver *zpool_get_driver(char *type)

In kernel convention, "get" implies "take a reference upon". A better
name would be zpool_find_driver or zpool_lookup_driver.

This is especially important because the code appears to need a
for-real zpool_get_driver to fix the races!

>
> ...
>
> +
> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
> + struct zpool_ops *ops)
> +{
> + struct zpool_driver *driver;
> + struct zpool *zpool;
> +
> + pr_info("creating pool type %s\n", type);
> +
> + spin_lock(&drivers_lock);
> + driver = zpool_get_driver(type);
> + spin_unlock(&drivers_lock);

Racy against unregister. Can be solved with a standard get/put
refcounting implementation. Or perhaps a big fat mutex.

> + if (!driver) {
> + request_module(type);
> + spin_lock(&drivers_lock);
> + driver = zpool_get_driver(type);
> + spin_unlock(&drivers_lock);
> + }
> +
> + if (!driver) {
> + pr_err("no driver for type %s\n", type);
> + return NULL;
> + }
> +
> + zpool = kmalloc(sizeof(*zpool), GFP_KERNEL);
> + if (!zpool) {
> + pr_err("couldn't create zpool - out of memory\n");
> + return NULL;
> + }
> +
> + zpool->type = driver->type;
> + zpool->driver = driver;
> + zpool->pool = driver->create(flags, ops);
> + zpool->ops = ops;
> +
> + if (!zpool->pool) {
> + pr_err("couldn't create %s pool\n", type);
> + kfree(zpool);
> + return NULL;
> + }
> +
> + pr_info("created %s pool\n", type);
> +
> + spin_lock(&pools_lock);
> + list_add(&zpool->list, &pools_head);
> + spin_unlock(&pools_lock);
> +
> + return zpool;
> +}
>
> ...
>
> +void zpool_destroy_pool(struct zpool *zpool)
> +{
> + pr_info("destroying pool type %s\n", zpool->type);
> +
> + spin_lock(&pools_lock);
> + list_del(&zpool->list);
> + spin_unlock(&pools_lock);
> + zpool->driver->destroy(zpool->pool);
> + kfree(zpool);
> +}

What are the lifecycle rules here? How do we know that nobody else can
be concurrently using this pool?

>
> ...
>

2014-06-23 21:48:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv2 6/6] mm/zpool: prevent zbud/zsmalloc from unloading when used

On Mon, 2 Jun 2014 18:19:46 -0400 Dan Streetman <[email protected]> wrote:

> Add try_module_get() to zpool_create_pool(), and module_put() to
> zpool_destroy_pool(). Without module usage counting, the driver module(s)
> could be unloaded while their pool(s) were active, resulting in an oops
> when zpool tried to access them.

Was wondering about that ;) We may as well fold
this fix into "mm/zpool: implement common zpool api to zbud/zsmalloc"?

2014-06-24 15:24:38

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv2 1/6] mm/zbud: zbud_alloc() minor param change

On Mon, Jun 23, 2014 at 5:19 PM, Andrew Morton
<[email protected]> wrote:
> On Mon, 2 Jun 2014 18:19:41 -0400 Dan Streetman <[email protected]> wrote:
>
>> Change zbud to store gfp_t flags passed at pool creation to use for
>> each alloc; this allows the api to be closer to the existing zsmalloc
>> interface, and the only current zbud user (zswap) uses the same gfp
>> flags for all allocs. Update zswap to use changed interface.
>
> This would appear to be a step backwards. There's nothing wrong with
> requiring all callers to pass in a gfp_t and removing this option makes
> the API less usable.
>
> IMO the patch needs much better justification, or dropping.

Well, since zpool can be backed by either zsmalloc or zbud, those 2
apis have to be consistent, and currently zbud does use a per-malloc
gfp_t param while zsmalloc doesn't. Does it make more sense to add a
gfp_t param to zsmalloc's alloc function?


I wonder though if allowing the caller to pass a gfp_t for each alloc
really does make sense, though. Any memory alloc'ed isn't actually
controllable by the caller, and in fact it's currently impossible for
the caller to free memory alloc'ed by the backing pool - the caller
can invalidate specific handles, but that doesn't guarantee the memory
alloc'ed for that handle will then be freed - it could remain in use
with some other handle(s). Additionally, there's no guarantee that
when the user creates a new handle, and new memory will be allocated -
a previous available handle could be used.

So I guess what I'm suggesting is that because 1) there is no
guarantee that a call to zpool_malloc() will actually call kmalloc()
with the provided gfp_t; previously kmalloc'ed memory with a different
gfp_t could be (and probably in many cases will be) used, and 2) the
caller has no way to free any memory kmalloc'ed with specific gfp_t
(so for example, using GFP_ATOMIC would be a bad idea, since the
caller couldn't then free that memory directly), it makes more sense
to me to keep all allocations in the pool using the same gfp_t flags.
If there was a need to be able to create pool handles using different
gfp_t flags, then it would be probably more effective to create
multiple pools, each one with the different desired gfp_t flags to
use.

However, from the implementation side, changing zsmalloc is trivial to
just add a gfp_t param to alloc, and update zpool_malloc to accept and
pass through the gfp_t param. So if that still makes more sense to
you, I can update things to change the zsmalloc api to add the param,
instead of this patch which removes the param from its api. Assuming
that Minchan and Nitin also have no problem with updating the zsmalloc
api - there should be no functional difference in the zram/zsmalloc
relationship, since zram would simply always pass the same gfp_t to
zsmalloc.

2014-06-24 15:39:40

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

On Mon, Jun 23, 2014 at 5:46 PM, Andrew Morton
<[email protected]> wrote:
> On Mon, 2 Jun 2014 18:19:43 -0400 Dan Streetman <[email protected]> wrote:
>
>> Add zpool api.
>>
>> zpool provides an interface for memory storage, typically of compressed
>> memory. Users can select what backend to use; currently the only
>> implementations are zbud, a low density implementation with up to
>> two compressed pages per storage page, and zsmalloc, a higher density
>> implementation with multiple compressed pages per storage page.
>>
>> ...
>>
>> +/**
>> + * zpool_create_pool() - Create a new zpool
>> + * @type The type of the zpool to create (e.g. zbud, zsmalloc)
>> + * @flags What GFP flags should be used when the zpool allocates memory.
>> + * @ops The optional ops callback.
>> + *
>> + * This creates a new zpool of the specified type. The zpool will use the
>> + * given flags when allocating any memory. If the ops param is NULL, then
>> + * the created zpool will not be shrinkable.
>> + *
>> + * Returns: New zpool on success, NULL on failure.
>> + */
>> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
>> + struct zpool_ops *ops);
>
> It is unconventional to document the API in the .h file. It's better
> to put the documentation where people expect to find it.
>
> It's irritating for me (for example) because this kernel convention has
> permitted me to train my tags system to ignore prototypes in headers.
> But if I want to find the zpool_create_pool documentation I will need
> to jump through hoops.

Got it, I will move it to the .c file.

I noticed you pulled these into -mm, do you want me to send follow-on
patches for these changes, or actually update the origin patches and
resend the patch set?


>
>>
>> ...
>>
>> +int zpool_evict(void *pool, unsigned long handle)
>> +{
>> + struct zpool *zpool;
>> +
>> + spin_lock(&pools_lock);
>> + list_for_each_entry(zpool, &pools_head, list) {
>> + if (zpool->pool == pool) {
>> + spin_unlock(&pools_lock);
>
> This is racy against zpool_unregister_driver().
>
>> + if (!zpool->ops || !zpool->ops->evict)
>> + return -EINVAL;
>> + return zpool->ops->evict(zpool, handle);
>> + }
>> + }
>> + spin_unlock(&pools_lock);
>> +
>> + return -ENOENT;
>> +}
>> +EXPORT_SYMBOL(zpool_evict);
>> +
>> +static struct zpool_driver *zpool_get_driver(char *type)
>
> In kernel convention, "get" implies "take a reference upon". A better
> name would be zpool_find_driver or zpool_lookup_driver.
>
> This is especially important because the code appears to need a
> for-real zpool_get_driver to fix the races!

yep as you mentioned in your next email, I will roll the
try_module_get() protection into this patch.

>
>>
>> ...
>>
>> +
>> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
>> + struct zpool_ops *ops)
>> +{
>> + struct zpool_driver *driver;
>> + struct zpool *zpool;
>> +
>> + pr_info("creating pool type %s\n", type);
>> +
>> + spin_lock(&drivers_lock);
>> + driver = zpool_get_driver(type);
>> + spin_unlock(&drivers_lock);
>
> Racy against unregister. Can be solved with a standard get/put
> refcounting implementation. Or perhaps a big fat mutex.
>
>> + if (!driver) {
>> + request_module(type);
>> + spin_lock(&drivers_lock);
>> + driver = zpool_get_driver(type);
>> + spin_unlock(&drivers_lock);
>> + }
>> +
>> + if (!driver) {
>> + pr_err("no driver for type %s\n", type);
>> + return NULL;
>> + }
>> +
>> + zpool = kmalloc(sizeof(*zpool), GFP_KERNEL);
>> + if (!zpool) {
>> + pr_err("couldn't create zpool - out of memory\n");
>> + return NULL;
>> + }
>> +
>> + zpool->type = driver->type;
>> + zpool->driver = driver;
>> + zpool->pool = driver->create(flags, ops);
>> + zpool->ops = ops;
>> +
>> + if (!zpool->pool) {
>> + pr_err("couldn't create %s pool\n", type);
>> + kfree(zpool);
>> + return NULL;
>> + }
>> +
>> + pr_info("created %s pool\n", type);
>> +
>> + spin_lock(&pools_lock);
>> + list_add(&zpool->list, &pools_head);
>> + spin_unlock(&pools_lock);
>> +
>> + return zpool;
>> +}
>>
>> ...
>>
>> +void zpool_destroy_pool(struct zpool *zpool)
>> +{
>> + pr_info("destroying pool type %s\n", zpool->type);
>> +
>> + spin_lock(&pools_lock);
>> + list_del(&zpool->list);
>> + spin_unlock(&pools_lock);
>> + zpool->driver->destroy(zpool->pool);
>> + kfree(zpool);
>> +}
>
> What are the lifecycle rules here? How do we know that nobody else can
> be concurrently using this pool?

Well I think with zpools, as well as direct use of zsmalloc and zbud
pools, whoever creates a pool is responsible for making sure it's no
longer in use before destroying it. I think in most use cases, pool
creators won't be sharing their pools, so there should be no issue
with concurrent use. In fact, concurrent pool use it probably a bad
idea in general - zsmalloc for example relies on per-cpu data during
handle mapping, so concurrent use of a single pool might result in the
per-cpu data being overwritten if multiple users of a single pool
tried to map and use different handles from the same cpu.

Should some use/sharing restrictions be added to the zpool documentation?

>
>>
>> ...
>>
>

2014-06-24 15:41:58

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv2 6/6] mm/zpool: prevent zbud/zsmalloc from unloading when used

On Mon, Jun 23, 2014 at 5:48 PM, Andrew Morton
<[email protected]> wrote:
> On Mon, 2 Jun 2014 18:19:46 -0400 Dan Streetman <[email protected]> wrote:
>
>> Add try_module_get() to zpool_create_pool(), and module_put() to
>> zpool_destroy_pool(). Without module usage counting, the driver module(s)
>> could be unloaded while their pool(s) were active, resulting in an oops
>> when zpool tried to access them.
>
> Was wondering about that ;) We may as well fold
> this fix into "mm/zpool: implement common zpool api to zbud/zsmalloc"?

Yes. Sorry, I had this pulled out of that because I was trying to
keep the patches logically separated. But they do need to be
together, to be safe ;-)

2014-06-24 23:09:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

On Tue, 24 Jun 2014 11:39:12 -0400 Dan Streetman <[email protected]> wrote:

> On Mon, Jun 23, 2014 at 5:46 PM, Andrew Morton
> <[email protected]> wrote:
> > On Mon, 2 Jun 2014 18:19:43 -0400 Dan Streetman <[email protected]> wrote:
> >
> >> Add zpool api.
> >>
> >> zpool provides an interface for memory storage, typically of compressed
> >> memory. Users can select what backend to use; currently the only
> >> implementations are zbud, a low density implementation with up to
> >> two compressed pages per storage page, and zsmalloc, a higher density
> >> implementation with multiple compressed pages per storage page.
> >>
> >> ...
> >>
> >> +/**
> >> + * zpool_create_pool() - Create a new zpool
> >> + * @type The type of the zpool to create (e.g. zbud, zsmalloc)
> >> + * @flags What GFP flags should be used when the zpool allocates memory.
> >> + * @ops The optional ops callback.
> >> + *
> >> + * This creates a new zpool of the specified type. The zpool will use the
> >> + * given flags when allocating any memory. If the ops param is NULL, then
> >> + * the created zpool will not be shrinkable.
> >> + *
> >> + * Returns: New zpool on success, NULL on failure.
> >> + */
> >> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
> >> + struct zpool_ops *ops);
> >
> > It is unconventional to document the API in the .h file. It's better
> > to put the documentation where people expect to find it.
> >
> > It's irritating for me (for example) because this kernel convention has
> > permitted me to train my tags system to ignore prototypes in headers.
> > But if I want to find the zpool_create_pool documentation I will need
> > to jump through hoops.
>
> Got it, I will move it to the .c file.
>
> I noticed you pulled these into -mm, do you want me to send follow-on
> patches for these changes, or actually update the origin patches and
> resend the patch set?

Full resend, I guess. I often add things which are
not-quite-fully-baked to give them a bit of testing, check for
integration with other changes, etc.

> >
> >>
> >> ...
> >>
> >> +
> >> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
> >> + struct zpool_ops *ops)
> >> +{
> >> + struct zpool_driver *driver;
> >> + struct zpool *zpool;
> >> +
> >> + pr_info("creating pool type %s\n", type);
> >> +
> >> + spin_lock(&drivers_lock);
> >> + driver = zpool_get_driver(type);
> >> + spin_unlock(&drivers_lock);
> >
> > Racy against unregister. Can be solved with a standard get/put
> > refcounting implementation. Or perhaps a big fat mutex.

Was there a decision here?

> >> +void zpool_destroy_pool(struct zpool *zpool)
> >> +{
> >> + pr_info("destroying pool type %s\n", zpool->type);
> >> +
> >> + spin_lock(&pools_lock);
> >> + list_del(&zpool->list);
> >> + spin_unlock(&pools_lock);
> >> + zpool->driver->destroy(zpool->pool);
> >> + kfree(zpool);
> >> +}
> >
> > What are the lifecycle rules here? How do we know that nobody else can
> > be concurrently using this pool?
>
> Well I think with zpools, as well as direct use of zsmalloc and zbud
> pools, whoever creates a pool is responsible for making sure it's no
> longer in use before destroying it.

Sounds reasonable. Perhaps there's some convenient WARN_ON we can put
in here to check that.

> I think in most use cases, pool
> creators won't be sharing their pools, so there should be no issue
> with concurrent use. In fact, concurrent pool use it probably a bad
> idea in general - zsmalloc for example relies on per-cpu data during
> handle mapping, so concurrent use of a single pool might result in the
> per-cpu data being overwritten if multiple users of a single pool
> tried to map and use different handles from the same cpu.

That's all a bit waffly. Either we support concurrent use or we don't!

> Should some use/sharing restrictions be added to the zpool documentation?

Sure. And the code if possible. If a second user tries to use a pool
which is already in use, that attempt should just fail, with WARN,
printk, return -EBUSY, whatever.

2014-06-27 17:11:38

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

On Tue, Jun 24, 2014 at 7:08 PM, Andrew Morton
<[email protected]> wrote:
> On Tue, 24 Jun 2014 11:39:12 -0400 Dan Streetman <[email protected]> wrote:
>
>> On Mon, Jun 23, 2014 at 5:46 PM, Andrew Morton
>> <[email protected]> wrote:
>> > On Mon, 2 Jun 2014 18:19:43 -0400 Dan Streetman <[email protected]> wrote:
>> >
>> >> Add zpool api.
>> >>
>> >> zpool provides an interface for memory storage, typically of compressed
>> >> memory. Users can select what backend to use; currently the only
>> >> implementations are zbud, a low density implementation with up to
>> >> two compressed pages per storage page, and zsmalloc, a higher density
>> >> implementation with multiple compressed pages per storage page.
>> >>
>> >> ...
>> >>
>> >> +/**
>> >> + * zpool_create_pool() - Create a new zpool
>> >> + * @type The type of the zpool to create (e.g. zbud, zsmalloc)
>> >> + * @flags What GFP flags should be used when the zpool allocates memory.
>> >> + * @ops The optional ops callback.
>> >> + *
>> >> + * This creates a new zpool of the specified type. The zpool will use the
>> >> + * given flags when allocating any memory. If the ops param is NULL, then
>> >> + * the created zpool will not be shrinkable.
>> >> + *
>> >> + * Returns: New zpool on success, NULL on failure.
>> >> + */
>> >> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
>> >> + struct zpool_ops *ops);
>> >
>> > It is unconventional to document the API in the .h file. It's better
>> > to put the documentation where people expect to find it.
>> >
>> > It's irritating for me (for example) because this kernel convention has
>> > permitted me to train my tags system to ignore prototypes in headers.
>> > But if I want to find the zpool_create_pool documentation I will need
>> > to jump through hoops.
>>
>> Got it, I will move it to the .c file.
>>
>> I noticed you pulled these into -mm, do you want me to send follow-on
>> patches for these changes, or actually update the origin patches and
>> resend the patch set?
>
> Full resend, I guess. I often add things which are
> not-quite-fully-baked to give them a bit of testing, check for
> integration with other changes, etc.
>
>> >
>> >>
>> >> ...
>> >>
>> >> +
>> >> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
>> >> + struct zpool_ops *ops)
>> >> +{
>> >> + struct zpool_driver *driver;
>> >> + struct zpool *zpool;
>> >> +
>> >> + pr_info("creating pool type %s\n", type);
>> >> +
>> >> + spin_lock(&drivers_lock);
>> >> + driver = zpool_get_driver(type);
>> >> + spin_unlock(&drivers_lock);
>> >
>> > Racy against unregister. Can be solved with a standard get/put
>> > refcounting implementation. Or perhaps a big fat mutex.
>
> Was there a decision here?

What I tried to do, with the final patch in the set, was use module
usage counting combined with function documentation - in
zpool_create_pool() the zpool_get_driver() does try_module_get()
before releasing the spinlock, so if the driver *only* calls
unregister from its module exit function, I think we should be good -
once zpool_create_pool() gets the driver module, the driver won't
enter its exit function and thus won't unregister; and if the driver
module has started its exit function, try_module_get() will return
failure and zpool_create_pool() will return failure.

Now, if we remove the restriction that the driver module can only
unregister from its module exit function, then we would need an
additional refcount (we could use module_refcount() but the module may
have refcounts unrelated to us) and unregister would need a return
value, to indicate failure. I think the problem I had with that is,
in the driver module's exit function it can't abort if unregister
fails; but with the module refcounting, unregister shouldn't ever fail
in the driver's exit function...

So should I remove the unregister function doc asking to only call
unregister from the module exit function, and add a separate refcount
to the driver get/put functions? I don't think we need to use a kref,
since we don't want to free the driver once kref == 0, we want to be
able to check in the unregister function if there are any refs, so
just an atomic_t should work. And we would still need to keep the
module get/put, too, so it would be something like:

spin_lock(&drivers_lock);
...
bool got = try_module_get(driver->owner);
if (got)
atomic_inc(driver->refs);
spin_unlock(&drivers_lock);
return got ? driver : NULL;

with the appropriate atomic_dec in zpool_put_driver(), and unregister
would change to:

int zpool_unregister_driver(struct zpool_driver *driver)
{
spin_lock(&drivers_lock);
if (atomic_read(driver->refs) > 0) {
spin_unlock(&drivers_lock);
return -EBUSY;
}
list_del(&driver->list);
spin_unlock(&drivers_lock);
return 0;
}


>
>> >> +void zpool_destroy_pool(struct zpool *zpool)
>> >> +{
>> >> + pr_info("destroying pool type %s\n", zpool->type);
>> >> +
>> >> + spin_lock(&pools_lock);
>> >> + list_del(&zpool->list);
>> >> + spin_unlock(&pools_lock);
>> >> + zpool->driver->destroy(zpool->pool);
>> >> + kfree(zpool);
>> >> +}
>> >
>> > What are the lifecycle rules here? How do we know that nobody else can
>> > be concurrently using this pool?
>>
>> Well I think with zpools, as well as direct use of zsmalloc and zbud
>> pools, whoever creates a pool is responsible for making sure it's no
>> longer in use before destroying it.
>
> Sounds reasonable. Perhaps there's some convenient WARN_ON we can put
> in here to check that.

Since zpool's just a passthrough, there's no simple way of it telling
if a pool is in use or not, but warnings could be added to
zbud/zsmalloc's destroy functions. zs_destroy_pool() already does
check and pr_info() if any non-empty pools are destroyed.

>
>> I think in most use cases, pool
>> creators won't be sharing their pools, so there should be no issue
>> with concurrent use. In fact, concurrent pool use it probably a bad
>> idea in general - zsmalloc for example relies on per-cpu data during
>> handle mapping, so concurrent use of a single pool might result in the
>> per-cpu data being overwritten if multiple users of a single pool
>> tried to map and use different handles from the same cpu.
>
> That's all a bit waffly. Either we support concurrent use or we don't!

I think I got offtrack talking about pool creators and pool users.
zpool, and zbud/zsmalloc, really don't care about *who* is calling
each of their functions. Only concurrency matters, and most of the
functions are safe for concurrent use, protected internally by
spinlocks, etc in each pool driver (zbud/zsmalloc). The map/unmap
functions are a notable exception, but the function doc for
zpool_map_handle() clarifies the restrictions for how to call it and
what the implementation may do (hold spinlocks, disable preempt/ints)
and that the caller should call unmap quickly after using the mapped
handle. And whoever creates the pool will need to also destroy the
pool, or at least handle coordinating who and when the pool is
destroyed (beyond warning, i don't think there is much the pool driver
can do when a non-empty pool is destroyed. Maybe don't destroy the
pool, but that risks leaking memory if nobody ever uses the pool
again).

I'll review zbud and zsmalloc again to make sure each function is
threadsafe, and state that in each zpool function doc, or make sure to
clarify any restrictions.

Since you already mentioned a few changes, let me get an updated patch
set sent, I'll try to send that by Monday, and we can go from there if
more changes are needed. Thanks for the review!


>
>> Should some use/sharing restrictions be added to the zpool documentation?
>
> Sure. And the code if possible. If a second user tries to use a pool
> which is already in use, that attempt should just fail, with WARN,
> printk, return -EBUSY, whatever.

2014-06-27 19:17:55

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv4 3/6] mm/zpool: implement common zpool api to zbud/zsmalloc

On Fri, 27 Jun 2014 13:11:15 -0400 Dan Streetman <[email protected]> wrote:

> >> >> +struct zpool *zpool_create_pool(char *type, gfp_t flags,
> >> >> + struct zpool_ops *ops)
> >> >> +{
> >> >> + struct zpool_driver *driver;
> >> >> + struct zpool *zpool;
> >> >> +
> >> >> + pr_info("creating pool type %s\n", type);
> >> >> +
> >> >> + spin_lock(&drivers_lock);
> >> >> + driver = zpool_get_driver(type);
> >> >> + spin_unlock(&drivers_lock);
> >> >
> >> > Racy against unregister. Can be solved with a standard get/put
> >> > refcounting implementation. Or perhaps a big fat mutex.
> >
> > Was there a decision here?
>
> What I tried to do, with the final patch in the set, was use module
> usage counting combined with function documentation - in
> zpool_create_pool() the zpool_get_driver() does try_module_get()
> before releasing the spinlock, so if the driver *only* calls
> unregister from its module exit function, I think we should be good -
> once zpool_create_pool() gets the driver module, the driver won't
> enter its exit function and thus won't unregister; and if the driver
> module has started its exit function, try_module_get() will return
> failure and zpool_create_pool() will return failure.
>
> Now, if we remove the restriction that the driver module can only
> unregister from its module exit function, then we would need an
> additional refcount (we could use module_refcount() but the module may
> have refcounts unrelated to us) and unregister would need a return
> value, to indicate failure. I think the problem I had with that is,
> in the driver module's exit function it can't abort if unregister
> fails; but with the module refcounting, unregister shouldn't ever fail
> in the driver's exit function...
>
> So should I remove the unregister function doc asking to only call
> unregister from the module exit function, and add a separate refcount
> to the driver get/put functions? I don't think we need to use a kref,
> since we don't want to free the driver once kref == 0, we want to be
> able to check in the unregister function if there are any refs, so
> just an atomic_t should work. And we would still need to keep the
> module get/put, too, so it would be something like:

I'm not sure I understood all that. But I don't want to understand it
in this context! Readers should be able to gather all this from
looking at the code.

> spin_lock(&drivers_lock);
> ...
> bool got = try_module_get(driver->owner);
> if (got)
> atomic_inc(driver->refs);
> spin_unlock(&drivers_lock);
> return got ? driver : NULL;
>
> with the appropriate atomic_dec in zpool_put_driver(), and unregister
> would change to:
>
> int zpool_unregister_driver(struct zpool_driver *driver)
> {
> spin_lock(&drivers_lock);
> if (atomic_read(driver->refs) > 0) {
> spin_unlock(&drivers_lock);
> return -EBUSY;
> }
> list_del(&driver->list);
> spin_unlock(&drivers_lock);
> return 0;
> }

It sounds like that will work.

2014-07-02 21:44:20

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

In order to allow zswap users to choose between zbud and zsmalloc for
the compressed storage pool, this patch set adds a new api "zpool" that
provides an interface to both zbud and zsmalloc. This does not include
implementing shrinking in zsmalloc, which will be sent separately.

I believe Seth originally was using zsmalloc for swap, but there were
concerns about how significant the impact of shrinking zsmalloc would
be when zswap had to start reclaiming pages. That still may be an
issue, but this at least allows users to choose themselves whether
they want a lower-density or higher-density compressed storage medium.
At least for situations where zswap reclaim is never or rarely reached,
it probably makes sense to use the higher density of zsmalloc.

Note this patch set does not change zram to use zpool, although that
change should be possible as well.

---
Changes since v4 : https://lkml.org/lkml/2014/6/2/711
-omit first patch, that removed gfp_t param from zpool_malloc()
-move function doc from zpool.h to zpool.c
-move module usage refcounting into patch that adds zpool
-add extra refcounting to prevent driver unregister if in use
-add doc clarifying concurrency usage
-make zbud/zsmalloc zpool functions static
-typo corrections

Changes since v3 : https://lkml.org/lkml/2014/5/24/130
-In zpool_shrink() use # pages instead of # bytes
-Add reclaimed param to zpool_shrink() to indicate to caller
# pages actually reclaimed
-move module usage counting to zpool, from zbud/zsmalloc
-update zbud_zpool_shrink() to call zbud_reclaim_page() in a
loop until requested # pages have been reclaimed (or error)

Changes since v2 : https://lkml.org/lkml/2014/5/7/927
-Change zpool to use driver registration instead of hardcoding
implementations
-Add module use counting in zbud/zsmalloc

Changes since v1 https://lkml.org/lkml/2014/4/19/97
-remove zsmalloc shrinking
-change zbud size param type from unsigned int to size_t
-remove zpool fallback creation
-zswap manually falls back to zbud if specified type fails


Dan Streetman (4):
mm/zbud: change zbud_alloc size type to size_t
mm/zpool: implement common zpool api to zbud/zsmalloc
mm/zpool: zbud/zsmalloc implement zpool
mm/zpool: update zswap to use zpool

include/linux/zbud.h | 2 +-
include/linux/zpool.h | 106 +++++++++++++++
mm/Kconfig | 43 +++---
mm/Makefile | 1 +
mm/zbud.c | 98 +++++++++++++-
mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 84 ++++++++++++
mm/zswap.c | 75 ++++++-----
8 files changed, 722 insertions(+), 51 deletions(-)
create mode 100644 include/linux/zpool.h
create mode 100644 mm/zpool.c

--
1.8.3.1

2014-07-02 21:45:57

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

In order to allow zswap users to choose between zbud and zsmalloc for
the compressed storage pool, this patch set adds a new api "zpool" that
provides an interface to both zbud and zsmalloc. This does not include
implementing shrinking in zsmalloc, which will be sent separately.

I believe Seth originally was using zsmalloc for swap, but there were
concerns about how significant the impact of shrinking zsmalloc would
be when zswap had to start reclaiming pages. That still may be an
issue, but this at least allows users to choose themselves whether
they want a lower-density or higher-density compressed storage medium.
At least for situations where zswap reclaim is never or rarely reached,
it probably makes sense to use the higher density of zsmalloc.

Note this patch set does not change zram to use zpool, although that
change should be possible as well.

---
Changes since v4 : https://lkml.org/lkml/2014/6/2/711
-omit first patch, that removed gfp_t param from zpool_malloc()
-move function doc from zpool.h to zpool.c
-move module usage refcounting into patch that adds zpool
-add extra refcounting to prevent driver unregister if in use
-add doc clarifying concurrency usage
-make zbud/zsmalloc zpool functions static
-typo corrections

Changes since v3 : https://lkml.org/lkml/2014/5/24/130
-In zpool_shrink() use # pages instead of # bytes
-Add reclaimed param to zpool_shrink() to indicate to caller
# pages actually reclaimed
-move module usage counting to zpool, from zbud/zsmalloc
-update zbud_zpool_shrink() to call zbud_reclaim_page() in a
loop until requested # pages have been reclaimed (or error)

Changes since v2 : https://lkml.org/lkml/2014/5/7/927
-Change zpool to use driver registration instead of hardcoding
implementations
-Add module use counting in zbud/zsmalloc

Changes since v1 https://lkml.org/lkml/2014/4/19/97
-remove zsmalloc shrinking
-change zbud size param type from unsigned int to size_t
-remove zpool fallback creation
-zswap manually falls back to zbud if specified type fails


Dan Streetman (4):
mm/zbud: change zbud_alloc size type to size_t
mm/zpool: implement common zpool api to zbud/zsmalloc
mm/zpool: zbud/zsmalloc implement zpool
mm/zpool: update zswap to use zpool

include/linux/zbud.h | 2 +-
include/linux/zpool.h | 106 +++++++++++++++
mm/Kconfig | 43 +++---
mm/Makefile | 1 +
mm/zbud.c | 98 +++++++++++++-
mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 84 ++++++++++++
mm/zswap.c | 75 ++++++-----
8 files changed, 722 insertions(+), 51 deletions(-)
create mode 100644 include/linux/zpool.h
create mode 100644 mm/zpool.c

--
1.8.3.1

2014-07-02 21:46:03

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv2 1/4] mm/zbud: change zbud_alloc size type to size_t

Change the type of the zbud_alloc() size param from unsigned int
to size_t.

Technically, this should not make any difference, as the zbud
implementation already restricts the size to well within either
type's limits; but as zsmalloc (and kmalloc) use size_t, and
zpool will use size_t, this brings the size parameter type
in line with zsmalloc/zpool.

Signed-off-by: Dan Streetman <[email protected]>
Acked-by: Seth Jennings <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Changes since v1 : https://lkml.org/lkml/2014/5/7/757
-context change due to omitting patch to remove gfp_t param

include/linux/zbud.h | 2 +-
mm/zbud.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/zbud.h b/include/linux/zbud.h
index 13af0d4..f9d41a6 100644
--- a/include/linux/zbud.h
+++ b/include/linux/zbud.h
@@ -11,7 +11,7 @@ struct zbud_ops {

struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops);
void zbud_destroy_pool(struct zbud_pool *pool);
-int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,
+int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle);
void zbud_free(struct zbud_pool *pool, unsigned long handle);
int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries);
diff --git a/mm/zbud.c b/mm/zbud.c
index 01df13a..d012261 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -122,7 +122,7 @@ enum buddy {
};

/* Converts an allocation size in bytes to size in zbud chunks */
-static int size_to_chunks(int size)
+static int size_to_chunks(size_t size)
{
return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT;
}
@@ -247,7 +247,7 @@ void zbud_destroy_pool(struct zbud_pool *pool)
* gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
* a new page.
*/
-int zbud_alloc(struct zbud_pool *pool, unsigned int size, gfp_t gfp,
+int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
{
int chunks, i, freechunks;
--
1.8.3.1

2014-07-02 21:46:07

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv5 2/4] mm/zpool: implement common zpool api to zbud/zsmalloc

Add zpool api.

zpool provides an interface for memory storage, typically of compressed
memory. Users can select what backend to use; currently the only
implementations are zbud, a low density implementation with up to
two compressed pages per storage page, and zsmalloc, a higher density
implementation with multiple compressed pages per storage page.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
Cc: Andrew Morton <[email protected]>
---

Changes since v4 : https://lkml.org/lkml/2014/6/2/707
-move function doc from zpool.h to zpool.c
-add doc clarifying concurrency use
-move module usage refcounting into this patch from later patch
-add atomic_t refcounting
-change driver unregister function to return failure if driver in use
-update to pass gfp to create and malloc functions

Changes since v3 : https://lkml.org/lkml/2014/5/24/135
-change zpool_shrink() to use # pages instead of # bytes
-change zpool_shrink() to update *reclaimed param with
number of pages actually reclaimed

Changes since v2 : https://lkml.org/lkml/2014/5/7/733
-Remove hardcoded zbud/zsmalloc implementations
-Add driver (un)register functions

Changes since v1 https://lkml.org/lkml/2014/4/19/101
-add some pr_info() during creation and pr_err() on errors
-remove zpool code to call zs_shrink(), since zsmalloc shrinking
was removed from this patchset
-remove fallback; only specified pool type will be tried
-pr_fmt() is defined in zpool to prefix zpool: in any pr_XXX() calls


include/linux/zpool.h | 106 +++++++++++++++
mm/Kconfig | 41 +++---
mm/Makefile | 1 +
mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 1 -
5 files changed, 495 insertions(+), 18 deletions(-)
create mode 100644 include/linux/zpool.h
create mode 100644 mm/zpool.c

diff --git a/include/linux/zpool.h b/include/linux/zpool.h
new file mode 100644
index 0000000..f14bd75
--- /dev/null
+++ b/include/linux/zpool.h
@@ -0,0 +1,106 @@
+/*
+ * zpool memory storage api
+ *
+ * Copyright (C) 2014 Dan Streetman
+ *
+ * This is a common frontend for the zbud and zsmalloc memory
+ * storage pool implementations. Typically, this is used to
+ * store compressed memory.
+ */
+
+#ifndef _ZPOOL_H_
+#define _ZPOOL_H_
+
+struct zpool;
+
+struct zpool_ops {
+ int (*evict)(struct zpool *pool, unsigned long handle);
+};
+
+/*
+ * Control how a handle is mapped. It will be ignored if the
+ * implementation does not support it. Its use is optional.
+ * Note that this does not refer to memory protection, it
+ * refers to how the memory will be copied in/out if copying
+ * is necessary during mapping; read-write is the safest as
+ * it copies the existing memory in on map, and copies the
+ * changed memory back out on unmap. Write-only does not copy
+ * in the memory and should only be used for initialization.
+ * If in doubt, use ZPOOL_MM_DEFAULT which is read-write.
+ */
+enum zpool_mapmode {
+ ZPOOL_MM_RW, /* normal read-write mapping */
+ ZPOOL_MM_RO, /* read-only (no copy-out at unmap time) */
+ ZPOOL_MM_WO, /* write-only (no copy-in at map time) */
+
+ ZPOOL_MM_DEFAULT = ZPOOL_MM_RW
+};
+
+struct zpool *zpool_create_pool(char *type, gfp_t gfp, struct zpool_ops *ops);
+
+char *zpool_get_type(struct zpool *pool);
+
+void zpool_destroy_pool(struct zpool *pool);
+
+int zpool_malloc(struct zpool *pool, size_t size, gfp_t gfp,
+ unsigned long *handle);
+
+void zpool_free(struct zpool *pool, unsigned long handle);
+
+int zpool_shrink(struct zpool *pool, unsigned int pages,
+ unsigned int *reclaimed);
+
+void *zpool_map_handle(struct zpool *pool, unsigned long handle,
+ enum zpool_mapmode mm);
+
+void zpool_unmap_handle(struct zpool *pool, unsigned long handle);
+
+u64 zpool_get_total_size(struct zpool *pool);
+
+
+/**
+ * struct zpool_driver - driver implementation for zpool
+ * @type: name of the driver.
+ * @list: entry in the list of zpool drivers.
+ * @create: create a new pool.
+ * @destroy: destroy a pool.
+ * @malloc: allocate mem from a pool.
+ * @free: free mem from a pool.
+ * @shrink: shrink the pool.
+ * @map: map a handle.
+ * @unmap: unmap a handle.
+ * @total_size: get total size of a pool.
+ *
+ * This is created by a zpool implementation and registered
+ * with zpool.
+ */
+struct zpool_driver {
+ char *type;
+ struct module *owner;
+ atomic_t refcount;
+ struct list_head list;
+
+ void *(*create)(gfp_t gfp, struct zpool_ops *ops);
+ void (*destroy)(void *pool);
+
+ int (*malloc)(void *pool, size_t size, gfp_t gfp,
+ unsigned long *handle);
+ void (*free)(void *pool, unsigned long handle);
+
+ int (*shrink)(void *pool, unsigned int pages,
+ unsigned int *reclaimed);
+
+ void *(*map)(void *pool, unsigned long handle,
+ enum zpool_mapmode mm);
+ void (*unmap)(void *pool, unsigned long handle);
+
+ u64 (*total_size)(void *pool);
+};
+
+void zpool_register_driver(struct zpool_driver *driver);
+
+int zpool_unregister_driver(struct zpool_driver *driver);
+
+int zpool_evict(void *pool, unsigned long handle);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 3e9977a..865f91c 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -508,15 +508,17 @@ config CMA_DEBUG
processing calls such as dma_alloc_from_contiguous().
This option does not affect warning and error messages.

-config ZBUD
- tristate
- default n
+config MEM_SOFT_DIRTY
+ bool "Track memory changes"
+ depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
+ select PROC_PAGE_MONITOR
help
- A special purpose allocator for storing compressed pages.
- It is designed to store up to two compressed pages per physical
- page. While this design limits storage density, it has simple and
- deterministic reclaim properties that make it preferable to a higher
- density approach when reclaim will be used.
+ This option enables memory changes tracking by introducing a
+ soft-dirty bit on pte-s. This bit it set when someone writes
+ into a page just as regular dirty bit, but unlike the latter
+ it can be cleared by hands.
+
+ See Documentation/vm/soft-dirty.txt for more details.

config ZSWAP
bool "Compressed cache for swap pages (EXPERIMENTAL)"
@@ -538,17 +540,22 @@ config ZSWAP
they have not be fully explored on the large set of potential
configurations and workloads that exist.

-config MEM_SOFT_DIRTY
- bool "Track memory changes"
- depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
- select PROC_PAGE_MONITOR
+config ZPOOL
+ tristate "Common API for compressed memory storage"
+ default n
help
- This option enables memory changes tracking by introducing a
- soft-dirty bit on pte-s. This bit it set when someone writes
- into a page just as regular dirty bit, but unlike the latter
- it can be cleared by hands.
+ Compressed memory storage API. This allows using either zbud or
+ zsmalloc.

- See Documentation/vm/soft-dirty.txt for more details.
+config ZBUD
+ tristate "Low density storage for compressed pages"
+ default n
+ help
+ A special purpose allocator for storing compressed pages.
+ It is designed to store up to two compressed pages per physical
+ page. While this design limits storage density, it has simple and
+ deterministic reclaim properties that make it preferable to a higher
+ density approach when reclaim will be used.

config ZSMALLOC
tristate "Memory allocator for compressed pages"
diff --git a/mm/Makefile b/mm/Makefile
index 4064f3e..2f6eeb6 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
+obj-$(CONFIG_ZPOOL) += zpool.o
obj-$(CONFIG_ZBUD) += zbud.o
obj-$(CONFIG_ZSMALLOC) += zsmalloc.o
obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o
diff --git a/mm/zpool.c b/mm/zpool.c
new file mode 100644
index 0000000..e40612a
--- /dev/null
+++ b/mm/zpool.c
@@ -0,0 +1,364 @@
+/*
+ * zpool memory storage api
+ *
+ * Copyright (C) 2014 Dan Streetman
+ *
+ * This is a common frontend for memory storage pool implementations.
+ * Typically, this is used to store compressed memory.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/list.h>
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/zpool.h>
+
+struct zpool {
+ char *type;
+
+ struct zpool_driver *driver;
+ void *pool;
+ struct zpool_ops *ops;
+
+ struct list_head list;
+};
+
+static LIST_HEAD(drivers_head);
+static DEFINE_SPINLOCK(drivers_lock);
+
+static LIST_HEAD(pools_head);
+static DEFINE_SPINLOCK(pools_lock);
+
+/**
+ * zpool_register_driver() - register a zpool implementation.
+ * @driver: driver to register
+ */
+void zpool_register_driver(struct zpool_driver *driver)
+{
+ spin_lock(&drivers_lock);
+ atomic_set(&driver->refcount, 0);
+ list_add(&driver->list, &drivers_head);
+ spin_unlock(&drivers_lock);
+}
+EXPORT_SYMBOL(zpool_register_driver);
+
+/**
+ * zpool_unregister_driver() - unregister a zpool implementation.
+ * @driver: driver to unregister.
+ *
+ * Module usage counting is used to prevent using a driver
+ * while/after unloading, so if this is called from module
+ * exit function, this should never fail; if called from
+ * other than the module exit function, and this returns
+ * failure, the driver is in use and must remain available.
+ */
+int zpool_unregister_driver(struct zpool_driver *driver)
+{
+ int ret = 0, refcount;
+
+ spin_lock(&drivers_lock);
+ refcount = atomic_read(&driver->refcount);
+ WARN_ON(refcount < 0);
+ if (refcount > 0)
+ ret = -EBUSY;
+ else
+ list_del(&driver->list);
+ spin_unlock(&drivers_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL(zpool_unregister_driver);
+
+/**
+ * zpool_evict() - evict callback from a zpool implementation.
+ * @pool: pool to evict from.
+ * @handle: handle to evict.
+ *
+ * This can be used by zpool implementations to call the
+ * user's evict zpool_ops struct evict callback.
+ */
+int zpool_evict(void *pool, unsigned long handle)
+{
+ struct zpool *zpool;
+
+ spin_lock(&pools_lock);
+ list_for_each_entry(zpool, &pools_head, list) {
+ if (zpool->pool == pool) {
+ spin_unlock(&pools_lock);
+ if (!zpool->ops || !zpool->ops->evict)
+ return -EINVAL;
+ return zpool->ops->evict(zpool, handle);
+ }
+ }
+ spin_unlock(&pools_lock);
+
+ return -ENOENT;
+}
+EXPORT_SYMBOL(zpool_evict);
+
+static struct zpool_driver *zpool_get_driver(char *type)
+{
+ struct zpool_driver *driver;
+
+ spin_lock(&drivers_lock);
+ list_for_each_entry(driver, &drivers_head, list) {
+ if (!strcmp(driver->type, type)) {
+ bool got = try_module_get(driver->owner);
+
+ if (got)
+ atomic_inc(&driver->refcount);
+ spin_unlock(&drivers_lock);
+ return got ? driver : NULL;
+ }
+ }
+
+ spin_unlock(&drivers_lock);
+ return NULL;
+}
+
+static void zpool_put_driver(struct zpool_driver *driver)
+{
+ atomic_dec(&driver->refcount);
+ module_put(driver->owner);
+}
+
+/**
+ * zpool_create_pool() - Create a new zpool
+ * @type The type of the zpool to create (e.g. zbud, zsmalloc)
+ * @gfp The GFP flags to use when allocating the pool.
+ * @ops The optional ops callback.
+ *
+ * This creates a new zpool of the specified type. The gfp flags will be
+ * used when allocating memory, if the implementation supports it. If the
+ * ops param is NULL, then the created zpool will not be shrinkable.
+ *
+ * Implementations must guarantee this to be thread-safe.
+ *
+ * Returns: New zpool on success, NULL on failure.
+ */
+struct zpool *zpool_create_pool(char *type, gfp_t gfp, struct zpool_ops *ops)
+{
+ struct zpool_driver *driver;
+ struct zpool *zpool;
+
+ pr_info("creating pool type %s\n", type);
+
+ driver = zpool_get_driver(type);
+
+ if (!driver) {
+ request_module(type);
+ driver = zpool_get_driver(type);
+ }
+
+ if (!driver) {
+ pr_err("no driver for type %s\n", type);
+ return NULL;
+ }
+
+ zpool = kmalloc(sizeof(*zpool), gfp);
+ if (!zpool) {
+ pr_err("couldn't create zpool - out of memory\n");
+ zpool_put_driver(driver);
+ return NULL;
+ }
+
+ zpool->type = driver->type;
+ zpool->driver = driver;
+ zpool->pool = driver->create(gfp, ops);
+ zpool->ops = ops;
+
+ if (!zpool->pool) {
+ pr_err("couldn't create %s pool\n", type);
+ zpool_put_driver(driver);
+ kfree(zpool);
+ return NULL;
+ }
+
+ pr_info("created %s pool\n", type);
+
+ spin_lock(&pools_lock);
+ list_add(&zpool->list, &pools_head);
+ spin_unlock(&pools_lock);
+
+ return zpool;
+}
+
+/**
+ * zpool_destroy_pool() - Destroy a zpool
+ * @pool The zpool to destroy.
+ *
+ * Implementations must guarantee this to be thread-safe,
+ * however only when destroying different pools. The same
+ * pool should only be destroyed once, and should not be used
+ * after it is destroyed.
+ *
+ * This destroys an existing zpool. The zpool should not be in use.
+ */
+void zpool_destroy_pool(struct zpool *zpool)
+{
+ pr_info("destroying pool type %s\n", zpool->type);
+
+ spin_lock(&pools_lock);
+ list_del(&zpool->list);
+ spin_unlock(&pools_lock);
+ zpool->driver->destroy(zpool->pool);
+ zpool_put_driver(zpool->driver);
+ kfree(zpool);
+}
+
+/**
+ * zpool_get_type() - Get the type of the zpool
+ * @pool The zpool to check
+ *
+ * This returns the type of the pool.
+ *
+ * Implementations must guarantee this to be thread-safe.
+ *
+ * Returns: The type of zpool.
+ */
+char *zpool_get_type(struct zpool *zpool)
+{
+ return zpool->type;
+}
+
+/**
+ * zpool_malloc() - Allocate memory
+ * @pool The zpool to allocate from.
+ * @size The amount of memory to allocate.
+ * @gfp The GFP flags to use when allocating memory.
+ * @handle Pointer to the handle to set
+ *
+ * This allocates the requested amount of memory from the pool.
+ * The gfp flags will be used when allocating memory, if the
+ * implementation supports it. The provided @handle will be
+ * set to the allocated object handle.
+ *
+ * Implementations must guarantee this to be thread-safe.
+ *
+ * Returns: 0 on success, negative value on error.
+ */
+int zpool_malloc(struct zpool *zpool, size_t size, gfp_t gfp,
+ unsigned long *handle)
+{
+ return zpool->driver->malloc(zpool->pool, size, gfp, handle);
+}
+
+/**
+ * zpool_free() - Free previously allocated memory
+ * @pool The zpool that allocated the memory.
+ * @handle The handle to the memory to free.
+ *
+ * This frees previously allocated memory. This does not guarantee
+ * that the pool will actually free memory, only that the memory
+ * in the pool will become available for use by the pool.
+ *
+ * Implementations must guarantee this to be thread-safe,
+ * however only when freeing different handles. The same
+ * handle should only be freed once, and should not be used
+ * after freeing.
+ */
+void zpool_free(struct zpool *zpool, unsigned long handle)
+{
+ zpool->driver->free(zpool->pool, handle);
+}
+
+/**
+ * zpool_shrink() - Shrink the pool size
+ * @pool The zpool to shrink.
+ * @pages The number of pages to shrink the pool.
+ * @reclaimed The number of pages successfully evicted.
+ *
+ * This attempts to shrink the actual memory size of the pool
+ * by evicting currently used handle(s). If the pool was
+ * created with no zpool_ops, or the evict call fails for any
+ * of the handles, this will fail. If non-NULL, the @reclaimed
+ * parameter will be set to the number of pages reclaimed,
+ * which may be more than the number of pages requested.
+ *
+ * Implementations must guarantee this to be thread-safe.
+ *
+ * Returns: 0 on success, negative value on error/failure.
+ */
+int zpool_shrink(struct zpool *zpool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ return zpool->driver->shrink(zpool->pool, pages, reclaimed);
+}
+
+/**
+ * zpool_map_handle() - Map a previously allocated handle into memory
+ * @pool The zpool that the handle was allocated from
+ * @handle The handle to map
+ * @mm How the memory should be mapped
+ *
+ * This maps a previously allocated handle into memory. The @mm
+ * param indicates to the implementation how the memory will be
+ * used, i.e. read-only, write-only, read-write. If the
+ * implementation does not support it, the memory will be treated
+ * as read-write.
+ *
+ * This may hold locks, disable interrupts, and/or preemption,
+ * and the zpool_unmap_handle() must be called to undo those
+ * actions. The code that uses the mapped handle should complete
+ * its operatons on the mapped handle memory quickly and unmap
+ * as soon as possible. As the implementation may use per-cpu
+ * data, multiple handles should not be mapped concurrently on
+ * any cpu.
+ *
+ * Returns: A pointer to the handle's mapped memory area.
+ */
+void *zpool_map_handle(struct zpool *zpool, unsigned long handle,
+ enum zpool_mapmode mapmode)
+{
+ return zpool->driver->map(zpool->pool, handle, mapmode);
+}
+
+/**
+ * zpool_unmap_handle() - Unmap a previously mapped handle
+ * @pool The zpool that the handle was allocated from
+ * @handle The handle to unmap
+ *
+ * This unmaps a previously mapped handle. Any locks or other
+ * actions that the implementation took in zpool_map_handle()
+ * will be undone here. The memory area returned from
+ * zpool_map_handle() should no longer be used after this.
+ */
+void zpool_unmap_handle(struct zpool *zpool, unsigned long handle)
+{
+ zpool->driver->unmap(zpool->pool, handle);
+}
+
+/**
+ * zpool_get_total_size() - The total size of the pool
+ * @pool The zpool to check
+ *
+ * This returns the total size in bytes of the pool.
+ *
+ * Returns: Total size of the zpool in bytes.
+ */
+u64 zpool_get_total_size(struct zpool *zpool)
+{
+ return zpool->driver->total_size(zpool->pool);
+}
+
+static int __init init_zpool(void)
+{
+ pr_info("loaded\n");
+ return 0;
+}
+
+static void __exit exit_zpool(void)
+{
+ pr_info("unloaded\n");
+}
+
+module_init(init_zpool);
+module_exit(exit_zpool);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Streetman <[email protected]>");
+MODULE_DESCRIPTION("Common API for compressed memory storage");
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index fe78189..4cd5479 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -240,7 +240,6 @@ struct mapping_area {
enum zs_mapmode vm_mm; /* mapping mode */
};

-
/* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
static DEFINE_PER_CPU(struct mapping_area, zs_map_area);

--
1.8.3.1

2014-07-02 21:46:13

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv3 3/4] mm/zpool: zbud/zsmalloc implement zpool

Update zbud and zsmalloc to implement the zpool api.

[Fengguang Wu <[email protected]>: make functions static]
Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Note to Seth: We talked about removing the retries parameter from
zbud_reclaim_page(), but I did not include that in this patch.
I'll send a separate patch for that.

Changes since v2 : https://lkml.org/lkml/2014/6/2/801
-make functions static per suggestion from Fengguang Wu
-move module owner initialization from later patch
-update to use gfp params for create and malloc

Changes since v1 : https://lkml.org/lkml/2014/5/24/136
-Update zbud_zpool_shrink() to call zbud_reclaim_page()
in a loop until number of pages requested has been
reclaimed, or error
-Update zbud_zpool_shrink() to update passed *reclaimed
param with # pages actually reclaimed
-Update zs_pool_shrink() with new param, although function
is not implemented yet

mm/zbud.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mm/zsmalloc.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 179 insertions(+)

diff --git a/mm/zbud.c b/mm/zbud.c
index d012261..a05790b 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -51,6 +51,7 @@
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/zbud.h>
+#include <linux/zpool.h>

/*****************
* Structures
@@ -113,6 +114,90 @@ struct zbud_header {
};

/*****************
+ * zpool
+ ****************/
+
+#ifdef CONFIG_ZPOOL
+
+static int zbud_zpool_evict(struct zbud_pool *pool, unsigned long handle)
+{
+ return zpool_evict(pool, handle);
+}
+
+static struct zbud_ops zbud_zpool_ops = {
+ .evict = zbud_zpool_evict
+};
+
+static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
+{
+ return zbud_create_pool(gfp, &zbud_zpool_ops);
+}
+
+static void zbud_zpool_destroy(void *pool)
+{
+ zbud_destroy_pool(pool);
+}
+
+static int zbud_zpool_malloc(void *pool, size_t size, gfp_t gfp,
+ unsigned long *handle)
+{
+ return zbud_alloc(pool, size, gfp, handle);
+}
+static void zbud_zpool_free(void *pool, unsigned long handle)
+{
+ zbud_free(pool, handle);
+}
+
+static int zbud_zpool_shrink(void *pool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ unsigned int total = 0;
+ int ret = -EINVAL;
+
+ while (total < pages) {
+ ret = zbud_reclaim_page(pool, 8);
+ if (ret < 0)
+ break;
+ total++;
+ }
+
+ if (reclaimed)
+ *reclaimed = total;
+
+ return ret;
+}
+
+static void *zbud_zpool_map(void *pool, unsigned long handle,
+ enum zpool_mapmode mm)
+{
+ return zbud_map(pool, handle);
+}
+static void zbud_zpool_unmap(void *pool, unsigned long handle)
+{
+ zbud_unmap(pool, handle);
+}
+
+static u64 zbud_zpool_total_size(void *pool)
+{
+ return zbud_get_pool_size(pool) * PAGE_SIZE;
+}
+
+static struct zpool_driver zbud_zpool_driver = {
+ .type = "zbud",
+ .owner = THIS_MODULE,
+ .create = zbud_zpool_create,
+ .destroy = zbud_zpool_destroy,
+ .malloc = zbud_zpool_malloc,
+ .free = zbud_zpool_free,
+ .shrink = zbud_zpool_shrink,
+ .map = zbud_zpool_map,
+ .unmap = zbud_zpool_unmap,
+ .total_size = zbud_zpool_total_size,
+};
+
+#endif /* CONFIG_ZPOOL */
+
+/*****************
* Helpers
*****************/
/* Just to make the code easier to read */
@@ -511,11 +596,20 @@ static int __init init_zbud(void)
/* Make sure the zbud header will fit in one chunk */
BUILD_BUG_ON(sizeof(struct zbud_header) > ZHDR_SIZE_ALIGNED);
pr_info("loaded\n");
+
+#ifdef CONFIG_ZPOOL
+ zpool_register_driver(&zbud_zpool_driver);
+#endif
+
return 0;
}

static void __exit exit_zbud(void)
{
+#ifdef CONFIG_ZPOOL
+ zpool_unregister_driver(&zbud_zpool_driver);
+#endif
+
pr_info("unloaded\n");
}

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 4cd5479..6c1e2a4 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -92,6 +92,7 @@
#include <linux/spinlock.h>
#include <linux/types.h>
#include <linux/zsmalloc.h>
+#include <linux/zpool.h>

/*
* This must be power of 2 and greater than of equal to sizeof(link_free).
@@ -240,6 +241,82 @@ struct mapping_area {
enum zs_mapmode vm_mm; /* mapping mode */
};

+/* zpool driver */
+
+#ifdef CONFIG_ZPOOL
+
+static void *zs_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
+{
+ return zs_create_pool(gfp);
+}
+
+static void zs_zpool_destroy(void *pool)
+{
+ zs_destroy_pool(pool);
+}
+
+static int zs_zpool_malloc(void *pool, size_t size, gfp_t gfp,
+ unsigned long *handle)
+{
+ *handle = zs_malloc(pool, size);
+ return *handle ? 0 : -1;
+}
+static void zs_zpool_free(void *pool, unsigned long handle)
+{
+ zs_free(pool, handle);
+}
+
+static int zs_zpool_shrink(void *pool, unsigned int pages,
+ unsigned int *reclaimed)
+{
+ return -EINVAL;
+}
+
+static void *zs_zpool_map(void *pool, unsigned long handle,
+ enum zpool_mapmode mm)
+{
+ enum zs_mapmode zs_mm;
+
+ switch (mm) {
+ case ZPOOL_MM_RO:
+ zs_mm = ZS_MM_RO;
+ break;
+ case ZPOOL_MM_WO:
+ zs_mm = ZS_MM_WO;
+ break;
+ case ZPOOL_MM_RW: /* fallthru */
+ default:
+ zs_mm = ZS_MM_RW;
+ break;
+ }
+
+ return zs_map_object(pool, handle, zs_mm);
+}
+static void zs_zpool_unmap(void *pool, unsigned long handle)
+{
+ zs_unmap_object(pool, handle);
+}
+
+static u64 zs_zpool_total_size(void *pool)
+{
+ return zs_get_total_size_bytes(pool);
+}
+
+static struct zpool_driver zs_zpool_driver = {
+ .type = "zsmalloc",
+ .owner = THIS_MODULE,
+ .create = zs_zpool_create,
+ .destroy = zs_zpool_destroy,
+ .malloc = zs_zpool_malloc,
+ .free = zs_zpool_free,
+ .shrink = zs_zpool_shrink,
+ .map = zs_zpool_map,
+ .unmap = zs_zpool_unmap,
+ .total_size = zs_zpool_total_size,
+};
+
+#endif /* CONFIG_ZPOOL */
+
/* per-cpu VM mapping areas for zspage accesses that cross page boundaries */
static DEFINE_PER_CPU(struct mapping_area, zs_map_area);

@@ -813,6 +890,10 @@ static void zs_exit(void)
{
int cpu;

+#ifdef CONFIG_ZPOOL
+ zpool_unregister_driver(&zs_zpool_driver);
+#endif
+
cpu_notifier_register_begin();

for_each_online_cpu(cpu)
@@ -839,6 +920,10 @@ static int zs_init(void)

cpu_notifier_register_done();

+#ifdef CONFIG_ZPOOL
+ zpool_register_driver(&zs_zpool_driver);
+#endif
+
return 0;
fail:
zs_exit();
--
1.8.3.1

2014-07-02 21:46:34

by Dan Streetman

[permalink] [raw]
Subject: [PATCHv5 4/4] mm/zpool: update zswap to use zpool

Change zswap to use the zpool api instead of directly using zbud.
Add a boot-time param to allow selecting which zpool implementation
to use, with zbud as the default.

Signed-off-by: Dan Streetman <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Weijie Yang <[email protected]>
---

Changes since v4 : https://lkml.org/lkml/2014/6/2/709
-update to use pass gfp params to create and malloc

Changes since v3 : https://lkml.org/lkml/2014/5/24/131
-use new parameters in call to zpool_shrink()

Changes since v2 : https://lkml.org/lkml/2014/5/7/894
-change zswap to select ZPOOL instead of ZBUD
-only try zbud default if not already tried

Changes since v1 https://lkml.org/lkml/2014/4/19/102
-since zpool fallback is removed, manually fall back to zbud if
specified type fails


mm/Kconfig | 2 +-
mm/zswap.c | 75 +++++++++++++++++++++++++++++++++++++-------------------------
2 files changed, 46 insertions(+), 31 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 865f91c..7fddb52 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -524,7 +524,7 @@ config ZSWAP
bool "Compressed cache for swap pages (EXPERIMENTAL)"
depends on FRONTSWAP && CRYPTO=y
select CRYPTO_LZO
- select ZBUD
+ select ZPOOL
default n
help
A lightweight compressed cache for swap pages. It takes
diff --git a/mm/zswap.c b/mm/zswap.c
index 008388fe..032c21e 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -34,7 +34,7 @@
#include <linux/swap.h>
#include <linux/crypto.h>
#include <linux/mempool.h>
-#include <linux/zbud.h>
+#include <linux/zpool.h>

#include <linux/mm_types.h>
#include <linux/page-flags.h>
@@ -45,8 +45,8 @@
/*********************************
* statistics
**********************************/
-/* Number of memory pages used by the compressed pool */
-static u64 zswap_pool_pages;
+/* Total bytes used by the compressed storage */
+static u64 zswap_pool_total_size;
/* The number of compressed pages currently stored in zswap */
static atomic_t zswap_stored_pages = ATOMIC_INIT(0);

@@ -89,8 +89,13 @@ static unsigned int zswap_max_pool_percent = 20;
module_param_named(max_pool_percent,
zswap_max_pool_percent, uint, 0644);

-/* zbud_pool is shared by all of zswap backend */
-static struct zbud_pool *zswap_pool;
+/* Compressed storage to use */
+#define ZSWAP_ZPOOL_DEFAULT "zbud"
+static char *zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
+module_param_named(zpool, zswap_zpool_type, charp, 0444);
+
+/* zpool is shared by all of zswap backend */
+static struct zpool *zswap_pool;

/*********************************
* compression functions
@@ -168,7 +173,7 @@ static void zswap_comp_exit(void)
* be held while changing the refcount. Since the lock must
* be held, there is no reason to also make refcount atomic.
* offset - the swap offset for the entry. Index into the red-black tree.
- * handle - zbud allocation handle that stores the compressed page data
+ * handle - zpool allocation handle that stores the compressed page data
* length - the length in bytes of the compressed page data. Needed during
* decompression
*/
@@ -284,15 +289,15 @@ static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *entry)
}

/*
- * Carries out the common pattern of freeing and entry's zbud allocation,
+ * Carries out the common pattern of freeing and entry's zpool allocation,
* freeing the entry itself, and decrementing the number of stored pages.
*/
static void zswap_free_entry(struct zswap_entry *entry)
{
- zbud_free(zswap_pool, entry->handle);
+ zpool_free(zswap_pool, entry->handle);
zswap_entry_cache_free(entry);
atomic_dec(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(zswap_pool);
+ zswap_pool_total_size = zpool_get_total_size(zswap_pool);
}

/* caller must hold the tree lock */
@@ -409,7 +414,7 @@ cleanup:
static bool zswap_is_full(void)
{
return totalram_pages * zswap_max_pool_percent / 100 <
- zswap_pool_pages;
+ DIV_ROUND_UP(zswap_pool_total_size, PAGE_SIZE);
}

/*********************************
@@ -525,7 +530,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry,
* the swap cache, the compressed version stored by zswap can be
* freed.
*/
-static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
+static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
{
struct zswap_header *zhdr;
swp_entry_t swpentry;
@@ -541,9 +546,9 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
};

/* extract swpentry from data */
- zhdr = zbud_map(pool, handle);
+ zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO);
swpentry = zhdr->swpentry; /* here */
- zbud_unmap(pool, handle);
+ zpool_unmap_handle(pool, handle);
tree = zswap_trees[swp_type(swpentry)];
offset = swp_offset(swpentry);

@@ -573,13 +578,13 @@ static int zswap_writeback_entry(struct zbud_pool *pool, unsigned long handle)
case ZSWAP_SWAPCACHE_NEW: /* page is locked */
/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zbud_map(zswap_pool, entry->handle) +
- sizeof(struct zswap_header);
+ src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src,
entry->length, dst, &dlen);
kunmap_atomic(dst);
- zbud_unmap(zswap_pool, entry->handle);
+ zpool_unmap_handle(zswap_pool, entry->handle);
BUG_ON(ret);
BUG_ON(dlen != PAGE_SIZE);

@@ -652,7 +657,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
/* reclaim space if needed */
if (zswap_is_full()) {
zswap_pool_limit_hit++;
- if (zbud_reclaim_page(zswap_pool, 8)) {
+ if (zpool_shrink(zswap_pool, 1, NULL)) {
zswap_reject_reclaim_fail++;
ret = -ENOMEM;
goto reject;
@@ -679,7 +684,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* store */
len = dlen + sizeof(struct zswap_header);
- ret = zbud_alloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
+ ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
&handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
@@ -689,11 +694,11 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
zswap_reject_alloc_fail++;
goto freepage;
}
- zhdr = zbud_map(zswap_pool, handle);
+ zhdr = zpool_map_handle(zswap_pool, handle, ZPOOL_MM_RW);
zhdr->swpentry = swp_entry(type, offset);
buf = (u8 *)(zhdr + 1);
memcpy(buf, dst, dlen);
- zbud_unmap(zswap_pool, handle);
+ zpool_unmap_handle(zswap_pool, handle);
put_cpu_var(zswap_dstmem);

/* populate entry */
@@ -716,7 +721,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,

/* update stats */
atomic_inc(&zswap_stored_pages);
- zswap_pool_pages = zbud_get_pool_size(zswap_pool);
+ zswap_pool_total_size = zpool_get_total_size(zswap_pool);

return 0;

@@ -752,13 +757,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,

/* decompress */
dlen = PAGE_SIZE;
- src = (u8 *)zbud_map(zswap_pool, entry->handle) +
- sizeof(struct zswap_header);
+ src = (u8 *)zpool_map_handle(zswap_pool, entry->handle,
+ ZPOOL_MM_RO) + sizeof(struct zswap_header);
dst = kmap_atomic(page);
ret = zswap_comp_op(ZSWAP_COMPOP_DECOMPRESS, src, entry->length,
dst, &dlen);
kunmap_atomic(dst);
- zbud_unmap(zswap_pool, entry->handle);
+ zpool_unmap_handle(zswap_pool, entry->handle);
BUG_ON(ret);

spin_lock(&tree->lock);
@@ -811,7 +816,7 @@ static void zswap_frontswap_invalidate_area(unsigned type)
zswap_trees[type] = NULL;
}

-static struct zbud_ops zswap_zbud_ops = {
+static struct zpool_ops zswap_zpool_ops = {
.evict = zswap_writeback_entry
};

@@ -869,8 +874,8 @@ static int __init zswap_debugfs_init(void)
zswap_debugfs_root, &zswap_written_back_pages);
debugfs_create_u64("duplicate_entry", S_IRUGO,
zswap_debugfs_root, &zswap_duplicate_entry);
- debugfs_create_u64("pool_pages", S_IRUGO,
- zswap_debugfs_root, &zswap_pool_pages);
+ debugfs_create_u64("pool_total_size", S_IRUGO,
+ zswap_debugfs_root, &zswap_pool_total_size);
debugfs_create_atomic_t("stored_pages", S_IRUGO,
zswap_debugfs_root, &zswap_stored_pages);

@@ -895,16 +900,26 @@ static void __exit zswap_debugfs_exit(void) { }
**********************************/
static int __init init_zswap(void)
{
+ gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN;
+
if (!zswap_enabled)
return 0;

pr_info("loading zswap\n");

- zswap_pool = zbud_create_pool(GFP_KERNEL, &zswap_zbud_ops);
+ zswap_pool = zpool_create_pool(zswap_zpool_type, gfp, &zswap_zpool_ops);
+ if (!zswap_pool && strcmp(zswap_zpool_type, ZSWAP_ZPOOL_DEFAULT)) {
+ pr_info("%s zpool not available\n", zswap_zpool_type);
+ zswap_zpool_type = ZSWAP_ZPOOL_DEFAULT;
+ zswap_pool = zpool_create_pool(zswap_zpool_type, gfp,
+ &zswap_zpool_ops);
+ }
if (!zswap_pool) {
- pr_err("zbud pool creation failed\n");
+ pr_err("%s zpool not available\n", zswap_zpool_type);
+ pr_err("zpool creation failed\n");
goto error;
}
+ pr_info("using %s pool\n", zswap_zpool_type);

if (zswap_entry_cache_create()) {
pr_err("entry cache creation failed\n");
@@ -928,7 +943,7 @@ pcpufail:
compfail:
zswap_entry_cache_destory();
cachefail:
- zbud_destroy_pool(zswap_pool);
+ zpool_destroy_pool(zswap_pool);
error:
return -ENOMEM;
}
--
1.8.3.1

2014-07-14 18:11:16

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

Andrew, any thoughts on this latest version of the patch set? Let me
know if I missed anything or you have any other suggestions.

Seth, did you get a chance to review this and/or test it out?



On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
> In order to allow zswap users to choose between zbud and zsmalloc for
> the compressed storage pool, this patch set adds a new api "zpool" that
> provides an interface to both zbud and zsmalloc. This does not include
> implementing shrinking in zsmalloc, which will be sent separately.
>
> I believe Seth originally was using zsmalloc for swap, but there were
> concerns about how significant the impact of shrinking zsmalloc would
> be when zswap had to start reclaiming pages. That still may be an
> issue, but this at least allows users to choose themselves whether
> they want a lower-density or higher-density compressed storage medium.
> At least for situations where zswap reclaim is never or rarely reached,
> it probably makes sense to use the higher density of zsmalloc.
>
> Note this patch set does not change zram to use zpool, although that
> change should be possible as well.
>
> ---
> Changes since v4 : https://lkml.org/lkml/2014/6/2/711
> -omit first patch, that removed gfp_t param from zpool_malloc()
> -move function doc from zpool.h to zpool.c
> -move module usage refcounting into patch that adds zpool
> -add extra refcounting to prevent driver unregister if in use
> -add doc clarifying concurrency usage
> -make zbud/zsmalloc zpool functions static
> -typo corrections
>
> Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> -In zpool_shrink() use # pages instead of # bytes
> -Add reclaimed param to zpool_shrink() to indicate to caller
> # pages actually reclaimed
> -move module usage counting to zpool, from zbud/zsmalloc
> -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> loop until requested # pages have been reclaimed (or error)
>
> Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> -Change zpool to use driver registration instead of hardcoding
> implementations
> -Add module use counting in zbud/zsmalloc
>
> Changes since v1 https://lkml.org/lkml/2014/4/19/97
> -remove zsmalloc shrinking
> -change zbud size param type from unsigned int to size_t
> -remove zpool fallback creation
> -zswap manually falls back to zbud if specified type fails
>
>
> Dan Streetman (4):
> mm/zbud: change zbud_alloc size type to size_t
> mm/zpool: implement common zpool api to zbud/zsmalloc
> mm/zpool: zbud/zsmalloc implement zpool
> mm/zpool: update zswap to use zpool
>
> include/linux/zbud.h | 2 +-
> include/linux/zpool.h | 106 +++++++++++++++
> mm/Kconfig | 43 +++---
> mm/Makefile | 1 +
> mm/zbud.c | 98 +++++++++++++-
> mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
> mm/zsmalloc.c | 84 ++++++++++++
> mm/zswap.c | 75 ++++++-----
> 8 files changed, 722 insertions(+), 51 deletions(-)
> create mode 100644 include/linux/zpool.h
> create mode 100644 mm/zpool.c
>
> --
> 1.8.3.1
>

2014-07-16 20:59:25

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

On Mon, Jul 14, 2014 at 02:10:42PM -0400, Dan Streetman wrote:
> Andrew, any thoughts on this latest version of the patch set? Let me
> know if I missed anything or you have any other suggestions.
>
> Seth, did you get a chance to review this and/or test it out?

I did have a chance to test it out quickly and didn't run into any
issues. Your patchset is already in linux-next so I'll test more from
there.

Seth

>
>
>
> On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
> > In order to allow zswap users to choose between zbud and zsmalloc for
> > the compressed storage pool, this patch set adds a new api "zpool" that
> > provides an interface to both zbud and zsmalloc. This does not include
> > implementing shrinking in zsmalloc, which will be sent separately.
> >
> > I believe Seth originally was using zsmalloc for swap, but there were
> > concerns about how significant the impact of shrinking zsmalloc would
> > be when zswap had to start reclaiming pages. That still may be an
> > issue, but this at least allows users to choose themselves whether
> > they want a lower-density or higher-density compressed storage medium.
> > At least for situations where zswap reclaim is never or rarely reached,
> > it probably makes sense to use the higher density of zsmalloc.
> >
> > Note this patch set does not change zram to use zpool, although that
> > change should be possible as well.
> >
> > ---
> > Changes since v4 : https://lkml.org/lkml/2014/6/2/711
> > -omit first patch, that removed gfp_t param from zpool_malloc()
> > -move function doc from zpool.h to zpool.c
> > -move module usage refcounting into patch that adds zpool
> > -add extra refcounting to prevent driver unregister if in use
> > -add doc clarifying concurrency usage
> > -make zbud/zsmalloc zpool functions static
> > -typo corrections
> >
> > Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> > -In zpool_shrink() use # pages instead of # bytes
> > -Add reclaimed param to zpool_shrink() to indicate to caller
> > # pages actually reclaimed
> > -move module usage counting to zpool, from zbud/zsmalloc
> > -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> > loop until requested # pages have been reclaimed (or error)
> >
> > Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> > -Change zpool to use driver registration instead of hardcoding
> > implementations
> > -Add module use counting in zbud/zsmalloc
> >
> > Changes since v1 https://lkml.org/lkml/2014/4/19/97
> > -remove zsmalloc shrinking
> > -change zbud size param type from unsigned int to size_t
> > -remove zpool fallback creation
> > -zswap manually falls back to zbud if specified type fails
> >
> >
> > Dan Streetman (4):
> > mm/zbud: change zbud_alloc size type to size_t
> > mm/zpool: implement common zpool api to zbud/zsmalloc
> > mm/zpool: zbud/zsmalloc implement zpool
> > mm/zpool: update zswap to use zpool
> >
> > include/linux/zbud.h | 2 +-
> > include/linux/zpool.h | 106 +++++++++++++++
> > mm/Kconfig | 43 +++---
> > mm/Makefile | 1 +
> > mm/zbud.c | 98 +++++++++++++-
> > mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > mm/zsmalloc.c | 84 ++++++++++++
> > mm/zswap.c | 75 ++++++-----
> > 8 files changed, 722 insertions(+), 51 deletions(-)
> > create mode 100644 include/linux/zpool.h
> > create mode 100644 mm/zpool.c
> >
> > --
> > 1.8.3.1
> >

2014-07-16 21:06:13

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

On Wed, Jul 16, 2014 at 4:59 PM, Seth Jennings <[email protected]> wrote:
> On Mon, Jul 14, 2014 at 02:10:42PM -0400, Dan Streetman wrote:
>> Andrew, any thoughts on this latest version of the patch set? Let me
>> know if I missed anything or you have any other suggestions.
>>
>> Seth, did you get a chance to review this and/or test it out?
>
> I did have a chance to test it out quickly and didn't run into any
> issues. Your patchset is already in linux-next so I'll test more from
> there.

This latest version has a few changes that Andrew requested, which
presumably will replace the patches that are currently in -mm and
-next; can you test with these patches instead of (or in addition to)
what's in -next?

>
> Seth
>
>>
>>
>>
>> On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
>> > In order to allow zswap users to choose between zbud and zsmalloc for
>> > the compressed storage pool, this patch set adds a new api "zpool" that
>> > provides an interface to both zbud and zsmalloc. This does not include
>> > implementing shrinking in zsmalloc, which will be sent separately.
>> >
>> > I believe Seth originally was using zsmalloc for swap, but there were
>> > concerns about how significant the impact of shrinking zsmalloc would
>> > be when zswap had to start reclaiming pages. That still may be an
>> > issue, but this at least allows users to choose themselves whether
>> > they want a lower-density or higher-density compressed storage medium.
>> > At least for situations where zswap reclaim is never or rarely reached,
>> > it probably makes sense to use the higher density of zsmalloc.
>> >
>> > Note this patch set does not change zram to use zpool, although that
>> > change should be possible as well.
>> >
>> > ---
>> > Changes since v4 : https://lkml.org/lkml/2014/6/2/711
>> > -omit first patch, that removed gfp_t param from zpool_malloc()
>> > -move function doc from zpool.h to zpool.c
>> > -move module usage refcounting into patch that adds zpool
>> > -add extra refcounting to prevent driver unregister if in use
>> > -add doc clarifying concurrency usage
>> > -make zbud/zsmalloc zpool functions static
>> > -typo corrections
>> >
>> > Changes since v3 : https://lkml.org/lkml/2014/5/24/130
>> > -In zpool_shrink() use # pages instead of # bytes
>> > -Add reclaimed param to zpool_shrink() to indicate to caller
>> > # pages actually reclaimed
>> > -move module usage counting to zpool, from zbud/zsmalloc
>> > -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
>> > loop until requested # pages have been reclaimed (or error)
>> >
>> > Changes since v2 : https://lkml.org/lkml/2014/5/7/927
>> > -Change zpool to use driver registration instead of hardcoding
>> > implementations
>> > -Add module use counting in zbud/zsmalloc
>> >
>> > Changes since v1 https://lkml.org/lkml/2014/4/19/97
>> > -remove zsmalloc shrinking
>> > -change zbud size param type from unsigned int to size_t
>> > -remove zpool fallback creation
>> > -zswap manually falls back to zbud if specified type fails
>> >
>> >
>> > Dan Streetman (4):
>> > mm/zbud: change zbud_alloc size type to size_t
>> > mm/zpool: implement common zpool api to zbud/zsmalloc
>> > mm/zpool: zbud/zsmalloc implement zpool
>> > mm/zpool: update zswap to use zpool
>> >
>> > include/linux/zbud.h | 2 +-
>> > include/linux/zpool.h | 106 +++++++++++++++
>> > mm/Kconfig | 43 +++---
>> > mm/Makefile | 1 +
>> > mm/zbud.c | 98 +++++++++++++-
>> > mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> > mm/zsmalloc.c | 84 ++++++++++++
>> > mm/zswap.c | 75 ++++++-----
>> > 8 files changed, 722 insertions(+), 51 deletions(-)
>> > create mode 100644 include/linux/zpool.h
>> > create mode 100644 mm/zpool.c
>> >
>> > --
>> > 1.8.3.1
>> >

2014-07-16 22:01:00

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

On Wed, Jul 16, 2014 at 05:05:45PM -0400, Dan Streetman wrote:
> On Wed, Jul 16, 2014 at 4:59 PM, Seth Jennings <[email protected]> wrote:
> > On Mon, Jul 14, 2014 at 02:10:42PM -0400, Dan Streetman wrote:
> >> Andrew, any thoughts on this latest version of the patch set? Let me
> >> know if I missed anything or you have any other suggestions.
> >>
> >> Seth, did you get a chance to review this and/or test it out?
> >
> > I did have a chance to test it out quickly and didn't run into any
> > issues. Your patchset is already in linux-next so I'll test more from
> > there.
>
> This latest version has a few changes that Andrew requested, which
> presumably will replace the patches that are currently in -mm and
> -next; can you test with these patches instead of (or in addition to)
> what's in -next?

Looks like Andrew just did the legwork for me to get the new patches
into mmotm. When the hit there (tomorrow?), I'll put it down and test
with that.

Thanks,
Seth

>
> >
> > Seth
> >
> >>
> >>
> >>
> >> On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
> >> > In order to allow zswap users to choose between zbud and zsmalloc for
> >> > the compressed storage pool, this patch set adds a new api "zpool" that
> >> > provides an interface to both zbud and zsmalloc. This does not include
> >> > implementing shrinking in zsmalloc, which will be sent separately.
> >> >
> >> > I believe Seth originally was using zsmalloc for swap, but there were
> >> > concerns about how significant the impact of shrinking zsmalloc would
> >> > be when zswap had to start reclaiming pages. That still may be an
> >> > issue, but this at least allows users to choose themselves whether
> >> > they want a lower-density or higher-density compressed storage medium.
> >> > At least for situations where zswap reclaim is never or rarely reached,
> >> > it probably makes sense to use the higher density of zsmalloc.
> >> >
> >> > Note this patch set does not change zram to use zpool, although that
> >> > change should be possible as well.
> >> >
> >> > ---
> >> > Changes since v4 : https://lkml.org/lkml/2014/6/2/711
> >> > -omit first patch, that removed gfp_t param from zpool_malloc()
> >> > -move function doc from zpool.h to zpool.c
> >> > -move module usage refcounting into patch that adds zpool
> >> > -add extra refcounting to prevent driver unregister if in use
> >> > -add doc clarifying concurrency usage
> >> > -make zbud/zsmalloc zpool functions static
> >> > -typo corrections
> >> >
> >> > Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> >> > -In zpool_shrink() use # pages instead of # bytes
> >> > -Add reclaimed param to zpool_shrink() to indicate to caller
> >> > # pages actually reclaimed
> >> > -move module usage counting to zpool, from zbud/zsmalloc
> >> > -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> >> > loop until requested # pages have been reclaimed (or error)
> >> >
> >> > Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> >> > -Change zpool to use driver registration instead of hardcoding
> >> > implementations
> >> > -Add module use counting in zbud/zsmalloc
> >> >
> >> > Changes since v1 https://lkml.org/lkml/2014/4/19/97
> >> > -remove zsmalloc shrinking
> >> > -change zbud size param type from unsigned int to size_t
> >> > -remove zpool fallback creation
> >> > -zswap manually falls back to zbud if specified type fails
> >> >
> >> >
> >> > Dan Streetman (4):
> >> > mm/zbud: change zbud_alloc size type to size_t
> >> > mm/zpool: implement common zpool api to zbud/zsmalloc
> >> > mm/zpool: zbud/zsmalloc implement zpool
> >> > mm/zpool: update zswap to use zpool
> >> >
> >> > include/linux/zbud.h | 2 +-
> >> > include/linux/zpool.h | 106 +++++++++++++++
> >> > mm/Kconfig | 43 +++---
> >> > mm/Makefile | 1 +
> >> > mm/zbud.c | 98 +++++++++++++-
> >> > mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >> > mm/zsmalloc.c | 84 ++++++++++++
> >> > mm/zswap.c | 75 ++++++-----
> >> > 8 files changed, 722 insertions(+), 51 deletions(-)
> >> > create mode 100644 include/linux/zpool.h
> >> > create mode 100644 mm/zpool.c
> >> >
> >> > --
> >> > 1.8.3.1
> >> >

2014-07-25 16:59:29

by Dan Streetman

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

Hey Seth,

have a chance to test yet?

On Wed, Jul 16, 2014 at 6:00 PM, Seth Jennings <[email protected]> wrote:
> On Wed, Jul 16, 2014 at 05:05:45PM -0400, Dan Streetman wrote:
>> On Wed, Jul 16, 2014 at 4:59 PM, Seth Jennings <[email protected]> wrote:
>> > On Mon, Jul 14, 2014 at 02:10:42PM -0400, Dan Streetman wrote:
>> >> Andrew, any thoughts on this latest version of the patch set? Let me
>> >> know if I missed anything or you have any other suggestions.
>> >>
>> >> Seth, did you get a chance to review this and/or test it out?
>> >
>> > I did have a chance to test it out quickly and didn't run into any
>> > issues. Your patchset is already in linux-next so I'll test more from
>> > there.
>>
>> This latest version has a few changes that Andrew requested, which
>> presumably will replace the patches that are currently in -mm and
>> -next; can you test with these patches instead of (or in addition to)
>> what's in -next?
>
> Looks like Andrew just did the legwork for me to get the new patches
> into mmotm. When the hit there (tomorrow?), I'll put it down and test
> with that.
>
> Thanks,
> Seth
>
>>
>> >
>> > Seth
>> >
>> >>
>> >>
>> >>
>> >> On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
>> >> > In order to allow zswap users to choose between zbud and zsmalloc for
>> >> > the compressed storage pool, this patch set adds a new api "zpool" that
>> >> > provides an interface to both zbud and zsmalloc. This does not include
>> >> > implementing shrinking in zsmalloc, which will be sent separately.
>> >> >
>> >> > I believe Seth originally was using zsmalloc for swap, but there were
>> >> > concerns about how significant the impact of shrinking zsmalloc would
>> >> > be when zswap had to start reclaiming pages. That still may be an
>> >> > issue, but this at least allows users to choose themselves whether
>> >> > they want a lower-density or higher-density compressed storage medium.
>> >> > At least for situations where zswap reclaim is never or rarely reached,
>> >> > it probably makes sense to use the higher density of zsmalloc.
>> >> >
>> >> > Note this patch set does not change zram to use zpool, although that
>> >> > change should be possible as well.
>> >> >
>> >> > ---
>> >> > Changes since v4 : https://lkml.org/lkml/2014/6/2/711
>> >> > -omit first patch, that removed gfp_t param from zpool_malloc()
>> >> > -move function doc from zpool.h to zpool.c
>> >> > -move module usage refcounting into patch that adds zpool
>> >> > -add extra refcounting to prevent driver unregister if in use
>> >> > -add doc clarifying concurrency usage
>> >> > -make zbud/zsmalloc zpool functions static
>> >> > -typo corrections
>> >> >
>> >> > Changes since v3 : https://lkml.org/lkml/2014/5/24/130
>> >> > -In zpool_shrink() use # pages instead of # bytes
>> >> > -Add reclaimed param to zpool_shrink() to indicate to caller
>> >> > # pages actually reclaimed
>> >> > -move module usage counting to zpool, from zbud/zsmalloc
>> >> > -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
>> >> > loop until requested # pages have been reclaimed (or error)
>> >> >
>> >> > Changes since v2 : https://lkml.org/lkml/2014/5/7/927
>> >> > -Change zpool to use driver registration instead of hardcoding
>> >> > implementations
>> >> > -Add module use counting in zbud/zsmalloc
>> >> >
>> >> > Changes since v1 https://lkml.org/lkml/2014/4/19/97
>> >> > -remove zsmalloc shrinking
>> >> > -change zbud size param type from unsigned int to size_t
>> >> > -remove zpool fallback creation
>> >> > -zswap manually falls back to zbud if specified type fails
>> >> >
>> >> >
>> >> > Dan Streetman (4):
>> >> > mm/zbud: change zbud_alloc size type to size_t
>> >> > mm/zpool: implement common zpool api to zbud/zsmalloc
>> >> > mm/zpool: zbud/zsmalloc implement zpool
>> >> > mm/zpool: update zswap to use zpool
>> >> >
>> >> > include/linux/zbud.h | 2 +-
>> >> > include/linux/zpool.h | 106 +++++++++++++++
>> >> > mm/Kconfig | 43 +++---
>> >> > mm/Makefile | 1 +
>> >> > mm/zbud.c | 98 +++++++++++++-
>> >> > mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >> > mm/zsmalloc.c | 84 ++++++++++++
>> >> > mm/zswap.c | 75 ++++++-----
>> >> > 8 files changed, 722 insertions(+), 51 deletions(-)
>> >> > create mode 100644 include/linux/zpool.h
>> >> > create mode 100644 mm/zpool.c
>> >> >
>> >> > --
>> >> > 1.8.3.1
>> >> >

2014-07-28 20:41:01

by Seth Jennings

[permalink] [raw]
Subject: Re: [PATCHv5 0/4] mm/zpool: add common api for zswap to use zbud/zsmalloc

On Fri, Jul 25, 2014 at 12:59:05PM -0400, Dan Streetman wrote:
> Hey Seth,
>
> have a chance to test yet?

Sorry for the delay. Things have been crazy.

Just pulled down -next and tested with a 8-thread kernel build on a
machine restricted to 512MB of RAM. Thrashed pretty well but completed
without any problems. The compressed pool mitigating the swapping for
the most part. swapoff at the end and all the statistics were sane.

Looks stable afaict. Good work! :)

Again sorry about dragging my feet on this.

Thanks,
Seth

>
> On Wed, Jul 16, 2014 at 6:00 PM, Seth Jennings <[email protected]> wrote:
> > On Wed, Jul 16, 2014 at 05:05:45PM -0400, Dan Streetman wrote:
> >> On Wed, Jul 16, 2014 at 4:59 PM, Seth Jennings <[email protected]> wrote:
> >> > On Mon, Jul 14, 2014 at 02:10:42PM -0400, Dan Streetman wrote:
> >> >> Andrew, any thoughts on this latest version of the patch set? Let me
> >> >> know if I missed anything or you have any other suggestions.
> >> >>
> >> >> Seth, did you get a chance to review this and/or test it out?
> >> >
> >> > I did have a chance to test it out quickly and didn't run into any
> >> > issues. Your patchset is already in linux-next so I'll test more from
> >> > there.
> >>
> >> This latest version has a few changes that Andrew requested, which
> >> presumably will replace the patches that are currently in -mm and
> >> -next; can you test with these patches instead of (or in addition to)
> >> what's in -next?
> >
> > Looks like Andrew just did the legwork for me to get the new patches
> > into mmotm. When the hit there (tomorrow?), I'll put it down and test
> > with that.
> >
> > Thanks,
> > Seth
> >
> >>
> >> >
> >> > Seth
> >> >
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Jul 2, 2014 at 5:45 PM, Dan Streetman <[email protected]> wrote:
> >> >> > In order to allow zswap users to choose between zbud and zsmalloc for
> >> >> > the compressed storage pool, this patch set adds a new api "zpool" that
> >> >> > provides an interface to both zbud and zsmalloc. This does not include
> >> >> > implementing shrinking in zsmalloc, which will be sent separately.
> >> >> >
> >> >> > I believe Seth originally was using zsmalloc for swap, but there were
> >> >> > concerns about how significant the impact of shrinking zsmalloc would
> >> >> > be when zswap had to start reclaiming pages. That still may be an
> >> >> > issue, but this at least allows users to choose themselves whether
> >> >> > they want a lower-density or higher-density compressed storage medium.
> >> >> > At least for situations where zswap reclaim is never or rarely reached,
> >> >> > it probably makes sense to use the higher density of zsmalloc.
> >> >> >
> >> >> > Note this patch set does not change zram to use zpool, although that
> >> >> > change should be possible as well.
> >> >> >
> >> >> > ---
> >> >> > Changes since v4 : https://lkml.org/lkml/2014/6/2/711
> >> >> > -omit first patch, that removed gfp_t param from zpool_malloc()
> >> >> > -move function doc from zpool.h to zpool.c
> >> >> > -move module usage refcounting into patch that adds zpool
> >> >> > -add extra refcounting to prevent driver unregister if in use
> >> >> > -add doc clarifying concurrency usage
> >> >> > -make zbud/zsmalloc zpool functions static
> >> >> > -typo corrections
> >> >> >
> >> >> > Changes since v3 : https://lkml.org/lkml/2014/5/24/130
> >> >> > -In zpool_shrink() use # pages instead of # bytes
> >> >> > -Add reclaimed param to zpool_shrink() to indicate to caller
> >> >> > # pages actually reclaimed
> >> >> > -move module usage counting to zpool, from zbud/zsmalloc
> >> >> > -update zbud_zpool_shrink() to call zbud_reclaim_page() in a
> >> >> > loop until requested # pages have been reclaimed (or error)
> >> >> >
> >> >> > Changes since v2 : https://lkml.org/lkml/2014/5/7/927
> >> >> > -Change zpool to use driver registration instead of hardcoding
> >> >> > implementations
> >> >> > -Add module use counting in zbud/zsmalloc
> >> >> >
> >> >> > Changes since v1 https://lkml.org/lkml/2014/4/19/97
> >> >> > -remove zsmalloc shrinking
> >> >> > -change zbud size param type from unsigned int to size_t
> >> >> > -remove zpool fallback creation
> >> >> > -zswap manually falls back to zbud if specified type fails
> >> >> >
> >> >> >
> >> >> > Dan Streetman (4):
> >> >> > mm/zbud: change zbud_alloc size type to size_t
> >> >> > mm/zpool: implement common zpool api to zbud/zsmalloc
> >> >> > mm/zpool: zbud/zsmalloc implement zpool
> >> >> > mm/zpool: update zswap to use zpool
> >> >> >
> >> >> > include/linux/zbud.h | 2 +-
> >> >> > include/linux/zpool.h | 106 +++++++++++++++
> >> >> > mm/Kconfig | 43 +++---
> >> >> > mm/Makefile | 1 +
> >> >> > mm/zbud.c | 98 +++++++++++++-
> >> >> > mm/zpool.c | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> > mm/zsmalloc.c | 84 ++++++++++++
> >> >> > mm/zswap.c | 75 ++++++-----
> >> >> > 8 files changed, 722 insertions(+), 51 deletions(-)
> >> >> > create mode 100644 include/linux/zpool.h
> >> >> > create mode 100644 mm/zpool.c
> >> >> >
> >> >> > --
> >> >> > 1.8.3.1
> >> >> >