zbud is a memory allocator for storing compressed data pages. It keeps
two data objects of arbitrary size on a single page. This simple design
provides very deterministic behavior on reclamation, which is one of
reasons why zswap selected zbud as a default allocator over zsmalloc.
Unlike zsmalloc, however, zbud does not support highmem. This is
problomatic especially on 32-bit machines having relatively small
lowmem. Compressing anonymous pages from highmem and storing them into
lowmem could eat up lowmem spaces.
This limitation is due to the fact that zbud manages its internal data
structures on zbud_header which is kept in the head of zbud_page. For
example, zbud_pages are tracked by several lists and have some status
information, which are being referenced at any time by the kernel. Thus,
zbud_pages should be allocated on a memory region directly mapped,
lowmem.
After some digging out, I found that internal data structures of zbud
can be kept in the struct page, the same way as zsmalloc does. So, this
series moves out all fields in zbud_header to struct page. Though it
alters quite a lot, it does not add any functional differences except
highmem support. I am afraid that this kind of modification abusing
several fields in struct page would be ok.
Heesub Shin (9):
mm/zbud: tidy up a bit
mm/zbud: remove buddied list from zbud_pool
mm/zbud: remove lru from zbud_header
mm/zbud: remove first|last_chunks from zbud_header
mm/zbud: encode zbud handle using struct page
mm/zbud: remove list_head for buddied list from zbud_header
mm/zbud: drop zbud_header
mm/zbud: allow clients to use highmem pages
mm/zswap: use highmem pages for compressed pool
mm/zbud.c | 244 ++++++++++++++++++++++++++++++-------------------------------
mm/zswap.c | 4 +-
2 files changed, 121 insertions(+), 127 deletions(-)
--
1.9.1
As a preparation for further patches, this patch changes the way of
encoding zbud handle. Currently, zbud handle is actually just a virtual
address that is casted to unsigned long before return back. Exporting
the address to clients would be inappropriate if we use highmem pages
for zbud pages, which will be implemented by following patches.
Change the zbud handle to struct page* with the least significant bit
indicating the first or last. All other information are hidden in the
struct page.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 50 ++++++++++++++++++++++++++++----------------------
1 file changed, 28 insertions(+), 22 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 193ea4f..383bab0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -240,35 +240,32 @@ static void free_zbud_page(struct zbud_header *zhdr)
__free_page(virt_to_page(zhdr));
}
+static int is_last_chunk(unsigned long handle)
+{
+ return (handle & LAST) == LAST;
+}
+
/*
* Encodes the handle of a particular buddy within a zbud page
* Pool lock should be held as this function accesses first|last_chunks
*/
-static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
+static unsigned long encode_handle(struct page *page, enum buddy bud)
{
- unsigned long handle;
- struct page *page = virt_to_page(zhdr);
+ return (unsigned long) page | bud;
+}
- /*
- * For now, the encoded handle is actually just the pointer to the data
- * but this might not always be the case. A little information hiding.
- * Add CHUNK_SIZE to the handle if it is the first allocation to jump
- * over the zbud header in the first chunk.
- */
- handle = (unsigned long)zhdr;
- if (bud == FIRST)
- /* skip over zbud header */
- handle += ZHDR_SIZE_ALIGNED;
- else /* bud == LAST */
- handle += PAGE_SIZE -
- (get_num_chunks(page, LAST) << CHUNK_SHIFT);
- return handle;
+/* Returns struct page of the zbud page where a given handle is stored */
+static struct page *handle_to_zbud_page(unsigned long handle)
+{
+ return (struct page *) (handle & ~LAST);
}
/* Returns the zbud page where a given handle is stored */
static struct zbud_header *handle_to_zbud_header(unsigned long handle)
{
- return (struct zbud_header *)(handle & PAGE_MASK);
+ struct page *page = handle_to_zbud_page(handle);
+
+ return page_address(page);
}
/* Returns the number of free chunks in a zbud page */
@@ -395,7 +392,7 @@ found:
list_del(&page->lru);
list_add(&page->lru, &pool->lru);
- *handle = encode_handle(zhdr, bud);
+ *handle = encode_handle(page, bud);
spin_unlock(&pool->lock);
return 0;
@@ -514,9 +511,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
first_handle = 0;
last_handle = 0;
if (get_num_chunks(page, FIRST))
- first_handle = encode_handle(zhdr, FIRST);
+ first_handle = encode_handle(page, FIRST);
if (get_num_chunks(page, LAST))
- last_handle = encode_handle(zhdr, LAST);
+ last_handle = encode_handle(page, LAST);
spin_unlock(&pool->lock);
/* Issue the eviction callback(s) */
@@ -570,7 +567,16 @@ next:
*/
void *zbud_map(struct zbud_pool *pool, unsigned long handle)
{
- return (void *)(handle);
+ size_t offset;
+ struct page *page = handle_to_zbud_page(handle);
+
+ if (is_last_chunk(handle))
+ offset = PAGE_SIZE -
+ (get_num_chunks(page, LAST) << CHUNK_SHIFT);
+ else
+ offset = ZHDR_SIZE_ALIGNED;
+
+ return (unsigned char *) page_address(page) + offset;
}
/**
--
1.9.1
There's no point in having the _buddied_ list of zbud_pages, as nobody
refers it. Tracking it adds runtime overheads only, so let's remove it.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 17 +++--------------
1 file changed, 3 insertions(+), 14 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 6f36394..0f5add0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -79,8 +79,6 @@
* @unbuddied: array of lists tracking zbud pages that only contain one buddy;
* the lists each zbud page is added to depends on the size of
* its free region.
- * @buddied: list tracking the zbud pages that contain two buddies;
- * these zbud pages are full
* @lru: list tracking the zbud pages in LRU order by most recently
* added buddy.
* @pages_nr: number of zbud pages in the pool.
@@ -93,7 +91,6 @@
struct zbud_pool {
spinlock_t lock;
struct list_head unbuddied[NCHUNKS];
- struct list_head buddied;
struct list_head lru;
u64 pages_nr;
struct zbud_ops *ops;
@@ -102,7 +99,7 @@ struct zbud_pool {
/*
* struct zbud_header - zbud page metadata occupying the first chunk of each
* zbud page.
- * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
+ * @buddy: links the zbud page into the unbuddied lists in the pool
* @lru: links the zbud page into the lru list in the pool
* @first_chunks: the size of the first buddy in chunks, 0 if free
* @last_chunks: the size of the last buddy in chunks, 0 if free
@@ -299,7 +296,6 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops)
spin_lock_init(&pool->lock);
for_each_unbuddied_list(i, 0)
INIT_LIST_HEAD(&pool->unbuddied[i]);
- INIT_LIST_HEAD(&pool->buddied);
INIT_LIST_HEAD(&pool->lru);
pool->pages_nr = 0;
pool->ops = ops;
@@ -383,9 +379,6 @@ found:
/* Add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
- } else {
- /* Add to buddied list */
- list_add(&zhdr->buddy, &pool->buddied);
}
/* Add/move zbud page to beginning of LRU */
@@ -429,10 +422,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
return;
}
- /* Remove from existing buddy list */
- list_del(&zhdr->buddy);
-
if (num_free_chunks(zhdr) == NCHUNKS) {
+ /* Remove from existing unbuddied list */
+ list_del(&zhdr->buddy);
/* zbud page is empty, free */
list_del(&zhdr->lru);
free_zbud_page(zhdr);
@@ -542,9 +534,6 @@ next:
/* add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
- } else {
- /* add to buddied list */
- list_add(&zhdr->buddy, &pool->buddied);
}
/* add to beginning of LRU */
--
1.9.1
zbud allocator links the _unbuddied_ zbud pages into a list in the pool.
When it tries to allocate some spaces, the list is first searched for
the best fit possible. Thus, current implementation has a list_head in
zbud_header structure to construct the list.
This patch simulates a list using the second double word of struct page,
instead of zbud_header. Then, we can eliminate the list_head in
zbud_header. Using _index and _mapcount fields (also including _count on
64-bits machines) in the page struct for list management looks a bit
odd, but no better idea now considering that page->lru is already in
use.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 383bab0..8a6dd6b 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -99,10 +99,8 @@ struct zbud_pool {
/*
* struct zbud_header - zbud page metadata occupying the first chunk of each
* zbud page.
- * @buddy: links the zbud page into the unbuddied lists in the pool
*/
struct zbud_header {
- struct list_head buddy;
bool under_reclaim;
};
@@ -223,21 +221,24 @@ static size_t get_num_chunks(struct page *page, enum buddy bud)
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
/* Initializes the zbud header of a newly allocated zbud page */
-static struct zbud_header *init_zbud_page(struct page *page)
+static void init_zbud_page(struct page *page)
{
struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
- INIT_LIST_HEAD(&zhdr->buddy);
+ INIT_LIST_HEAD((struct list_head *) &page->index);
INIT_LIST_HEAD(&page->lru);
zhdr->under_reclaim = 0;
- return zhdr;
}
/* Resets the struct page fields and frees the page */
static void free_zbud_page(struct zbud_header *zhdr)
{
- __free_page(virt_to_page(zhdr));
+ struct page *page = virt_to_page(zhdr);
+
+ init_page_count(page);
+ page_mapcount_reset(page);
+ __free_page(page);
}
static int is_last_chunk(unsigned long handle)
@@ -341,7 +342,6 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
unsigned long *handle)
{
int chunks, i, freechunks;
- struct zbud_header *zhdr = NULL;
enum buddy bud;
struct page *page;
@@ -355,10 +355,9 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
/* First, try to find an unbuddied zbud page. */
for_each_unbuddied_list(i, chunks) {
if (!list_empty(&pool->unbuddied[i])) {
- zhdr = list_first_entry(&pool->unbuddied[i],
- struct zbud_header, buddy);
- page = virt_to_page(zhdr);
- list_del(&zhdr->buddy);
+ page = list_entry((unsigned long *)
+ pool->unbuddied[i].next, struct page, index);
+ list_del((struct list_head *) &page->index);
goto found;
}
}
@@ -370,7 +369,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
return -ENOMEM;
spin_lock(&pool->lock);
pool->pages_nr++;
- zhdr = init_zbud_page(page);
+ init_zbud_page(page);
found:
if (get_num_chunks(page, FIRST) == 0)
@@ -384,7 +383,8 @@ found:
get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
freechunks = num_free_chunks(page);
- list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
+ list_add((struct list_head *) &page->index,
+ &pool->unbuddied[freechunks]);
}
/* Add/move zbud page to beginning of LRU */
@@ -433,14 +433,15 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
freechunks = num_free_chunks(page);
if (freechunks == NCHUNKS) {
/* Remove from existing unbuddied list */
- list_del(&zhdr->buddy);
+ list_del((struct list_head *) &page->index);
/* zbud page is empty, free */
list_del(&page->lru);
free_zbud_page(zhdr);
pool->pages_nr--;
} else {
/* Add to unbuddied list */
- list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
+ list_add((struct list_head *) &page->index,
+ &pool->unbuddied[freechunks]);
}
spin_unlock(&pool->lock);
@@ -501,7 +502,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
page = list_tail_entry(&pool->lru, struct page, lru);
zhdr = page_address(page);
list_del(&page->lru);
- list_del(&zhdr->buddy);
+ list_del((struct list_head *) &page->index);
/* Protect zbud page against free */
zhdr->under_reclaim = true;
/*
@@ -543,7 +544,8 @@ next:
} else if (get_num_chunks(page, FIRST) == 0 ||
get_num_chunks(page, LAST) == 0) {
/* add to unbuddied list */
- list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
+ list_add((struct list_head *) &page->index,
+ &pool->unbuddied[freechunks]);
}
/* add to beginning of LRU */
--
1.9.1
Now that the zbud supports highmem, storing compressed anonymous pages
on highmem looks more reasonble. So, pass __GFP_HIGHMEM flag to zpool
when zswap allocates memory from it.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zswap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index ea064c1..eaabe95 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -684,8 +684,8 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
/* store */
len = dlen + sizeof(struct zswap_header);
- ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
- &handle);
+ ret = zpool_malloc(zswap_pool, len,
+ __GFP_NORETRY | __GFP_NOWARN | __GFP_HIGHMEM, &handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
goto freepage;
--
1.9.1
Now that all fields for the internal data structure of zbud are moved to
struct page, there is no reason to restrict zbud pages to be allocated
only in lowmem. This patch allows to use highmem pages for zbud pages.
Pages from highmem are mapped using kmap_atomic() before accessing.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 5a392f3..677fdc1 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -52,6 +52,7 @@
#include <linux/spinlock.h>
#include <linux/zbud.h>
#include <linux/zpool.h>
+#include <linux/highmem.h>
/*****************
* Structures
@@ -94,6 +95,9 @@ struct zbud_pool {
struct zbud_ops *ops;
};
+/* per-cpu mapping addresses of kmap_atomic()'ed zbud pages */
+static DEFINE_PER_CPU(void *, zbud_mapping);
+
/*****************
* zpool
****************/
@@ -310,9 +314,6 @@ void zbud_destroy_pool(struct zbud_pool *pool)
* performed first. If no suitable free region is found, then a new page is
* allocated and added to the pool to satisfy the request.
*
- * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used
- * as zbud pool pages.
- *
* Return: 0 if success and handle is set, otherwise -EINVAL if the size or
* gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
* a new page.
@@ -324,7 +325,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
enum buddy bud;
struct page *page;
- if (!size || (gfp & __GFP_HIGHMEM))
+ if (!size)
return -EINVAL;
if (size > PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
@@ -543,14 +544,24 @@ next:
*/
void *zbud_map(struct zbud_pool *pool, unsigned long handle)
{
+ void **mapping;
size_t offset = 0;
struct page *page = handle_to_zbud_page(handle);
+ /*
+ * Because we use per-cpu mapping shared among the pools/users,
+ * we can't allow mapping in interrupt context because it can
+ * corrupt another users mappings.
+ */
+ BUG_ON(in_interrupt());
+
if (is_last_chunk(handle))
offset = PAGE_SIZE -
(get_num_chunks(page, LAST) << CHUNK_SHIFT);
- return (unsigned char *) page_address(page) + offset;
+ mapping = &get_cpu_var(zbud_mapping);
+ *mapping = kmap_atomic(page);
+ return (char *) *mapping + offset;
}
/**
@@ -560,6 +571,10 @@ void *zbud_map(struct zbud_pool *pool, unsigned long handle)
*/
void zbud_unmap(struct zbud_pool *pool, unsigned long handle)
{
+ void **mapping = this_cpu_ptr(&zbud_mapping);
+
+ kunmap_atomic(*mapping);
+ put_cpu_var(zbud_mapping);
}
/**
--
1.9.1
zbud_pool has an lru list for tracking zbud pages and they are strung
together via zhdr->lru. If we reuse page->lru for linking zbud pages
instead of it, the lru field in zbud_header can be dropped.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 0f5add0..a2390f6 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,11 @@ struct zbud_pool {
* struct zbud_header - zbud page metadata occupying the first chunk of each
* zbud page.
* @buddy: links the zbud page into the unbuddied lists in the pool
- * @lru: links the zbud page into the lru list in the pool
* @first_chunks: the size of the first buddy in chunks, 0 if free
* @last_chunks: the size of the last buddy in chunks, 0 if free
*/
struct zbud_header {
struct list_head buddy;
- struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
bool under_reclaim;
@@ -224,7 +222,7 @@ static struct zbud_header *init_zbud_page(struct page *page)
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(&zhdr->buddy);
- INIT_LIST_HEAD(&zhdr->lru);
+ INIT_LIST_HEAD(&page->lru);
zhdr->under_reclaim = 0;
return zhdr;
}
@@ -352,6 +350,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
if (!list_empty(&pool->unbuddied[i])) {
zhdr = list_first_entry(&pool->unbuddied[i],
struct zbud_header, buddy);
+ page = virt_to_page(zhdr);
list_del(&zhdr->buddy);
goto found;
}
@@ -382,9 +381,9 @@ found:
}
/* Add/move zbud page to beginning of LRU */
- if (!list_empty(&zhdr->lru))
- list_del(&zhdr->lru);
- list_add(&zhdr->lru, &pool->lru);
+ if (!list_empty(&page->lru))
+ list_del(&page->lru);
+ list_add(&page->lru, &pool->lru);
*handle = encode_handle(zhdr, bud);
spin_unlock(&pool->lock);
@@ -405,10 +404,12 @@ found:
void zbud_free(struct zbud_pool *pool, unsigned long handle)
{
struct zbud_header *zhdr;
+ struct page *page;
int freechunks;
spin_lock(&pool->lock);
zhdr = handle_to_zbud_header(handle);
+ page = virt_to_page(zhdr);
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
@@ -426,7 +427,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing unbuddied list */
list_del(&zhdr->buddy);
/* zbud page is empty, free */
- list_del(&zhdr->lru);
+ list_del(&page->lru);
free_zbud_page(zhdr);
pool->pages_nr--;
} else {
@@ -479,6 +480,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
{
int i, ret, freechunks;
+ struct page *page;
struct zbud_header *zhdr;
unsigned long first_handle, last_handle;
@@ -489,8 +491,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
return -EINVAL;
}
for (i = 0; i < retries; i++) {
- zhdr = list_tail_entry(&pool->lru, struct zbud_header, lru);
- list_del(&zhdr->lru);
+ page = list_tail_entry(&pool->lru, struct page, lru);
+ zhdr = page_address(page);
+ list_del(&page->lru);
list_del(&zhdr->buddy);
/* Protect zbud page against free */
zhdr->under_reclaim = true;
@@ -537,7 +540,7 @@ next:
}
/* add to beginning of LRU */
- list_add(&zhdr->lru, &pool->lru);
+ list_add(&page->lru, &pool->lru);
}
spin_unlock(&pool->lock);
return -EAGAIN;
--
1.9.1
The size information of each first and last buddy are stored into
first|last_chunks in struct zbud_header respectively. Put them into
page->private instead of zbud_header.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 62 ++++++++++++++++++++++++++++++++++++--------------------------
1 file changed, 36 insertions(+), 26 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index a2390f6..193ea4f 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,9 @@ struct zbud_pool {
* struct zbud_header - zbud page metadata occupying the first chunk of each
* zbud page.
* @buddy: links the zbud page into the unbuddied lists in the pool
- * @first_chunks: the size of the first buddy in chunks, 0 if free
- * @last_chunks: the size of the last buddy in chunks, 0 if free
*/
struct zbud_header {
struct list_head buddy;
- unsigned int first_chunks;
- unsigned int last_chunks;
bool under_reclaim;
};
@@ -212,6 +208,17 @@ static int size_to_chunks(size_t size)
return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT;
}
+static void set_num_chunks(struct page *page, enum buddy bud, size_t chunks)
+{
+ page->private = (page->private & (0xffff << (16 * !bud))) |
+ ((chunks & 0xffff) << (16 * bud));
+}
+
+static size_t get_num_chunks(struct page *page, enum buddy bud)
+{
+ return (page->private >> (16 * bud)) & 0xffff;
+}
+
#define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
@@ -219,8 +226,8 @@ static int size_to_chunks(size_t size)
static struct zbud_header *init_zbud_page(struct page *page)
{
struct zbud_header *zhdr = page_address(page);
- zhdr->first_chunks = 0;
- zhdr->last_chunks = 0;
+ set_num_chunks(page, FIRST, 0);
+ set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD(&zhdr->buddy);
INIT_LIST_HEAD(&page->lru);
zhdr->under_reclaim = 0;
@@ -240,6 +247,7 @@ static void free_zbud_page(struct zbud_header *zhdr)
static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
{
unsigned long handle;
+ struct page *page = virt_to_page(zhdr);
/*
* For now, the encoded handle is actually just the pointer to the data
@@ -252,7 +260,8 @@ static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
else /* bud == LAST */
- handle += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT);
+ handle += PAGE_SIZE -
+ (get_num_chunks(page, LAST) << CHUNK_SHIFT);
return handle;
}
@@ -263,13 +272,14 @@ static struct zbud_header *handle_to_zbud_header(unsigned long handle)
}
/* Returns the number of free chunks in a zbud page */
-static int num_free_chunks(struct zbud_header *zhdr)
+static int num_free_chunks(struct page *page)
{
/*
* Rather than branch for different situations, just use the fact that
* free buddies have a length of zero to simplify everything.
*/
- return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
+ return NCHUNKS - get_num_chunks(page, FIRST)
+ - get_num_chunks(page, LAST);
}
/*****************
@@ -366,17 +376,17 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
zhdr = init_zbud_page(page);
found:
- if (zhdr->first_chunks == 0) {
- zhdr->first_chunks = chunks;
+ if (get_num_chunks(page, FIRST) == 0)
bud = FIRST;
- } else {
- zhdr->last_chunks = chunks;
+ else
bud = LAST;
- }
- if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) {
+ set_num_chunks(page, bud, chunks);
+
+ if (get_num_chunks(page, FIRST) == 0 ||
+ get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
- freechunks = num_free_chunks(zhdr);
+ freechunks = num_free_chunks(page);
list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
}
@@ -413,9 +423,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
- zhdr->last_chunks = 0;
+ set_num_chunks(page, LAST, 0);
else
- zhdr->first_chunks = 0;
+ set_num_chunks(page, FIRST, 0);
if (zhdr->under_reclaim) {
/* zbud page is under reclaim, reclaim will free */
@@ -423,7 +433,8 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
return;
}
- if (num_free_chunks(zhdr) == NCHUNKS) {
+ freechunks = num_free_chunks(page);
+ if (freechunks == NCHUNKS) {
/* Remove from existing unbuddied list */
list_del(&zhdr->buddy);
/* zbud page is empty, free */
@@ -432,7 +443,6 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
pool->pages_nr--;
} else {
/* Add to unbuddied list */
- freechunks = num_free_chunks(zhdr);
list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
}
@@ -503,9 +513,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
*/
first_handle = 0;
last_handle = 0;
- if (zhdr->first_chunks)
+ if (get_num_chunks(page, FIRST))
first_handle = encode_handle(zhdr, FIRST);
- if (zhdr->last_chunks)
+ if (get_num_chunks(page, LAST))
last_handle = encode_handle(zhdr, LAST);
spin_unlock(&pool->lock);
@@ -523,7 +533,8 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
next:
spin_lock(&pool->lock);
zhdr->under_reclaim = false;
- if (num_free_chunks(zhdr) == NCHUNKS) {
+ freechunks = num_free_chunks(page);
+ if (freechunks == NCHUNKS) {
/*
* Both buddies are now free, free the zbud page and
* return success.
@@ -532,10 +543,9 @@ next:
pool->pages_nr--;
spin_unlock(&pool->lock);
return 0;
- } else if (zhdr->first_chunks == 0 ||
- zhdr->last_chunks == 0) {
+ } else if (get_num_chunks(page, FIRST) == 0 ||
+ get_num_chunks(page, LAST) == 0) {
/* add to unbuddied list */
- freechunks = num_free_chunks(zhdr);
list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);
}
--
1.9.1
For aesthetics, add a blank line between functions, remove useless
initialization statements, and simplify codes a bit. No functional
differences are introduced.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index ecf1dbe..6f36394 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -145,6 +145,7 @@ static int zbud_zpool_malloc(void *pool, size_t size, gfp_t gfp,
{
return zbud_alloc(pool, size, gfp, handle);
}
+
static void zbud_zpool_free(void *pool, unsigned long handle)
{
zbud_free(pool, handle);
@@ -174,6 +175,7 @@ static void *zbud_zpool_map(void *pool, unsigned long handle,
{
return zbud_map(pool, handle);
}
+
static void zbud_zpool_unmap(void *pool, unsigned long handle)
{
zbud_unmap(pool, handle);
@@ -350,16 +352,11 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
spin_lock(&pool->lock);
/* First, try to find an unbuddied zbud page. */
- zhdr = NULL;
for_each_unbuddied_list(i, chunks) {
if (!list_empty(&pool->unbuddied[i])) {
zhdr = list_first_entry(&pool->unbuddied[i],
struct zbud_header, buddy);
list_del(&zhdr->buddy);
- if (zhdr->first_chunks == 0)
- bud = FIRST;
- else
- bud = LAST;
goto found;
}
}
@@ -372,13 +369,15 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
spin_lock(&pool->lock);
pool->pages_nr++;
zhdr = init_zbud_page(page);
- bud = FIRST;
found:
- if (bud == FIRST)
+ if (zhdr->first_chunks == 0) {
zhdr->first_chunks = chunks;
- else
+ bud = FIRST;
+ } else {
zhdr->last_chunks = chunks;
+ bud = LAST;
+ }
if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) {
/* Add to unbuddied list */
@@ -433,7 +432,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing buddy list */
list_del(&zhdr->buddy);
- if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
+ if (num_free_chunks(zhdr) == NCHUNKS) {
/* zbud page is empty, free */
list_del(&zhdr->lru);
free_zbud_page(zhdr);
@@ -489,7 +488,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
{
int i, ret, freechunks;
struct zbud_header *zhdr;
- unsigned long first_handle = 0, last_handle = 0;
+ unsigned long first_handle, last_handle;
spin_lock(&pool->lock);
if (!pool->ops || !pool->ops->evict || list_empty(&pool->lru) ||
@@ -529,7 +528,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
next:
spin_lock(&pool->lock);
zhdr->under_reclaim = false;
- if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
+ if (num_free_chunks(zhdr) == NCHUNKS) {
/*
* Both buddies are now free, free the zbud page and
* return success.
--
1.9.1
Now that the only field in zbud_header is .under_reclaim, get it out of
the struct and let PG_reclaim bit in page->flags take over. As a result
of this change, we can finally eliminate the struct zbud_header, and
hence all the internal data structures of zbud live in struct page.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/zbud.c | 66 +++++++++++++++++----------------------------------------------
1 file changed, 18 insertions(+), 48 deletions(-)
diff --git a/mm/zbud.c b/mm/zbud.c
index 8a6dd6b..5a392f3 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -60,17 +60,15 @@
* NCHUNKS_ORDER determines the internal allocation granularity, effectively
* adjusting internal fragmentation. It also determines the number of
* freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the
- * allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk
- * in allocated page is occupied by zbud header, NCHUNKS will be calculated to
- * 63 which shows the max number of free chunks in zbud page, also there will be
- * 63 freelists per pool.
+ * allocation granularity will be in chunks of size PAGE_SIZE/64.
+ * NCHUNKS will be calculated to 64 which shows the max number of free
+ * chunks in zbud page, also there will be 64 freelists per pool.
*/
#define NCHUNKS_ORDER 6
#define CHUNK_SHIFT (PAGE_SHIFT - NCHUNKS_ORDER)
#define CHUNK_SIZE (1 << CHUNK_SHIFT)
-#define ZHDR_SIZE_ALIGNED CHUNK_SIZE
-#define NCHUNKS ((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT)
+#define NCHUNKS (PAGE_SIZE >> CHUNK_SHIFT)
/**
* struct zbud_pool - stores metadata for each zbud pool
@@ -96,14 +94,6 @@ struct zbud_pool {
struct zbud_ops *ops;
};
-/*
- * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
- */
-struct zbud_header {
- bool under_reclaim;
-};
-
/*****************
* zpool
****************/
@@ -220,22 +210,19 @@ static size_t get_num_chunks(struct page *page, enum buddy bud)
#define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
-/* Initializes the zbud header of a newly allocated zbud page */
+/* Initializes a newly allocated zbud page */
static void init_zbud_page(struct page *page)
{
- struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD((struct list_head *) &page->index);
INIT_LIST_HEAD(&page->lru);
- zhdr->under_reclaim = 0;
+ ClearPageReclaim(page);
}
/* Resets the struct page fields and frees the page */
-static void free_zbud_page(struct zbud_header *zhdr)
+static void free_zbud_page(struct page *page)
{
- struct page *page = virt_to_page(zhdr);
-
init_page_count(page);
page_mapcount_reset(page);
__free_page(page);
@@ -261,14 +248,6 @@ static struct page *handle_to_zbud_page(unsigned long handle)
return (struct page *) (handle & ~LAST);
}
-/* Returns the zbud page where a given handle is stored */
-static struct zbud_header *handle_to_zbud_header(unsigned long handle)
-{
- struct page *page = handle_to_zbud_page(handle);
-
- return page_address(page);
-}
-
/* Returns the number of free chunks in a zbud page */
static int num_free_chunks(struct page *page)
{
@@ -347,7 +326,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
- if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+ if (size > PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
chunks = size_to_chunks(size);
spin_lock(&pool->lock);
@@ -410,21 +389,18 @@ found:
*/
void zbud_free(struct zbud_pool *pool, unsigned long handle)
{
- struct zbud_header *zhdr;
struct page *page;
int freechunks;
spin_lock(&pool->lock);
- zhdr = handle_to_zbud_header(handle);
- page = virt_to_page(zhdr);
+ page = handle_to_zbud_page(handle);
- /* If first buddy, handle will be page aligned */
- if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
- set_num_chunks(page, LAST, 0);
- else
+ if (!is_last_chunk(handle))
set_num_chunks(page, FIRST, 0);
+ else
+ set_num_chunks(page, LAST, 0);
- if (zhdr->under_reclaim) {
+ if (PageReclaim(page)) {
/* zbud page is under reclaim, reclaim will free */
spin_unlock(&pool->lock);
return;
@@ -436,7 +412,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
list_del((struct list_head *) &page->index);
/* zbud page is empty, free */
list_del(&page->lru);
- free_zbud_page(zhdr);
+ free_zbud_page(page);
pool->pages_nr--;
} else {
/* Add to unbuddied list */
@@ -489,7 +465,6 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
{
int i, ret, freechunks;
struct page *page;
- struct zbud_header *zhdr;
unsigned long first_handle, last_handle;
spin_lock(&pool->lock);
@@ -500,11 +475,10 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
}
for (i = 0; i < retries; i++) {
page = list_tail_entry(&pool->lru, struct page, lru);
- zhdr = page_address(page);
list_del(&page->lru);
list_del((struct list_head *) &page->index);
/* Protect zbud page against free */
- zhdr->under_reclaim = true;
+ SetPageReclaim(page);
/*
* We need encode the handles before unlocking, since we can
* race with free that will set (first|last)_chunks to 0
@@ -530,14 +504,14 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
}
next:
spin_lock(&pool->lock);
- zhdr->under_reclaim = false;
+ ClearPageReclaim(page);
freechunks = num_free_chunks(page);
if (freechunks == NCHUNKS) {
/*
* Both buddies are now free, free the zbud page and
* return success.
*/
- free_zbud_page(zhdr);
+ free_zbud_page(page);
pool->pages_nr--;
spin_unlock(&pool->lock);
return 0;
@@ -569,14 +543,12 @@ next:
*/
void *zbud_map(struct zbud_pool *pool, unsigned long handle)
{
- size_t offset;
+ size_t offset = 0;
struct page *page = handle_to_zbud_page(handle);
if (is_last_chunk(handle))
offset = PAGE_SIZE -
(get_num_chunks(page, LAST) << CHUNK_SHIFT);
- else
- offset = ZHDR_SIZE_ALIGNED;
return (unsigned char *) page_address(page) + offset;
}
@@ -604,8 +576,6 @@ u64 zbud_get_pool_size(struct zbud_pool *pool)
static int __init init_zbud(void)
{
- /* Make sure the zbud header will fit in one chunk */
- BUILD_BUG_ON(sizeof(struct zbud_header) > ZHDR_SIZE_ALIGNED);
pr_info("loaded\n");
#ifdef CONFIG_ZPOOL
--
1.9.1
On Tue, Oct 14, 2014 at 7:59 AM, Heesub Shin <[email protected]> wrote:
> zbud is a memory allocator for storing compressed data pages. It keeps
> two data objects of arbitrary size on a single page. This simple design
> provides very deterministic behavior on reclamation, which is one of
> reasons why zswap selected zbud as a default allocator over zsmalloc.
>
> Unlike zsmalloc, however, zbud does not support highmem. This is
> problomatic especially on 32-bit machines having relatively small
> lowmem. Compressing anonymous pages from highmem and storing them into
> lowmem could eat up lowmem spaces.
>
> This limitation is due to the fact that zbud manages its internal data
> structures on zbud_header which is kept in the head of zbud_page. For
> example, zbud_pages are tracked by several lists and have some status
> information, which are being referenced at any time by the kernel. Thus,
> zbud_pages should be allocated on a memory region directly mapped,
> lowmem.
>
> After some digging out, I found that internal data structures of zbud
> can be kept in the struct page, the same way as zsmalloc does. So, this
> series moves out all fields in zbud_header to struct page. Though it
> alters quite a lot, it does not add any functional differences except
> highmem support. I am afraid that this kind of modification abusing
> several fields in struct page would be ok.
Seth, have you had a chance to review this yet? I'm going to try to
take a look at it next week if you haven't yet. Letting zbud use
highmem would be a good thing.
>
> Heesub Shin (9):
> mm/zbud: tidy up a bit
> mm/zbud: remove buddied list from zbud_pool
> mm/zbud: remove lru from zbud_header
> mm/zbud: remove first|last_chunks from zbud_header
> mm/zbud: encode zbud handle using struct page
> mm/zbud: remove list_head for buddied list from zbud_header
> mm/zbud: drop zbud_header
> mm/zbud: allow clients to use highmem pages
> mm/zswap: use highmem pages for compressed pool
>
> mm/zbud.c | 244 ++++++++++++++++++++++++++++++-------------------------------
> mm/zswap.c | 4 +-
> 2 files changed, 121 insertions(+), 127 deletions(-)
>
> --
> 1.9.1
>
On Thu, Oct 23, 2014 at 07:14:15PM -0400, Dan Streetman wrote:
> On Tue, Oct 14, 2014 at 7:59 AM, Heesub Shin <[email protected]> wrote:
> > zbud is a memory allocator for storing compressed data pages. It keeps
> > two data objects of arbitrary size on a single page. This simple design
> > provides very deterministic behavior on reclamation, which is one of
> > reasons why zswap selected zbud as a default allocator over zsmalloc.
> >
> > Unlike zsmalloc, however, zbud does not support highmem. This is
> > problomatic especially on 32-bit machines having relatively small
> > lowmem. Compressing anonymous pages from highmem and storing them into
> > lowmem could eat up lowmem spaces.
> >
> > This limitation is due to the fact that zbud manages its internal data
> > structures on zbud_header which is kept in the head of zbud_page. For
> > example, zbud_pages are tracked by several lists and have some status
> > information, which are being referenced at any time by the kernel. Thus,
> > zbud_pages should be allocated on a memory region directly mapped,
> > lowmem.
> >
> > After some digging out, I found that internal data structures of zbud
> > can be kept in the struct page, the same way as zsmalloc does. So, this
> > series moves out all fields in zbud_header to struct page. Though it
> > alters quite a lot, it does not add any functional differences except
> > highmem support. I am afraid that this kind of modification abusing
> > several fields in struct page would be ok.
>
> Seth, have you had a chance to review this yet? I'm going to try to
> take a look at it next week if you haven't yet. Letting zbud use
> highmem would be a good thing.
I have looked at it, and it looks sound to me. I seem to remember
having a comment on something, but I'll have to look back over
it. Haven't tested it yet.
Seth
>
>
> >
> > Heesub Shin (9):
> > mm/zbud: tidy up a bit
> > mm/zbud: remove buddied list from zbud_pool
> > mm/zbud: remove lru from zbud_header
> > mm/zbud: remove first|last_chunks from zbud_header
> > mm/zbud: encode zbud handle using struct page
> > mm/zbud: remove list_head for buddied list from zbud_header
> > mm/zbud: drop zbud_header
> > mm/zbud: allow clients to use highmem pages
> > mm/zswap: use highmem pages for compressed pool
> >
> > mm/zbud.c | 244 ++++++++++++++++++++++++++++++-------------------------------
> > mm/zswap.c | 4 +-
> > 2 files changed, 121 insertions(+), 127 deletions(-)
> >
> > --
> > 1.9.1
> >
On Tue, Oct 14, 2014 at 08:59:19PM +0900, Heesub Shin wrote:
> zbud is a memory allocator for storing compressed data pages. It keeps
> two data objects of arbitrary size on a single page. This simple design
> provides very deterministic behavior on reclamation, which is one of
> reasons why zswap selected zbud as a default allocator over zsmalloc.
>
> Unlike zsmalloc, however, zbud does not support highmem. This is
> problomatic especially on 32-bit machines having relatively small
> lowmem. Compressing anonymous pages from highmem and storing them into
> lowmem could eat up lowmem spaces.
>
> This limitation is due to the fact that zbud manages its internal data
> structures on zbud_header which is kept in the head of zbud_page. For
> example, zbud_pages are tracked by several lists and have some status
> information, which are being referenced at any time by the kernel. Thus,
> zbud_pages should be allocated on a memory region directly mapped,
> lowmem.
>
> After some digging out, I found that internal data structures of zbud
> can be kept in the struct page, the same way as zsmalloc does. So, this
> series moves out all fields in zbud_header to struct page. Though it
> alters quite a lot, it does not add any functional differences except
> highmem support. I am afraid that this kind of modification abusing
> several fields in struct page would be ok.
Hi Heesub,
Sorry for the very late reply. The end of October was very busy for me.
A little history on zbud. I didn't put the metadata in the struct
page, even though I knew that was an option since we had done it with
zsmalloc. At the time, Andrew Morton had concerns about memmap walkers
getting messed up with unexpected values in the struct page fields. In
order to smooth zbud's acceptance, I decided to store the metadata
inline in the page itself.
Later, zsmalloc eventually got accepted, which basically gave the
impression that putting the metadata in the struct page was acceptable.
I have recently been looking at implementing compaction for zsmalloc,
but having the metadata in the struct page and having the handle
directly encode the PFN and offset of the data block prevents
transparent relocation of the data. zbud has a similar issue as it
currently encodes the page address in the handle returned to the user
(also the limitation that is preventing use of highmem pages).
I would like to implement compaction for zbud too and moving the
metadata into the struct page is going to work against that. In fact,
I'm looking at the option of converting the current zbud_header into a
per-allocation metadata structure, which would provide a layer of
indirection between zbud and the user, allowing for transparent
relocation and compaction.
However, I do like the part about letting zbud use highmem pages.
I have something in mind that would allow highmem pages _and_ move
toward something that would support compaction. I'll see if I can put
it into code today.
Thanks,
Seth
>
> Heesub Shin (9):
> mm/zbud: tidy up a bit
> mm/zbud: remove buddied list from zbud_pool
> mm/zbud: remove lru from zbud_header
> mm/zbud: remove first|last_chunks from zbud_header
> mm/zbud: encode zbud handle using struct page
> mm/zbud: remove list_head for buddied list from zbud_header
> mm/zbud: drop zbud_header
> mm/zbud: allow clients to use highmem pages
> mm/zswap: use highmem pages for compressed pool
>
> mm/zbud.c | 244 ++++++++++++++++++++++++++++++-------------------------------
> mm/zswap.c | 4 +-
> 2 files changed, 121 insertions(+), 127 deletions(-)
>
> --
> 1.9.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>