2019-04-02 23:09:18

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 0/7] mm: Use slab_list list_head instead of lru

Hi Andrew,

Here is the updated series that caused the bug found by 0day test
robot. To ease the load on your memory, this series is aimed at
replacing the following:

Original buggy series:

mm-remove-stale-comment-from-page-struct.patch
slub-use-slab_list-instead-of-lru.patch
slab-use-slab_list-instead-of-lru.patch
slob-use-slab_list-instead-of-lru.patch
slub-add-comments-to-endif-pre-processor-macros.patch
slob-respect-list_head-abstraction-layer.patch
list-add-function-list_rotate_to_front.patch

And the bug fix patch:

slob-only-use-list-functions-when-safe-to-do-so.patch

Applies cleanly on top of Linus' tree (tag: v5.1-rc3).

This series differs from the bug fix above by adding a separate return
parameter to slob_page_alloc() instead of using a double pointer. This
is defensive in case later someone adds code that accesses sp (struct
page *), also it is easier to read/verify the code since its less 'tricky'.

Tested by building and booting a kernel using the SLOB allocator and
with CONFIG_DEBUG_LIST.

From v4 ...

Currently the slab allocators (ab)use the struct page 'lru' list_head.
We have a list head for slab allocators to use, 'slab_list'.

During v2 it was noted by Christoph that the SLOB allocator was reaching
into a list_head, this version adds 2 patches to the front of the set to
fix that.

Clean up all three allocators by using the 'slab_list' list_head instead
of overloading the 'lru' list_head.

Changes since v4:
- Add return parameter to slob_page_alloc() to indicate whether the
page is removed from the freelist during allocation.
- Only do list rotate optimisation if the page was _not_ removed from
the freelist (fix bug found by 0day test robot).

Changes since v3:

- Change all ->lru to ->slab_list in slob (thanks Roman).

Changes since v2:

- Add list_rotate_to_front().
- Fix slob to use list_head API.
- Re-order patches to put the list.h changes up front.
- Add acks from Christoph.

Changes since v1:

- Verify object files are the same before and after the patch set is
applied (suggested by Matthew).
- Add extra explanation to the commit logs explaining why these changes
are safe to make (suggested by Roman).
- Remove stale comment (thanks Willy).


thanks,
Tobin.


Tobin C. Harding (7):
list: Add function list_rotate_to_front()
slob: Respect list_head abstraction layer
slob: Use slab_list instead of lru
slub: Add comments to endif pre-processor macros
slub: Use slab_list instead of lru
slab: Use slab_list instead of lru
mm: Remove stale comment from page struct

include/linux/list.h | 18 ++++++++++++
include/linux/mm_types.h | 2 +-
mm/slab.c | 49 ++++++++++++++++----------------
mm/slob.c | 59 +++++++++++++++++++++++++++------------
mm/slub.c | 60 ++++++++++++++++++++--------------------
5 files changed, 115 insertions(+), 73 deletions(-)

--
2.21.0


2019-04-02 23:07:29

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 1/7] list: Add function list_rotate_to_front()

Currently if we wish to rotate a list until a specific item is at the
front of the list we can call list_move_tail(head, list). Note that the
arguments are the reverse way to the usual use of list_move_tail(list,
head). This is a hack, it depends on the developer knowing how the
list_head operates internally which violates the layer of abstraction
offered by the list_head. Also, it is not intuitive so the next
developer to come along must study list.h in order to fully understand
what is meant by the call, while this is 'good for' the developer it
makes reading the code harder. We should have an function appropriately
named that does this if there are users for it intree.

By grep'ing the tree for list_move_tail() and list_tail() and attempting
to guess the argument order from the names it seems there is only one
place currently in the tree that does this - the slob allocatator.

Add function list_rotate_to_front() to rotate a list until the specified
item is at the front of the list.

Signed-off-by: Tobin C. Harding <[email protected]>
---
include/linux/list.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/include/linux/list.h b/include/linux/list.h
index 58aa3adf94e6..9e9a6403dbe4 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -270,6 +270,24 @@ static inline void list_rotate_left(struct list_head *head)
}
}

+/**
+ * list_rotate_to_front() - Rotate list to specific item.
+ * @list: The desired new front of the list.
+ * @head: The head of the list.
+ *
+ * Rotates list so that @list becomes the new front of the list.
+ */
+static inline void list_rotate_to_front(struct list_head *list,
+ struct list_head *head)
+{
+ /*
+ * Deletes the list head from the list denoted by @head and
+ * places it as the tail of @list, this effectively rotates the
+ * list so that @list is at the front.
+ */
+ list_move_tail(head, list);
+}
+
/**
* list_is_singular - tests whether a list has just one entry.
* @head: the list to test.
--
2.21.0

2019-04-02 23:07:37

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 2/7] slob: Respect list_head abstraction layer

Currently we reach inside the list_head. This is a violation of the
layer of abstraction provided by the list_head. It makes the code
fragile. More importantly it makes the code wicked hard to understand.

The code reaches into the list_head structure to counteract the fact
that the list _may_ have been changed during slob_page_alloc(). Instead
of this we can add a return parameter to slob_page_alloc() to signal
that the list was modified (list_del() called with page->lru to remove
page from the freelist).

This code is concerned with an optimisation that counters the tendency
for first fit allocation algorithm to fragment memory into many small
chunks at the front of the memory pool. Since the page is only removed
from the list when an allocation uses _all_ the remaining memory in the
page then in this special case fragmentation does not occur and we
therefore do not need the optimisation.

Add a return parameter to slob_page_alloc() to signal that the
allocation used up the whole page and that the page was removed from the
free list. After calling slob_page_alloc() check the return value just
added and only attempt optimisation if the page is still on the list.

Use list_head API instead of reaching into the list_head structure to
check if sp is at the front of the list.

Signed-off-by: Tobin C. Harding <[email protected]>
---
mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/mm/slob.c b/mm/slob.c
index 307c2c9feb44..07356e9feaaa 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
}

/*
- * Allocate a slob block within a given slob_page sp.
+ * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
+ * @sp: Page to look in.
+ * @size: Size of the allocation.
+ * @align: Allocation alignment.
+ * @page_removed_from_list: Return parameter.
+ *
+ * Tries to find a chunk of memory at least @size bytes big within @page.
+ *
+ * Return: Pointer to memory if allocated, %NULL otherwise. If the
+ * allocation fills up @page then the page is removed from the
+ * freelist, in this case @page_removed_from_list will be set to
+ * true (set to false otherwise).
*/
-static void *slob_page_alloc(struct page *sp, size_t size, int align)
+static void *slob_page_alloc(struct page *sp, size_t size, int align,
+ bool *page_removed_from_list)
{
slob_t *prev, *cur, *aligned = NULL;
int delta = 0, units = SLOB_UNITS(size);

+ *page_removed_from_list = false;
for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
slobidx_t avail = slob_units(cur);

@@ -254,8 +267,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
}

sp->units -= units;
- if (!sp->units)
+ if (!sp->units) {
clear_slob_page_free(sp);
+ *page_removed_from_list = true;
+ }
return cur;
}
if (slob_last(cur))
@@ -269,10 +284,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
{
struct page *sp;
- struct list_head *prev;
struct list_head *slob_list;
slob_t *b = NULL;
unsigned long flags;
+ bool _unused;

if (size < SLOB_BREAK1)
slob_list = &free_slob_small;
@@ -284,6 +299,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
spin_lock_irqsave(&slob_lock, flags);
/* Iterate through each partially free page, try to find room */
list_for_each_entry(sp, slob_list, lru) {
+ bool page_removed_from_list = false;
#ifdef CONFIG_NUMA
/*
* If there's a node specification, search for a partial
@@ -296,18 +312,25 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
if (sp->units < SLOB_UNITS(size))
continue;

- /* Attempt to alloc */
- prev = sp->lru.prev;
- b = slob_page_alloc(sp, size, align);
+ b = slob_page_alloc(sp, size, align, &page_removed_from_list);
if (!b)
continue;

- /* Improve fragment distribution and reduce our average
- * search time by starting our next search here. (see
- * Knuth vol 1, sec 2.5, pg 449) */
- if (prev != slob_list->prev &&
- slob_list->next != prev->next)
- list_move_tail(slob_list, prev->next);
+ /*
+ * If slob_page_alloc() removed sp from the list then we
+ * cannot call list functions on sp. If so allocation
+ * did not fragment the page anyway so optimisation is
+ * unnecessary.
+ */
+ if (!page_removed_from_list) {
+ /*
+ * Improve fragment distribution and reduce our average
+ * search time by starting our next search here. (see
+ * Knuth vol 1, sec 2.5, pg 449)
+ */
+ if (!list_is_first(&sp->lru, slob_list))
+ list_rotate_to_front(&sp->lru, slob_list);
+ }
break;
}
spin_unlock_irqrestore(&slob_lock, flags);
@@ -326,7 +349,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
INIT_LIST_HEAD(&sp->lru);
set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
set_slob_page_free(sp, slob_list);
- b = slob_page_alloc(sp, size, align);
+ b = slob_page_alloc(sp, size, align, &_unused);
BUG_ON(!b);
spin_unlock_irqrestore(&slob_lock, flags);
}
--
2.21.0

2019-04-02 23:07:45

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 3/7] slob: Use slab_list instead of lru

Currently we use the page->lru list for maintaining lists of slabs. We
have a list_head in the page structure (slab_list) that can be used for
this purpose. Doing so makes the code cleaner since we are not
overloading the lru list.

The slab_list is part of a union within the page struct (included here
stripped down):

union {
struct { /* Page cache and anonymous pages */
struct list_head lru;
...
};
struct {
dma_addr_t dma_addr;
};
struct { /* slab, slob and slub */
union {
struct list_head slab_list;
struct { /* Partial pages */
struct page *next;
int pages; /* Nr of pages left */
int pobjects; /* Approximate count */
};
};
...

Here we see that slab_list and lru are the same bits. We can verify
that this change is safe to do by examining the object file produced from
slob.c before and after this patch is applied.

Steps taken to verify:

1. checkout current tip of Linus' tree

commit a667cb7a94d4 ("Merge branch 'akpm' (patches from Andrew)")

2. configure and build (select SLOB allocator)

CONFIG_SLOB=y
CONFIG_SLAB_MERGE_DEFAULT=y

3. dissasemble object file `objdump -dr mm/slub.o > before.s
4. apply patch
5. build
6. dissasemble object file `objdump -dr mm/slub.o > after.s
7. diff before.s after.s

Use slab_list list_head instead of the lru list_head for maintaining
lists of slabs.

Reviewed-by: Roman Gushchin <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
---
mm/slob.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/slob.c b/mm/slob.c
index 07356e9feaaa..84aefd9b91ee 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -112,13 +112,13 @@ static inline int slob_page_free(struct page *sp)

static void set_slob_page_free(struct page *sp, struct list_head *list)
{
- list_add(&sp->lru, list);
+ list_add(&sp->slab_list, list);
__SetPageSlobFree(sp);
}

static inline void clear_slob_page_free(struct page *sp)
{
- list_del(&sp->lru);
+ list_del(&sp->slab_list);
__ClearPageSlobFree(sp);
}

@@ -298,7 +298,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)

spin_lock_irqsave(&slob_lock, flags);
/* Iterate through each partially free page, try to find room */
- list_for_each_entry(sp, slob_list, lru) {
+ list_for_each_entry(sp, slob_list, slab_list) {
bool page_removed_from_list = false;
#ifdef CONFIG_NUMA
/*
@@ -328,8 +328,8 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
* search time by starting our next search here. (see
* Knuth vol 1, sec 2.5, pg 449)
*/
- if (!list_is_first(&sp->lru, slob_list))
- list_rotate_to_front(&sp->lru, slob_list);
+ if (!list_is_first(&sp->slab_list, slob_list))
+ list_rotate_to_front(&sp->slab_list, slob_list);
}
break;
}
@@ -346,7 +346,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
spin_lock_irqsave(&slob_lock, flags);
sp->units = SLOB_UNITS(PAGE_SIZE);
sp->freelist = b;
- INIT_LIST_HEAD(&sp->lru);
+ INIT_LIST_HEAD(&sp->slab_list);
set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
set_slob_page_free(sp, slob_list);
b = slob_page_alloc(sp, size, align, &_unused);
--
2.21.0

2019-04-02 23:07:50

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 4/7] slub: Add comments to endif pre-processor macros

SLUB allocator makes heavy use of ifdef/endif pre-processor macros.
The pairing of these statements is at times hard to follow e.g. if the
pair are further than a screen apart or if there are nested pairs. We
can reduce cognitive load by adding a comment to the endif statement of
form

#ifdef CONFIG_FOO
...
#endif /* CONFIG_FOO */

Add comments to endif pre-processor macros if ifdef/endif pair is not
immediately apparent.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
---
mm/slub.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d30ede89f4a6..8fbba4ff6c67 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1951,7 +1951,7 @@ static void *get_any_partial(struct kmem_cache *s, gfp_t flags,
}
}
} while (read_mems_allowed_retry(cpuset_mems_cookie));
-#endif
+#endif /* CONFIG_NUMA */
return NULL;
}

@@ -2249,7 +2249,7 @@ static void unfreeze_partials(struct kmem_cache *s,
discard_slab(s, page);
stat(s, FREE_SLAB);
}
-#endif
+#endif /* CONFIG_SLUB_CPU_PARTIAL */
}

/*
@@ -2308,7 +2308,7 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
local_irq_restore(flags);
}
preempt_enable();
-#endif
+#endif /* CONFIG_SLUB_CPU_PARTIAL */
}

static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
@@ -2813,7 +2813,7 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
}
EXPORT_SYMBOL(kmem_cache_alloc_node_trace);
#endif
-#endif
+#endif /* CONFIG_NUMA */

/*
* Slow path handling. This may still be called frequently since objects
@@ -3848,7 +3848,7 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node)
return ret;
}
EXPORT_SYMBOL(__kmalloc_node);
-#endif
+#endif /* CONFIG_NUMA */

#ifdef CONFIG_HARDENED_USERCOPY
/*
@@ -4066,7 +4066,7 @@ void __kmemcg_cache_deactivate(struct kmem_cache *s)
*/
slab_deactivate_memcg_cache_rcu_sched(s, kmemcg_cache_deact_after_rcu);
}
-#endif
+#endif /* CONFIG_MEMCG */

static int slab_mem_going_offline_callback(void *arg)
{
@@ -4699,7 +4699,7 @@ static int list_locations(struct kmem_cache *s, char *buf,
len += sprintf(buf, "No data\n");
return len;
}
-#endif
+#endif /* CONFIG_SLUB_DEBUG */

#ifdef SLUB_RESILIENCY_TEST
static void __init resiliency_test(void)
@@ -4759,7 +4759,7 @@ static void __init resiliency_test(void)
#ifdef CONFIG_SYSFS
static void resiliency_test(void) {};
#endif
-#endif
+#endif /* SLUB_RESILIENCY_TEST */

#ifdef CONFIG_SYSFS
enum slab_stat_type {
@@ -5416,7 +5416,7 @@ STAT_ATTR(CPU_PARTIAL_ALLOC, cpu_partial_alloc);
STAT_ATTR(CPU_PARTIAL_FREE, cpu_partial_free);
STAT_ATTR(CPU_PARTIAL_NODE, cpu_partial_node);
STAT_ATTR(CPU_PARTIAL_DRAIN, cpu_partial_drain);
-#endif
+#endif /* CONFIG_SLUB_STATS */

static struct attribute *slab_attrs[] = {
&slab_size_attr.attr,
@@ -5617,7 +5617,7 @@ static void memcg_propagate_slab_attrs(struct kmem_cache *s)

if (buffer)
free_page((unsigned long)buffer);
-#endif
+#endif /* CONFIG_MEMCG */
}

static void kmem_cache_release(struct kobject *k)
--
2.21.0

2019-04-02 23:07:56

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 5/7] slub: Use slab_list instead of lru

Currently we use the page->lru list for maintaining lists of slabs. We
have a list in the page structure (slab_list) that can be used for this
purpose. Doing so makes the code cleaner since we are not overloading
the lru list.

Use the slab_list instead of the lru list for maintaining lists of
slabs.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
---
mm/slub.c | 40 ++++++++++++++++++++--------------------
1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8fbba4ff6c67..d17f117830a9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1023,7 +1023,7 @@ static void add_full(struct kmem_cache *s,
return;

lockdep_assert_held(&n->list_lock);
- list_add(&page->lru, &n->full);
+ list_add(&page->slab_list, &n->full);
}

static void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, struct page *page)
@@ -1032,7 +1032,7 @@ static void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, struct
return;

lockdep_assert_held(&n->list_lock);
- list_del(&page->lru);
+ list_del(&page->slab_list);
}

/* Tracking of the number of slabs for debugging purposes */
@@ -1773,9 +1773,9 @@ __add_partial(struct kmem_cache_node *n, struct page *page, int tail)
{
n->nr_partial++;
if (tail == DEACTIVATE_TO_TAIL)
- list_add_tail(&page->lru, &n->partial);
+ list_add_tail(&page->slab_list, &n->partial);
else
- list_add(&page->lru, &n->partial);
+ list_add(&page->slab_list, &n->partial);
}

static inline void add_partial(struct kmem_cache_node *n,
@@ -1789,7 +1789,7 @@ static inline void remove_partial(struct kmem_cache_node *n,
struct page *page)
{
lockdep_assert_held(&n->list_lock);
- list_del(&page->lru);
+ list_del(&page->slab_list);
n->nr_partial--;
}

@@ -1863,7 +1863,7 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
return NULL;

spin_lock(&n->list_lock);
- list_for_each_entry_safe(page, page2, &n->partial, lru) {
+ list_for_each_entry_safe(page, page2, &n->partial, slab_list) {
void *t;

if (!pfmemalloc_match(page, flags))
@@ -2407,7 +2407,7 @@ static unsigned long count_partial(struct kmem_cache_node *n,
struct page *page;

spin_lock_irqsave(&n->list_lock, flags);
- list_for_each_entry(page, &n->partial, lru)
+ list_for_each_entry(page, &n->partial, slab_list)
x += get_count(page);
spin_unlock_irqrestore(&n->list_lock, flags);
return x;
@@ -3705,10 +3705,10 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)

BUG_ON(irqs_disabled());
spin_lock_irq(&n->list_lock);
- list_for_each_entry_safe(page, h, &n->partial, lru) {
+ list_for_each_entry_safe(page, h, &n->partial, slab_list) {
if (!page->inuse) {
remove_partial(n, page);
- list_add(&page->lru, &discard);
+ list_add(&page->slab_list, &discard);
} else {
list_slab_objects(s, page,
"Objects remaining in %s on __kmem_cache_shutdown()");
@@ -3716,7 +3716,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
}
spin_unlock_irq(&n->list_lock);

- list_for_each_entry_safe(page, h, &discard, lru)
+ list_for_each_entry_safe(page, h, &discard, slab_list)
discard_slab(s, page);
}

@@ -3996,7 +3996,7 @@ int __kmem_cache_shrink(struct kmem_cache *s)
* Note that concurrent frees may occur while we hold the
* list_lock. page->inuse here is the upper limit.
*/
- list_for_each_entry_safe(page, t, &n->partial, lru) {
+ list_for_each_entry_safe(page, t, &n->partial, slab_list) {
int free = page->objects - page->inuse;

/* Do not reread page->inuse */
@@ -4006,10 +4006,10 @@ int __kmem_cache_shrink(struct kmem_cache *s)
BUG_ON(free <= 0);

if (free == page->objects) {
- list_move(&page->lru, &discard);
+ list_move(&page->slab_list, &discard);
n->nr_partial--;
} else if (free <= SHRINK_PROMOTE_MAX)
- list_move(&page->lru, promote + free - 1);
+ list_move(&page->slab_list, promote + free - 1);
}

/*
@@ -4022,7 +4022,7 @@ int __kmem_cache_shrink(struct kmem_cache *s)
spin_unlock_irqrestore(&n->list_lock, flags);

/* Release empty slabs */
- list_for_each_entry_safe(page, t, &discard, lru)
+ list_for_each_entry_safe(page, t, &discard, slab_list)
discard_slab(s, page);

if (slabs_node(s, node))
@@ -4214,11 +4214,11 @@ static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache)
for_each_kmem_cache_node(s, node, n) {
struct page *p;

- list_for_each_entry(p, &n->partial, lru)
+ list_for_each_entry(p, &n->partial, slab_list)
p->slab_cache = s;

#ifdef CONFIG_SLUB_DEBUG
- list_for_each_entry(p, &n->full, lru)
+ list_for_each_entry(p, &n->full, slab_list)
p->slab_cache = s;
#endif
}
@@ -4435,7 +4435,7 @@ static int validate_slab_node(struct kmem_cache *s,

spin_lock_irqsave(&n->list_lock, flags);

- list_for_each_entry(page, &n->partial, lru) {
+ list_for_each_entry(page, &n->partial, slab_list) {
validate_slab_slab(s, page, map);
count++;
}
@@ -4446,7 +4446,7 @@ static int validate_slab_node(struct kmem_cache *s,
if (!(s->flags & SLAB_STORE_USER))
goto out;

- list_for_each_entry(page, &n->full, lru) {
+ list_for_each_entry(page, &n->full, slab_list) {
validate_slab_slab(s, page, map);
count++;
}
@@ -4642,9 +4642,9 @@ static int list_locations(struct kmem_cache *s, char *buf,
continue;

spin_lock_irqsave(&n->list_lock, flags);
- list_for_each_entry(page, &n->partial, lru)
+ list_for_each_entry(page, &n->partial, slab_list)
process_slab(&t, s, page, alloc, map);
- list_for_each_entry(page, &n->full, lru)
+ list_for_each_entry(page, &n->full, slab_list)
process_slab(&t, s, page, alloc, map);
spin_unlock_irqrestore(&n->list_lock, flags);
}
--
2.21.0

2019-04-02 23:09:17

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 6/7] slab: Use slab_list instead of lru

Currently we use the page->lru list for maintaining lists of slabs. We
have a list in the page structure (slab_list) that can be used for this
purpose. Doing so makes the code cleaner since we are not overloading
the lru list.

Use the slab_list instead of the lru list for maintaining lists of
slabs.

Signed-off-by: Tobin C. Harding <[email protected]>
---
mm/slab.c | 49 +++++++++++++++++++++++++------------------------
1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 329bfe67f2ca..09e2a0131338 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1710,8 +1710,8 @@ static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
{
struct page *page, *n;

- list_for_each_entry_safe(page, n, list, lru) {
- list_del(&page->lru);
+ list_for_each_entry_safe(page, n, list, slab_list) {
+ list_del(&page->slab_list);
slab_destroy(cachep, page);
}
}
@@ -2267,8 +2267,8 @@ static int drain_freelist(struct kmem_cache *cache,
goto out;
}

- page = list_entry(p, struct page, lru);
- list_del(&page->lru);
+ page = list_entry(p, struct page, slab_list);
+ list_del(&page->slab_list);
n->free_slabs--;
n->total_slabs--;
/*
@@ -2728,13 +2728,13 @@ static void cache_grow_end(struct kmem_cache *cachep, struct page *page)
if (!page)
return;

- INIT_LIST_HEAD(&page->lru);
+ INIT_LIST_HEAD(&page->slab_list);
n = get_node(cachep, page_to_nid(page));

spin_lock(&n->list_lock);
n->total_slabs++;
if (!page->active) {
- list_add_tail(&page->lru, &(n->slabs_free));
+ list_add_tail(&page->slab_list, &n->slabs_free);
n->free_slabs++;
} else
fixup_slab_list(cachep, n, page, &list);
@@ -2843,9 +2843,9 @@ static inline void fixup_slab_list(struct kmem_cache *cachep,
void **list)
{
/* move slabp to correct slabp list: */
- list_del(&page->lru);
+ list_del(&page->slab_list);
if (page->active == cachep->num) {
- list_add(&page->lru, &n->slabs_full);
+ list_add(&page->slab_list, &n->slabs_full);
if (OBJFREELIST_SLAB(cachep)) {
#if DEBUG
/* Poisoning will be done without holding the lock */
@@ -2859,7 +2859,7 @@ static inline void fixup_slab_list(struct kmem_cache *cachep,
page->freelist = NULL;
}
} else
- list_add(&page->lru, &n->slabs_partial);
+ list_add(&page->slab_list, &n->slabs_partial);
}

/* Try to find non-pfmemalloc slab if needed */
@@ -2882,20 +2882,20 @@ static noinline struct page *get_valid_first_slab(struct kmem_cache_node *n,
}

/* Move pfmemalloc slab to the end of list to speed up next search */
- list_del(&page->lru);
+ list_del(&page->slab_list);
if (!page->active) {
- list_add_tail(&page->lru, &n->slabs_free);
+ list_add_tail(&page->slab_list, &n->slabs_free);
n->free_slabs++;
} else
- list_add_tail(&page->lru, &n->slabs_partial);
+ list_add_tail(&page->slab_list, &n->slabs_partial);

- list_for_each_entry(page, &n->slabs_partial, lru) {
+ list_for_each_entry(page, &n->slabs_partial, slab_list) {
if (!PageSlabPfmemalloc(page))
return page;
}

n->free_touched = 1;
- list_for_each_entry(page, &n->slabs_free, lru) {
+ list_for_each_entry(page, &n->slabs_free, slab_list) {
if (!PageSlabPfmemalloc(page)) {
n->free_slabs--;
return page;
@@ -2910,11 +2910,12 @@ static struct page *get_first_slab(struct kmem_cache_node *n, bool pfmemalloc)
struct page *page;

assert_spin_locked(&n->list_lock);
- page = list_first_entry_or_null(&n->slabs_partial, struct page, lru);
+ page = list_first_entry_or_null(&n->slabs_partial, struct page,
+ slab_list);
if (!page) {
n->free_touched = 1;
page = list_first_entry_or_null(&n->slabs_free, struct page,
- lru);
+ slab_list);
if (page)
n->free_slabs--;
}
@@ -3415,29 +3416,29 @@ static void free_block(struct kmem_cache *cachep, void **objpp,
objp = objpp[i];

page = virt_to_head_page(objp);
- list_del(&page->lru);
+ list_del(&page->slab_list);
check_spinlock_acquired_node(cachep, node);
slab_put_obj(cachep, page, objp);
STATS_DEC_ACTIVE(cachep);

/* fixup slab chains */
if (page->active == 0) {
- list_add(&page->lru, &n->slabs_free);
+ list_add(&page->slab_list, &n->slabs_free);
n->free_slabs++;
} else {
/* Unconditionally move a slab to the end of the
* partial list on free - maximum time for the
* other objects to be freed, too.
*/
- list_add_tail(&page->lru, &n->slabs_partial);
+ list_add_tail(&page->slab_list, &n->slabs_partial);
}
}

while (n->free_objects > n->free_limit && !list_empty(&n->slabs_free)) {
n->free_objects -= cachep->num;

- page = list_last_entry(&n->slabs_free, struct page, lru);
- list_move(&page->lru, list);
+ page = list_last_entry(&n->slabs_free, struct page, slab_list);
+ list_move(&page->slab_list, list);
n->free_slabs--;
n->total_slabs--;
}
@@ -3475,7 +3476,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
int i = 0;
struct page *page;

- list_for_each_entry(page, &n->slabs_free, lru) {
+ list_for_each_entry(page, &n->slabs_free, slab_list) {
BUG_ON(page->active);

i++;
@@ -4338,9 +4339,9 @@ static int leaks_show(struct seq_file *m, void *p)
check_irq_on();
spin_lock_irq(&n->list_lock);

- list_for_each_entry(page, &n->slabs_full, lru)
+ list_for_each_entry(page, &n->slabs_full, slab_list)
handle_slab(x, cachep, page);
- list_for_each_entry(page, &n->slabs_partial, lru)
+ list_for_each_entry(page, &n->slabs_partial, slab_list)
handle_slab(x, cachep, page);
spin_unlock_irq(&n->list_lock);
}
--
2.21.0

2019-04-02 23:09:28

by Tobin C. Harding

[permalink] [raw]
Subject: [PATCH v5 7/7] mm: Remove stale comment from page struct

We now use the slab_list list_head instead of the lru list_head. This
comment has become stale.

Remove stale comment from page struct slab_list list_head.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
---
include/linux/mm_types.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 7eade9132f02..63a34e3d7c29 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -103,7 +103,7 @@ struct page {
};
struct { /* slab, slob and slub */
union {
- struct list_head slab_list; /* uses lru */
+ struct list_head slab_list;
struct { /* Partial pages */
struct page *next;
#ifdef CONFIG_64BIT
--
2.21.0

Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, 3 Apr 2019, Tobin C. Harding wrote:

> Currently we reach inside the list_head. This is a violation of the
> layer of abstraction provided by the list_head. It makes the code
> fragile. More importantly it makes the code wicked hard to understand.

Great.... It definitely makes it clearer. The boolean parameter is not
so nice but I have no idea how to avoid it in the brief time I spent
looking at it.

Acked-by: Christoph Lameter <[email protected]>

Subject: Re: [PATCH v5 1/7] list: Add function list_rotate_to_front()

On Wed, 3 Apr 2019, Tobin C. Harding wrote:

> Add function list_rotate_to_front() to rotate a list until the specified
> item is at the front of the list.

Reviewed-by: Christoph Lameter <[email protected]>

2019-04-03 18:00:26

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 1/7] list: Add function list_rotate_to_front()

On Wed, Apr 03, 2019 at 10:05:39AM +1100, Tobin C. Harding wrote:
> Currently if we wish to rotate a list until a specific item is at the
> front of the list we can call list_move_tail(head, list). Note that the
> arguments are the reverse way to the usual use of list_move_tail(list,
> head). This is a hack, it depends on the developer knowing how the
> list_head operates internally which violates the layer of abstraction
> offered by the list_head. Also, it is not intuitive so the next
> developer to come along must study list.h in order to fully understand
> what is meant by the call, while this is 'good for' the developer it
> makes reading the code harder. We should have an function appropriately
> named that does this if there are users for it intree.
>
> By grep'ing the tree for list_move_tail() and list_tail() and attempting
> to guess the argument order from the names it seems there is only one
> place currently in the tree that does this - the slob allocatator.
>
> Add function list_rotate_to_front() to rotate a list until the specified
> item is at the front of the list.
>
> Signed-off-by: Tobin C. Harding <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

2019-04-03 18:01:39

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, Apr 03, 2019 at 10:05:40AM +1100, Tobin C. Harding wrote:
> Currently we reach inside the list_head. This is a violation of the
> layer of abstraction provided by the list_head. It makes the code
> fragile. More importantly it makes the code wicked hard to understand.
>
> The code reaches into the list_head structure to counteract the fact
> that the list _may_ have been changed during slob_page_alloc(). Instead
> of this we can add a return parameter to slob_page_alloc() to signal
> that the list was modified (list_del() called with page->lru to remove
> page from the freelist).
>
> This code is concerned with an optimisation that counters the tendency
> for first fit allocation algorithm to fragment memory into many small
> chunks at the front of the memory pool. Since the page is only removed
> from the list when an allocation uses _all_ the remaining memory in the
> page then in this special case fragmentation does not occur and we
> therefore do not need the optimisation.
>
> Add a return parameter to slob_page_alloc() to signal that the
> allocation used up the whole page and that the page was removed from the
> free list. After calling slob_page_alloc() check the return value just
> added and only attempt optimisation if the page is still on the list.
>
> Use list_head API instead of reaching into the list_head structure to
> check if sp is at the front of the list.
>
> Signed-off-by: Tobin C. Harding <[email protected]>
> ---
> mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
> 1 file changed, 37 insertions(+), 14 deletions(-)
>
> diff --git a/mm/slob.c b/mm/slob.c
> index 307c2c9feb44..07356e9feaaa 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
> }
>
> /*
> - * Allocate a slob block within a given slob_page sp.
> + * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
> + * @sp: Page to look in.
> + * @size: Size of the allocation.
> + * @align: Allocation alignment.
> + * @page_removed_from_list: Return parameter.
> + *
> + * Tries to find a chunk of memory at least @size bytes big within @page.
> + *
> + * Return: Pointer to memory if allocated, %NULL otherwise. If the
> + * allocation fills up @page then the page is removed from the
> + * freelist, in this case @page_removed_from_list will be set to
> + * true (set to false otherwise).
> */
> -static void *slob_page_alloc(struct page *sp, size_t size, int align)
> +static void *slob_page_alloc(struct page *sp, size_t size, int align,
> + bool *page_removed_from_list)

Hi Tobin!

Isn't it better to make slob_page_alloc() return a bool value?
Then it's easier to ignore the returned value, no need to introduce "_unused".

Thanks!

> {
> slob_t *prev, *cur, *aligned = NULL;
> int delta = 0, units = SLOB_UNITS(size);
>
> + *page_removed_from_list = false;
> for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
> slobidx_t avail = slob_units(cur);
>
> @@ -254,8 +267,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
> }
>
> sp->units -= units;
> - if (!sp->units)
> + if (!sp->units) {
> clear_slob_page_free(sp);
> + *page_removed_from_list = true;
> + }
> return cur;
> }
> if (slob_last(cur))
> @@ -269,10 +284,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
> static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> {
> struct page *sp;
> - struct list_head *prev;
> struct list_head *slob_list;
> slob_t *b = NULL;
> unsigned long flags;
> + bool _unused;
>
> if (size < SLOB_BREAK1)
> slob_list = &free_slob_small;
> @@ -284,6 +299,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> spin_lock_irqsave(&slob_lock, flags);
> /* Iterate through each partially free page, try to find room */
> list_for_each_entry(sp, slob_list, lru) {
> + bool page_removed_from_list = false;
> #ifdef CONFIG_NUMA
> /*
> * If there's a node specification, search for a partial
> @@ -296,18 +312,25 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> if (sp->units < SLOB_UNITS(size))
> continue;
>
> - /* Attempt to alloc */
> - prev = sp->lru.prev;
> - b = slob_page_alloc(sp, size, align);
> + b = slob_page_alloc(sp, size, align, &page_removed_from_list);
> if (!b)
> continue;
>
> - /* Improve fragment distribution and reduce our average
> - * search time by starting our next search here. (see
> - * Knuth vol 1, sec 2.5, pg 449) */
> - if (prev != slob_list->prev &&
> - slob_list->next != prev->next)
> - list_move_tail(slob_list, prev->next);
> + /*
> + * If slob_page_alloc() removed sp from the list then we
> + * cannot call list functions on sp. If so allocation
> + * did not fragment the page anyway so optimisation is
> + * unnecessary.
> + */
> + if (!page_removed_from_list) {
> + /*
> + * Improve fragment distribution and reduce our average
> + * search time by starting our next search here. (see
> + * Knuth vol 1, sec 2.5, pg 449)
> + */
> + if (!list_is_first(&sp->lru, slob_list))
> + list_rotate_to_front(&sp->lru, slob_list);
> + }
> break;
> }
> spin_unlock_irqrestore(&slob_lock, flags);
> @@ -326,7 +349,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> INIT_LIST_HEAD(&sp->lru);
> set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
> set_slob_page_free(sp, slob_list);
> - b = slob_page_alloc(sp, size, align);
> + b = slob_page_alloc(sp, size, align, &_unused);
> BUG_ON(!b);
> spin_unlock_irqrestore(&slob_lock, flags);
> }
> --
> 2.21.0
>

2019-04-03 18:43:53

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 4/7] slub: Add comments to endif pre-processor macros

On Wed, Apr 03, 2019 at 10:05:42AM +1100, Tobin C. Harding wrote:
> SLUB allocator makes heavy use of ifdef/endif pre-processor macros.
> The pairing of these statements is at times hard to follow e.g. if the
> pair are further than a screen apart or if there are nested pairs. We
> can reduce cognitive load by adding a comment to the endif statement of
> form
>
> #ifdef CONFIG_FOO
> ...
> #endif /* CONFIG_FOO */
>
> Add comments to endif pre-processor macros if ifdef/endif pair is not
> immediately apparent.
>
> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Tobin C. Harding <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

2019-04-03 18:45:57

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 5/7] slub: Use slab_list instead of lru

On Wed, Apr 03, 2019 at 10:05:43AM +1100, Tobin C. Harding wrote:
> Currently we use the page->lru list for maintaining lists of slabs. We
> have a list in the page structure (slab_list) that can be used for this
> purpose. Doing so makes the code cleaner since we are not overloading
> the lru list.
>
> Use the slab_list instead of the lru list for maintaining lists of
> slabs.
>
> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Tobin C. Harding <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

2019-04-03 18:46:20

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 7/7] mm: Remove stale comment from page struct

On Wed, Apr 03, 2019 at 10:05:45AM +1100, Tobin C. Harding wrote:
> We now use the slab_list list_head instead of the lru list_head. This
> comment has become stale.
>
> Remove stale comment from page struct slab_list list_head.
>
> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Tobin C. Harding <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

Thank you!

2019-04-03 18:48:05

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 6/7] slab: Use slab_list instead of lru

On Wed, Apr 03, 2019 at 10:05:44AM +1100, Tobin C. Harding wrote:
> Currently we use the page->lru list for maintaining lists of slabs. We
> have a list in the page structure (slab_list) that can be used for this
> purpose. Doing so makes the code cleaner since we are not overloading
> the lru list.
>
> Use the slab_list instead of the lru list for maintaining lists of
> slabs.
>
> Signed-off-by: Tobin C. Harding <[email protected]>

Reviewed-by: Roman Gushchin <[email protected]>

2019-04-03 21:04:59

by Tobin C. Harding

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, Apr 03, 2019 at 06:00:30PM +0000, Roman Gushchin wrote:
> On Wed, Apr 03, 2019 at 10:05:40AM +1100, Tobin C. Harding wrote:
> > Currently we reach inside the list_head. This is a violation of the
> > layer of abstraction provided by the list_head. It makes the code
> > fragile. More importantly it makes the code wicked hard to understand.
> >
> > The code reaches into the list_head structure to counteract the fact
> > that the list _may_ have been changed during slob_page_alloc(). Instead
> > of this we can add a return parameter to slob_page_alloc() to signal
> > that the list was modified (list_del() called with page->lru to remove
> > page from the freelist).
> >
> > This code is concerned with an optimisation that counters the tendency
> > for first fit allocation algorithm to fragment memory into many small
> > chunks at the front of the memory pool. Since the page is only removed
> > from the list when an allocation uses _all_ the remaining memory in the
> > page then in this special case fragmentation does not occur and we
> > therefore do not need the optimisation.
> >
> > Add a return parameter to slob_page_alloc() to signal that the
> > allocation used up the whole page and that the page was removed from the
> > free list. After calling slob_page_alloc() check the return value just
> > added and only attempt optimisation if the page is still on the list.
> >
> > Use list_head API instead of reaching into the list_head structure to
> > check if sp is at the front of the list.
> >
> > Signed-off-by: Tobin C. Harding <[email protected]>
> > ---
> > mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
> > 1 file changed, 37 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/slob.c b/mm/slob.c
> > index 307c2c9feb44..07356e9feaaa 100644
> > --- a/mm/slob.c
> > +++ b/mm/slob.c
> > @@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
> > }
> >
> > /*
> > - * Allocate a slob block within a given slob_page sp.
> > + * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
> > + * @sp: Page to look in.
> > + * @size: Size of the allocation.
> > + * @align: Allocation alignment.
> > + * @page_removed_from_list: Return parameter.
> > + *
> > + * Tries to find a chunk of memory at least @size bytes big within @page.
> > + *
> > + * Return: Pointer to memory if allocated, %NULL otherwise. If the
> > + * allocation fills up @page then the page is removed from the
> > + * freelist, in this case @page_removed_from_list will be set to
> > + * true (set to false otherwise).
> > */
> > -static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > +static void *slob_page_alloc(struct page *sp, size_t size, int align,
> > + bool *page_removed_from_list)
>
> Hi Tobin!
>
> Isn't it better to make slob_page_alloc() return a bool value?
> Then it's easier to ignore the returned value, no need to introduce "_unused".

We need a pointer to the memory allocated also so AFAICS its either a
return parameter for the memory pointer or a return parameter to
indicate the boolean value? Open to any other ideas I'm missing.

In a previous crack at this I used a double pointer to the page struct
then set that to null to indicate the boolean value. I think the
explicit boolean parameter is cleaner.

thanks,
Tobin.

2019-04-03 21:15:35

by Tobin C. Harding

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, Apr 03, 2019 at 06:00:30PM +0000, Roman Gushchin wrote:
> On Wed, Apr 03, 2019 at 10:05:40AM +1100, Tobin C. Harding wrote:
> > Currently we reach inside the list_head. This is a violation of the
> > layer of abstraction provided by the list_head. It makes the code
> > fragile. More importantly it makes the code wicked hard to understand.
> >
> > The code reaches into the list_head structure to counteract the fact
> > that the list _may_ have been changed during slob_page_alloc(). Instead
> > of this we can add a return parameter to slob_page_alloc() to signal
> > that the list was modified (list_del() called with page->lru to remove
> > page from the freelist).
> >
> > This code is concerned with an optimisation that counters the tendency
> > for first fit allocation algorithm to fragment memory into many small
> > chunks at the front of the memory pool. Since the page is only removed
> > from the list when an allocation uses _all_ the remaining memory in the
> > page then in this special case fragmentation does not occur and we
> > therefore do not need the optimisation.
> >
> > Add a return parameter to slob_page_alloc() to signal that the
> > allocation used up the whole page and that the page was removed from the
> > free list. After calling slob_page_alloc() check the return value just
> > added and only attempt optimisation if the page is still on the list.
> >
> > Use list_head API instead of reaching into the list_head structure to
> > check if sp is at the front of the list.
> >
> > Signed-off-by: Tobin C. Harding <[email protected]>
> > ---
> > mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
> > 1 file changed, 37 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/slob.c b/mm/slob.c
> > index 307c2c9feb44..07356e9feaaa 100644
> > --- a/mm/slob.c
> > +++ b/mm/slob.c
> > @@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
> > }
> >
> > /*
> > - * Allocate a slob block within a given slob_page sp.
> > + * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
> > + * @sp: Page to look in.
> > + * @size: Size of the allocation.
> > + * @align: Allocation alignment.
> > + * @page_removed_from_list: Return parameter.
> > + *
> > + * Tries to find a chunk of memory at least @size bytes big within @page.
> > + *
> > + * Return: Pointer to memory if allocated, %NULL otherwise. If the
> > + * allocation fills up @page then the page is removed from the
> > + * freelist, in this case @page_removed_from_list will be set to
> > + * true (set to false otherwise).
> > */
> > -static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > +static void *slob_page_alloc(struct page *sp, size_t size, int align,
> > + bool *page_removed_from_list)
>
> Hi Tobin!
>
> Isn't it better to make slob_page_alloc() return a bool value?
> Then it's easier to ignore the returned value, no need to introduce "_unused".
>
> Thanks!
>
> > {
> > slob_t *prev, *cur, *aligned = NULL;
> > int delta = 0, units = SLOB_UNITS(size);
> >
> > + *page_removed_from_list = false;
> > for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
> > slobidx_t avail = slob_units(cur);
> >
> > @@ -254,8 +267,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > }
> >
> > sp->units -= units;
> > - if (!sp->units)
> > + if (!sp->units) {
> > clear_slob_page_free(sp);
> > + *page_removed_from_list = true;
> > + }
> > return cur;
> > }
> > if (slob_last(cur))
> > @@ -269,10 +284,10 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> > {
> > struct page *sp;
> > - struct list_head *prev;
> > struct list_head *slob_list;
> > slob_t *b = NULL;
> > unsigned long flags;
> > + bool _unused;
> >
> > if (size < SLOB_BREAK1)
> > slob_list = &free_slob_small;
> > @@ -284,6 +299,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> > spin_lock_irqsave(&slob_lock, flags);
> > /* Iterate through each partially free page, try to find room */
> > list_for_each_entry(sp, slob_list, lru) {
> > + bool page_removed_from_list = false;
> > #ifdef CONFIG_NUMA
> > /*
> > * If there's a node specification, search for a partial
> > @@ -296,18 +312,25 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
> > if (sp->units < SLOB_UNITS(size))
> > continue;
> >
> > - /* Attempt to alloc */
> > - prev = sp->lru.prev;
> > - b = slob_page_alloc(sp, size, align);
> > + b = slob_page_alloc(sp, size, align, &page_removed_from_list);
> > if (!b)
> > continue;
> >
> > - /* Improve fragment distribution and reduce our average
> > - * search time by starting our next search here. (see
> > - * Knuth vol 1, sec 2.5, pg 449) */
> > - if (prev != slob_list->prev &&
> > - slob_list->next != prev->next)
> > - list_move_tail(slob_list, prev->next);
> > + /*
> > + * If slob_page_alloc() removed sp from the list then we
> > + * cannot call list functions on sp. If so allocation
> > + * did not fragment the page anyway so optimisation is
> > + * unnecessary.
> > + */
> > + if (!page_removed_from_list) {
> > + /*
> > + * Improve fragment distribution and reduce our average
> > + * search time by starting our next search here. (see
> > + * Knuth vol 1, sec 2.5, pg 449)
> > + */
> > + if (!list_is_first(&sp->lru, slob_list))
> > + list_rotate_to_front(&sp->lru, slob_list);

According to 0day test robot this is triggering an error from
CHECK_DATA_CORRUPTION when the kernel is built with CONFIG_DEBUG_LIST.
I think this is because list_rotate_to_front() puts the list into an
invalid state before it calls __list_add(). The thing that has me
stumped is why this was not happening before this patch series was
applied? ATM I'm not able to get my test module to trigger this but I'm
going to try a bit harder today. If I'm right one solution is to modify
list_rotate_to_front() to _not_ call __list_add() but do it manually,
this solution doesn't sit well with me though.

So, summing up, I think the patch is correct in that it does the correct
thing but I think the debugging code doesn't like it because we are
violating typical usage - so the patch is wrong :)

thanks,
Tobin.

2019-04-03 21:50:44

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Thu, Apr 04, 2019 at 08:03:27AM +1100, Tobin C. Harding wrote:
> On Wed, Apr 03, 2019 at 06:00:30PM +0000, Roman Gushchin wrote:
> > On Wed, Apr 03, 2019 at 10:05:40AM +1100, Tobin C. Harding wrote:
> > > Currently we reach inside the list_head. This is a violation of the
> > > layer of abstraction provided by the list_head. It makes the code
> > > fragile. More importantly it makes the code wicked hard to understand.
> > >
> > > The code reaches into the list_head structure to counteract the fact
> > > that the list _may_ have been changed during slob_page_alloc(). Instead
> > > of this we can add a return parameter to slob_page_alloc() to signal
> > > that the list was modified (list_del() called with page->lru to remove
> > > page from the freelist).
> > >
> > > This code is concerned with an optimisation that counters the tendency
> > > for first fit allocation algorithm to fragment memory into many small
> > > chunks at the front of the memory pool. Since the page is only removed
> > > from the list when an allocation uses _all_ the remaining memory in the
> > > page then in this special case fragmentation does not occur and we
> > > therefore do not need the optimisation.
> > >
> > > Add a return parameter to slob_page_alloc() to signal that the
> > > allocation used up the whole page and that the page was removed from the
> > > free list. After calling slob_page_alloc() check the return value just
> > > added and only attempt optimisation if the page is still on the list.
> > >
> > > Use list_head API instead of reaching into the list_head structure to
> > > check if sp is at the front of the list.
> > >
> > > Signed-off-by: Tobin C. Harding <[email protected]>
> > > ---
> > > mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
> > > 1 file changed, 37 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/mm/slob.c b/mm/slob.c
> > > index 307c2c9feb44..07356e9feaaa 100644
> > > --- a/mm/slob.c
> > > +++ b/mm/slob.c
> > > @@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
> > > }
> > >
> > > /*
> > > - * Allocate a slob block within a given slob_page sp.
> > > + * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
> > > + * @sp: Page to look in.
> > > + * @size: Size of the allocation.
> > > + * @align: Allocation alignment.
> > > + * @page_removed_from_list: Return parameter.
> > > + *
> > > + * Tries to find a chunk of memory at least @size bytes big within @page.
> > > + *
> > > + * Return: Pointer to memory if allocated, %NULL otherwise. If the
> > > + * allocation fills up @page then the page is removed from the
> > > + * freelist, in this case @page_removed_from_list will be set to
> > > + * true (set to false otherwise).
> > > */
> > > -static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > > +static void *slob_page_alloc(struct page *sp, size_t size, int align,
> > > + bool *page_removed_from_list)
> >
> > Hi Tobin!
> >
> > Isn't it better to make slob_page_alloc() return a bool value?
> > Then it's easier to ignore the returned value, no need to introduce "_unused".
>
> We need a pointer to the memory allocated also so AFAICS its either a
> return parameter for the memory pointer or a return parameter to
> indicate the boolean value? Open to any other ideas I'm missing.
>
> In a previous crack at this I used a double pointer to the page struct
> then set that to null to indicate the boolean value. I think the
> explicit boolean parameter is cleaner.

Yeah, sorry, it's my fault. Please, ignore this comment.
Bool* argument is perfectly fine here.

Thanks!

2019-04-03 22:15:56

by Tobin C. Harding

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, Apr 03, 2019 at 09:23:28PM +0000, Roman Gushchin wrote:
> On Thu, Apr 04, 2019 at 08:03:27AM +1100, Tobin C. Harding wrote:
> > On Wed, Apr 03, 2019 at 06:00:30PM +0000, Roman Gushchin wrote:
> > > On Wed, Apr 03, 2019 at 10:05:40AM +1100, Tobin C. Harding wrote:
> > > > Currently we reach inside the list_head. This is a violation of the
> > > > layer of abstraction provided by the list_head. It makes the code
> > > > fragile. More importantly it makes the code wicked hard to understand.
> > > >
> > > > The code reaches into the list_head structure to counteract the fact
> > > > that the list _may_ have been changed during slob_page_alloc(). Instead
> > > > of this we can add a return parameter to slob_page_alloc() to signal
> > > > that the list was modified (list_del() called with page->lru to remove
> > > > page from the freelist).
> > > >
> > > > This code is concerned with an optimisation that counters the tendency
> > > > for first fit allocation algorithm to fragment memory into many small
> > > > chunks at the front of the memory pool. Since the page is only removed
> > > > from the list when an allocation uses _all_ the remaining memory in the
> > > > page then in this special case fragmentation does not occur and we
> > > > therefore do not need the optimisation.
> > > >
> > > > Add a return parameter to slob_page_alloc() to signal that the
> > > > allocation used up the whole page and that the page was removed from the
> > > > free list. After calling slob_page_alloc() check the return value just
> > > > added and only attempt optimisation if the page is still on the list.
> > > >
> > > > Use list_head API instead of reaching into the list_head structure to
> > > > check if sp is at the front of the list.
> > > >
> > > > Signed-off-by: Tobin C. Harding <[email protected]>
> > > > ---
> > > > mm/slob.c | 51 +++++++++++++++++++++++++++++++++++++--------------
> > > > 1 file changed, 37 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/mm/slob.c b/mm/slob.c
> > > > index 307c2c9feb44..07356e9feaaa 100644
> > > > --- a/mm/slob.c
> > > > +++ b/mm/slob.c
> > > > @@ -213,13 +213,26 @@ static void slob_free_pages(void *b, int order)
> > > > }
> > > >
> > > > /*
> > > > - * Allocate a slob block within a given slob_page sp.
> > > > + * slob_page_alloc() - Allocate a slob block within a given slob_page sp.
> > > > + * @sp: Page to look in.
> > > > + * @size: Size of the allocation.
> > > > + * @align: Allocation alignment.
> > > > + * @page_removed_from_list: Return parameter.
> > > > + *
> > > > + * Tries to find a chunk of memory at least @size bytes big within @page.
> > > > + *
> > > > + * Return: Pointer to memory if allocated, %NULL otherwise. If the
> > > > + * allocation fills up @page then the page is removed from the
> > > > + * freelist, in this case @page_removed_from_list will be set to
> > > > + * true (set to false otherwise).
> > > > */
> > > > -static void *slob_page_alloc(struct page *sp, size_t size, int align)
> > > > +static void *slob_page_alloc(struct page *sp, size_t size, int align,
> > > > + bool *page_removed_from_list)
> > >
> > > Hi Tobin!
> > >
> > > Isn't it better to make slob_page_alloc() return a bool value?
> > > Then it's easier to ignore the returned value, no need to introduce "_unused".
> >
> > We need a pointer to the memory allocated also so AFAICS its either a
> > return parameter for the memory pointer or a return parameter to
> > indicate the boolean value? Open to any other ideas I'm missing.
> >
> > In a previous crack at this I used a double pointer to the page struct
> > then set that to null to indicate the boolean value. I think the
> > explicit boolean parameter is cleaner.
>
> Yeah, sorry, it's my fault. Please, ignore this comment.
> Bool* argument is perfectly fine here.

Cheers man, no sweat. I appreciate you looking at this stuff.

Tobin

2019-04-09 13:00:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On 4/3/19 11:13 PM, Tobin C. Harding wrote:

> According to 0day test robot this is triggering an error from
> CHECK_DATA_CORRUPTION when the kernel is built with CONFIG_DEBUG_LIST.

FWIW, that report [1] was for commit 15c8410c67adef from next-20190401. I've
checked and it's still the v4 version, although the report came after you
submitted v5 (it wasn't testing the patches from mailing list, but mmotm). I
don't see any report for the v5 version so I'd expect it to be indeed fixed by
the new approach that adds boolean return parameter to slob_page_alloc().

Vlastimil

[1] https://lore.kernel.org/linux-mm/5ca413c6.9TM84kwWw8lLhnmK%[email protected]/T/#u

> I think this is because list_rotate_to_front() puts the list into an
> invalid state before it calls __list_add(). The thing that has me
> stumped is why this was not happening before this patch series was
> applied? ATM I'm not able to get my test module to trigger this but I'm
> going to try a bit harder today. If I'm right one solution is to modify
> list_rotate_to_front() to _not_ call __list_add() but do it manually,
> this solution doesn't sit well with me though.
>
> So, summing up, I think the patch is correct in that it does the correct
> thing but I think the debugging code doesn't like it because we are
> violating typical usage - so the patch is wrong :)
>
> thanks,
> Tobin.
>

2019-04-09 13:09:18

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v5 0/7] mm: Use slab_list list_head instead of lru

On 4/3/19 1:05 AM, Tobin C. Harding wrote:
> Tobin C. Harding (7):
> list: Add function list_rotate_to_front()
> slob: Respect list_head abstraction layer
> slob: Use slab_list instead of lru
> slub: Add comments to endif pre-processor macros
> slub: Use slab_list instead of lru
> slab: Use slab_list instead of lru
> mm: Remove stale comment from page struct

For the whole series:

Acked-by: Vlastimil Babka <[email protected]>

>
> include/linux/list.h | 18 ++++++++++++
> include/linux/mm_types.h | 2 +-
> mm/slab.c | 49 ++++++++++++++++----------------
> mm/slob.c | 59 +++++++++++++++++++++++++++------------
> mm/slub.c | 60 ++++++++++++++++++++--------------------
> 5 files changed, 115 insertions(+), 73 deletions(-)
>

2019-04-09 20:08:42

by Tobin C. Harding

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Tue, Apr 09, 2019 at 02:59:52PM +0200, Vlastimil Babka wrote:
> On 4/3/19 11:13 PM, Tobin C. Harding wrote:
>
> > According to 0day test robot this is triggering an error from
> > CHECK_DATA_CORRUPTION when the kernel is built with CONFIG_DEBUG_LIST.
>
> FWIW, that report [1] was for commit 15c8410c67adef from next-20190401. I've
> checked and it's still the v4 version, although the report came after you
> submitted v5 (it wasn't testing the patches from mailing list, but mmotm). I
> don't see any report for the v5 version so I'd expect it to be indeed fixed by
> the new approach that adds boolean return parameter to slob_page_alloc().
>
> Vlastimil

Oh man thanks! That is super cool, thanks for letting me know
Vlastimil.

Tobin

2019-04-09 22:29:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v5 2/7] slob: Respect list_head abstraction layer

On Wed, 10 Apr 2019 06:06:49 +1000 "Tobin C. Harding" <[email protected]> wrote:

> On Tue, Apr 09, 2019 at 02:59:52PM +0200, Vlastimil Babka wrote:
> > On 4/3/19 11:13 PM, Tobin C. Harding wrote:
> >
> > > According to 0day test robot this is triggering an error from
> > > CHECK_DATA_CORRUPTION when the kernel is built with CONFIG_DEBUG_LIST.
> >
> > FWIW, that report [1] was for commit 15c8410c67adef from next-20190401. I've
> > checked and it's still the v4 version, although the report came after you
> > submitted v5 (it wasn't testing the patches from mailing list, but mmotm). I
> > don't see any report for the v5 version so I'd expect it to be indeed fixed by
> > the new approach that adds boolean return parameter to slob_page_alloc().
> >
> > Vlastimil
>
> Oh man thanks! That is super cool, thanks for letting me know
> Vlastimil.

Yes, thanks for the followup.