LinuxLists.cc - [PATCH v3 0/9] clean-up and remove lockdep annotation in SLAB

2014-07-01 08:24:05

Subject: [PATCH v3 0/9] clean-up and remove lockdep annotation in SLAB

This patchset does some clean-up and tries to remove lockdep annotation.

Patches 1~2 are just for really really minor improvement.
Patches 3~9 are for clean-up and removing lockdep annotation.

There are two cases that lockdep annotation is needed in SLAB.
1) holding two node locks
2) holding two array cache(alien cache) locks

I looked at the code and found that we can avoid these cases without
any negative effect.

1) occurs if freeing object makes new free slab and we decide to
destroy it. Although we don't need to hold the lock during destroying
a slab, current code do that. Destroying a slab without holding the lock
would help the reduction of the lock contention. To do it, I change the
implementation that new free slab is destroyed after releasing the lock.

2) occurs on similar situation. When we free object from non-local node,
we put this object to alien cache with holding the alien cache lock.
If alien cache is full, we try to flush alien cache to proper node cache,
and, in this time, new free slab could be made. Destroying it would be
started and we will free metadata object which comes from another node.
In this case, we need another node's alien cache lock to free object.
This forces us to hold two array cache locks and then we need lockdep
annotation although they are always different locks and deadlock cannot
be possible. To prevent this situation, I use same way as 1).

In this way, we can avoid 1) and 2) cases, and then, can remove lockdep
annotation. As short stat noted, this makes SLAB code much simpler.

All patches of this series got Ack from Christoph Lamter on previous
iteration, and there is no big change from previous iteration. Just
one clean-up patch is dropped, because it seems not good clean-up.
Others are just rebased on current linux-next.

Thanks.

Joonsoo Kim (9):
slab: add unlikely macro to help compiler
slab: move up code to get kmem_cache_node in free_block()
slab: defer slab_destroy in free_block()
slab: factor out initialization of arracy cache
slab: introduce alien_cache
slab: use the lock on alien_cache, instead of the lock on array_cache
slab: destroy a slab without holding any alien cache lock
slab: remove a useless lockdep annotation
slab: remove BAD_ALIEN_MAGIC

mm/slab.c | 377 ++++++++++++++++++++++---------------------------------------
mm/slab.h | 2 +-
2 files changed, 137 insertions(+), 242 deletions(-)

--
1.7.9.5

2014-07-01 08:22:37

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 2/9] slab: move up code to get kmem_cache_node in free_block()

node isn't changed, so we don't need to retreive this structure
everytime we move the object. Maybe compiler do this optimization,
but making it explicitly is better.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index f8a0ed1..19e2136 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3417,7 +3417,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
int node)
{
int i;
- struct kmem_cache_node *n;
+ struct kmem_cache_node *n = get_node(cachep, node);

for (i = 0; i < nr_objects; i++) {
void *objp;
@@ -3427,7 +3427,6 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
objp = objpp[i];

page = virt_to_head_page(objp);
- n = get_node(cachep, node);
list_del(&page->lru);
check_spinlock_acquired_node(cachep, node);
slab_put_obj(cachep, page, objp, node);
--
1.7.9.5

2014-07-01 08:22:35

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 4/9] slab: factor out initialization of arracy cache

Factor out initialization of array cache to use it in following patch.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 59b9a4c..00b6bbc 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -791,13 +791,8 @@ static void start_cpu_timer(int cpu)
}
}

-static struct array_cache *alloc_arraycache(int node, int entries,
- int batchcount, gfp_t gfp)
+static void init_arraycache(struct array_cache *ac, int limit, int batch)
{
- int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
- struct array_cache *nc = NULL;
-
- nc = kmalloc_node(memsize, gfp, node);
/*
* The array_cache structures contain pointers to free object.
* However, when such objects are allocated or transferred to another
@@ -805,15 +800,25 @@ static struct array_cache *alloc_arraycache(int node, int entries,
* valid references during a kmemleak scan. Therefore, kmemleak must
* not scan such objects.
*/
- kmemleak_no_scan(nc);
- if (nc) {
- nc->avail = 0;
- nc->limit = entries;
- nc->batchcount = batchcount;
- nc->touched = 0;
- spin_lock_init(&nc->lock);
+ kmemleak_no_scan(ac);
+ if (ac) {
+ ac->avail = 0;
+ ac->limit = limit;
+ ac->batchcount = batch;
+ ac->touched = 0;
+ spin_lock_init(&ac->lock);
}
- return nc;
+}
+
+static struct array_cache *alloc_arraycache(int node, int entries,
+ int batchcount, gfp_t gfp)
+{
+ int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
+ struct array_cache *ac = NULL;
+
+ ac = kmalloc_node(memsize, gfp, node);
+ init_arraycache(ac, entries, batchcount);
+ return ac;
}

static inline bool is_slab_pfmemalloc(struct page *page)
--
1.7.9.5

2014-07-01 08:23:09

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 9/9] slab: remove BAD_ALIEN_MAGIC

BAD_ALIEN_MAGIC value isn't used anymore. So remove it.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 7820a45..60c9e11 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -470,8 +470,6 @@ static struct kmem_cache kmem_cache_boot = {
.name = "kmem_cache",
};

-#define BAD_ALIEN_MAGIC 0x01020304ul
-
static DEFINE_PER_CPU(struct delayed_work, slab_reap_work);

static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
@@ -838,7 +836,7 @@ static int transfer_objects(struct array_cache *to,
static inline struct alien_cache **alloc_alien_cache(int node,
int limit, gfp_t gfp)
{
- return (struct alien_cache **)BAD_ALIEN_MAGIC;
+ return NULL;
}

static inline void free_alien_cache(struct alien_cache **ac_ptr)
--
1.7.9.5

2014-07-01 08:24:04

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 1/9] slab: add unlikely macro to help compiler

slab_should_failslab() is called on every allocation, so to optimize it
is reasonable. We normally don't allocate from kmem_cache. It is just
used when new kmem_cache is created, so it's very rare case. Therefore,
add unlikely macro to help compiler optimization.

Acked-by: David Rientjes <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index 179272f..f8a0ed1 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3067,7 +3067,7 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,

static bool slab_should_failslab(struct kmem_cache *cachep, gfp_t flags)
{
- if (cachep == kmem_cache)
+ if (unlikely(cachep == kmem_cache))
return false;

return should_failslab(cachep->object_size, flags, cachep->flags);
--
1.7.9.5

2014-07-01 08:24:03

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 3/9] slab: defer slab_destroy in free_block()

In free_block(), if freeing object makes new free slab and number of
free_objects exceeds free_limit, we start to destroy this new free slab
with holding the kmem_cache node lock. Holding the lock is useless and,
generally, holding a lock as least as possible is good thing. I never
measure performance effect of this, but we'd be better not to hold the lock
as much as possible.

Commented by Christoph:
This is also good because kmem_cache_free is no longer called while
holding the node lock. So we avoid one case of recursion.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 63 +++++++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 43 insertions(+), 20 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 19e2136..59b9a4c 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -242,7 +242,8 @@ static struct kmem_cache_node __initdata init_kmem_cache_node[NUM_INIT_LISTS];
static int drain_freelist(struct kmem_cache *cache,
struct kmem_cache_node *n, int tofree);
static void free_block(struct kmem_cache *cachep, void **objpp, int len,
- int node);
+ int node, struct list_head *list);
+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list);
static int enable_cpucache(struct kmem_cache *cachep, gfp_t gfp);
static void cache_reap(struct work_struct *unused);

@@ -1030,6 +1031,7 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
struct array_cache *ac, int node)
{
struct kmem_cache_node *n = get_node(cachep, node);
+ LIST_HEAD(list);

if (ac->avail) {
spin_lock(&n->list_lock);
@@ -1041,9 +1043,10 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
if (n->shared)
transfer_objects(n->shared, ac, ac->limit);

- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
ac->avail = 0;
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

@@ -1087,6 +1090,7 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
struct kmem_cache_node *n;
struct array_cache *alien = NULL;
int node;
+ LIST_HEAD(list);

node = numa_mem_id();

@@ -1111,8 +1115,9 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
} else {
n = get_node(cachep, nodeid);
spin_lock(&n->list_lock);
- free_block(cachep, &objp, 1, nodeid);
+ free_block(cachep, &objp, 1, nodeid, &list);
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
return 1;
}
@@ -1184,6 +1189,7 @@ static void cpuup_canceled(long cpu)
struct array_cache *nc;
struct array_cache *shared;
struct array_cache **alien;
+ LIST_HEAD(list);

/* cpu is dead; no one can alloc from it. */
nc = cachep->array[cpu];
@@ -1199,7 +1205,7 @@ static void cpuup_canceled(long cpu)
if (!memcg_cache_dead(cachep))
n->free_limit -= cachep->batchcount;
if (nc)
- free_block(cachep, nc->entry, nc->avail, node);
+ free_block(cachep, nc->entry, nc->avail, node, &list);

if (!cpumask_empty(mask)) {
spin_unlock_irq(&n->list_lock);
@@ -1209,7 +1215,7 @@ static void cpuup_canceled(long cpu)
shared = n->shared;
if (shared) {
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);
n->shared = NULL;
}

@@ -1224,6 +1230,7 @@ static void cpuup_canceled(long cpu)
free_alien_cache(alien);
}
free_array_cache:
+ slabs_destroy(cachep, &list);
kfree(nc);
}
/*
@@ -2062,6 +2069,16 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page)
kmem_cache_free(cachep->freelist_cache, freelist);
}

+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
+{
+ struct page *page, *n;
+
+ list_for_each_entry_safe(page, n, list, lru) {
+ list_del(&page->lru);
+ slab_destroy(cachep, page);
+ }
+}
+
/**
* calculate_slab_order - calculate size (page order) of slabs
* @cachep: pointer to the cache that is being created
@@ -2465,6 +2482,7 @@ static void do_drain(void *arg)
struct array_cache *ac;
int node = numa_mem_id();
struct kmem_cache_node *n;
+ LIST_HEAD(list);

check_irq_off();
ac = cpu_cache_get(cachep);
@@ -2473,8 +2491,9 @@ static void do_drain(void *arg)

n = get_node(cachep, node);
spin_lock(&n->list_lock);
- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail = 0;
if (memcg_cache_dead(cachep)) {
cachep->array[smp_processor_id()] = NULL;
@@ -3413,8 +3432,8 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
/*
* Caller needs to acquire correct kmem_cache_node's list_lock
*/
-static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
- int node)
+static void free_block(struct kmem_cache *cachep, void **objpp,
+ int nr_objects, int node, struct list_head *list)
{
int i;
struct kmem_cache_node *n = get_node(cachep, node);
@@ -3437,13 +3456,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
if (page->active == 0) {
if (n->free_objects > n->free_limit) {
n->free_objects -= cachep->num;
- /* No need to drop any previously held
- * lock here, even if we have a off-slab slab
- * descriptor it is guaranteed to come from
- * a different cache, refer to comments before
- * alloc_slabmgmt.
- */
- slab_destroy(cachep, page);
+ list_add_tail(&page->lru, list);
} else {
list_add(&page->lru, &n->slabs_free);
}
@@ -3462,6 +3475,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
int batchcount;
struct kmem_cache_node *n;
int node = numa_mem_id();
+ LIST_HEAD(list);

batchcount = ac->batchcount;
#if DEBUG
@@ -3483,7 +3497,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
}
}

- free_block(cachep, ac->entry, batchcount, node);
+ free_block(cachep, ac->entry, batchcount, node, &list);
free_done:
#if STATS
{
@@ -3504,6 +3518,7 @@ free_done:
}
#endif
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail -= batchcount;
memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail);
}
@@ -3531,11 +3546,13 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,

#ifdef CONFIG_MEMCG_KMEM
if (unlikely(!ac)) {
+ LIST_HEAD(list);
int nodeid = page_to_nid(virt_to_page(objp));

spin_lock(&cachep->node[nodeid]->list_lock);
- free_block(cachep, &objp, 1, nodeid);
+ free_block(cachep, &objp, 1, nodeid, &list);
spin_unlock(&cachep->node[nodeid]->list_lock);
+ slabs_destroy(cachep, &list);
return;
}
#endif
@@ -3801,12 +3818,13 @@ static int alloc_kmem_cache_node(struct kmem_cache *cachep, gfp_t gfp)
n = get_node(cachep, node);
if (n) {
struct array_cache *shared = n->shared;
+ LIST_HEAD(list);

spin_lock_irq(&n->list_lock);

if (shared)
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);

n->shared = new_shared;
if (!n->alien) {
@@ -3816,6 +3834,7 @@ static int alloc_kmem_cache_node(struct kmem_cache *cachep, gfp_t gfp)
n->free_limit = (1 + nr_cpus_node(node)) *
cachep->batchcount + cachep->num;
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
kfree(shared);
free_alien_cache(new_alien);
continue;
@@ -3908,6 +3927,7 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
cachep->shared = shared;

for_each_online_cpu(i) {
+ LIST_HEAD(list);
struct array_cache *ccold = new->new[i];
int node;
struct kmem_cache_node *n;
@@ -3918,8 +3938,9 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
node = cpu_to_mem(i);
n = get_node(cachep, node);
spin_lock_irq(&n->list_lock);
- free_block(cachep, ccold->entry, ccold->avail, node);
+ free_block(cachep, ccold->entry, ccold->avail, node, &list);
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
kfree(ccold);
}
kfree(new);
@@ -4027,6 +4048,7 @@ skip_setup:
static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
struct array_cache *ac, int force, int node)
{
+ LIST_HEAD(list);
int tofree;

if (!ac || !ac->avail)
@@ -4039,12 +4061,13 @@ static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
tofree = force ? ac->avail : (ac->limit + 4) / 5;
if (tofree > ac->avail)
tofree = (ac->avail + 1) / 2;
- free_block(cachep, ac->entry, tofree, node);
+ free_block(cachep, ac->entry, tofree, node, &list);
ac->avail -= tofree;
memmove(ac->entry, &(ac->entry[tofree]),
sizeof(void *) * ac->avail);
}
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

--
1.7.9.5

2014-07-01 08:24:01

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 6/9] slab: use the lock on alien_cache, instead of the lock on array_cache

Now, we have separate alien_cache structure, so it'd be better to hold
the lock on alien_cache while manipulating alien_cache. After that,
we don't need the lock on array_cache, so remove it.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 25 ++++++++-----------------
1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index e1a473d..1c319ad 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -191,7 +191,6 @@ struct array_cache {
unsigned int limit;
unsigned int batchcount;
unsigned int touched;
- spinlock_t lock;
void *entry[]; /*
* Must have this definition in here for the proper
* alignment of array_cache. Also simplifies accessing
@@ -512,7 +511,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
return;
for_each_node(r) {
if (alc[r])
- lockdep_set_class(&(alc[r]->ac.lock), alc_key);
+ lockdep_set_class(&(alc[r]->lock), alc_key);
}
}

@@ -811,7 +810,6 @@ static void init_arraycache(struct array_cache *ac, int limit, int batch)
ac->limit = limit;
ac->batchcount = batch;
ac->touched = 0;
- spin_lock_init(&ac->lock);
}
}

@@ -1010,6 +1008,7 @@ static struct alien_cache *__alloc_alien_cache(int node, int entries,

alc = kmalloc_node(memsize, gfp, node);
init_arraycache(&alc->ac, entries, batch);
+ spin_lock_init(&alc->lock);
return alc;
}

@@ -1086,9 +1085,9 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)

if (alc) {
ac = &alc->ac;
- if (ac->avail && spin_trylock_irq(&ac->lock)) {
+ if (ac->avail && spin_trylock_irq(&alc->lock)) {
__drain_alien_cache(cachep, ac, node);
- spin_unlock_irq(&ac->lock);
+ spin_unlock_irq(&alc->lock);
}
}
}
@@ -1106,9 +1105,9 @@ static void drain_alien_cache(struct kmem_cache *cachep,
alc = alien[i];
if (alc) {
ac = &alc->ac;
- spin_lock_irqsave(&ac->lock, flags);
+ spin_lock_irqsave(&alc->lock, flags);
__drain_alien_cache(cachep, ac, i);
- spin_unlock_irqrestore(&ac->lock, flags);
+ spin_unlock_irqrestore(&alc->lock, flags);
}
}
}
@@ -1136,13 +1135,13 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
if (n->alien && n->alien[nodeid]) {
alien = n->alien[nodeid];
ac = &alien->ac;
- spin_lock(&ac->lock);
+ spin_lock(&alien->lock);
if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
__drain_alien_cache(cachep, ac, nodeid);
}
ac_put_obj(cachep, ac, objp);
- spin_unlock(&ac->lock);
+ spin_unlock(&alien->lock);
} else {
n = get_node(cachep, nodeid);
spin_lock(&n->list_lock);
@@ -1619,10 +1618,6 @@ void __init kmem_cache_init(void)

memcpy(ptr, cpu_cache_get(kmem_cache),
sizeof(struct arraycache_init));
- /*
- * Do not assume that spinlocks can be initialized via memcpy:
- */
- spin_lock_init(&ptr->lock);

kmem_cache->array[smp_processor_id()] = ptr;

@@ -1632,10 +1627,6 @@ void __init kmem_cache_init(void)
!= &initarray_generic.cache);
memcpy(ptr, cpu_cache_get(kmalloc_caches[INDEX_AC]),
sizeof(struct arraycache_init));
- /*
- * Do not assume that spinlocks can be initialized via memcpy:
- */
- spin_lock_init(&ptr->lock);

kmalloc_caches[INDEX_AC]->array[smp_processor_id()] = ptr;
}
--
1.7.9.5

2014-07-01 08:23:59

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 5/9] slab: introduce alien_cache

Currently, we use array_cache for alien_cache. Although they are mostly
similar, there is one difference, that is, need for spinlock.
We don't need spinlock for array_cache itself, but to use array_cache for
alien_cache, array_cache structure should have spinlock. This is needless
overhead, so removing it would be better. This patch prepare it by
introducing alien_cache and using it. In the following patch,
we remove spinlock in array_cache.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 108 ++++++++++++++++++++++++++++++++++++++-----------------------
mm/slab.h | 2 +-
2 files changed, 68 insertions(+), 42 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 00b6bbc..e1a473d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -203,6 +203,11 @@ struct array_cache {
*/
};

+struct alien_cache {
+ spinlock_t lock;
+ struct array_cache ac;
+};
+
#define SLAB_OBJ_PFMEMALLOC 1
static inline bool is_obj_pfmemalloc(void *objp)
{
@@ -491,7 +496,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
struct lock_class_key *l3_key, struct lock_class_key *alc_key,
struct kmem_cache_node *n)
{
- struct array_cache **alc;
+ struct alien_cache **alc;
int r;

lockdep_set_class(&n->list_lock, l3_key);
@@ -507,7 +512,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
return;
for_each_node(r) {
if (alc[r])
- lockdep_set_class(&alc[r]->lock, alc_key);
+ lockdep_set_class(&(alc[r]->ac.lock), alc_key);
}
}

@@ -965,12 +970,13 @@ static int transfer_objects(struct array_cache *to,
#define drain_alien_cache(cachep, alien) do { } while (0)
#define reap_alien(cachep, n) do { } while (0)

-static inline struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
+static inline struct alien_cache **alloc_alien_cache(int node,
+ int limit, gfp_t gfp)
{
- return (struct array_cache **)BAD_ALIEN_MAGIC;
+ return (struct alien_cache **)BAD_ALIEN_MAGIC;
}

-static inline void free_alien_cache(struct array_cache **ac_ptr)
+static inline void free_alien_cache(struct alien_cache **ac_ptr)
{
}

@@ -996,40 +1002,52 @@ static inline void *____cache_alloc_node(struct kmem_cache *cachep,
static void *____cache_alloc_node(struct kmem_cache *, gfp_t, int);
static void *alternate_node_alloc(struct kmem_cache *, gfp_t);

-static struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
+static struct alien_cache *__alloc_alien_cache(int node, int entries,
+ int batch, gfp_t gfp)
+{
+ int memsize = sizeof(void *) * entries + sizeof(struct alien_cache);
+ struct alien_cache *alc = NULL;
+
+ alc = kmalloc_node(memsize, gfp, node);
+ init_arraycache(&alc->ac, entries, batch);
+ return alc;
+}
+
+static struct alien_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
{
- struct array_cache **ac_ptr;
+ struct alien_cache **alc_ptr;
int memsize = sizeof(void *) * nr_node_ids;
int i;

if (limit > 1)
limit = 12;
- ac_ptr = kzalloc_node(memsize, gfp, node);
- if (ac_ptr) {
- for_each_node(i) {
- if (i == node || !node_online(i))
- continue;
- ac_ptr[i] = alloc_arraycache(node, limit, 0xbaadf00d, gfp);
- if (!ac_ptr[i]) {
- for (i--; i >= 0; i--)
- kfree(ac_ptr[i]);
- kfree(ac_ptr);
- return NULL;
- }
+ alc_ptr = kzalloc_node(memsize, gfp, node);
+ if (!alc_ptr)
+ return NULL;
+
+ for_each_node(i) {
+ if (i == node || !node_online(i))
+ continue;
+ alc_ptr[i] = __alloc_alien_cache(node, limit, 0xbaadf00d, gfp);
+ if (!alc_ptr[i]) {
+ for (i--; i >= 0; i--)
+ kfree(alc_ptr[i]);
+ kfree(alc_ptr);
+ return NULL;
}
}
- return ac_ptr;
+ return alc_ptr;
}

-static void free_alien_cache(struct array_cache **ac_ptr)
+static void free_alien_cache(struct alien_cache **alc_ptr)
{
int i;

- if (!ac_ptr)
+ if (!alc_ptr)
return;
for_each_node(i)
- kfree(ac_ptr[i]);
- kfree(ac_ptr);
+ kfree(alc_ptr[i]);
+ kfree(alc_ptr);
}

static void __drain_alien_cache(struct kmem_cache *cachep,
@@ -1063,25 +1081,31 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)
int node = __this_cpu_read(slab_reap_node);

if (n->alien) {
- struct array_cache *ac = n->alien[node];
-
- if (ac && ac->avail && spin_trylock_irq(&ac->lock)) {
- __drain_alien_cache(cachep, ac, node);
- spin_unlock_irq(&ac->lock);
+ struct alien_cache *alc = n->alien[node];
+ struct array_cache *ac;
+
+ if (alc) {
+ ac = &alc->ac;
+ if (ac->avail && spin_trylock_irq(&ac->lock)) {
+ __drain_alien_cache(cachep, ac, node);
+ spin_unlock_irq(&ac->lock);
+ }
}
}
}

static void drain_alien_cache(struct kmem_cache *cachep,
- struct array_cache **alien)
+ struct alien_cache **alien)
{
int i = 0;
+ struct alien_cache *alc;
struct array_cache *ac;
unsigned long flags;

for_each_online_node(i) {
- ac = alien[i];
- if (ac) {
+ alc = alien[i];
+ if (alc) {
+ ac = &alc->ac;
spin_lock_irqsave(&ac->lock, flags);
__drain_alien_cache(cachep, ac, i);
spin_unlock_irqrestore(&ac->lock, flags);
@@ -1093,7 +1117,8 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
{
int nodeid = page_to_nid(virt_to_page(objp));
struct kmem_cache_node *n;
- struct array_cache *alien = NULL;
+ struct alien_cache *alien = NULL;
+ struct array_cache *ac;
int node;
LIST_HEAD(list);

@@ -1110,13 +1135,14 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
STATS_INC_NODEFREES(cachep);
if (n->alien && n->alien[nodeid]) {
alien = n->alien[nodeid];
- spin_lock(&alien->lock);
- if (unlikely(alien->avail == alien->limit)) {
+ ac = &alien->ac;
+ spin_lock(&ac->lock);
+ if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
- __drain_alien_cache(cachep, alien, nodeid);
+ __drain_alien_cache(cachep, ac, nodeid);
}
- ac_put_obj(cachep, alien, objp);
- spin_unlock(&alien->lock);
+ ac_put_obj(cachep, ac, objp);
+ spin_unlock(&ac->lock);
} else {
n = get_node(cachep, nodeid);
spin_lock(&n->list_lock);
@@ -1193,7 +1219,7 @@ static void cpuup_canceled(long cpu)
list_for_each_entry(cachep, &slab_caches, list) {
struct array_cache *nc;
struct array_cache *shared;
- struct array_cache **alien;
+ struct alien_cache **alien;
LIST_HEAD(list);

/* cpu is dead; no one can alloc from it. */
@@ -1275,7 +1301,7 @@ static int cpuup_prepare(long cpu)
list_for_each_entry(cachep, &slab_caches, list) {
struct array_cache *nc;
struct array_cache *shared = NULL;
- struct array_cache **alien = NULL;
+ struct alien_cache **alien = NULL;

if (memcg_cache_dead(cachep))
continue;
@@ -3799,7 +3825,7 @@ static int alloc_kmem_cache_node(struct kmem_cache *cachep, gfp_t gfp)
int node;
struct kmem_cache_node *n;
struct array_cache *new_shared;
- struct array_cache **new_alien = NULL;
+ struct alien_cache **new_alien = NULL;

for_each_online_node(node) {

diff --git a/mm/slab.h b/mm/slab.h
index 4789ca4..84c160a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -301,7 +301,7 @@ struct kmem_cache_node {
unsigned int free_limit;
unsigned int colour_next; /* Per-node cache coloring */
struct array_cache *shared; /* shared per node */
- struct array_cache **alien; /* on other nodes */
+ struct alien_cache **alien; /* on other nodes */
unsigned long next_reap; /* updated without locking */
int free_touched; /* updated without locking */
#endif
--
1.7.9.5

2014-07-01 08:23:58

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 8/9] slab: remove a useless lockdep annotation

Now, there is no code to hold two lock simultaneously, since
we don't call slab_destroy() with holding any lock. So, lockdep
annotation is useless now. Remove it.

v2: don't remove BAD_ALIEN_MAGIC in this patch. It will be removed
in the following patch.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 153 -------------------------------------------------------------
1 file changed, 153 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 854dfa0..7820a45 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -472,139 +472,6 @@ static struct kmem_cache kmem_cache_boot = {

#define BAD_ALIEN_MAGIC 0x01020304ul

-#ifdef CONFIG_LOCKDEP
-
-/*
- * Slab sometimes uses the kmalloc slabs to store the slab headers
- * for other slabs "off slab".
- * The locking for this is tricky in that it nests within the locks
- * of all other slabs in a few places; to deal with this special
- * locking we put on-slab caches into a separate lock-class.
- *
- * We set lock class for alien array caches which are up during init.
- * The lock annotation will be lost if all cpus of a node goes down and
- * then comes back up during hotplug
- */
-static struct lock_class_key on_slab_l3_key;
-static struct lock_class_key on_slab_alc_key;
-
-static struct lock_class_key debugobj_l3_key;
-static struct lock_class_key debugobj_alc_key;
-
-static void slab_set_lock_classes(struct kmem_cache *cachep,
- struct lock_class_key *l3_key, struct lock_class_key *alc_key,
- struct kmem_cache_node *n)
-{
- struct alien_cache **alc;
- int r;
-
- lockdep_set_class(&n->list_lock, l3_key);
- alc = n->alien;
- /*
- * FIXME: This check for BAD_ALIEN_MAGIC
- * should go away when common slab code is taught to
- * work even without alien caches.
- * Currently, non NUMA code returns BAD_ALIEN_MAGIC
- * for alloc_alien_cache,
- */
- if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC)
- return;
- for_each_node(r) {
- if (alc[r])
- lockdep_set_class(&(alc[r]->lock), alc_key);
- }
-}
-
-static void slab_set_debugobj_lock_classes_node(struct kmem_cache *cachep,
- struct kmem_cache_node *n)
-{
- slab_set_lock_classes(cachep, &debugobj_l3_key, &debugobj_alc_key, n);
-}
-
-static void slab_set_debugobj_lock_classes(struct kmem_cache *cachep)
-{
- int node;
- struct kmem_cache_node *n;
-
- for_each_kmem_cache_node(cachep, node, n)
- slab_set_debugobj_lock_classes_node(cachep, n);
-}
-
-static void init_node_lock_keys(int q)
-{
- int i;
-
- if (slab_state < UP)
- return;
-
- for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
- struct kmem_cache_node *n;
- struct kmem_cache *cache = kmalloc_caches[i];
-
- if (!cache)
- continue;
-
- n = get_node(cache, q);
- if (!n || OFF_SLAB(cache))
- continue;
-
- slab_set_lock_classes(cache, &on_slab_l3_key,
- &on_slab_alc_key, n);
- }
-}
-
-static void on_slab_lock_classes_node(struct kmem_cache *cachep,
- struct kmem_cache_node *n)
-{
- slab_set_lock_classes(cachep, &on_slab_l3_key,
- &on_slab_alc_key, n);
-}
-
-static inline void on_slab_lock_classes(struct kmem_cache *cachep)
-{
- int node;
- struct kmem_cache_node *n;
-
- VM_BUG_ON(OFF_SLAB(cachep));
- for_each_kmem_cache_node(cachep, node, n)
- on_slab_lock_classes_node(cachep, n);
-}
-
-static inline void __init init_lock_keys(void)
-{
- int node;
-
- for_each_node(node)
- init_node_lock_keys(node);
-}
-#else
-static void __init init_node_lock_keys(int q)
-{
-}
-
-static inline void init_lock_keys(void)
-{
-}
-
-static inline void on_slab_lock_classes(struct kmem_cache *cachep)
-{
-}
-
-static inline void on_slab_lock_classes_node(struct kmem_cache *cachep,
- struct kmem_cache_node *n)
-{
-}
-
-static void slab_set_debugobj_lock_classes_node(struct kmem_cache *cachep,
- struct kmem_cache_node *n)
-{
-}
-
-static void slab_set_debugobj_lock_classes(struct kmem_cache *cachep)
-{
-}
-#endif
-
static DEFINE_PER_CPU(struct delayed_work, slab_reap_work);

static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
@@ -1354,13 +1221,7 @@ static int cpuup_prepare(long cpu)
spin_unlock_irq(&n->list_lock);
kfree(shared);
free_alien_cache(alien);
- if (cachep->flags & SLAB_DEBUG_OBJECTS)
- slab_set_debugobj_lock_classes_node(cachep, n);
- else if (!OFF_SLAB(cachep) &&
- !(cachep->flags & SLAB_DESTROY_BY_RCU))
- on_slab_lock_classes_node(cachep, n);
}
- init_node_lock_keys(node);

return 0;
bad:
@@ -1669,9 +1530,6 @@ void __init kmem_cache_init_late(void)
BUG();
mutex_unlock(&slab_mutex);

- /* Annotate slab for lockdep -- annotate the malloc caches */
- init_lock_keys();
-
/* Done! */
slab_state = FULL;

@@ -2452,17 +2310,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
return err;
}

- if (flags & SLAB_DEBUG_OBJECTS) {
- /*
- * Would deadlock through slab_destroy()->call_rcu()->
- * debug_object_activate()->kmem_cache_alloc().
- */
- WARN_ON_ONCE(flags & SLAB_DESTROY_BY_RCU);
-
- slab_set_debugobj_lock_classes(cachep);
- } else if (!OFF_SLAB(cachep) && !(flags & SLAB_DESTROY_BY_RCU))
- on_slab_lock_classes(cachep);
-
return 0;
}

--
1.7.9.5

2014-07-01 08:23:56

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH v3 7/9] slab: destroy a slab without holding any alien cache lock

I haven't heard that this alien cache lock is contended, but to reduce
chance of contention would be better generally. And with this change,
we can simplify complex lockdep annotation in slab code.
In the following patch, it will be implemented.

Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 1c319ad..854dfa0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1050,10 +1050,10 @@ static void free_alien_cache(struct alien_cache **alc_ptr)
}

static void __drain_alien_cache(struct kmem_cache *cachep,
- struct array_cache *ac, int node)
+ struct array_cache *ac, int node,
+ struct list_head *list)
{
struct kmem_cache_node *n = get_node(cachep, node);
- LIST_HEAD(list);

if (ac->avail) {
spin_lock(&n->list_lock);
@@ -1065,10 +1065,9 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
if (n->shared)
transfer_objects(n->shared, ac, ac->limit);

- free_block(cachep, ac->entry, ac->avail, node, &list);
+ free_block(cachep, ac->entry, ac->avail, node, list);
ac->avail = 0;
spin_unlock(&n->list_lock);
- slabs_destroy(cachep, &list);
}
}

@@ -1086,8 +1085,11 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)
if (alc) {
ac = &alc->ac;
if (ac->avail && spin_trylock_irq(&alc->lock)) {
- __drain_alien_cache(cachep, ac, node);
+ LIST_HEAD(list);
+
+ __drain_alien_cache(cachep, ac, node, &list);
spin_unlock_irq(&alc->lock);
+ slabs_destroy(cachep, &list);
}
}
}
@@ -1104,10 +1106,13 @@ static void drain_alien_cache(struct kmem_cache *cachep,
for_each_online_node(i) {
alc = alien[i];
if (alc) {
+ LIST_HEAD(list);
+
ac = &alc->ac;
spin_lock_irqsave(&alc->lock, flags);
- __drain_alien_cache(cachep, ac, i);
+ __drain_alien_cache(cachep, ac, i, &list);
spin_unlock_irqrestore(&alc->lock, flags);
+ slabs_destroy(cachep, &list);
}
}
}
@@ -1138,10 +1143,11 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
spin_lock(&alien->lock);
if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
- __drain_alien_cache(cachep, ac, nodeid);
+ __drain_alien_cache(cachep, ac, nodeid, &list);
}
ac_put_obj(cachep, ac, objp);
spin_unlock(&alien->lock);
+ slabs_destroy(cachep, &list);
} else {
n = get_node(cachep, nodeid);
spin_lock(&n->list_lock);
--
1.7.9.5

2014-07-01 22:15:49

by Andrew Morton

[permalink] [raw]

Subject: Re: [PATCH v3 5/9] slab: introduce alien_cache

On Tue, 1 Jul 2014 17:27:34 +0900 Joonsoo Kim <[email protected]> wrote:

> -static struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
> +static struct alien_cache *__alloc_alien_cache(int node, int entries,
> + int batch, gfp_t gfp)
> +{
> + int memsize = sizeof(void *) * entries + sizeof(struct alien_cache);

nit: all five memsizes in slab.c have type `int'. size_t would be more
appropriate.

> + struct alien_cache *alc = NULL;
> +
> + alc = kmalloc_node(memsize, gfp, node);
> + init_arraycache(&alc->ac, entries, batch);
> + return alc;
> +}

2014-07-01 22:21:39

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH v3 2/9] slab: move up code to get kmem_cache_node in free_block()

On Tue, 1 Jul 2014, Joonsoo Kim wrote:

> node isn't changed, so we don't need to retreive this structure
> everytime we move the object. Maybe compiler do this optimization,
> but making it explicitly is better.
>

Qualifying the pointer as const would be even more explicit.

> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Joonsoo Kim <[email protected]>

Acked-by: David Rientjes <[email protected]>

2014-07-01 22:25:08

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH v3 3/9] slab: defer slab_destroy in free_block()

On Tue, 1 Jul 2014, Joonsoo Kim wrote:

> In free_block(), if freeing object makes new free slab and number of
> free_objects exceeds free_limit, we start to destroy this new free slab
> with holding the kmem_cache node lock. Holding the lock is useless and,
> generally, holding a lock as least as possible is good thing. I never
> measure performance effect of this, but we'd be better not to hold the lock
> as much as possible.
>
> Commented by Christoph:
> This is also good because kmem_cache_free is no longer called while
> holding the node lock. So we avoid one case of recursion.
>
> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Joonsoo Kim <[email protected]>

Not sure what happened to my

Acked-by: David Rientjes <[email protected]>

from http://marc.info/?l=linux-kernel&m=139951092124314, and for the
record, I still think the free_block() "list" formal should be commented.

2014-07-01 22:26:30

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH v3 4/9] slab: factor out initialization of arracy cache

On Tue, 1 Jul 2014, Joonsoo Kim wrote:

> Factor out initialization of array cache to use it in following patch.
>
> Acked-by: Christoph Lameter <[email protected]>
> Signed-off-by: Joonsoo Kim <[email protected]>

Not sure what happened to my

Acked-by: David Rientjes <[email protected]>

from http://marc.info/?l=linux-mm&m=139951195724487 and my comment still
stands about s/arracy/array/ in the patch title.

2014-07-02 00:34:44

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH v3 2/9] slab: move up code to get kmem_cache_node in free_block()

On Tue, Jul 01, 2014 at 03:21:21PM -0700, David Rientjes wrote:
> On Tue, 1 Jul 2014, Joonsoo Kim wrote:
>
> > node isn't changed, so we don't need to retreive this structure
> > everytime we move the object. Maybe compiler do this optimization,
> > but making it explicitly is better.
> >
>
> Qualifying the pointer as const would be even more explicit.

Hello,

So what you recommend is something likes below?

- struct kmem_cache_node *n = get_node(cachep, node);
+ struct kmem_cache_node * const n = get_node(cachep, node);

I don't have seen this form of code protecting pointer itself in mm.
Instead, I have seen 'const struct kmem_cache_node *n' which protects
memory pointed by pointer. But this case isn't that case.

Am I missing something?

>
> > Acked-by: Christoph Lameter <[email protected]>
> > Signed-off-by: Joonsoo Kim <[email protected]>
>
> Acked-by: David Rientjes <[email protected]>

Thank you!

2014-07-02 00:39:17

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH v3 3/9] slab: defer slab_destroy in free_block()

On Tue, Jul 01, 2014 at 03:25:04PM -0700, David Rientjes wrote:
> On Tue, 1 Jul 2014, Joonsoo Kim wrote:
>
> > In free_block(), if freeing object makes new free slab and number of
> > free_objects exceeds free_limit, we start to destroy this new free slab
> > with holding the kmem_cache node lock. Holding the lock is useless and,
> > generally, holding a lock as least as possible is good thing. I never
> > measure performance effect of this, but we'd be better not to hold the lock
> > as much as possible.
> >
> > Commented by Christoph:
> > This is also good because kmem_cache_free is no longer called while
> > holding the node lock. So we avoid one case of recursion.
> >
> > Acked-by: Christoph Lameter <[email protected]>
> > Signed-off-by: Joonsoo Kim <[email protected]>
>
> Not sure what happened to my
>
> Acked-by: David Rientjes <[email protected]>
>
> from http://marc.info/?l=linux-kernel&m=139951092124314, and for the
> record, I still think the free_block() "list" formal should be commented.

Really sorry about that.
My mail client didn't have this mail due to unknow reason, so I missed it.

Here goes the new one with applying your comment.

--------->8------------
>From 39d0bd43b978583a3e735a4bc9896447f7873153 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <[email protected]>
Date: Thu, 29 Aug 2013 15:25:36 +0900
Subject: [PATCH v4 3/9] slab: defer slab_destroy in free_block()

In free_block(), if freeing object makes new free slab and number of
free_objects exceeds free_limit, we start to destroy this new free slab
with holding the kmem_cache node lock. Holding the lock is useless and,
generally, holding a lock as least as possible is good thing. I never
measure performance effect of this, but we'd be better not to hold the lock
as much as possible.

Commented by Christoph:
This is also good because kmem_cache_free is no longer called while
holding the node lock. So we avoid one case of recursion.

Acked-by: Christoph Lameter <[email protected]>
Acked-by: David Rientjes <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 64 ++++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 44 insertions(+), 20 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 19e2136..8f9e176 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -242,7 +242,8 @@ static struct kmem_cache_node __initdata init_kmem_cache_node[NUM_INIT_LISTS];
static int drain_freelist(struct kmem_cache *cache,
struct kmem_cache_node *n, int tofree);
static void free_block(struct kmem_cache *cachep, void **objpp, int len,
- int node);
+ int node, struct list_head *list);
+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list);
static int enable_cpucache(struct kmem_cache *cachep, gfp_t gfp);
static void cache_reap(struct work_struct *unused);

@@ -1030,6 +1031,7 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
struct array_cache *ac, int node)
{
struct kmem_cache_node *n = get_node(cachep, node);
+ LIST_HEAD(list);

if (ac->avail) {
spin_lock(&n->list_lock);
@@ -1041,9 +1043,10 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
if (n->shared)
transfer_objects(n->shared, ac, ac->limit);

- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
ac->avail = 0;
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

@@ -1087,6 +1090,7 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
struct kmem_cache_node *n;
struct array_cache *alien = NULL;
int node;
+ LIST_HEAD(list);

node = numa_mem_id();

@@ -1111,8 +1115,9 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
} else {
n = get_node(cachep, nodeid);
spin_lock(&n->list_lock);
- free_block(cachep, &objp, 1, nodeid);
+ free_block(cachep, &objp, 1, nodeid, &list);
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
return 1;
}
@@ -1184,6 +1189,7 @@ static void cpuup_canceled(long cpu)
struct array_cache *nc;
struct array_cache *shared;
struct array_cache **alien;
+ LIST_HEAD(list);

/* cpu is dead; no one can alloc from it. */
nc = cachep->array[cpu];
@@ -1199,7 +1205,7 @@ static void cpuup_canceled(long cpu)
if (!memcg_cache_dead(cachep))
n->free_limit -= cachep->batchcount;
if (nc)
- free_block(cachep, nc->entry, nc->avail, node);
+ free_block(cachep, nc->entry, nc->avail, node, &list);

if (!cpumask_empty(mask)) {
spin_unlock_irq(&n->list_lock);
@@ -1209,7 +1215,7 @@ static void cpuup_canceled(long cpu)
shared = n->shared;
if (shared) {
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);
n->shared = NULL;
}

@@ -1224,6 +1230,7 @@ static void cpuup_canceled(long cpu)
free_alien_cache(alien);
}
free_array_cache:
+ slabs_destroy(cachep, &list);
kfree(nc);
}
/*
@@ -2062,6 +2069,16 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page)
kmem_cache_free(cachep->freelist_cache, freelist);
}

+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
+{
+ struct page *page, *n;
+
+ list_for_each_entry_safe(page, n, list, lru) {
+ list_del(&page->lru);
+ slab_destroy(cachep, page);
+ }
+}
+
/**
* calculate_slab_order - calculate size (page order) of slabs
* @cachep: pointer to the cache that is being created
@@ -2465,6 +2482,7 @@ static void do_drain(void *arg)
struct array_cache *ac;
int node = numa_mem_id();
struct kmem_cache_node *n;
+ LIST_HEAD(list);

check_irq_off();
ac = cpu_cache_get(cachep);
@@ -2473,8 +2491,9 @@ static void do_drain(void *arg)

n = get_node(cachep, node);
spin_lock(&n->list_lock);
- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail = 0;
if (memcg_cache_dead(cachep)) {
cachep->array[smp_processor_id()] = NULL;
@@ -3412,9 +3431,10 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)

/*
* Caller needs to acquire correct kmem_cache_node's list_lock
+ * @list: List of detached free slabs should be freed by caller
*/
-static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
- int node)
+static void free_block(struct kmem_cache *cachep, void **objpp,
+ int nr_objects, int node, struct list_head *list)
{
int i;
struct kmem_cache_node *n = get_node(cachep, node);
@@ -3437,13 +3457,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
if (page->active == 0) {
if (n->free_objects > n->free_limit) {
n->free_objects -= cachep->num;
- /* No need to drop any previously held
- * lock here, even if we have a off-slab slab
- * descriptor it is guaranteed to come from
- * a different cache, refer to comments before
- * alloc_slabmgmt.
- */
- slab_destroy(cachep, page);
+ list_add_tail(&page->lru, list);
} else {
list_add(&page->lru, &n->slabs_free);
}
@@ -3462,6 +3476,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
int batchcount;
struct kmem_cache_node *n;
int node = numa_mem_id();
+ LIST_HEAD(list);

batchcount = ac->batchcount;
#if DEBUG
@@ -3483,7 +3498,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
}
}

- free_block(cachep, ac->entry, batchcount, node);
+ free_block(cachep, ac->entry, batchcount, node, &list);
free_done:
#if STATS
{
@@ -3504,6 +3519,7 @@ free_done:
}
#endif
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail -= batchcount;
memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail);
}
@@ -3531,11 +3547,13 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,

#ifdef CONFIG_MEMCG_KMEM
if (unlikely(!ac)) {
+ LIST_HEAD(list);
int nodeid = page_to_nid(virt_to_page(objp));

spin_lock(&cachep->node[nodeid]->list_lock);
- free_block(cachep, &objp, 1, nodeid);
+ free_block(cachep, &objp, 1, nodeid, &list);
spin_unlock(&cachep->node[nodeid]->list_lock);
+ slabs_destroy(cachep, &list);
return;
}
#endif
@@ -3801,12 +3819,13 @@ static int alloc_kmem_cache_node(struct kmem_cache *cachep, gfp_t gfp)
n = get_node(cachep, node);
if (n) {
struct array_cache *shared = n->shared;
+ LIST_HEAD(list);

spin_lock_irq(&n->list_lock);

if (shared)
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);

n->shared = new_shared;
if (!n->alien) {
@@ -3816,6 +3835,7 @@ static int alloc_kmem_cache_node(struct kmem_cache *cachep, gfp_t gfp)
n->free_limit = (1 + nr_cpus_node(node)) *
cachep->batchcount + cachep->num;
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
kfree(shared);
free_alien_cache(new_alien);
continue;
@@ -3908,6 +3928,7 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
cachep->shared = shared;

for_each_online_cpu(i) {
+ LIST_HEAD(list);
struct array_cache *ccold = new->new[i];
int node;
struct kmem_cache_node *n;
@@ -3918,8 +3939,9 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
node = cpu_to_mem(i);
n = get_node(cachep, node);
spin_lock_irq(&n->list_lock);
- free_block(cachep, ccold->entry, ccold->avail, node);
+ free_block(cachep, ccold->entry, ccold->avail, node, &list);
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
kfree(ccold);
}
kfree(new);
@@ -4027,6 +4049,7 @@ skip_setup:
static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
struct array_cache *ac, int force, int node)
{
+ LIST_HEAD(list);
int tofree;

if (!ac || !ac->avail)
@@ -4039,12 +4062,13 @@ static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
tofree = force ? ac->avail : (ac->limit + 4) / 5;
if (tofree > ac->avail)
tofree = (ac->avail + 1) / 2;
- free_block(cachep, ac->entry, tofree, node);
+ free_block(cachep, ac->entry, tofree, node, &list);
ac->avail -= tofree;
memmove(ac->entry, &(ac->entry[tofree]),
sizeof(void *) * ac->avail);
}
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

--
1.7.9.5

2014-07-02 00:41:07

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH v3 4/9] slab: factor out initialization of arracy cache

On Tue, Jul 01, 2014 at 03:26:26PM -0700, David Rientjes wrote:
> On Tue, 1 Jul 2014, Joonsoo Kim wrote:
>
> > Factor out initialization of array cache to use it in following patch.
> >
> > Acked-by: Christoph Lameter <[email protected]>
> > Signed-off-by: Joonsoo Kim <[email protected]>
>
> Not sure what happened to my
>
> Acked-by: David Rientjes <[email protected]>
>
> from http://marc.info/?l=linux-mm&m=139951195724487 and my comment still
> stands about s/arracy/array/ in the patch title.

This is new one with applying your comment.

Thanks.

--------------8<--------------
>From 5d4f47d839ae4fc0796b633d41858d9c87fd235d Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <[email protected]>
Date: Fri, 30 Aug 2013 14:34:38 +0900
Subject: [PATCH v4 4/9] slab: factor out initialization of array cache

Factor out initialization of array cache to use it in following patch.

Acked-by: Christoph Lameter <[email protected]>
Acked-by: David Rientjes <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>
---
mm/slab.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 8f9e176..6fa3fdf 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -791,13 +791,8 @@ static void start_cpu_timer(int cpu)
}
}

-static struct array_cache *alloc_arraycache(int node, int entries,
- int batchcount, gfp_t gfp)
+static void init_arraycache(struct array_cache *ac, int limit, int batch)
{
- int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
- struct array_cache *nc = NULL;
-
- nc = kmalloc_node(memsize, gfp, node);
/*
* The array_cache structures contain pointers to free object.
* However, when such objects are allocated or transferred to another
@@ -805,15 +800,25 @@ static struct array_cache *alloc_arraycache(int node, int entries,
* valid references during a kmemleak scan. Therefore, kmemleak must
* not scan such objects.
*/
- kmemleak_no_scan(nc);
- if (nc) {
- nc->avail = 0;
- nc->limit = entries;
- nc->batchcount = batchcount;
- nc->touched = 0;
- spin_lock_init(&nc->lock);
+ kmemleak_no_scan(ac);
+ if (ac) {
+ ac->avail = 0;
+ ac->limit = limit;
+ ac->batchcount = batch;
+ ac->touched = 0;
+ spin_lock_init(&ac->lock);
}
- return nc;
+}
+
+static struct array_cache *alloc_arraycache(int node, int entries,
+ int batchcount, gfp_t gfp)
+{
+ int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
+ struct array_cache *ac = NULL;
+
+ ac = kmalloc_node(memsize, gfp, node);
+ init_arraycache(ac, entries, batchcount);
+ return ac;
}

static inline bool is_slab_pfmemalloc(struct page *page)
--
1.7.9.5

2014-07-02 00:43:10

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH v3 5/9] slab: introduce alien_cache

On Tue, Jul 01, 2014 at 03:15:47PM -0700, Andrew Morton wrote:
> On Tue, 1 Jul 2014 17:27:34 +0900 Joonsoo Kim <[email protected]> wrote:
>
> > -static struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
> > +static struct alien_cache *__alloc_alien_cache(int node, int entries,
> > + int batch, gfp_t gfp)
> > +{
> > + int memsize = sizeof(void *) * entries + sizeof(struct alien_cache);
>
> nit: all five memsizes in slab.c have type `int'. size_t would be more
> appropriate.
>

Hello,

As my inspection, there are 4 memsize. Can you confirm that?
Anyway, here goes the patch you suggested.

Thanks.

----------8<-----------------
>From e9ae011804f3d90375b2e50c0dbd95e708afc509 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <[email protected]>
Date: Wed, 2 Jul 2014 09:23:05 +0900
Subject: [PATCH] slab: change int to size_t for representing allocation size

It is better to represent allocation size in size_t rather than int.
So change it.

Suggested-by: Andrew Morton <[email protected]>
Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 5b0224c..e7763db 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -681,7 +681,7 @@ static void init_arraycache(struct array_cache *ac, int limit, int batch)
static struct array_cache *alloc_arraycache(int node, int entries,
int batchcount, gfp_t gfp)
{
- int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
+ size_t memsize = sizeof(void *) * entries + sizeof(struct array_cache);
struct array_cache *ac = NULL;

ac = kmalloc_node(memsize, gfp, node);
@@ -868,7 +868,7 @@ static void *alternate_node_alloc(struct kmem_cache *, gfp_t);
static struct alien_cache *__alloc_alien_cache(int node, int entries,
int batch, gfp_t gfp)
{
- int memsize = sizeof(void *) * entries + sizeof(struct alien_cache);
+ size_t memsize = sizeof(void *) * entries + sizeof(struct alien_cache);
struct alien_cache *alc = NULL;

alc = kmalloc_node(memsize, gfp, node);
@@ -880,7 +880,7 @@ static struct alien_cache *__alloc_alien_cache(int node, int entries,
static struct alien_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
{
struct alien_cache **alc_ptr;
- int memsize = sizeof(void *) * nr_node_ids;
+ size_t memsize = sizeof(void *) * nr_node_ids;
int i;

if (limit > 1)
@@ -1037,7 +1037,7 @@ static int init_cache_node_node(int node)
{
struct kmem_cache *cachep;
struct kmem_cache_node *n;
- const int memsize = sizeof(struct kmem_cache_node);
+ const size_t memsize = sizeof(struct kmem_cache_node);

list_for_each_entry(cachep, &slab_caches, list) {
/*
--
1.7.9.5