LinuxLists.cc - [PATCH 0/9] clean-up and remove lockdep annotation in SLAB

2014-02-14 06:57:27

Subject: [PATCH 0/9] clean-up and remove lockdep annotation in SLAB

This patchset does some clean-up and tries to remove lockdep annotation.

Patches 1~3 are just for really really minor improvement.
Patches 4~9 are for clean-up and removing lockdep annotation.

There are two cases that lockdep annotation is needed in SLAB.
1) holding two node locks
2) holding two array cache(alien cache) locks

I looked at the code and found that we can avoid these cases without
any negative effect.

1) occurs if freeing object makes new free slab and we decide to
destroy it. Although we don't need to hold the lock during destroying
a slab, current code do that. Destroying a slab without holding the lock
would help the reduction of the lock contention. To do it, I change the
implementation that new free slab is destroyed after releasing the lock.

2) occurs on similar situation. When we free object from non-local node,
we put this object to alien cache with holding the alien cache lock.
If alien cache is full, we try to flush alien cache to proper node cache,
and, in this time, new free slab could be made. Destroying it would be
started and we will free metadata object which comes from another node.
In this case, we need another node's alien cache lock to free object.
This forces us to hold two array cache locks and then we need lockdep
annotation although they are always different locks and deadlock cannot
be possible. To prevent this situation, I use same way as 1).

In this way, we can avoid 1) and 2) cases, and then, can remove lockdep
annotation. As short stat noted, this makes SLAB code much simpler.

This patchset is based on slab/next branch on Pekka's git tree.
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git

Thanks.

Joonsoo Kim (9):
slab: add unlikely macro to help compiler
slab: makes clear_obj_pfmemalloc() just return store masked value
slab: move up code to get kmem_cache_node in free_block()
slab: defer slab_destroy in free_block()
slab: factor out initialization of arracy cache
slab: introduce alien_cache
slab: use the lock on alien_cache, instead of the lock on array_cache
slab: destroy a slab without holding any alien cache lock
slab: remove a useless lockdep annotation

mm/slab.c | 384 ++++++++++++++++++++++---------------------------------------
mm/slab.h | 2 +-
2 files changed, 138 insertions(+), 248 deletions(-)

--
1.7.9.5

2014-02-14 06:57:30

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 1/9] slab: add unlikely macro to help compiler

slab_should_failslab() is called on every allocation, so to optimize it
is reasonable. We normally don't allocate from kmem_cache. It is just
used when new kmem_cache is created, so it's very rare case. Therefore,
add unlikely macro to help compiler optimization.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 8347d80..5906f8f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3009,7 +3009,7 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,

static bool slab_should_failslab(struct kmem_cache *cachep, gfp_t flags)
{
- if (cachep == kmem_cache)
+ if (unlikely(cachep == kmem_cache))
return false;

return should_failslab(cachep->object_size, flags, cachep->flags);
--
1.7.9.5

2014-02-14 06:57:29

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 2/9] slab: makes clear_obj_pfmemalloc() just return store masked value

clear_obj_pfmemalloc() takes the pointer to the object pointer as argument
to store masked value back into this address.
But this is useless, since we don't use this stored value anymore.
All we need is just masked value. So makes clear_obj_pfmemalloc()
just return masked value.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 5906f8f..6d17cad 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -215,9 +215,9 @@ static inline void set_obj_pfmemalloc(void **objp)
return;
}

-static inline void clear_obj_pfmemalloc(void **objp)
+static inline void *clear_obj_pfmemalloc(void *objp)
{
- *objp = (void *)((unsigned long)*objp & ~SLAB_OBJ_PFMEMALLOC);
+ return (void *)((unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC);
}

/*
@@ -810,7 +810,7 @@ static void *__ac_get_obj(struct kmem_cache *cachep, struct array_cache *ac,
struct kmem_cache_node *n;

if (gfp_pfmemalloc_allowed(flags)) {
- clear_obj_pfmemalloc(&objp);
+ objp = clear_obj_pfmemalloc(objp);
return objp;
}

@@ -833,7 +833,7 @@ static void *__ac_get_obj(struct kmem_cache *cachep, struct array_cache *ac,
if (!list_empty(&n->slabs_free) && force_refill) {
struct page *page = virt_to_head_page(objp);
ClearPageSlabPfmemalloc(page);
- clear_obj_pfmemalloc(&objp);
+ objp = clear_obj_pfmemalloc(objp);
recheck_pfmemalloc_active(cachep, ac);
return objp;
}
@@ -3365,8 +3365,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
void *objp;
struct page *page;

- clear_obj_pfmemalloc(&objpp[i]);
- objp = objpp[i];
+ objp = clear_obj_pfmemalloc(objpp[i]);

page = virt_to_head_page(objp);
n = cachep->node[node];
--
1.7.9.5

2014-02-14 06:57:47

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 9/9] slab: remove a useless lockdep annotation

Now, there is no code to hold two lock simultaneously, since
we don't call slab_destroy() with holding any lock. So, lockdep
annotation is useless now. Remove it.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 9c9d4d4..f723a72 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -437,143 +437,6 @@ static struct kmem_cache kmem_cache_boot = {
.name = "kmem_cache",
};

-#define BAD_ALIEN_MAGIC 0x01020304ul
-
-#ifdef CONFIG_LOCKDEP
-
-/*
- * Slab sometimes uses the kmalloc slabs to store the slab headers
- * for other slabs "off slab".
- * The locking for this is tricky in that it nests within the locks
- * of all other slabs in a few places; to deal with this special
- * locking we put on-slab caches into a separate lock-class.
- *
- * We set lock class for alien array caches which are up during init.
- * The lock annotation will be lost if all cpus of a node goes down and
- * then comes back up during hotplug
- */
-static struct lock_class_key on_slab_l3_key;
-static struct lock_class_key on_slab_alc_key;
-
-static struct lock_class_key debugobj_l3_key;
-static struct lock_class_key debugobj_alc_key;
-
-static void slab_set_lock_classes(struct kmem_cache *cachep,
- struct lock_class_key *l3_key, struct lock_class_key *alc_key,
- int q)
-{
- struct alien_cache **alc;
- struct kmem_cache_node *n;
- int r;
-
- n = cachep->node[q];
- if (!n)
- return;
-
- lockdep_set_class(&n->list_lock, l3_key);
- alc = n->alien;
- /*
- * FIXME: This check for BAD_ALIEN_MAGIC
- * should go away when common slab code is taught to
- * work even without alien caches.
- * Currently, non NUMA code returns BAD_ALIEN_MAGIC
- * for alloc_alien_cache,
- */
- if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC)
- return;
- for_each_node(r) {
- if (alc[r])
- lockdep_set_class(&(alc[r]->lock), alc_key);
- }
-}
-
-static void slab_set_debugobj_lock_classes_node(struct kmem_cache *cachep, int node)
-{
- slab_set_lock_classes(cachep, &debugobj_l3_key, &debugobj_alc_key, node);
-}
-
-static void slab_set_debugobj_lock_classes(struct kmem_cache *cachep)
-{
- int node;
-
- for_each_online_node(node)
- slab_set_debugobj_lock_classes_node(cachep, node);
-}
-
-static void init_node_lock_keys(int q)
-{
- int i;
-
- if (slab_state < UP)
- return;
-
- for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
- struct kmem_cache_node *n;
- struct kmem_cache *cache = kmalloc_caches[i];
-
- if (!cache)
- continue;
-
- n = cache->node[q];
- if (!n || OFF_SLAB(cache))
- continue;
-
- slab_set_lock_classes(cache, &on_slab_l3_key,
- &on_slab_alc_key, q);
- }
-}
-
-static void on_slab_lock_classes_node(struct kmem_cache *cachep, int q)
-{
- if (!cachep->node[q])
- return;
-
- slab_set_lock_classes(cachep, &on_slab_l3_key,
- &on_slab_alc_key, q);
-}
-
-static inline void on_slab_lock_classes(struct kmem_cache *cachep)
-{
- int node;
-
- VM_BUG_ON(OFF_SLAB(cachep));
- for_each_node(node)
- on_slab_lock_classes_node(cachep, node);
-}
-
-static inline void init_lock_keys(void)
-{
- int node;
-
- for_each_node(node)
- init_node_lock_keys(node);
-}
-#else
-static void init_node_lock_keys(int q)
-{
-}
-
-static inline void init_lock_keys(void)
-{
-}
-
-static inline void on_slab_lock_classes(struct kmem_cache *cachep)
-{
-}
-
-static inline void on_slab_lock_classes_node(struct kmem_cache *cachep, int node)
-{
-}
-
-static void slab_set_debugobj_lock_classes_node(struct kmem_cache *cachep, int node)
-{
-}
-
-static void slab_set_debugobj_lock_classes(struct kmem_cache *cachep)
-{
-}
-#endif
-
static DEFINE_PER_CPU(struct delayed_work, slab_reap_work);

static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
@@ -921,7 +784,7 @@ static int transfer_objects(struct array_cache *to,
static inline struct alien_cache **alloc_alien_cache(int node,
int limit, gfp_t gfp)
{
- return (struct alien_cache **)BAD_ALIEN_MAGIC;
+ return NULL;
}

static inline void free_alien_cache(struct alien_cache **ac_ptr)
@@ -1296,13 +1159,7 @@ static int cpuup_prepare(long cpu)
spin_unlock_irq(&n->list_lock);
kfree(shared);
free_alien_cache(alien);
- if (cachep->flags & SLAB_DEBUG_OBJECTS)
- slab_set_debugobj_lock_classes_node(cachep, node);
- else if (!OFF_SLAB(cachep) &&
- !(cachep->flags & SLAB_DESTROY_BY_RCU))
- on_slab_lock_classes_node(cachep, node);
}
- init_node_lock_keys(node);

return 0;
bad:
@@ -1611,9 +1468,6 @@ void __init kmem_cache_init_late(void)
BUG();
mutex_unlock(&slab_mutex);

- /* Annotate slab for lockdep -- annotate the malloc caches */
- init_lock_keys();
-
/* Done! */
slab_state = FULL;

@@ -2386,17 +2240,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
return err;
}

- if (flags & SLAB_DEBUG_OBJECTS) {
- /*
- * Would deadlock through slab_destroy()->call_rcu()->
- * debug_object_activate()->kmem_cache_alloc().
- */
- WARN_ON_ONCE(flags & SLAB_DESTROY_BY_RCU);
-
- slab_set_debugobj_lock_classes(cachep);
- } else if (!OFF_SLAB(cachep) && !(flags & SLAB_DESTROY_BY_RCU))
- on_slab_lock_classes(cachep);
-
return 0;
}

--
1.7.9.5

2014-02-14 06:58:06

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 8/9] slab: destroy a slab without holding any alien cache lock

I haven't heard that this alien cache lock is contended, but to reduce
chance of contention would be better generally. And with this change,
we can simplify complex lockdep annotation in slab code.
In the following patch, it will be implemented.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index ec1df4c..9c9d4d4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1000,9 +1000,9 @@ static void free_alien_cache(struct alien_cache **alc_ptr)
}

static void __drain_alien_cache(struct kmem_cache *cachep,
- struct array_cache *ac, int node)
+ struct array_cache *ac, int node,
+ struct list_head *list)
{
- LIST_HEAD(list);
struct kmem_cache_node *n = cachep->node[node];

if (ac->avail) {
@@ -1015,10 +1015,9 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
if (n->shared)
transfer_objects(n->shared, ac, ac->limit);

- free_block(cachep, ac->entry, ac->avail, node, &list);
+ free_block(cachep, ac->entry, ac->avail, node, list);
ac->avail = 0;
spin_unlock(&n->list_lock);
- slabs_destroy(cachep, &list);
}
}

@@ -1036,8 +1035,11 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)
if (alc) {
ac = &alc->ac;
if (ac->avail && spin_trylock_irq(&alc->lock)) {
- __drain_alien_cache(cachep, ac, node);
+ LIST_HEAD(list);
+
+ __drain_alien_cache(cachep, ac, node, &list);
spin_unlock_irq(&alc->lock);
+ slabs_destroy(cachep, &list);
}
}
}
@@ -1054,10 +1056,13 @@ static void drain_alien_cache(struct kmem_cache *cachep,
for_each_online_node(i) {
alc = alien[i];
if (alc) {
+ LIST_HEAD(list);
+
ac = &alc->ac;
spin_lock_irqsave(&alc->lock, flags);
- __drain_alien_cache(cachep, ac, i);
+ __drain_alien_cache(cachep, ac, i, &list);
spin_unlock_irqrestore(&alc->lock, flags);
+ slabs_destroy(cachep, &list);
}
}
}
@@ -1088,10 +1093,11 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
spin_lock(&alien->lock);
if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
- __drain_alien_cache(cachep, ac, nodeid);
+ __drain_alien_cache(cachep, ac, nodeid, &list);
}
ac_put_obj(cachep, ac, objp);
spin_unlock(&alien->lock);
+ slabs_destroy(cachep, &list);
} else {
spin_lock(&(cachep->node[nodeid])->list_lock);
free_block(cachep, &objp, 1, nodeid, &list);
--
1.7.9.5

2014-02-14 06:58:27

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 7/9] slab: use the lock on alien_cache, instead of the lock on array_cache

Now, we have separate alien_cache structure, so it'd be better to hold
the lock on alien_cache while manipulating alien_cache. After that,
we don't need the lock on array_cache, so remove it.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index c048ac5..ec1df4c 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -191,7 +191,6 @@ struct array_cache {
unsigned int limit;
unsigned int batchcount;
unsigned int touched;
- spinlock_t lock;
void *entry[]; /*
* Must have this definition in here for the proper
* alignment of array_cache. Also simplifies accessing
@@ -484,7 +483,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
return;
for_each_node(r) {
if (alc[r])
- lockdep_set_class(&(alc[r]->ac.lock), alc_key);
+ lockdep_set_class(&(alc[r]->lock), alc_key);
}
}

@@ -761,7 +760,6 @@ static void init_arraycache(struct array_cache *ac, int limit, int batch)
ac->limit = limit;
ac->batchcount = batch;
ac->touched = 0;
- spin_lock_init(&ac->lock);
}
}

@@ -960,6 +958,7 @@ static struct alien_cache *__alloc_alien_cache(int node, int entries,

alc = kmalloc_node(memsize, gfp, node);
init_arraycache(&alc->ac, entries, batch);
+ spin_lock_init(&alc->lock);
return alc;
}

@@ -1036,9 +1035,9 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)

if (alc) {
ac = &alc->ac;
- if (ac->avail && spin_trylock_irq(&ac->lock)) {
+ if (ac->avail && spin_trylock_irq(&alc->lock)) {
__drain_alien_cache(cachep, ac, node);
- spin_unlock_irq(&ac->lock);
+ spin_unlock_irq(&alc->lock);
}
}
}
@@ -1056,9 +1055,9 @@ static void drain_alien_cache(struct kmem_cache *cachep,
alc = alien[i];
if (alc) {
ac = &alc->ac;
- spin_lock_irqsave(&ac->lock, flags);
+ spin_lock_irqsave(&alc->lock, flags);
__drain_alien_cache(cachep, ac, i);
- spin_unlock_irqrestore(&ac->lock, flags);
+ spin_unlock_irqrestore(&alc->lock, flags);
}
}
}
@@ -1086,13 +1085,13 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
if (n->alien && n->alien[nodeid]) {
alien = n->alien[nodeid];
ac = &alien->ac;
- spin_lock(&ac->lock);
+ spin_lock(&alien->lock);
if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
__drain_alien_cache(cachep, ac, nodeid);
}
ac_put_obj(cachep, ac, objp);
- spin_unlock(&ac->lock);
+ spin_unlock(&alien->lock);
} else {
spin_lock(&(cachep->node[nodeid])->list_lock);
free_block(cachep, &objp, 1, nodeid, &list);
@@ -1561,10 +1560,6 @@ void __init kmem_cache_init(void)

memcpy(ptr, cpu_cache_get(kmem_cache),
sizeof(struct arraycache_init));
- /*
- * Do not assume that spinlocks can be initialized via memcpy:
- */
- spin_lock_init(&ptr->lock);

kmem_cache->array[smp_processor_id()] = ptr;

@@ -1574,10 +1569,6 @@ void __init kmem_cache_init(void)
!= &initarray_generic.cache);
memcpy(ptr, cpu_cache_get(kmalloc_caches[INDEX_AC]),
sizeof(struct arraycache_init));
- /*
- * Do not assume that spinlocks can be initialized via memcpy:
- */
- spin_lock_init(&ptr->lock);

kmalloc_caches[INDEX_AC]->array[smp_processor_id()] = ptr;
}
--
1.7.9.5

2014-02-14 06:58:47

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 4/9] slab: defer slab_destroy in free_block()

In free_block(), if freeing object makes new free slab and number of
free_objects exceeds free_limit, we start to destroy this new free slab
with holding the kmem_cache node lock. Holding the lock is useless and,
generally, holding a lock as least as possible is good thing. I never
measure performance effect of this, but we'd be better not to hold the lock
as much as possible.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 53d1a36..551d503 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -242,7 +242,8 @@ static struct kmem_cache_node __initdata init_kmem_cache_node[NUM_INIT_LISTS];
static int drain_freelist(struct kmem_cache *cache,
struct kmem_cache_node *n, int tofree);
static void free_block(struct kmem_cache *cachep, void **objpp, int len,
- int node);
+ int node, struct list_head *list);
+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list);
static int enable_cpucache(struct kmem_cache *cachep, gfp_t gfp);
static void cache_reap(struct work_struct *unused);

@@ -979,6 +980,7 @@ static void free_alien_cache(struct array_cache **ac_ptr)
static void __drain_alien_cache(struct kmem_cache *cachep,
struct array_cache *ac, int node)
{
+ LIST_HEAD(list);
struct kmem_cache_node *n = cachep->node[node];

if (ac->avail) {
@@ -991,9 +993,10 @@ static void __drain_alien_cache(struct kmem_cache *cachep,
if (n->shared)
transfer_objects(n->shared, ac, ac->limit);

- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
ac->avail = 0;
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

@@ -1037,6 +1040,7 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
struct kmem_cache_node *n;
struct array_cache *alien = NULL;
int node;
+ LIST_HEAD(list);

node = numa_mem_id();

@@ -1060,8 +1064,9 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
spin_unlock(&alien->lock);
} else {
spin_lock(&(cachep->node[nodeid])->list_lock);
- free_block(cachep, &objp, 1, nodeid);
+ free_block(cachep, &objp, 1, nodeid, &list);
spin_unlock(&(cachep->node[nodeid])->list_lock);
+ slabs_destroy(cachep, &list);
}
return 1;
}
@@ -1130,6 +1135,7 @@ static void cpuup_canceled(long cpu)
struct array_cache *nc;
struct array_cache *shared;
struct array_cache **alien;
+ LIST_HEAD(list);

/* cpu is dead; no one can alloc from it. */
nc = cachep->array[cpu];
@@ -1144,7 +1150,7 @@ static void cpuup_canceled(long cpu)
/* Free limit for this kmem_cache_node */
n->free_limit -= cachep->batchcount;
if (nc)
- free_block(cachep, nc->entry, nc->avail, node);
+ free_block(cachep, nc->entry, nc->avail, node, &list);

if (!cpumask_empty(mask)) {
spin_unlock_irq(&n->list_lock);
@@ -1154,7 +1160,7 @@ static void cpuup_canceled(long cpu)
shared = n->shared;
if (shared) {
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);
n->shared = NULL;
}

@@ -1162,6 +1168,7 @@ static void cpuup_canceled(long cpu)
n->alien = NULL;

spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);

kfree(shared);
if (alien) {
@@ -1999,6 +2006,15 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page)
kmem_cache_free(cachep->freelist_cache, freelist);
}

+static void slabs_destroy(struct kmem_cache *cachep, struct list_head *list)
+{
+ struct page *page, *n;
+ list_for_each_entry_safe(page, n, list, lru) {
+ list_del(&page->lru);
+ slab_destroy(cachep, page);
+ }
+}
+
/**
* calculate_slab_order - calculate size (page order) of slabs
* @cachep: pointer to the cache that is being created
@@ -2399,12 +2415,14 @@ static void do_drain(void *arg)
struct kmem_cache *cachep = arg;
struct array_cache *ac;
int node = numa_mem_id();
+ LIST_HEAD(list);

check_irq_off();
ac = cpu_cache_get(cachep);
spin_lock(&cachep->node[node]->list_lock);
- free_block(cachep, ac->entry, ac->avail, node);
+ free_block(cachep, ac->entry, ac->avail, node, &list);
spin_unlock(&cachep->node[node]->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail = 0;
}

@@ -3355,8 +3373,8 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
/*
* Caller needs to acquire correct kmem_list's list_lock
*/
-static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
- int node)
+static void free_block(struct kmem_cache *cachep, void **objpp,
+ int nr_objects, int node, struct list_head *list)
{
int i;
struct kmem_cache_node *n;
@@ -3379,13 +3397,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
if (page->active == 0) {
if (n->free_objects > n->free_limit) {
n->free_objects -= cachep->num;
- /* No need to drop any previously held
- * lock here, even if we have a off-slab slab
- * descriptor it is guaranteed to come from
- * a different cache, refer to comments before
- * alloc_slabmgmt.
- */
- slab_destroy(cachep, page);
+ list_add(&page->lru, list);
} else {
list_add(&page->lru, &n->slabs_free);
}
@@ -3404,6 +3416,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
int batchcount;
struct kmem_cache_node *n;
int node = numa_mem_id();
+ LIST_HEAD(list);

batchcount = ac->batchcount;
#if DEBUG
@@ -3425,7 +3438,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
}
}

- free_block(cachep, ac->entry, batchcount, node);
+ free_block(cachep, ac->entry, batchcount, node, &list);
free_done:
#if STATS
{
@@ -3446,6 +3459,7 @@ free_done:
}
#endif
spin_unlock(&n->list_lock);
+ slabs_destroy(cachep, &list);
ac->avail -= batchcount;
memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail);
}
@@ -3731,12 +3745,13 @@ static int alloc_kmemlist(struct kmem_cache *cachep, gfp_t gfp)
n = cachep->node[node];
if (n) {
struct array_cache *shared = n->shared;
+ LIST_HEAD(list);

spin_lock_irq(&n->list_lock);

if (shared)
free_block(cachep, shared->entry,
- shared->avail, node);
+ shared->avail, node, &list);

n->shared = new_shared;
if (!n->alien) {
@@ -3746,6 +3761,7 @@ static int alloc_kmemlist(struct kmem_cache *cachep, gfp_t gfp)
n->free_limit = (1 + nr_cpus_node(node)) *
cachep->batchcount + cachep->num;
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
kfree(shared);
free_alien_cache(new_alien);
continue;
@@ -3836,12 +3852,15 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, int limit,
cachep->shared = shared;

for_each_online_cpu(i) {
+ LIST_HEAD(list);
struct array_cache *ccold = new->new[i];
if (!ccold)
continue;
spin_lock_irq(&cachep->node[cpu_to_mem(i)]->list_lock);
- free_block(cachep, ccold->entry, ccold->avail, cpu_to_mem(i));
+ free_block(cachep, ccold->entry, ccold->avail,
+ cpu_to_mem(i), &list);
spin_unlock_irq(&cachep->node[cpu_to_mem(i)]->list_lock);
+ slabs_destroy(cachep, &list);
kfree(ccold);
}
kfree(new);
@@ -3949,6 +3968,7 @@ skip_setup:
static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
struct array_cache *ac, int force, int node)
{
+ LIST_HEAD(list);
int tofree;

if (!ac || !ac->avail)
@@ -3961,12 +3981,13 @@ static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n,
tofree = force ? ac->avail : (ac->limit + 4) / 5;
if (tofree > ac->avail)
tofree = (ac->avail + 1) / 2;
- free_block(cachep, ac->entry, tofree, node);
+ free_block(cachep, ac->entry, tofree, node, &list);
ac->avail -= tofree;
memmove(ac->entry, &(ac->entry[tofree]),
sizeof(void *) * ac->avail);
}
spin_unlock_irq(&n->list_lock);
+ slabs_destroy(cachep, &list);
}
}

--
1.7.9.5

2014-02-14 06:58:45

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 6/9] slab: introduce alien_cache

Currently, we use array_cache for alien_cache. Although they are mostly
similar, there is one difference, that is, need for spinlock.
We don't need spinlock for array_cache itself, but to use array_cache for
alien_cache, array_cache structure should have spinlock. This is needless
overhead, so removing it would be better. This patch prepare it by
introducing alien_cache and using it. In the following patch,
we remove spinlock in array_cache.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 90bfd79..c048ac5 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -203,6 +203,11 @@ struct array_cache {
*/
};

+struct alien_cache {
+ spinlock_t lock;
+ struct array_cache ac;
+};
+
#define SLAB_OBJ_PFMEMALLOC 1
static inline bool is_obj_pfmemalloc(void *objp)
{
@@ -458,7 +463,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
struct lock_class_key *l3_key, struct lock_class_key *alc_key,
int q)
{
- struct array_cache **alc;
+ struct alien_cache **alc;
struct kmem_cache_node *n;
int r;

@@ -479,7 +484,7 @@ static void slab_set_lock_classes(struct kmem_cache *cachep,
return;
for_each_node(r) {
if (alc[r])
- lockdep_set_class(&alc[r]->lock, alc_key);
+ lockdep_set_class(&(alc[r]->ac.lock), alc_key);
}
}

@@ -915,12 +920,13 @@ static int transfer_objects(struct array_cache *to,
#define drain_alien_cache(cachep, alien) do { } while (0)
#define reap_alien(cachep, n) do { } while (0)

-static inline struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
+static inline struct alien_cache **alloc_alien_cache(int node,
+ int limit, gfp_t gfp)
{
- return (struct array_cache **)BAD_ALIEN_MAGIC;
+ return (struct alien_cache **)BAD_ALIEN_MAGIC;
}

-static inline void free_alien_cache(struct array_cache **ac_ptr)
+static inline void free_alien_cache(struct alien_cache **ac_ptr)
{
}

@@ -946,40 +952,52 @@ static inline void *____cache_alloc_node(struct kmem_cache *cachep,
static void *____cache_alloc_node(struct kmem_cache *, gfp_t, int);
static void *alternate_node_alloc(struct kmem_cache *, gfp_t);

-static struct array_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
+static struct alien_cache *__alloc_alien_cache(int node, int entries,
+ int batch, gfp_t gfp)
+{
+ int memsize = sizeof(void *) * entries + sizeof(struct alien_cache);
+ struct alien_cache *alc = NULL;
+
+ alc = kmalloc_node(memsize, gfp, node);
+ init_arraycache(&alc->ac, entries, batch);
+ return alc;
+}
+
+static struct alien_cache **alloc_alien_cache(int node, int limit, gfp_t gfp)
{
- struct array_cache **ac_ptr;
+ struct alien_cache **alc_ptr;
int memsize = sizeof(void *) * nr_node_ids;
int i;

if (limit > 1)
limit = 12;
- ac_ptr = kzalloc_node(memsize, gfp, node);
- if (ac_ptr) {
- for_each_node(i) {
- if (i == node || !node_online(i))
- continue;
- ac_ptr[i] = alloc_arraycache(node, limit, 0xbaadf00d, gfp);
- if (!ac_ptr[i]) {
- for (i--; i >= 0; i--)
- kfree(ac_ptr[i]);
- kfree(ac_ptr);
- return NULL;
- }
+ alc_ptr = kzalloc_node(memsize, gfp, node);
+ if (!alc_ptr)
+ return NULL;
+
+ for_each_node(i) {
+ if (i == node || !node_online(i))
+ continue;
+ alc_ptr[i] = __alloc_alien_cache(node, limit, 0xbaadf00d, gfp);
+ if (!alc_ptr[i]) {
+ for (i--; i >= 0; i--)
+ kfree(alc_ptr[i]);
+ kfree(alc_ptr);
+ return NULL;
}
}
- return ac_ptr;
+ return alc_ptr;
}

-static void free_alien_cache(struct array_cache **ac_ptr)
+static void free_alien_cache(struct alien_cache **alc_ptr)
{
int i;

- if (!ac_ptr)
+ if (!alc_ptr)
return;
for_each_node(i)
- kfree(ac_ptr[i]);
- kfree(ac_ptr);
+ kfree(alc_ptr[i]);
+ kfree(alc_ptr);
}

static void __drain_alien_cache(struct kmem_cache *cachep,
@@ -1013,25 +1031,31 @@ static void reap_alien(struct kmem_cache *cachep, struct kmem_cache_node *n)
int node = __this_cpu_read(slab_reap_node);

if (n->alien) {
- struct array_cache *ac = n->alien[node];
-
- if (ac && ac->avail && spin_trylock_irq(&ac->lock)) {
- __drain_alien_cache(cachep, ac, node);
- spin_unlock_irq(&ac->lock);
+ struct alien_cache *alc = n->alien[node];
+ struct array_cache *ac;
+
+ if (alc) {
+ ac = &alc->ac;
+ if (ac->avail && spin_trylock_irq(&ac->lock)) {
+ __drain_alien_cache(cachep, ac, node);
+ spin_unlock_irq(&ac->lock);
+ }
}
}
}

static void drain_alien_cache(struct kmem_cache *cachep,
- struct array_cache **alien)
+ struct alien_cache **alien)
{
int i = 0;
+ struct alien_cache *alc;
struct array_cache *ac;
unsigned long flags;

for_each_online_node(i) {
- ac = alien[i];
- if (ac) {
+ alc = alien[i];
+ if (alc) {
+ ac = &alc->ac;
spin_lock_irqsave(&ac->lock, flags);
__drain_alien_cache(cachep, ac, i);
spin_unlock_irqrestore(&ac->lock, flags);
@@ -1043,7 +1067,8 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
{
int nodeid = page_to_nid(virt_to_page(objp));
struct kmem_cache_node *n;
- struct array_cache *alien = NULL;
+ struct alien_cache *alien = NULL;
+ struct array_cache *ac;
int node;
LIST_HEAD(list);

@@ -1060,13 +1085,14 @@ static inline int cache_free_alien(struct kmem_cache *cachep, void *objp)
STATS_INC_NODEFREES(cachep);
if (n->alien && n->alien[nodeid]) {
alien = n->alien[nodeid];
- spin_lock(&alien->lock);
- if (unlikely(alien->avail == alien->limit)) {
+ ac = &alien->ac;
+ spin_lock(&ac->lock);
+ if (unlikely(ac->avail == ac->limit)) {
STATS_INC_ACOVERFLOW(cachep);
- __drain_alien_cache(cachep, alien, nodeid);
+ __drain_alien_cache(cachep, ac, nodeid);
}
- ac_put_obj(cachep, alien, objp);
- spin_unlock(&alien->lock);
+ ac_put_obj(cachep, ac, objp);
+ spin_unlock(&ac->lock);
} else {
spin_lock(&(cachep->node[nodeid])->list_lock);
free_block(cachep, &objp, 1, nodeid, &list);
@@ -1139,7 +1165,7 @@ static void cpuup_canceled(long cpu)
list_for_each_entry(cachep, &slab_caches, list) {
struct array_cache *nc;
struct array_cache *shared;
- struct array_cache **alien;
+ struct alien_cache **alien;
LIST_HEAD(list);

/* cpu is dead; no one can alloc from it. */
@@ -1220,7 +1246,7 @@ static int cpuup_prepare(long cpu)
list_for_each_entry(cachep, &slab_caches, list) {
struct array_cache *nc;
struct array_cache *shared = NULL;
- struct array_cache **alien = NULL;
+ struct alien_cache **alien = NULL;

nc = alloc_arraycache(node, cachep->limit,
cachep->batchcount, GFP_KERNEL);
@@ -3726,7 +3752,7 @@ static int alloc_kmemlist(struct kmem_cache *cachep, gfp_t gfp)
int node;
struct kmem_cache_node *n;
struct array_cache *new_shared;
- struct array_cache **new_alien = NULL;
+ struct alien_cache **new_alien = NULL;

for_each_online_node(node) {

diff --git a/mm/slab.h b/mm/slab.h
index 8184a7c..4c2d801 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -283,7 +283,7 @@ struct kmem_cache_node {
unsigned int free_limit;
unsigned int colour_next; /* Per-node cache coloring */
struct array_cache *shared; /* shared per node */
- struct array_cache **alien; /* on other nodes */
+ struct alien_cache **alien; /* on other nodes */
unsigned long next_reap; /* updated without locking */
int free_touched; /* updated without locking */
#endif
--
1.7.9.5

2014-02-14 06:59:15

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 5/9] slab: factor out initialization of arracy cache

Factor out initialization of array cache to use it in following patch.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 551d503..90bfd79 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -741,13 +741,8 @@ static void start_cpu_timer(int cpu)
}
}

-static struct array_cache *alloc_arraycache(int node, int entries,
- int batchcount, gfp_t gfp)
+static void init_arraycache(struct array_cache *ac, int limit, int batch)
{
- int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
- struct array_cache *nc = NULL;
-
- nc = kmalloc_node(memsize, gfp, node);
/*
* The array_cache structures contain pointers to free object.
* However, when such objects are allocated or transferred to another
@@ -755,15 +750,25 @@ static struct array_cache *alloc_arraycache(int node, int entries,
* valid references during a kmemleak scan. Therefore, kmemleak must
* not scan such objects.
*/
- kmemleak_no_scan(nc);
- if (nc) {
- nc->avail = 0;
- nc->limit = entries;
- nc->batchcount = batchcount;
- nc->touched = 0;
- spin_lock_init(&nc->lock);
+ kmemleak_no_scan(ac);
+ if (ac) {
+ ac->avail = 0;
+ ac->limit = limit;
+ ac->batchcount = batch;
+ ac->touched = 0;
+ spin_lock_init(&ac->lock);
}
- return nc;
+}
+
+static struct array_cache *alloc_arraycache(int node, int entries,
+ int batchcount, gfp_t gfp)
+{
+ int memsize = sizeof(void *) * entries + sizeof(struct array_cache);
+ struct array_cache *ac = NULL;
+
+ ac = kmalloc_node(memsize, gfp, node);
+ init_arraycache(ac, entries, batchcount);
+ return ac;
}

static inline bool is_slab_pfmemalloc(struct page *page)
--
1.7.9.5

2014-02-14 06:59:45

by Joonsoo Kim

[permalink] [raw]

Subject: [PATCH 3/9] slab: move up code to get kmem_cache_node in free_block()

node isn't changed, so we don't need to retreive this structure
everytime we move the object. Maybe compiler do this optimization,
but making it explicitly is better.

Signed-off-by: Joonsoo Kim <[email protected]>

diff --git a/mm/slab.c b/mm/slab.c
index 6d17cad..53d1a36 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3361,6 +3361,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
int i;
struct kmem_cache_node *n;

+ n = cachep->node[node];
for (i = 0; i < nr_objects; i++) {
void *objp;
struct page *page;
@@ -3368,7 +3369,6 @@ static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects,
objp = clear_obj_pfmemalloc(objpp[i]);

page = virt_to_head_page(objp);
- n = cachep->node[node];
list_del(&page->lru);
check_spinlock_acquired_node(cachep, node);
slab_put_obj(cachep, page, objp, node);
--
1.7.9.5

2014-02-14 18:25:15

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 3/9] slab: move up code to get kmem_cache_node in free_block()

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> node isn't changed, so we don't need to retreive this structure
> everytime we move the object. Maybe compiler do this optimization,
> but making it explicitly is better.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:27:16

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 2/9] slab: makes clear_obj_pfmemalloc() just return store masked value

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> clear_obj_pfmemalloc() takes the pointer to the object pointer as argument
> to store masked value back into this address.
> But this is useless, since we don't use this stored value anymore.
> All we need is just masked value. So makes clear_obj_pfmemalloc()
> just return masked value.

Could this be a bit more compact?

> @@ -215,9 +215,9 @@ static inline void set_obj_pfmemalloc(void **objp)
> return;
> }
>
> -static inline void clear_obj_pfmemalloc(void **objp)
> +static inline void *clear_obj_pfmemalloc(void *objp)
> {
> - *objp = (void *)((unsigned long)*objp & ~SLAB_OBJ_PFMEMALLOC);
> + return (void *)((unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC);
> }

I dont think you need the (void *) cast here.

> /*
> @@ -810,7 +810,7 @@ static void *__ac_get_obj(struct kmem_cache *cachep, struct array_cache *ac,
> struct kmem_cache_node *n;
>
> if (gfp_pfmemalloc_allowed(flags)) {
> - clear_obj_pfmemalloc(&objp);
> + objp = clear_obj_pfmemalloc(objp);
> return objp;
> }

No need for objp. Just "return clear_obj_....

> @@ -833,7 +833,7 @@ static void *__ac_get_obj(struct kmem_cache *cachep, struct array_cache *ac,
> if (!list_empty(&n->slabs_free) && force_refill) {
> struct page *page = virt_to_head_page(objp);
> ClearPageSlabPfmemalloc(page);
> - clear_obj_pfmemalloc(&objp);
> + objp = clear_obj_pfmemalloc(objp);
> recheck_pfmemalloc_active(cachep, ac);
> return objp;

Same here?

2014-02-14 18:40:14

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 4/9] slab: defer slab_destroy in free_block()

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> In free_block(), if freeing object makes new free slab and number of
> free_objects exceeds free_limit, we start to destroy this new free slab
> with holding the kmem_cache node lock. Holding the lock is useless and,
> generally, holding a lock as least as possible is good thing. I never
> measure performance effect of this, but we'd be better not to hold the lock
> as much as possible.

This is also good because kmem_cache_free is no longer called while
holding the node lock. So we avoid one case of recursion.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:41:50

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 5/9] slab: factor out initialization of arracy cache

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> Factor out initialization of array cache to use it in following patch.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:46:19

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 6/9] slab: introduce alien_cache

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> Currently, we use array_cache for alien_cache. Although they are mostly
> similar, there is one difference, that is, need for spinlock.
> We don't need spinlock for array_cache itself, but to use array_cache for
> alien_cache, array_cache structure should have spinlock. This is needless
> overhead, so removing it would be better. This patch prepare it by
> introducing alien_cache and using it. In the following patch,
> we remove spinlock in array_cache.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:47:25

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 7/9] slab: use the lock on alien_cache, instead of the lock on array_cache

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> Now, we have separate alien_cache structure, so it'd be better to hold
> the lock on alien_cache while manipulating alien_cache. After that,
> we don't need the lock on array_cache, so remove it.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:48:32

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 8/9] slab: destroy a slab without holding any alien cache lock

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> I haven't heard that this alien cache lock is contended, but to reduce
> chance of contention would be better generally. And with this change,
> we can simplify complex lockdep annotation in slab code.
> In the following patch, it will be implemented.

Ok. Same move as before with the regular freeing.

Acked-by: Christoph Lameter <[email protected]>

2014-02-14 18:50:07

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 9/9] slab: remove a useless lockdep annotation

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> @@ -921,7 +784,7 @@ static int transfer_objects(struct array_cache *to,
> static inline struct alien_cache **alloc_alien_cache(int node,
> int limit, gfp_t gfp)
> {
> - return (struct alien_cache **)BAD_ALIEN_MAGIC;
> + return NULL;
> }
>

Why change the BAD_ALIEN_MAGIC?

2014-02-14 23:15:55

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH 1/9] slab: add unlikely macro to help compiler

On Fri, 14 Feb 2014, Joonsoo Kim wrote:

> slab_should_failslab() is called on every allocation, so to optimize it
> is reasonable. We normally don't allocate from kmem_cache. It is just
> used when new kmem_cache is created, so it's very rare case. Therefore,
> add unlikely macro to help compiler optimization.
>
> Signed-off-by: Joonsoo Kim <[email protected]>

Acked-by: David Rientjes <[email protected]>

2014-02-14 23:18:06

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH 2/9] slab: makes clear_obj_pfmemalloc() just return store masked value

On Fri, 14 Feb 2014, Christoph Lameter wrote:

> > @@ -215,9 +215,9 @@ static inline void set_obj_pfmemalloc(void **objp)
> > return;
> > }
> >
> > -static inline void clear_obj_pfmemalloc(void **objp)
> > +static inline void *clear_obj_pfmemalloc(void *objp)
> > {
> > - *objp = (void *)((unsigned long)*objp & ~SLAB_OBJ_PFMEMALLOC);
> > + return (void *)((unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC);
> > }
>
> I dont think you need the (void *) cast here.
>

Yeah, you don't need it, but don't you think it makes the code more
readable? Otherwise this is going to be just doing

return (unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC;

and you gotta figure out the function type to understand it's returned as
a pointer.

2014-02-14 23:19:06

by David Rientjes

[permalink] [raw]

Subject: Re: [PATCH 3/9] slab: move up code to get kmem_cache_node in free_block()

2014-02-15 00:26:19

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 2/9] slab: makes clear_obj_pfmemalloc() just return store masked value

On Fri, 14 Feb 2014, David Rientjes wrote:

> Yeah, you don't need it, but don't you think it makes the code more
> readable? Otherwise this is going to be just doing
>
> return (unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC;
>
> and you gotta figure out the function type to understand it's returned as

Isnt there something like PTR_ALIGN() for this case that would make it
more readable?

2014-02-17 06:14:54

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH 2/9] slab: makes clear_obj_pfmemalloc() just return store masked value

On Fri, Feb 14, 2014 at 06:26:15PM -0600, Christoph Lameter wrote:
> On Fri, 14 Feb 2014, David Rientjes wrote:
>
> > Yeah, you don't need it, but don't you think it makes the code more
> > readable? Otherwise this is going to be just doing
> >
> > return (unsigned long)objp & ~SLAB_OBJ_PFMEMALLOC;
> >
> > and you gotta figure out the function type to understand it's returned as
>
> Isnt there something like PTR_ALIGN() for this case that would make it
> more readable?

I can't find what you want.
I agree with David's opinion and want to keep patch as is.

Thanks.

2014-02-17 06:23:17

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH 3/9] slab: move up code to get kmem_cache_node in free_block()

On Fri, Feb 14, 2014 at 03:19:02PM -0800, David Rientjes wrote:
> On Fri, 14 Feb 2014, Joonsoo Kim wrote:
>
> > node isn't changed, so we don't need to retreive this structure
> > everytime we move the object. Maybe compiler do this optimization,
> > but making it explicitly is better.
> >
>
> Would it be possible to make it const struct kmem_cache_node *n then?

Hello, David.

Yes, it is possible.
If I send v2, I will change it.

Thanks.

2014-02-17 06:26:53

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH 9/9] slab: remove a useless lockdep annotation

On Fri, Feb 14, 2014 at 12:49:57PM -0600, Christoph Lameter wrote:
> On Fri, 14 Feb 2014, Joonsoo Kim wrote:
>
> > @@ -921,7 +784,7 @@ static int transfer_objects(struct array_cache *to,
> > static inline struct alien_cache **alloc_alien_cache(int node,
> > int limit, gfp_t gfp)
> > {
> > - return (struct alien_cache **)BAD_ALIEN_MAGIC;
> > + return NULL;
> > }
> >
>
> Why change the BAD_ALIEN_MAGIC?

Hello, Christoph.

BAD_ALIEN_MAGIC is only checked by slab_set_lock_classes(). We remove this
function in this patch, so returning BAD_ALIEN_MAGIC is useless.
And, in fact, BAD_ALIEN_MAGIC is already useless, because alloc_alien_cache()
can't be called on !CONFIG_NUMA. This function is called if use_alien_caches
is positive, but on !CONFIG_NUMA, use_alien_caches is always 0. So we don't
have any chance to meet this BAD_ALIEN_MAGIC in runtime.

Thanks.

2014-02-18 16:21:18

by Christoph Lameter (Ampere)

[permalink] [raw]

Subject: Re: [PATCH 9/9] slab: remove a useless lockdep annotation

On Mon, 17 Feb 2014, Joonsoo Kim wrote:

> > Why change the BAD_ALIEN_MAGIC?
>
> Hello, Christoph.
>
> BAD_ALIEN_MAGIC is only checked by slab_set_lock_classes(). We remove this
> function in this patch, so returning BAD_ALIEN_MAGIC is useless.

Its not useless. The point is if there is a pointer deref then we will see
this as a pointer value and know that it is realted to alien cache
processing.

> And, in fact, BAD_ALIEN_MAGIC is already useless, because alloc_alien_cache()
> can't be called on !CONFIG_NUMA. This function is called if use_alien_caches
> is positive, but on !CONFIG_NUMA, use_alien_caches is always 0. So we don't
> have any chance to meet this BAD_ALIEN_MAGIC in runtime.

Maybe it no longer serves a point. But note that caches may not be
populated because processors/nodes are not up yet.

2014-02-24 04:49:12

by Joonsoo Kim

[permalink] [raw]

Subject: Re: [PATCH 9/9] slab: remove a useless lockdep annotation

On Tue, Feb 18, 2014 at 10:21:10AM -0600, Christoph Lameter wrote:
> On Mon, 17 Feb 2014, Joonsoo Kim wrote:
>
> > > Why change the BAD_ALIEN_MAGIC?
> >
> > Hello, Christoph.
> >
> > BAD_ALIEN_MAGIC is only checked by slab_set_lock_classes(). We remove this
> > function in this patch, so returning BAD_ALIEN_MAGIC is useless.
>
> Its not useless. The point is if there is a pointer deref then we will see
> this as a pointer value and know that it is realted to alien cache
> processing.
>
> > And, in fact, BAD_ALIEN_MAGIC is already useless, because alloc_alien_cache()
> > can't be called on !CONFIG_NUMA. This function is called if use_alien_caches
> > is positive, but on !CONFIG_NUMA, use_alien_caches is always 0. So we don't
> > have any chance to meet this BAD_ALIEN_MAGIC in runtime.
>
> Maybe it no longer serves a point. But note that caches may not be
> populated because processors/nodes are not up yet.

Hello,

Let me clarify about alloc_alien_cache().

alloc_alien_cache() has two definitions, one for !CONFIG_NUMA, and the other for
CONFIG_NUMA. BAD_ALIEN_MAGIC is only assigned on !CONFIG_NUMA definition. On
CONFIG_NUMA, alloc_alien_cache() doesn't use BAD_ALIEN_MAGIC. So it is sufficient
to consider just !CONFIG_NUMA case.

As I mentioned before, this function isn't called if use_alien_caches is zero
and use_alien_caches is always zero on !CONFIG_NUMA. Therefore we cannot see
BAD_ALIEN_MAGIC on any configuration. I don't know why BAD_ALIEN_MAGIC is
introduced, however, it no longer serves a point, so it is better to remove it.

There are lots of code to check whether processor/nodes are up or not and these
doesn't use BAD_ALIEN_MAGIC. Instead, it checks NULL on alien_cache of specific node.
So removing BAD_ALIEN_MAGIC doesn't harm anything here.

Thanks.