Hi Linus,
It's been rather quiet on the SLAB front. Here's the usual batch of minor
bug fixes accumulated over the past release cycle.
Pekka
The following changes since commit 053fe57ac249a9531c396175778160d9e9509399:
Linus Torvalds (1):
Merge git://git.kernel.org/.../davem/net-2.6
are available in the git repository at:
ssh://master.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 for-linus
David Rientjes (1):
slub: allow stats to be cleared
J. R. Okajima (2):
slab, kmemleak: stop calling kmemleak_erase() unconditionally
slab, kmemleak: pass the correct pointer to kmemleak_erase()
Pekka Enberg (3):
SLUB: Fix __GFP_ZERO unlikely() annotation
SLAB: Fix lockdep annotations for CPU hotplug
Merge branches 'slab/fixes', 'slab/kmemleak', 'slub/perf' and 'slub/stats' into for-linus
Tim Blechmann (1):
SLAB: Fix unlikely() annotation in __cache_alloc_node()
Documentation/ABI/testing/sysfs-kernel-slab | 109 +++++++++++++------------
mm/slab.c | 118 ++++++++++++++++-----------
mm/slub.c | 20 ++++-
3 files changed, 145 insertions(+), 102 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
index 6dcf75e..8b093f8 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -45,8 +45,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The alloc_fastpath file is read-only and specifies how many
- objects have been allocated using the fast path.
+ The alloc_fastpath file shows how many objects have been
+ allocated using the fast path. It can be written to clear the
+ current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_from_partial
@@ -55,9 +56,10 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The alloc_from_partial file is read-only and specifies how
- many times a cpu slab has been full and it has been refilled
- by using a slab from the list of partially used slabs.
+ The alloc_from_partial file shows how many times a cpu slab has
+ been full and it has been refilled by using a slab from the list
+ of partially used slabs. It can be written to clear the current
+ count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_refill
@@ -66,9 +68,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The alloc_refill file is read-only and specifies how many
- times the per-cpu freelist was empty but there were objects
- available as the result of remote cpu frees.
+ The alloc_refill file shows how many times the per-cpu freelist
+ was empty but there were objects available as the result of
+ remote cpu frees. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_slab
@@ -77,8 +79,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The alloc_slab file is read-only and specifies how many times
- a new slab had to be allocated from the page allocator.
+ The alloc_slab file is shows how many times a new slab had to
+ be allocated from the page allocator. It can be written to
+ clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_slowpath
@@ -87,9 +90,10 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The alloc_slowpath file is read-only and specifies how many
- objects have been allocated using the slow path because of a
- refill or allocation from a partial or new slab.
+ The alloc_slowpath file shows how many objects have been
+ allocated using the slow path because of a refill or
+ allocation from a partial or new slab. It can be written to
+ clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/cache_dma
@@ -117,10 +121,11 @@ KernelVersion: 2.6.31
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file cpuslab_flush is read-only and specifies how many
- times a cache's cpu slabs have been flushed as the result of
- destroying or shrinking a cache, a cpu going offline, or as
- the result of forcing an allocation from a certain node.
+ The file cpuslab_flush shows how many times a cache's cpu slabs
+ have been flushed as the result of destroying or shrinking a
+ cache, a cpu going offline, or as the result of forcing an
+ allocation from a certain node. It can be written to clear the
+ current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/ctor
@@ -139,8 +144,8 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file deactivate_empty is read-only and specifies how many
- times an empty cpu slab was deactivated.
+ The deactivate_empty file shows how many times an empty cpu slab
+ was deactivated. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_full
@@ -149,8 +154,8 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file deactivate_full is read-only and specifies how many
- times a full cpu slab was deactivated.
+ The deactivate_full file shows how many times a full cpu slab
+ was deactivated. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_remote_frees
@@ -159,9 +164,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file deactivate_remote_frees is read-only and specifies how
- many times a cpu slab has been deactivated and contained free
- objects that were freed remotely.
+ The deactivate_remote_frees file shows how many times a cpu slab
+ has been deactivated and contained free objects that were freed
+ remotely. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_to_head
@@ -170,9 +175,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file deactivate_to_head is read-only and specifies how
- many times a partial cpu slab was deactivated and added to the
- head of its node's partial list.
+ The deactivate_to_head file shows how many times a partial cpu
+ slab was deactivated and added to the head of its node's partial
+ list. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_to_tail
@@ -181,9 +186,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file deactivate_to_tail is read-only and specifies how
- many times a partial cpu slab was deactivated and added to the
- tail of its node's partial list.
+ The deactivate_to_tail file shows how many times a partial cpu
+ slab was deactivated and added to the tail of its node's partial
+ list. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/destroy_by_rcu
@@ -201,9 +206,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file free_add_partial is read-only and specifies how many
- times an object has been freed in a full slab so that it had to
- added to its node's partial list.
+ The free_add_partial file shows how many times an object has
+ been freed in a full slab so that it had to added to its node's
+ partial list. It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_calls
@@ -222,9 +227,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The free_fastpath file is read-only and specifies how many
- objects have been freed using the fast path because it was an
- object from the cpu slab.
+ The free_fastpath file shows how many objects have been freed
+ using the fast path because it was an object from the cpu slab.
+ It can be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_frozen
@@ -233,9 +238,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The free_frozen file is read-only and specifies how many
- objects have been freed to a frozen slab (i.e. a remote cpu
- slab).
+ The free_frozen file shows how many objects have been freed to
+ a frozen slab (i.e. a remote cpu slab). It can be written to
+ clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_remove_partial
@@ -244,9 +249,10 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file free_remove_partial is read-only and specifies how
- many times an object has been freed to a now-empty slab so
- that it had to be removed from its node's partial list.
+ The free_remove_partial file shows how many times an object has
+ been freed to a now-empty slab so that it had to be removed from
+ its node's partial list. It can be written to clear the current
+ count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_slab
@@ -255,8 +261,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The free_slab file is read-only and specifies how many times an
- empty slab has been freed back to the page allocator.
+ The free_slab file shows how many times an empty slab has been
+ freed back to the page allocator. It can be written to clear
+ the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_slowpath
@@ -265,9 +272,9 @@ KernelVersion: 2.6.25
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The free_slowpath file is read-only and specifies how many
- objects have been freed using the slow path (i.e. to a full or
- partial slab).
+ The free_slowpath file shows how many objects have been freed
+ using the slow path (i.e. to a full or partial slab). It can
+ be written to clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/hwcache_align
@@ -346,10 +353,10 @@ KernelVersion: 2.6.26
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
- The file order_fallback is read-only and specifies how many
- times an allocation of a new slab has not been possible at the
- cache's order and instead fallen back to its minimum possible
- order.
+ The order_fallback file shows how many times an allocation of a
+ new slab has not been possible at the cache's order and instead
+ fallen back to its minimum possible order. It can be written to
+ clear the current count.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/partial
diff --git a/mm/slab.c b/mm/slab.c
index 7dfa481..a6c9166 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -604,6 +604,26 @@ static struct kmem_cache cache_cache = {
#define BAD_ALIEN_MAGIC 0x01020304ul
+/*
+ * chicken and egg problem: delay the per-cpu array allocation
+ * until the general caches are up.
+ */
+static enum {
+ NONE,
+ PARTIAL_AC,
+ PARTIAL_L3,
+ EARLY,
+ FULL
+} g_cpucache_up;
+
+/*
+ * used by boot code to determine if it can use slab based allocator
+ */
+int slab_is_available(void)
+{
+ return g_cpucache_up >= EARLY;
+}
+
#ifdef CONFIG_LOCKDEP
/*
@@ -620,40 +640,52 @@ static struct kmem_cache cache_cache = {
static struct lock_class_key on_slab_l3_key;
static struct lock_class_key on_slab_alc_key;
-static inline void init_lock_keys(void)
-
+static void init_node_lock_keys(int q)
{
- int q;
struct cache_sizes *s = malloc_sizes;
- while (s->cs_size != ULONG_MAX) {
- for_each_node(q) {
- struct array_cache **alc;
- int r;
- struct kmem_list3 *l3 = s->cs_cachep->nodelists[q];
- if (!l3 || OFF_SLAB(s->cs_cachep))
- continue;
- lockdep_set_class(&l3->list_lock, &on_slab_l3_key);
- alc = l3->alien;
- /*
- * FIXME: This check for BAD_ALIEN_MAGIC
- * should go away when common slab code is taught to
- * work even without alien caches.
- * Currently, non NUMA code returns BAD_ALIEN_MAGIC
- * for alloc_alien_cache,
- */
- if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC)
- continue;
- for_each_node(r) {
- if (alc[r])
- lockdep_set_class(&alc[r]->lock,
- &on_slab_alc_key);
- }
+ if (g_cpucache_up != FULL)
+ return;
+
+ for (s = malloc_sizes; s->cs_size != ULONG_MAX; s++) {
+ struct array_cache **alc;
+ struct kmem_list3 *l3;
+ int r;
+
+ l3 = s->cs_cachep->nodelists[q];
+ if (!l3 || OFF_SLAB(s->cs_cachep))
+ return;
+ lockdep_set_class(&l3->list_lock, &on_slab_l3_key);
+ alc = l3->alien;
+ /*
+ * FIXME: This check for BAD_ALIEN_MAGIC
+ * should go away when common slab code is taught to
+ * work even without alien caches.
+ * Currently, non NUMA code returns BAD_ALIEN_MAGIC
+ * for alloc_alien_cache,
+ */
+ if (!alc || (unsigned long)alc == BAD_ALIEN_MAGIC)
+ return;
+ for_each_node(r) {
+ if (alc[r])
+ lockdep_set_class(&alc[r]->lock,
+ &on_slab_alc_key);
}
- s++;
}
}
+
+static inline void init_lock_keys(void)
+{
+ int node;
+
+ for_each_node(node)
+ init_node_lock_keys(node);
+}
#else
+static void init_node_lock_keys(int q)
+{
+}
+
static inline void init_lock_keys(void)
{
}
@@ -665,26 +697,6 @@ static inline void init_lock_keys(void)
static DEFINE_MUTEX(cache_chain_mutex);
static struct list_head cache_chain;
-/*
- * chicken and egg problem: delay the per-cpu array allocation
- * until the general caches are up.
- */
-static enum {
- NONE,
- PARTIAL_AC,
- PARTIAL_L3,
- EARLY,
- FULL
-} g_cpucache_up;
-
-/*
- * used by boot code to determine if it can use slab based allocator
- */
-int slab_is_available(void)
-{
- return g_cpucache_up >= EARLY;
-}
-
static DEFINE_PER_CPU(struct delayed_work, reap_work);
static inline struct array_cache *cpu_cache_get(struct kmem_cache *cachep)
@@ -1254,6 +1266,8 @@ static int __cpuinit cpuup_prepare(long cpu)
kfree(shared);
free_alien_cache(alien);
}
+ init_node_lock_keys(node);
+
return 0;
bad:
cpuup_canceled(cpu);
@@ -3103,13 +3117,19 @@ static inline void *____cache_alloc(struct kmem_cache *cachep, gfp_t flags)
} else {
STATS_INC_ALLOCMISS(cachep);
objp = cache_alloc_refill(cachep, flags);
+ /*
+ * the 'ac' may be updated by cache_alloc_refill(),
+ * and kmemleak_erase() requires its correct value.
+ */
+ ac = cpu_cache_get(cachep);
}
/*
* To avoid a false negative, if an object that is in one of the
* per-CPU caches is leaked, we need to make sure kmemleak doesn't
* treat the array pointers as a reference to the object.
*/
- kmemleak_erase(&ac->entry[ac->avail]);
+ if (objp)
+ kmemleak_erase(&ac->entry[ac->avail]);
return objp;
}
@@ -3306,7 +3326,7 @@ __cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
cache_alloc_debugcheck_before(cachep, flags);
local_irq_save(save_flags);
- if (unlikely(nodeid == -1))
+ if (nodeid == -1)
nodeid = numa_node_id();
if (unlikely(!cachep->nodelists[nodeid])) {
diff --git a/mm/slub.c b/mm/slub.c
index 4996fc7..da0ce55 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1735,7 +1735,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
}
local_irq_restore(flags);
- if (unlikely((gfpflags & __GFP_ZERO) && object))
+ if (unlikely(gfpflags & __GFP_ZERO) && object)
memset(object, 0, objsize);
kmemcheck_slab_alloc(s, gfpflags, object, c->objsize);
@@ -4371,12 +4371,28 @@ static int show_stat(struct kmem_cache *s, char *buf, enum stat_item si)
return len + sprintf(buf + len, "\n");
}
+static void clear_stat(struct kmem_cache *s, enum stat_item si)
+{
+ int cpu;
+
+ for_each_online_cpu(cpu)
+ get_cpu_slab(s, cpu)->stat[si] = 0;
+}
+
#define STAT_ATTR(si, text) \
static ssize_t text##_show(struct kmem_cache *s, char *buf) \
{ \
return show_stat(s, buf, si); \
} \
-SLAB_ATTR_RO(text); \
+static ssize_t text##_store(struct kmem_cache *s, \
+ const char *buf, size_t length) \
+{ \
+ if (buf[0] != '0') \
+ return -EINVAL; \
+ clear_stat(s, si); \
+ return length; \
+} \
+SLAB_ATTR(text); \
STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath);
STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath);