2006-05-13 16:05:39

by Catalin Marinas

[permalink] [raw]
Subject: [RFC PATCH 2.6.17-rc4 0/6] Kernel memory leak detector

This is the first attempt at a kernel memory leak detector based on
the tracing garbage collection technique (similar to Valgrind's
"memcheck --leak-check" tool). See the Documentation/kmemleak.txt file
for a more detailed description.

The implementation is not complete and the code might contain
bugs. Only the ARM and i386 architectures are supported at the moment.

Let me know what you think. Thanks.

--
Catalin


2006-05-13 16:06:05

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 2/6] Some documentation for kmemleak

From: Catalin Marinas <[email protected]>

Signed-off-by: Catalin Marinas <[email protected]>
---

Documentation/kmemleak.txt | 94 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt
new file mode 100644
index 0000000..2c36cbb
--- /dev/null
+++ b/Documentation/kmemleak.txt
@@ -0,0 +1,94 @@
+Kernel Memory Leak Detector
+===========================
+
+
+Introduction
+------------
+
+Kmemleak provides a way of detecting possible kernel memory leaks in a
+way similar to a tracing garbage collector
+(http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Tracing_garbage_collectors),
+with the difference that the orphan pointers are not freed but only
+reported via /proc/memleak. A similar method is used by the Valgrind
+tool (memcheck --leak-check) to detect the memory leaks in user-space
+applications.
+
+
+Basic Algorithm
+---------------
+
+The memory allocations via kmalloc, vmalloc, kmem_cache_alloc and
+friends are tracked and the pointers, together with additional
+information like size and stack trace, are stored in a radix tree. The
+corresponding freeing function calls are tracked and the pointers
+removed from the radix tree.
+
+An allocated block of memory is considered orphan if a pointer to its
+start address or to an alias (pointer aliases are explained later)
+cannot be found by scanning the memory (including saved
+registers). This means that there might be no way for the kernel to
+pass the address of the allocated block to a freeing function and
+therefore the block is considered a leak.
+
+The scanning algorithm steps:
+
+ 1. mark all pointers as white (remaining white pointers will later
+ be considered orphan)
+ 2. scan the memory starting with the data section and stacks,
+ checking the values against the addresses stored in the radix
+ tree. If a white pointer is found, it is added to the grey list
+ 3. scan the grey pointers for matching addresses (some white
+ pointers can become grey and added at the end of the grey list)
+ until the grey set is finished
+ 4. the remaining white pointers are considered orphan and reported
+ via /proc/memleak
+
+
+Improvements
+------------
+
+Because the Linux kernel calculates many pointers at run-time via the
+container_of macro (see the lists implementation), a lot of false
+positives would be reported. This tool re-writes the container_of
+macro so that the offset and size information is stored in the
+.init.memleak_offsets section. The memleak_init() function creates a
+radix tree with corresponding offsets for every encountered block
+size. The memory allocations hook stores the pointer address together
+with its aliases based on the size of the allocated block.
+
+While one level of offsets should be enough for most cases, two levels
+are considered, i.e. container_of(container_of(...)) (one false
+positive is the "struct socket_alloc" allocation in the
+sock_alloc_inode() function).
+
+Some allocated memory blocks have pointers stored in the kernel's
+internal data structures and they cannot be detected as orphans. To
+avoid this, kmemleak can also store the number of values equal to the
+pointer (or aliases) that need to be found so that the block is not
+considered a leak. One example is __vmalloc().
+
+
+Limitations and Drawbacks
+-------------------------
+
+The biggest drawback is the reduced performance of memory allocation
+and freeing. To avoid other penalties, the memory scanning is only
+performed when the /proc/memleak file is read. Anyway, this tool is
+intended for debugging purposes where the performance might not be the
+most important requirement.
+
+The tool can report false positives. These are cases where an
+allocated block doesn't need to be freed (some cases in the init_call
+functions), the pointer is calculated by other methods than the
+container_of macro or the pointer is stored in a location not scanned
+by kmemleak. If the "member" argument in the offsetof(type, member)
+call is not constant, kmemleak considers the offset as zero since it
+cannot be determined at compilation time (as a side node, it seems
+that gcc-4.0 doesn't compile these offsetof constructs either).
+
+Page allocations and ioremap are not tracked (yet).
+
+NUMA architectures are not supported. The tool wasn't tested on SMP
+hardware.
+
+Only the ARM and i386 architectures are currently supported.

2006-05-13 16:05:57

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

From: Catalin Marinas <[email protected]>

This patch adds the base support for the kernel memory leak detector. It
traces the memory allocation/freeing in a way similar to the Boehm's
conservative garbage collector, the difference being that the orphan
pointers are not freed but only shown in /proc/memleak. Enabling this
feature would introduce an overhead to memory allocations.

Signed-off-by: Catalin Marinas <[email protected]>
---

include/linux/kernel.h | 13 +
include/linux/memleak.h | 55 +++++
init/main.c | 5
lib/Kconfig.debug | 11 +
mm/Makefile | 2
mm/memleak.c | 549 +++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 632 insertions(+), 3 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e1bd084..988e368 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -13,6 +13,9 @@ #include <linux/stddef.h>
#include <linux/types.h>
#include <linux/compiler.h>
#include <linux/bitops.h>
+#ifdef CONFIG_DEBUG_MEMLEAK
+#include <linux/memleak.h>
+#endif
#include <asm/byteorder.h>
#include <asm/bug.h>

@@ -283,9 +286,17 @@ #define max_t(type,x,y) \
* @member: the name of the member within the struct.
*
*/
-#define container_of(ptr, type, member) ({ \
+#define __container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
+#ifdef CONFIG_DEBUG_MEMLEAK
+#define container_of(ptr, type, member) ({ \
+ DECLARE_MEMLEAK_OFFSET(container_of, type, member); \
+ __container_of(ptr, type, member); \
+})
+#else
+#define container_of(ptr, type, member) __container_of(ptr, type, member)
+#endif

/*
* Check at compile time that something is of a particular type.
diff --git a/include/linux/memleak.h b/include/linux/memleak.h
new file mode 100644
index 0000000..86763b4
--- /dev/null
+++ b/include/linux/memleak.h
@@ -0,0 +1,55 @@
+/*
+ * include/linux/memleak.h
+ *
+ * Copyright (C) 2006 ARM Limited
+ * Written by Catalin Marinas <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#ifndef __MEMLEAK_H
+#define __MEMLEAK_H
+
+#include <linux/compiler.h>
+
+
+struct memleak_offset {
+ unsigned long offset;
+ unsigned long size;
+ unsigned long member_size;
+};
+
+/* if (&((TYPE *) 0)->MEMBER) is not a constant known at compile time,
+ * just use 0 instead since we cannot add it to the
+ * .init.memleak_offsets section
+ */
+#define memleak_offsetof(type, member) \
+ (__builtin_constant_p(&((type *) 0)->member) ? \
+ ((size_t) &((type *) 0)->member) : 0)
+
+#define DECLARE_MEMLEAK_OFFSET(name, type, member) \
+ static const struct memleak_offset \
+ __attribute__ ((__section__ (".init.memleak_offsets"))) \
+ __attribute_used__ __memleak_offset__##name = { \
+ memleak_offsetof(type, member), \
+ sizeof(type), \
+ sizeof(((type *) 0)->member) \
+ }
+
+extern int memleak_init(void);
+/* Allocation/freeing hooks for pointers tracking */
+extern void memleak_alloc(const void *ptr, size_t size, int ref_count);
+extern void memleak_free(const void *ptr);
+
+#endif /* __MEMLEAK_H */
diff --git a/init/main.c b/init/main.c
index f715b9b..6304628 100644
--- a/init/main.c
+++ b/init/main.c
@@ -513,6 +513,10 @@ #endif
cpuset_init_early();
mem_init();
kmem_cache_init();
+ radix_tree_init();
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_init();
+#endif
setup_per_cpu_pageset();
numa_policy_init();
if (late_time_init)
@@ -533,7 +537,6 @@ #endif
key_init();
security_init();
vfs_caches_init(num_physpages);
- radix_tree_init();
signals_init();
/* rootfs populating might need page-writeback */
page_writeback_init();
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 6ecc180..5519d55 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -89,6 +89,17 @@ config DEBUG_SLAB_LEAK
bool "Memory leak debugging"
depends on DEBUG_SLAB

+config DEBUG_MEMLEAK
+ bool "Kernel memory leak detector"
+ depends on DEBUG_KERNEL && SLAB
+ help
+ Say Y here if you want to enable the memory leak
+ detector. The memory allocation/freeing is traced in a way
+ similar to the Boehm's conservative garbage collector, the
+ difference being that the orphan pointers are not freed but
+ only shown in /proc/memleak. Enabling this feature would
+ introduce an overhead to memory allocations.
+
config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
diff --git a/mm/Makefile b/mm/Makefile
index 0b8f73f..d487d96 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -23,4 +23,4 @@ obj-$(CONFIG_SLAB) += slab.o
obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
obj-$(CONFIG_FS_XIP) += filemap_xip.o
obj-$(CONFIG_MIGRATION) += migrate.o
-
+obj-$(CONFIG_DEBUG_MEMLEAK) += memleak.o
diff --git a/mm/memleak.c b/mm/memleak.c
new file mode 100644
index 0000000..aacdeb4
--- /dev/null
+++ b/mm/memleak.c
@@ -0,0 +1,549 @@
+/*
+ * mm/memleak.c
+ *
+ * Copyright (C) 2006 ARM Limited
+ * Written by Catalin Marinas <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+/* #define DEBUG */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/radix-tree.h>
+#include <linux/gfp.h>
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/kallsyms.h>
+#include <linux/mman.h>
+#include <linux/nodemask.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/vmalloc.h>
+#include <linux/mutex.h>
+
+#include <asm/bitops.h>
+#include <asm/sections.h>
+#include <asm/percpu.h>
+#include <asm/processor.h>
+#include <asm/thread_info.h>
+
+#include <linux/memleak.h>
+
+
+#ifdef CONFIG_FRAME_POINTER
+#define MAX_TRACE 4
+#else
+#define MAX_TRACE 1
+#endif
+
+
+extern struct memleak_offset __memleak_offsets_start[];
+extern struct memleak_offset __memleak_offsets_end[];
+
+
+struct memleak_alias {
+ struct hlist_node node;
+ unsigned long offset;
+};
+
+struct memleak_pointer {
+ struct list_head pointer_list;
+ struct list_head grey_list;
+ unsigned long pointer;
+ size_t size;
+ int ref_count; /* the minimum ecounters of the value */
+ int count; /* the ecounters of the value */
+ struct hlist_head *alias_list;
+ unsigned long trace[MAX_TRACE];
+};
+
+#define COLOR_WHITE(pointer) ((pointer)->count != -1 && (pointer)->count < (pointer)->ref_count)
+#define COLOR_GREY(pointer) ((pointer)->count >= (pointer)->ref_count)
+
+
+/* Tree storing the pointer aliases indexed by size */
+static RADIX_TREE(alias_tree, GFP_KERNEL);
+/* Tree storing all the possible pointers, indexed by the pointer value */
+static RADIX_TREE(pointer_tree, GFP_ATOMIC);
+/* The list of all allocated pointers */
+static LIST_HEAD(pointer_list);
+/* The list of the grey pointers */
+static LIST_HEAD(grey_list);
+
+static kmem_cache_t *pointer_cache;
+/* Used to avoid recursive call via the kmalloc/kfree functions */
+static spinlock_t memleak_lock = SPIN_LOCK_UNLOCKED;
+static int memleak_initialised = 0;
+static unsigned long memleak_running = 0;
+static DEFINE_MUTEX(memleak_mutex);
+
+
+/* Insert an element into the aliases radix tree.
+ * Return 0 on success.
+ */
+static inline int __init insert_alias(unsigned long size, unsigned long offset)
+{
+ int ret = 0;
+ struct hlist_head *alias_list;
+ struct memleak_alias *alias;
+
+ alias_list = radix_tree_lookup(&alias_tree, size);
+ if (alias_list) {
+ struct hlist_node *elem;
+
+ hlist_for_each_entry(alias, elem, alias_list, node) {
+ if (alias->offset == offset) {
+ ret = -EEXIST;
+ goto out;
+ }
+ }
+ } else {
+ alias_list = kmalloc(sizeof(struct hlist_head), GFP_KERNEL);
+ if (!alias_list)
+ panic("kmemleak: cannot allocate initial memory\n");
+ INIT_HLIST_HEAD(alias_list);
+
+ ret = radix_tree_insert(&alias_tree, size, alias_list);
+ if (ret)
+ panic("kmemleak: cannot insert into the aliases radix tree: %d\n", ret);
+ }
+
+ alias = kmalloc(sizeof(struct memleak_alias), GFP_KERNEL);
+ if (!alias)
+ panic("kmemleak: cannot allocate initial memory\n");
+ INIT_HLIST_NODE(&alias->node);
+ alias->offset = offset;
+
+ hlist_add_head(&alias->node, alias_list);
+
+ out:
+ return ret;
+}
+
+/* Insert a pointer and its aliases into the pointer radix tree
+ */
+static inline void insert_pointer(unsigned long ptr, size_t size, int ref_count)
+{
+ struct memleak_alias *alias;
+ struct hlist_node *elem;
+ struct memleak_pointer *pointer;
+ int err, i;
+#ifdef CONFIG_FRAME_POINTER
+ void *frame;
+#endif
+
+ pointer = kmem_cache_alloc(pointer_cache, SLAB_ATOMIC);
+ if (!pointer)
+ panic("kmemleak: cannot allocate a memleak_pointer structure\n");
+
+ INIT_LIST_HEAD(&pointer->pointer_list);
+ INIT_LIST_HEAD(&pointer->grey_list);
+ pointer->pointer = ptr;
+ pointer->size = size;
+ pointer->ref_count = ref_count;
+ pointer->count = -1;
+ pointer->alias_list = radix_tree_lookup(&alias_tree, size);
+
+#ifdef CONFIG_FRAME_POINTER
+ frame = __builtin_frame_address(0);
+ for (i = 0; i < MAX_TRACE; i++) {
+ if (kstack_end(frame)) {
+ pointer->trace[i] = 0;
+ continue;
+ }
+
+ pointer->trace[i] = arch_call_address(frame);
+ frame = arch_prev_frame(frame);
+ /* we don't need the return to do_exit() */
+ if (kstack_end(frame))
+ pointer->trace[i] = 0;
+ }
+#else
+ pointer->trace[0] = __builtin_return_address(0);
+#endif
+
+ err = radix_tree_insert(&pointer_tree, ptr, pointer);
+ if (err) {
+ dump_stack();
+ panic("kmemleak: cannot insert into the pointer radix tree: %d\n", err);
+ }
+
+ if (pointer->alias_list) {
+ hlist_for_each_entry(alias, elem, pointer->alias_list, node) {
+ if (alias->offset >= size)
+ BUG();
+
+ err = radix_tree_insert(&pointer_tree, ptr + alias->offset, pointer);
+ if (err) {
+ dump_stack();
+ panic("kmemleak: cannot insert alias into the pointer radix tree: %d\n", err);
+ }
+ }
+ }
+
+ list_add_tail(&pointer->pointer_list, &pointer_list);
+}
+
+/* Remove a pointer and its aliases from the pointer radix tree
+ */
+static inline void delete_pointer(unsigned long ptr)
+{
+ struct memleak_alias *alias;
+ struct hlist_node *elem;
+ struct memleak_pointer *pointer;
+
+ pointer = radix_tree_delete(&pointer_tree, ptr);
+ if (!pointer) {
+ printk(KERN_WARNING "kmemleak: freeing unknown pointer value 0x%08lx\n", ptr);
+ return;
+ }
+ if (pointer->pointer != ptr) {
+ dump_stack();
+ panic("kmemleak: freeing pointer by alias 0x%08lx\n", ptr);
+ }
+ if (COLOR_WHITE(pointer))
+ printk(KERN_WARNING "kmemleak: freeing orphan pointer 0x%08lx\n", ptr);
+
+ if (pointer->alias_list) {
+ hlist_for_each_entry(alias, elem, pointer->alias_list, node) {
+ if (!radix_tree_delete(&pointer_tree, ptr + alias->offset))
+ panic("kmemleak: cannot find pointer alias 0x%08lx, 0x%lx\n",
+ ptr, alias->offset);
+ }
+ }
+
+ list_del(&pointer->pointer_list);
+
+ kmem_cache_free(pointer_cache, pointer);
+}
+
+/* Allocation function hook
+ */
+void memleak_alloc(const void *ptr, size_t size, int ref_count)
+{
+ unsigned long flags;
+ unsigned int cpu_id = smp_processor_id();
+
+ if (!ptr)
+ return;
+
+ local_irq_save(flags);
+
+ /* avoid recursive calls. After disabling the interrupts, the
+ * only calls to this function on the same CPU should be from
+ * kmemleak itself and we ignore them. Calls from other CPU's
+ * would wait on the spin_lock.
+ */
+ if (!test_and_set_bit(cpu_id, &memleak_running)) {
+ spin_lock(&memleak_lock);
+
+ if (memleak_initialised) {
+ pr_debug("kmemleak: memleak_alloc(%p, %d)\n", ptr, size);
+ insert_pointer((unsigned long) ptr, size, ref_count);
+ }
+
+ spin_unlock(&memleak_lock);
+
+ clear_bit(cpu_id, &memleak_running);
+ }
+
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(memleak_alloc);
+
+/* Freeing function hook
+ */
+void memleak_free(const void *ptr)
+{
+ unsigned long flags;
+ unsigned int cpu_id = smp_processor_id();
+
+ if (!ptr)
+ return;
+
+ local_irq_save(flags);
+
+ /* avoid recursive calls. See memleak_alloc() for an explanation
+ */
+ if (!test_and_set_bit(cpu_id, &memleak_running)) {
+ spin_lock(&memleak_lock);
+
+ if (memleak_initialised) {
+ pr_debug("kmemleak: memleak_free(%p)\n", ptr);
+ delete_pointer((unsigned long) ptr);
+ }
+
+ spin_unlock(&memleak_lock);
+
+ clear_bit(cpu_id, &memleak_running);
+ }
+
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(memleak_free);
+
+/* Scan a block of memory (exclusive range) for pointers and move
+ * those found to the grey list
+ */
+static void memleak_scan_block(void *_start, void *_end)
+{
+ unsigned long *ptr;
+ unsigned long *start = _start;
+ unsigned long *end = _end;
+
+ for (ptr = start; ptr < end; ptr++) {
+ struct memleak_pointer *pointer =
+ radix_tree_lookup(&pointer_tree, *ptr);
+ if (!pointer)
+ continue;
+ if (!COLOR_WHITE(pointer))
+ continue;
+
+ pointer->count++;
+ if (COLOR_GREY(pointer))
+ list_add_tail_rcu(&pointer->grey_list, &grey_list);
+ }
+}
+
+/* Scan the memory and print the orphan pointers
+ */
+static void memleak_scan(void)
+{
+ unsigned long flags;
+ struct memleak_pointer *pointer;
+ struct task_struct *task;
+ int node;
+#ifdef CONFIG_SMP
+ int i;
+#endif
+
+ spin_lock_irqsave(&memleak_lock, flags);
+
+ list_for_each_entry_rcu(pointer, &pointer_list, pointer_list) {
+ pointer->count = 0;
+ }
+
+ /* data/bss scanning */
+ memleak_scan_block(_sdata, _edata);
+ memleak_scan_block(__bss_start, __bss_stop);
+
+#ifdef CONFIG_SMP
+ /* per-cpu scanning */
+ for (i = 0; i < NR_CPUS; i++)
+ memleak_scan_block(__per_cpu_offset[i] + __per_cpu_start,
+ __per_cpu_offset[i] + __per_cpu_end);
+#endif
+
+ /* mem_map scanning */
+ for_each_online_node(node) {
+ struct page *page, *end;
+
+ page = NODE_MEM_MAP(node);
+ end = page + NODE_DATA(node)->node_spanned_pages;
+
+ memleak_scan_block(page, end);
+ }
+
+ /* task stacks scanning */
+ for_each_process(task)
+ memleak_scan_block(task_stack_page(task),
+ task_stack_page(task) + THREAD_SIZE);
+
+ /* grey_list scanning */
+ list_for_each_entry_rcu(pointer, &grey_list, grey_list) {
+ memleak_scan_block((void *) pointer->pointer,
+ (void *) (pointer->pointer + pointer->size));
+ list_del_rcu(&pointer->grey_list);
+ }
+
+ spin_unlock_irqrestore(&memleak_lock, flags);
+}
+
+#ifdef CONFIG_PROC_FS
+static void *memleak_seq_start(struct seq_file *seq, loff_t *pos)
+{
+ struct memleak_pointer *pointer;
+ loff_t n = *pos;
+
+ mutex_lock(&memleak_mutex);
+
+ if (!n)
+ memleak_scan();
+
+ list_for_each_entry_rcu(pointer, &pointer_list, pointer_list) {
+ if (!n--)
+ return pointer;
+ }
+
+ return NULL;
+}
+
+static void *memleak_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+ struct list_head *n = ((struct memleak_pointer *) v)->pointer_list.next;
+
+ ++(*pos);
+
+ return (n != &pointer_list)
+ ? list_entry(n, struct memleak_pointer, pointer_list)
+ : NULL;
+}
+
+static void memleak_seq_stop(struct seq_file *seq, void *v)
+{
+ mutex_unlock(&memleak_mutex);
+}
+
+static int memleak_seq_show(struct seq_file *seq, void *v)
+{
+ const struct memleak_pointer *pointer = v;
+ char namebuf[KSYM_NAME_LEN + 1] = "";
+ char *modname;
+ unsigned long symsize;
+ unsigned long offset = 0;
+ int i;
+
+ if (!COLOR_WHITE(pointer))
+ return 0;
+
+ seq_printf(seq, "orphan pointer 0x%08lx (size %d):\n",
+ pointer->pointer, pointer->size);
+
+ for (i = 0; i < MAX_TRACE; i++) {
+ unsigned long trace = pointer->trace[i];
+ if (!trace)
+ break;
+
+#ifdef CONFIG_KALLSYMS
+ kallsyms_lookup(trace, &symsize, &offset, &modname, namebuf);
+#endif
+ seq_printf(seq, " %lx: <%s>\n", trace, namebuf);
+ }
+
+ return 0;
+}
+
+static struct seq_operations memleak_seq_ops = {
+ .start = memleak_seq_start,
+ .next = memleak_seq_next,
+ .stop = memleak_seq_stop,
+ .show = memleak_seq_show,
+};
+
+static int memleak_seq_open(struct inode *inode, struct file *file)
+{
+ return seq_open(file, &memleak_seq_ops);
+}
+
+static struct file_operations memleak_proc_fops = {
+ .owner = THIS_MODULE,
+ .open = memleak_seq_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+#endif /* CONFIG_PROC_FS */
+
+/* KMemLeak initialisation. Set up the radix tree for the pointer
+ * aliases
+ */
+int __init memleak_init(void)
+{
+ struct memleak_offset *ml_off;
+ int aliases = 0;
+ unsigned long flags;
+
+ printk(KERN_INFO "Kernel memory leak detector\n");
+
+ pointer_cache = kmem_cache_create("kmemleak cache", sizeof(struct memleak_pointer),
+ 0, SLAB_PANIC, NULL, NULL);
+
+ spin_lock_irqsave(&memleak_lock, flags);
+
+ /* primary aliases - container_of(member) */
+ for (ml_off = __memleak_offsets_start; ml_off < __memleak_offsets_end;
+ ml_off++) {
+ /* Note that offset == size is not a bug (see
+ * container_of usage in ipc/utils.c) but we ignore it
+ * because the alias can overlap with valid pointers
+ */
+ if (ml_off->offset > ml_off->size)
+ BUG();
+ else if (ml_off->offset != 0 && ml_off->offset < ml_off->size
+ && !insert_alias(ml_off->size, ml_off->offset))
+ aliases++;
+ }
+ pr_debug("kmemleak: found %d primary aliases\n", aliases);
+
+ /* secondary aliases - container_of(container_of(member)) */
+ for (ml_off = __memleak_offsets_start; ml_off < __memleak_offsets_end;
+ ml_off++) {
+ struct hlist_head *alias_list;
+ struct memleak_alias *alias;
+ struct hlist_node *elem;
+
+ alias_list = radix_tree_lookup(&alias_tree, ml_off->member_size);
+ if (!alias_list)
+ continue;
+
+ hlist_for_each_entry(alias, elem, alias_list, node) {
+ if (!insert_alias(ml_off->size,
+ ml_off->offset + alias->offset))
+ aliases++;
+ }
+ }
+ pr_debug("kmemleak: found %d aliases\n", aliases);
+
+ memleak_initialised = 1;
+
+ spin_unlock_irqrestore(&memleak_lock, flags);
+
+ return 0;
+}
+
+/* Late initialisation function
+ */
+int __init memleak_late_init(void)
+{
+#ifdef CONFIG_PROC_FS
+ struct proc_dir_entry *entry;
+#endif
+
+ pr_debug("kmemleak: late init\n");
+
+#if 0
+ /* make some orphan pointers for testing */
+ kmalloc(32, GFP_KERNEL);
+ kmalloc(32, GFP_KERNEL);
+ kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
+ kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
+ vmalloc(64);
+ vmalloc(64);
+#endif
+
+#ifdef CONFIG_PROC_FS
+ entry = create_proc_entry("memleak", S_IRUGO, NULL);
+ if (!entry)
+ return -ENOMEM;
+ entry->proc_fops = &memleak_proc_fops;
+#endif
+
+ return 0;
+}
+late_initcall(memleak_late_init);

2006-05-13 16:06:43

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 3/6] Add the memory allocation/freeing hooks for kmemleak

From: Catalin Marinas <[email protected]>

This patch adds the callbacks to memleak_(alloc|free) functions from
kmalloc/kree, kmem_cache_(alloc|free), vmalloc/vfree etc.

Signed-off-by: Catalin Marinas <[email protected]>
---

include/linux/slab.h | 4 ++++
mm/page_alloc.c | 3 +++
mm/slab.c | 30 ++++++++++++++++++++++++++----
mm/vmalloc.c | 20 ++++++++++++++++++--
4 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3af03b1..4a573ff 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -89,6 +89,7 @@ #endif

static inline void *kmalloc(size_t size, gfp_t flags)
{
+#ifndef CONFIG_DEBUG_MEMLEAK
if (__builtin_constant_p(size)) {
int i = 0;
#define CACHE(x) \
@@ -107,6 +108,7 @@ found:
malloc_sizes[i].cs_dmacachep :
malloc_sizes[i].cs_cachep, flags);
}
+#endif
return __kmalloc(size, flags);
}

@@ -114,6 +116,7 @@ extern void *__kzalloc(size_t, gfp_t);

static inline void *kzalloc(size_t size, gfp_t flags)
{
+#ifndef CONFIG_DEBUG_MEMLEAK
if (__builtin_constant_p(size)) {
int i = 0;
#define CACHE(x) \
@@ -132,6 +135,7 @@ found:
malloc_sizes[i].cs_dmacachep :
malloc_sizes[i].cs_cachep, flags);
}
+#endif
return __kzalloc(size, flags);
}

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ea77c99..d511586 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2788,6 +2788,9 @@ void *__init alloc_large_system_hash(con
if (_hash_mask)
*_hash_mask = (1 << log2qty) - 1;

+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_alloc(table, size, 1);
+#endif
return table;
}

diff --git a/mm/slab.c b/mm/slab.c
index c32af7e..5f4ebf4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -434,6 +434,8 @@ #if DEBUG
* variables contain the offset to the user object and its size.
*/
int obj_offset;
+#endif
+#if DEBUG || defined(CONFIG_DEBUG_MEMLEAK)
int obj_size;
#endif
};
@@ -672,7 +674,7 @@ static struct kmem_cache cache_cache = {
.shared = 1,
.buffer_size = sizeof(struct kmem_cache),
.name = "kmem_cache",
-#if DEBUG
+#if DEBUG || defined(CONFIG_DEBUG_MEMLEAK)
.obj_size = sizeof(struct kmem_cache),
#endif
};
@@ -2034,9 +2036,11 @@ #endif
if (!cachep)
goto oops;

-#if DEBUG
+#if DEBUG || defined(CONFIG_DEBUG_MEMLEAK)
cachep->obj_size = size;
+#endif

+#if DEBUG
if (flags & SLAB_RED_ZONE) {
/* redzoning only works with word aligned caches */
align = BYTES_PER_WORD;
@@ -3133,7 +3137,11 @@ #endif
*/
void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
{
- return __cache_alloc(cachep, flags, __builtin_return_address(0));
+ void *ptr = __cache_alloc(cachep, flags, __builtin_return_address(0));
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_alloc(ptr, cachep->obj_size, 1);
+#endif
+ return ptr;
}
EXPORT_SYMBOL(kmem_cache_alloc);

@@ -3148,6 +3156,9 @@ EXPORT_SYMBOL(kmem_cache_alloc);
void *kmem_cache_zalloc(struct kmem_cache *cache, gfp_t flags)
{
void *ret = __cache_alloc(cache, flags, __builtin_return_address(0));
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_alloc(ret, cache->obj_size, 1);
+#endif
if (ret)
memset(ret, 0, obj_size(cache));
return ret;
@@ -3269,6 +3280,7 @@ static __always_inline void *__do_kmallo
void *caller)
{
struct kmem_cache *cachep;
+ void *ptr;

/* If you want to save a few bytes .text space: replace
* __ with kmem_.
@@ -3278,7 +3290,11 @@ static __always_inline void *__do_kmallo
cachep = __find_general_cachep(size, flags);
if (unlikely(cachep == NULL))
return NULL;
- return __cache_alloc(cachep, flags, caller);
+ ptr = __cache_alloc(cachep, flags, caller);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_alloc(ptr, size, 1);
+#endif
+ return ptr;
}


@@ -3362,6 +3378,9 @@ void kmem_cache_free(struct kmem_cache *
unsigned long flags;

local_irq_save(flags);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_free(objp);
+#endif
__cache_free(cachep, objp);
local_irq_restore(flags);
}
@@ -3385,6 +3404,9 @@ void kfree(const void *objp)
return;
local_irq_save(flags);
kfree_debugcheck(objp);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_free(objp);
+#endif
c = virt_to_cache(objp);
mutex_debug_check_no_locks_freed(objp, obj_size(c));
__cache_free(c, (void *)objp);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c0504f1..ffd6154 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -349,6 +349,9 @@ void __vunmap(void *addr, int deallocate
void vfree(void *addr)
{
BUG_ON(in_interrupt());
+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_free(addr);
+#endif
__vunmap(addr, 1);
}
EXPORT_SYMBOL(vfree);
@@ -447,7 +450,14 @@ fail:

void *__vmalloc_area(struct vm_struct *area, gfp_t gfp_mask, pgprot_t prot)
{
- return __vmalloc_area_node(area, gfp_mask, prot, -1);
+ void *addr = __vmalloc_area_node(area, gfp_mask, prot, -1);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ /* this needs ref_count = 2 since vm_struct also contains a
+ pointer to this address. The guard page is also subtracted
+ from the size */
+ memleak_alloc(addr, area->size - PAGE_SIZE, 2);
+#endif
+ return addr;
}

/**
@@ -481,7 +491,13 @@ EXPORT_SYMBOL(__vmalloc_node);

void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot)
{
- return __vmalloc_node(size, gfp_mask, prot, -1);
+ void *addr = __vmalloc_node(size, gfp_mask, prot, -1);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ /* this needs ref_count = 2 since the vm_struct also contains
+ a pointer to this address */
+ memleak_alloc(addr, size, 2);
+#endif
+ return addr;
}
EXPORT_SYMBOL(__vmalloc);

2006-05-13 16:06:30

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 5/6] Add kmemleak support for ARM

From: Catalin Marinas <[email protected]>

This patch modifies the vmlinux.lds.S script and adds the backtrace support
for ARM to be used with kmemleak.

Signed-off-by: Catalin Marinas <[email protected]>
---

arch/arm/kernel/vmlinux.lds.S | 7 +++++++
include/asm-arm/processor.h | 12 ++++++++++++
2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 2b254e8..c6f038c 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -68,6 +68,11 @@ #endif
__per_cpu_start = .;
*(.data.percpu)
__per_cpu_end = .;
+#ifdef CONFIG_DEBUG_MEMLEAK
+ __memleak_offsets_start = .;
+ *(.init.memleak_offsets)
+ __memleak_offsets_end = .;
+#endif
#ifndef CONFIG_XIP_KERNEL
__init_begin = _stext;
*(.init.data)
@@ -110,6 +115,7 @@ #endif

.data : AT(__data_loc) {
__data_start = .; /* address in memory */
+ _sdata = .;

/*
* first, the init task union, aligned
@@ -158,6 +164,7 @@ #endif
__bss_start = .; /* BSS */
*(.bss)
*(COMMON)
+ __bss_stop = .;
_end = .;
}
/* Stabs debugging sections. */
diff --git a/include/asm-arm/processor.h b/include/asm-arm/processor.h
index 04f4d34..feaf017 100644
--- a/include/asm-arm/processor.h
+++ b/include/asm-arm/processor.h
@@ -121,6 +121,18 @@ #define spin_lock_prefetch(x) do { } whi

#endif

+#ifdef CONFIG_FRAME_POINTER
+static inline unsigned long arch_call_address(void *frame)
+{
+ return *(unsigned long *) (frame - 4) - 4;
+}
+
+static inline void *arch_prev_frame(void *frame)
+{
+ return *(void **) (frame - 12);
+}
+#endif
+
#endif

#endif /* __ASM_ARM_PROCESSOR_H */

2006-05-13 16:07:04

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

From: Catalin Marinas <[email protected]>

There are allocations for which the main pointer cannot be found but they
are not memory leaks. This patch fixes some of them.

Signed-off-by: Catalin Marinas <[email protected]>
---

ipc/util.c | 13 +++++++++++++
kernel/params.c | 9 ++++++++-
2 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index 8193299..788db50 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -31,6 +31,11 @@ #include <linux/seq_file.h>
#include <linux/proc_fs.h>
#include <linux/audit.h>

+#ifdef CONFIG_DEBUG_MEMLEAK
+#include <linux/memleak.h>
+#endif
+
+
#include <asm/unistd.h>

#include "util.h"
@@ -389,6 +394,10 @@ void* ipc_rcu_alloc(int size)
*/
if (rcu_use_vmalloc(size)) {
out = vmalloc(HDRLEN_VMALLOC + size);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ /* avoid a false alarm. That's not a memory leak */
+ memleak_free(out);
+#endif
if (out) {
out += HDRLEN_VMALLOC;
container_of(out, struct ipc_rcu_hdr, data)->is_vmalloc = 1;
@@ -396,6 +405,10 @@ void* ipc_rcu_alloc(int size)
}
} else {
out = kmalloc(HDRLEN_KMALLOC + size, GFP_KERNEL);
+#ifdef CONFIG_DEBUG_MEMLEAK
+ /* avoid a false alarm. That's not a memory leak */
+ memleak_free(out);
+#endif
if (out) {
out += HDRLEN_KMALLOC;
container_of(out, struct ipc_rcu_hdr, data)->is_vmalloc = 0;
diff --git a/kernel/params.c b/kernel/params.c
index af43ecd..fb268e8 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -548,6 +548,7 @@ static void __init kernel_param_sysfs_se
unsigned int name_skip)
{
struct module_kobject *mk;
+ struct module_param_attrs *mp;

mk = kzalloc(sizeof(struct module_kobject), GFP_KERNEL);
BUG_ON(!mk);
@@ -557,11 +558,17 @@ static void __init kernel_param_sysfs_se
kobject_set_name(&mk->kobj, name);
kobject_register(&mk->kobj);

+ mp = param_sysfs_setup(mk, kparam, num_params, name_skip);
/* no need to keep the kobject if no parameter is exported */
- if (!param_sysfs_setup(mk, kparam, num_params, name_skip)) {
+ if (!mp) {
kobject_unregister(&mk->kobj);
kfree(mk);
}
+#ifdef CONFIG_DEBUG_MEMLEAK
+ /* avoid a false alarm. That's not a memory leak */
+ else
+ memleak_free(mp);
+#endif
}

/*

2006-05-13 16:06:44

by Catalin Marinas

[permalink] [raw]
Subject: [PATCH 2.6.17-rc4 4/6] Add kmemleak support for i386

From: Catalin Marinas <[email protected]>

This patch modifies the vmlinux.lds.S script and adds the backtrace support
for i386 to be used with kmemleak.

Signed-off-by: Catalin Marinas <[email protected]>
---

arch/i386/kernel/vmlinux.lds.S | 4 ++++
include/asm-i386/processor.h | 12 ++++++++++++
2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/vmlinux.lds.S b/arch/i386/kernel/vmlinux.lds.S
index 8831303..370480e 100644
--- a/arch/i386/kernel/vmlinux.lds.S
+++ b/arch/i386/kernel/vmlinux.lds.S
@@ -38,6 +38,7 @@ SECTIONS
RODATA

/* writeable */
+ _sdata = .; /* Start of data section */
.data : AT(ADDR(.data) - LOAD_OFFSET) { /* Data */
*(.data)
CONSTRUCTORS
@@ -140,6 +141,9 @@ SECTIONS
__per_cpu_start = .;
.data.percpu : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) }
__per_cpu_end = .;
+ __memleak_offsets_start = .;
+ .init.memleak_offsets : AT(ADDR(.init.memleak_offsets) - LOAD_OFFSET) { *(.init.memleak_offsets) }
+ __memleak_offsets_end = .;
. = ALIGN(4096);
__init_end = .;
/* freed after init ends here */
diff --git a/include/asm-i386/processor.h b/include/asm-i386/processor.h
index 805f0dc..9b6568a 100644
--- a/include/asm-i386/processor.h
+++ b/include/asm-i386/processor.h
@@ -743,4 +743,16 @@ #else
#define mcheck_init(c) do {} while(0)
#endif

+#ifdef CONFIG_FRAME_POINTER
+static inline unsigned long arch_call_address(void *frame)
+{
+ return *(unsigned long *) (frame + 4);
+}
+
+static inline void *arch_prev_frame(void *frame)
+{
+ return *(void **) frame;
+}
+#endif
+
#endif /* __ASM_I386_PROCESSOR_H */

2006-05-13 17:42:42

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

A few comments on your patch below.

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> From: Catalin Marinas <[email protected]>
>
> This patch adds the base support for the kernel memory leak detector. It
> traces the memory allocation/freeing in a way similar to the Boehm's
> conservative garbage collector, the difference being that the orphan
> pointers are not freed but only shown in /proc/memleak. Enabling this

/proc is such a mess already, do we have to add another file to it?
How about using sysfs instead? I know that is "one value pr file", but
then simply make one file pr leaked pointer or something like that...


> feature would introduce an overhead to memory allocations.
>
> Signed-off-by: Catalin Marinas <[email protected]>
> ---
>
> include/linux/kernel.h | 13 +
> include/linux/memleak.h | 55 +++++
> init/main.c | 5
> lib/Kconfig.debug | 11 +
> mm/Makefile | 2
> mm/memleak.c | 549 +++++++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 632 insertions(+), 3 deletions(-)
>
[snip]
> diff --git a/include/linux/memleak.h b/include/linux/memleak.h
[snip]
> +#define memleak_offsetof(type, member) \
> + (__builtin_constant_p(&((type *) 0)->member) ? \
> + ((size_t) &((type *) 0)->member) : 0)
> +

No spaces after the closing parenthesis of a cast and the value being
cast please.

(__builtin_constant_p(&((type *)0)->member) ? \
((size_t) &((type *)0)->member) : 0)

There are more occourances of this, only pointing out the first one.

[snip]
[snip]
> +config DEBUG_MEMLEAK
> + bool "Kernel memory leak detector"
> + depends on DEBUG_KERNEL && SLAB
> + help
> + Say Y here if you want to enable the memory leak
> + detector. The memory allocation/freeing is traced in a way
> + similar to the Boehm's conservative garbage collector, the
> + difference being that the orphan pointers are not freed but
> + only shown in /proc/memleak. Enabling this feature would
> + introduce an overhead to memory allocations.

Shouldn't that last bit read "Enabling this feature will introduce
overhead to memory allocations." ?


[snip]
> +#define MAX_TRACE 1
> +#endif
> +
> +
> +extern struct memleak_offset __memleak_offsets_start[];
> +extern struct memleak_offset __memleak_offsets_end[];
> +
> +
> +struct memleak_alias {

You seem to be very fond of double empty lines, here and elsewhere.
Surely just a single blank line would do just fine in many places -
no?


[snip]
> +static inline void delete_pointer(unsigned long ptr)

"inline" ? Isn't this function a little too fat for that?


[snip]
> +/* Freeing function hook
> + */

A lot of lines could be saved if all these small comments were on a
single line instead...

/* Freeing function hook */

[snip]
> + delete_pointer((unsigned long) ptr);

delete_pointer((unsigned long)ptr);


[snip]
> +static void memleak_scan(void)
> +{
> + unsigned long flags;
> + struct memleak_pointer *pointer;
> + struct task_struct *task;
> + int node;
> +#ifdef CONFIG_SMP
> + int i;
> +#endif

Why not just get rid of `i' and just use the `node' variable in the
single location where `i' is used (or get rid of `node' and use `i' in
its place) ?
As far as I can see that shouldn't be a problem and it'll save one
local variable on SMP.


[snip]
> + memleak_scan_block((void *) pointer->pointer,
> + (void *) (pointer->pointer + pointer->size));

memleak_scan_block((void *)pointer->pointer,
(void *)(pointer->pointer + pointer->size));


[snip]
> +static void *memleak_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> +{
> + struct list_head *n = ((struct memleak_pointer *) v)->pointer_list.next;

struct list_head *n = ((struct memleak_pointer *)v)->pointer_list.next;


> +
> + ++(*pos);
> +
> + return (n != &pointer_list)
> + ? list_entry(n, struct memleak_pointer, pointer_list)
> + : NULL;

Wouldn't this be more readable as

if (n != &pointer_list)
return list_entry(n, struct memleak_pointer, pointer_list);
return NULL

???

[snip]
> +int __init memleak_init(void)
> +{
> + struct memleak_offset *ml_off;
> + int aliases = 0;
> + unsigned long flags;
> +
> + printk(KERN_INFO "Kernel memory leak detector\n");

How about moving this printk() to the end of memleak_init() and changing it to :

printk(KERN_INFO "Kernel memory leak detector initialized.\n");


[snip]
> +#if 0
> + /* make some orphan pointers for testing */
> + kmalloc(32, GFP_KERNEL);
> + kmalloc(32, GFP_KERNEL);
> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
> + vmalloc(64);
> + vmalloc(64);
> +#endif

Stuff for testing is nice, but do we have to add it to the kernel? - I
assume you are done testing :-)
We have waay too much code already in the kernel inside #if 0



--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-13 17:47:23

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Jesper> /proc is such a mess already, do we have to add another
Jesper> file to it? How about using sysfs instead? I know that is
Jesper> "one value pr file", but then simply make one file pr
Jesper> leaked pointer or something like that...

Actually debugfs would make the most sense I think -- and the code
to create a debugfs file is simpler than the corresponding procfs code.

- R.

2006-05-13 18:11:53

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Please try to keep ifdef's out of the main kernel files.

Imagine what an unreadable mess the main kernel files
would be if every CONFIG option had its hooks wrapped
in #ifdef's.

==============================================

+#ifdef CONFIG_DEBUG_MEMLEAK
+#include <linux/memleak.h>
+#endif

should be changed to always include linux/memleak.h,
and bury the ifdefs with memleak.h so that if not
CONFIG'd, it just defines the empty stubs needed
to remove other ifdef's in the main files.

==============================================

+#ifdef CONFIG_DEBUG_MEMLEAK
+#define container_of(ptr, type, member) ({ \
+ DECLARE_MEMLEAK_OFFSET(container_of, type, member); \
+ __container_of(ptr, type, member); \
+})
+#else
+#define container_of(ptr, type, member) __container_of(ptr, type, member)
+#endif

could (not sure of this works - you'll have to experiment) become:

+#define container_of(ptr, type, member) ({ \
+ DECLARE_MEMLEAK_OFFSET(container_of, type, member); \
+ __container_of(ptr, type, member); \
+})

where DECLARE_MEMLEAK_OFFSET was an empty stub if not CONFIG'd

==============================================

+#ifdef CONFIG_DEBUG_MEMLEAK
+ memleak_init();
+#endif

becomes simply:

+ memleak_init();

where memleak_init() is an empty stuff if memleak if not CONFIG'd.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-05-13 18:24:12

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 4/6] Add kmemleak support for i386

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> From: Catalin Marinas <[email protected]>
>
> This patch modifies the vmlinux.lds.S script and adds the backtrace support
> for i386 to be used with kmemleak.
>
[snip]
> +static inline unsigned long arch_call_address(void *frame)
> +{
> + return *(unsigned long *) (frame + 4);

return *(unsigned long *)(frame + 4);


> +}
> +
> +static inline void *arch_prev_frame(void *frame)
> +{
> + return *(void **) frame;

return *(void **)frame;


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-13 18:25:20

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 5/6] Add kmemleak support for ARM

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> From: Catalin Marinas <[email protected]>
>
> This patch modifies the vmlinux.lds.S script and adds the backtrace support
> for ARM to be used with kmemleak.
>
[snip]
> +static inline unsigned long arch_call_address(void *frame)
> +{
> + return *(unsigned long *) (frame - 4) - 4;

return *(unsigned long *)(frame - 4) - 4;


> +}
> +
> +static inline void *arch_prev_frame(void *frame)
> +{
> + return *(void **) (frame - 12);

return *(void **)(frame - 12);


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-13 19:21:40

by Jesper Juhl

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

On 13/05/06, Catalin Marinas <[email protected]> wrote:
> From: Catalin Marinas <[email protected]>
>
> There are allocations for which the main pointer cannot be found but they
> are not memory leaks. This patch fixes some of them.
>
[snip]
> +#ifdef CONFIG_DEBUG_MEMLEAK
> + /* avoid a false alarm. That's not a memory leak */
> + memleak_free(out);
> +#endif

Hmm, so eventually we are going to end up with a bunch of ugly #ifdef
CONFIG_DEBUG_MEMLEAK's all over the place?

Wouldn't it be better to just make memleak_free() an empty stub in the
!CONFIG_DEBUG_MEMLEAK case?

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-13 21:20:37

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 4/6] Add kmemleak support for i386

> [snip]
>> +static inline unsigned long arch_call_address(void *frame)
>> +{
>> + return *(unsigned long *) (frame + 4);
>
> return *(unsigned long *)(frame + 4);

I would like to point out that a __bulitin_return_address(immediate int)
and __builtin_frame_address(immediate int) exist (but they can
unfortunately not be used with variables even though that would not pose
much of a problem on x86).


>> +static inline void *arch_prev_frame(void *frame)
>> +{
>> + return *(void **) frame;
>
> return *(void **)frame;
>



Jan Engelhardt
--

2006-05-13 23:20:42

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Catalin Marinas <[email protected]> writes:

> From: Catalin Marinas <[email protected]>
>
> This patch adds the base support for the kernel memory leak detector. It
> traces the memory allocation/freeing in a way similar to the Boehm's
> conservative garbage collector, the difference being that the orphan
> pointers are not freed but only shown in /proc/memleak. Enabling this
> feature would introduce an overhead to memory allocations.

Interesting approach. Did you actually find any leaks with this?

What looks a bit dubious is how objects reuse is handled. You can't
distingush an reused object from an old leaked pointer. But due
to the way slab allocates this should be pretty common. I guess
for your approach to be effective slab would need to be changed
to a queue?

-Andi

2006-05-14 07:24:08

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Jesper Juhl wrote:
> A few comments on your patch below.

Thanks.

> On 5/13/06, Catalin Marinas <[email protected]> wrote:
>> This patch adds the base support for the kernel memory leak detector. It
>> traces the memory allocation/freeing in a way similar to the Boehm's
>> conservative garbage collector, the difference being that the orphan
>> pointers are not freed but only shown in /proc/memleak. Enabling this
>
> /proc is such a mess already, do we have to add another file to it?
> How about using sysfs instead? I know that is "one value pr file", but
> then simply make one file pr leaked pointer or something like that...

I'll probably use debugfs.

>> +#define memleak_offsetof(type, member) \
>> + (__builtin_constant_p(&((type *) 0)->member) ? \
>> + ((size_t) &((type *) 0)->member) : 0)
>> +
>
>
> No spaces after the closing parenthesis of a cast and the value being
> cast please.

OK.

>> + only shown in /proc/memleak. Enabling this feature would
>> + introduce an overhead to memory allocations.
>
>
> Shouldn't that last bit read "Enabling this feature will introduce
> overhead to memory allocations." ?

Yes, indeed.

> You seem to be very fond of double empty lines, here and elsewhere.
> Surely just a single blank line would do just fine in many places -
> no?

That's just my style I seem to also add to many single empty lines but I
don't have any problem with removing them.

>> +static inline void delete_pointer(unsigned long ptr)
>
>
> "inline" ? Isn't this function a little too fat for that?

Not really since it is only called from one place and there won't be any
size overhead (there might not be a big speed gain either). I wanted to
separate the deleting algorithm from the locking in memleak_free.

>> +/* Freeing function hook
>> + */
>
>
> A lot of lines could be saved if all these small comments were on a
> single line instead...

OK.

>> +static void memleak_scan(void)
>> +{
>> + unsigned long flags;
>> + struct memleak_pointer *pointer;
>> + struct task_struct *task;
>> + int node;
>> +#ifdef CONFIG_SMP
>> + int i;
>> +#endif
>
> Why not just get rid of `i' and just use the `node' variable in the
> single location where `i' is used (or get rid of `node' and use `i' in
> its place) ?

OK.

>> + return (n != &pointer_list)
>> + ? list_entry(n, struct memleak_pointer, pointer_list)
>> + : NULL;
>
>
> Wouldn't this be more readable as
>
> if (n != &pointer_list)
> return list_entry(n, struct memleak_pointer, pointer_list);
> return NULL

Indeed, I'll change it.

>> +int __init memleak_init(void)
>> +{
>> + struct memleak_offset *ml_off;
>> + int aliases = 0;
>> + unsigned long flags;
>> +
>> + printk(KERN_INFO "Kernel memory leak detector\n");
>
>
> How about moving this printk() to the end of memleak_init() and changing
> it to :
>
> printk(KERN_INFO "Kernel memory leak detector initialized.\n");

OK.

>> +#if 0
>> + /* make some orphan pointers for testing */
>> + kmalloc(32, GFP_KERNEL);
>> + kmalloc(32, GFP_KERNEL);
>> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
>> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
>> + vmalloc(64);
>> + vmalloc(64);
>> +#endif
>
> Stuff for testing is nice, but do we have to add it to the kernel? - I
> assume you are done testing :-)
> We have waay too much code already in the kernel inside #if 0

The best would be to test it using a loadable module but I did most of
the work on an embedded ARM platform where it was much easier to add
some code directly. The code will be cleaned up in subsequent versions.

Thanks again,

Catalin

2006-05-14 07:28:08

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 4/6] Add kmemleak support for i386

Jan Engelhardt wrote:
>>[snip]
>>
>>>+static inline unsigned long arch_call_address(void *frame)
>>>+{
>>>+ return *(unsigned long *) (frame + 4);
>>
>> return *(unsigned long *)(frame + 4);
>
>
> I would like to point out that a __bulitin_return_address(immediate int)
> and __builtin_frame_address(immediate int) exist (but they can
> unfortunately not be used with variables even though that would not pose
> much of a problem on x86).

The code already uses __builtin_frame_address(0) and
__builtin_return_address(0) (the latter only if CONFIG_FRAME_POINTER is
not defined) but using these with a variable doesn't work on all
architectures (ARM being one of them, where I did the initial development).

Catalin

2006-05-14 07:31:58

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

Jesper Juhl wrote:
> On 13/05/06, Catalin Marinas <[email protected]> wrote:
>> +#ifdef CONFIG_DEBUG_MEMLEAK
>> + /* avoid a false alarm. That's not a memory leak */
>> + memleak_free(out);
>> +#endif
>
>
> Hmm, so eventually we are going to end up with a bunch of ussgly #ifdef
> CONFIG_DEBUG_MEMLEAK's all over the place?
>
> Wouldn't it be better to just make memleak_free() an empty stub in the
> !CONFIG_DEBUG_MEMLEAK case?

Yes, I'll make empty stubs (Paul suggested this as well).

Catalin

2006-05-14 08:19:32

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Andi Kleen wrote:
> Catalin Marinas <[email protected]> writes:
>>This patch adds the base support for the kernel memory leak detector. It
>>traces the memory allocation/freeing in a way similar to the Boehm's
>>conservative garbage collector, the difference being that the orphan
>>pointers are not freed but only shown in /proc/memleak. Enabling this
>>feature would introduce an overhead to memory allocations.
>
> Interesting approach. Did you actually find any leaks with this?

I haven't tested it intensively (that's the first version and I mainly
used a minimal kernel on an embedded system) but it could find
explicitely created leaks. For example, it should be able to detect
leaks similar to those fixed recently by Jesper:

http://lkml.org/lkml/2006/5/13/135
http://lkml.org/lkml/2006/5/13/140
http://lkml.org/lkml/2006/5/13/145

The above bugs were found by Coverity when analysing the code. The main
difference is that kmemleak finds them at run-time and only if they
happened.

> What looks a bit dubious is how objects reuse is handled. You can't
> distingush an reused object from an old leaked pointer.

The reused objects are not reported as leaks as long as the tool finds a
pointer to their address (or alias). The memleak_alloc hook is called in
kmem_cache_alloc after the object was actually allocated by
__cache_alloc. An object cannot be reused as long as it hasn't been
previously freed via kmem_cache_free (and the corresponding hook,
memleak_free, called). Kmemleak only reports allocated objects for which
there is no way to determine their address that can later be used in a
kmem_cache_free call.

Catalin

2006-05-14 14:49:32

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 3/6] Add the memory allocation/freeing hooks for kmemleak

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> From: Catalin Marinas <[email protected]>
>
> This patch adds the callbacks to memleak_(alloc|free) functions from
> kmalloc/kree, kmem_cache_(alloc|free), vmalloc/vfree etc.
>
> Signed-off-by: Catalin Marinas <[email protected]>

Way too many #ifdefs. Instead, please make memleak_alloc() and
memleak_free() empty static inline functions for the
non-CONFIG_DEBUG_MEMLEAK case.

Pekka

2006-05-14 14:53:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> This patch adds the base support for the kernel memory leak detector. It
> traces the memory allocation/freeing in a way similar to the Boehm's
> conservative garbage collector, the difference being that the orphan
> pointers are not freed but only shown in /proc/memleak. Enabling this
> feature would introduce an overhead to memory allocations.

Hmm. How much is the overhead anyway? I am guessing lots so can we
reasonably expect anyone to run such kernel? Why isn't DEBUG_SLAB_LEAK
enough for us?

Pekka

2006-05-14 14:55:48

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

On 5/13/06, Catalin Marinas <[email protected]> wrote:
> There are allocations for which the main pointer cannot be found but they
> are not memory leaks. This patch fixes some of them.

Why can't they be found? How many false positives are you expecting?

Pekka

2006-05-14 15:30:09

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Pekka Enberg wrote:
> On 5/13/06, Catalin Marinas <[email protected]> wrote:
>> This patch adds the base support for the kernel memory leak detector. It
>> traces the memory allocation/freeing in a way similar to the Boehm's
>> conservative garbage collector, the difference being that the orphan
>> pointers are not freed but only shown in /proc/memleak. Enabling this
>> feature would introduce an overhead to memory allocations.
>
> Hmm. How much is the overhead anyway? I am guessing lots so can we
> reasonably expect anyone to run such kernel?

The overhead is probably high since an allocated pointer needs to be
inserted in a radix tree, together with its aliases. However, no-one
should enable this feature for a production kernel, but only for
debugging (as with Valgrind for user-space, there isn't anyone using
those libraries for production versions).

> Why isn't DEBUG_SLAB_LEAK
> enough for us?

I haven't looked at DEBUG_SLAB_LEAK in detail but it looks to me like it
only shows the calling address of whoever allocated the object. Kmemleak
detects whether the object is still refered (tracing garbage collection).

So, DEBUG_SLAB_LEAK should be enough if you suspect a memory leak. If
you don't, some leaks might go unnoticed.

Catalin

2006-05-14 15:39:09

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

Pekka Enberg wrote:
> On 5/13/06, Catalin Marinas <[email protected]> wrote:
>> There are allocations for which the main pointer cannot be found but they
>> are not memory leaks. This patch fixes some of them.
>
> Why can't they be found? How many false positives are you expecting?

The tool scans the data section and kernel stacks for pointers. The
referred blocks are scanned as well. If a block address is not found in
all the scanned memory, it is considered unreferable or orphan (the
tracing garbage collection algorithm) and reported. There are some
memory blocks which get allocated but the address to their beginning is
discarded as the kernel doesn't need to free them (it happens with some
allocations during the booting process).

Another false positive is that the pointer to the beginning of the block
is determined by the kernel at run-time via the container_of macro or
some other method. KMemLeak currently supports up to two nested
container_of macros.

Yet another false positive can be caused by allocation of a size that
differs from the structure's size (kmalloc(sizeof(struct...) + ...)) and
kmemleak cannot properly determine the container_of aliases. One example
is the platform_device_alloc function and a false positive is in
add_pcspkr in arch/i386/kernel/setup.c.

I'll do more testing and post a new version next week (which will
include the suggestions I received so far).

Catalin

2006-05-14 15:52:23

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Pekka Enberg wrote:
> On 5/13/06, Catalin Marinas <[email protected]> wrote:
>> This patch adds the base support for the kernel memory leak detector. It
>> traces the memory allocation/freeing in a way similar to the Boehm's
>> conservative garbage collector, the difference being that the orphan
>> pointers are not freed but only shown in /proc/memleak. Enabling this
>> feature would introduce an overhead to memory allocations.
>
> Hmm. How much is the overhead anyway?

An additional note - the patch can be addapted so that the
allocated/freed pointers are only added to a list and a background
thread updates the radix tree at a later time. With this approach, the
overhead to the memory allocations would be relatively small.

Catalin

2006-05-14 17:35:11

by Ingo Oeser

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Hi Catalin,

On Sunday, 14. May 2006 09:24, Catalin Marinas wrote:
> >> +#if 0
> >> + /* make some orphan pointers for testing */
> >> + kmalloc(32, GFP_KERNEL);
> >> + kmalloc(32, GFP_KERNEL);
> >> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
> >> + kmem_cache_alloc(pointer_cache, GFP_ATOMIC);
> >> + vmalloc(64);
> >> + vmalloc(64);
> >> +#endif
> >
> > Stuff for testing is nice, but do we have to add it to the kernel? - I
> > assume you are done testing :-)
> > We have waay too much code already in the kernel inside #if 0
>
> The best would be to test it using a loadable module but I did most of
> the work on an embedded ARM platform where it was much easier to add
> some code directly. The code will be cleaned up in subsequent versions.

Fair enough. Just put it in a seperate file,
add a Kconfig "TEST_MEMLEAK_DETECTOR" tristate option,
depending on "DEBUG_MEMLEAK" and adjust the Makefile accordingly.

That way we can activate it from time to time by loading that module
and you can test it by compiling a new kernel.

RCU and CRYPTO have similiar features.


Regards

Ingo Oeser

2006-05-14 17:42:51

by Ingo Oeser

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

Hi Catalin,

On Sunday, 14. May 2006 17:39, Catalin Marinas wrote:
> Yet another false positive can be caused by allocation of a size that
> differs from the structure's size (kmalloc(sizeof(struct...) + ...)) and
> kmemleak cannot properly determine the container_of aliases. One example
> is the platform_device_alloc function and a false positive is in
> add_pcspkr in arch/i386/kernel/setup.c.
>
> I'll do more testing and post a new version next week (which will
> include the suggestions I received so far).

While we are at it: How do you handle the encoding of
info into the lower bits of a pointer? For Boehm GC,
this was a major problem, AFAIR. At least the RT-Mutex code
does this. There are others, but I'm to lazy to grep now...

Regards

Ingo Oeser

2006-05-15 09:15:33

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

> For example, it should be able to detect
> leaks similar to those fixed recently by Jesper:

Does it or does it not?

> > What looks a bit dubious is how objects reuse is handled. You can't
> > distingush an reused object from an old leaked pointer.
>
> The reused objects are not reported as leaks as long as the tool finds a
> pointer to their address (or alias). The memleak_alloc hook is called in
> kmem_cache_alloc after the object was actually allocated by
> __cache_alloc. An object cannot be reused as long as it hasn't been
> previously freed via kmem_cache_free (and the corresponding hook,
> memleak_free, called). Kmemleak only reports allocated objects for which
> there is no way to determine their address that can later be used in a
> kmem_cache_free call.

My point was that if you changed slab to be a queue and not
reuse objects that quickly you could likely find many more leaks with
your patch. It would make the patch more intrusive though and
slow slab down, so it would need to be ifdefed.

Another possibly less intrusive approach would be to use ioremap()
for all slab objects and hack __pa/__va to resolve it. Then you
would get unique addresses. You might need to increase the vmalloc
space on 32bit though. And ioremap again would need to be changed
to cycle through the address space, not reuse quickly (but it is
much simpler than slab and wouldn't be very difficult)

Using ioremap like this also has the advantage that use-after-free can
be easily detected.

-Andi

2006-05-15 10:09:24

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

On 15/05/06, Andi Kleen <[email protected]> wrote:
> > For example, it should be able to detect
> > leaks similar to those fixed recently by Jesper:
>
> Does it or does it not?

It does in most of the cases. Take for example
http://lkml.org/lkml/2006/5/13/145. KMemLeak detects it if the leak
happened, i.e. parse_num32 returns a non-zero value and kfree(name)
isn't present. When scanning the memory, the value of the name pointer
won't be found and therefore the block is shown as orphan.

I did similar tests to the above bug and it detected them in 90% of
the cases. However, I haven't done tests with objects reusing. I need
to write more comprehensive tests.

> > > What looks a bit dubious is how objects reuse is handled. You can't
> > > distingush an reused object from an old leaked pointer.
> >
> > The reused objects are not reported as leaks as long as the tool finds a
> > pointer to their address (or alias). The memleak_alloc hook is called in
> > kmem_cache_alloc after the object was actually allocated by
> > __cache_alloc. An object cannot be reused as long as it hasn't been
> > previously freed via kmem_cache_free (and the corresponding hook,
> > memleak_free, called). Kmemleak only reports allocated objects for which
> > there is no way to determine their address that can later be used in a
> > kmem_cache_free call.
>
> My point was that if you changed slab to be a queue and not
> reuse objects that quickly you could likely find many more leaks with
> your patch. It would make the patch more intrusive though and
> slow slab down, so it would need to be ifdefed.

I think I got it now. You mean that the value of a leaked pointer
could be found in other structures that previously used that object.
Anyway, I considered it a kernel bug if an invalid pointer remains in
a valid data structure. The latter would need to be freed as well and,
in this case, kmemleak won't scan it. I don't expect to have many
cases like this but, well, I haven't read the whole source.

> Another possibly less intrusive approach would be to use ioremap()
> for all slab objects and hack __pa/__va to resolve it. Then you
> would get unique addresses. You might need to increase the vmalloc
> space on 32bit though. And ioremap again would need to be changed
> to cycle through the address space, not reuse quickly (but it is
> much simpler than slab and wouldn't be very difficult)
>
> Using ioremap like this also has the advantage that use-after-free can
> be easily detected.

That's indeed an interesting idea. It has some difficulties as well
since you can only remap at the page level and a lot of virtual space
would be used (some way to compact consecutive blocks in one remapped
page would be needed).

Another improvement (but not enough) would be to zero the pointer
passed to kmem_cache_free (detecting at compilation time if it is
non-constant).

The tool can also be extended to track where the allocated blocks are
referred from and check whether those values are still in valid
structures after the blocks were freed.

Thanks for suggestions. They are really useful.

--
Catalin

2006-05-15 10:15:44

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

On 14/05/06, Ingo Oeser <[email protected]> wrote:
> On Sunday, 14. May 2006 09:24, Catalin Marinas wrote:
> > The best would be to test it using a loadable module but I did most of
> > the work on an embedded ARM platform where it was much easier to add
> > some code directly. The code will be cleaned up in subsequent versions.
>
> Fair enough. Just put it in a seperate file,
> add a Kconfig "TEST_MEMLEAK_DETECTOR" tristate option,
> depending on "DEBUG_MEMLEAK" and adjust the Makefile accordingly.

I'll do this since the patch needs a lot more testing to show that it
can actually find real bugs and also determine its limitations. I
posted this first version to see what people think about it but it is
far from the final version.

--
Catalin

2006-05-15 10:12:21

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

On 14/05/06, Ingo Oeser <[email protected]> wrote:
> While we are at it: How do you handle the encoding of
> info into the lower bits of a pointer? For Boehm GC,
> this was a major problem, AFAIR. At least the RT-Mutex code
> does this. There are others, but I'm to lazy to grep now...

I haven't looked at RT-Mutex but are more than the 2 bottom bits used
for this? If not, they can be masked out before look-up. The slab
allocator seems to always return blocks aligned to word size.

Thanks for pointing out.

--
Catalin

2006-05-15 11:53:03

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

Catalin Marinas wrote:
> From: Catalin Marinas <[email protected]>
>
> There are allocations for which the main pointer cannot be found but they
> are not memory leaks. This patch fixes some of them.
>
>
> #include "util.h"
> @@ -389,6 +394,10 @@ void* ipc_rcu_alloc(int size)
> */
> if (rcu_use_vmalloc(size)) {
> out = vmalloc(HDRLEN_VMALLOC + size);
> +#ifdef CONFIG_DEBUG_MEMLEAK
> + /* avoid a false alarm. That's not a memory leak */
> + memleak_free(out);
> +#endif
>

Maybe add a function memleak_false_alarm() (which just calls
memleak_free()) to document the fact that nothing is actually freed.

--
error compiling committee.c: too many arguments to function

2006-05-15 18:35:36

by Ingo Oeser

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives

On Monday, 15. May 2006 12:12, Catalin Marinas wrote:
> I haven't looked at RT-Mutex but are more than the 2 bottom bits used
> for this?

No. Only 2 bits.

> If not, they can be masked out before look-up.

Yes, just do it that way, if it is possible.


Regards

Ingo Oeser

2006-05-26 08:59:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak


* Andi Kleen <[email protected]> wrote:

> > From: Catalin Marinas <[email protected]>
> >
> > This patch adds the base support for the kernel memory leak detector. It
> > traces the memory allocation/freeing in a way similar to the Boehm's
> > conservative garbage collector, the difference being that the orphan
> > pointers are not freed but only shown in /proc/memleak. Enabling this
> > feature would introduce an overhead to memory allocations.
>
> Interesting approach. Did you actually find any leaks with this?

we should be very careful here, since 'low hanging fruit' memory leaks
have already been clean-sweeped by a proprietary tool (Coverity), so the
utility of this free (GPL-ed) tool is artificially depressed!

What kmemleak will find are all the cases that Coverity does not check:
developer's own trees, uncommon architectures/drivers.

Also, kmemleak guarantees (assuming the implementation is correct) that
if a leak happens in practice, it will be detected immediately.
Coverity, being a static analyzer, wont find leaks that are obscured by
some sort of complex codepath.

All in one, i'm very much in favor of adding kmemleak to the upstream
kernel, once it gets clean enough and has seen some exposure on -mm.

Ingo

2006-05-26 09:41:28

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

On Friday 26 May 2006 10:59, Ingo Molnar wrote:
>
> * Andi Kleen <[email protected]> wrote:
>
> > > From: Catalin Marinas <[email protected]>
> > >
> > > This patch adds the base support for the kernel memory leak detector. It
> > > traces the memory allocation/freeing in a way similar to the Boehm's
> > > conservative garbage collector, the difference being that the orphan
> > > pointers are not freed but only shown in /proc/memleak. Enabling this
> > > feature would introduce an overhead to memory allocations.
> >
> > Interesting approach. Did you actually find any leaks with this?
>
> we should be very careful here, since 'low hanging fruit' memory leaks
> have already been clean-sweeped by a proprietary tool (Coverity), so the
> utility of this free (GPL-ed) tool is artificially depressed!

It should be more. A real dynamic GC can find much larger
classes of leaks than even the best static checker.

iirc Coverty mostly just found missing free()s in error handling paths.
Which is surely nice, but in practice most of these error handling cases only occur
rarely if at all.

> What kmemleak will find are all the cases that Coverity does not check:
> developer's own trees, uncommon architectures/drivers.

> Also, kmemleak guarantees (assuming the implementation is correct) that
> if a leak happens in practice, it will be detected immediately.

Not if the slab object is reused quickly - which it often is.

That is why I think it would at least need some tweaks to slab to make
the reuse times longer.

>All in one, i'm very much in favor of adding kmemleak to the upstream
>kernel, once it gets clean enough and has seen some exposure on -mm.

Yes.

-Andi

2006-05-26 11:30:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak


* Andi Kleen <[email protected]> wrote:

> > Also, kmemleak guarantees (assuming the implementation is correct)
> > that if a leak happens in practice, it will be detected immediately.
>
> Not if the slab object is reused quickly - which it often is.

I dont see how slab object reuse could cause leak detection problems -
if something _is_ being freed, it's not a leak. What matters are the
objects that are 'forgotten' - and (at least statistically) kmemleak
should find them, because after some time all references to them go
away.

on 64-bit systems the statistical likelyhood of finding a leak could be
even increased by artificially relocating the kernel to a semi-random
base within the 64-bit address space. (that would mean that in practice
that all kernel pointers would be 'marked' with the top 28-32 bits that
are a good indicator of them being a kernel pointer)

Ingo

2006-05-26 16:37:33

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

On 26/05/06, Ingo Molnar <[email protected]> wrote:
> All in one, i'm very much in favor of adding kmemleak to the upstream
> kernel, once it gets clean enough and has seen some exposure on -mm.

I cleaned it and also tested it on SMP. I need to run a few more tests
on x86 (as I mainly work on ARM) and release a new version this
weekend.

A problem I'm facing (also because I'm not familiar with the other
architectures) is detecting the effective stack boundaries of the
threads running or waiting in kernel mode. Scanning the whole stack
(8K) hides some possible leaks (because of no longer used local
variables) and not scanning the list at all can lead to false
positives. I would need to investigate this a bit more.

--
Catalin

2006-05-26 17:47:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak


* Catalin Marinas <[email protected]> wrote:

> A problem I'm facing (also because I'm not familiar with the other
> architectures) is detecting the effective stack boundaries of the
> threads running or waiting in kernel mode. Scanning the whole stack
> (8K) hides some possible leaks (because of no longer used local
> variables) and not scanning the list at all can lead to false
> positives. I would need to investigate this a bit more.

i was thinking about this too, and i wanted to suggest a different
solution here: you could build a list of "temporary" objects that only
get registered with the memleak proper once a thread exits a system call
(or once a kernel thread goes back to its main loop). This means a
(lightweight) callback in the syscall exit (or irq exit) path. This way
you'd not have to scan kernel stacks at all, only .data and the objects
themselves.

the stack boundary rules can be quite complex: for example on x86_64 you
can have a pretty complex nesting of exception, interrupt and process
stacks. In fact on SMP we dont even know the precise stack boundary for
tasks that are running on some other CPU. [because we have no snapshot
of their register state]

avoiding the scanning of the kernel stacks gets rid of some of the
biggest source of natural entropy. (they contain strings and all sorts
of other binary data that could accidentally match up with a kernel
pointer)

Ingo

2006-05-26 21:50:03

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Hi Ingo,

Ingo Molnar wrote:
> * Catalin Marinas <[email protected]> wrote:
>>A problem I'm facing (also because I'm not familiar with the other
>>architectures) is detecting the effective stack boundaries of the
>>threads running or waiting in kernel mode. Scanning the whole stack
>>(8K) hides some possible leaks (because of no longer used local
>>variables) and not scanning the list at all can lead to false
>>positives. I would need to investigate this a bit more.
>
> i was thinking about this too, and i wanted to suggest a different
> solution here: you could build a list of "temporary" objects that only
> get registered with the memleak proper once a thread exits a system call
> (or once a kernel thread goes back to its main loop). This means a
> (lightweight) callback in the syscall exit (or irq exit) path. This way
> you'd not have to scan kernel stacks at all, only .data and the objects
> themselves.

That's an interesting approach. I'll first look at the level of false
positives without scanning the stacks at all.

> avoiding the scanning of the kernel stacks gets rid of some of the
> biggest source of natural entropy. (they contain strings and all sorts
> of other binary data that could accidentally match up with a kernel
> pointer)

Indeed, there is a quite high rate of false negatives (undetected leaks)
in my tests, especially on SMP, when scanning the stacks. However, I
haven't got any false positive when not scanning them (on an embedded
platform). I think even the false positives in this case would be mainly
temporary (until a moment of relative calm, i.e. most tasks sleeping).

On SMP there can be another issue - pointers kept in registers only -
leading to false positives. Anyway, I think it's more important to have
a few (temporary) false positives rather than missing real memory leaks.

Catalin

2006-05-26 22:02:00

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 2.6.17-rc4 1/6] Base support for kmemleak

Ingo Molnar wrote:
> Also, kmemleak guarantees (assuming the implementation is correct) that
> if a leak happens in practice, it will be detected immediately.
> Coverity, being a static analyzer, wont find leaks that are obscured by
> some sort of complex codepath.

A good example is the skb allocation/freeing. I'm not sure Coverity is
able to track the code path in a protocol stack. I'll modify a network
driver to "forget" some skb freeing and test the kmemleak detection.

Catalin