Hi Cristoph,
I've recently worked on reviving kmemleak (to be posted on LKML this
week) and tried the slub allocator. I got the following report of orphan
objects:
unreferenced object 0xdf80f180 (size 32):
comm "swapper", pid 1, jiffies 4294937343
backtrace:
[<c0082f44>] memleak_alloc
[<c0080ff0>] __kmalloc
[<c019edac>] proto_register
[<c001640c>] inet_init
[<c001e30c>] do_one_initcall
[<c0008400>] kernel_init
[<c0035aa8>] do_exit
[<ffffffff>]
unreferenced object 0xdf800840 (size 16):
comm "swapper", pid 1, jiffies 4294937343
backtrace:
[<c0082f44>] memleak_alloc
[<c0080ff0>] __kmalloc
[<c019ee3c>] proto_register
[<c001640c>] inet_init
[<c001e30c>] do_one_initcall
[<c0008400>] kernel_init
[<c0035aa8>] do_exit
[<ffffffff>]
The proto_register() function in net/core/sock.c allocates
request_sock_slab_name and timewait_sock_slab_name to generate the cache
names passed to kmem_cache_create(). However, this function in mm/slub.c
goes on the find_mergeable() route and doesn't update s->name to the
previously allocated pointers. Therefore, kmemleak reports them as
orphan.
It could be worse since proto_unregister() tries to free these pointers
but they don't actually point to the allocated blocks because of the
merging.
A solution could be to pass one of the SLUB_NEVER_MERGE bits to
kmem_cache_create in proto_register(), though none of them has any
meaning for this situation. Otherwise, maybe defining another bit like
SLAB_ALLOCATED_NAME to ensure that kmem_cache_name() returns the same
value.
Or just simplify proto_register() to no longer allocate memory for these
names and it should be stated somewhere that kmem_cache_create() doesn't
necessarily saves the pointer to the name.
Thanks.
--
Catalin
proto_register could add another field somewhere and store the pointer to
the name there? Then free the string on proto_unregister.
On Wed, 2008-11-19 at 13:22 -0600, Christoph Lameter wrote:
> proto_register could add another field somewhere and store the pointer to
> the name there? Then free the string on proto_unregister.
The patch below fixes this issue for proto_register. There is another
similar case in net/dccp/ccid.c. I cc'ed the person who added the
original proto_register code and he also seems to be the DCCP
maintainer.
My point is that the API is slightly different when slub is used since
kmem_cache_name is no longer guaranteed to return the same pointer
passed to kmem_cache_create. Maybe a documentation update:
diff --git a/mm/slab.c b/mm/slab.c
index ea76bcb..9723a72 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2124,6 +2124,8 @@ static int __init_refok setup_cpu_cache(struct kmem_cache
*
* @name must be valid until the cache is destroyed. This implies that
* the module calling this has to destroy the cache before getting unloaded.
+ * Note that kmem_cache_name() is not guaranteed to return the same pointer,
+ * therefore applications must manage it themselves.
*
* The flags are
*
And the proto_register fix below (if it looks alright, I'll submit it
separately):
Fix memory leak in the proto_register function
From: Catalin Marinas <[email protected]>
If the slub allocator is used, kmem_cache_create() may merge two or more
kmem_cache's into one but the cache name pointer is not updated and
kmem_cache_name() is no longer guaranteed to return the pointer passed
to the former function. This patch stores the kmalloc'ed pointers in the
corresponding request_sock_ops and timewait_sock_ops structures.
Signed-off-by: Catalin Marinas <[email protected]>
---
include/net/request_sock.h | 1 +
include/net/timewait_sock.h | 1 +
net/core/sock.c | 31 ++++++++++++-------------------
3 files changed, 14 insertions(+), 19 deletions(-)
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index cac811e..c719084 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -31,6 +31,7 @@ struct request_sock_ops {
int family;
int obj_size;
struct kmem_cache *slab;
+ char *slab_name;
int (*rtx_syn_ack)(struct sock *sk,
struct request_sock *req);
void (*send_ack)(struct sock *sk, struct sk_buff *skb,
diff --git a/include/net/timewait_sock.h b/include/net/timewait_sock.h
index 1e1ee32..97c3b14 100644
--- a/include/net/timewait_sock.h
+++ b/include/net/timewait_sock.h
@@ -16,6 +16,7 @@
struct timewait_sock_ops {
struct kmem_cache *twsk_slab;
+ char *twsk_slab_name;
unsigned int twsk_obj_size;
int (*twsk_unique)(struct sock *sk,
struct sock *sktw, void *twp);
diff --git a/net/core/sock.c b/net/core/sock.c
index 5e2a313..b7300af 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2037,9 +2037,6 @@ static inline void release_proto_idx(struct proto *prot)
int proto_register(struct proto *prot, int alloc_slab)
{
- char *request_sock_slab_name = NULL;
- char *timewait_sock_slab_name;
-
if (alloc_slab) {
prot->slab = kmem_cache_create(prot->name, prot->obj_size, 0,
SLAB_HWCACHE_ALIGN, NULL);
@@ -2053,12 +2050,12 @@ int proto_register(struct proto *prot, int alloc_slab)
if (prot->rsk_prot != NULL) {
static const char mask[] = "request_sock_%s";
- request_sock_slab_name = kmalloc(strlen(prot->name) + sizeof(mask) - 1, GFP_KERNEL);
- if (request_sock_slab_name == NULL)
+ prot->rsk_prot->slab_name = kmalloc(strlen(prot->name) + sizeof(mask) - 1, GFP_KERNEL);
+ if (prot->rsk_prot->slab_name == NULL)
goto out_free_sock_slab;
- sprintf(request_sock_slab_name, mask, prot->name);
- prot->rsk_prot->slab = kmem_cache_create(request_sock_slab_name,
+ sprintf(prot->rsk_prot->slab_name, mask, prot->name);
+ prot->rsk_prot->slab = kmem_cache_create(prot->rsk_prot->slab_name,
prot->rsk_prot->obj_size, 0,
SLAB_HWCACHE_ALIGN, NULL);
@@ -2072,14 +2069,14 @@ int proto_register(struct proto *prot, int alloc_slab)
if (prot->twsk_prot != NULL) {
static const char mask[] = "tw_sock_%s";
- timewait_sock_slab_name = kmalloc(strlen(prot->name) + sizeof(mask) - 1, GFP_KERNEL);
+ prot->twsk_prot->twsk_slab_name = kmalloc(strlen(prot->name) + sizeof(mask) - 1, GFP_KERNEL);
- if (timewait_sock_slab_name == NULL)
+ if (prot->twsk_prot->twsk_slab_name == NULL)
goto out_free_request_sock_slab;
- sprintf(timewait_sock_slab_name, mask, prot->name);
+ sprintf(prot->twsk_prot->twsk_slab_name, mask, prot->name);
prot->twsk_prot->twsk_slab =
- kmem_cache_create(timewait_sock_slab_name,
+ kmem_cache_create(prot->twsk_prot->twsk_slab_name,
prot->twsk_prot->twsk_obj_size,
0, SLAB_HWCACHE_ALIGN,
NULL);
@@ -2095,14 +2092,14 @@ int proto_register(struct proto *prot, int alloc_slab)
return 0;
out_free_timewait_sock_slab_name:
- kfree(timewait_sock_slab_name);
+ kfree(prot->twsk_prot->twsk_slab_name);
out_free_request_sock_slab:
if (prot->rsk_prot && prot->rsk_prot->slab) {
kmem_cache_destroy(prot->rsk_prot->slab);
prot->rsk_prot->slab = NULL;
}
out_free_request_sock_slab_name:
- kfree(request_sock_slab_name);
+ kfree(prot->rsk_prot->slab_name);
out_free_sock_slab:
kmem_cache_destroy(prot->slab);
prot->slab = NULL;
@@ -2125,18 +2122,14 @@ void proto_unregister(struct proto *prot)
}
if (prot->rsk_prot != NULL && prot->rsk_prot->slab != NULL) {
- const char *name = kmem_cache_name(prot->rsk_prot->slab);
-
kmem_cache_destroy(prot->rsk_prot->slab);
- kfree(name);
+ kfree(prot->rsk_prot->slab_name);
prot->rsk_prot->slab = NULL;
}
if (prot->twsk_prot != NULL && prot->twsk_prot->twsk_slab != NULL) {
- const char *name = kmem_cache_name(prot->twsk_prot->twsk_slab);
-
kmem_cache_destroy(prot->twsk_prot->twsk_slab);
- kfree(name);
+ kfree(prot->twsk_prot->twsk_slab_name);
prot->twsk_prot->twsk_slab = NULL;
}
}
--
Catalin
Em Thu, Nov 20, 2008 at 09:51:50AM +0000, Catalin Marinas escreveu:
> On Wed, 2008-11-19 at 13:22 -0600, Christoph Lameter wrote:
> > proto_register could add another field somewhere and store the pointer to
> > the name there? Then free the string on proto_unregister.
>
> The patch below fixes this issue for proto_register. There is another
> similar case in net/dccp/ccid.c. I cc'ed the person who added the
> original proto_register code and he also seems to be the DCCP
> maintainer.
>
> My point is that the API is slightly different when slub is used since
> kmem_cache_name is no longer guaranteed to return the same pointer
> passed to kmem_cache_create. Maybe a documentation update:
>
> diff --git a/mm/slab.c b/mm/slab.c
> index ea76bcb..9723a72 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2124,6 +2124,8 @@ static int __init_refok setup_cpu_cache(struct kmem_cache
> *
> * @name must be valid until the cache is destroyed. This implies that
> * the module calling this has to destroy the cache before getting unloaded.
> + * Note that kmem_cache_name() is not guaranteed to return the same pointer,
> + * therefore applications must manage it themselves.
> *
> * The flags are
> *
>
> And the proto_register fix below (if it looks alright, I'll submit it
> separately):
>
>
> Fix memory leak in the proto_register function
>
> From: Catalin Marinas <[email protected]>
>
> If the slub allocator is used, kmem_cache_create() may merge two or more
> kmem_cache's into one but the cache name pointer is not updated and
> kmem_cache_name() is no longer guaranteed to return the pointer passed
> to the former function. This patch stores the kmalloc'ed pointers in the
> corresponding request_sock_ops and timewait_sock_ops structures.
>
> Signed-off-by: Catalin Marinas <[email protected]>
Thanks, when I wrote this there X was only 'a' in slXb :-)
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
I think I have reviewed a patch like this before. Wonder where it went.
Reviewed-by: Christoph Lameter <[email protected]>
Hi Catalin,
On Thu, Nov 20, 2008 at 11:51 AM, Catalin Marinas
<[email protected]> wrote:
> My point is that the API is slightly different when slub is used since
> kmem_cache_name is no longer guaranteed to return the same pointer
> passed to kmem_cache_create. Maybe a documentation update:
>
> diff --git a/mm/slab.c b/mm/slab.c
> index ea76bcb..9723a72 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2124,6 +2124,8 @@ static int __init_refok setup_cpu_cache(struct kmem_cache
> *
> * @name must be valid until the cache is destroyed. This implies that
> * the module calling this has to destroy the cache before getting unloaded.
> + * Note that kmem_cache_name() is not guaranteed to return the same pointer,
> + * therefore applications must manage it themselves.
Yes, makes sense. Care to send a patch I can apply?
Pekka