2003-08-14 10:32:44

by Jozsef Kadlecsik

[permalink] [raw]
Subject: kmem_cache_destroy: Can't free all objects (2.6)

Hello,

I'm working on performance testing netfilter ip_conntrack. The tested
machine is a dual Xeon PC with Serverworks chipset and e1000 cards.

"Calibrating" without ip_conntrack loaded in went fine. However as
the ip_conntrack module was removed after a test, sometimes I got the
message:

slab error in kmem_cache_destroy(): cache `ip_conntrack': Can't free all
objects
Call Trace:
[<c0136fe9>] kmem_cache_destroy+0xd8/0x116
[<f889f3ba>] ip_conntrack_cleanup+0x43/0x6c [ip_conntrack]
[<f889d61a>] init_or_cleanup+0x131/0x190 [ip_conntrack]
[<f889fc13>] fini+0xf/0x19 [ip_conntrack]
[<c012dd76>] sys_delete_module+0x13c/0x169
[<c01402ab>] sys_munmap+0x45/0x66
[<c0108f9b>] syscall_call+0x7/0xb

I cannot reliably reproduce the error. Sometimes it comes at the
first test run, sometimes at say the 20th, but I have never been able to
run all the planned 150 tests.

This happens both with 2.5.74 and 2.6.0-test1.

Rusty suggested to apply the following little patch to capture allocs and
frees:

--- linux-2.6.0-ct/mm/slab.c.orig 2003-07-25 13:48:26.000000000
+0200
+++ linux-2.6.0-ct/mm/slab.c 2003-07-25 13:56:51.000000000 +0200
@@ -280,6 +280,7 @@
struct list_head next;

/* 5) statistics */
+ atomic_t alloc,free;
#if STATS
unsigned long num_active;
unsigned long num_allocations;
@@ -1389,6 +1390,10 @@
up(&cache_chain_sem);

if (__cache_shrink(cachep)) {
+ printk("Slab cache %s: %u allocs vs %u frees\n",
+ cachep->name,
+ atomic_read(&cachep->alloc),
+ atomic_read(&cachep->free));
slab_error(cachep, "Can't free all objects");
down(&cache_chain_sem);
list_add(&cachep->next,&cache_chain);
@@ -2022,7 +2027,10 @@
*/
void * kmem_cache_alloc (kmem_cache_t *cachep, int flags)
{
- return __cache_alloc(cachep, flags);
+ void *ret = __cache_alloc(cachep, flags);
+ if (ret)
+ atomic_inc(&cachep->alloc);
+ return ret;
}

/**
@@ -2119,6 +2127,7 @@
{
unsigned long flags;

+ atomic_inc(&cachep->free);
local_irq_save(flags);
__cache_free(cachep, objp);
local_irq_restore(flags);

The result with the patch applied is as follows:

Slab cache ip_conntrack: 2052678 allocs vs 2052678 frees
slab error in kmem_cache_destroy(): cache `ip_conntrack': Can't free all objects
Call Trace:
[<c0133554>] __slab_error+0x24/0x28
[<c013426d>] kmem_cache_destroy+0xa5/0x128
[<f889f562>] ip_conntrack_cleanup+0x4a/0x70 [ip_conntrack]
[<f889d5fa>] init_or_cleanup+0x16a/0x170 [ip_conntrack]
[<f889fd63>] fini+0x7/0xb [ip_conntrack]
[<c012b75e>] sys_delete_module+0x14a/0x16c
[<c013da30>] sys_munmap+0x38/0x58
[<c0108c27>] syscall_call+0x7/0xb

Slab debugging is enabled, no single message in the log. /proc/slabinfo
says (long lines wrapped):

slabinfo - version: 2.0 (statistics)
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
: tunables <batchcount> <limit <sharedfactor>
: slabdata <active_slabs> <num_slabs> <sharedavail>
: globalstat <listallocs> <maxobjs> <grown> <reaped> <error> <maxfreeable> <freelimit>
: cpustat <allochit <allocmiss <freehit <freemiss>
broken 128 2486 180 22 1
: tunables 32 16 8
: slabdata 113 113 128
: globalstat 664846 590238 26829 0 0 1 70
: cpustat 1 974008 78670 1990714 61964

Any hint or idea what can be the problem?

Please Cc answers to me, I'm not on the list.

Best regards,
Jozsef
-
E-mail : [email protected], [email protected]
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary


2003-08-14 14:04:52

by Philippe Elie

[permalink] [raw]
Subject: Re: kmem_cache_destroy: Can't free all objects (2.6)

Jozsef Kadlecsik wrote:
> Hello,
>
> I'm working on performance testing netfilter ip_conntrack. The tested
> machine is a dual Xeon PC with Serverworks chipset and e1000 cards.
>
> "Calibrating" without ip_conntrack loaded in went fine. However as
> the ip_conntrack module was removed after a test, sometimes I got the
> message:
>
> slab error in kmem_cache_destroy(): cache `ip_conntrack': Can't free all
> objects
> Call Trace:
> [<c0136fe9>] kmem_cache_destroy+0xd8/0x116
> [<f889f3ba>] ip_conntrack_cleanup+0x43/0x6c [ip_conntrack]
> [<f889d61a>] init_or_cleanup+0x131/0x190 [ip_conntrack]
> [<f889fc13>] fini+0xf/0x19 [ip_conntrack]
> [<c012dd76>] sys_delete_module+0x13c/0x169
> [<c01402ab>] sys_munmap+0x45/0x66
> [<c0108f9b>] syscall_call+0x7/0xb
>
> I cannot reliably reproduce the error. Sometimes it comes at the
> first test run, sometimes at say the 20th, but I have never been able to
> run all the planned 150 tests.
>
> This happens both with 2.5.74 and 2.6.0-test1.

I fixed a bug with exactly this symptom, all object freed but
the cache was retaining some object internally. I look 2.6.0-test1
and the fix is in, apologies if you already did it correctly
but can you double check the bug is really in 2.6.0-test1.

> Slab cache ip_conntrack: 2052678 allocs vs 2052678 frees


> slab error in kmem_cache_destroy(): cache `ip_conntrack': Can't free all objects

regards,
Philippe Elie


2003-08-14 19:33:53

by Jozsef Kadlecsik

[permalink] [raw]
Subject: Re: kmem_cache_destroy: Can't free all objects (2.6)

On Thu, 14 Aug 2003, Philippe Elie wrote:

> > This happens both with 2.5.74 and 2.6.0-test1.
>
> I fixed a bug with exactly this symptom, all object freed but
> the cache was retaining some object internally. I look 2.6.0-test1
> and the fix is in, apologies if you already did it correctly
> but can you double check the bug is really in 2.6.0-test1.

Yes. Before I posted my mail I googled a lot and found your fix. I
checked the source of 2.6.0-test1 and your fix is there of course.

Best regards,
Jozsef
-
E-mail : [email protected], [email protected]
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary

2003-08-15 12:04:27

by Philippe Elie

[permalink] [raw]
Subject: Re: kmem_cache_destroy: Can't free all objects (2.6)

--- mm/slab.c~ 2003-07-14 03:36:48.000000000 +0000
+++ mm/slab.c 2003-08-15 13:59:34.000000000 +0000
@@ -2349,7 +2349,7 @@
int tofree;

check_spinlock_acquired(cachep);
- if (ac->touched) {
+ if (ac->touched && !force) {
ac->touched = 0;
} else if (ac->avail) {
tofree = force ? ac->avail : (ac->limit+4)/5;


Attachments:
slab.diff (321.00 B)

2003-08-15 12:32:08

by Jozsef Kadlecsik

[permalink] [raw]
Subject: Re: kmem_cache_destroy: Can't free all objects (2.6)

On Fri, 15 Aug 2003, Philippe Elie wrote:

> Thanks, I wanted to be sure before digging in the source. Try
> the attached patch please.

Thank you, the patched kernel is up and the tests are started.
I'll report you the results.

Best regards,
Jozsef
-
E-mail : [email protected], [email protected]
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary

2003-08-16 14:20:37

by Jozsef Kadlecsik

[permalink] [raw]
Subject: Re: kmem_cache_destroy: Can't free all objects (2.6)

On Fri, 15 Aug 2003, Jozsef Kadlecsik wrote:

> > Thanks, I wanted to be sure before digging in the source. Try
> > the attached patch please.
>
> I'll report you the results.

With your patch applied, the problem went away!
Thank you very much indeed!

Best regards,
Jozsef
-
E-mail : [email protected], [email protected]
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary