2002-10-24 07:40:51

by Helge Hafting

[permalink] [raw]
Subject: [long]2.5.44-mm3 UP went into unexpected trashing

1 0 0 1632 7092 5020 80104 0 0 0 0 1007 235 0 0 99
0 0 0 1632 8056 5020 78824 0 0 0 0 1006 232 0 0 99
1 0 0 1632 7832 5020 78884 0 0 0 0 1008 232 0 0 99
0 0 0 1632 7496 5020 78884 0 0 0 0 1006 233 0 0 99
0 0 0 1632 7328 5020 78884 0 0 0 0 1006 235 0 0 99
1 0 0 1632 6992 5020 78884 0 0 0 0 1005 231 0 0 99
1 0 0 1632 8012 5020 77604 0 0 0 0 1007 230 1 0 99
1 0 0 1632 7564 5020 77736 0 0 0 0 1007 233 0 0 99
1 0 0 1632 7396 5020 77736 0 0 0 0 1005 235 0 0 99
1 0 0 1632 7060 5020 77736 0 0 0 0 1005 231 0 0 99
1 0 0 1632 8200 4952 76440 0 0 0 0 1007 232 0 0 99
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
1 0 0 1632 7636 4952 76720 0 0 1 0 1005 232 0 0 99
1 0 0 1632 7468 4952 76720 0 0 0 0 1005 231 0 0 99
1 0 1 1632 7132 4952 76720 0 0 0 0 1005 232 0 0 99
0 0 0 1632 8212 4808 75568 0 0 0 0 1005 232 0 0 99
1 0 0 1632 7404 4824 76028 0 0 2 0 1007 233 0 0 99
0 0 0 1632 7232 4824 76028 0 0 0 0 1005 235 0 0 99
0 0 0 1632 8028 4516 75184 0 0 0 0 1005 232 0 0 99
2 0 0 1632 7804 4516 75244 0 0 0 0 1007 231 0 0 99
0 0 0 1632 7468 4516 75244 0 0 0 0 1006 231 0 0 99
0 0 0 1632 7300 4516 75244 0 0 0 0 1007 231 0 0 99
0 0 0 1632 8036 4316 74332 0 0 0 0 1009 232 0 0 99
0 0 0 1632 7700 4316 74504 0 0 1 0 1006 232 0 0 99
1 0 0 1632 7364 4320 74508 0 0 0 0 1005 232 0 0 99
1 0 0 1632 7196 4324 74508 0 0 0 0 1005 232 0 0 99
0 0 0 1632 8104 4208 73348 0 0 0 0 1005 231 0 0 99
1 0 0 1632 7876 4212 73408 0 0 0 0 1007 232 0 0 99
1 0 0 1632 7484 4212 73408 0 0 0 0 1009 232 0 0 99
0 0 0 1632 7316 4212 73408 0 0 0 0 1008 230 1 0 99
0 0 0 1632 6980 4212 73408 0 0 0 0 1007 230 1 0 99
1 0 0 1632 8612 4208 71576 0 0 0 0 1005 233 0 0 99
1 0 0 1632 8216 4208 71712 0 0 0 0 1005 233 0 0 99
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
2 0 0 1632 8048 4208 71712 0 0 0 0 1007 231 0 0 99
0 0 0 1632 7712 4208 71712 0 0 0 0 1005 232 0 0 99
0 0 0 1632 7544 4208 71712 0 0 0 0 1006 232 0 0 99
1 0 1 1632 7208 4208 71716 0 0 0 0 1006 231 0 0 99
1 0 0 1632 8232 4144 70472 0 0 0 0 1005 232 0 0 99
0 0 0 1632 7784 4152 70608 0 0 0 0 1005 231 0 0 99
1 0 0 1632 7616 4152 70608 0 0 0 0 1007 231 0 0 99
1 0 0 1632 7224 4152 70608 0 0 0 0 1006 234 0 0 99
1 0 0 1632 7052 4152 70608 0 0 0 0 1005 233 0 0 99
0 0 0 1632 7868 3848 69776 0 0 0 0 1006 235 0 0 99
1 0 0 1632 7644 3848 69836 0 0 0 0 1007 232 0 0 99
0 0 0 1632 7308 3848 69836 0 0 0 0 1006 232 0 0 99
1 0 0 1632 7136 3848 69836 0 0 0 0 1007 232 0 0 99
0 0 0 1632 7936 3492 69032 0 0 0 0 1005 231 0 0 99
1 0 0 1632 7708 3492 69092 0 0 0 0 1006 231 0 0 99
0 0 0 1632 7372 3492 69096 0 0 0 0 1005 231 0 0 99
1 0 0 1632 7204 3492 69096 0 0 0 0 1005 234 0 0 99
0 0 0 1632 7964 3400 68076 0 0 0 0 1008 232 1 0 99
1 0 1 1632 7736 3404 68136 0 0 0 0 1005 233 0 0 99
1 0 0 1632 7400 3404 68136 0 0 0 0 1006 231 0 0 99
0 0 1 1632 7232 3404 68136 0 0 0 0 1008 233 0 0 99
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
2 0 0 1632 8088 3312 67020 0 0 0 0 1005 235 0 0 99
1 0 0 1632 7808 3312 67124 0 0 0 0 1006 232 0 0 99
1 0 0 1632 7468 3312 67124 0 0 0 0 1007 232 0 0 99
1 0 0 1632 7300 3312 67124 0 0 0 0 1006 234 0 0 99
1 0 0 1632 8100 3232 66024 0 0 0 0 1006 230 1 0 99
1 0 0 1632 7876 3232 66084 0 0 0 0 1006 231 0 0 99
1 0 0 1632 7536 3232 66088 0 0 0 0 1006 232 0 0 99
1 0 0 1632 7176 3248 66148 0 0 0 0 1007 233 0 0 99
2 0 0 2656 2984 5084 16120 0 3 325 29 1087 555 10 27 63
2 0 1 8600 3028 3416 7096 0 20 473 50 1113 651 8 27 66
1 0 0 14760 6704 1728 7448 4 22 99 24 1029 312 1 4 95
1 0 0 14712 7128 1732 7528 1 0 4 0 1011 234 0 0 99
0 0 0 14880 6820 1572 7652 1 1 3 1 1012 234 0 0 99
0 0 0 15144 7164 1560 7432 0 1 2 1 1008 233 0 0 99
1 0 0 15132 7064 1416 7492 1 0 3 0 1007 233 0 0 99
1 0 0 15520 7396 1304 7176 0 1 2 2 1007 234 0 0 99
0 0 0 15532 7748 1200 6980 1 0 3 0 1010 235 0 0 99
1 0 0 15532 6964 1200 7400 1 0 2 0 1008 234 0 0 99
1 0 0 15516 6196 972 6992 5 0 6 0 1007 233 0 0 99
1 0 0 15460 7272 912 6744 0 0 1 0 1008 233 0 0 99
0 0 1 28136 3472 172 10340 64 51 188 54 1040 413 8 2 90
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
3 0 1 32704 3012 1040 9736 20 18 344 18 1049 416 6 2 92
1 0 1 40624 3568 788 6844 46 32 3275 39 1324 1367 46 21 33
1 0 1 43972 4104 188 6148 290 97 1964 100 1225 968 41 20 39
0 5 1 46240 2816 108 4584 102 25 157 29 1069 533 6 3 91
2 1 0 53952 3020 320 5880 285 131 634 175 1120 594 33 15 53


Attachments:
slabinfo (12.95 kB)
topm (1.14 kB)
vmstat1min (6.83 kB)
vmstat5min (6.71 kB)
Download all attachments

2002-10-24 07:52:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

Helge Hafting wrote:
>
> 2.5.44-mm3 just began trashing. I ran debsums -s in order
> to verify installed packages. This checksums all
> binaries. Things got a little sluggish, but it finished.
>
> Then I started a compile for 2.5.44-mm4, and the machine
> became so hopeless that I stopped the compile. That
> haven't happened before, I have 256M.
>
> Looking at a running vmstat I saw lots of swapping,
> almost no free memory (as usual) but _cache_
> was down to 4004 too!

Memory leak.


> dentry_cache 1174824 1174824 140 41958 41958 1 : 248 124 : 1174824 1174923 41958 0 0 0 276 : 1131352 43467 0 0

160 megabytes of dcache.

Hopefully the rcu fix in -mm4 will cure this.

2002-10-24 08:11:22

by Andrew Morton

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

Andrew Morton wrote:
>
> Hopefully the rcu fix in -mm4 will cure this.

Oh. It was in -mm3 too. But something went wrong with the
dcache shrinking there.

2002-10-24 09:23:10

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 08:22:07AM +0000, Andrew Morton wrote:
> Andrew Morton wrote:
> >
> > Hopefully the rcu fix in -mm4 will cure this.
>
> Oh. It was in -mm3 too. But something went wrong with the
> dcache shrinking there.

Hmm.. the thing to do here would be to look at cat /proc/sys/fs/dentry-state.
The number of dentries in the system should tally with dentry slab,
if it doesn't it might be an RCU issue in which case I would like to
look at /proc/rcu. If not, then we need to do some more digging.

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

2002-10-24 11:27:49

by Maneesh Soni

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 08:22:07AM +0000, Andrew Morton wrote:
> Andrew Morton wrote:
> >
> > Hopefully the rcu fix in -mm4 will cure this.
>
> Oh. It was in -mm3 too. But something went wrong with the
> dcache shrinking there.

Backing out larger-cpu-masks.patch fixes this in -mm3 so, -mm4 should not give
this problem. Basically callbacks are not getting processed due to incorrect
rcu_cpu_mask.

Maneesh


--
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
Phone: +91-80-5044999 email: [email protected]
http://lse.sourceforge.net/

2002-10-24 11:48:29

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

Maneesh Soni wrote:

> On Thu, Oct 24, 2002 at 08:22:07AM +0000, Andrew Morton wrote:
>> Andrew Morton wrote:
>> >
>> > Hopefully the rcu fix in -mm4 will cure this.
>>
>> Oh. It was in -mm3 too. But something went wrong with the
>> dcache shrinking there.
>
> Backing out larger-cpu-masks.patch fixes this in -mm3 so, -mm4 should not
> give this problem. Basically callbacks are not getting processed due to
> incorrect rcu_cpu_mask.

Would this affect UP systems? Had the dentry leak on a UP box with 512m
memory. About 400m ended up in unfreeable dentries...

Ed Tomlinson

2002-10-24 12:26:11

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 12:01:31PM +0000, Ed Tomlinson wrote:
> Maneesh Soni wrote:
> >> Oh. It was in -mm3 too. But something went wrong with the
> >> dcache shrinking there.
> >
> > Backing out larger-cpu-masks.patch fixes this in -mm3 so, -mm4 should not
> > give this problem. Basically callbacks are not getting processed due to
> > incorrect rcu_cpu_mask.
>
> Would this affect UP systems? Had the dentry leak on a UP box with 512m
> memory. About 400m ended up in unfreeable dentries...

It does affect UP systems.

A quick look at /proc/rcu in a leaky system indicated that somehow
despite having a batch of RCUs, they are not getting started.

/* Fake initialization required by compiler */
@@ -106,10 +106,11 @@ static void rcu_start_batch(long newbatc
rcu_ctrlblk.maxbatch = newbatch;
}
if (rcu_batch_before(rcu_ctrlblk.maxbatch, rcu_ctrlblk.curbatch) ||
- (rcu_ctrlblk.rcu_cpu_mask != 0)) {
+ (find_first_bit(rcu_ctrlblk.rcu_cpu_mask, NR_CPUS) != NR_CPUS)) {
return;
}
- rcu_ctrlblk.rcu_cpu_mask = cpu_online_map;
+ memcpy(rcu_ctrlblk.rcu_cpu_mask, cpu_online_map,
+ sizeof(rcu_ctrlblk.rcu_cpu_mask));
}

Either find_first_bit() is not returning NR_CPUS when the bitmask has no
bit set or memcpy is not working on the UP version of cpu_online_map. Will
dig a little bit more.

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

2002-10-24 12:29:10

by Maneesh Soni

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 07:47:39AM -0400, Ed Tomlinson wrote:
> Maneesh Soni wrote:
>
> > On Thu, Oct 24, 2002 at 08:22:07AM +0000, Andrew Morton wrote:
> >> Andrew Morton wrote:
> >> >
> >> > Hopefully the rcu fix in -mm4 will cure this.
> >>
> >> Oh. It was in -mm3 too. But something went wrong with the
> >> dcache shrinking there.
> >
> > Backing out larger-cpu-masks.patch fixes this in -mm3 so, -mm4 should not
> > give this problem. Basically callbacks are not getting processed due to
> > incorrect rcu_cpu_mask.
>
> Would this affect UP systems? Had the dentry leak on a UP box with 512m
> memory. About 400m ended up in unfreeable dentries...

Actually problem does not appear for SMP (with default NRCPUS) kernel as it getsproper rcu_cpu_mask and for UP it is not correct.

Maneesh


--
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
Phone: +91-80-5044999 email: [email protected]
http://lse.sourceforge.net/

2002-10-24 15:18:57

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 06:08:09PM +0530, Dipankar Sarma wrote:
> On Thu, Oct 24, 2002 at 12:01:31PM +0000, Ed Tomlinson wrote:
> > Would this affect UP systems? Had the dentry leak on a UP box with 512m
> > memory. About 400m ended up in unfreeable dentries...
>
> It does affect UP systems.
>
> A quick look at /proc/rcu in a leaky system indicated that somehow
> despite having a batch of RCUs, they are not getting started.
>
> /* Fake initialization required by compiler */
> @@ -106,10 +106,11 @@ static void rcu_start_batch(long newbatc
> rcu_ctrlblk.maxbatch = newbatch;
> }
> if (rcu_batch_before(rcu_ctrlblk.maxbatch, rcu_ctrlblk.curbatch) ||
> - (rcu_ctrlblk.rcu_cpu_mask != 0)) {
> + (find_first_bit(rcu_ctrlblk.rcu_cpu_mask, NR_CPUS) != NR_CPUS)) {
> return;
> }
> - rcu_ctrlblk.rcu_cpu_mask = cpu_online_map;
> + memcpy(rcu_ctrlblk.rcu_cpu_mask, cpu_online_map,
> + sizeof(rcu_ctrlblk.rcu_cpu_mask));
> }
>
> Either find_first_bit() is not returning NR_CPUS when the bitmask has no
> bit set or memcpy is not working on the UP version of cpu_online_map. Will
> dig a little bit more.

OK, I think I know why this one didn't work.

If the bit_mask is 0, find_first_bit() returns 32 or BITS_PER_LONG.
That works fine as long as NR_CPUS is 32, but when it isn't things
are broken.

(find_first_bit(rcu_ctrlblk.rcu_cpu_mask, NR_CPUS) != BITS_PER_LONG)) {
return;

should probably work here.

I guess we need to audit all bitmask tests and fix them to check for
the right value.

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

2002-10-24 15:34:31

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Thu, Oct 24, 2002 at 09:01:05PM +0530, Dipankar Sarma wrote:
> OK, I think I know why this one didn't work.
>
> If the bit_mask is 0, find_first_bit() returns 32 or BITS_PER_LONG.
> That works fine as long as NR_CPUS is 32, but when it isn't things
> are broken.
>
> (find_first_bit(rcu_ctrlblk.rcu_cpu_mask, NR_CPUS) != BITS_PER_LONG)) {
> return;
>
> should probably work here.
>
> I guess we need to audit all bitmask tests and fix them to check for
> the right value.

Argh!! I spoke too soon.

AFAICS, find_first_bit() needs to be fixed to return "size" if the
bitmask is all zeros.

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

2002-10-25 00:16:56

by Rusty Russell

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

In message <[email protected]> you write:
> AFAICS, find_first_bit() needs to be fixed to return "size" if the
> bitmask is all zeros.

Yes, the x86 one looks wrong. Other archs seem to get this correct.

Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-10-25 12:14:50

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

On Fri, Oct 25, 2002 at 09:35:17AM +1000, Rusty Russell wrote:
> In message <[email protected]> you write:
> > AFAICS, find_first_bit() needs to be fixed to return "size" if the
> > bitmask is all zeros.
>
> Yes, the x86 one looks wrong. Other archs seem to get this correct.
>

I tested this code in userspace and seems to do the right thing - return
"size" if no bit is set. But I can't find a larger_cpu_mask patch to
test with -mm5. Should I forward port the one from -mm4 experimental or
is there a new version available somewhere ?

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

bitops_fix.patch
----------------

diff -urN linux-2.5.44-mm5/include/asm-i386/bitops.h linux-2.5.44-mm5-fix/include/asm-i386/bitops.h
--- linux-2.5.44-mm5/include/asm-i386/bitops.h Sat Oct 19 09:32:01 2002
+++ linux-2.5.44-mm5-fix/include/asm-i386/bitops.h Fri Oct 25 15:19:54 2002
@@ -306,18 +306,23 @@
int res;

/* This looks at memory. Mark it volatile to tell gcc not to move it around */
- __asm__ __volatile__(
- "xorl %%eax,%%eax\n\t"
- "repe; scasl\n\t"
- "jz 1f\n\t"
- "leal -4(%%edi),%%edi\n\t"
- "bsfl (%%edi),%%eax\n"
- "1:\tsubl %%ebx,%%edi\n\t"
- "shll $3,%%edi\n\t"
- "addl %%edi,%%eax"
- :"=a" (res), "=&c" (d0), "=&D" (d1)
- :"1" ((size + 31) >> 5), "2" (addr), "b" (addr));
- return res;
+ __asm__ __volatile__(
+ "movl %%edi,%%esi\n\t"
+ "movl %%ecx,%%edx\n\t"
+ "repe; scasl\n\t"
+ "subl %%ecx,%%edx\n\t"
+ "movl %%edx,%%ecx\n\t"
+ "shll $5,%%edx\n\t"
+ "movl (%%esi,%%ecx,4),%%edi\n\t"
+ "movl %%ebx,%%eax\n\t"
+ "testl %%edi,%%edi\n\t"
+ "jz 1f\n\t"
+ "bsfl %%edi,%%eax\n\t"
+ "addl %%edx,%%eax\n\t"
+ "1:\t"
+ :"=a" (res), "=&c" (d0), "=&D" (d1)
+ :"1" ((size - 1) >> 5), "2" (addr), "b" (size));
+ return res;
}

/**

2002-10-26 15:42:19

by Dipankar Sarma

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

Well, my earlier find_first_bit() implementation was completely bogus.
My sanity has now returned and I coded this patch below that fixes
find_find_bit() to return "size" if all bits are zero. I have tested it
extensively in userspace and it boots 2.5.44-mm5 which crashed with the earlier
version of the bitops_fix patch. I have coded the assembly routine
as optimal as I could think of and without introducing any new
branches or memory loads.

Along with this patch, I applied the larger_cpu_mask patch to -mm5
and sanity tested both UP and SMP kernels for dcache leaks in a 4CPU P3 box.
An ls -lR and subsequent unmounting of that filesystems showed that
the dentries were correctly getting returned the dcache slab and
that indicates that the larger_cpu_mask patch no longer breaks RCU.
I will do some more testing with this combination later with
rcu_stats applied on this tree (just to be sure), but so far it looks good.

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.


bitops_fix.patch
-----------------

diff -urN linux-2.5.44-mm5/include/asm-i386/bitops.h linux-2.5.44-mm5-fix/include/asm-i386/bitops.h
--- linux-2.5.44-mm5/include/asm-i386/bitops.h Sat Oct 19 09:32:01 2002
+++ linux-2.5.44-mm5-fix/include/asm-i386/bitops.h Sat Oct 26 17:52:09 2002
@@ -311,12 +311,13 @@
"repe; scasl\n\t"
"jz 1f\n\t"
"leal -4(%%edi),%%edi\n\t"
- "bsfl (%%edi),%%eax\n"
- "1:\tsubl %%ebx,%%edi\n\t"
+ "bsfl (%%edi),%%edx\n"
+ "subl %%ebx,%%edi\n\t"
"shll $3,%%edi\n\t"
- "addl %%edi,%%eax"
+ "addl %%edi,%%edx\n\t"
+ "1:\tmovl %%edx,%%eax\n\t"
:"=a" (res), "=&c" (d0), "=&D" (d1)
- :"1" ((size + 31) >> 5), "2" (addr), "b" (addr));
+ :"1" ((size + 31) >> 5), "2" (addr), "b" (addr), "d" (size));
return res;
}

2002-10-29 12:51:49

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [long]2.5.44-mm3 UP went into unexpected trashing

Dipankar Sarma wrote:

> Well, my earlier find_first_bit() implementation was completely bogus.
> My sanity has now returned and I coded this patch below that fixes
> find_find_bit() to return "size" if all bits are zero. I have tested it
> extensively in userspace and it boots 2.5.44-mm5 which crashed with the
> earlier version of the bitops_fix patch. I have coded the assembly routine
> as optimal as I could think of and without introducing any new
> branches or memory loads.
>
> Along with this patch, I applied the larger_cpu_mask patch to -mm5
> and sanity tested both UP and SMP kernels for dcache leaks in a 4CPU P3
> box. An ls -lR and subsequent unmounting of that filesystems showed that
> the dentries were correctly getting returned the dcache slab and
> that indicates that the larger_cpu_mask patch no longer breaks RCU.
> I will do some more testing with this combination later with
> rcu_stats applied on this tree (just to be sure), but so far it looks
> good.

-mm5 with this patch is working fine here.

Thanks
Ed Tomlinson