2002-09-20 21:04:12

by Steven Cole

[permalink] [raw]
Subject: 2.5.37 lockup with dbench 36 and make -j3 bzImage and PREEMPT=y.

While running 2.5.37 with 36 dbench clients and doing a kernel compile
with make -j3 bzImage, my test machine locked up. Repeating the test,
the box ran up to 80 dbench clients and was doing make clean on a kernel
tree when the freeze occurred. (My dbench ramp-up test runs dbench with
1,2,3,4,6,8,10,12,16,20,24,28,32,36,40,44,48,52,56,64,80,96,112, and 128
clients). When the box froze, it did not even respond to SYSRQ-H (tested
OK before lockup). Each time, I gave the system a few minutes to come
out of its coma, but it didn't.

I also saw this with 2.5.36-bk3, patched with Robert's preempt-fix,
under similar circumstances (dbench 40 and parallel make).

Without doing a kernel compile at the same time, the text box was able
to run up to 128 dbench clients successfully with 2.5.37 and PREEMPT.

The 2.5.37 kernel was patched with this fix for PREEMPT:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103255294016473&w=2
and this fix for eepro100:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103253526526144&w=2

Both of those kernels were configured with SMP and PREEMPT.

The test box is dual PIII, 1GB, scsi, ext3 fs. The dbench runs and
kernel compiles were both run from the same disk.

I have been unable so far to repeat this failure with plain 2.5.37 and
no PREEMPT. Plain vanilla 2.5.37 no preempt has run up to 112 dbench
clients while doing make -j3 bzImage. At one point, the box appeared to
have frozen, with vmstat -n 1 3600 output stopped printing for a while,
but the system is still responsive to SYSRQ-H, P, T, just slow with
dbench 128 running.

Steven




2002-09-21 10:00:09

by Helge Hafting

[permalink] [raw]
Subject: Re: 2.5.37 lockup with dbench 36 and make -j3 bzImage and PREEMPT=y.

On 2002.09.20 23:05 Steven Cole wrote:
> While running 2.5.37 with 36 dbench clients and doing a kernel compile
> with make -j3 bzImage, my test machine locked up.

2.5.36 SMP no preempt locks up under similiar circumstances - a compile
with make -j 3 and moderate disk activity on 2 scsi disks makes it
freeze solid now and then. Perhaps this is some sort of SMP problem
also exposed by preempt?

I haven't tested this in 2.5.37 because refuses to run X.

Helge Hafting

2002-09-23 18:27:31

by Robert Love

[permalink] [raw]
Subject: Re: 2.5.37 lockup with dbench 36 and make -j3 bzImage and PREEMPT=y.

On Mon, 2002-09-23 at 12:09, Steven Cole wrote:

> Further testing with 2.5.37 over the weekend showed that the lockup also
> occurred without preempt here too.
>
> This seems to have been fixed in 2.5.38-mm2. I've run up to dbench 128
> while doing several kernel compiles with make -j3 with PREEMPT enabled,
> and have not observed any lockups at all.

Excellent. Thanks a lot for this update.

I appreciate your testing, Steven. Any future preempt-related lock ups
please let me know.

Robert Love

2002-09-23 18:45:45

by Steven Cole

[permalink] [raw]
Subject: Re: 2.5.37 lockup with dbench 36 and make -j3 bzImage and PREEMPT=y.

On Sat, 2002-09-21 at 04:05, Helge Hafting wrote:
> On 2002.09.20 23:05 Steven Cole wrote:
> > While running 2.5.37 with 36 dbench clients and doing a kernel compile
> > with make -j3 bzImage, my test machine locked up.
>
> 2.5.36 SMP no preempt locks up under similiar circumstances - a compile
> with make -j 3 and moderate disk activity on 2 scsi disks makes it
> freeze solid now and then. Perhaps this is some sort of SMP problem
> also exposed by preempt?
>
> I haven't tested this in 2.5.37 because refuses to run X.
>
> Helge Hafting

Further testing with 2.5.37 over the weekend showed that the lockup also
occurred without preempt here too.

This seems to have been fixed in 2.5.38-mm2. I've run up to dbench 128
while doing several kernel compiles with make -j3 with PREEMPT enabled,
and have not observed any lockups at all.

Steven