2002-06-17 13:14:43

by Randy Hron

[permalink] [raw]
Subject: Re: [PATCH] 2.5.21 IDE 91

>> tiobench.pl --size 2048 --numruns 3 --threads 128 # 384 MB ram in machine

> From your trace it would seem that writeout completion has not
> occurred against one or more pages. Could be that the device
> driver lost an interrupt, or it failed to deliver completion
> for one or more BIO segments, or something screwed up at the
> VFS level.

> I am (of course ;)) disinclined to believe the latter, mainly
> because of the amount of testing I do here. Eight-hour Cerberus
> runs on quad CPU, five IDE disks and six SCSI disks all chugging
> along, no probs.

> Is it reproducible? Are you able to try it on a different machine?
> On other disks in the same machine?

tiobench had a similar livelock in 2.5.19, 2.5.19 + 2 versions of
wli's lazy-buddy allocator, 2.5.20, 2.5.21, and 2.5.21 with the
request queue size change we talked about. However, when I tried
it just now on a different disk and filesystem type (reiser instead
of ext2) it didn't happen. The non-reproduced livelock didn't
have any of the previous stress testing run against the machine
though.

Since the tiobench livelock appeared in 2.5.19, the following
kernels have completed all tests.

2.4.19-pre10
2.5.20-dj3
2.4.19-pre10-ac2
2.4.19-pre10-aa1
2.5.20-dj4
2.4.19-pre10-jam2
2.4.19-pre10-jam2-O2
2.4.19-pre10-jam2-O3
2.4.18-cmpr-cache
2.4.19-pre10-mjc1

2.5.22 is out now and I'm trying that. If it has a problem, I'm
inclined to rebuild the ext2 that tiobench, osdb, and dbench run
on. (may not help, but it won't hurt :)

--
Randy Hron