2003-05-20 17:30:10

by Cliff White

[permalink] [raw]
Subject: re-aim - 2.5.69, -mm6



This is the result of running the Reaim test against the
2.5.69 and 2.5.69-mm6 kernels. The -mm kernels are a bit
slower, and i'm wondering if i'm missing a tuning knob somewhere..
advice appreciated.

Re-aim is a rework of the AIM suite. (locations below)

Two data points i look at-
1. Maximum jobs per minute
2. Number of children when Jobs/second/child less than 1.0.
(convergence)

Load is the new_dbase load. (AIM7 dbase load with less synch IO)
System is a 4-CPU PIII with 4GB of physical
memory, test used 4 SCSI disks on a qlogicfc adapter.
Test is run with two different convergence methods, 3 runs each.
Peak load is the average of 3 runs, i pick the best results
regardless of convergence method.

Kernel Peak Load
2.5.69 - base 5216.68 Jobs/Minute
2.5.69-mm6(AS) 4963.36 JPM
2.5.69-mm6(deadline) 4966.71 JPM
2.5.69-mm7(AS) 4966.86 JPM

Load when JPS/child < 1.0 (c_ times are total for all children)
Average of six runs

Kernel Children JPM RunTime c_utime c_systime

2.5.69 - base 88 5185.88 104.87 376.01 40.28
2.5.69-mm6(AS) 84 4894.73 106.08 374.91 45.54
2.5.69-mm6(deadln) 84 4858.36 106.90 376.55 46.41
2.5.69-mm7(AS) 84 4853.80 106.95 378.02 46.26


Attempting a second pass of -mm7 caused the hang reported earlier.
cliffw
OSDL

report details and profile data at:
http://www.osdl.org/archive/cliffw/reaim

Reaim code at:
http://sourceforge.net/projects/re-aim-7
or
bk://bk.osdl.org/aimrework


2003-05-20 19:36:16

by Andrew Morton

[permalink] [raw]
Subject: Re: re-aim - 2.5.69, -mm6

Cliff White <[email protected]> wrote:
>
> This is the result of running the Reaim test against the
> 2.5.69 and 2.5.69-mm6 kernels. The -mm kernels are a bit
> slower, and i'm wondering if i'm missing a tuning knob somewhere..
> advice appreciated.

I can look into the slowdown. Could you please tell me exactly how you are
invoking the benchmark? Show me what commands you're using, so I can do
exactly the same thing.

> Attempting a second pass of -mm7 caused the hang reported earlier.

I have a bad feeling I won't be able to reproduce this. If you could
capture the output from a sysrq-T (or "echo t > /proc/sysrq-trigger") then
that would help a lot.

It could be a hole in the new dynamic request allocation code, or a driver
problem. Or something else.

2003-05-20 20:08:41

by Cliff White

[permalink] [raw]
Subject: Re: re-aim - 2.5.69, -mm6

> Cliff White <[email protected]> wrote:
> >
> > This is the result of running the Reaim test against the
> > 2.5.69 and 2.5.69-mm6 kernels. The -mm kernels are a bit
> > slower, and i'm wondering if i'm missing a tuning knob somewhere..
> > advice appreciated.
>
> I can look into the slowdown. Could you please tell me exactly how you are
> invoking the benchmark? Show me what commands you're using, so I can do
> exactly the same thing.

For these runs, i'm using the STP wrap.sh - you can get that kit from stp cvs
if you need.
The wrap.sh does the disk setup and a few other things,then invokes the test.

The two runs are done like this -> (4 cpu machine)
./reaim -s4 -x -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
maxjobs convergence
./reaim -s4 -q -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
'quick' convergence

stp.config has the poolsizes and path for disk directories:
FILESIZE 80k
POOLSIZE 1024k
DISKDIR /mnt/disk1
DISKDIR /mnt/disk2
DISKDIR /mnt/disk3
DISKDIR /mnt/disk4

Options:
the -b surpresseses stdout and creates an html version of the report. - this
is for STP and
not necessary.
the -r3 runs the test three times
-t turns off the adaptive increment

cliffw

>
> > Attempting a second pass of -mm7 caused the hang reported earlier.
>
> I have a bad feeling I won't be able to reproduce this. If you could
> capture the output from a sysrq-T (or "echo t > /proc/sysrq-trigger") then
> that would help a lot.
>
> It could be a hole in the new dynamic request allocation code, or a driver
> problem. Or something else.
>


2003-05-21 16:09:57

by Andrew Morton

[permalink] [raw]
Subject: Re: re-aim - 2.5.69, -mm6

Cliff White <[email protected]> wrote:
>
> The two runs are done like this -> (4 cpu machine)
> ./reaim -s4 -x -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
> maxjobs convergence
> ./reaim -s4 -q -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
> 'quick' convergence
>
> stp.config has the poolsizes and path for disk directories:
> FILESIZE 80k
> POOLSIZE 1024k
> DISKDIR /mnt/disk1
> DISKDIR /mnt/disk2
> DISKDIR /mnt/disk3
> DISKDIR /mnt/disk4

Well I spent a few hours running this on the quad xeon (aic7xxx).

There were no hangs, and there was no appreciable performance difference
between 2.5.69, 2.6.69-mm7++ with AS and 2.5.69-mm7++ with deadline.

Please confirm that the hang only happened with the anticipatory scheduler?

It could require a particular device driver to reproduce. Please see if
you can generate that sysrq-T output. Also if you can try a different
device driver sometime that would be interesting. There seem to be several
alternate ISP drivers around - the feral driver perhaps, and the new one in
the linux-scsi tree.

2003-05-21 17:26:30

by Cliff White

[permalink] [raw]
Subject: Re: re-aim - 2.5.69, -mm6

> Cliff White <[email protected]> wrote:
> >
> > The two runs are done like this -> (4 cpu machine)
> > ./reaim -s4 -x -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
> > maxjobs convergence
> > ./reaim -s4 -q -t -i4 -f workfile.new_dbase -r3 -b -lstp.config -> for the
> > 'quick' convergence
> >
> > stp.config has the poolsizes and path for disk directories:
> > FILESIZE 80k
> > POOLSIZE 1024k
> > DISKDIR /mnt/disk1
> > DISKDIR /mnt/disk2
> > DISKDIR /mnt/disk3
> > DISKDIR /mnt/disk4
>
> Well I spent a few hours running this on the quad xeon (aic7xxx).
>
> There were no hangs, and there was no appreciable performance difference
> between 2.5.69, 2.6.69-mm7++ with AS and 2.5.69-mm7++ with deadline.
>
> Please confirm that the hang only happened with the anticipatory scheduler?
Yes. Those are the only hangs.
>
> It could require a particular device driver to reproduce. Please see if
> you can generate that sysrq-T output. Also if you can try a different
> device driver sometime that would be interesting. There seem to be several
> alternate ISP drivers around - the feral driver perhaps, and the new one in
> the linux-scsi tree.

Okay - i have been using qlogicfc,but there are others..
OSDL is moving this weekend, so it'll be a bit before i have a machine up.
cliffw

>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>