LinuxLists.cc - 2.4.8-pre1 and dbench -20% throughput

2001-07-27 21:12:21

Subject: 2.4.8-pre1 and dbench -20% throughput

Hi all,

I have done some throughput testing again.
Streaming write, copy, read, diff are almost identical to earlier 2.4 kernels.
(Note: 2.4.0 was clearly better when reading from two files - i.e. diff -
15.4 MB/s v. around 11 MB/s with later kenels - can be a result of disk
layout too...)

But "dbench 32" (on my 256 MB box) results has are the most interesting:

2.4.0 gave 33 MB/s
2.4.8-pre1 gives 26.1 MB/s (-21%)

Do we now throw away pages that would be reused?

[I have also verified that mmap002 still works as expected]

/RogerL

2001-07-27 21:42:48

by Rik van Riel

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Fri, 27 Jul 2001, Roger Larsson wrote:

> But "dbench 32" (on my 256 MB box) results has are the most interesting:
>
> 2.4.0 gave 33 MB/s
> 2.4.8-pre1 gives 26.1 MB/s (-21%)
>
> Do we now throw away pages that would be reused?

Yes. This is pretty much expected behaviour with the use-once
patch, both as it is currently implemented and how it works
in principle.

This is because the use-once strategy protects the working
set from streaming IO in a better way than before. One of the
consequences of this is that streaming IO pages get less of a
chance to be reused before they're evicted.

Database systems usually have a history of recently evicted
pages so they can promote these quick-evicted pages to the
list of more frequently used pages when it's faulted in again.

regards,

Rik
--
Executive summary of a recent Microsoft press release:
"we are concerned about the GNU General Public License (GPL)"

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-07-27 22:31:01

by Daniel Phillips

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Friday 27 July 2001 23:08, Roger Larsson wrote:
> Hi all,
>
> I have done some throughput testing again.
> Streaming write, copy, read, diff are almost identical to earlier 2.4
> kernels. (Note: 2.4.0 was clearly better when reading from two files
> - i.e. diff - 15.4 MB/s v. around 11 MB/s with later kenels - can be
> a result of disk layout too...)
>
> But "dbench 32" (on my 256 MB box) results has are the most
> interesting:
>
> 2.4.0 gave 33 MB/s
> 2.4.8-pre1 gives 26.1 MB/s (-21%)
>
> Do we now throw away pages that would be reused?
>
> [I have also verified that mmap002 still works as expected]

Could you run that test again with /usr/bin/time (the GNU time
function) so we can see what kind of swapping it's doing?

The use-once approach depends on having a fairly stable inactive_dirty
+ inactive_clean queue size, to give use-often pages a fair chance to
be rescued. To see how the sizes of the queues are changing, use
Shift-ScrollLock on your text console.

To tell the truth, I don't have a deep understanding of how dbench
works. I should read the code now and see if I can learn more about it
:-/ I have noticed that it tends to be highly variable in performance,
sometimes showing variation of a few 10's of percents from run to run.
This variation seems to depend a lot on scheduling. Do you see "*"'s
evenly spaced throughout the tracing output, or do you see most of them
bunched up near the end?

--
Daniel

2001-07-27 23:47:47

by Roger Larsson

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

Hi again,

It might be variations in dbench - but I am not sure since I run
the same script each time.

(When I made a testrun in a terminal window - with X running, but not doing
anything activly, I got
[some '.' deleted]
.............++++++++++++++++++++++++++++++++********************************
Throughput 15.8859 MB/sec (NB=19.8573 MB/sec 158.859 MBit/sec)
14.74user 22.92system 4:26.91elapsed 14%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (912major+1430minor)pagefaults 0swaps

I have never seen anyting like this - all '+' together!

I logged off and tried again - got more normal values 32 MB/s
and '+' were spread out.

More testing needed...

/RogerL

On Saturdayen den 28 July 2001 00:34, Daniel Phillips wrote:
> On Friday 27 July 2001 23:08, Roger Larsson wrote:
> > Hi all,
> >
> > I have done some throughput testing again.
> > Streaming write, copy, read, diff are almost identical to earlier 2.4
> > kernels. (Note: 2.4.0 was clearly better when reading from two files
> > - i.e. diff - 15.4 MB/s v. around 11 MB/s with later kenels - can be
> > a result of disk layout too...)
> >
> > But "dbench 32" (on my 256 MB box) results has are the most
> > interesting:
> >
> > 2.4.0 gave 33 MB/s
> > 2.4.8-pre1 gives 26.1 MB/s (-21%)
> >
> > Do we now throw away pages that would be reused?
> >
> > [I have also verified that mmap002 still works as expected]
>
> Could you run that test again with /usr/bin/time (the GNU time
> function) so we can see what kind of swapping it's doing?
>
> The use-once approach depends on having a fairly stable inactive_dirty
> + inactive_clean queue size, to give use-often pages a fair chance to
> be rescued. To see how the sizes of the queues are changing, use
> Shift-ScrollLock on your text console.
>
> To tell the truth, I don't have a deep understanding of how dbench
> works. I should read the code now and see if I can learn more about it
>
> :-/ I have noticed that it tends to be highly variable in performance,
>
> sometimes showing variation of a few 10's of percents from run to run.
> This variation seems to depend a lot on scheduling. Do you see "*"'s
> evenly spaced throughout the tracing output, or do you see most of them
> bunched up near the end?
>
> --
> Daniel

--
Roger Larsson
Skellefte?
Sweden

2001-07-28 00:36:54

by Steven Cole

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Friday 27 July 2001 16:34, Daniel Phillips wrote:
> On Friday 27 July 2001 23:08, Roger Larsson wrote:
> > Hi all,
> >
> > I have done some throughput testing again.
> > Streaming write, copy, read, diff are almost identical to earlier 2.4
> > kernels. (Note: 2.4.0 was clearly better when reading from two files
> > - i.e. diff - 15.4 MB/s v. around 11 MB/s with later kenels - can be
> > a result of disk layout too...)
> >
> > But "dbench 32" (on my 256 MB box) results has are the most
> > interesting:
> >
> > 2.4.0 gave 33 MB/s
> > 2.4.8-pre1 gives 26.1 MB/s (-21%)
> >
> > Do we now throw away pages that would be reused?
> >
> > [I have also verified that mmap002 still works as expected]
>
> Could you run that test again with /usr/bin/time (the GNU time
> function) so we can see what kind of swapping it's doing?
>

I also saw a significant drop in dbench 32 results.
Here are a few more data points, this time comparing 2.4.8-pre1 with 2.4.7.

2.4.7 9.3422 MB/sec vs 2.4.8-pre1 6.88884 MB/sec average of 3 runs

The system under test has 384 MB of memory, and did not go
into swap during the test. I performed a set of three runs immediately after
a boot, and with no pauses in between individual runs. I used time ./dbench 32
and caputured the output in a file using script `uname -r`. The tests were done
with X and KDE running, but no other activity.

Here are the results of the six runs:

Steven
-----------------------------------------------------------------------------
2.4.7 average 9.3422 MB/sec

Throughput 9.2929 MB/sec (NB=11.6161 MB/sec 92.929 MBit/sec)
34.11user 238.89system 7:34.59elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

Throughput 9.56338 MB/sec (NB=11.9542 MB/sec 95.6338 MBit/sec)
34.07user 262.44system 7:22.72elapsed 66%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

Throughput 9.17032 MB/sec (NB=11.4629 MB/sec 91.7032 MBit/sec)
33.79user 248.46system 7:41.62elapsed 61%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

-----------------------------------------------------------------------------
2.4.8-pre1 average 6.88884 MB/sec

Throughput 6.8078 MB/sec (NB=8.50975 MB/sec 68.078 MBit/sec)
34.30user 358.35system 10:21.57elapsed 63%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

Throughput 6.91993 MB/sec (NB=8.64992 MB/sec 69.1993 MBit/sec)
33.62user 369.55system 10:11.43elapsed 65%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

Throughput 6.93879 MB/sec (NB=8.67349 MB/sec 69.3879 MBit/sec)
33.33user 341.58system 10:09.77elapsed 61%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1008major+1402minor)pagefaults 0swaps

2001-07-28 01:06:49

by Daniel Phillips

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Saturday 28 July 2001 01:43, Roger Larsson wrote:
> Hi again,
>
> It might be variations in dbench - but I am not sure since I run
> the same script each time.
>
> (When I made a testrun in a terminal window - with X running, but not
> doing anything activly, I got
> [some '.' deleted]
> .............++++++++++++++++++++++++++++++++************************
>******** Throughput 15.8859 MB/sec (NB=19.8573 MB/sec 158.859
> MBit/sec) 14.74user 22.92system 4:26.91elapsed 14%CPU
> (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs
> (912major+1430minor)pagefaults 0swaps
>
> I have never seen anyting like this - all '+' together!
>
> I logged off and tried again - got more normal values 32 MB/s
> and '+' were spread out.
>
> More testing needed...

Truly wild, truly crazy. OK, this is getting interesting. I'll go
read the dbench source now, I really want to understand how the IO and
thread sheduling are interrelated. I'm not even going to try to
advance a theory just yet ;-)

I'd mentioned that dbench seems to run fastest when threads run and
complete all at different times instead of all together. It's easy to
see why this might be so: if the sum of all working sets is bigger than
memory then the system will thrash and do its work much more slowly.
If the threads *can* all run independently (which I think is true of
dbench because it simulates SMB accesses from a number of unrelated
sources) then the optimal strategy is to suspend enough processes so
that all the working sets do fit in memory. Linux has no mechanism for
detecting or responding to such situations (whereas FreeBSD - our
arch-rival in the mm sweepstakes - does) so we sometimes see what are
essentially random variations in scheduling causing very measurable
differences in throughput. (The "butterfly effect" where the beating
wings of a butterfly in Alberta set in motion a chain of events that
culminates with a hurricane in Florida.)

I am not saying this is the effect we're seeing here (the working set
effect, not the butterfly:-) but it is something to keep in mind when
investigating this. There is such a thing as being too fair, and maybe
that's what we're running into here.

--
Daniel

2001-07-28 01:59:49

by Daniel Phillips

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Saturday 28 July 2001 02:35, Steven Cole wrote:
> I also saw a significant drop in dbench 32 results.
> Here are a few more data points, this time comparing 2.4.8-pre1 with
> 2.4.7.
>
> 2.4.7 9.3422 MB/sec vs 2.4.8-pre1 6.88884 MB/sec average of 3
> runs
>
> The system under test has 384 MB of memory, and did not go
> into swap during the test. I performed a set of three runs
> immediately after a boot, and with no pauses in between individual
> runs. I used time ./dbench 32 and caputured the output in a file
> using script `uname -r`. The tests were done with X and KDE running,
> but no other activity.

The variation is accounted for almost entirely by the change in system
time. Does this mean more IO's or more scanning? I don't know, more
research needed.

We need Marcelo's vm statistics patch, I wonder what the status of that
is.

Thanks for the nice clear results, I'll try it here now. ;-)

> Here are the results of the six runs:
>
> Steven
> ---------------------------------------------------------------------
>-------- 2.4.7 average 9.3422 MB/sec
>
> Throughput 9.2929 MB/sec (NB=11.6161 MB/sec 92.929 MBit/sec)
> 34.11user 238.89system 7:34.59elapsed 60%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps
>
> Throughput 9.56338 MB/sec (NB=11.9542 MB/sec 95.6338 MBit/sec)
> 34.07user 262.44system 7:22.72elapsed 66%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps
>
> Throughput 9.17032 MB/sec (NB=11.4629 MB/sec 91.7032 MBit/sec)
> 33.79user 248.46system 7:41.62elapsed 61%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps
>
> ---------------------------------------------------------------------
>-------- 2.4.8-pre1 average 6.88884 MB/sec
>
> Throughput 6.8078 MB/sec (NB=8.50975 MB/sec 68.078 MBit/sec)
> 34.30user 358.35system 10:21.57elapsed 63%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps
>
> Throughput 6.91993 MB/sec (NB=8.64992 MB/sec 69.1993 MBit/sec)
> 33.62user 369.55system 10:11.43elapsed 65%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps
>
> Throughput 6.93879 MB/sec (NB=8.67349 MB/sec 69.3879 MBit/sec)
> 33.33user 341.58system 10:09.77elapsed 61%CPU (0avgtext+0avgdata
> 0maxresident)k 0inputs+0outputs (1008major+1402minor)pagefaults
> 0swaps

2001-07-30 14:45:56

by Marcelo Tosatti

[permalink] [raw]

Subject: Re: 2.4.8-pre1 and dbench -20% throughput

On Sat, 28 Jul 2001, Daniel Phillips wrote:

> On Saturday 28 July 2001 02:35, Steven Cole wrote:
> > I also saw a significant drop in dbench 32 results.
> > Here are a few more data points, this time comparing 2.4.8-pre1 with
> > 2.4.7.
> >
> > 2.4.7 9.3422 MB/sec vs 2.4.8-pre1 6.88884 MB/sec average of 3
> > runs
> >
> > The system under test has 384 MB of memory, and did not go
> > into swap during the test. I performed a set of three runs
> > immediately after a boot, and with no pauses in between individual
> > runs. I used time ./dbench 32 and caputured the output in a file
> > using script `uname -r`. The tests were done with X and KDE running,
> > but no other activity.
>
> The variation is accounted for almost entirely by the change in system
> time. Does this mean more IO's or more scanning? I don't know, more
> research needed.
>
> We need Marcelo's vm statistics patch, I wonder what the status of that
> is.

Well, I've switched to Andrew Morton's generic stats scheme.

I've also started writing a new userlevel tool (based on cpustat.c from
Zach Brown) to "replace" the old vmstat.c.

Right now I'm busy fixing clients problems and kernel RPM bugs, but I hope
to have the new vm stats patch using Andrew's scheme plus the userlevel
tool until the end of the week.