2003-01-17 19:12:34

by Cliff White

[permalink] [raw]
Subject: [OSDL][BENCHMARK] Database results 2.4 versus 2.5


We have found some very nice database performance improvements in the
OSDL-DBT-2 database workload comparing the latest 2.4 kernel with 2.5.49
on a 8-way Profusion Xeon 700MHz Pentium III system with 4GB of memory.
We suspect there will be I/O improvements after moving to the latest
2.5 releases. We would like to optimize our memory utilization before
moving on to those experiments.

OSDL-DBT-2 is transaction intensive. We have implemented two variants on
SAP-DB using raw data files:
-one "cached" that runs in memory and does very little I/O except for
the log writes, and
-another "non-cached" with heavy reads and some writes.

In both variants the database buffer cache is sized to consume most of the
memory on the system. There are five transaction types running during the
test run. For the workload metric, we count how many of one transaction
type,"new-order", complete per minute (NOTPM). We measure this after the
database cache is warm. The new-order transaction represents 45% of all
transactions running. The bigger the number, the better the performance.

We did several runs of each variant (cached and non-cached) on each of
the two OS versions (2.4.21-pre3 and 2.5.49*). Run variances were low
compared to the differences we saw between OS versions. Results are as
follows (numbers represent average over the runs):


Linux DBT2 Metric Wrkld %memused iostats
Version Workload (bigger Speedup on4GB %user %sys total
better) iops
___________________________________________________________________
2.4.21-pre3 cached 4479 99.73 74.24 3.64 **
2.5.49 (*) cached 5040 99.73 85.37 2.85 381
cached 12.5%
___________________________________________________________________
2.4.21-pre3 noncached 1407.8 95.11 25.75 9.68 **
2.5.49 (*) noncached 1667.5 99.68 49.12 7.2 1461
non-cached 18.4%
___________________________________________________________________
** iostats is broken at 2.4 due to driver problems.

( If the table above gets distorted, or you want more details, please go to:
http://www.osdl.org/projects/dbt2dev/results/LKML_dbt2_2.4v2.5_both.html )


The results for 2.5 are significantly improved over 2.4, 12.5% for the
cached workload and 18.4% for the non-cached.

Notice that even though the %sys times are not particularly high at 2.4 ,
the metric improves. Our examination of the statistics show that both
the cached and non-cached workloads are paging in the 2.4 case but are
not paging in the 2.5 case. Since we use raw data files rather than file
system, we think that the 2.5 kernel is taking away memory from the mostly
unused file system buffer cache in favor of database cache, but cannot
do this in the 2.4. Perhaps someone can confirm this? Any suggestions
for further improvement?

The %sys drops going from 2.4 to 2.5 in both cases. We suspect this is
due to lack of paging in the 2.5 runs.

We saved system and database stats from these runs. The system
configuration details and summarized stats can be found at the URL
given above.



Regards,

Mary Edie Meredith
Mark Wong
cliffw
Open Source Development Lab

OSDL DBT-2 Project information: http://sourceforge.net/projects/osdldbt

OSDL-DBT-2 tests on 4-way systems will be released soon as part of OSDL's
test suite in the Scalable Test Platform (STP) : http://www.osdl.org/stp/


(*)We needed to include Mathew Wilcox's flock patch so we could
stop and restart the database. Note this patch should not be used
on any systems with NFS. The patch is found at the following URL:
ftp://ftp.linux.org.uk/pub/linux/willy/patches/flock-2.5.49-2.diff




2003-01-17 20:52:13

by Andrew Morton

[permalink] [raw]
Subject: Re: [OSDL][BENCHMARK] Database results 2.4 versus 2.5

Cliff White <[email protected]> wrote:
>
>
> We have found some very nice database performance improvements in the
> OSDL-DBT-2 database workload comparing the latest 2.4 kernel with 2.5.49
> on a 8-way Profusion Xeon 700MHz Pentium III system with 4GB of memory.
> We suspect there will be I/O improvements after moving to the latest
> 2.5 releases. We would like to optimize our memory utilization before
> moving on to those experiments.

So it sounds like DBT2 is stabilised now, and producing repeatable results?
That's excellent.

> ...
> We did several runs of each variant (cached and non-cached) on each of
> the two OS versions (2.4.21-pre3 and 2.5.49*). Run variances were low
> compared to the differences we saw between OS versions. Results are as
> follows (numbers represent average over the runs):

I notice you're using an extremeraid 2000? I have one of those, and
immediately shelved it when I saw how slow it is ;)

>
> Linux DBT2 Metric Wrkld %memused iostats
> Version Workload (bigger Speedup on4GB %user %sys total
> better) iops
> ___________________________________________________________________
> 2.4.21-pre3 cached 4479 99.73 74.24 3.64 **
> 2.5.49 (*) cached 5040 99.73 85.37 2.85 381
> cached 12.5%
> ___________________________________________________________________
> 2.4.21-pre3 noncached 1407.8 95.11 25.75 9.68 **
> 2.5.49 (*) noncached 1667.5 99.68 49.12 7.2 1461
> non-cached 18.4%
> ___________________________________________________________________
> ** iostats is broken at 2.4 due to driver problems.

Interesting. All the gains here are due to reduced idle time.

So either the I/O scheduler is doing a better job, or the VM page
replacement decisions are agreeable for this load.

> The %sys drops going from 2.4 to 2.5 in both cases. We suspect this is
> due to lack of paging in the 2.5 runs.

Yup. Do you have all the vmstat traces and all the other goodies? The
pgpgin/pgpgout numbers, etc seem to be wrong there.


This could easily be a complete fluke, and you may find that with
smaller/larger working sets or smaller/larger physical memory, the difference
goes away.


2003-01-17 21:14:15

by Cliff White

[permalink] [raw]
Subject: Re: [OSDL][BENCHMARK] Database results 2.4 versus 2.5

> Cliff White <[email protected]> wrote:
> >
> >
> > We have found some very nice database performance improvements in the
> > OSDL-DBT-2 database workload comparing the latest 2.4 kernel with 2.5.49
> > on a 8-way Profusion Xeon 700MHz Pentium III system with 4GB of memory.
> > We suspect there will be I/O improvements after moving to the latest
> > 2.5 releases. We would like to optimize our memory utilization before
> > moving on to those experiments.
>
> So it sounds like DBT2 is stabilised now, and producing repeatable results?
> That's excellent.
Thanks. the kit's available off Sourceforge now, and we'll have STP version
up Mondayish.
>
> > ...
> > We did several runs of each variant (cached and non-cached) on each of
> > the two OS versions (2.4.21-pre3 and 2.5.49*). Run variances were low
> > compared to the differences we saw between OS versions. Results are as
> > follows (numbers represent average over the runs):
>
> I notice you're using an extremeraid 2000? I have one of those, and
> immediately shelved it when I saw how slow it is ;)

We'll take that one up with our ops people ;) This is one of the many
reasons we did a 'cached' version of the load. Us database people just
can't get enough IO right now. :)
>
> >
> > Linux DBT2 Metric Wrkld %memused iostats
> > Version Workload (bigger Speedup on4GB %user %sys total
> > better) iops
> > ___________________________________________________________________
> > 2.4.21-pre3 cached 4479 99.73 74.24 3.64 **
> > 2.5.49 (*) cached 5040 99.73 85.37 2.85 381
> > cached 12.5%
> > ___________________________________________________________________
> > 2.4.21-pre3 noncached 1407.8 95.11 25.75 9.68 **
> > 2.5.49 (*) noncached 1667.5 99.68 49.12 7.2 1461
> > non-cached 18.4%
> > ___________________________________________________________________
> > ** iostats is broken at 2.4 due to driver problems.
>
> Interesting. All the gains here are due to reduced idle time.
>
> So either the I/O scheduler is doing a better job, or the VM page
> replacement decisions are agreeable for this load.

Okay. Is there something we could do that would point at one or the other?

>
> > The %sys drops going from 2.4 to 2.5 in both cases. We suspect this is
> > due to lack of paging in the 2.5 runs.
>
> Yup. Do you have all the vmstat traces and all the other goodies? The
> pgpgin/pgpgout numbers, etc seem to be wrong there.

We didn't have a working vmstat for those runs. We just grabbed the latest
procps,
so we should have that data for the next set. What looks wrong to you on the
pgpgin/pgpgout?
>
>
> This could easily be a complete fluke, and you may find that with
> smaller/larger working sets or smaller/larger physical memory, the difference
> goes away.
>
We're doing a series of runs with some slightly different memory sizes. STP
will allow
you to do the same. We normally try to tune the run so as to use all of the
physical
memory we can get our little hands on (it's a DBA thang) - would a smaller
memory
database (say 2GB instead of 4GB ) really show you anything interesting on a
4GB system,
since there's so little pressure?

cliffw
>


2003-01-17 21:28:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [OSDL][BENCHMARK] Database results 2.4 versus 2.5

Cliff White <[email protected]> wrote:
>
> > So it sounds like DBT2 is stabilised now, and producing repeatable results?
> > That's excellent.
> Thanks. the kit's available off Sourceforge now, and we'll have STP version
> up Mondayish.

OK. My utter database ignorance was an
insurmountable-within-two-hour-attention-span problem when I tried to set up
dbt1.

> > So either the I/O scheduler is doing a better job, or the VM page
> > replacement decisions are agreeable for this load.
>
> Okay. Is there something we could do that would point at one or the other?

Different combinations of working set and physical memory will tell us.

Also, when we have a lot of vmstat/etc traces available we can decide how I/O
bound it is, and whether we need to look at upping the request queue sizes.
Which is something which we can now do, and which could easily make a
difference here.

But we'll have to get you onto at least 2.5.58 for that ;)

> would a smaller memory database (say 2GB instead of 4GB ) really show you
> anything interesting on a 4GB system, since there's so little pressure?

Yes, that would be interesting. We're dealing with single points in
twenty-seven-dimensional space. Tweaking input parameter individually helps
one gain an understanding of what is going on.

2003-01-17 22:04:57

by Timothy D. Witham

[permalink] [raw]
Subject: Re: [OSDL][BENCHMARK] Database results 2.4 versus 2.5

On Fri, 2003-01-17 at 13:37, Andrew Morton wrote:
> Cliff White <[email protected]> wrote:
> >
> > > So it sounds like DBT2 is stabilised now, and producing repeatable results?
> > > That's excellent.
> > Thanks. the kit's available off Sourceforge now, and we'll have STP version
> > up Mondayish.
>
> OK. My utter database ignorance was an
> insurmountable-within-two-hour-attention-span problem when I tried to set up
> dbt1.

That is what we are trying to do with setting it up on STP. So you
don't have to be a database guy. :-) I guess what we need to explore
is how to change the setup of dbt{123} on STP so that it gives you
the information that you need.

Tim

>
> > > So either the I/O scheduler is doing a better job, or the VM page
> > > replacement decisions are agreeable for this load.
> >
> > Okay. Is there something we could do that would point at one or the other?
>
> Different combinations of working set and physical memory will tell us.
>
> Also, when we have a lot of vmstat/etc traces available we can decide how I/O
> bound it is, and whether we need to look at upping the request queue sizes.
> Which is something which we can now do, and which could easily make a
> difference here.
>
> But we'll have to get you onto at least 2.5.58 for that ;)
>
> > would a smaller memory database (say 2GB instead of 4GB ) really show you
> > anything interesting on a 4GB system, since there's so little pressure?
>
> Yes, that would be interesting. We're dealing with single points in
> twenty-seven-dimensional space. Tweaking input parameter individually helps
> one gain an understanding of what is going on.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Timothy D. Witham - Lab Director - [email protected]
Open Source Development Lab Inc - A non-profit corporation
15275 SW Koll Parkway - Suite H - Beaverton OR, 97006
(503)-626-2455 x11 (office) (503)-702-2871 (cell)
(503)-626-2436 (fax)