2001-11-04 07:01:13

by Dan Kegel

[permalink] [raw]
Subject: Regression testing of 2.4.x before release?

I get the impression that Alan stress-tests his kernels
more than Linus does before releasing them.

Would it be a Good Thing if Linus decided to make sure
his kernels pass all of Alan's stress tests before
releasing them? (I'm talking e.g. 2.4.14-final, not 2.4.14-preX.)

- Dan


2001-11-04 07:15:47

by Ted Deppner

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On Sat, Nov 03, 2001 at 11:03:17PM -0800, Dan Kegel wrote:
> Would it be a Good Thing if Linus decided to make sure
> his kernels pass all of Alan's stress tests before
> releasing them? (I'm talking e.g. 2.4.14-final, not 2.4.14-preX.)

Yes it would. It would be a better idea if everyone (including you and
me) stress test those pre and final kernels too.

--
Ted Deppner
http://www.psyber.com/~ted/

2001-11-04 12:01:46

by Tahar

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?


> Yes it would. It would be a better idea if everyone (including you and
> me) stress test those pre and final kernels too.

Just a newbie question: where can we find such stress tests, and what
are the kernel parts targeted by these tests ?

Thanks,

Tahar

2001-11-04 17:27:48

by Ted Deppner

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On Sun, Nov 04, 2001 at 01:04:27PM +0100, Tahar wrote:
> Just a newbie question: where can we find such stress tests, and what
> are the kernel parts targeted by these tests ?

A few searches for "linux benchmark", "unix benchmark", or perhaps just
"benchmark" on Google and Freshmeat.net should turn up plenty to keep you
busy.

Linus and others have said in the past thought, that YOUR usage is the
testing they want... So it's best if you install the kernel and use it
normally, whatever you'd use a kernel to do.

I am concerned about lots of I/O and multiprocessing... So I test by doing
CD-RW burns to two drives (12x and a 4x), NFS data moves (using bonnie,
dd, and cat), while listening to MP3 streams, reading my email, and
watching extace, with some of my mysql data loading scripts running.

These are all things I do normally, and I'm the best able to compare new
performance to past performance. Sure, I don't do all of those things all
at the same time _usually_, but that's the main body of my 'test bench'.

--
Ted Deppner
http://www.psyber.com/~ted/

2001-11-04 17:56:50

by Dan Kegel

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

Mike Galbraith wrote:
>
> On Sat, 3 Nov 2001, Dan Kegel wrote:
>
> > I get the impression that Alan stress-tests his kernels
> > more than Linus does before releasing them.
>
> That's _our_ collective job. We're supposed to beat the snot out
> of the -pre kernels and report. One guy can't cover effectively,
> even if his name is Linus "stress-testing is boring" Torvalds ;-)

I'm not saying Linus should do the testing.

It's good that Linus is asking others to test with cerberus, as he
did in http://marc.theaimsgroup.com/?l=linux-kernel&m=100451768023436&w=2

It would be even better if Linus came out and stated that he would
refuse to call a kernel final if there is an outstanding report of
it failing an agreed-upon set of stress tests.

And it would be *even better* if http://osdl.org/stp/ were used
to do stress testing in a nice, automated way on 1, 4, 8, and 16-cpu
machines on release candidates.

Almost none of this requires any work by Linus. All Linus has to
do is say "The 2.4.x kernels will pass stress tests before release",
and recruit someone to run his kernels through OSDL's STP in a
timely manner.

(I'd be happy to help if it weren't for my darn tendinitis, which
makes it hard even to stir up trouble on mailing lists these days.)
- Dan

2001-11-04 18:42:24

by Luigi Genoni

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?


Most of us just developed their own stress test, depending on what they do
need.
Then many use dbench, cerberus and so on.

I would suggest you to start with cerberus.


Luigi


On Sun, 4 Nov 2001, Tahar wrote:

>
> > Yes it would. It would be a better idea if everyone (including you and
> > me) stress test those pre and final kernels too.
>
> Just a newbie question: where can we find such stress tests, and what
> are the kernel parts targeted by these tests ?
>
> Thanks,
>
> Tahar
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-11-04 19:09:53

by Luigi Genoni

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

Problem is:
there is a lot of HW out there, and we should ALL do stress tests, to have
a wide basis for HWs and test cases.
Basically it is very hard to agree
about a set of stress tests, because we all have different needs, and our
tests are based on our needs. That is a streght, because they tend to be
real life tests.

In my esperience, if some default set of tests comes out, then software
tend to be optimized for this set. And that is badly wrong.

Our disomogeneous tests are a good thing indeed.

For example, with servers with not so mutch memory (128/256 Mbyte RAM
256/512 MByte SWAP) I am interested for interactive uses, because usually
I have 20/50 users connected to them using Xwin (from X session on
localhost and many Xterms), netscape, KDE, a lot of mail software and TeX.

With bigger Iron (sparc64 with some Gbytes of RAM/SWAP) I am mutch more
interested in DB performances and web servers with zope and other
application servers.

My stress test have been developed during those years thinking on my
needs.
Thanx to this test, it was possible to see some VM bugs that was very
difficoult to be seen under other conditions. Then it was easy to work
with AA to fix them, and I have to say, he has been always available and
interested to solve every problem.


Other people will test other cases.

Then it belongs to Linus to see if he is satisfied with reports.
He can just take advantage of actual situation with testers.
This way Linux has a lot of possibility to run well on the most HW
(I am also thinking to alpha, sparc64, mips ecc.), and
with the most cases.



On Sun, 4 Nov 2001, Dan Kegel wrote:

> Mike Galbraith wrote:
> >
> > On Sat, 3 Nov 2001, Dan Kegel wrote:
> >
> > > I get the impression that Alan stress-tests his kernels
> > > more than Linus does before releasing them.
> >
> > That's _our_ collective job. We're supposed to beat the snot out
> > of the -pre kernels and report. One guy can't cover effectively,
> > even if his name is Linus "stress-testing is boring" Torvalds ;-)
>
> I'm not saying Linus should do the testing.
>
> It's good that Linus is asking others to test with cerberus, as he
> did in http://marc.theaimsgroup.com/?l=linux-kernel&m=100451768023436&w=2
>
> It would be even better if Linus came out and stated that he would
> refuse to call a kernel final if there is an outstanding report of
> it failing an agreed-upon set of stress tests.
>
> And it would be *even better* if http://osdl.org/stp/ were used
> to do stress testing in a nice, automated way on 1, 4, 8, and 16-cpu
> machines on release candidates.
>
> Almost none of this requires any work by Linus. All Linus has to
> do is say "The 2.4.x kernels will pass stress tests before release",
> and recruit someone to run his kernels through OSDL's STP in a
> timely manner.
>
> (I'd be happy to help if it weren't for my darn tendinitis, which
> makes it hard even to stir up trouble on mailing lists these days.)
> - Dan
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-11-05 01:50:02

by Dan Kegel

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

Luigi Genoni wrote:
> Problem is:
> there is a lot of HW out there, and we should ALL do stress tests, to have
> a wide basis for HWs and test cases. Basically it is very hard to agree
> about a set of stress tests, because we all have different needs, and our
> tests are based on our needs. That is a streght, because they tend to be
> real life tests.

Sure, no argument there.

> In my esperience, if some default set of tests comes out, then software
> tend to be optimized for this set. And that is badly wrong.

My post was motivated by two observations:

1. Alan Cox complains occasionally that Linus' trees are not well tested,
and can't survive the torture tests that the ac tree goes through before
release. (e.g.
"2.4.8-ac12
I'm trying to make sure I can keep this testable
as 2.4.9 vanilla isnt being stable on my test sets "

2. The STP at OSDLab seems like a great resource that we might be able
to leverage to solve the problem Alan points out.

I'm not suggesting anyone do any less testing. Just the opposite;
if we set things up properly with the STP, we might be able to run
many more tests before each final release.

- Dan

2001-11-05 16:38:59

by Timothy D. Witham

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On Sun, 2001-11-04 at 17:51, Dan Kegel wrote:
> Luigi Genoni wrote:
> > Problem is:
> > there is a lot of HW out there, and we should ALL do stress tests, to have
> > a wide basis for HWs and test cases. Basically it is very hard to agree
> > about a set of stress tests, because we all have different needs, and our
> > tests are based on our needs. That is a streght, because they tend to be
> > real life tests.
>

I agree having the users run their applications and under their usage
model is a very good way of testing code drops. Dan, I think that what
you are trying to say is that it might be a good idea to take a group
of tests and make them the standard set of "pass/fail" that people
should look to before doing their own testing.

> Sure, no argument there.
>
> > In my esperience, if some default set of tests comes out, then software
> > tend to be optimized for this set. And that is badly wrong.
>

Any time you start optimizing for a set of performance tests you
take the chance of doing things that only benefit the single test. The
good part about open source is that if somebody tries to do that
the rest of us can point out what a useless (or even counter productive)
optimization they are trying to implement.

Regression type pass/fail tests don't tend to have the benchmark
optimization issue but like any test they usually only find the
problems that you either already have had in the past or that are
obvious. Not complete but they should be dynamic environment that
things are being added to all the time. Also the nice part about a
knows series of tests is that if a problem pops up it is much
easier to reproduce for debugging purposes.

> My post was motivated by two observations:
>
> 1. Alan Cox complains occasionally that Linus' trees are not well tested,
> and can't survive the torture tests that the ac tree goes through before
> release. (e.g.
> "2.4.8-ac12
> I'm trying to make sure I can keep this testable
> as 2.4.9 vanilla isnt being stable on my test sets "
>
> 2. The STP at OSDLab seems like a great resource that we might be able
> to leverage to solve the problem Alan points out.
>

The nice part about the way that STP was designed is that it is
extensible. If somebody comes up with another test we can add it.
If we need to add additional equipment to get the run times down
to a usable level then that is easy to do also.

> I'm not suggesting anyone do any less testing. Just the opposite;
> if we set things up properly with the STP, we might be able to run
> many more tests before each final release.
>

We are in the process of setting up the Kernel STP to automatically
grab the Linus and -ac kernels and run the full setup. This will
do part of what Dan is asking for and it will also allow people who
are looking to supply patches a baseline for there patch testing.

Tim


> - Dan
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Timothy D. Witham - Lab Director - [email protected]
Open Source Development Lab Inc - A non-profit corporation
15275 SW Koll Parkway - Suite H - Beaverton OR, 97006
(503)-626-2455 x11 (office) (503)-702-2871 (cell)
(503)-626-2436 (fax)


2001-11-12 06:22:00

by Dan Kegel

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

"Timothy D. Witham" wrote:
> ...
> I agree having the users run their applications and under their usage
> model is a very good way of testing code drops. Dan, I think that what
> you are trying to say is that it might be a good idea to take a group
> of tests and make them the standard set of "pass/fail" that people
> should look to before doing their own testing.

More like a safety net. It'd help make sure we didn't forget something
obvious.
>...
> Regression type pass/fail tests don't tend to have the benchmark
> optimization issue but like any test they usually only find the
> problems that you either already have had in the past or that are
> obvious. Not complete but they should be dynamic environment that
> things are being added to all the time. Also the nice part about a
> knows series of tests is that if a problem pops up it is much
> easier to reproduce for debugging purposes.

Yep.

> > 2. The STP at OSDLab seems like a great resource that we might be able
> > to leverage to solve the problem Alan points out.
>
> The nice part about the way that STP was designed is that it is
> extensible. If somebody comes up with another test we can add it.
> If we need to add additional equipment to get the run times down
> to a usable level then that is easy to do also.
>
> > I'm not suggesting anyone do any less testing. Just the opposite;
> > if we set things up properly with the STP, we might be able to run
> > many more tests before each final release.
>
> We are in the process of setting up the Kernel STP to automatically
> grab the Linus and -ac kernels and run the full setup. This will
> do part of what Dan is asking for and it will also allow people who
> are looking to supply patches a baseline for there patch testing.

That's super! Thanks, Tim!

At some point it might be nice to also use the STP to help
speed gcc 3 development, too. (I personally am really
looking forward to the day when I can use the same compiler
for both c++ and kernel.)

- Dan

2001-11-12 19:07:14

by Timothy D. Witham

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On Sun, 2001-11-11 at 22:24, Dan Kegel wrote:
> "Timothy D. Witham" wrote:

Snip

>
> At some point it might be nice to also use the STP to help
> speed gcc 3 development, too. (I personally am really
> looking forward to the day when I can use the same compiler
> for both c++ and kernel.)
>

Strange, I was just talking to somebody about compiler
performance and regression issues and what sort of automation
could be done to do that sort of testing.

Since the STP is really a framework and just about any piece
of software and testing environment could be worked into it.

So I guess you could have two pieces. One that just ran a bunch
of compile and user level tests and then one that went in and
checked out the compiler on a kernel tree and then ran the
same performance tests that had been run using the "standard"
compiler.

Are you stepping forward to integrate this into STP? :-)

> - Dan
--
Timothy D. Witham - Lab Director - [email protected]
Open Source Development Lab Inc - A non-profit corporation
15275 SW Koll Parkway - Suite H - Beaverton OR, 97006
(503)-626-2455 x11 (office) (503)-702-2871 (cell)
(503)-626-2436 (fax)

2001-11-13 04:53:25

by Dan Kegel

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

"Timothy D. Witham" wrote:
>
> On Sun, 2001-11-11 at 22:24, Dan Kegel wrote:
> > At some point it might be nice to also use the STP to help
> > speed gcc 3 development, too. (I personally am really
> > looking forward to the day when I can use the same compiler
> > for both c++ and kernel.)
>
> Strange, I was just talking to somebody about compiler
> performance and regression issues and what sort of automation
> could be done to do that sort of testing.
>
> Since the STP is really a framework and just about any piece
> of software and testing environment could be worked into it.
>
> So I guess you could have two pieces. One that just ran a bunch
> of compile and user level tests and then one that went in and
> checked out the compiler on a kernel tree and then ran the
> same performance tests that had been run using the "standard"
> compiler.

Go/no-go tests, where you make sure a kernel compiled with gcc 3
actually works, might be appropriate for starters. I don't know
if that's been established yet.

> Are you stepping forward to integrate this into STP? :-)

I wish! Alas, tendinitis makes hacking hazardous for me for now.

- Dan

2001-11-13 22:00:46

by Bryce Harrington

[permalink] [raw]
Subject: STP for automated GCC testing (was Re: Regression testing of 2.4.x beforerelease?)

On Mon, 12 Nov 2001, Dan Kegel wrote:
> "Timothy D. Witham" wrote:
> >
> > On Sun, 2001-11-11 at 22:24, Dan Kegel wrote:
> > > At some point it might be nice to also use the STP to help
> > > speed gcc 3 development, too. (I personally am really
> > > looking forward to the day when I can use the same compiler
> > > for both c++ and kernel.)
> >
> > Strange, I was just talking to somebody about compiler
> > performance and regression issues and what sort of automation
> > could be done to do that sort of testing.
> >
> > Since the STP is really a framework and just about any piece
> > of software and testing environment could be worked into it.
> >
> > So I guess you could have two pieces. One that just ran a bunch
> > of compile and user level tests and then one that went in and
> > checked out the compiler on a kernel tree and then ran the
> > same performance tests that had been run using the "standard"
> > compiler.
>
> Go/no-go tests, where you make sure a kernel compiled with gcc 3
> actually works, might be appropriate for starters. I don't know
> if that's been established yet.
>
> > Are you stepping forward to integrate this into STP? :-)
>
> I wish! Alas, tendinitis makes hacking hazardous for me for now.
>

Dan, I'd be willing to join in with some of the hacking. I helped
develop STP so am familiar with how it works and how to apply it to
this. I've used/compiled/cursed at gcc for years but am not super
familiar with its internals, so would welcome some advice and help
there.

Would anyone else be interested in joining in on doing this? I think
providing this testing service for gcc would be invaluable for the
community. Basically we need a couple folks with perl scripting and
html form creation skills.

Anyway, if I get a few positive responses and it's okay by Tim, I'll go
ahead and get a mailing list, etc. set up for everyone interested in
joining in on this.

Bryce



2002-01-11 23:14:08

by Daniel Phillips

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On November 12, 2001 07:24 am, Dan Kegel wrote:
> At some point it might be nice to also use the STP to help
> speed gcc 3 development, too. (I personally am really
> looking forward to the day when I can use the same compiler
> for both c++ and kernel.)

You already can, at least I can because gcc3 builds recent kernels just fine.
IOW, it works for me. Conservatively, it's good to keep the old compiler
around (choose your poison) for those few apps that don't build with gcc, but
I feel quite comfortable at the moment having gcc3 as my default.

--
Daniel

2002-01-14 04:50:03

by Daniel Phillips

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On November 12, 2001 07:24 am, Dan Kegel wrote:
> At some point it might be nice to also use the STP to help
> speed gcc 3 development, too. (I personally am really
> looking forward to the day when I can use the same compiler
> for both c++ and kernel.)

You already can, at least I can because gcc3 builds recent kernels just fine.
IOW, it works for me. Conservatively, it's good to keep the old compiler
around (choose your poison) for those few apps that don't build with gcc, but
I feel quite comfortable at the moment having gcc3 as my default.

--
Daniel

2002-01-12 00:05:31

by M. Edward Borasky

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?

On Fri, 11 Jan 2002, Daniel Phillips wrote:

> On November 12, 2001 07:24 am, Dan Kegel wrote:
> > At some point it might be nice to also use the STP to help speed gcc
> > 3 development, too. (I personally am really looking forward to the
> > day when I can use the same compiler for both c++ and kernel.)
>
> You already can, at least I can because gcc3 builds recent kernels
> just fine. IOW, it works for me. Conservatively, it's good to keep
> the old compiler around (choose your poison) for those few apps that
> don't build with gcc, but I feel quite comfortable at the moment
> having gcc3 as my default.

One particular application for which gcc 3.x *and* gcc 2.96.x are
seriously deficient, at least on Intel/AMD 32-bit systems, is the
high-performance linear algebra library Atlas. As a result, *my* default
for compiling numerical applications is the Atlas-recommended one,
2.95.3. For the kernel, I use whatever the Red Hat 7.2 default is.
--
M. Edward Borasky
[email protected]

The COUGAR Project
http://www.borasky-research.com/Cougar.htm

I brought my inner child to "Take Your Child To Work Day."

2002-01-12 00:27:34

by eddantes

[permalink] [raw]
Subject: Re: Regression testing of 2.4.x before release?


M. Edward (Ed) Borasky wrote:

[snip]

> One particular application for which gcc 3.x *and* gcc 2.96.x are
> seriously deficient, at least on Intel/AMD 32-bit systems, is the
> high-performance linear algebra library Atlas. As a result, *my* default
> for compiling numerical applications is the Atlas-recommended one,
> 2.95.3. For the kernel, I use whatever the Red Hat 7.2 default is.
>

Mmhh... Just remember gcc 2.96.x is NOT a regular gcc release, you can
check at:
http://www.gnu.org/software/gcc/releases.html
AFAIK, it is a RH-hacked pre-3.0, which is probably not the best thing
to use for anything.

The 3.x series are know to generate pretty slow code, anyway. So I bet
your experience is pretty normal. I still stick with 2.95.[34] for x86
kernel compile, although I'm using 3.0 for all purposes on Hitashi SH,
as only gcc>=3.0 correctly supports the sh4.

/dantes

2002-01-12 00:35:04

by Kurt Garloff

[permalink] [raw]
Subject: [OT] Re: Regression testing of 2.4.x before release?

Hi,

On Fri, Jan 11, 2002 at 04:04:59PM -0800, M. Edward (Ed) Borasky wrote:
> One particular application for which gcc 3.x *and* gcc 2.96.x are
> seriously deficient, at least on Intel/AMD 32-bit systems, is the
> high-performance linear algebra library Atlas. As a result, *my* default
> for compiling numerical applications is the Atlas-recommended one,
> 2.95.3. For the kernel, I use whatever the Red Hat 7.2 default is.

One of the problems of gcc-3 is taking decisions when to inline and when
not. This can hurt numerical code a lot, especially C++.
You may want to use -finline-limit-XXX to tune.
http://www.garloff.de/kurt/freesoft/gcc/
v1 of my patch went into 3.0.3, some version (don't know which) into
mainline, so 3.0.3 should do better.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (979.00 B)
(No filename) (232.00 B)
Download all attachments