2003-01-24 00:01:59

by Austin Gonyou

[permalink] [raw]
Subject: Using O(1) scheduler with 600 processes.

I've heard some say that O(1) sched can only really help on systems with
lots and lots of processes.

But my systems run about 600 processes max, but are P4 Xeons with HT,
and we kick off several hundred processes sometimes. (sleeping to
running then back) based on things happening in the system.

I am possibly going to forgo putting O(1)sched in production *right now*
until I've got my patch solid. But I got to thinking, do I need it at
all on a Oracle VLDB?

I think yes, but I wanted to get some opinions/facts before making that
choice to go without O(1) sched.


--
Austin Gonyou <[email protected]>
Coremetrics, Inc.


2003-01-24 00:15:18

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

> I've heard some say that O(1) sched can only really help on systems with
> lots and lots of processes.
>
> But my systems run about 600 processes max, but are P4 Xeons with HT,
> and we kick off several hundred processes sometimes. (sleeping to
> running then back) based on things happening in the system.
>
> I am possibly going to forgo putting O(1)sched in production *right now*
> until I've got my patch solid. But I got to thinking, do I need it at
> all on a Oracle VLDB?
>
> I think yes, but I wanted to get some opinions/facts before making that
> choice to go without O(1) sched.

How many *processors*? Real ones.

M.

2003-01-24 00:15:05

by Perez-Gonzalez, Inaky

[permalink] [raw]
Subject: RE: Using O(1) scheduler with 600 processes.


> I think yes, but I wanted to get some opinions/facts before
> making that
> choice to go without O(1) sched.

Just go, it will help. Test it first, though :)

Inaky Perez-Gonzalez -- Not speaking for Intel - opinions are my own [or my
fault]

2003-01-24 01:55:17

by mgross

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

You should definitely give it a try.

However; boosts in Oracle throughput by going to the O(1) scheduler may end
up being dependent on your I/O setup.

I was helping out with a TPCC benchmark effort last fall for Itanium Oracle
through put on Red Hat AS. For the longest time the guys with the big iron
hardware would not move to the newer kernels with the O(1) scheduler. They
had a silly rule of only accepting changes that improved TPCC throughput.
(oh, this work was on 4-way Itanium 2's with 32Gig of ram, and a large number
of clarion fiber channel disk array towers)

Anyway, for the longest time the old 2.4.18 kernel with the 4/10/04 ia-64
patch was 10% better than the a kernel with O(1) scheduler. I never quite
figured out what the problem was. I think the difference was in the way
Oracle likes to be on a Round Robbin scheduler, and the O(1) scheduler tended
to get unlucky more often than the old scheduler, for those drive arrays.

However; when we updated the clarion towers to have more drives and to 18K
RPM drives from the 15K drives, all of a sudden the O(1) scheduler beat the
the old scheduler.

Your milage will vary.

Give it a try.

--mgross



On Thursday 23 January 2003 04:10 pm, Austin Gonyou wrote:
> I've heard some say that O(1) sched can only really help on systems with
> lots and lots of processes.
>
> But my systems run about 600 processes max, but are P4 Xeons with HT,
> and we kick off several hundred processes sometimes. (sleeping to
> running then back) based on things happening in the system.
>
> I am possibly going to forgo putting O(1)sched in production *right now*
> until I've got my patch solid. But I got to thinking, do I need it at
> all on a Oracle VLDB?
>
> I think yes, but I wanted to get some opinions/facts before making that
> choice to go without O(1) sched.

2003-01-24 06:01:52

by GrandMasterLee

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

On Thu, 2003-01-23 at 18:24, Martin J. Bligh wrote:
> > I've heard some say that O(1) sched can only really help on systems with
> > lots and lots of processes.
> >
> > But my systems run about 600 processes max, but are P4 Xeons with HT,
> > and we kick off several hundred processes sometimes. (sleeping to
> > running then back) based on things happening in the system.
> >
> > I am possibly going to forgo putting O(1)sched in production *right now*
> > until I've got my patch solid. But I got to thinking, do I need it at
> > all on a Oracle VLDB?
> >
> > I think yes, but I wanted to get some opinions/facts before making that
> > choice to go without O(1) sched.
>
> How many *processors*? Real ones.
>
> M.
>

Quad P4 Xeon. Dell 6650

2003-01-24 06:00:25

by GrandMasterLee

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

On Thu, 2003-01-23 at 20:05, mgross wrote:
> You should definitely give it a try.
>
> However; boosts in Oracle throughput by going to the O(1) scheduler may end
> up being dependent on your I/O setup.
>
> I was helping out with a TPCC benchmark effort last fall for Itanium Oracle
> through put on Red Hat AS. For the longest time the guys with the big iron
> hardware would not move to the newer kernels with the O(1) scheduler. They
> had a silly rule of only accepting changes that improved TPCC throughput.
> (oh, this work was on 4-way Itanium 2's with 32Gig of ram, and a large number
> of clarion fiber channel disk array towers)

We've got LSI, so it's very similar.

> Anyway, for the longest time the old 2.4.18 kernel with the 4/10/04 ia-64
> patch was 10% better than the a kernel with O(1) scheduler. I never quite
> figured out what the problem was. I think the difference was in the way
> Oracle likes to be on a Round Robbin scheduler, and the O(1) scheduler tended
> to get unlucky more often than the old scheduler, for those drive arrays.
>
> However; when we updated the clarion towers to have more drives and to 18K
> RPM drives from the 15K drives, all of a sudden the O(1) scheduler beat the
> the old scheduler.

Well, if I could get a clean patch against 2.4.20, or possibly some help
fixing the one I do have, thanks to Ingo, then we'd have a straight
O(1) sched for 2.4.20. I tried merging the patch that Ingo gave me, and
everything seems OK, but I don't have any menu selection for O(1) stuff
in the kernel config.(0 and 100 priority bits)

So I can't tell if it's enabled.


> Your milage will vary.
>
> Give it a try.
>
> --mgross
>

I agree. In the interest of time, I may have to forego O(1), but maybe
I'll get lucky. :) *hint*hint* :)

TIA

--
GrandMasterLee

2003-01-24 06:09:34

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

>> > I've heard some say that O(1) sched can only really help on systems with
>> > lots and lots of processes.
>> >
>> > But my systems run about 600 processes max, but are P4 Xeons with HT,
>> > and we kick off several hundred processes sometimes. (sleeping to
>> > running then back) based on things happening in the system.
>> >
>> > I am possibly going to forgo putting O(1)sched in production *right now*
>> > until I've got my patch solid. But I got to thinking, do I need it at
>> > all on a Oracle VLDB?
>> >
>> > I think yes, but I wanted to get some opinions/facts before making that
>> > choice to go without O(1) sched.
>>
>> How many *processors*? Real ones.
>
> Quad P4 Xeon. Dell 6650

I'd say you definitely want O(1) sched then (or just run -aa or something).
But why don't you just try it and see?

M.


2003-01-24 06:20:18

by GrandMasterLee

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

On Fri, 2003-01-24 at 00:18, Martin J. Bligh wrote:
> >> How many *processors*? Real ones.
> >
> > Quad P4 Xeon. Dell 6650
>
> I'd say you definitely want O(1) sched then (or just run -aa or something).
> But why don't you just try it and see?
>
> M.


Heh..Well, I am currently using 2.4.19rc5aa1. We're having some major
stack problems, so I first when through trying to update the XFS
codebase in 2.4.19rc5aa1. That didn't prove very fruitful. I couldn't
even fully reverse the patch for some reason.

So I decided to try 2.4.20aa1 instead, reversing the xfs patches, and
then updating with a newer code base, worse problems reversing those xfs
patches.

SO I decided to just roll my own with the known features we use in
production.

2.4.20 + xfs + lvm106 + rmap or aavm + O(1) sched + pte-highmem.

well, I easily can get rmap+pte-highmem+xfs. Adding O(1) has proven to
be a pain, at least where P4's are concerned. I actually succesfully
merged 2.4.18-o1-p4 optimizations patch, only to have the vmlinux link
fail at the end of the kernel build.

I chased down the problem to an undefined reference to
arch_load_balance, but I can't find anywhere it's actually undefined in
my source.Come to find out, that smp_balance.h is only used for P4's
anyway, or so it said, and that's just my target platform.

I'm really close to nailing it, but I don't know where to go from here.

My build errors are here:
http://digitalroadkill.net/public/kernel/

any of the 2.4.20-rmap* error files. The error3 file has the ld error.
And as for building 2.4.20 with the updated patch, I can't even tell if
it's merged right cause there's not menu entry for the prio.

--
GrandMasterLee

2003-01-24 06:39:15

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

> Heh..Well, I am currently using 2.4.19rc5aa1. We're having some major
> stack problems, so I first when through trying to update the XFS
> codebase in 2.4.19rc5aa1. That didn't prove very fruitful. I couldn't
> even fully reverse the patch for some reason.
>
> So I decided to try 2.4.20aa1 instead, reversing the xfs patches, and
> then updating with a newer code base, worse problems reversing those xfs
> patches.
>
> SO I decided to just roll my own with the known features we use in
> production.
>
> 2.4.20 + xfs + lvm106 + rmap or aavm + O(1) sched + pte-highmem.

If you have enough ptes to want pte-highmem, I doubt you want rmap.
pte-chain space consumption will kill you. The calculations are pretty
easy to work out as to what the right solution is for your setup.

M.

2003-01-24 08:42:13

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

At some point in the past, someone else wrote:
>> So I decided to try 2.4.20aa1 instead, reversing the xfs patches, and
>> then updating with a newer code base, worse problems reversing those xfs
>> patches.
>> SO I decided to just roll my own with the known features we use in
>> production.
>> 2.4.20 + xfs + lvm106 + rmap or aavm + O(1) sched + pte-highmem.

On Thu, Jan 23, 2003 at 10:48:19PM -0800, Martin J. Bligh wrote:
> If you have enough ptes to want pte-highmem, I doubt you want rmap.
> pte-chain space consumption will kill you. The calculations are pretty
> easy to work out as to what the right solution is for your setup.

Basically vma-based ptov resolution needs to be implemented for private
anonymous pages, which will require much less ZONE_NORMAL space overhead
as pte_chains may then be chucked.

Dropping physical scanning altogether would be a mistake esp. for boxen
of any appreciable amount of physical locality (NUMA, big highmem
penalties, etc.) or wishing to support any significant number of tasks.


-- wli

2003-01-24 18:11:32

by mgross

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

On Thursday 23 January 2003 10:08 pm, GrandMasterLee wrote:
> Well, if I could get a clean patch against 2.4.20, or possibly some help
> fixing the one ?I do have, thanks to Ingo, then we'd have a straight
> O(1) sched for 2.4.20. I tried merging the patch that Ingo gave me, and
> everything seems OK, but I don't have any menu selection for O(1) stuff
> in the kernel config.(0 and 100 priority bits)
>
> So I can't tell if it's enabled.

do a ps -aux and see if there are any process migration threads, if you do
then its running the O(1) scheduler.

>
> >?Your milage will vary.
> >?
> >?Give it a try.
> >?
> >?--mgross
> >?
>
> I agree. In the interest of time, I may have to forego O(1), but maybe
> I'll get lucky. :) *hint*hint* :)

You really should try the O(1) scheduler. 600 process is a lot, we had ~100
for our benchmarks so it wasn't as big of a overhead for the old scheduler.
(Running Itanium 2's didn't hurt either ;)

Your running Xeon's with more processes, you are more likely to see a benefit
from the O(1) scheduler.

--mgross

2003-01-24 21:35:35

by GrandMasterLee

[permalink] [raw]
Subject: Re: Using O(1) scheduler with 600 processes.

On Fri, 2003-01-24 at 12:22, mgross wrote:
> On Thursday 23 January 2003 10:08 pm, GrandMasterLee wrote:
> > Well, if I could get a clean patch against 2.4.20, or possibly some help
> > fixing the one I do have, thanks to Ingo, then we'd have a straight
> > O(1) sched for 2.4.20. I tried merging the patch that Ingo gave me, and
> > everything seems OK, but I don't have any menu selection for O(1) stuff
> > in the kernel config.(0 and 100 priority bits)
> >
> > So I can't tell if it's enabled.
>
> do a ps -aux and see if there are any process migration threads, if you do
> then its running the O(1) scheduler.
>
> >
> > > Your milage will vary.
> > >
> > > Give it a try.
> > >
> > > --mgross
> > >
> >
> > I agree. In the interest of time, I may have to forego O(1), but maybe
> > I'll get lucky. :) *hint*hint* :)
>
> You really should try the O(1) scheduler. 600 process is a lot, we had ~100
> for our benchmarks so it wasn't as big of a overhead for the old scheduler.
> (Running Itanium 2's didn't hurt either ;)
>
> Your running Xeon's with more processes, you are more likely to see a benefit
> from the O(1) scheduler.
>
> --mgross

Ok...that's good feed back. If someone could help me sort out my patch
problems, I'd be happy to integrate it. but as WLI pointed out, 2.5 has
what I need, 2.4 doesn't, and thus, more effort, seemingly, is directed
at fixing O(1) for 2.5, versus backporting to 2.4.


--
GrandMasterLee <[email protected]>