2001-04-04 17:03:21

by Hubertus Franke

[permalink] [raw]
Subject: Re: [Lse-tech] Re: a quest for a better scheduler



Kanoj, our cpu-pooling + loadbalancing allows you to do that.
The system adminstrator can specify at runtime through a
/proc filesystem interface the cpu-pool-size, whether loadbalacing
should take place.
We can put limiting to the local cpu-set during reschedule_idle
back into the code, to make it complete and compatible with
the approach that Andrea has taken.
This way, one can fully isolate or combine cpu-sets.

here is the code for the pooling.
http://lse.sourceforge.net/scheduling/LB/2.4.1-MQpool

loadbalancing and /proc system combined in this module.
http://lse.sourceforge.net/scheduling/LB/loadbalance.c

a writeup explaining this concept is available under
http://lse.sourceforge.net/scheduling/LB/poolMQ.html

Prerequisite is the MQ scheduler...
http://lse.sourceforge.net/scheduling/2.4.1.mq1-sched

We need to update these for 2.4.3 .... (coming)

Hubertus Franke
Enterprise Linux Group (Mgr), Linux Technology Center (Member Scalability)
, OS-PIC (Chair)
email: [email protected]
(w) 914-945-2003 (fax) 914-945-4425 TL: 862-2003



Kanoj Sarcar <[email protected]>@lists.sourceforge.net on
04/04/2001 12:50:58 PM

Sent by: [email protected]


To: [email protected] (Andrea Arcangeli)
cc: [email protected] (Ingo Molnar), Hubertus Franke/Watson/IBM@IBMUS,
[email protected] (Mike Kravetz), [email protected] (Fabio
Riccardi), [email protected] (Linux Kernel List),
[email protected]
Subject: Re: [Lse-tech] Re: a quest for a better scheduler



>
> I didn't seen anything from Kanoj but I did something myself for the
wildfire:
>
>
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.3aa1/10_numa-sched-1

>
> this is mostly an userspace issue, not really intended as a kernel
optimization
> (however it's also partly a kernel optimization). Basically it splits the
load
> of the numa machine into per-node load, there can be unbalanced load
across the
> nodes but fairness is guaranteed inside each node. It's not extremely
well
> tested but benchmarks were ok and it is at least certainly stable.
>

Just a quick comment. Andrea, unless your machine has some hardware
that imply pernode runqueues will help (nodelevel caches etc), I fail
to understand how this is helping you ... here's a simple theory though.
If your system is lightly loaded, your pernode queues are actually
implementing some sort of affinity, making sure processes stick to
cpus on nodes where they have allocated most of their memory on. I am
not sure what the situation will be under huge loads though.

As I have mentioned to some people before, percpu/pernode/percpuset/global
runqueues probably all have their advantages and disadvantages, and their
own sweet spots. Wouldn't it be really neat if a system administrator
or performance expert could pick and choose what scheduler behavior he
wants, based on how the system is going to be used?

Kanoj

_______________________________________________
Lse-tech mailing list
[email protected]
http://lists.sourceforge.net/lists/listinfo/lse-tech




2001-04-04 17:15:03

by Kanoj Sarcar

[permalink] [raw]
Subject: Re: [Lse-tech] Re: a quest for a better scheduler

>
>
>
> Kanoj, our cpu-pooling + loadbalancing allows you to do that.
> The system adminstrator can specify at runtime through a
> /proc filesystem interface the cpu-pool-size, whether loadbalacing
> should take place.

Yes, I think this approach can support the various requirements
put on the scheduler.

I think there are two degrees of freedom that are needed in the
scheduler. One, as you say, for the sysadmin to be able to specify
what overall scheduler behavior he wants.

Secondly, from the kernel standpoint, there needs to be perarch
hooks, to be able to utilize nodelevel/multilevel caches, NUMA
aspects etc.

Kanoj