2002-01-12 19:16:37

by James C. Owens

[permalink] [raw]
Subject: O(1) scheduler ver H6 - more straightforward timeslice macros

Ingo,

I like the new scheduler. It seems like the timeslice macros in sched.h
could be more straighforward - i.e. instead of

#define PRIO_TO_TIMESLICE(p) \
((( (MAX_USER_PRIO-1-USER_PRIO(p))*(MAX_TIMESLICE-MIN_TIMESLICE) + \
MAX_USER_PRIO-1) / MAX_USER_PRIO) + MIN_TIMESLICE)

#define RT_PRIO_TO_TIMESLICE(p) \
((( (MAX_RT_PRIO-(p)-1)*(MAX_TIMESLICE-MIN_TIMESLICE) + \
MAX_RT_PRIO-1) / MAX_RT_PRIO) + MIN_TIMESLICE)

why not

#define PRIO_TO_TIMESLICE(p) \
(MAX_TIMESLICE -
(USER_PRIO(p)/(MAX_USER_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))

#define RT_PRIO_TO_TIMESLICE(p) \
(MAX_TIMESLICE - (p/(MAX_RT_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))


The second way seems simpler to me, and really illustrates what you are
doing in a more straightforward manner.

I also cleaned up some of the comments. The sched.h diff between the H6
version of the scheduler applied to 2.4.18-pre3 and vanilla 2.4.18-pre3
follows: (Note that I changed the min and max timeslices to 20 and 100
for my own use.)


474c474

< * is for SCHED_OTHER tasks.
---
> * is for SCHED_OTHER tasks. (Max Priority is 168.)
481c481
< * to static priority [ 24 ... 63 (MAX_PRIO-1) ]
---
> * to static priority [ 128 ... 167 (MAX_PRIO-1) ]
483,484c483,484
< * User-nice value of -20 == static priority 24, and
< * user-nice value 19 == static priority 63. The lower
---
> * User-nice value of -20 == static priority 128, and
> * user-nice value 19 == static priority 167. The lower
486,488d485
< *
< * Note that while static priority cannot go below 24,
< * the priority of a process can go as low as 0.
495,496c492,493
< * Default timeslice is 90 msecs, maximum is 150 msecs.
< * Minimum timeslice is 30 msecs.
---
> * Default timeslice is 60 msecs; maximum is 100 msecs.
> * Minimum timeslice is 20 msecs.
498,499c495,496
< #define MIN_TIMESLICE ( 30 * HZ / 1000)
< #define MAX_TIMESLICE (150 * HZ / 1000)
---
> #define MIN_TIMESLICE (20 * HZ / 1000)
> #define MAX_TIMESLICE (100 * HZ / 1000)
500a498,503
> /*
> * Scales priority values to user priority values.
> * This means nice of -20 => p of 128 => user priority of 0.
> * This means nice of +19 => p of 167 => user priority of 39.
> * MAX_USER_PRIO is 40 which would be nice of +20.
> */
505,506c508,512
< * PRIO_TO_TIMESLICE scales priority values [ 100 ... 139 ]
< * to initial time slice values [ MAX_TIMESLICE (150 msec) ... 2 ]
---
> * PRIO_TO_TIMESLICE scales priority values [ 128 ... 167 ]
> * to initial time slice values [ MAX_TIMESLICE ... MIN_TIMESLICE ]
> *
> * RT_PRIO_TO_TIMESLICE scales priority values [ 0 ... 127 ]
> * to initial time slice values [ MAX_TIMESLICE ... MIN_TIMESLICE ]
508,509c514,515
< * The higher a process's priority, the bigger timeslices
< * it gets during one round of execution. But even the lowest
---
> * The numerically lower a process's priority, the bigger timeslices
> * it gets during one round of execution. But even the numerically highest
513,514c519
< ((( (MAX_USER_PRIO-1-USER_PRIO(p))*(MAX_TIMESLICE-MIN_TIMESLICE) + \
< MAX_USER_PRIO-1) / MAX_USER_PRIO) + MIN_TIMESLICE)
---
> (MAX_TIMESLICE -
(USER_PRIO(p)/(MAX_USER_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
517,518c522
< ((( (MAX_RT_PRIO-(p)-1)*(MAX_TIMESLICE-MIN_TIMESLICE) + \
< MAX_RT_PRIO-1) / MAX_RT_PRIO) + MIN_TIMESLICE)
---
> (MAX_TIMESLICE - (p/(MAX_RT_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))


To lkml - please cc me on any response, as I do not subscribe to the
lkml - I read it via a news gateway.


Jim Owens
SuSE Linux 6.4 (kernel 2.4.18-pre3)
Tyan Tiger MP 2xAthlon MP 1600+
1.25 GB RAM


2002-01-12 22:00:17

by Matti Aarnio

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros

On Sat, Jan 12, 2002 at 02:16:08PM -0500, James C. Owens wrote:
> Ingo,
>
> I like the new scheduler. It seems like the timeslice macros in sched.h
> could be more straighforward - i.e. instead of

(I quote too much, but to illustriate the point...)

> #define PRIO_TO_TIMESLICE(p) \
> ((( (MAX_USER_PRIO-1-USER_PRIO(p))*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> MAX_USER_PRIO-1) / MAX_USER_PRIO) + MIN_TIMESLICE)
>
> #define RT_PRIO_TO_TIMESLICE(p) \
> ((( (MAX_RT_PRIO-(p)-1)*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> MAX_RT_PRIO-1) / MAX_RT_PRIO) + MIN_TIMESLICE)
>
> why not
>
> #define PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE -
> (USER_PRIO(p)/(MAX_USER_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
>
> #define RT_PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE - (p/(MAX_RT_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
>
>
> The second way seems simpler to me, and really illustrates what you are
> doing in a more straightforward manner.

Except that the math is INTEGER, not floating-point,
which means that this way you loose precission.

You HAVE TO do multiplications first, only then (finally) the division.

Depending the value-spaces, small-enough value-spaces might be
turnable into table mappings. However that has lots of dependencies
in hardware architecture, e.g. memory access speeds, cache pollution,
speed of multiply/divide operations, etc.

If dividers/multipliers are constants, and powers of two, the math
can happen with constant shifts, which are fast at all systems.
If not, things get rather complicated. (And thus a careless -1,
or lack of one, may be costly.)

> I also cleaned up some of the comments. The sched.h diff between the H6
> version of the scheduler applied to 2.4.18-pre3 and vanilla 2.4.18-pre3
> follows: (Note that I changed the min and max timeslices to 20 and 100
> for my own use.)
...

"uni-diff please, use '-u' option for the diff"

> To lkml - please cc me on any response, as I do not subscribe to the
> lkml - I read it via a news gateway.
>
>
> Jim Owens

/Matti Aarnio

2002-01-12 22:19:26

by James C. Owens

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros

> -----Original Message-----
> From: Matti Aarnio [mailto:[email protected]]
> Sent: Saturday, January 12, 2002 5:00 PM
> To: James C. Owens
> Cc: [email protected]; [email protected]
> Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice
> macros
>
>
> On Sat, Jan 12, 2002 at 02:16:08PM -0500, James C. Owens wrote:
> > Ingo,
> >
> > I like the new scheduler. It seems like the timeslice
> macros in sched.h
> > could be more straighforward - i.e. instead of
>
> (I quote too much, but to illustriate the point...)
>
> > #define PRIO_TO_TIMESLICE(p) \
> > (((
> (MAX_USER_PRIO-1-USER_PRIO(p))*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> > MAX_USER_PRIO-1) / MAX_USER_PRIO) + MIN_TIMESLICE)
> >
> > #define RT_PRIO_TO_TIMESLICE(p) \
> > ((( (MAX_RT_PRIO-(p)-1)*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> > MAX_RT_PRIO-1) / MAX_RT_PRIO) + MIN_TIMESLICE)
> >
> > why not
> >
> > #define PRIO_TO_TIMESLICE(p) \
> > (MAX_TIMESLICE -
> > (USER_PRIO(p)/(MAX_USER_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
> >
> > #define RT_PRIO_TO_TIMESLICE(p) \
> > (MAX_TIMESLICE -
> (p/(MAX_RT_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
> >
> >
> > The second way seems simpler to me, and really illustrates
> what you are
> > doing in a more straightforward manner.
>
> Except that the math is INTEGER, not floating-point,
> which means that this way you loose precission.
>
> You HAVE TO do multiplications first, only then (finally)
> the division.
>
> Depending the value-spaces, small-enough value-spaces might be
> turnable into table mappings. However that has lots of
> dependencies
> in hardware architecture, e.g. memory access speeds, cache
> pollution,
> speed of multiply/divide operations, etc.
>
> If dividers/multipliers are constants, and powers of two, the math
> can happen with constant shifts, which are fast at all systems.
> If not, things get rather complicated. (And thus a careless -1,
> or lack of one, may be costly.)
>
[snip]
> /Matti Aarnio
>

Point well made. How about

#define PRIO_TO_TIMESLICE(p) \
(MAX_TIMESLICE -
((USER_PRIO(p)*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_USER_PRIO-1)))

#define RT_PRIO_TO_TIMESLICE(p) \
(MAX_TIMESLICE - ((p*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_RT_PRIO-1)))

If people agree with this, I'll submit a new diff (with the right options).

Jim Owens


2002-01-13 16:11:33

by Ingo Molnar

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros


On Sat, 12 Jan 2002, James C. Owens wrote:

> Point well made. How about
>
> #define PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE -
> ((USER_PRIO(p)*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_USER_PRIO-1)))
>
> #define RT_PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE - ((p*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_RT_PRIO-1)))

the macros are still not equivalent. Try HZ = 100 and nice == -17 for
example.

Ingo

2002-01-13 18:43:17

by James C. Owens

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On
> Behalf Of Ingo Molnar
> Sent: Sunday, January 13, 2002 1:09 PM
> To: James C. Owens
> Cc: 'Matti Aarnio'; [email protected]
> Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice
> macros

[snip]
>
> the macros are still not equivalent. Try HZ = 100 and nice == -17 for
> example.
>
> Ingo
>
>

I agree they are not quite equivalent. I have to point out that your macro
definition never delivers max timeslice for the valid nice range. I wrote a
short program to compare the two. Here is the output. Your macro is the
"orig timeslice", the one I am suggesting is "new timeslice". I used the
same values 150 and 30 for max and min timeslice, respectively, that you do
in the sched.h in -H6.

What I am suggesting is that (1) my suggested expressions actually deliver
the correct timeslice range, (2) they clearly show the decrement from
max-timeslice to min-timeslice as p is numerically increased, and (3) they
have a smaller number of arithmetic operators, five in mine versus eight in
the originals. Since these expressions get executed during a significant
fraction of scheduler_tick(task_t *p) calls, which are at HZ frequency, I
would think it would make sense to make them as simple as possible.

nice -20, orig timeslice 147, new timeslice 150, difference 3
nice -19, orig timeslice 144, new timeslice 147, difference 3
nice -18, orig timeslice 141, new timeslice 144, difference 3
nice -17, orig timeslice 138, new timeslice 141, difference 3
nice -16, orig timeslice 135, new timeslice 138, difference 3
nice -15, orig timeslice 132, new timeslice 135, difference 3
nice -14, orig timeslice 129, new timeslice 132, difference 3
nice -13, orig timeslice 126, new timeslice 129, difference 3
nice -12, orig timeslice 123, new timeslice 126, difference 3
nice -11, orig timeslice 120, new timeslice 123, difference 3
nice -10, orig timeslice 117, new timeslice 120, difference 3
nice -9, orig timeslice 114, new timeslice 117, difference 3
nice -8, orig timeslice 111, new timeslice 114, difference 3
nice -7, orig timeslice 108, new timeslice 110, difference 2
nice -6, orig timeslice 105, new timeslice 107, difference 2
nice -5, orig timeslice 102, new timeslice 104, difference 2
nice -4, orig timeslice 99, new timeslice 101, difference 2
nice -3, orig timeslice 96, new timeslice 98, difference 2
nice -2, orig timeslice 93, new timeslice 95, difference 2
nice -1, orig timeslice 90, new timeslice 92, difference 2
nice 0, orig timeslice 87, new timeslice 89, difference 2
nice 1, orig timeslice 84, new timeslice 86, difference 2
nice 2, orig timeslice 81, new timeslice 83, difference 2
nice 3, orig timeslice 78, new timeslice 80, difference 2
nice 4, orig timeslice 75, new timeslice 77, difference 2
nice 5, orig timeslice 72, new timeslice 74, difference 2
nice 6, orig timeslice 69, new timeslice 70, difference 1
nice 7, orig timeslice 66, new timeslice 67, difference 1
nice 8, orig timeslice 63, new timeslice 64, difference 1
nice 9, orig timeslice 60, new timeslice 61, difference 1
nice 10, orig timeslice 57, new timeslice 58, difference 1
nice 11, orig timeslice 54, new timeslice 55, difference 1
nice 12, orig timeslice 51, new timeslice 52, difference 1
nice 13, orig timeslice 48, new timeslice 49, difference 1
nice 14, orig timeslice 45, new timeslice 46, difference 1
nice 15, orig timeslice 42, new timeslice 43, difference 1
nice 16, orig timeslice 39, new timeslice 40, difference 1
nice 17, orig timeslice 36, new timeslice 37, difference 1
nice 18, orig timeslice 33, new timeslice 34, difference 1
nice 19, orig timeslice 30, new timeslice 30, difference 0

2002-01-13 18:58:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros


On Sun, 13 Jan 2002, James C. Owens wrote:

> I agree they are not quite equivalent. I have to point out that your
> macro definition never delivers max timeslice for the valid nice
> range. [...]

agreed - i took your macro definitions and they are in my tree already.
Good work and thanks!

Ingo

2002-01-14 19:02:53

by George Anzinger

[permalink] [raw]
Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice macros

"James C. Owens" wrote:
>
> > -----Original Message-----
> > From: Matti Aarnio [mailto:[email protected]]
> > Sent: Saturday, January 12, 2002 5:00 PM
> > To: James C. Owens
> > Cc: [email protected]; [email protected]
> > Subject: Re: O(1) scheduler ver H6 - more straightforward timeslice
> > macros
> >
> >
> > On Sat, Jan 12, 2002 at 02:16:08PM -0500, James C. Owens wrote:
> > > Ingo,
> > >
> > > I like the new scheduler. It seems like the timeslice
> > macros in sched.h
> > > could be more straighforward - i.e. instead of
> >
> > (I quote too much, but to illustriate the point...)
> >
> > > #define PRIO_TO_TIMESLICE(p) \
> > > (((
> > (MAX_USER_PRIO-1-USER_PRIO(p))*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> > > MAX_USER_PRIO-1) / MAX_USER_PRIO) + MIN_TIMESLICE)
> > >
> > > #define RT_PRIO_TO_TIMESLICE(p) \
> > > ((( (MAX_RT_PRIO-(p)-1)*(MAX_TIMESLICE-MIN_TIMESLICE) + \
> > > MAX_RT_PRIO-1) / MAX_RT_PRIO) + MIN_TIMESLICE)
> > >
> > > why not
> > >
> > > #define PRIO_TO_TIMESLICE(p) \
> > > (MAX_TIMESLICE -
> > > (USER_PRIO(p)/(MAX_USER_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
> > >
> > > #define RT_PRIO_TO_TIMESLICE(p) \
> > > (MAX_TIMESLICE -
> > (p/(MAX_RT_PRIO-1))*(MAX_TIMESLICE-MIN_TIMESLICE))
> > >
> > >
> > > The second way seems simpler to me, and really illustrates
> > what you are
> > > doing in a more straightforward manner.
> >
> > Except that the math is INTEGER, not floating-point,
> > which means that this way you loose precission.
> >
> > You HAVE TO do multiplications first, only then (finally)
> > the division.
> >
> > Depending the value-spaces, small-enough value-spaces might be
> > turnable into table mappings. However that has lots of
> > dependencies
> > in hardware architecture, e.g. memory access speeds, cache
> > pollution,
> > speed of multiply/divide operations, etc.
> >
> > If dividers/multipliers are constants, and powers of two, the math
> > can happen with constant shifts, which are fast at all systems.
> > If not, things get rather complicated. (And thus a careless -1,
> > or lack of one, may be costly.)
> >
> [snip]
> > /Matti Aarnio
> >
>
> Point well made. How about
>
> #define PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE -
> ((USER_PRIO(p)*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_USER_PRIO-1)))
>
> #define RT_PRIO_TO_TIMESLICE(p) \
> (MAX_TIMESLICE - ((p*(MAX_TIMESLICE-MIN_TIMESLICE))/(MAX_RT_PRIO-1)))
>
> If people agree with this, I'll submit a new diff (with the right options).
>
> Jim Owens

If performance is important here we could eliminate the "/" thusly:

#define SC 24 // Change as needed to get precision and fit in 32 bits
#define SC_USER_FACTOR
((MAX_TIMESLICE-MIN_TIMESLICE)<<SC)/(MAX_USER_PRIO-1) // a constant
#define PRIO_TO_TIMESLICE(p) \
(MAX_TIMESLICE - ((USER_PRIO(p) * (SC_USER_FACTOR))>>SC

To eliminate the ">>" one could go to SC=32 and then just take the high
32-bits of the mpy. Taking the high 32-bits of mpy (which gives
64-bits) requires asm, but SC_USER_FACTOR would be:
#define SC_USER_FACTOR \
(long)((long long) (MAX_TIMESLICE-MIN_TIMESLICE)<<SC)/(long
long)(MAX_USER_PRIO-1)

SC=32 can be used as long as
(MAX_TIMESLICE-MIN_TIMESLICE)<(MAX_USER_PRIO-1)
--
George [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/