2001-03-07 17:40:48

by Oswald Buddenhagen

[permalink] [raw]
Subject: static scheduling - SCHED_IDLE?

hi there,

i found, that linux is missing a static low-priority scheduling class
(or did i miss something? in this case feel free to stomp me into the
ground :). it would be ideal for typical number-crunchers running in
the background like the different distributed.net-like clients.
within this class, static priorities like those for SCHED_RR would be
required, so one could have for example such a priorization:

[-] idle-task/apmd (here never running)
[idle/1] distributed.net client or other toy (when the machine is really idle)
[idle/50] mp3 compressor (when no "real" work has to be done)
[other/nice 10] not time-critical job
[other/nice 0] the usual jobs, like a make process, etc.
[other/nice -10] x-server, etc.
[rr/50] some real-time app, like a computer-controlled toaster :)

what do you recon?

best regards

--
Hi! I'm a .signature virus! Copy me into your ~/.signature, please!
--
Nothing is fool-proof to a sufficiently talented fool.


2001-03-07 18:05:27

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Wed, 7 Mar 2001, Oswald Buddenhagen wrote:

> i found, that linux is missing a static low-priority scheduling class
> (or did i miss something? in this case feel free to stomp me into the
> ground :). it would be ideal for typical number-crunchers running in
> the background like the different distributed.net-like clients.

The problem with these things it that sometimes such a task may
hold a lock, which can prevent higher-priority tasks from running.

A solution would be to make sure that these tasks get at least one
time slice every 3 seconds or so, so they can release any locks
they might be holding and the system as a whole won't livelock.

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-07 19:21:06

by Oswald Buddenhagen

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

> The problem with these things it that sometimes such a task may hold
> a lock, which can prevent higher-priority tasks from running.
>
true ... three ideas:
- a sort of temporary priority elevation (the opposite of SCHED_YIELD)
as long as the process holds some lock
- automatically schedule the task, if some higher-priorized task wants
the lock
- preventing the processes from aquiring locks at all (obviously this
is not possible for required locks inside the kernel, but i don't
know enough about this)

> A solution would be to make sure that these tasks get at least one
> time slice every 3 seconds or so, so they can release any locks
> they might be holding and the system as a whole won't livelock.
>
did "these" apply only to the tasks, that actually hold a lock?
if not, then i don't like this idea, as it gives the processes
time for the only reason, that it _might_ hold a lock. this basically
undermines the idea of static classes. in this case, we could actually
just make the "nice" scale incredibly large and possibly nonlinear,
as mark suggested.

best regards

--
Hi! I'm a .signature virus! Copy me into your ~/.signature, please!
--
Nothing is fool-proof to a sufficiently talented fool.

2001-03-08 00:30:21

by ludovic

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Oswald Buddenhagen wrote:
>
> > The problem with these things it that sometimes such a task may hold
> > a lock, which can prevent higher-priority tasks from running.
> >
> true ... three ideas:
> - a sort of temporary priority elevation (the opposite of SCHED_YIELD)
> as long as the process holds some lock
> - automatically schedule the task, if some higher-priorized task wants
> the lock
> - preventing the processes from aquiring locks at all (obviously this
> is not possible for required locks inside the kernel, but i don't
> know enough about this)
>
> > A solution would be to make sure that these tasks get at least one
> > time slice every 3 seconds or so, so they can release any locks
> > they might be holding and the system as a whole won't livelock.
> >
> did "these" apply only to the tasks, that actually hold a lock?
> if not, then i don't like this idea, as it gives the processes
> time for the only reason, that it _might_ hold a lock. this basically
> undermines the idea of static classes. in this case, we could actually
> just make the "nice" scale incredibly large and possibly nonlinear,
> as mark suggested.
>

Since the linux kernel is not preemptive, the problem is a little
bit more complicated; A low priority kernel thread won't lose the
CPU while holding a lock except if it wants to. That simplifies the
locking problem you mention but the idea of background low priority
threads that run when the machine is really idle is also not this
simple.

Ludo.

2001-03-08 11:18:31

by Zdenek Kabelac

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

ludovic wrote:
>
> Oswald Buddenhagen wrote:
> >
> > > The problem with these things it that sometimes such a task may hold
> > > a lock, which can prevent higher-priority tasks from running.
> > >
> > true ... three ideas:
> > - a sort of temporary priority elevation (the opposite of SCHED_YIELD)
> > as long as the process holds some lock
> > - automatically schedule the task, if some higher-priorized task wants
> > the lock
> > - preventing the processes from aquiring locks at all (obviously this
> > is not possible for required locks inside the kernel, but i don't
> > know enough about this)
> >
> > > A solution would be to make sure that these tasks get at least one
> > > time slice every 3 seconds or so, so they can release any locks
> > > they might be holding and the system as a whole won't livelock.
> > >
> > did "these" apply only to the tasks, that actually hold a lock?
> > if not, then i don't like this idea, as it gives the processes
> > time for the only reason, that it _might_ hold a lock. this basically
> > undermines the idea of static classes. in this case, we could actually
> > just make the "nice" scale incredibly large and possibly nonlinear,
> > as mark suggested.
> >
>
> Since the linux kernel is not preemptive, the problem is a little
> bit more complicated; A low priority kernel thread won't lose the
> CPU while holding a lock except if it wants to. That simplifies the
> locking problem you mention but the idea of background low priority
> threads that run when the machine is really idle is also not this
> simple.

You seem to have a sence for black humor right :) ?
As this is purely a complete nonsence
- you were talking about M$Win3.11 right ?
(are you really the employ of Sun ??)

2001-03-08 11:44:02

by Andrew Morton

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Zdenek Kabelac wrote:
>
> > Since the linux kernel is not preemptive, the problem is a little
> > bit more complicated; A low priority kernel thread won't lose the
> > CPU while holding a lock except if it wants to. That simplifies the
> > locking problem you mention but the idea of background low priority
> > threads that run when the machine is really idle is also not this
> > simple.
>
> You seem to have a sence for black humor right :) ?
> As this is purely a complete nonsence
> - you were talking about M$Win3.11 right ?
> (are you really the employ of Sun ??)

awww.. Don't say that. Ludovic is a nice guy.

Look. Suppose you have a SCHED_IDLE task which does this,
in the kernel:

down(&sem1);
down(&sem2); /* This sleeps */

Now, a SCHED_OTHER task does this, in user space:

for ( ; ; )
;

We're dead. The SCHED_IDLE task will never be scheduled,
and hence will never release sem1. The solution to this
problem is well known but, as Ludovic says, "not simple".

-

2001-03-08 13:32:30

by Boris Dragovic

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

> did "these" apply only to the tasks, that actually hold a lock?
> if not, then i don't like this idea, as it gives the processes
> time for the only reason, that it _might_ hold a lock. this basically
> undermines the idea of static classes. in this case, we could actually
> just make the "nice" scale incredibly large and possibly nonlinear,
> as mark suggested.

would it be possible to subqueue tasks that are holding a lock so that
they get some guaranteed amount of cpu and just leave other to be executed
when processor really idle?

lynx

2001-03-08 13:46:50

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Thu, 8 Mar 2001, Boris Dragovic wrote:

> > did "these" apply only to the tasks, that actually hold a lock?
> > if not, then i don't like this idea, as it gives the processes
> > time for the only reason, that it _might_ hold a lock. this basically
> > undermines the idea of static classes. in this case, we could actually
> > just make the "nice" scale incredibly large and possibly nonlinear,
> > as mark suggested.
>
> would it be possible to subqueue tasks that are holding a lock
> so that they get some guaranteed amount of cpu and just leave
> other to be executed when processor really idle?

Of course. Now we just need the code to determine when a task
is holding some kernel-side lock ;)

regrads,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-08 20:23:08

by Boris Dragovic

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

>
> Of course. Now we just need the code to determine when a task
> is holding some kernel-side lock ;)

couldn't it just be indicated on actual locking the resource?

lynx


2001-03-08 20:56:08

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Thu, 8 Mar 2001, Boris Dragovic wrote:

> > Of course. Now we just need the code to determine when a task
> > is holding some kernel-side lock ;)
>
> couldn't it just be indicated on actual locking the resource?

It could, but I doubt we would want this overhead on the locking...

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-09 19:43:23

by George Anzinger

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Rik van Riel wrote:
>
> On Thu, 8 Mar 2001, Boris Dragovic wrote:
>
> > > Of course. Now we just need the code to determine when a task
> > > is holding some kernel-side lock ;)
> >
> > couldn't it just be indicated on actual locking the resource?
>
> It could, but I doubt we would want this overhead on the locking...
>
> Rik

Seems like you are sneaking up on priority inherit mutexes. The locking
over head is not so bad (same as spinlock, except in UP case, where it
is the same as the SMP case). The unlock is, however, the same as the
lock overhead. It is hard to beat the store of zero which is the
spin_unlock.

George

2001-03-09 19:46:32

by Jamie Lokier

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Rik van Riel wrote:
> > > Of course. Now we just need the code to determine when a task
> > > is holding some kernel-side lock ;)
> >
> > couldn't it just be indicated on actual locking the resource?
>
> It could, but I doubt we would want this overhead on the locking...

Just raise the priority whenever the task's in kernel mode. Problem solved.

-- Jamie

2001-03-09 19:56:04

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Fri, 9 Mar 2001, Jamie Lokier wrote:
> Rik van Riel wrote:
> > > > Of course. Now we just need the code to determine when a task
> > > > is holding some kernel-side lock ;)
> > >
> > > couldn't it just be indicated on actual locking the resource?
> >
> > It could, but I doubt we would want this overhead on the locking...
>
> Just raise the priority whenever the task's in kernel mode. Problem solved.

Remember that a task schedules itself out at the timer interrupt,
in kernel/sched.c::schedule() ... which is kernel mode ;)

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-09 19:54:32

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Fri, 9 Mar 2001, george anzinger wrote:
> Rik van Riel wrote:
> > On Thu, 8 Mar 2001, Boris Dragovic wrote:
> >
> > > > Of course. Now we just need the code to determine when a task
> > > > is holding some kernel-side lock ;)
> > >
> > > couldn't it just be indicated on actual locking the resource?
> >
> > It could, but I doubt we would want this overhead on the locking...
>
> Seems like you are sneaking up on priority inherit mutexes.
> The locking over head is not so bad (same as spinlock, except in
> UP case, where it is the same as the SMP case). The unlock is,
> however, the same as the lock overhead. It is hard to beat the
> store of zero which is the spin_unlock.

Hmmm, what would this look like ?

(we need the same code if we want to do decent load
control for the VM, where we suspend tasks when the
load gets too high)

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-09 20:11:12

by Jamie Lokier

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Rik van Riel wrote:
> > Just raise the priority whenever the task's in kernel mode. Problem
> > solved.
>
> Remember that a task schedules itself out at the timer interrupt,
> in kernel/sched.c::schedule() ... which is kernel mode ;)

Even nicer. On x86 change this:

reschedule:
call SYMBOL_NAME(schedule) # test
jmp ret_from_sys_call

to this:

reschedule:
orl $PF_HONOUR_LOW_PRIORITY,flags(%ebx)
call SYMBOL_NAME(schedule) # test
andl $~PF_HONOUR_LOW_PRIORITY,flags(%ebx)
jmp ret_from_sys_call

(You get the idea; this isn't the best implementation).

I think this code can only be reached in two ways:

1. An interrupt, exception, page fault etc. that is returning to user space.
2. A system call, whatever space it's from.

In both these cases, no critical locks will be held, right?

-- Jamie

2001-03-09 20:21:52

by Adrian Cox

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

george anzinger wrote:


> Seems like you are sneaking up on priority inherit mutexes. The locking
> over head is not so bad (same as spinlock, except in UP case, where it
> is the same as the SMP case). The unlock is, however, the same as the
> lock overhead. It is hard to beat the store of zero which is the
> spin_unlock.

Unfortunately the kernel is already full of counting semaphores.
Priority inheritance won't save you, as the task which is going to call
up() need not be the same one that called down().

Jamie Lokier's suggestion of raising priority when in the kernel doesn't
help. You need to raise the priority of the task which is currently in
userspace and will call up() next time it enters the kernel. You don't
know which task that is.

There are three solutions I can think of:
1) don't have SCHED_IDLE
2) promote all SCHED_IDLE tasks into SCHED_OTHER whenever *any* task is
waiting on a semaphore.
3) an audit of semaphores for 2.5

I'm quite fond of option 1.

- Adrian

2001-03-10 14:09:36

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Fri, 9 Mar 2001, Jamie Lokier wrote:
> Rik van Riel wrote:
> > > Just raise the priority whenever the task's in kernel mode. Problem
> > > solved.
> >
> > Remember that a task schedules itself out at the timer interrupt,
> > in kernel/sched.c::schedule() ... which is kernel mode ;)
>
> Even nicer. On x86 change this:
>
> reschedule:
> call SYMBOL_NAME(schedule) # test
> jmp ret_from_sys_call
>
> to this:
>
> reschedule:
> orl $PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> call SYMBOL_NAME(schedule) # test
> andl $~PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> jmp ret_from_sys_call

Wonderful !

I think we'll want to use this, since we can use it for:

1. SCHED_IDLE
2. load control, when the VM starts thrashing we can just
suspend a few processes to make sure the system as a
whole won't thrash to death
3. ... ?

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-03-10 15:27:33

by Pavel Machek

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Hi!

> > did "these" apply only to the tasks, that actually hold a lock?
> > if not, then i don't like this idea, as it gives the processes
> > time for the only reason, that it _might_ hold a lock. this basically
> > undermines the idea of static classes. in this case, we could actually
> > just make the "nice" scale incredibly large and possibly nonlinear,
> > as mark suggested.
>
> would it be possible to subqueue tasks that are holding a lock so that
> they get some guaranteed amount of cpu and just leave other to be executed
> when processor really idle?

There was implementation which promoted SCHED_IDLE task to normal
priority whenever it entered syscall. I liked it.
Pavel
--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]

2001-03-12 18:06:25

by Jamie Lokier

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Adrian Cox wrote:
> Unfortunately the kernel is already full of counting semaphores.
> Priority inheritance won't save you, as the task which is going to call
> up() need not be the same one that called down().
>
> Jamie Lokier's suggestion of raising priority when in the kernel doesn't
> help. You need to raise the priority of the task which is currently in
> userspace and will call up() next time it enters the kernel. You don't
> know which task that is.

Dear oh dear. I was under the impression that kernel semaphores are
supposed to be used as mutexes only -- there are other mechanisms for
signalling between processes.

Do any processes ever enter userspace holding a critical semaphore?

(Things like userspace signalling another userspace don't count -- it's
your own fault and your own problem if _that_ deadlocks).

-- Jamie

2001-03-12 19:40:43

by Adrian Cox

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Jamie Lokier wrote:

> Adrian Cox wrote:

>> Jamie Lokier's suggestion of raising priority when in the kernel doesn't
>> help. You need to raise the priority of the task which is currently in
>> userspace and will call up() next time it enters the kernel. You don't
>> know which task that is.

> Dear oh dear. I was under the impression that kernel semaphores are
> supposed to be used as mutexes only -- there are other mechanisms for
> signalling between processes.

I think most of the kernel semaphores are used as mutexes, with
occasional producer/consumer semaphores. I think the core kernel code is
fine, the risk mostly comes from miscellaneous character devices. I've
written code that does this for a specialised device driver. I wanted
only one process to have the device open at once, and for others to
block on open. Using semaphores meant that multiple shells could do "cat
> /dev/mywidget" and be serialised.

Locking up users of this strange piece of hardware doesn't bring down
the system, so your suggestion could work. We need a big fat warning in
semaphore.h, and a careful examination of the current code.

- Adrian

2001-03-13 09:41:41

by Jamie Lokier

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Adrian Cox wrote:
> >> Jamie Lokier's suggestion of raising priority when in the kernel doesn't
> >> help. You need to raise the priority of the task which is currently in
> >> userspace and will call up() next time it enters the kernel. You don't
> >> know which task that is.
>
> > Dear oh dear. I was under the impression that kernel semaphores are
> > supposed to be used as mutexes only -- there are other mechanisms for
> > signalling between processes.
>
> I think most of the kernel semaphores are used as mutexes, with
> occasional producer/consumer semaphores. I think the core kernel code is
> fine, the risk mostly comes from miscellaneous character devices. I've
> written code that does this for a specialised device driver. I wanted
> only one process to have the device open at once, and for others to
> block on open. Using semaphores meant that multiple shells could do "cat
> > /dev/mywidget" and be serialised.

Oh, it's you :-)

In this case we don't care if process A is blocked waiting for idle
process B to release the device, do we?

> Locking up users of this strange piece of hardware doesn't bring down
> the system, so your suggestion could work. We need a big fat warning in
> semaphore.h, and a careful examination of the current code.

If it weren't for the weight of history, they could be called `struct
mutex' just to rub it in.

-- Jamie

2001-03-14 13:21:00

by Jamie Lokier

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

Rik van Riel wrote:
> > reschedule:
> > orl $PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> > call SYMBOL_NAME(schedule) # test
> > andl $~PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> > jmp ret_from_sys_call
>
> Wonderful !
>
> I think we'll want to use this, since we can use it for:
>
> 1. SCHED_IDLE
> 2. load control, when the VM starts thrashing we can just
> suspend a few processes to make sure the system as a
> whole won't thrash to death

Surely it would be easier, and more appropriate, to make the processes
sleep when they next page fault.

-- Jamie

2001-03-14 14:28:25

by Philipp Rumpf

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Fri, Mar 09, 2001 at 09:09:13PM +0100, Jamie Lokier wrote:
> Rik van Riel wrote:
> > > Just raise the priority whenever the task's in kernel mode. Problem
> > > solved.
> >
> > Remember that a task schedules itself out at the timer interrupt,
> > in kernel/sched.c::schedule() ... which is kernel mode ;)
>
> Even nicer. On x86 change this:
>
> reschedule:
> call SYMBOL_NAME(schedule) # test
> jmp ret_from_sys_call
>
> to this:
>
> reschedule:
> orl $PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> call SYMBOL_NAME(schedule) # test
> andl $~PF_HONOUR_LOW_PRIORITY,flags(%ebx)
> jmp ret_from_sys_call
>
> (You get the idea; this isn't the best implementation).

A few months ago, I implemented preemptible kernel threads (locally; I
tend to think the other patches are superior). Part of the changes was
to separate schedule into __schedule() (common part), schedule_user()
(automatic schedule from entry.S) and schedule() (manual schedule in
kernel space); besides making what Jamie proposed easier, we can also
save a few cycles in the (common) schedule_user case:

- we never release the kernel lock
- we can pass current to schedule_user
- we just handled softirqs

this is 2.5 material though ...

2001-03-14 20:01:27

by Rik van Riel

[permalink] [raw]
Subject: Re: static scheduling - SCHED_IDLE?

On Wed, 14 Mar 2001, Jamie Lokier wrote:

> > 2. load control, when the VM starts thrashing we can just
> > suspend a few processes to make sure the system as a
> > whole won't thrash to death
>
> Surely it would be easier, and more appropriate, to make the
> processes sleep when they next page fault.

This should work ...

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/