2005-05-08 06:12:01

by Haoqiang Zheng

[permalink] [raw]
Subject: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

swap-sched is a patch that solves dynamic priority inversion problem.

Run X at normal priority (nice 0) and keep the system really busy by
running a lot of interactive jobs (with dynamic priority at 115), or
simply run some CPU bound tasks at nice -10. Then start a mpeg player
at a high priority (nice -20). What would you expect? In my machine,
the mpeg player runs at poorly 4 frm/s. Why the tasks running at
dynamic priorities of 115 can have such dramatic impact on the
performance of mpeg player running at nice -20? What happens is the
mpeg player often blocks to wait the normal priority X to render the
frames. Without knowing such dependency between mpeg player and X, the
existing Linux scheduler would select other tasks to run and thus
results in poor video playback quality. This problem is generally
known as priority inversion.

Certainly, this very problem can be solved by setting the priority of
X to nice -10 (like what Redhat etc. does). However, inter-process
communication mechanisms like pipe, socket and signal etc. are widely
used in modern applications, and thus the inter-process dependencies
are everywhere in today's computer systems. It's not possible for a
system administrator to find out all the dependencies and set the
priorities properly. Obviously, we need a system that can dynamically
detects the dependencies among the tasks and take the dependency
information into account when scheduling. swap-sched is such a system.

swap-sched consists of two components: the automatic dependency
detection component and the dependency based scheduling
component. swap-sched detects the dependency among tasks by
monitoring/instrumenting the inter-process
communication/synchronization related system calls. Since all the
inter-process communications/synchronizations (except shared-memory)
are done via system calls, the dynamic dependencies can be effectively
detected by instrumenting these system calls.

In a conventional CPU scheduler, a task is removed from the runqueue
once it's blocked. This is a PROBLEM since a high priority task's
request is ignored once it's blocked, even though it's blocked because
of waiting for the execution of another task. Based on this
observation, swap-sched solves the priority inversion problem by make
two simple changes to the existing CPU scheduler. First, it keeps all
the tasks that are blocked but depends on some other tasks that are
runnable in runqueue. (We call such tasks are virtual runnable
tasks). Second, the existing CPU scheduler is called as usual. But since the
virtual runnable tasks are in runqueue, they may be scheduled. In this
case the swap scheduler is called to choose one of the providers of
the task (the task that the virtual runnable task depends on) to run.

Our results show that SWAP has low overhead, effectively solves the
priority inversion problem and can provide substantial improvements in
system performance in scheduling processes with dependencies. For the
mpeg player + X scenario discussed above, mpeg player can play at 23
frm/s with swap-sched enabled!!!

Please visit our swap-sched project homepage at
http://swap-sched.sourceforge.net/ for details and latest
patches. Suggestions/Comments are welcomed.


Haoqiang


2005-05-08 07:33:22

by Con Kolivas

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On Sun, 8 May 2005 16:11, Haoqiang Zheng wrote:
> swap-sched is a patch that solves dynamic priority inversion problem.
>
> Run X at normal priority (nice 0) and keep the system really busy by
> running a lot of interactive jobs (with dynamic priority at 115), or
> simply run some CPU bound tasks at nice -10. Then start a mpeg player
> at a high priority (nice -20). What would you expect? In my machine,
> the mpeg player runs at poorly 4 frm/s. Why the tasks running at
> dynamic priorities of 115 can have such dramatic impact on the
> performance of mpeg player running at nice -20? What happens is the
> mpeg player often blocks to wait the normal priority X to render the
> frames. Without knowing such dependency between mpeg player and X, the
> existing Linux scheduler would select other tasks to run and thus
> results in poor video playback quality. This problem is generally
> known as priority inversion.
>
> Certainly, this very problem can be solved by setting the priority of
> X to nice -10 (like what Redhat etc. does). However, inter-process
> communication mechanisms like pipe, socket and signal etc. are widely
> used in modern applications, and thus the inter-process dependencies
> are everywhere in today's computer systems. It's not possible for a
> system administrator to find out all the dependencies and set the
> priorities properly. Obviously, we need a system that can dynamically
> detects the dependencies among the tasks and take the dependency
> information into account when scheduling. swap-sched is such a system.
>
> swap-sched consists of two components: the automatic dependency
> detection component and the dependency based scheduling
> component. swap-sched detects the dependency among tasks by
> monitoring/instrumenting the inter-process
> communication/synchronization related system calls. Since all the
> inter-process communications/synchronizations (except shared-memory)
> are done via system calls, the dynamic dependencies can be effectively
> detected by instrumenting these system calls.
>
> In a conventional CPU scheduler, a task is removed from the runqueue
> once it's blocked. This is a PROBLEM since a high priority task's
> request is ignored once it's blocked, even though it's blocked because
> of waiting for the execution of another task. Based on this
> observation, swap-sched solves the priority inversion problem by make
> two simple changes to the existing CPU scheduler. First, it keeps all
> the tasks that are blocked but depends on some other tasks that are
> runnable in runqueue. (We call such tasks are virtual runnable
> tasks). Second, the existing CPU scheduler is called as usual. But since
> the virtual runnable tasks are in runqueue, they may be scheduled. In this
> case the swap scheduler is called to choose one of the providers of the
> task (the task that the virtual runnable task depends on) to run.
>
> Our results show that SWAP has low overhead, effectively solves the
> priority inversion problem and can provide substantial improvements in
> system performance in scheduling processes with dependencies. For the
> mpeg player + X scenario discussed above, mpeg player can play at 23
> frm/s with swap-sched enabled!!!
>
> Please visit our swap-sched project homepage at
> http://swap-sched.sourceforge.net/ for details and latest
> patches. Suggestions/Comments are welcomed.

Very interesting code. How do you prevent a ring of dependent tasks from
DoSing the entire system? eg what happens with process_load in contest -
since you asked about this recently I assume you have already tested it.

Cheers,
Con


Attachments:
(No filename) (3.53 kB)
(No filename) (189.00 B)
Download all attachments

2005-05-08 07:46:07

by Lee Revell

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On Sun, 2005-05-08 at 02:11 -0400, Haoqiang Zheng wrote:
> Certainly, this very problem can be solved by setting the priority of
> X to nice -10 (like what Redhat etc. does).

This just flips the priority inversion around: the nice -10 X server
will consume all available CPU rendering on behalf xscreensaver, running
at nice 20.

I would like to test this code very much, as this has been a real
problem for me.

Lee

2005-05-08 15:57:32

by Haoqiang Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

I am not quite sure about what do you mean for " a ring of dependent
tasks". Do you mean the situation that A depends on B while at the
same time B depends on A? It shouldn't happen since in swap-sched,
the dependency is generated on the fly. Task A depends on B only when
A blocks on waiting for B. For example, if task A blocks on
"read(pipe_fd,...)" and B is the task that can do
"write(pipe_fd,...)", then A is depending on B. Once A is waked up,
A no longer depends on any other task. So the "ring of dependent
tasks" shouldn't happen, otherwise it's a deadlock.

Haoqiang

On 5/8/05, Con Kolivas <[email protected]> wrote:
> On Sun, 8 May 2005 16:11, Haoqiang Zheng wrote:
> > swap-sched is a patch that solves dynamic priority inversion problem.
> >
> > Run X at normal priority (nice 0) and keep the system really busy by
> > running a lot of interactive jobs (with dynamic priority at 115), or
> > simply run some CPU bound tasks at nice -10. Then start a mpeg player
> > at a high priority (nice -20). What would you expect? In my machine,
> > the mpeg player runs at poorly 4 frm/s. Why the tasks running at
> > dynamic priorities of 115 can have such dramatic impact on the
> > performance of mpeg player running at nice -20? What happens is the
> > mpeg player often blocks to wait the normal priority X to render the
> > frames. Without knowing such dependency between mpeg player and X, the
> > existing Linux scheduler would select other tasks to run and thus
> > results in poor video playback quality. This problem is generally
> > known as priority inversion.
> >
> > Certainly, this very problem can be solved by setting the priority of
> > X to nice -10 (like what Redhat etc. does). However, inter-process
> > communication mechanisms like pipe, socket and signal etc. are widely
> > used in modern applications, and thus the inter-process dependencies
> > are everywhere in today's computer systems. It's not possible for a
> > system administrator to find out all the dependencies and set the
> > priorities properly. Obviously, we need a system that can dynamically
> > detects the dependencies among the tasks and take the dependency
> > information into account when scheduling. swap-sched is such a system.
> >
> > swap-sched consists of two components: the automatic dependency
> > detection component and the dependency based scheduling
> > component. swap-sched detects the dependency among tasks by
> > monitoring/instrumenting the inter-process
> > communication/synchronization related system calls. Since all the
> > inter-process communications/synchronizations (except shared-memory)
> > are done via system calls, the dynamic dependencies can be effectively
> > detected by instrumenting these system calls.
> >
> > In a conventional CPU scheduler, a task is removed from the runqueue
> > once it's blocked. This is a PROBLEM since a high priority task's
> > request is ignored once it's blocked, even though it's blocked because
> > of waiting for the execution of another task. Based on this
> > observation, swap-sched solves the priority inversion problem by make
> > two simple changes to the existing CPU scheduler. First, it keeps all
> > the tasks that are blocked but depends on some other tasks that are
> > runnable in runqueue. (We call such tasks are virtual runnable
> > tasks). Second, the existing CPU scheduler is called as usual. But since
> > the virtual runnable tasks are in runqueue, they may be scheduled. In this
> > case the swap scheduler is called to choose one of the providers of the
> > task (the task that the virtual runnable task depends on) to run.
> >
> > Our results show that SWAP has low overhead, effectively solves the
> > priority inversion problem and can provide substantial improvements in
> > system performance in scheduling processes with dependencies. For the
> > mpeg player + X scenario discussed above, mpeg player can play at 23
> > frm/s with swap-sched enabled!!!
> >
> > Please visit our swap-sched project homepage at
> > http://swap-sched.sourceforge.net/ for details and latest
> > patches. Suggestions/Comments are welcomed.
>
> Very interesting code. How do you prevent a ring of dependent tasks from
> DoSing the entire system? eg what happens with process_load in contest -
> since you asked about this recently I assume you have already tested it.
>
> Cheers,
> Con
>
>
>

2005-05-08 23:26:47

by Con Kolivas

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On Mon, 9 May 2005 01:55, Haoqiang Zheng wrote:
> I am not quite sure about what do you mean for " a ring of dependent
> tasks". Do you mean the situation that A depends on B while at the
> same time B depends on A? It shouldn't happen since in swap-sched,
> the dependency is generated on the fly. Task A depends on B only when
> A blocks on waiting for B. For example, if task A blocks on
> "read(pipe_fd,...)" and B is the task that can do
> "write(pipe_fd,...)", then A is depending on B. Once A is waked up,
> A no longer depends on any other task. So the "ring of dependent
> tasks" shouldn't happen, otherwise it's a deadlock.

Ok so how does it respond to process_load in contest?

Cheers,
Con


Attachments:
(No filename) (706.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-09 03:56:47

by Haoqiang Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On 5/8/05, Con Kolivas <[email protected]> wrote:
> On Mon, 9 May 2005 01:55, Haoqiang Zheng wrote:
> > I am not quite sure about what do you mean for " a ring of dependent
> > tasks". Do you mean the situation that A depends on B while at the
> > same time B depends on A? It shouldn't happen since in swap-sched,
> > the dependency is generated on the fly. Task A depends on B only when
> > A blocks on waiting for B. For example, if task A blocks on
> > "read(pipe_fd,...)" and B is the task that can do
> > "write(pipe_fd,...)", then A is depending on B. Once A is waked up,
> > A no longer depends on any other task. So the "ring of dependent
> > tasks" shouldn't happen, otherwise it's a deadlock.
>
> Ok so how does it respond to process_load in contest?

Based on my measurements, the "process_load" processes run at a
dynamic priority of 115--122. Which is also pretty much the dynamic
priority range of the gcc processes. At a certain point, the vanilla
Linux scheduler may select either a process_load process or a gcc/make
process to run, depending on which current runnable task has the
highest dynamic priority.

With swap-sched enabled, the virtual runnable tasks (tasks that are
blocked because of waiting for another task) are kept in runqueue.
For example, if a contest process_load task A with prio 115 blocks on
waiting for another contest task B with prio 122, task A will remain
in runqueue. Task A may be selected by the vanilla scheduler to run
since it has a high priority. On noticing that A is a virtual runnable
task, swap-sched further select B to run in place of A. So in the end,
B will be select to run. Without swap-sched, A will be removed from
the runqueue once it's blocked, and task B can hardly get a chance to
run since it has a low priority. That's why at
http://swap-sched.sourceforge.net/node9.html, process_load has a much
higher LCPU% when swap-sched is enabled.

Haoqiang

2005-05-09 05:57:23

by Con Kolivas

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On Mon, 9 May 2005 13:56, Haoqiang Zheng wrote:
> On 5/8/05, Con Kolivas <[email protected]> wrote:
> > Ok so how does it respond to process_load in contest?
>
> Based on my measurements, the "process_load" processes run at a
> dynamic priority of 115--122. Which is also pretty much the dynamic
> priority range of the gcc processes. At a certain point, the vanilla
> Linux scheduler may select either a process_load process or a gcc/make
> process to run, depending on which current runnable task has the
> highest dynamic priority.
>
> With swap-sched enabled, the virtual runnable tasks (tasks that are
> blocked because of waiting for another task) are kept in runqueue.
> For example, if a contest process_load task A with prio 115 blocks on
> waiting for another contest task B with prio 122, task A will remain
> in runqueue. Task A may be selected by the vanilla scheduler to run
> since it has a high priority. On noticing that A is a virtual runnable
> task, swap-sched further select B to run in place of A. So in the end,
> B will be select to run. Without swap-sched, A will be removed from
> the runqueue once it's blocked, and task B can hardly get a chance to
> run since it has a low priority. That's why at
> http://swap-sched.sourceforge.net/node9.html, process_load has a much
> higher LCPU% when swap-sched is enabled.

While your swap-sched allows tasks waiting on other tasks to run better, it
also introduces a greater degree of unfairness in cpu resource sharing. By
allowing the dependent tasks to stay on the runqueue you increase
substantially their share of the resources they would otherwise have gotten.
The whole point of decrementing priority and runqueue expiration is to
maintain fairness and you're introducing another way to delay that system
from working. process_load is not the ideal task to test this unfairness on
this design but even that shows twice as much cpu usage with your own tests.
How do you propose to ensure we maintain fairness in this model ?

Cheers,
Con


Attachments:
(No filename) (1.98 kB)
(No filename) (189.00 B)
Download all attachments

2005-05-10 08:22:29

by Coywolf Qi Hunt

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

On 5/8/05, Haoqiang Zheng <[email protected]> wrote:
> swap-sched is a patch that solves dynamic priority inversion problem.
>
> Run X at normal priority (nice 0) and keep the system really busy by
> running a lot of interactive jobs (with dynamic priority at 115), or
> simply run some CPU bound tasks at nice -10. Then start a mpeg player
> at a high priority (nice -20). What would you expect? In my machine,
> the mpeg player runs at poorly 4 frm/s. Why the tasks running at
> dynamic priorities of 115 can have such dramatic impact on the
> performance of mpeg player running at nice -20? What happens is the
> mpeg player often blocks to wait the normal priority X to render the
> frames. Without knowing such dependency between mpeg player and X, the
> existing Linux scheduler would select other tasks to run and thus
> results in poor video playback quality. This problem is generally
> known as priority inversion.
>
> Certainly, this very problem can be solved by setting the priority of
> X to nice -10 (like what Redhat etc. does). However, inter-process
> communication mechanisms like pipe, socket and signal etc. are widely
> used in modern applications, and thus the inter-process dependencies
> are everywhere in today's computer systems. It's not possible for a
> system administrator to find out all the dependencies and set the
> priorities properly. Obviously, we need a system that can dynamically
> detects the dependencies among the tasks and take the dependency
> information into account when scheduling. swap-sched is such a system.
>
> swap-sched consists of two components: the automatic dependency
> detection component and the dependency based scheduling
> component. swap-sched detects the dependency among tasks by
> monitoring/instrumenting the inter-process
> communication/synchronization related system calls. Since all the
> inter-process communications/synchronizations (except shared-memory)
> are done via system calls, the dynamic dependencies can be effectively
> detected by instrumenting these system calls.
>
> In a conventional CPU scheduler, a task is removed from the runqueue
> once it's blocked. This is a PROBLEM since a high priority task's
> request is ignored once it's blocked, even though it's blocked because
> of waiting for the execution of another task. Based on this
> observation, swap-sched solves the priority inversion problem by make
> two simple changes to the existing CPU scheduler. First, it keeps all
> the tasks that are blocked but depends on some other tasks that are
> runnable in runqueue. (We call such tasks are virtual runnable
> tasks). Second, the existing CPU scheduler is called as usual. But since the
> virtual runnable tasks are in runqueue, they may be scheduled. In this
> case the swap scheduler is called to choose one of the providers of
> the task (the task that the virtual runnable task depends on) to run.
>
> Our results show that SWAP has low overhead, effectively solves the
> priority inversion problem and can provide substantial improvements in
> system performance in scheduling processes with dependencies. For the
> mpeg player + X scenario discussed above, mpeg player can play at 23
> frm/s with swap-sched enabled!!!
>
> Please visit our swap-sched project homepage at
> http://swap-sched.sourceforge.net/ for details and latest
> patches. Suggestions/Comments are welcomed.
>
> Haoqiang


It is such a misleading name, but it's been in use for a rather long time ...



--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/

2005-05-11 04:23:54

by Haoqiang Zheng

[permalink] [raw]
Subject: Re: [RFC PATCH] swap-sched: schedule with dynamic dependency detection (2.6.12-rc3)

> While your swap-sched allows tasks waiting on other tasks to run better, it
> also introduces a greater degree of unfairness in cpu resource sharing. By
> allowing the dependent tasks to stay on the runqueue you increase
> substantially their share of the resources they would otherwise have gotten.
> The whole point of decrementing priority and runqueue expiration is to
> maintain fairness and you're introducing another way to delay that system
> from working. process_load is not the ideal task to test this unfairness on
> this design but even that shows twice as much cpu usage with your own tests.
> How do you propose to ensure we maintain fairness in this model ?

Sorry for the late response. I was busy with some other projects.

AFAIK, there are basically two task properties that can affect its
long term CPU allocation: the nice value and the interactiveness. The
nice value affects the time slice value. The interactiveness affects
the time_slice recharge frequency (an interactive task doesn't expire
unless there's an ``expire_starving'').

Well, swap-sched doesn't affect a task's time_slice value, but it does
has impact on interactivity detection. Currently, Linux detects task
interactivity based on ``sleep_average''. With swap-sched, a task with
dependency will sleep less than it does in vanilla kernel. I am not
sure if ``different'' necessary means bad. But I am sure
swap-sched can be modified so that the dependency detection part will
work exactly the same way as vanilla kernel.

Best regards,
Haoqiang