The 2.2.21 kernel was behaving incorrectly for SCHED_FIFO and SCHED_RR
scheduling.
The correct behaviour for SCHED_FIFO is priority preemption: run to
completion, or system call, or preemption by higher priority process. The
correct behaviour for SCHED_RR is the same as SCHED_FIFO for the
preemption case, or run for a time slice, and go to the back of the run
queue for that priority.
More details can be found at:
http://www.opengroup.org/onlinepubs/7908799/xsh/realtime.html
This is a small patch, but fixes the behaviour for SCHED_FIFO and SCHED_RR
scheduling in the 2.2.21 kernel. It also improves the efficiency of the
kernel by NOT calling schedule() for every tick for a SCHED_FIFO process.
--
Bhavesh P. Davda
Avaya, Inc.
[email protected]
diff -aur linux-2.2.21/kernel/sched.c linux-2.2.21-bpd/kernel/sched.c
--- linux-2.2.21/kernel/sched.c Sun Mar 25 09:37:40 2001
+++ linux-2.2.21-bpd/kernel/sched.c Fri Jun 21 09:54:55 2002
@@ -749,7 +749,16 @@
/* Default process to select.. */
next = idle_task(this_cpu);
c = -1000;
- if (prev->state == TASK_RUNNING)
+ /*
+ * If a SCHED_RR task has exhausted its time slice,
+ * it is at the back of the runqueue, even if it is
+ * still running. On the other hand, if a SCHED_RR task
+ * is still running and still has time left in its
+ * time slice, then it is still the first process in its
+ * priority band, so it will run next if it is the first
+ * highest priority SCHED_RR task
+ */
+ if ((prev->state == TASK_RUNNING) && (prev->policy != SCHED_RR))
goto still_running;
still_running_back:
@@ -1492,7 +1501,12 @@
p->counter -= ticks;
if (p->counter < 0) {
p->counter = 0;
- p->need_resched = 1;
+ /* SCHED_FIFO is priority preemption, so this is
+ * not the place to reschedule it
+ */
+ if (p->policy != SCHED_FIFO) {
+ p->need_resched = 1;
+ }
}
if (p->priority < DEF_PRIORITY)
kstat.cpu_nice += user;
@@ -1785,8 +1799,6 @@
retval = 0;
p->policy = policy;
p->rt_priority = lp.sched_priority;
- if (p->next_run)
- move_first_runqueue(p);
current->need_resched = 1;
> The 2.2.21 kernel was behaving incorrectly for SCHED_FIFO and SCHED_RR
> scheduling.
Looks fine but I dont want to apply behaviour changing non critical stuff
to 2.2
Alan Cox wrote:
>>The 2.2.21 kernel was behaving incorrectly for SCHED_FIFO and SCHED_RR
>>scheduling.
>
>
> Looks fine but I dont want to apply behaviour changing non critical stuff
> to 2.2
Oh, no!
What's going on with the kernel community? I posted a similar fix for
the 2.4.18 kernel, and it hasn't been picked up there either.
For the 2.4.18 kernel scheduler, our 86 process application (SCHED_FIFO,
priorities 7-23, System V semaphores for priority preemption) won't even
stay up without my patch.
The 2.2.21 SCHED_FIFO behaviour is correct but slightly inefficient,
while the SCHED_RR behaviour is plain broken. What is an application
that depends on correct SCHED_RR behaviour to do in that case? There are
applications where increased latencies as a result of SCHED_RR being
broken are unacceptable.
--
Bhavesh P. Davda
Avaya Inc
Room B3-B03 E-mail : [email protected]
1300 West 120th Avenue Phone : (303) 538-4438
Westminster, CO 80234 Fax : (303) 538-3155
> What's going on with the kernel community? I posted a similar fix for
> the 2.4.18 kernel, and it hasn't been picked up there either.
I've not seen that one. However the -ac tree uses a different scheduler
anyway. You should check if 2.4.19pre has the same problem and if so mail
Marcelo directly a patch
On Fri, 21 Jun 2002, Alan Cox wrote:
> > What's going on with the kernel community? I posted a similar fix for
> > the 2.4.18 kernel, and it hasn't been picked up there either.
>
> I've not seen that one. However the -ac tree uses a different scheduler
> anyway. You should check if 2.4.19pre has the same problem and if so
> mail Marcelo directly a patch
the O(1) scheduler does not have this problem, so the -ac tree is ok.
for vanilla 2.4.18/2.4.19-pre i've created a compromise patch which
reduces the impact and fixes RT behavior (attached). There was no further
comment from Bhavesh, so i assumed it's all a done deal ... Marcelo,
please apply.
Ingo
--- linux/kernel/sched.c.orig Thu Jun 13 20:14:31 2002
+++ linux/kernel/sched.c Thu Jun 13 23:33:41 2002
@@ -324,7 +324,10 @@
*/
static inline void add_to_runqueue(struct task_struct * p)
{
- list_add(&p->run_list, &runqueue_head);
+ if (p->policy == SCHED_OTHER)
+ list_add(&p->run_list, &runqueue_head);
+ else
+ list_add_tail(&p->run_list, &runqueue_head);
nr_running++;
}
@@ -334,12 +337,6 @@
list_add_tail(&p->run_list, &runqueue_head);
}
-static inline void move_first_runqueue(struct task_struct * p)
-{
- list_del(&p->run_list);
- list_add(&p->run_list, &runqueue_head);
-}
-
/*
* Wake up a process. Put it on the run-queue if it's not
* already there. The "current" process is always on the
@@ -955,9 +952,6 @@
retval = 0;
p->policy = policy;
p->rt_priority = lp.sched_priority;
- if (task_on_runqueue(p))
- move_first_runqueue(p);
-
current->need_resched = 1;
out_unlock:
--- linux/kernel/timer.c.orig Thu Jun 13 20:17:04 2002
+++ linux/kernel/timer.c Thu Jun 13 20:23:15 2002
@@ -585,7 +585,8 @@
if (p->pid) {
if (--p->counter <= 0) {
p->counter = 0;
- p->need_resched = 1;
+ if (p->policy != SCHED_FIFO)
+ p->need_resched = 1;
}
if (p->nice > 0)
kstat.per_cpu_nice[cpu] += user_tick;