2003-06-19 05:52:37

by Perez-Gonzalez, Inaky

[permalink] [raw]
Subject: RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks


Hi All

I have another test case that is showing this behavior.
It is very similar to George's, however, it is a simplification
of another one using threads that a co-worker, Adam Li found
a few days ago.

Parent (FIFO 5) forks child that sets itself to FIFO 4 and
busy loops, then it sleeps five seconds and kills the child.

Doing SysRq + T after a while shows the parent'd call trace
to be at sys_rt_sigaction+0xd1, that is just inside the final
copy_to_user() in signal.c:sys_rt_sigaction().

Reprioritizing events/0 to FIFO 5+ fixes the inversion.

If I call nanosleep directly (with system() instead of
glibc's sleep(), so I avoid all the rt_sig calls),
I get the parent process always stuck in work_resched+0x5,
in entry.S:work_resched, just after the call to the
scheduler - however, I cannot trace what is happening
inside the scheduler.

My point here is: I am trying to trace where this program
is making use of workqueues inside of the kernel, and I
can find none. The only place where I need to look some
more is inside the timer code, but in a quick glance,
it seems it is not being used, so why is it affected by
the reprioritization of the events/0 thread? George, can
you help me here?

kernel is 2.5.67, SMP and PREEMPT with maxcpus=1; tomorrow
I will try .72 ...

I?aky P?rez-Gonz?lez -- Not speaking for Intel -- all opinions are my own
(and my fault)



Attachments:
sched-hang.c (0.98 kB)

2003-06-19 05:54:33

by Ingo Molnar

[permalink] [raw]
Subject: RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks


On Wed, 18 Jun 2003, Perez-Gonzalez, Inaky wrote:

> My point here is: I am trying to trace where this program is making use
> of workqueues inside of the kernel, and I can find none. The only place
> where I need to look some more is inside the timer code, but in a quick
> glance, it seems it is not being used, so why is it affected by the
> reprioritization of the events/0 thread? George, can you help me here?

well, printk (console input/output) can already make use of keventd.

Ingo

2003-06-19 15:47:48

by George Anzinger

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

Perez-Gonzalez, Inaky wrote:
> Hi All
>
> I have another test case that is showing this behavior.
> It is very similar to George's, however, it is a simplification
> of another one using threads that a co-worker, Adam Li found
> a few days ago.
>
> Parent (FIFO 5) forks child that sets itself to FIFO 4 and
> busy loops, then it sleeps five seconds and kills the child.
>
> Doing SysRq + T after a while shows the parent'd call trace
> to be at sys_rt_sigaction+0xd1, that is just inside the final
> copy_to_user() in signal.c:sys_rt_sigaction().
>
> Reprioritizing events/0 to FIFO 5+ fixes the inversion.
>
> If I call nanosleep directly (with system() instead of
> glibc's sleep(), so I avoid all the rt_sig calls),
> I get the parent process always stuck in work_resched+0x5,
> in entry.S:work_resched, just after the call to the
> scheduler - however, I cannot trace what is happening
> inside the scheduler.
>
> My point here is: I am trying to trace where this program
> is making use of workqueues inside of the kernel, and I
> can find none. The only place where I need to look some
> more is inside the timer code, but in a quick glance,
> it seems it is not being used, so why is it affected by
> the reprioritization of the events/0 thread? George, can
> you help me here?
>

Hm! I wonder. Robert is working on a fix for schedsetschedule()
where it fails to actually tell the scheduler to switch to a process
that it just made higher priority or away from one it just lowered.

The net result is that the caller keeps running (FIFO for all in this
case) when, in fact it should have been switched out. Next time
schedule() actually switches, it is all sorted out again. Could the
elavation of the events/0 thread cause this needed switch?

-g

> kernel is 2.5.67, SMP and PREEMPT with maxcpus=1; tomorrow
> I will try .72 ...
>
> I?aky P?rez-Gonz?lez -- Not speaking for Intel -- all opinions are my own
> (and my fault)
>
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-06-19 17:07:13

by Joe Korty

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

> Hm! I wonder. Robert is working on a fix for schedsetschedule()
> where it fails to actually tell the scheduler to switch to a process
> that it just made higher priority or away from one it just lowered.
>
> The net result is that the caller keeps running (FIFO for all in this
> case) when, in fact it should have been switched out. Next time
> schedule() actually switches, it is all sorted out again. Could the
> elavation of the events/0 thread cause this needed switch?


I posted a fix for this a month ago that was ignored. Which is a
good thing, since now that I look at it again, I don't care for the
approach I took nor does it appear to be complete.

Joe

----------- original posting

> Date: Wed, 21 May 2003 16:40:26 -0400
> From: Joe Korty <[email protected]>
> To: Ingo Molnar <[email protected]>
> Cc: [email protected]
> Subject: [PATCH] setscheduler resched bug

setscheduler is not forcing a reschedule when needed like set_user_nice
does. It should.

Joe


--- 2.5.69/kernel/sched.c.orig 2003-05-21 14:50:53.000000000 -0400
+++ 2.5.69/kernel/sched.c 2003-05-21 15:01:13.000000000 -0400
@@ -1716,6 +1716,7 @@
unsigned long flags;
runqueue_t *rq;
task_t *p;
+ int oldprio;

if (!param || pid < 0)
goto out_nounlock;
@@ -1778,12 +1779,20 @@
retval = 0;
p->policy = policy;
p->rt_priority = lp.sched_priority;
+ oldprio = p->prio;
if (policy != SCHED_NORMAL)
p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority;
else
p->prio = p->static_prio;
- if (array)
+ if (array) {
__activate_task(p, task_rq(p));
+ /*
+ * Reschedule if on a CPU and the priority dropped, or not on
+ * a CPU and the priority rose above the currently running task.
+ */
+ if ((rq->curr == p) ? (p->prio > oldprio) : (p->prio < rq->curr->prio))
+ resched_task(rq->curr);
+ }

out_unlock:
task_rq_unlock(rq, &flags);

2003-06-19 17:09:40

by Robert Love

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

On Thu, 2003-06-19 at 10:19, '[email protected]' wrote:

> I posted a fix for this a month ago that was ignored. Which is a
> good thing, since now that I look at it again, I don't care for the
> approach I took nor does it appear to be complete.

Ah, sorry for missing it. Other than that tertiary statement inside an
if ;) my patch is about the same.

Why do you think it is incomplete? It looks correct to me.

Robert Love

2003-06-19 17:15:19

by Joe Korty

[permalink] [raw]
Subject: Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks

On Thu, Jun 19, 2003 at 10:23:30AM -0700, Robert Love wrote:
> On Thu, 2003-06-19 at 10:19, '[email protected]' wrote:
>
> > I posted a fix for this a month ago that was ignored. Which is a
> > good thing, since now that I look at it again, I don't care for the
> > approach I took nor does it appear to be complete.
>
> Ah, sorry for missing it. Other than that tertiary statement inside an
> if ;) my patch is about the same.
>
> Why do you think it is incomplete? It looks correct to me.


It may be better to add it to __activate_task() rather than after the
single activate_task() use. At the time I wrote the patch I did not
think to look at the five __activate_task() calls to see if they each
needed the test. By me not looking, my patch is automatically
incorrect, even if it turns out to be correct.

Joe

2003-06-19 17:33:37

by Robert Love

[permalink] [raw]
Subject: [patch] setscheduler fix

Here is my patch. It is the same idea as Joe's. Is there a better fix?

Basically, the problem is that setscheduler() does not set need_resched
when needed. There are two basic cases where this is needed:

- the task is running, but now it is no longer the highest
priority task on the rq
- the task is not running, but now it is the highest
priority task on the rq

In either case, we need to set need_resched to invoke the scheduler.

Patch is against 2.5.72. Comments?

Robert Love


setschedule() needs to force a reschedule in some situations.

kernel/sched.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletion(-)


diff -urN linux-2.5.72/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.72/kernel/sched.c 2003-06-16 21:20:20.000000000 -0700
+++ linux/kernel/sched.c 2003-06-17 13:44:15.509894276 -0700
@@ -1691,6 +1691,7 @@
{
struct sched_param lp;
int retval = -EINVAL;
+ int oldprio;
prio_array_t *array;
unsigned long flags;
runqueue_t *rq;
@@ -1757,12 +1758,24 @@
retval = 0;
p->policy = policy;
p->rt_priority = lp.sched_priority;
+ oldprio = p->prio;
if (policy != SCHED_NORMAL)
p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority;
else
p->prio = p->static_prio;
- if (array)
+ if (array) {
__activate_task(p, task_rq(p));
+ /*
+ * Reschedule if we are currently running on this runqueue and
+ * our priority decreased, or if we are not currently running on
+ * this runqueue and our priority is higher than the current's
+ */
+ if (rq->curr == p) {
+ if (p->prio > oldprio)
+ resched_task(rq->curr);
+ } else if (p->prio < rq->curr->prio)
+ resched_task(rq->curr);
+ }

out_unlock:
task_rq_unlock(rq, &flags);


2003-06-19 18:08:20

by Joe Korty

[permalink] [raw]
Subject: Re: [patch] setscheduler fix

> Here is my patch. It is the same idea as Joe's. Is there a better fix?
>
> Basically, the problem is that setscheduler() does not set need_resched
> when needed. There are two basic cases where this is needed:
>
> - the task is running, but now it is no longer the highest
> priority task on the rq
> - the task is not running, but now it is the highest
> priority task on the rq
>
> In either case, we need to set need_resched to invoke the scheduler.
>
> Patch is against 2.5.72. Comments?

Looks good to me.

migration_thread and try_to_wake_up already have a simplier version of
your test that seems to be correct for that environment, so no change
is needed there.

wake_up_forked_process in principle might need your patch, but as it
appears to be called only from boot code it is unimportant that it
have the lowest possible latency, so no change is needed there either.

Joe

2003-06-19 18:24:21

by Robert Love

[permalink] [raw]
Subject: Re: [patch] setscheduler fix

On Thu, 2003-06-19 at 11:20, Joe Korty wrote:

> Looks good to me.

Good.

> migration_thread and try_to_wake_up already have a simplier version of
> your test that seems to be correct for that environment, so no change
> is needed there.
>
> wake_up_forked_process in principle might need your patch, but as it
> appears to be called only from boot code it is unimportant that it
> have the lowest possible latency, so no change is needed there either.

Agreed.

This is worse than just a latency issue, by the way. Imagine if a
FIFO/50 thread promotes a FIFO/40 thread to FIFO/60. The thread should
run immediately (because, at priority 60, it is the highest), but it may
not until the FIFO/50 thread completes.

Robert Love

2003-06-19 18:56:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] setscheduler fix


On 19 Jun 2003, Robert Love wrote:

> + if (rq->curr == p) {
> + if (p->prio > oldprio)
> + resched_task(rq->curr);
> + } else if (p->prio < rq->curr->prio)
> + resched_task(rq->curr);
> + }

looks good.

Ingo

2003-06-20 02:25:05

by Perez-Gonzalez, Inaky

[permalink] [raw]
Subject: RE: [patch] setscheduler fix


> From: Robert Love [mailto:[email protected]]
>
> Here is my patch. It is the same idea as Joe's. Is there a better fix?

Fixes my sched-hang.c

I?aky P?rez-Gonz?lez -- Not speaking for Intel -- all opinions are my own
(and my fault)