2008-10-03 12:38:57

by Gregory Haskins

[permalink] [raw]
Subject: [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9

Hi Chirag,

Please try the following patches applied to 26.5-rt9 and let me know what you
find.

Hi Steve,
If these look good to everyone, please consider them for inclusion in -rt10.

Regards,
-Greg

---

Gregory Haskins (2):
RT: remove "paranoid" limit in push_rt_task
RT: Remove comment that is no longer true


kernel/sched_rt.c | 15 +++------------
1 files changed, 3 insertions(+), 12 deletions(-)


2008-10-03 12:38:44

by Gregory Haskins

[permalink] [raw]
Subject: [PATCH 1/2] RT: Remove comment that is no longer true

We fixed the condition noted in the comment with the "pushable_tasks"
logic, but forgot to remove this comment. Lets clean it up.

Signed-off-by: Gregory Haskins <[email protected]>
---

kernel/sched_rt.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 57a0c0d..59ead84 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1125,16 +1125,6 @@ out:
return 1;
}

-/*
- * TODO: Currently we just use the second highest prio task on
- * the queue, and stop when it can't migrate (or there's
- * no more RT tasks). There may be a case where a lower
- * priority RT task has a different affinity than the
- * higher RT task. In this case the lower RT task could
- * possibly be able to migrate where as the higher priority
- * RT task could not. We currently ignore this issue.
- * Enhancements are welcome!
- */
static void push_rt_tasks(struct rq *rq)
{
/* push_rt_task will return true if it moved an RT */

2008-10-03 12:39:18

by Gregory Haskins

[permalink] [raw]
Subject: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task

A panic was discovered by Chirag Jog and investigated by Gilles Carry
to be originating in the fact that a task being pushed away
may get migrated away during a double_lock_balance. The result was
that the pushable_tasks list may become corrupted.

The root cause is that the "paranoid" retry limit could cause us to
bail out of a retry, but still try to remove the item from the (now
potentially incorrect) list. There are numerous ways to correct the
condition, but the paranoid feature is no longer relevant with the new
pushable logic (since pushable naturally limits the loop anyway), so
lets just remove it.

Reported By: Chirag Jog <[email protected]>
Found-by: Gilles Carry <[email protected]>
Signed-off-by: Gregory Haskins <[email protected]>
---

kernel/sched_rt.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 59ead84..5a754fe 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq)
{
struct task_struct *next_task;
struct rq *lowest_rq;
- int paranoid = RT_MAX_TRIES;

if (!rq->rt.overloaded)
return 0;
@@ -1094,12 +1093,14 @@ static int push_rt_task(struct rq *rq)
* If it has, then try again.
*/
task = pick_next_pushable_task(rq);
- if (unlikely(task != next_task) && task && paranoid--) {
+ if (unlikely(task != next_task) && task) {
put_task_struct(next_task);
next_task = task;
goto retry;
}

+ BUG_ON(task_cpu(next_task) != rq->cpu);
+
/*
* Once we have failed to push this task, we will not
* try again, since the other cpus will pull from us

2008-10-03 12:50:54

by Gregory Haskins

[permalink] [raw]
Subject: Re: [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9

Gregory Haskins wrote:
> Hi Chirag,
>
> Please try the following patches applied to 26.5-rt9 and let me know what you
> find.
>
> Hi Steve,
> If these look good to everyone, please consider them for inclusion in -rt10.
>
>

I meant to add: this are build-tested only (now back to vacation time
for me, which means slave labor in the garage!).

I will catch up with you guys on Monday.

-Greg



Attachments:
signature.asc (257.00 B)
OpenPGP digital signature

2008-10-03 13:50:45

by Gilles Carry

[permalink] [raw]
Subject: Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task

Sorry Greg,

Neither PPC64 nor Intel64 make it with this patch.
At boot time, it stops at the BUG_ON you added:
0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102)

I let you do more investigations.
Have a good week-end in you garage ;)

Gilles.


PPC64:
cpu 0x2: Vector: 700 (Program Check) at [c0000000ee2877b0]
pc: c00000000004eca4: .push_rt_task+0x1f4/0x2d0
lr: c00000000004ec24: .push_rt_task+0x174/0x2d0
sp: c0000000ee287a30
msr: 8000000000021032
current = 0xc0000000ee276fe0
paca = 0xc0000000005c3780
pid = 36, comm = sirq-block/2
kernel BUG at kernel/sched_rt.c:1102!
enter ? for help
[c0000000ee287a30] c00000000004ec78 .push_rt_task+0x1c8/0x2d0 (unreliable)
[c0000000ee287ae0] c00000000004eda4 .push_rt_tasks+0x24/0x44
[c0000000ee287b70] c00000000004edf0 .post_schedule_rt+0x2c/0x50
[c0000000ee287c00] c000000000052864 .finish_task_switch+0x100/0x1a8
[c0000000ee287cb0] c0000000002cdbd0 .__schedule+0x6a0/0x75c
[c0000000ee287d90] c0000000002cdedc .schedule+0xf4/0x128
[c0000000ee287e20] c000000000061700 .ksoftirqd+0x124/0x37c
[c0000000ee287f00] c000000000076dc0 .kthread+0x84/0xd4
[c0000000ee287f90] c000000000029368 .kernel_thread+0x4c/0x68
2:mon>

Intel64:
kernel BUG at kernel/sched_rt.c:1102!
invalid opcode: 0000 [1] PREEMPT SMP
CPU 4
Modules linked in: mptsas scsi_transport_sas
Pid: 61, comm: sirq-block/4 Not tainted 2.6.26.5-rt9-00002-g3b27927-dirty #26
RIP: 0010:[<ffffffff8022b307>] [<ffffffff8022b307>] push_rt_task+0x15f/0x20b
RSP: 0018:ffff81007f4d5d70 EFLAGS: 00010097
RAX: 0000000000000000 RBX: ffff81007edf09d0 RCX: 000000000822b765
RDX: 000000000822b765 RSI: 0000000000000000 RDI: ffff81000103f280
RBP: ffff81007f4d5da0 R08: ffff81007f4d4000 R09: ffff81007edcbe20
R10: 00000000ffffffff R11: ffffffff8021fa2c R12: 0000000000000000
R13: ffff810001034280 R14: ffff81007edf09e0 R15: ffff81000103f280
FS: 00007f2f26e776f0(0000) GS:ffff81007fc0ccc0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006b9fb0 CR3: 00000001bf4c9000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sirq-block/4 (pid: 61, threadinfo ffff81007f4d4000, task
ffff81007f4d0e10)
Stack: 000000007f4d5e00 ffff81000103f280 ffff81007edf09d0 ffff8101bf457540
0000000000000001 0000000000000002 ffff81007f4d5dc0 ffffffff8022b3c7
ffff81007f4d5de0 ffff81000103f280 ffff81007f4d5de0 ffffffff8022b3e8
Call Trace:
[<ffffffff8022b3c7>] push_rt_tasks+0x14/0x1c
[<ffffffff8022b3e8>] post_schedule_rt+0x19/0x25
[<ffffffff8022d7ee>] finish_task_switch+0x73/0x121
[<ffffffff805bbe3d>] thread_return+0x4f/0xdc
[<ffffffff805bc066>] schedule+0xd4/0xf0
[<ffffffff80237eeb>] ksoftirqd+0xb3/0x260
[<ffffffff80237e38>] ? ksoftirqd+0x0/0x260
[<ffffffff80245209>] ? kthread+0x47/0x76
[<ffffffff8022e9f9>] ? schedule_tail+0x43/0x97
[<ffffffff8020c3d8>] ? child_rip+0xa/0x12
[<ffffffff802451c2>] ? kthread+0x0/0x76
[<ffffffff8020c3ce>] ? child_rip+0x0/0x12


Code: 48 c7 c6 c0 1d 23 80 e8 83 b3 03 00 e9 ee fe ff ff 4c 89 e7 e8 b1 31 39
00 eb ba 48 8b 43 08 8b 40 18 41 3b 87 90 0e 00 00 74 04 <0f> 0b eb fe 48 89
de 4c 89 ff e8 5b fe ff ff f0 41 ff 0e 0f 94
RIP [<ffffffff8022b307>] push_rt_task+0x15f/0x20b
RSP <ffff81007f4d5d70>


Gregory Haskins wrote:
> A panic was discovered by Chirag Jog and investigated by Gilles Carry
> to be originating in the fact that a task being pushed away
> may get migrated away during a double_lock_balance. The result was
> that the pushable_tasks list may become corrupted.
>
> The root cause is that the "paranoid" retry limit could cause us to
> bail out of a retry, but still try to remove the item from the (now
> potentially incorrect) list. There are numerous ways to correct the
> condition, but the paranoid feature is no longer relevant with the new
> pushable logic (since pushable naturally limits the loop anyway), so
> lets just remove it.
>
> Reported By: Chirag Jog <[email protected]>
> Found-by: Gilles Carry <[email protected]>
> Signed-off-by: Gregory Haskins <[email protected]>
> ---
>
> kernel/sched_rt.c | 5 +++--
> 1 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index 59ead84..5a754fe 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq)
> {
> struct task_struct *next_task;
> struct rq *lowest_rq;
> - int paranoid = RT_MAX_TRIES;
>
> if (!rq->rt.overloaded)
> return 0;
> @@ -1094,12 +1093,14 @@ static int push_rt_task(struct rq *rq)
> * If it has, then try again.
> */
> task = pick_next_pushable_task(rq);
> - if (unlikely(task != next_task) && task && paranoid--) {
> + if (unlikely(task != next_task) && task) {
> put_task_struct(next_task);
> next_task = task;
> goto retry;
> }
>
> + BUG_ON(task_cpu(next_task) != rq->cpu);
> +
> /*
> * Once we have failed to push this task, we will not
> * try again, since the other cpus will pull from us
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>

2008-10-03 15:48:36

by Chirag Jog

[permalink] [raw]
Subject: Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task

* Gilles Carry <[email protected]> [2008-10-03 15:46:59]:

> Sorry Greg,
>
> Neither PPC64 nor Intel64 make it with this patch.
> At boot time, it stops at the BUG_ON you added:
> 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102)
I am also confirming this issue gilles reported.

Although, i have a question:
When we enable group scheduling (CONFIG_RT_GROUP_SCHED),
all the problems disappear.
After skimming through the rt scheduler code, I don't feel group
scheduling alters the behavior of push/pull strategies in any way.

So I am wonder whether enabling group scheduling
actually solves the problem or just makes it tough
to recreate?

> I let you do more investigations.
> Have a good week-end in you garage ;)
Have a great weekend :)
>
--
-Thanks,Chirag

2008-10-03 17:21:48

by Gregory Haskins

[permalink] [raw]
Subject: [RT PATCH v2 0/2] Series short description

Gilles Carry wrote:
> Sorry Greg,
>
> Neither PPC64 nor Intel64 make it with this patch.
> At boot time, it stops at the BUG_ON you added:
> 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102)
>


Indeed. Your report has revealed the problem to me.

The issue is that there are three conditions embedded in that if(!lower_rq)
code, but two are buried in the !retry case. This was the mistake I was making.

We basically need to

a) dequeue if the task hasnt moved
b) retry if the task *has* moved AND there are more tasks left
c) stop of the task *has* moved AND there are no more tasks

I was missing logic to handle (c). "v2" should fix this so it is handled.
Please give it a try. Thanks again, Gilles!

(Again, only build-tested)

Regards,
-Greg


---

Gregory Haskins (2):
RT: remove "paranoid" limit in push_rt_task
RT: Remove comment that is no longer true


kernel/sched_rt.c | 44 ++++++++++++++++++++++----------------------
1 files changed, 22 insertions(+), 22 deletions(-)

--
Signature

2008-10-03 17:22:39

by Gregory Haskins

[permalink] [raw]
Subject: [RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task

A panic was discovered by Chirag Jog and investigated by Gilles Carry
to be originating in the fact that a task being pushed away
may get migrated away during a double_lock_balance. The result was
that the pushable_tasks list may become corrupted.

The root cause is that the "paranoid" retry limit could cause us to
bail out of a retry, but still try to remove the item from the (now
potentially incorrect) list. There are numerous ways to correct the
condition, but the paranoid feature is no longer relevant with the new
pushable logic (since pushable naturally limits the loop anyway), so
lets just remove it.

Reported By: Chirag Jog <[email protected]>
Found-by: Gilles Carry <[email protected]>
Signed-off-by: Gregory Haskins <[email protected]>
---

kernel/sched_rt.c | 34 ++++++++++++++++++++++------------
1 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 59ead84..201bd97 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq)
{
struct task_struct *next_task;
struct rq *lowest_rq;
- int paranoid = RT_MAX_TRIES;

if (!rq->rt.overloaded)
return 0;
@@ -1090,23 +1089,34 @@ static int push_rt_task(struct rq *rq)
struct task_struct *task;
/*
* find lock_lowest_rq releases rq->lock
- * so it is possible that next_task has changed.
- * If it has, then try again.
+ * so it is possible that next_task has migrated.
+ *
+ * We need to make sure that the task is still on the same
+ * run-queue and is also still the next task eligible for
+ * pushing.
*/
task = pick_next_pushable_task(rq);
- if (unlikely(task != next_task) && task && paranoid--) {
- put_task_struct(next_task);
- next_task = task;
- goto retry;
+ if (task_cpu(next_task) == rq->cpu && task == next_task) {
+ /*
+ * If we get here, the task hasnt moved it all, but
+ * it has failed to push. We will not try again,
+ * since the other cpus will pull from us when they
+ * are ready.
+ */
+ dequeue_pushable_task(rq, next_task);
+ goto out;
}
+
+ if (!task)
+ /* No more tasks, just exit */
+ goto out;

/*
- * Once we have failed to push this task, we will not
- * try again, since the other cpus will pull from us
- * when they are ready
+ * Something has shifted, try again.
*/
- dequeue_pushable_task(rq, next_task);
- goto out;
+ put_task_struct(next_task);
+ next_task = task;
+ goto retry;
}

deactivate_task(rq, next_task, 0);

2008-10-03 17:22:24

by Gregory Haskins

[permalink] [raw]
Subject: [RT PATCH v2 1/2] RT: Remove comment that is no longer true

We fixed the condition noted in the comment with the "pushable_tasks"
logic, but forgot to remove this comment. Lets clean it up.

Signed-off-by: Gregory Haskins <[email protected]>
---

kernel/sched_rt.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 57a0c0d..59ead84 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1125,16 +1125,6 @@ out:
return 1;
}

-/*
- * TODO: Currently we just use the second highest prio task on
- * the queue, and stop when it can't migrate (or there's
- * no more RT tasks). There may be a case where a lower
- * priority RT task has a different affinity than the
- * higher RT task. In this case the lower RT task could
- * possibly be able to migrate where as the higher priority
- * RT task could not. We currently ignore this issue.
- * Enhancements are welcome!
- */
static void push_rt_tasks(struct rq *rq)
{
/* push_rt_task will return true if it moved an RT */

2008-10-03 17:23:27

by Gregory Haskins

[permalink] [raw]
Subject: Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task

Chirag Jog wrote:
> * Gilles Carry <[email protected]> [2008-10-03 15:46:59]:
>
>
>> Sorry Greg,
>>
>> Neither PPC64 nor Intel64 make it with this patch.
>> At boot time, it stops at the BUG_ON you added:
>> 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102)
>>
> I am also confirming this issue gilles reported.
>
> Although, i have a question:
> When we enable group scheduling (CONFIG_RT_GROUP_SCHED),
> all the problems disappear.
> After skimming through the rt scheduler code, I don't feel group
> scheduling alters the behavior of push/pull strategies in any way.
>
> So I am wonder whether enabling group scheduling
> actually solves the problem or just makes it tough
> to recreate?
>
Hi Chirag,


The issue that Gilles pointed me at is a race condition. I do not
suspect GROUP_SCHED itself has anything to do with the problem other
than it changes the timing. HTH!

-Greg




Attachments:
signature.asc (257.00 B)
OpenPGP digital signature