2006-08-19 15:06:29

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 1/4] for 2.6.18, revert "Drop tasklist lock in do_sched_setscheduler"

sched_setscheduler() looks at ->signal->rlim[]. It is unsafe do dereference
->signal unless tasklist_lock or ->siglock is held (or p == current). We pin
the task structure, but this can't prevent from release_task()->__exit_signal()
which sets ->signal = NULL.

Restore tasklist_lock across the setscheduler call.

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.18-rc4/kernel/sched.c~1_revert 2006-08-19 17:50:56.000000000 +0400
+++ 2.6.18-rc4/kernel/sched.c 2006-08-19 18:15:15.000000000 +0400
@@ -4162,10 +4162,8 @@ do_sched_setscheduler(pid_t pid, int pol
read_unlock_irq(&tasklist_lock);
return -ESRCH;
}
- get_task_struct(p);
- read_unlock_irq(&tasklist_lock);
retval = sched_setscheduler(p, policy, &lparam);
- put_task_struct(p);
+ read_unlock_irq(&tasklist_lock);

return retval;
}


2006-08-19 15:19:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/4] for 2.6.18, revert "Drop tasklist lock in do_sched_setscheduler"

On 08/19, Oleg Nesterov wrote:
>
> sched_setscheduler() looks at ->signal->rlim[]. It is unsafe do dereference
> ->signal unless tasklist_lock or ->siglock is held (or p == current). We pin
> the task structure, but this can't prevent from release_task()->__exit_signal()
> which sets ->signal = NULL.

See the testcase below, 2.6.18-rc4 oopses. The stable tree is ok, the problem
was introduced during 2.6.18 development.

Oleg.

#!/usr/bin/perl

pipe R, W;

if (fork) {
while (sysread R, $_, 4) {
do {
syscall 156, unpack('i', $_), 1, pack('i', 1);
} while $! == 1; # EPERM
}
} else {
wait while fork;
syswrite W, pack 'i', $$;
}