LinuxLists.cc - Re: Fw: 2.5.61 oops running SDET

2003-02-16 01:29:02

Subject: Re: Fw: 2.5.61 oops running SDET

On Sat, 15 Feb 2003, Andrew Morton wrote:
>
> The recent change to fs/proc/array.c:task_sig()?
>
> buffer += sprintf(buffer, "ShdPnd:\t");
> buffer = render_sigset_t(&p->signal->shared_pending.signal, buffer);

Yeah, but I think the bug has existed for much longer.

It looks like "proc_pid_status()" doesn't actually lock the task at all,
nor even bother to test whether the task has signal state. Which has
_always_ been a bug. I don't know why it would start happening now, but it
might just be unlucky timing.

I think the proper fix is to put a

task_lock()
..
task_unlock()

around the whole proc_pid_status() function, _and_ then verify that
"tsk->sighand" is non-NULL.

(Oh, careful, that's already what "get_task_mm()" does internally, so look
out for deadlocks - you'd need to open-code the get_task_mm() in there
too, so the end result is something like

task_lock(task)
if (task->mm) {
.. mm state
}
if (task->sighand) {
.. signal state
}
..
task_unlock(task);

instead).

Linus

2003-02-16 02:00:17

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

> Yeah, but I think the bug has existed for much longer.
>
> It looks like "proc_pid_status()" doesn't actually lock the task at all,
> nor even bother to test whether the task has signal state. Which has
> _always_ been a bug. I don't know why it would start happening now, but
> it might just be unlucky timing.
>
> I think the proper fix is to put a
>
> task_lock()
> ..
> task_unlock()
>
> around the whole proc_pid_status() function, _and_ then verify that
> "tsk->sighand" is non-NULL.
>
> (Oh, careful, that's already what "get_task_mm()" does internally, so
> look out for deadlocks - you'd need to open-code the get_task_mm() in
> there too, so the end result is something like
>
> task_lock(task)
> if (task->mm) {
> .. mm state
> }
> if (task->sighand) {
> .. signal state
> }
> ..
> task_unlock(task);
>
> instead).

Something like this? Compiles, but not tested yet ...

diff -urpN -X /home/fletch/.diff.exclude virgin/fs/proc/array.c
proc/fs/proc/array.c
--- virgin/fs/proc/array.c Sat Feb 15 16:11:45 2003
+++ proc/fs/proc/array.c Sat Feb 15 18:05:10 2003
@@ -243,8 +243,10 @@ extern char *task_mem(struct mm_struct *
int proc_pid_status(struct task_struct *task, char * buffer)
{
char * orig = buffer;
- struct mm_struct *mm = get_task_mm(task);
+ struct mm_struct *mm;

+ task_lock(task);
+ mm = __get_task_mm(task);
buffer = task_name(task, buffer);
buffer = task_state(task, buffer);

@@ -257,6 +259,7 @@ int proc_pid_status(struct task_struct *
#if defined(CONFIG_ARCH_S390)
buffer = task_show_regs(task, buffer);
#endif
+ task_unlock(task);
return buffer - orig;
}

diff -urpN -X /home/fletch/.diff.exclude virgin/include/linux/sched.h
proc/include/linux/sched.h
--- virgin/include/linux/sched.h Sat Feb 15 16:11:47 2003
+++ proc/include/linux/sched.h Sat Feb 15 18:04:42 2003
@@ -706,6 +706,18 @@ static inline struct mm_struct * get_tas

return mm;
}
+
+/* lockless version of get_task_mm */
+static inline struct mm_struct * __get_task_mm(struct task_struct * task)
+{
+ struct mm_struct * mm;
+
+ mm = task->mm;
+ if (mm)
+ atomic_inc(&mm->mm_users);
+
+ return mm;
+}

/* set thread flags in other task's structures

2003-02-16 02:21:02

by Linus Torvalds

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

On Sat, 15 Feb 2003, Martin J. Bligh wrote:
>
> Something like this? Compiles, but not tested yet ...

Close, but you also need

- buffer = task_sig(task, buffer);
+ if (task->sighand)
+ buffer = task_sig(task, buffer);

to actually check whether signals exist or not. Otherwise you'll just get
the same oops anyway (well, keeping the task locked may change timings
enough that it doesn't happen, but the bug will continue to be there.

I would also say that since you explicitly take the task lock, there's no
real reason to use "get_task_mm()" at all any more, so instead of doing
that (and then doing the mmput()), just get rid of the mm variable
entirely, and do

if (task->mm)
buffer = task_mem(task->mm, buffer)

too. There's really no downside to just holding on to the task lock over
the whole operation instead of incrementing and decrementing the mm
counts and dropping the lock early.

(There are a few valid reasons for using the "get_task_mm()" function:

- you need to block and thus drop the task lock

- the original code just used "task->mm" directly, and using
"get_task_mm()" made the original conversion to mm safe handling
easier.

Neither reason is really valid any more in this function at least).

Linus

2003-02-16 02:40:00

by Zwane Mwaikambo

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

On Sat, 15 Feb 2003, Martin J. Bligh wrote:

> - struct mm_struct *mm = get_task_mm(task);
> + struct mm_struct *mm;
>
> + task_lock(task);
> + mm = __get_task_mm(task);
> buffer = task_name(task, buffer);
> buffer = task_state(task, buffer);

task_state acquires the task_struct lock.

Zwane
--
function.linuxpower.ca

2003-02-16 03:51:16

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

> Close, but you also need
>
> - buffer = task_sig(task, buffer);
> + if (task->sighand)
> + buffer = task_sig(task, buffer);
>
> to actually check whether signals exist or not. Otherwise you'll just get
> the same oops anyway (well, keeping the task locked may change timings
> enough that it doesn't happen, but the bug will continue to be there.
>
> I would also say that since you explicitly take the task lock, there's no
> real reason to use "get_task_mm()" at all any more, so instead of doing
> that (and then doing the mmput()), just get rid of the mm variable
> entirely, and do
>
> if (task->mm)
> buffer = task_mem(task->mm, buffer)
>
> too. There's really no downside to just holding on to the task lock over
> the whole operation instead of incrementing and decrementing the mm
> counts and dropping the lock early.
>
> (There are a few valid reasons for using the "get_task_mm()" function:
>
> - you need to block and thus drop the task lock
>
> - the original code just used "task->mm" directly, and using
> "get_task_mm()" made the original conversion to mm safe handling
> easier.
>
> Neither reason is really valid any more in this function at least).

OK, I did the following, which is what I think you wanted, plus Zwane's
observation that task_state acquires the task_struct lock (we're the only
caller, so I just removed it), but I still get the same panic and this time
the box hung. No doubt I've just done something stupid in the patch ...
(task_state takes the tasklist_lock ... is that safe to do inside task_lock?)

diff -urpN -X /home/fletch/.diff.exclude virgin/fs/proc/array.c sdet/fs/proc/array.c
--- virgin/fs/proc/array.c Sat Feb 15 16:11:45 2003
+++ sdet/fs/proc/array.c Sat Feb 15 19:28:34 2003
@@ -166,12 +166,10 @@ static inline char * task_state(struct t
p->uid, p->euid, p->suid, p->fsuid,
p->gid, p->egid, p->sgid, p->fsgid);
read_unlock(&tasklist_lock);
- task_lock(p);
buffer += sprintf(buffer,
"FDSize:\t%d\n"
"Groups:\t",
p->files ? p->files->max_fds : 0);
- task_unlock(p);

for (g = 0; g < p->ngroups; g++)
buffer += sprintf(buffer, "%d ", p->groups[g]);
@@ -243,20 +241,20 @@ extern char *task_mem(struct mm_struct *
int proc_pid_status(struct task_struct *task, char * buffer)
{
char * orig = buffer;
- struct mm_struct *mm = get_task_mm(task);

+ task_lock(task);
buffer = task_name(task, buffer);
buffer = task_state(task, buffer);

- if (mm) {
- buffer = task_mem(mm, buffer);
- mmput(mm);
- }
- buffer = task_sig(task, buffer);
+ if (task->mm)
+ buffer = task_mem(task->mm, buffer);
+ if (task->sighand)
+ buffer = task_sig(task, buffer);
buffer = task_cap(task, buffer);
#if defined(CONFIG_ARCH_S390)
buffer = task_show_regs(task, buffer);
#endif
+ task_unlock(task);
return buffer - orig;
}

Unable to handle kernel NULL pointer dereference at virtual address 00000014
printing eip:
c0118b13
*pde = 2d4bf001
*pte = 00000000
Oops: 0000
CPU: 6
EIP: 0060:[<c0118b13>] Not tainted
EFLAGS: 00010207
EIP is at render_sigset_t+0x17/0x7c
eax: 00000010 ebx: ed3d909f ecx: ed3d9097 edx: eda60100
esi: 0000003c edi: 00000010 ebp: ed49bf04 esp: ed49bef8
ds: 007b es: 007b ss: 0068
Process ps (pid: 3339, threadinfo=ed49a000 task=ee32cd20)
Stack: ed3d907d 00000002 ed3d909f 00000006 c016d85f 00000010 ed3d909f ed3d9097
c0233f38 eda606bc ed3d9086 ed3d907e c0233f2f ee1d7b70 eda60100 eeb4d0c0
ed3d9000 ee1d7b70 00000000 ed3d9000 000000d0 eeb4d0c0 c01300f6 c17119e8
Call Trace:
[<c016d85f>] proc_pid_status+0x26f/0x334
[<c01300f6>] __get_free_pages+0x4e/0x54
[<c016b517>] proc_info_read+0x53/0x130
[<c0145295>] vfs_read+0xa5/0x128
[<c0145522>] sys_read+0x2a/0x3c
[<c0108b3f>] syscall_call+0x7/0xb

Code: 0f a3 37 19 d2 b8 01 00 00 00 31 c9 85 d2 0f 45 c8 8d 56 01
<1>Unable to handle kernel NULL pointer dereference at virtual address 00000014
printing eip:
c0118b13
*pde = 2d512001
*pte = 00000000
Oops: 0000
CPU: 2
EIP: 0060:[<c0118b13>] Not tainted
EFLAGS: 00010207
EIP is at render_sigset_t+0x17/0x7c
eax: 00000010 ebx: ee34b0a0 ecx: ee34b098 edx: ed3ea7a0
esi: 0000003c edi: 00000010 ebp: ed54ff04 esp: ed54fef8
ds: 007b es: 007b ss: 0068
Process ps (pid: 14710, threadinfo=ed54e000 task=ee430d40)
Stack: ee34b07e 00000002 ee34b0a0 00000006 c016d85f 00000010 ee34b0a0 ee34b098
c0233f38 ed3ead5c ee34b087 ee34b07f c0233f2f ee4960d0 ed3ea7a0 eec66ca0
ee34b000 ee4960d0 00000000 ee34b000 000000d0 eec66ca0 c01300f6 c17383b8
Call Trace:
[<c016d85f>] proc_pid_status+0x26f/0x334
[<c01300f6>] __get_free_pages+0x4e/0x54
[<c016b517>] proc_info_read+0x53/0x130
[<c0145295>] vfs_read+0xa5/0x128
[<c0145522>] sys_read+0x2a/0x3c
[<c0108b3f>] syscall_call+0x7/0xb

Code: 0f a3 37 19 d2 b8 01 00 00 00 31 c9 85 d2 0f 45 c8 8d 56 01

2003-02-16 12:59:59

by Anton Blanchard

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

> OK, I did the following, which is what I think you wanted, plus Zwane's
> observation that task_state acquires the task_struct lock (we're the only
> caller, so I just removed it), but I still get the same panic and this time
> the box hung. No doubt I've just done something stupid in the patch ...
> (task_state takes the tasklist_lock ... is that safe to do inside task_lock?)

It looks like you now try and take the tasklist_lock inside a task_lock()
region, but task_lock says no-can-do:

/* Protects ->fs, ->files, ->mm, and synchronises with wait4(). Nests
* inside tasklist_lock */
static inline void task_lock(struct task_struct *p)
{
...

Anton

2003-02-16 16:30:03

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

>> OK, I did the following, which is what I think you wanted, plus Zwane's
>> observation that task_state acquires the task_struct lock (we're the only
>> caller, so I just removed it), but I still get the same panic and this time
>> the box hung. No doubt I've just done something stupid in the patch ...
>> (task_state takes the tasklist_lock ... is that safe to do inside task_lock?)
>
> It looks like you now try and take the tasklist_lock inside a task_lock()
> region, but task_lock says no-can-do:
>
> /* Protects ->fs, ->files, ->mm, and synchronises with wait4(). Nests
> * inside tasklist_lock */
> static inline void task_lock(struct task_struct *p)
> {
> ...

Right, that's what I expected, and that may well explain the hang, but I
don't see how that explains the oops still happening. What exactly is
tasklist_lock protecting here anyway?

read_lock(&tasklist_lock);
buffer += sprintf(buffer,
"State:\t%s\n"
"Tgid:\t%d\n"
"Pid:\t%d\n"
"PPid:\t%d\n"
"TracerPid:\t%d\n"
"Uid:\t%d\t%d\t%d\t%d\n"
"Gid:\t%d\t%d\t%d\t%d\n",
get_task_state(p), p->tgid,
p->pid, p->pid ? p->real_parent->pid : 0,
p->pid && p->ptrace ? p->parent->pid : 0,
p->uid, p->euid, p->suid, p->fsuid,
p->gid, p->egid, p->sgid, p->fsgid);
read_unlock(&tasklist_lock);

Is it these two accesses:

p->real_parent->pid ?
p->parent->pid ?

Don't see what I can do for this apart from to invert the ordering and take
tasklist_lock around the whole function, and nest task_lock inside that, or
I suppose I could take the task_lock for each of the parents? I seem to
recall Linus reminding people recently that it was only the lock
acquisition order that was important, not release ... does something like
the following look OK?

diff -urpN -X /home/fletch/.diff.exclude virgin/fs/proc/array.c sdet2/fs/proc/array.c
--- virgin/fs/proc/array.c Sat Feb 15 16:11:45 2003
+++ sdet2/fs/proc/array.c Sun Feb 16 08:37:45 2003
@@ -147,11 +147,11 @@ static inline const char * get_task_stat
return *p;
}

+/* Call me with the tasklist_lock and task_lock for p held already */
static inline char * task_state(struct task_struct *p, char *buffer)
{
int g;

- read_lock(&tasklist_lock);
buffer += sprintf(buffer,
"State:\t%s\n"
"Tgid:\t%d\n"
@@ -165,13 +165,10 @@ static inline char * task_state(struct t
p->pid && p->ptrace ? p->parent->pid : 0,
p->uid, p->euid, p->suid, p->fsuid,
p->gid, p->egid, p->sgid, p->fsgid);
- read_unlock(&tasklist_lock);
- task_lock(p);
buffer += sprintf(buffer,
"FDSize:\t%d\n"
"Groups:\t",
p->files ? p->files->max_fds : 0);
- task_unlock(p);

for (g = 0; g < p->ngroups; g++)
buffer += sprintf(buffer, "%d ", p->groups[g]);
@@ -243,20 +240,22 @@ extern char *task_mem(struct mm_struct *
int proc_pid_status(struct task_struct *task, char * buffer)
{
char * orig = buffer;
- struct mm_struct *mm = get_task_mm(task);

+ read_lock(&tasklist_lock);
+ task_lock(task);
buffer = task_name(task, buffer);
buffer = task_state(task, buffer);
+ read_unlock(&tasklist_lock);

- if (mm) {
- buffer = task_mem(mm, buffer);
- mmput(mm);
- }
- buffer = task_sig(task, buffer);
+ if (task->mm)
+ buffer = task_mem(task->mm, buffer);
+ if (task->sighand)
+ buffer = task_sig(task, buffer);
buffer = task_cap(task, buffer);
#if defined(CONFIG_ARCH_S390)
buffer = task_show_regs(task, buffer);
#endif
+ task_unlock(task);
return buffer - orig;
}

2003-02-16 18:01:34

by Linus Torvalds

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

On Sat, 15 Feb 2003, Martin J. Bligh wrote:
>
> OK, I did the following, which is what I think you wanted, plus Zwane's
> observation that task_state acquires the task_struct lock (we're the only
> caller, so I just removed it), but I still get the same panic and this time
> the box hung.

Yeah, a closer look shows that the exit path doesn't actually take the
task lock at all around any of the signal stuff, so the lock protects
"task->mm", "task->files" and "task->fs", but it does NOT protect
"task->signal" or "task->sighand" (illogical, but true).

Oh, damn.

The core that checks for "p->sighand" takes the tasklist lock for this
reason (see "collect_sigign_sigcatch()"

So the choice seems to be either:

- make the exit path hold the task lock over the whole exit path, not
just over mm exit.

- take the "tasklist_lock" over more of "task_sig()" (not just the
collect_sigign_sigcatch() thing, but the "&p->signal->shared_pending"
rendering too.

The latter is a two-liner. The former is the "right thing" for multiple
reasons.

The reason I'd _like_ to see the task lock held over _all_ of the fields
in the exit() path is:

- right now we actually take it and drop it multiple times in exit. See
__exit_files(), __exit_fs(), __exit_mm(). Which all take it just to
serialize setting ot "mm/files/fs" to NULL.

- task_lock is a nice local lock, no global scalability impact.

So it really would be much nicer to do something like this in do_exit():

struct mm_struct *mm;
struct files_struct *files;
struct fs_struct *fs;
struct signal_struct *signal;
struct sighand_struct *sighand;

task_lock(task);
mm = task->mm;
files = task->files;
fs = task->fs;
signal = task->signal;
sighand = task->sighand;

task->mm = NULL;
task->files = NULL;
task->fs = NULL;
task->signal = NULL;
task->sighand = NULL;
task_unlock(task);

.. actually do the "__exit_mm(task, mm)" etc here ..

which would make things a lot more readable, and result in us taking the
lock only _once_ instead of three times, and would properly protect
"signal" and "sighand" so that the /proc code wouldn't need to take the
heavily used "tasklist_lock" just to read the signal state for a single
task.

But fixing up exit to do the above would require several (trivial) calling
convention changes, like changing

static inline void __exit_mm(struct task_struct * tsk)
{
struct mm_struct *mm = tsk->mm;
...

into

static inline void __exit_mm(struct task_struct * tsk,
struct mm_struct *mm)
{
...

instead and updatign the callers.

Is anybody willing to do that (hopefully fairly trivial) fixup and test
it, or should we go with the stupid "take the 'tasklist_lock'" approach?

Linus

2003-02-16 18:14:59

by Linus Torvalds

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

On Sun, 16 Feb 2003, Martin J. Bligh wrote:
>
> Right, that's what I expected, and that may well explain the hang, but I
> don't see how that explains the oops still happening. What exactly is
> tasklist_lock protecting here anyway?
>
> read_lock(&tasklist_lock);
> buffer += sprintf(buffer,
> "State:\t%s\n"
> "Tgid:\t%d\n"
> "Pid:\t%d\n"
> "PPid:\t%d\n"
> "TracerPid:\t%d\n"
> "Uid:\t%d\t%d\t%d\t%d\n"
> "Gid:\t%d\t%d\t%d\t%d\n",
> get_task_state(p), p->tgid,
> p->pid, p->pid ? p->real_parent->pid : 0,
> p->pid && p->ptrace ? p->parent->pid : 0,
> p->uid, p->euid, p->suid, p->fsuid,
> p->gid, p->egid, p->sgid, p->fsgid);
> read_unlock(&tasklist_lock);
>
> Is it these two accesses:
>
> p->real_parent->pid ?
> p->parent->pid ?

Yup.

> Don't see what I can do for this apart from to invert the ordering and take
> tasklist_lock around the whole function, and nest task_lock inside that, or
> I suppose I could take the task_lock for each of the parents? I seem to
> recall Linus reminding people recently that it was only the lock
> acquisition order that was important, not release ... does something like
> the following look OK?

This patch looks like it should certainly fix the problem, but that is
still some god-awful ugly overkill in locking.

I'd rather make the rule be that you have to take the task lock before
modifying things like the parent pointers (and all the other fundamntal
pointers), since that's already the rule for most of it anyway.

And then the tasklist lock would go away _entirely_ from /proc (except for
task lookup in ->readdir/->lookup, of course, where it is fundamentally
needed and proper - and will probably some day be replaced by RCU, I
suspect).

Linus

2003-02-16 18:16:48

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

> So the choice seems to be either:
>
> - make the exit path hold the task lock over the whole exit path, not
> just over mm exit.
>
> - take the "tasklist_lock" over more of "task_sig()" (not just the
> collect_sigign_sigcatch() thing, but the "&p->signal->shared_pending"
> rendering too.
>
> The latter is a two-liner. The former is the "right thing" for multiple
> reasons.
>
> The reason I'd _like_ to see the task lock held over _all_ of the fields
> in the exit() path is:
>
> - right now we actually take it and drop it multiple times in exit. See
> __exit_files(), __exit_fs(), __exit_mm(). Which all take it just to
> serialize setting ot "mm/files/fs" to NULL.
>
> - task_lock is a nice local lock, no global scalability impact.

Don't we have to take tasklist_lock anyway for when task_state reads
p->real_parent->pid and p->parent->pid? Or should we have to grab
the task_lock for those too? If so, it sounds like it could get into
rather horrible ordering dependencies to me.

If we move exit under task_lock ... then tasklist_lock doesn't help us
any more there either (presumably we're trying to stop them exiting
whilst we look at them) ... I've coded up the stupid approach for now,
and will check that works first. Should have results very soon.

M.

2003-02-16 18:26:52

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

>> The reason I'd _like_ to see the task lock held over _all_ of the fields
>> in the exit() path is:
>>
>> - right now we actually take it and drop it multiple times in exit. See
>> __exit_files(), __exit_fs(), __exit_mm(). Which all take it just to
>> serialize setting ot "mm/files/fs" to NULL.
>>
>> - task_lock is a nice local lock, no global scalability impact.
>
> Don't we have to take tasklist_lock anyway for when task_state reads
> p->real_parent->pid and p->parent->pid? Or should we have to grab
> the task_lock for those too? If so, it sounds like it could get into
> rather horrible ordering dependencies to me.
>
> If we move exit under task_lock ... then tasklist_lock doesn't help us
> any more there either (presumably we're trying to stop them exiting
> whilst we look at them) ... I've coded up the stupid approach for now,
> and will check that works first. Should have results very soon.

Hmmmm ... I guess we could keep real_parent->pid and parent->pid inside
the child's task_struct ... I can't see those change unless we're changing
real_parent / parent task pointers anyway. Not sure how much that idea
will make people vomit, but it would seem to make it easier to get rid
of the global locking, and it's probably nicer than trying to order the
aquistion of other people's task locks ...

M.

2003-02-16 18:56:52

by Martin J. Bligh

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

>> Don't see what I can do for this apart from to invert the ordering and take
>> tasklist_lock around the whole function, and nest task_lock inside that, or
>> I suppose I could take the task_lock for each of the parents? I seem to
>> recall Linus reminding people recently that it was only the lock
>> acquisition order that was important, not release ... does something like
>> the following look OK?
>
> This patch looks like it should certainly fix the problem, but that is
> still some god-awful ugly overkill in locking.
>
> I'd rather make the rule be that you have to take the task lock before
> modifying things like the parent pointers (and all the other fundamntal
> pointers), since that's already the rule for most of it anyway.
>
> And then the tasklist lock would go away _entirely_ from /proc (except for
> task lookup in ->readdir/->lookup, of course, where it is fundamentally
> needed and proper - and will probably some day be replaced by RCU, I
> suspect).

Well, I did the stupid safe thing, and it hangs the box once we get up to
a load of 32 with SDET. Below is what I did, the only other issue I can
see in here is that task_mem takes mm->mmap_sem which is now nested inside
the task_lock inside tasklist_lock ... but I can't see anywhere that's a
problem from a quick search

diff -urpN -X /home/fletch/.diff.exclude virgin/fs/proc/array.c sdet2/fs/proc/array.c
--- virgin/fs/proc/array.c Sat Feb 15 16:11:45 2003
+++ sdet2/fs/proc/array.c Sun Feb 16 09:59:24 2003
@@ -147,11 +147,11 @@ static inline const char * get_task_stat
return *p;
}

+/* Call me with the tasklist_lock and task_lock for p held already */
static inline char * task_state(struct task_struct *p, char *buffer)
{
int g;

- read_lock(&tasklist_lock);
buffer += sprintf(buffer,
"State:\t%s\n"
"Tgid:\t%d\n"
@@ -165,13 +165,10 @@ static inline char * task_state(struct t
p->pid && p->ptrace ? p->parent->pid : 0,
p->uid, p->euid, p->suid, p->fsuid,
p->gid, p->egid, p->sgid, p->fsgid);
- read_unlock(&tasklist_lock);
- task_lock(p);
buffer += sprintf(buffer,
"FDSize:\t%d\n"
"Groups:\t",
p->files ? p->files->max_fds : 0);
- task_unlock(p);

for (g = 0; g < p->ngroups; g++)
buffer += sprintf(buffer, "%d ", p->groups[g]);
@@ -243,20 +240,22 @@ extern char *task_mem(struct mm_struct *
int proc_pid_status(struct task_struct *task, char * buffer)
{
char * orig = buffer;
- struct mm_struct *mm = get_task_mm(task);

+ read_lock(&tasklist_lock);
+ task_lock(task);
buffer = task_name(task, buffer);
buffer = task_state(task, buffer);

- if (mm) {
- buffer = task_mem(mm, buffer);
- mmput(mm);
- }
- buffer = task_sig(task, buffer);
+ if (task->mm)
+ buffer = task_mem(task->mm, buffer);
+ if (task->sighand)
+ buffer = task_sig(task, buffer);
buffer = task_cap(task, buffer);
#if defined(CONFIG_ARCH_S390)
buffer = task_show_regs(task, buffer);
#endif
+ task_unlock(task);
+ read_unlock(&tasklist_lock);
return buffer - orig;
}

2003-02-16 19:09:10

by Manfred Spraul

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

Martin J. Bligh wrote:

>Well, I did the stupid safe thing, and it hangs the box once we get up to
>a load of 32 with SDET. Below is what I did, the only other issue I can
>see in here is that task_mem takes mm->mmap_sem which is now nested inside
>the task_lock inside tasklist_lock ... but I can't see anywhere that's a
>problem from a quick search
>
>
task_lock is actually &spin_lock(tsk->alloc_lock); mmap_sem is a
semaphore. Nesting means deadlock.

--
Manfred

2003-02-16 19:10:34

by Linus Torvalds

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

On Sun, 16 Feb 2003, Martin J. Bligh wrote:
>
> Well, I did the stupid safe thing, and it hangs the box once we get up to
> a load of 32 with SDET. Below is what I did, the only other issue I can
> see in here is that task_mem takes mm->mmap_sem which is now nested inside
> the task_lock inside tasklist_lock ... but I can't see anywhere that's a
> problem from a quick search

Ho - you can _never_ nest a semaphore inside a spinlock - if the semaphore
sleeps, the spinlock will cause a lockup regardless of any reverse nesting
issues.. So I guess the old "get_task_mm()" code is requried anyway.

Linus

2003-02-16 19:09:25

by Martin J. Bligh

[permalink] [raw]

Subject: more signal locking bugs?

2003-02-16 19:17:56

by Linus Torvalds

[permalink] [raw]

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: Fw: 2.5.61 oops running SDET

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: [PATCH] fix secondary oops in sighand locking

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?

Subject: Re: more signal locking bugs?