2003-03-22 16:49:49

by Zwane Mwaikambo

[permalink] [raw]
Subject: BUG: Use after free in detach_pid

Hi Manfred, much quicker turnover if i leave it to you. kernel/box is SMP
i triggered it whilst doing a find in a large NFS directory and writing
out 512M to that same directory.

187 }
188
189 static inline int __detach_pid(task_t *task, enum pid_type type)
190 {
191 struct pid_link *link = task->pids + type;
192 struct pid *pid = link->pidptr;
193 int nr;
194
195 list_del(&link->pid_chain); <==
196 if (!atomic_dec_and_test(&pid->count))

Unable to handle kernel paging request at virtual address 6b6b6b6b
printing eip:
c013479c
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0060:[<c013479c>] Not tainted
EFLAGS: 00010046
EIP is at detach_pid+0x1c/0xf0
eax: 6b6b6b6b ebx: 6b6b6b6b ecx: 6b6b6b6b edx: c1bb93d0
esi: 00000000 edi: bfffef74 ebp: 00000000 esp: caaa3f08
ds: 007b es: 007b ss: 0068
Process bash (pid: 1292, threadinfo=caaa2000 task=cbb02560)
Stack: c1bb9380 00000000 bfffef74 00000000 c01232ec c1bb9380 c01233dc c1bb9380
c1bb9380 c1bb9944 c1bb9380 00000526 c01251cb c1bb9380 bfffef74 bfffef74
c1bb9424 c1bb9380 cbb02560 cbb025fc c0125681 c1bb9380 bfffef74 00000000
Call Trace:
[<c01232ec>] __unhash_process+0x10c/0x170
[<c01233dc>] release_task+0x8c/0x200
[<c01251cb>] wait_task_zombie+0x15b/0x1c0
[<c0125681>] sys_wait4+0x241/0x290
[<c011cb10>] default_wake_function+0x0/0x20
[<c011cb10>] default_wake_function+0x0/0x20
[<c0109477>] syscall_call+0x7/0xb

Code: 89 01 89 48 04 f0 ff 4b 04 0f 94 c0 31 f6 84 c0 74 1f 8b 43

--
function.linuxpower.ca


2003-03-22 16:57:43

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

Forgot to mention, kernel is 2.5.65-pgcl w/ HIGHMEM + 32k PAGE_SIZE, i
haven't encountered any memory stomping other than this one so far.

Thanks,
Zwane
--
function.linuxpower.ca

2003-03-22 17:05:09

by William Lee Irwin III

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

On Sat, Mar 22, 2003 at 11:57:15AM -0500, Zwane Mwaikambo wrote:
> EIP is at detach_pid+0x1c/0xf0
> Call Trace:
> [<c01232ec>] __unhash_process+0x10c/0x170
> [<c01233dc>] release_task+0x8c/0x200
> [<c01251cb>] wait_task_zombie+0x15b/0x1c0
> [<c0125681>] sys_wait4+0x241/0x290
> [<c011cb10>] default_wake_function+0x0/0x20
> [<c011cb10>] default_wake_function+0x0/0x20
> [<c0109477>] syscall_call+0x7/0xb

This is highly unusual. I know of what I believe to be most of the
outstanding bugs in pgcl and none are of this form.

I'm hoping manfred's analysis will turn up something; I can chase this,
but he seems to have good leads already.


-- wli

2003-03-22 20:06:58

by Manfred Spraul

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

Zwane Mwaikambo wrote:

>Process bash (pid: 1292, threadinfo=caaa2000 task=cbb02560)
>Call Trace:
> [<c01232ec>] __unhash_process+0x10c/0x170
> [<c01233dc>] release_task+0x8c/0x200
> [<c01251cb>] wait_task_zombie+0x15b/0x1c0
> [<c0125681>] sys_wait4+0x241/0x290
> [<c011cb10>] default_wake_function+0x0/0x20
> [<c011cb10>] default_wake_function+0x0/0x20
> [<c0109477>] syscall_call+0x7/0xb
>
>Code: 89 01 89 48 04 f0 ff 4b 04 0f 94 c0 31 f6 84 c0 74 1f 8b 43
>
>
>
0: 89 01 mov %eax,(%ecx)
2: 89 48 04 mov %ecx,0x4(%eax)
list_del(&link->pid_chain):
link->pid_chain->next, prev == 0x6b6b6b6b
5: f0 ff 4b 04 lock decl 0x4(%ebx)
%ebx: link->pidptr == 0x6b6b6b6b


The whole link structure is filled with slab poison. The link structure is embedded in the task struct stucture.
You mentioned that the last detach_pid() within __unhash_process oopsed. That means the reference count of the task structure was off by one, and the
put_task_struct(pid->task)
within
detach_pid(p,PIDTYPE_PGID);
freed the task structure.

The process was bash - does your bash use anything fancy, or plain boring single threaded app?

--
Manfred


2003-03-22 20:36:39

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

On Sat, 22 Mar 2003, Andrew Morton wrote:

> Manfred Spraul <[email protected]> wrote:
> >
> > You mentioned that the last detach_pid() within __unhash_process oopsed. That means the reference count of the task structure was off by one, and the
> > put_task_struct(pid->task)
> > within
> > detach_pid(p,PIDTYPE_PGID);
> > freed the task structure.
> >
>
> Might be related to http://bugme.osdl.org/show_bug.cgi?id=482
> in which someone did put_task_struct() on an already-freed task_struct.
>
> And that was a uniprocessor without pgcl gunk.
>
> It is not known whether preemption was enabled?

CONFIG_PREEMPT=y on 3way P133

--
function.linuxpower.ca

2003-03-22 20:34:40

by Andrew Morton

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

Manfred Spraul <[email protected]> wrote:
>
> You mentioned that the last detach_pid() within __unhash_process oopsed. That means the reference count of the task structure was off by one, and the
> put_task_struct(pid->task)
> within
> detach_pid(p,PIDTYPE_PGID);
> freed the task structure.
>

Might be related to http://bugme.osdl.org/show_bug.cgi?id=482
in which someone did put_task_struct() on an already-freed task_struct.

And that was a uniprocessor without pgcl gunk.

It is not known whether preemption was enabled?

2003-03-22 20:33:33

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

On Sat, 22 Mar 2003, Manfred Spraul wrote:

> >Code: 89 01 89 48 04 f0 ff 4b 04 0f 94 c0 31 f6 84 c0 74 1f 8b 43
> >
> >
> >
> 0: 89 01 mov %eax,(%ecx)
> 2: 89 48 04 mov %ecx,0x4(%eax)
> list_del(&link->pid_chain):
> link->pid_chain->next, prev == 0x6b6b6b6b
> 5: f0 ff 4b 04 lock decl 0x4(%ebx)
> %ebx: link->pidptr == 0x6b6b6b6b

Yep, sorry i should have given this to you earlier.

0xc01232d0 <__unhash_process+240>: push $0xc0505400
0xc01232d5 <__unhash_process+245>: call 0xc011ef20 <__preempt_spin_lock>
0xc01232da <__unhash_process+250>: pop %eax
0xc01232db <__unhash_process+251>: jmp 0xc0123259 <__unhash_process+121>
0xc01232e0 <__unhash_process+256>: mov $0x2,%edx
0xc01232e5 <__unhash_process+261>: mov %ebx,%eax
0xc01232e7 <__unhash_process+263>: call 0xc0134780 <detach_pid> <==
0xc01232ec <__unhash_process+268>: mov %ebx,%eax
0xc01232ee <__unhash_process+270>: mov $0x3,%edx
0xc01232f3 <__unhash_process+275>: call 0xc0134780 <detach_pid>
0xc01232f8 <__unhash_process+280>: mov 0x7c(%ebx),%edx
0xc01232fb <__unhash_process+283>: test %edx,%edx
0xc01232fd <__unhash_process+285>: je 0xc012331f <__unhash_process+319>
0xc01232ff <__unhash_process+287>: mov $0xffffe000,%edx
0xc0123304 <__unhash_process+292>: mov $0xc057b434,%eax

0xc0134780 <detach_pid>: lea (%edx,%edx,4),%edx
0xc0134783 <detach_pid+3>: push %ebp
0xc0134784 <detach_pid+4>: push %edi
0xc0134785 <detach_pid+5>: lea (%eax,%edx,8),%edx
0xc0134788 <detach_pid+8>: push %esi
0xc0134789 <detach_pid+9>: push %ebx
0xc013478a <detach_pid+10>: lea 0xb0(%edx),%eax
0xc0134790 <detach_pid+16>: mov 0x4(%eax),%ecx
0xc0134793 <detach_pid+19>: mov 0x8(%eax),%ebx
0xc0134796 <detach_pid+22>: mov 0xb0(%edx),%eax
0xc013479c <detach_pid+28>: mov %eax,(%ecx) <===
0xc013479e <detach_pid+30>: mov %ecx,0x4(%eax)
0xc01347a1 <detach_pid+33>: lock decl 0x4(%ebx)


> The whole link structure is filled with slab poison. The link structure is embedded in the task struct stucture.
> You mentioned that the last detach_pid() within __unhash_process oopsed.
> That means the reference count of the task structure was off by one, and the
> put_task_struct(pid->task)
> within
> detach_pid(p,PIDTYPE_PGID);
> freed the task structure.

Yes that corresponds with where it oopsed.

0xc01232ec is in __unhash_process (kernel/exit.c:43).
38 nr_threads--;
39 detach_pid(p, PIDTYPE_PID);
40 detach_pid(p, PIDTYPE_TGID);
41 if (thread_group_leader(p)) {
42 detach_pid(p, PIDTYPE_PGID);
43 detach_pid(p, PIDTYPE_SID);

> The process was bash - does your bash use anything fancy, or plain boring single threaded app?

No nothing special, it's a default RH install type thing.

Thanks,
Zwane
--
function.linuxpower.ca

2003-03-23 01:58:53

by William Lee Irwin III

[permalink] [raw]
Subject: Re: BUG: Use after free in detach_pid

On Sat, Mar 22, 2003 at 12:44:47PM -0800, Andrew Morton wrote:
> And that was a uniprocessor without pgcl gunk.

I'd rather not hide my WIP's until they're "perfect".


-- wli