2007-12-19 05:54:50

by David Chinner

[permalink] [raw]
Subject: [ia64] BUG: sleeping in atomic

Just saw this again:

[ 5667.086055] BUG: sleeping function called from invalid context at kernel/fork.c:401
[ 5667.087314] in_atomic():1, irqs_disabled():0
[ 5667.088210]
[ 5667.088212] Call Trace:
[ 5667.089104] [<a000000100015e00>] show_stack+0x80/0xa0
[ 5667.089106] sp=e0000038f6507a00 bsp=e0000038f6500f48
[ 5667.116025] [<a000000100015e50>] dump_stack+0x30/0x60
[ 5667.116028] sp=e0000038f6507bd0 bsp=e0000038f6500f30
[ 5667.118317] [<a0000001000af8a0>] __might_sleep+0x1e0/0x200
[ 5667.118320] sp=e0000038f6507bd0 bsp=e0000038f6500f08
[ 5667.142316] [<a0000001000c55a0>] mmput+0x20/0x220
[ 5667.142319] sp=e0000038f6507bd0 bsp=e0000038f6500ee0
[ 5667.164201] [<a000000100033ae0>] sys_ptrace+0x460/0x15c0
[ 5667.164203] sp=e0000038f6507bd0 bsp=e0000038f6500dd0
[ 5667.175488] [<a00000010000b4a0>] ia64_ret_from_syscall+0x0/0x20
[ 5667.175490] sp=e0000038f6507e30 bsp=e0000038f6500dd0
[ 5667.199324] [<a000000000010620>] __kernel_syscall_via_break+0x0/0x20
[ 5667.199327] sp=e0000038f6508000 bsp=e0000038f6500dd0
[ 5682.626704] BUG: sleeping function called from invalid context at kernel/fork.c:401

When stracing a process on 2.6.24-rc3 on ia64. commandline to reproduce:

# strace ls -l /mnt/scratch/test

ISTR reporting this some time ago (maybe 2.6.22?)....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group


2007-12-19 16:42:18

by Kyle McMartin

[permalink] [raw]
Subject: Re: [ia64] BUG: sleeping in atomic

On Wed, Dec 19, 2007 at 04:54:30PM +1100, David Chinner wrote:
> [ 5667.086055] BUG: sleeping function called from invalid context at kernel/fork.c:401
>

The problem is that mmput is called under the read_lock by
find_thread_for_addr... The comment above seems to indicate that gdb
needs to be able to access any child tasks register backing store
memory... This seems pretty broken.

cheers, Kyle

---

Who knows, maybe gdb is saner now?

diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
index 2e96f17..b609704 100644
--- a/arch/ia64/kernel/ptrace.c
+++ b/arch/ia64/kernel/ptrace.c
@@ -1418,7 +1418,7 @@ asmlinkage long
sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
{
struct pt_regs *pt;
- unsigned long urbs_end, peek_or_poke;
+ unsigned long urbs_end;
struct task_struct *child;
struct switch_stack *sw;
long ret;
@@ -1430,23 +1430,12 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
goto out;
}

- peek_or_poke = (request == PTRACE_PEEKTEXT
- || request == PTRACE_PEEKDATA
- || request == PTRACE_POKETEXT
- || request == PTRACE_POKEDATA);
- ret = -ESRCH;
- read_lock(&tasklist_lock);
- {
- child = find_task_by_pid(pid);
- if (child) {
- if (peek_or_poke)
- child = find_thread_for_addr(child, addr);
- get_task_struct(child);
- }
- }
- read_unlock(&tasklist_lock);
- if (!child)
+ child = ptrace_get_task_struct(pid);
+ if (IS_ERR(child)) {
+ ret = PTR_ERR(child);
goto out;
+ }
+
ret = -EPERM;
if (pid == 1) /* no messing around with init! */
goto out_tsk;

2007-12-20 04:34:54

by David Chinner

[permalink] [raw]
Subject: Re: [ia64] BUG: sleeping in atomic

On Wed, Dec 19, 2007 at 11:42:04AM -0500, Kyle McMartin wrote:
> On Wed, Dec 19, 2007 at 04:54:30PM +1100, David Chinner wrote:
> > [ 5667.086055] BUG: sleeping function called from invalid context at kernel/fork.c:401
> >
>
> The problem is that mmput is called under the read_lock by
> find_thread_for_addr... The comment above seems to indicate that gdb
> needs to be able to access any child tasks register backing store
> memory... This seems pretty broken.
>
> cheers, Kyle
>
> ---
>
> Who knows, maybe gdb is saner now?
>
> diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
> index 2e96f17..b609704 100644
> --- a/arch/ia64/kernel/ptrace.c
> +++ b/arch/ia64/kernel/ptrace.c
> @@ -1418,7 +1418,7 @@ asmlinkage long
> sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
> {
> struct pt_regs *pt;
> - unsigned long urbs_end, peek_or_poke;
> + unsigned long urbs_end;
> struct task_struct *child;
> struct switch_stack *sw;
> long ret;
> @@ -1430,23 +1430,12 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
> goto out;
> }
>
> - peek_or_poke = (request == PTRACE_PEEKTEXT
> - || request == PTRACE_PEEKDATA
> - || request == PTRACE_POKETEXT
> - || request == PTRACE_POKEDATA);
> - ret = -ESRCH;
> - read_lock(&tasklist_lock);
> - {
> - child = find_task_by_pid(pid);
> - if (child) {
> - if (peek_or_poke)
> - child = find_thread_for_addr(child, addr);
> - get_task_struct(child);
> - }
> - }
> - read_unlock(&tasklist_lock);
> - if (!child)
> + child = ptrace_get_task_struct(pid);
> + if (IS_ERR(child)) {
> + ret = PTR_ERR(child);
> goto out;
> + }
> +
> ret = -EPERM;
> if (pid == 1) /* no messing around with init! */
> goto out_tsk;

Yes, this patch fixes the problem (though I haven't tried to
use gdb yet).

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-12-21 08:11:47

by Petr Tesařík

[permalink] [raw]
Subject: Re: [ia64] BUG: sleeping in atomic

David Chinner wrote:
> On Wed, Dec 19, 2007 at 11:42:04AM -0500, Kyle McMartin wrote:
>
>> On Wed, Dec 19, 2007 at 04:54:30PM +1100, David Chinner wrote:
>>
>>> [ 5667.086055] BUG: sleeping function called from invalid context at kernel/fork.c:401
>>>
>>>
>> The problem is that mmput is called under the read_lock by
>> find_thread_for_addr... The comment above seems to indicate that gdb
>> needs to be able to access any child tasks register backing store
>> memory... This seems pretty broken.
>>
>> cheers, Kyle
>>
>> ---
>>
>> Who knows, maybe gdb is saner now?
>>

Well, gdb is saner, but the bug you're talking about will be fixed in a
clean way (even for insane debuggers) with Shaohua's, Roland's and my
fixes to handling the RSE, because they make it possible to get rid of
find_thread_for_addr(). I already sent a mail about it to linux-ia64
some time ago, and Roland suggested we might even change everything to
the generic sys_ptrace(), which is the correct solution (TM), and I'm
planning to do it so as soon as the RSE patches are in.

Regards,
Petr Tesarik
>> diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
>> index 2e96f17..b609704 100644
>> --- a/arch/ia64/kernel/ptrace.c
>> +++ b/arch/ia64/kernel/ptrace.c
>> @@ -1418,7 +1418,7 @@ asmlinkage long
>> sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
>> {
>> struct pt_regs *pt;
>> - unsigned long urbs_end, peek_or_poke;
>> + unsigned long urbs_end;
>> struct task_struct *child;
>> struct switch_stack *sw;
>> long ret;
>> @@ -1430,23 +1430,12 @@ sys_ptrace (long request, pid_t pid, unsigned long addr, unsigned long data)
>> goto out;
>> }
>>
>> - peek_or_poke = (request == PTRACE_PEEKTEXT
>> - || request == PTRACE_PEEKDATA
>> - || request == PTRACE_POKETEXT
>> - || request == PTRACE_POKEDATA);
>> - ret = -ESRCH;
>> - read_lock(&tasklist_lock);
>> - {
>> - child = find_task_by_pid(pid);
>> - if (child) {
>> - if (peek_or_poke)
>> - child = find_thread_for_addr(child, addr);
>> - get_task_struct(child);
>> - }
>> - }
>> - read_unlock(&tasklist_lock);
>> - if (!child)
>> + child = ptrace_get_task_struct(pid);
>> + if (IS_ERR(child)) {
>> + ret = PTR_ERR(child);
>> goto out;
>> + }
>> +
>> ret = -EPERM;
>> if (pid == 1) /* no messing around with init! */
>> goto out_tsk;
>>
>
> Yes, this patch fixes the problem (though I haven't tried to
> use gdb yet).
>
> Cheers,
>
> Dave.
>