HI ,
We are seeing crash in do_task_stat while accessing stack pointer, It
seems same task has already completed do_exit call.
So it seems a race between them:
Below is the crash trace:
49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
unavailable]
[49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
[49750.534400] PC is at do_task_stat+0x740/0x908
[49750.534402] LR is at do_task_stat+0xa4/0x908
[49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
pstate: 80400145
[49750.534404] sp : ffffffe7a5f0fbd0
and here is stack trace on that core:
-000|user_stack_pointer(inline)
-000|do_task_stat(
| m = 0xFFFFFFE7A5CD7380,
| ns = 0xFFFFFF8E7C43C748,
| ?,
| task = 0xFFFFFFE80D8C2280,
| ?)
| tty_pgrp = 0
| ppid = 2084696064
| sid = 0
| mm = 0xFFFFFFE7B4424140
| tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
231, 255, 255, 255)
| flags = 18446743969119403392
-001|proc_tgid_stat(
| m = 0xFFFFFFE7A5CD7380,
| ?,
Below are task stats which shows , process completed the do_exit call:
struct task_struct.flags -x 0xFFFFFFE80D8C2280
flags = 0x40870c
crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
exit_code = 0x6
struct task_struct.state -x 0xFFFFFFE80D8C2280
state = 0x40
In our build both patches are there ,
fs/proc: report eip/esp in /prod/PID/stat for coredumping
and also task.state has already set PF_DUMPCORE as it got the sigabrt
signal.
Regards
Gaurav
-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project.
> We are seeing crash in do_task_stat while accessing stack pointer, It
> seems same task has already completed do_exit call.
> So it seems a race between them:
Please, post exact kernel version and struct task_struct::usage if you
still have that kernel core (or even full task_struct)
Hi John, Ingo
As still we are seeing race between do_task_stat and do_exit of task,
Can't we have to
put more strict check in case, if stack pointer is NULL in below code :
if (permitted && (task->flags & PF_DUMPCORE)) {
eip = KSTK_EIP(task);
esp = KSTK_ESP(task);
}
Regards
Gaurav
On 1/9/2018 7:03 PM, Kohli, Gaurav wrote:
> HI ,
>
> We are seeing crash in do_task_stat while accessing stack pointer, It
> seems same task has already completed do_exit call.
> So it seems a race between them:
>
> Below is the crash trace:
> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
> unavailable]
> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
> [49750.534400] PC is at do_task_stat+0x740/0x908
> [49750.534402] LR is at do_task_stat+0xa4/0x908
> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
> pstate: 80400145
> [49750.534404] sp : ffffffe7a5f0fbd0
>
> and here is stack trace on that core:
>
> -000|user_stack_pointer(inline)
> -000|do_task_stat(
> | m = 0xFFFFFFE7A5CD7380,
> | ns = 0xFFFFFF8E7C43C748,
> | ?,
> | task = 0xFFFFFFE80D8C2280,
> | ?)
> | tty_pgrp = 0
> | ppid = 2084696064
> | sid = 0
> | mm = 0xFFFFFFE7B4424140
> | tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
> 231, 255, 255, 255)
> | flags = 18446743969119403392
> -001|proc_tgid_stat(
> | m = 0xFFFFFFE7A5CD7380,
> | ?,
>
> Below are task stats which shows , process completed the do_exit call:
> struct task_struct.flags -x 0xFFFFFFE80D8C2280
> flags = 0x40870c
>
> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
> exit_code = 0x6
>
> struct task_struct.state -x 0xFFFFFFE80D8C2280
> state = 0x40
>
> In our build both patches are there ,
> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>
> and also task.state has already set PF_DUMPCORE as it got the sigabrt
> signal.
>
> Regards
> Gaurav
>
>
> -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation
> Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation
> Collaborative Project.
-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project.
Hello Gaurav.
On 2018-01-09, Kohli, Gaurav <[email protected]> wrote:
> We are seeing crash in do_task_stat while accessing stack pointer, It
> seems same task has already completed do_exit call.
> So it seems a race between them:
>
> Below is the crash trace:
> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
> unavailable]
> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
> [49750.534400] PC is at do_task_stat+0x740/0x908
> [49750.534402] LR is at do_task_stat+0xa4/0x908
> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
> pstate: 80400145
> [49750.534404] sp : ffffffe7a5f0fbd0
>
> and here is stack trace on that core:
>
> -000|user_stack_pointer(inline)
> -000|do_task_stat(
> | m = 0xFFFFFFE7A5CD7380,
> | ns = 0xFFFFFF8E7C43C748,
> | ?,
> | task = 0xFFFFFFE80D8C2280,
> | ?)
> | tty_pgrp = 0
> | ppid = 2084696064
> | sid = 0
> | mm = 0xFFFFFFE7B4424140
> | tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
> 231, 255, 255, 255)
> | flags = 18446743969119403392
> -001|proc_tgid_stat(
> | m = 0xFFFFFFE7A5CD7380,
> | ?,
>
> Below are task stats which shows , process completed the do_exit call:
> struct task_struct.flags -x 0xFFFFFFE80D8C2280
> flags = 0x40870c
>
> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
> exit_code = 0x6
>
> struct task_struct.state -x 0xFFFFFFE80D8C2280
> state = 0x40
I am confused why this task is in the TASK_PARKED state. What kind of
task is this?
> In our build both patches are there ,
> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>
> and also task.state has already set PF_DUMPCORE as it got the sigabrt
> signal.
John Ogness
On 1/15/2018 4:32 PM, John Ogness wrote:
> Hello Gaurav.
>
> On 2018-01-09, Kohli, Gaurav <[email protected]> wrote:
>> We are seeing crash in do_task_stat while accessing stack pointer, It
>> seems same task has already completed do_exit call.
>> So it seems a race between them:
>>
>> Below is the crash trace:
>> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
>> unavailable]
>> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
>> [49750.534400] PC is at do_task_stat+0x740/0x908
>> [49750.534402] LR is at do_task_stat+0xa4/0x908
>> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
>> pstate: 80400145
>> [49750.534404] sp : ffffffe7a5f0fbd0
>>
>> and here is stack trace on that core:
>>
>> -000|user_stack_pointer(inline)
>> -000|do_task_stat(
>> | m = 0xFFFFFFE7A5CD7380,
>> | ns = 0xFFFFFF8E7C43C748,
>> | ?,
>> | task = 0xFFFFFFE80D8C2280,
>> | ?)
>> | tty_pgrp = 0
>> | ppid = 2084696064
>> | sid = 0
>> | mm = 0xFFFFFFE7B4424140
>> | tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
>> 231, 255, 255, 255)
>> | flags = 18446743969119403392
>> -001|proc_tgid_stat(
>> | m = 0xFFFFFFE7A5CD7380,
>> | ?,
>>
>> Below are task stats which shows , process completed the do_exit call:
>> struct task_struct.flags -x 0xFFFFFFE80D8C2280
>> flags = 0x40870c
>>
>> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
>> exit_code = 0x6
>>
>> struct task_struct.state -x 0xFFFFFFE80D8C2280
>> state = 0x40
> I am confused why this task is in the TASK_PARKED state. What kind of
> task is this?
Hi John,
This is android HAL layer service and also before bug, i am seeing lot of service exited in logs also,
although not seeing for this pid 6807
.452202: <2> init: starting service 'limits-hal-1-0'...
49749.460039: <2> init: property_set("ro.boottime.limits-hal-1-0", "61591320967789") failed: property already set
49749.607496: <6> sh (2422): drop_caches: 3
49750.281635: <6> sh (2422): drop_caches: 3
49750.533853: <2> init: Untracked pid 6811 exited with status 0
And why it is parked , that is not clear as state is already updated of task.
Regards
Gaurav
>
>> In our build both patches are there ,
>> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>>
>> and also task.state has already set PF_DUMPCORE as it got the sigabrt
>> signal.
> John Ogness
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:
>> We are seeing crash in do_task_stat while accessing stack pointer, It
>> seems same task has already completed do_exit call.
>> So it seems a race between them:
> Please, post exact kernel version and struct task_struct::usage if you
> still have that kernel core (or even full task_struct)
Hi Alexey,
We are working on 4.9.65 and Please find below usage value and other task_struct value,
please let me know if some other data required as well.
crash_64> struct task_struct.usage -x 0xFFFFFFE80D8C2280
usage = {
counter = 0x4
}
struct task_struct.flags -x 0xFFFFFFE80D8C2280
flags = 0x40870c
crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
exit_code = 0x6
struct task_struct.state -x 0xFFFFFFE80D8C2280
state = 0x40
Please find below crash stack:
-000|user_stack_pointer(inline)
-000|do_task_stat(
| m = 0xFFFFFFE7A5CD7380,
| ns = 0xFFFFFF8E7C43C748,
| ?,
| task = 0xFFFFFFE80D8C2280,
| ?)
| tty_pgrp = 0
| ppid = 2084696064
| sid = 0
| mm = 0xFFFFFFE7B4424140
| tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165, 231, 255, 255, 255)
| flags = 18446743969119403392
-001|proc_tgid_stat(
| m = 0xFFFFFFE7A5CD7380,
| ?,
| ?,
| ?)
-002|atomic_sub_return(inline)
Regards
Gaurav
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
On Tue, Jan 16, 2018 at 11:06:47AM +0530, Kohli, Gaurav wrote:
> On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:
>
> >> We are seeing crash in do_task_stat while accessing stack pointer, It
> >> seems same task has already completed do_exit call.
> >> So it seems a race between them:
> > Please, post exact kernel version and struct task_struct::usage if you
> > still have that kernel core (or even full task_struct)
>
> Hi Alexey,
>
> We are working on 4.9.65 and Please find below usage value and other task_struct value,
> please let me know if some other data required as well.
Kernel stacks live their own lives nowadays, the code needs try_get_task_stack().
On 1/16/2018 12:50 PM, Alexey Dobriyan wrote:
> On Tue, Jan 16, 2018 at 11:06:47AM +0530, Kohli, Gaurav wrote:
>> On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:
>>
>>>> We are seeing crash in do_task_stat while accessing stack pointer, It
>>>> seems same task has already completed do_exit call.
>>>> So it seems a race between them:
>>> Please, post exact kernel version and struct task_struct::usage if you
>>> still have that kernel core (or even full task_struct)
>> Hi Alexey,
>>
>> We are working on 4.9.65 and Please find below usage value and other task_struct value,
>> please let me know if some other data required as well.
> Kernel stacks live their own lives nowadays, the code needs try_get_task_stack().
>
Hi Alexey,
Yes , agree we have to put some check like below
if (permitted && (task->flags & PF_DUMPCORE) && try_get_task_stack(task)) {
eip = KSTK_EIP(task);
esp = KSTK_ESP(task);
}
Or instead of this also , can't we check whether task is in exiting path or not by checking some flags like PF_EXITING.
Regards
Gaurav
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.