Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933056AbeAOMaV (ORCPT + 1 other); Mon, 15 Jan 2018 07:30:21 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:43606 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932657AbeAOMaT (ORCPT ); Mon, 15 Jan 2018 07:30:19 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org E0BC260500 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=gkohli@codeaurora.org Subject: Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task To: John Ogness Cc: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org References: <36ea8b88-4786-dbb2-6b89-15f9801e9c86@codeaurora.org> <87zi5fxu4g.fsf@linutronix.de> From: "Kohli, Gaurav" Message-ID: <959b7b1e-3f93-5792-c613-d23b21c46246@codeaurora.org> Date: Mon, 15 Jan 2018 18:00:14 +0530 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <87zi5fxu4g.fsf@linutronix.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 1/15/2018 4:32 PM, John Ogness wrote: > Hello Gaurav. > > On 2018-01-09, Kohli, Gaurav wrote: >> We are seeing crash in do_task_stat while accessing stack pointer, It >> seems same task has already completed do_exit call. >> So it seems a race between them: >> >> Below is the crash trace: >> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info >> unavailable] >> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000 >> [49750.534400] PC is at do_task_stat+0x740/0x908 >> [49750.534402] LR is at do_task_stat+0xa4/0x908 >> [49750.534403] pc : [] lr : [] >> pstate: 80400145 >> [49750.534404] sp : ffffffe7a5f0fbd0 >> >> and here is stack trace on that core: >> >> -000|user_stack_pointer(inline) >> -000|do_task_stat( >> | m = 0xFFFFFFE7A5CD7380, >> | ns = 0xFFFFFF8E7C43C748, >> | ?, >> | task = 0xFFFFFFE80D8C2280, >> | ?) >> | tty_pgrp = 0 >> | ppid = 2084696064 >> | sid = 0 >> | mm = 0xFFFFFFE7B4424140 >> | tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165, >> 231, 255, 255, 255) >> | flags = 18446743969119403392 >> -001|proc_tgid_stat( >> | m = 0xFFFFFFE7A5CD7380, >> | ?, >> >> Below are task stats which shows , process completed the do_exit call: >> struct task_struct.flags -x 0xFFFFFFE80D8C2280 >> flags = 0x40870c >> >> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280 >> exit_code = 0x6 >> >> struct task_struct.state -x 0xFFFFFFE80D8C2280 >> state = 0x40 > I am confused why this task is in the TASK_PARKED state. What kind of > task is this? Hi John, This is android HAL layer service and also before bug, i am seeing lot of service exited in logs also, although not seeing for this pid 6807 .452202: <2> init: starting service 'limits-hal-1-0'... 49749.460039: <2> init: property_set("ro.boottime.limits-hal-1-0", "61591320967789") failed: property already set 49749.607496: <6> sh (2422): drop_caches: 3 49750.281635: <6> sh (2422): drop_caches: 3 49750.533853: <2> init: Untracked pid 6811 exited with status 0 And why it is parked , that is not clear as state is already updated of task. Regards Gaurav > >> In our build both patches are there , >> fs/proc: report eip/esp in /prod/PID/stat for coredumping >> >> and also task.state has already set PF_DUMPCORE as it got the sigabrt >> signal. > John Ogness > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.