Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1633038ybz; Thu, 30 Apr 2020 02:48:57 -0700 (PDT) X-Google-Smtp-Source: APiQypKJojyzRWMmFQoq9pVAhhxFOeOKVff8ABhFACK7gZIJJCwJ2ZkFJE9y5347zXEQdDDiD+S/ X-Received: by 2002:a17:906:c82b:: with SMTP id dd11mr1806963ejb.216.1588240136985; Thu, 30 Apr 2020 02:48:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588240136; cv=none; d=google.com; s=arc-20160816; b=1AxIFqMOSZPzl30HYm2UO03aDYwLPhD9nnyykftnfTq+SQKfwuavCkRr9Tz0HIeLy3 rmJ+/fTwFE/Q4fS6ajYLCO90XXC2P4Zr3dx6rxadGqpv8ny7xzSeJV5Qgqz+2ySOdqqw z2hITaTTMuR2Ujpi1HEeM42VfcR5tvYZqmirJOznMObkdI8EsJMwABlDnwnPQMGIaU5f 9bqo6JxyQ5CemAhmfWhspxaIrM10mbIj8h4Ocey6bzgCfDgK1yVK3agde2cs6xyy3eKd CLDCl9Q0o3+9vkCxwzffJkdQBJsnRtH3Bqgx0t50BCZ5cjroMcugh7koZ+tARZxa57C1 4jOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=mLFAAJ82ffeICQAI0pdahAXez+n7ZIFZMh1FGBYIZ1g=; b=iUZW9ug3Wqlt3QSsoWmZs5IBFyRfQhVPf7gFkQSuraJJFzFfnfMUtZeQZ5yomtxmkF IF/FhwIV79QvRcghGmWz8a/b7gBDTTc7newQdj7tXImqN1OTFJaJvsTZUREfQcEsaVm1 0L4l0wVRuJCbf/4fKIrsw4lqA150tH4LUnOhQOl7gLLgv0PtboHmtF76sp4yt8hFvd0z r1kFaH0Ov9shoA0y107mZH7PBEnp6XNLsOPaXlSztkmPEnbmvwDTp7wtQASciilQB3aw vvH+0ZIbmEuoRd/F3zSR3u0OMqeqSIGDPw9nvcrA+tb79nsELBaPHIs6hzRvb4owwr1j yDsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jt10si5767423ejb.446.2020.04.30.02.48.33; Thu, 30 Apr 2020 02:48:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726910AbgD3JoN (ORCPT + 99 others); Thu, 30 Apr 2020 05:44:13 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:48400 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726453AbgD3JoN (ORCPT ); Thu, 30 Apr 2020 05:44:13 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 4B94FCD1098734C8882D; Thu, 30 Apr 2020 17:44:09 +0800 (CST) Received: from [127.0.0.1] (10.166.215.120) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.487.0; Thu, 30 Apr 2020 17:44:02 +0800 Subject: Re: [PATCH] arm64: panic on synchronous external abort in kernel context To: James Morse CC: Mark Rutland , , , , , , , References: <20200410015245.23230-1-xiexiuqi@huawei.com> <20200414105923.GA2486@C02TD0UTHF1T.local> <21997719-c521-c39a-f521-54feae16fc45@huawei.com> <60131aba-4e09-7824-17b2-a6fc711c150b@arm.com> From: Xie XiuQi Message-ID: <483792fd-5b01-8cf7-7687-abe513cd3474@huawei.com> Date: Thu, 30 Apr 2020 17:44:00 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <60131aba-4e09-7824-17b2-a6fc711c150b@arm.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.166.215.120] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi James, Sorry for reply late. On 2020/4/14 22:53, James Morse wrote: > Hi Xie, > > On 14/04/2020 13:39, Xie XiuQi wrote: >> On 2020/4/14 20:16, James Morse wrote: >>> On 14/04/2020 11:59, Mark Rutland wrote: >>>> On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >>>>> We should panic even panic_on_oops is not set, when we can't recover >>>>> from synchronous external abort in kernel context. >>> >>> Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is >>> reading or writing from user-space memory for a syscall, its the user-space memory that is >>> affected. This thread can't make progress, so we kill it. >>> If its a kernel thread or we were in irq context, we panic(). >>> >>> I don't think you really want all faults that happen as a result of a kernel access to be >>> fatal! > >> Yes, you're right. I just want to fix a hung up when ras error inject testing. >> >> panic_on_oops is not set in the kernel image for testing. When receiving a sea in kernel >> context, the PE trap in do_exit(), and can't return any more. > > trap? gets trapped, (or gets stuck, to prevent confusion with the architectures use of the > word 'trap'!) > > >> I analyze the source code, the call trace might like this: >> do_mem_abort >> -> arm64_notify_die >> -> die # kernel context, call die() directly; >> -> do_exit # kernel process context, call do_exit(SIGSEGV); >> -> do_task_dead() # call do_task_dead(), and hung up this core; > > Thanks for the trace. This describes a corrected error in your I-cache, that occurred > while the kernel was executing a kernel thread. These shouldn't be fatal, because it was > corrected ... but the kernel doesn't know that because it doesn't know how to parse those > records. > > There are two things wrong here: > 1. it locks up while trying to kill the thread. > 2. it tried to kill the thread in the first place! > > For 1, does your l1l2_inject module take any spinlocks or tinker with the pre-empt counter? > > I suspect this is some rarely-tested path in do_task_dead() that sleeps, but can't from > your l1l2_inject module because the pre-empt counter has been raised. > > CONFIG_DEBUG_ATOMIC_SLEEP may point at the function to blame. > > It may be accessing some thread data that kthreads don't have, taking a second exception > and blocking on the die_lock. LOCKDEP should catch this one. > > We should fix this one first. > I analyze the l1l2_inject module, there is a kworker thread used to inject error. do_sea() try to kill the kworker thread, so the work(s) queued on this kworker is blocked. panic_on_oops option is usually set on production environment, so if someone unset this option for testing, he should take his own risk. Is it right? > > 2 is a bit more complicated. Today, this is fatal because the arch code assumes this was > probably a memory error, and if it returns to user-space it can't avoid getting stuck in a > loop until the scheduled memory_failure() work runs. Instead it unconditionally signals > the process. > > [0] fixes this up for memory errors. But in this case it will assume all the work has been > done by APEI, (or will be before we get back to user-space). APEI ignored the processor > error you fed it, because it doesn't know what they are, they are just printed out. > > This is fine for corrected errors, but were are reliant on your firmware describing > uncorrected errors with a 'fatal' severity... which might be too heavy a hammer. (Ideally > that would mean 'uncontained', and the kernel should handle, or detect it can't, any other > errror...) > > We can discuss the right thing to do here when support for parsing these 'arm processor > errors' is posted. > (If you think I need to do something different in [0] because of this, please shout!) For some errors which could be recoverable or corrected, do_sea() should not kill process or die() unconditionally. It's better detect this situation. > > >> [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 >> [ 387.748837] {1}[Hardware Error]: event severity: recoverable >> [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable > >> [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error > > et voila! Case 2. Linux doesn't handle these 'ARM processor error' things, because it > doesn't know what they are. It just prints them out. > > >> [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 >> [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 >> [ 387.780628] {1}[Hardware Error]: error affinity level: 0 >> [ 387.786088] {1}[Hardware Error]: running state: 0x1 >> [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 >> [ 387.798301] {1}[Hardware Error]: Error info structure 0: >> [ 387.803761] {1}[Hardware Error]: num errors: 1 >> [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error >> [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 >> [ 387.820311] {1}[Hardware Error]: transaction type: Instruction >> [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) >> [ 387.835031] {1}[Hardware Error]: cache level: 1 > >> [ 387.839878] {1}[Hardware Error]: the error has been corrected > > As this is corrected, I think the bug is a deadlock somewhere in do_task_dead(). > > >> [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 > > (and your firmware gives you the physical address, excellent, the kernel can do something > with this!) > > >> [ 388.021037] Call trace: >> [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] >> [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] >> [ 388.033707] process_one_work+0x158/0x4b8 >> [ 388.037699] worker_thread+0x50/0x498 >> [ 388.041348] kthread+0xfc/0x128 >> [ 388.044480] ret_from_fork+0x10/0x1c >> [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) >> [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- >> >> core 0 hung up at here. > > DEBUG_ATOMIC_SLEEP and maybe LOCKDEP should help you pin down where the kernel is getting > stuck. It looks like a bug in the core code. > > > Thanks, > > James > > [0] https://lore.kernel.org/linux-acpi/20200228174817.74278-4-james.morse@arm.com/ > . >