Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp590089ybz; Wed, 15 Apr 2020 14:40:44 -0700 (PDT) X-Google-Smtp-Source: APiQypIBlRc0JF/Hgznss4oRUufJ7H4K2v5v08sTDBhjlDNoafG+rFQdEM/k1udqJXGmqy82YKsi X-Received: by 2002:a17:906:2ad4:: with SMTP id m20mr7255370eje.324.1586986843879; Wed, 15 Apr 2020 14:40:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586986843; cv=none; d=google.com; s=arc-20160816; b=EA9i2hRsDQvkOFk3jO5fEQhHUT9bylLRXURlQr3AkHqDxKkvU9RH2q+hv8qT5oU6jl lSQ0NyZGojN/Zc39RJbdpHQTlJj2Mk+y/SHfysF6OTeo77M4fwQgyTkBdazTENgd8YUN 5J63PxY9WDcKVxOQAWMFfzVLqICOskuqbF91GE+4xddE9Q8k1FYArhA4/gOef6JN6ok0 a5kXoM5yJBPErdjYWMmbQCnNUxzoJR/GbNsrNXwddMbjvMH0ZMIJk/O8fG0OgKYdAMl4 f8aXi4e/vWNNn4NggTK4RiFgngva4tTwid7BTU+GDny1npEajTG1GgmjZUeidsaaweAo 68Sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=zRlif0yFgICFrhU3ctF5TNkTaKUvB7TAVk7XtRoLtS0=; b=u8rg1+BOu6+xTy7NJyNUgceqWo1bLdolndSHLGLlYdk2mMHL2mhj6skJCB2dA+4tMV ChiL5fY+CXsVTPaMz0Wi/rWJXQif8jSfV22+cOU56nsadkMFzmgd8DZH5mlVvNatlNaC t6trtN49XgiaAPAyyWGeGDFUqoiyFicsi4kN7PKtW4bjSrD9uB4bqP65V5j4oB1mNznW +eFyCNiX+FN4SliHFOySi4TsUACbzAon9SimiyzrQ6/mt/tiyrk4uyyU4PHYTtHFcFMM Y3U+TGumgniIcM/g0dFRKseUmmcXqMdX4R3Yok0lddVhWFgp7T37+SqkRdKpvkP+114r 16Kw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id df12si1449164edb.540.2020.04.15.14.40.20; Wed, 15 Apr 2020 14:40:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405323AbgDNOxi (ORCPT + 99 others); Tue, 14 Apr 2020 10:53:38 -0400 Received: from foss.arm.com ([217.140.110.172]:57440 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405178AbgDNOxX (ORCPT ); Tue, 14 Apr 2020 10:53:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3D2E330E; Tue, 14 Apr 2020 07:53:23 -0700 (PDT) Received: from [192.168.0.14] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9FF393F73D; Tue, 14 Apr 2020 07:53:21 -0700 (PDT) Subject: Re: [PATCH] arm64: panic on synchronous external abort in kernel context To: Xie XiuQi Cc: Mark Rutland , catalin.marinas@arm.com, will@kernel.org, tglx@linutronix.de, tanxiaofei@huawei.com, wangxiongfeng2@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20200410015245.23230-1-xiexiuqi@huawei.com> <20200414105923.GA2486@C02TD0UTHF1T.local> <21997719-c521-c39a-f521-54feae16fc45@huawei.com> From: James Morse Message-ID: <60131aba-4e09-7824-17b2-a6fc711c150b@arm.com> Date: Tue, 14 Apr 2020 15:53:06 +0100 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <21997719-c521-c39a-f521-54feae16fc45@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Xie, On 14/04/2020 13:39, Xie XiuQi wrote: > On 2020/4/14 20:16, James Morse wrote: >> On 14/04/2020 11:59, Mark Rutland wrote: >>> On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote: >>>> We should panic even panic_on_oops is not set, when we can't recover >>>> from synchronous external abort in kernel context. >> >> Hmm, fault-from-kernel-context doesn't mean the fault affects the kernel. If the kernel is >> reading or writing from user-space memory for a syscall, its the user-space memory that is >> affected. This thread can't make progress, so we kill it. >> If its a kernel thread or we were in irq context, we panic(). >> >> I don't think you really want all faults that happen as a result of a kernel access to be >> fatal! > Yes, you're right. I just want to fix a hung up when ras error inject testing. > > panic_on_oops is not set in the kernel image for testing. When receiving a sea in kernel > context, the PE trap in do_exit(), and can't return any more. trap? gets trapped, (or gets stuck, to prevent confusion with the architectures use of the word 'trap'!) > I analyze the source code, the call trace might like this: > do_mem_abort > -> arm64_notify_die > -> die # kernel context, call die() directly; > -> do_exit # kernel process context, call do_exit(SIGSEGV); > -> do_task_dead() # call do_task_dead(), and hung up this core; Thanks for the trace. This describes a corrected error in your I-cache, that occurred while the kernel was executing a kernel thread. These shouldn't be fatal, because it was corrected ... but the kernel doesn't know that because it doesn't know how to parse those records. There are two things wrong here: 1. it locks up while trying to kill the thread. 2. it tried to kill the thread in the first place! For 1, does your l1l2_inject module take any spinlocks or tinker with the pre-empt counter? I suspect this is some rarely-tested path in do_task_dead() that sleeps, but can't from your l1l2_inject module because the pre-empt counter has been raised. CONFIG_DEBUG_ATOMIC_SLEEP may point at the function to blame. It may be accessing some thread data that kthreads don't have, taking a second exception and blocking on the die_lock. LOCKDEP should catch this one. We should fix this one first. 2 is a bit more complicated. Today, this is fatal because the arch code assumes this was probably a memory error, and if it returns to user-space it can't avoid getting stuck in a loop until the scheduled memory_failure() work runs. Instead it unconditionally signals the process. [0] fixes this up for memory errors. But in this case it will assume all the work has been done by APEI, (or will be before we get back to user-space). APEI ignored the processor error you fed it, because it doesn't know what they are, they are just printed out. This is fine for corrected errors, but were are reliant on your firmware describing uncorrected errors with a 'fatal' severity... which might be too heavy a hammer. (Ideally that would mean 'uncontained', and the kernel should handle, or detect it can't, any other errror...) We can discuss the right thing to do here when support for parsing these 'arm processor errors' is posted. (If you think I need to do something different in [0] because of this, please shout!) > [ 387.740609] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 9 > [ 387.748837] {1}[Hardware Error]: event severity: recoverable > [ 387.754470] {1}[Hardware Error]: Error 0, type: recoverable > [ 387.760103] {1}[Hardware Error]: section_type: ARM processor error et voila! Case 2. Linux doesn't handle these 'ARM processor error' things, because it doesn't know what they are. It just prints them out. > [ 387.766425] {1}[Hardware Error]: MIDR: 0x00000000481fd010 > [ 387.771972] {1}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000081080000 > [ 387.780628] {1}[Hardware Error]: error affinity level: 0 > [ 387.786088] {1}[Hardware Error]: running state: 0x1 > [ 387.791115] {1}[Hardware Error]: Power State Coordination Interface state: 0 > [ 387.798301] {1}[Hardware Error]: Error info structure 0: > [ 387.803761] {1}[Hardware Error]: num errors: 1 > [ 387.808356] {1}[Hardware Error]: error_type: 0, cache error > [ 387.814160] {1}[Hardware Error]: error_info: 0x0000000024400017 > [ 387.820311] {1}[Hardware Error]: transaction type: Instruction > [ 387.826461] {1}[Hardware Error]: operation type: Generic error (type cannot be determined) > [ 387.835031] {1}[Hardware Error]: cache level: 1 > [ 387.839878] {1}[Hardware Error]: the error has been corrected As this is corrected, I think the bug is a deadlock somewhere in do_task_dead(). > [ 387.845942] {1}[Hardware Error]: physical fault address: 0x00000027caf50770 (and your firmware gives you the physical address, excellent, the kernel can do something with this!) > [ 388.021037] Call trace: > [ 388.023475] lsu_inj_ue+0x58/0x70 [l1l2_inject] > [ 388.029019] error_inject+0x64/0xb0 [l1l2_inject] > [ 388.033707] process_one_work+0x158/0x4b8 > [ 388.037699] worker_thread+0x50/0x498 > [ 388.041348] kthread+0xfc/0x128 > [ 388.044480] ret_from_fork+0x10/0x1c > [ 388.048042] Code: b2790000 d519f780 f9800020 d5033f9f (58001001) > [ 388.054109] ---[ end trace 39d51c21b0e42ba6 ]--- > > core 0 hung up at here. DEBUG_ATOMIC_SLEEP and maybe LOCKDEP should help you pin down where the kernel is getting stuck. It looks like a bug in the core code. Thanks, James [0] https://lore.kernel.org/linux-acpi/20200228174817.74278-4-james.morse@arm.com/