Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1300376iog; Sat, 18 Jun 2022 06:05:38 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t4Wipy0TCUOCdrYE8eky41DZaj8LrPy2Amx1EPfshMkWdH2YBW5m7sYo1WzKJs2osngOl3 X-Received: by 2002:a17:903:2452:b0:166:4b6c:affb with SMTP id l18-20020a170903245200b001664b6caffbmr13929142pls.68.1655557538373; Sat, 18 Jun 2022 06:05:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655557538; cv=none; d=google.com; s=arc-20160816; b=Qds46nkjWIkwTbcyfTph+6FnIwu16G63nx3cgwgEpvxNLVA/IbnIxroENaDXVpglu8 PbdfkDbN/yH/scTy8/DZjDy07EbHXdHjBiX9nxJ4zTRYuYFxF66aOCZa72gzE1y1fGv9 Z2L0tJWHzU2i9OddT2LUUk26j1O9TfCY7YdOqJiH9a1LLZCbkR24uYs6bzDJxg1xVG2N NJHEhLEjGs8Pyq1zbDQ0eMShqHXNS/ExdPkvj8wAob7+4jyJ5ARm3d4UN2UJaR01XBQd Zenh5FpUC0AJvIiqSeMpwyCaUZjXbQRqHv0TyK+A4Gg+soDH0dQbuOxmZ5X8wgpX2oqz BJeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=4K0c0FGYB67OXTPpQQapOQLWfaqxguXyN0ybNLBUuLE=; b=eui/Zvdsa9jC17PRXbJ+Yde88tuCLg7lVOzDqUveQ0uIAIC/gvQUZyCBpxbNHxMNl2 gXu4eZFzOztQ90ygW2Hh/0cmeHd7wq9KnqpLYplhe/XQw9xO8leEG7f+cH+/2ZTPUdKE hkjKBRHIgb+9my3nweCYIjmM/Wyllls5U3WInZrSXgZYwiGEdyzyIwvGvr+QqZCJrVsf 10N/mQzhZav9x3/+XY7/AjQIEBEOLOzC9i2LSZLPNvgtiCPmEoJMtrji7N7Zvl0XYxjQ 4bSW/P+Rszg/uUwpedMlt8fnXsYiNZsJ5tmtRZFfNSteIOdrKZKnqywgKysgjUHabuJq jnRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a192-20020a6390c9000000b00408aa25502asi9078845pge.872.2022.06.18.06.05.26; Sat, 18 Jun 2022 06:05:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236118AbiFRMwg (ORCPT + 99 others); Sat, 18 Jun 2022 08:52:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236061AbiFRMwf (ORCPT ); Sat, 18 Jun 2022 08:52:35 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EDDB417E23 for ; Sat, 18 Jun 2022 05:52:33 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B43C5113E; Sat, 18 Jun 2022 05:52:33 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.35.139]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1F5253F792; Sat, 18 Jun 2022 05:52:28 -0700 (PDT) Date: Sat, 18 Jun 2022 13:52:24 +0100 From: Mark Rutland To: Tong Tiangen Cc: James Morse , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Robin Murphy , Dave Hansen , Catalin Marinas , Will Deacon , Alexander Viro , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , x86@kernel.org, "H . Peter Anvin" , linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Kefeng Wang , Xie XiuQi , Guohanjun Subject: Re: [PATCH -next v5 6/8] arm64: add support for machine check error safe Message-ID: References: <20220528065056.1034168-1-tongtiangen@huawei.com> <20220528065056.1034168-7-tongtiangen@huawei.com> <4aa8b109-c79b-8da0-db89-85ca128f1049@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4aa8b109-c79b-8da0-db89-85ca128f1049@huawei.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 18, 2022 at 05:18:55PM +0800, Tong Tiangen wrote: > 在 2022/6/17 16:55, Mark Rutland 写道: > > On Sat, May 28, 2022 at 06:50:54AM +0000, Tong Tiangen wrote: > > > +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr, > > > + struct pt_regs *regs, int sig, int code) > > > +{ > > > + if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC)) > > > + return false; > > > + > > > + if (user_mode(regs) || !current->mm) > > > + return false; > > > > What's the `!current->mm` check for? > > At first, I considered that only user processes have the opportunity to > recover when they trigger memory error. > > But it seems that this restriction is unreasonable. When the kernel thread > triggers memory error, it can also be recovered. for instance: > > https://lore.kernel.org/linux-mm/20220527190731.322722-1-jiaqiyan@google.com/ > > And i think if(!current->mm) shoud be added below: > > if(!current->mm) { > set_thread_esr(0, esr); > arm64_force_sig_fault(...); > } > return true; Why does 'current->mm' have anything to do with this, though? There can be kernel threads with `current->mm` set in unusual circumstances (and there's a lot of kernel code out there which handles that wrong), so if you want to treat user tasks differently, we should be doing something like checking PF_KTHREAD, or adding something like an is_user_task() helper. [...] > > > + > > > + if (apei_claim_sea(regs) < 0) > > > + return false; > > > + > > > + if (!fixup_exception_mc(regs)) > > > + return false; > > > > I thought we still wanted to signal the task in this case? Or do you expect to > > add that into `fixup_exception_mc()` ? > > Yeah, here return false and will signal to task in do_sea() -> > arm64_notify_die(). I mean when we do the fixup. I thought the idea was to apply the fixup (to stop the kernel from crashing), but still to deliver a fatal signal to the user task since we can't do what the user task asked us to. > > > + > > > + set_thread_esr(0, esr); > > > > Why are we not setting the address? Is that deliberate, or an oversight? > > Here set fault_address to 0, i refer to the logic of arm64_notify_die(). > > void arm64_notify_die(...) > { > if (user_mode(regs)) { > WARN_ON(regs != current_pt_regs()); > current->thread.fault_address = 0; > current->thread.fault_code = err; > > arm64_force_sig_fault(signo, sicode, far, str); > } else { > die(str, regs, err); > } > } > > I don't know exactly why and do you know why arm64_notify_die() did this? :) To be honest, I don't know, and that looks equally suspicious to me. Looking at the git history, that was added in commit: 9141300a5884b57c ("arm64: Provide read/write fault information in compat signal handlers") ... so maybe Catalin recalls why. Perhaps the assumption is just that this will be fatal and so unimportant? ... but in that case the same logic would apply to the ESR value, so it's not clear to me. Mark.