Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp1813637rdb; Tue, 3 Oct 2023 01:29:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE62wufwiNfxhsg6logxOae0scdSANf/m96T+sHOZ6dQQTVJkpobMR3f7tuZ5JDZU3aDzuc X-Received: by 2002:a05:6a21:601:b0:15d:4a2b:b513 with SMTP id ll1-20020a056a21060100b0015d4a2bb513mr12606726pzb.36.1696321763170; Tue, 03 Oct 2023 01:29:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696321763; cv=none; d=google.com; s=arc-20160816; b=aN/oq3x/uGB65nomCild2c0A5yoeM7nlthcT4mVtijj1K+rAUNAG70gLj7S35Puwpx y5JrywLsSK5+vOpT9989Mvjb7GhV/LC7HYlh5K1Uj+cdz4OkGiwEwCLZW6Olh6mPIAF8 t3Z1YeuPGB9UPVtraNSx7Hegygnk1DBFM+lh12Y3IoCTDdKqv1gmx6W4x35If7QiB0Pf kywyZ+Cwe8HezBU1l/wiJ8hSzzF2fzgA8HmPeVYc/eeghH/DhMGdOCbpJjV7QDp4b61P gCI1kCDGYMlVuOqyyIkq+bvpH99RMmy+DenrOoJKE/aH4crtjMKaCglROs+Jl1gI52cr 45wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=SvHEdL127DkwY1o1/aHJZdaismZDLKyIFOf23kZMKII=; fh=S5L9134ci8Ogxof5x5a2lCLT3QWEn80FiV6eSNhVWEI=; b=b63jXrRv67U4IgOQKe4a/Cjw507vwUDyWJKe5sfwtZG8IhPg7zmbpMaz5eVvN95ZqX UzKnCMbSlqirCrPl+spcuejuh5bEnkalOWKzQYEhIWhSw7RON5mn15ZZRrpFVSopBfPk COvSus69jlquFe1w4CL82MpnqtFzu6WqyU/c3Ay7wGr11S9FYjeRh/bULg9pzNnSbv+C fUKC0F8SI1YP426nUWha+D671MJWznCsgHwmnpTlbB+me4wnZ/I+nUNHOo2hlOBRLDXS b3GqkKPblt/OslocHl4Bw5DxOQgdSyOawCf4RtGM7/vLeztrzWaiiw5EHDKxMzfDnvsU hVLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=SnZnaakk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id kt5-20020a170903088500b001c7342b46e4si832408plb.23.2023.10.03.01.29.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Oct 2023 01:29:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=SnZnaakk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 4ADC5802C8F1; Tue, 3 Oct 2023 01:29:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239453AbjJCI3W (ORCPT + 99 others); Tue, 3 Oct 2023 04:29:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230330AbjJCI3V (ORCPT ); Tue, 3 Oct 2023 04:29:21 -0400 Received: from out-205.mta0.migadu.com (out-205.mta0.migadu.com [IPv6:2001:41d0:1004:224b::cd]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C6C3A3 for ; Tue, 3 Oct 2023 01:29:18 -0700 (PDT) Date: Tue, 3 Oct 2023 17:28:58 +0900 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696321756; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SvHEdL127DkwY1o1/aHJZdaismZDLKyIFOf23kZMKII=; b=SnZnaakkCBrskHWe8xJL1L5kXJ04u9tThP9M8iKTR/CDUx+XLVjReoLcedlF6WCxvIyGD6 OhdcyIniVLeOQ4Kzd3U+6O26XoRzrW7252N8bpgQy1lQ/Hvcody5jaKDq/8yQ6j4TdM4hz ijfc3ipmQGWFZBIAd/Eq8b9RfxVpS2E= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Naoya Horiguchi To: Shuai Xue Cc: rafael@kernel.org, wangkefeng.wang@huawei.com, tanxiaofei@huawei.com, mawupeng1@huawei.com, tony.luck@intel.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, james.morse@arm.com, gregkh@linuxfoundation.org, will@kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, linux-edac@vger.kernel.org, acpica-devel@lists.linuxfoundation.org, stable@vger.kernel.org, x86@kernel.org, justin.he@arm.com, ardb@kernel.org, ying.huang@intel.com, ashish.kalra@amd.com, baolin.wang@linux.alibaba.com, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, jarkko@kernel.org, lenb@kernel.org, hpa@zytor.com, robert.moore@intel.com, lvying6@huawei.com, xiexiuqi@huawei.com, zhuo.song@linux.alibaba.com Subject: Re: [RESEND PATCH v8 2/2] ACPI: APEI: handle synchronous exceptions in task work Message-ID: <20231003082858.GA750796@ik1-406-35019.vs.sakura.ne.jp> References: <20221027042445.60108-1-xueshuai@linux.alibaba.com> <20230919022127.69732-3-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230919022127.69732-3-xueshuai@linux.alibaba.com> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 03 Oct 2023 01:29:22 -0700 (PDT) On Tue, Sep 19, 2023 at 10:21:27AM +0800, Shuai Xue wrote: > Hardware errors could be signaled by synchronous interrupt, e.g. when an > error is detected by a background scrubber, or signaled by synchronous > exception, e.g. when an uncorrected error is consumed. Both synchronous and > asynchronous error are queued and handled by a dedicated kthread in > workqueue. > > commit 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for > synchronous errors") keep track of whether memory_failure() work was > queued, and make task_work pending to flush out the workqueue so that the > work for synchronous error is processed before returning to user-space. > The trick ensures that the corrupted page is unmapped and poisoned. And > after returning to user-space, the task starts at current instruction which > triggering a page fault in which kernel will send SIGBUS to current process > due to VM_FAULT_HWPOISON. > > However, the memory failure recovery for hwpoison-aware mechanisms does not > work as expected. For example, hwpoison-aware user-space processes like > QEMU register their customized SIGBUS handler and enable early kill mode by > seting PF_MCE_EARLY at initialization. Then the kernel will directy notify > the process by sending a SIGBUS signal in memory failure with wrong > si_code: the actual user-space process accessing the corrupt memory > location, but its memory failure work is handled in a kthread context, so > it will send SIGBUS with BUS_MCEERR_AO si_code to the actual user-space > process instead of BUS_MCEERR_AR in kill_proc(). > > To this end, separate synchronous and asynchronous error handling into > different paths like X86 platform does: > > - valid synchronous errors: queue a task_work to synchronously send SIGBUS > before ret_to_user. > - valid asynchronous errors: queue a work into workqueue to asynchronously > handle memory failure. > - abnormal branches such as invalid PA, unexpected severity, no memory > failure config support, invalid GUID section, OOM, etc. > > Then for valid synchronous errors, the current context in memory failure is > exactly belongs to the task consuming poison data and it will send SIBBUS > with proper si_code. > > Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors") > Signed-off-by: Shuai Xue > Tested-by: Ma Wupeng > Reviewed-by: Kefeng Wang > Reviewed-by: Xiaofei Tan > Reviewed-by: Baolin Wang > --- > arch/x86/kernel/cpu/mce/core.c | 9 +--- > drivers/acpi/apei/ghes.c | 84 +++++++++++++++++++++------------- > include/acpi/ghes.h | 3 -- > mm/memory-failure.c | 17 ++----- > 4 files changed, 56 insertions(+), 57 deletions(-) > ... > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 4d6e43c88489..80e1ea1cc56d 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -2163,7 +2163,9 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > * > * Return: 0 for successfully handled the memory error, > * -EOPNOTSUPP for hwpoison_filter() filtered the error event, > - * < 0(except -EOPNOTSUPP) on failure. > + * -EHWPOISON for already sent SIGBUS to the current process with > + * the proper error info, The meaning of this comment is understood, but the sentence seems to be a little too long. Could you sort this out with bullet points (like below)? * Return values: * 0 - success * -EOPNOTSUPP - hwpoison_filter() filtered the error event. * -EHWPOISON - sent SIGBUS to the current process with the proper * error info by kill_accessing_process(). * other negative values - failure > + * other negative error code on failure. > */ > int memory_failure(unsigned long pfn, int flags) > { > @@ -2445,19 +2447,6 @@ static void memory_failure_work_func(struct work_struct *work) > } > } > > -/* > - * Process memory_failure work queued on the specified CPU. > - * Used to avoid return-to-userspace racing with the memory_failure workqueue. > - */ > -void memory_failure_queue_kick(int cpu) > -{ > - struct memory_failure_cpu *mf_cpu; > - > - mf_cpu = &per_cpu(memory_failure_cpu, cpu); > - cancel_work_sync(&mf_cpu->work); > - memory_failure_work_func(&mf_cpu->work); > -} > - The declaration of memory_failure_queue_kick() still remains in include/linux/mm.h, so you can remove it together. Thanks, Naoya Horiguchi