Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp1725259ybg; Thu, 30 Jul 2020 00:34:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgGOkS0mpqRY3eu+6YdPmZyCZyZT+MbrbCgApQHWzSEnwlClH1IfJquqOPD3SySZguKcL9 X-Received: by 2002:a17:906:48d3:: with SMTP id d19mr1360764ejt.180.1596094472898; Thu, 30 Jul 2020 00:34:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596094472; cv=none; d=google.com; s=arc-20160816; b=s+NJqi4ULZ94cH+7ciMlOyCX4E5VbxjT820+pt+Z06OtHGxBd5Yigg/zh490/TeZZh wOawEdJMKpdgrHW+4gXg/MBGIQQDWwinmZLi+1zHw4U9vuDwTeQ0rPYskemAmxW/ZIZQ gx/RTc+i+Deqifqfq59EBA2r4kQkl9LQsQfsGF4P5xDKneCoTcpI3xYLbxDJxNM95wMy NhQe87isHUEuXGu6+IBwk3Mo78BVeO0QYNwa2VjvVUcgT0xuKdSZo/MKnHNuC3/cPWOx toV03znax3QArzAUXfdHLf6tLVI3qMkvPdtM/k4UJv51gOSZQK9Nyu5t0h52bA2l42/u O0bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject:cc :to:from; bh=xsaMdgq1bNXBShakHnwT06i02NWx3Hdi2ybMG0Y2WG8=; b=bwkk5gjj0+zwD2Rp9mguwDnEQMQ4TEfD26AwQAaTlwbn28VMIRLTXDHzPHl844Ew0c 8X4eYcUA7BPhojTgUwAs0tmmv44Ry27TCulMmEd7iV5jnuj9b36HbkdoDTRHl/a9m2ij o4Sj4/vyUEnF45ka0AOy7FD1QDjKFz2Qa9U0EHz9qsox52l3it9rbfuGPE4nlRMFNHuC KYTn3oyAreAal3RRb93PhaLJnE/oSRCyXHaBEsnSb2JfpjGaGsqgfmiLs8MOPv1YtXDr UfSdOKOAKXJouy+CXFu9ybCqUyf0rADbQ2thDWfzSJc2Y0sgi3FzUvFyP4CBe05YwUOD kehg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r18si2850824ejd.679.2020.07.30.00.34.10; Thu, 30 Jul 2020 00:34:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728674AbgG3Hd7 (ORCPT + 99 others); Thu, 30 Jul 2020 03:33:59 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:8302 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725892AbgG3Hd7 (ORCPT ); Thu, 30 Jul 2020 03:33:59 -0400 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 0EA2B942A9034AD7533E; Thu, 30 Jul 2020 15:33:33 +0800 (CST) Received: from localhost.localdomain (10.67.165.24) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.487.0; Thu, 30 Jul 2020 15:33:22 +0800 From: Xiaofei Tan To: , , , , , , CC: , , , Xiaofei Tan Subject: [PATCH] ACPI / APEI: do memory failure on the physical address reported by ARM processor error section Date: Thu, 30 Jul 2020 15:32:28 +0800 Message-ID: <1596094348-10230-1-git-send-email-tanxiaofei@huawei.com> X-Mailer: git-send-email 2.8.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.67.165.24] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org After the following commit applied, user-mode SEA is preferentially processed by APEI. Do memory failure to recover. But there are some problems: 1) The function apei_claim_sea() has processed an CPER, does not mean that memory failure handling has done. Because the firmware-first RAS error is reported by both producer and consumer. Mostly SEA uses ARM processor error section to report as a consumer. (The producer could be DDRC and cache, and use memory error section and other error section to report). But memory failure handling for ARM processor error section has not been supported. We should add it. 2) Some hardware platforms can't record physical address each time. But they could always have reported a firmware-first RAS error using ARM processor error section. Such platform should update firmware. Don't report the RAS error when physical address is not recorded. Fixes: 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work") Signed-off-by: Xiaofei Tan --- drivers/acpi/apei/ghes.c | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 81bf71b..07bfa28 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -466,6 +466,44 @@ static bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, return false; } +static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, int sev) +{ + struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); + struct cper_arm_err_info *err_info; + bool queued = false; + int sec_sev, i; + + log_arm_hw_error(err); + + if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE)) + return false; + + sec_sev = ghes_severity(gdata->error_severity); + if (sev != GHES_SEV_RECOVERABLE || sec_sev != GHES_SEV_RECOVERABLE) + return false; + + err_info = (struct cper_arm_err_info *) (err + 1); + for (i = 0; i < err->err_info_num; i++, err_info++) { + unsigned long pfn; + + if (!(err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR)) + continue; + + pfn = PHYS_PFN(err_info->physical_fault_addr); + if (!pfn_valid(pfn)) { + pr_warn(FW_WARN GHES_PFX + "Invalid address in generic error data: 0x%#llx\n", + err_info->physical_fault_addr); + continue; + } + + memory_failure_queue(pfn, 0); + queued = true; + } + + return queued; +} + /* * PCIe AER errors need to be sent to the AER driver for reporting and * recovery. The GHES severities map to the following AER severities and @@ -543,9 +581,7 @@ static bool ghes_do_proc(struct ghes *ghes, ghes_handle_aer(gdata); } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { - struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); - - log_arm_hw_error(err); + queued = ghes_handle_arm_hw_error(gdata, sev); } else { void *err = acpi_hest_get_payload(gdata); -- 2.8.1