Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp824809rdb; Sat, 7 Oct 2023 00:16:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/CO5MeaJ8VhGPi2t/K01rLvGKtB2cjyLFUWf0z64rGYUJkyXrMNYDUbfHa3lW0DQQG8aU X-Received: by 2002:a05:6808:1820:b0:3a9:c2fe:335c with SMTP id bh32-20020a056808182000b003a9c2fe335cmr12728358oib.52.1696662985750; Sat, 07 Oct 2023 00:16:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696662985; cv=none; d=google.com; s=arc-20160816; b=JlNnVRR/NfPk/3fra9CRx/IH2uLzXEWxHJBIOSuYj9Se2Q3EptwuIQeIjjFjHzGXgm xB/bO0x/Xry8n4Ng2OEzC9qQGoVAxkInya1f1kFfJj2cYEEd1BvJN2EHcI6F3CPaA9Fb DYymzEMItFuWs65f4D8bQ0+iObp1MqPapJyGDzaOusJI9u3xcKJJ1EY6WOMUba5KPf8T SZa2GnU1YzM0Zz/BPY4U9RsupPR7p/5QRzSVRjECqsN6sJKQHQny2rA7W1Yt0naZqu9h 9vLPtIVvtdiTTDt8GPY107nw+L6K4UCO1ctW+lbgjlzihhUn/ruCmUJXqfRu6HQTA+5J R/Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=T2u+I6jz+ZTbsMQKmeW+2GRBFc3xzK8WyaZA+h4HT/g=; fh=j3jpXF7gEcfPKHGIxtrLvSVl/AcwLGsPQRJCZQ7tdYo=; b=Miqk8icRlAASZGi/OYEVk4an/1ixFZ7t2IeL8y8rx5eFPjoCM95iA29fgAsIA2KdKB 4PURMDn2mJvHPXbHpoeLeGuztg63gBS3Tqy+2uWq9LAFjWvruUeR5q4CdxiEBAYgznKY Z/HhN1PEJcuCBmnhIgqxqGqJVW0dNpz9hAJyD17DVEE/DOW8leedlnPT2O4mfV2NCH3E 18y4WPgEf6HmUozgsiMRI8ZoQU7rJTBkXVB65cgjv7UKD3kzrx/tRJIV+CgD1Nw4HYfl XYPaRANrAZT6jIXziZlXOcTnxG4YtYopstpIFqJR0bPEu97/giy+YgS1UNQJrK60ndqF o9pg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id s30-20020a056a00179e00b00690bd5a0bc9si3288651pfg.360.2023.10.07.00.16.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Oct 2023 00:16:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id EDB2E80AB5A6; Sat, 7 Oct 2023 00:16:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343647AbjJGHP6 (ORCPT + 99 others); Sat, 7 Oct 2023 03:15:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343641AbjJGHP4 (ORCPT ); Sat, 7 Oct 2023 03:15:56 -0400 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4DADB9; Sat, 7 Oct 2023 00:15:50 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0VtY4oO5_1696662946; Received: from 30.240.114.194(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VtY4oO5_1696662946) by smtp.aliyun-inc.com; Sat, 07 Oct 2023 15:15:48 +0800 Message-ID: Date: Sat, 7 Oct 2023 15:15:45 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [RFC PATCH v2 0/9] Use ERST for persistent storage of MCE and APEI errors Content-Language: en-US To: Borislav Petkov Cc: keescook@chromium.org, tony.luck@intel.com, gpiccoli@igalia.com, rafael@kernel.org, lenb@kernel.org, james.morse@arm.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, ardb@kernel.org, robert.moore@intel.com, linux-hardening@vger.kernel.org, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-efi@vger.kernel.org, acpica-devel@lists.linuxfoundation.org, baolin.wang@linux.alibaba.com References: <20230925074426.97856-1-xueshuai@linux.alibaba.com> <20230928144345.GAZRWRIXH1Tfgn5EpO@fat_crate.local> From: Shuai Xue In-Reply-To: <20230928144345.GAZRWRIXH1Tfgn5EpO@fat_crate.local> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Sat, 07 Oct 2023 00:16:23 -0700 (PDT) On 2023/9/28 22:43, Borislav Petkov wrote: > On Mon, Sep 25, 2023 at 03:44:17PM +0800, Shuai Xue wrote: >> After /dev/mcelog character device deprecated by commit 5de97c9f6d85 >> ("x86/mce: Factor out and deprecate the /dev/mcelog driver"), the >> serialized MCE error record, of previous boot in persistent storage is not >> collected via APEI ERST. > > You lost me here. /dev/mcelog is deprecated but you can still use it and > apei_write_mce() still happens. Yes, you are right. apei_write_mce() still happens so that MCE records are written to persistent storage and the MCE records can be retrieved by apei_read_mce(). Previously, the task was performed by the mcelog package. However, it has been deprecated, some distributions like Arch kernels are not even compiled with the necessary configuration option CONFIG_X86_MCELOG_LEGACY.[1] So, IMHO, it's better to add a way to retrieve MCE records through switching to the new generation rasdaemon solution. > > Looking at your patches, you're adding this to ghes so how about you sit > down first and explain your exact use case and what exactly you wanna > do? > > Thx. > Sorry for the poor cover letter. I hope the following response can clarify the matter. Q1: What is the exact problem? Traditionally, fatal hardware errors will cause Linux print error log to console, e.g. print_mce() or __ghes_print_estatus(), then reboot. With Linux, the primary method for obtaining debugging information of a serious error or fault is via the kdump mechanism. Kdump captures a wealth of kernel and machine state and writes it to a file for post-mortem debugging. In certain scenarios, ie. hosts/guests with root filesystems on NFS/iSCSI where networking software and/or hardware fails, and thus kdump fails to collect the hardware error context, leaving us unaware of what actually occurred. In the public cloud scenario, multiple virtual machines run on a single physical server, and if that server experiences a failure, it can potentially impact multiple tenants. It is crucial for us to thoroughly analyze the root causes of each instance failure in order to: - Provide customers with a detailed explanation of the outage to reassure them. - Collect the characteristics of the failures, such as ECC syndrome, to enable fault prediction. - Explore potential solutions to prevent widespread outages. In short, it is necessary to serialize hardware error information available for post-mortem debugging. Q2: What exactly I wanna do: The MCE handler, do_machine_check(), saves the MCE record to persistent storage and it is retrieved by mcelog. Mcelog has been deprecated when kernel 4.12 released in 2017, and the help of the configuration option CONFIG_X86_MCELOG_LEGACY suggest to consider switching to the new generation rasdaemon solution. The GHES handler does not support APEI error record now. To serialize hardware error information available for post-mortem debugging: - add support to save APEI error record into flash via ERST before go panic, - add support to retrieve MCE or APEI error record from the flash and emit the related tracepoint after system boot successful again so that rasdaemon can collect them Best Regards, Shuai [1] https://wiki.archlinux.org/title/Machine-check_exception