Received: by 2002:a05:7412:8d09:b0:fa:4c10:6cad with SMTP id bj9csp144246rdb; Mon, 15 Jan 2024 15:47:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IH00OAZ9Bl3Mmo2qRPXi9/hOQ3WXThf/FzEjyENLgLwI7PS6aWLMO4YFG+/mPuB4d3uvWpo X-Received: by 2002:a05:6358:4e4f:b0:175:96d:c6dc with SMTP id le15-20020a0563584e4f00b00175096dc6dcmr9711914rwc.29.1705362453480; Mon, 15 Jan 2024 15:47:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705362453; cv=none; d=google.com; s=arc-20160816; b=RuhtnCMtUSuB6hoqu6juPy0jMEnx/5l/NpNAqy1tivBoMqUP0zhfHGuNBBqrV6E3jR kZDmUG0GeFxvr3wryNtYOXfSOjM7RmtrFqAzJ9QU224u+zmF86MtQAaAogvMMWP2Dzl5 7ynm49q0kWG9/J2xlqX4mpgvtSPZHbiWOPhuxhHQsTfI0njcKpnZYElh7eMKAG3p0COh 8PQ7OD4U7aM8T2H7YEeywLcs2ANhZM+JK9TV7X1qDhXICmPM1S04vrRyxCyPuSu33JVz GQw7yFXrH/1r73MsVAPK4ieFzXMhJ2VBE4SaSXe2RzUr0qwOSnyR3UPYpba6XXQEPJv6 Bk9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=r6aLpZHUknoy/NzQ418228LeeJvncgKXrdRlyH2pIL4=; fh=XcUvfsESAlpkbGcZSjwlO68Sn32JY01EghIoBnMlJrs=; b=BooB3w8GnyFvjRmDmheSRPYdL9uSrOC2SwDbZqu+f8m7kM7+LeMJyhZjclPKlfUhWb 8tfHav6sHCPAdiYHwcyLUEClq2cwAlgicUpRWuBGUbFqijkKFdcYVdKFVuPMDP00WAd9 o6j0iZlHX68E8NX8QyKnebFqw8TiGo874NA7/biVRGTbKMtc+SKQQSSX0JQxg1/22UGL Yi79KeTQKagq7XnRfDoyRLU6dqkop7pB+c88gBc+2Fe9jiFQR1VcogPbFxBZdQdS6iQm APXF7bE829RF/urGSPRgB2OymrzbAF2V6AhPgBMIu0Xa57rqLJOzudWc9I5ckU8UOPKm 9eZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VbMahGay; spf=pass (google.com: domain of linux-kernel+bounces-26570-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26570-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id o26-20020a63921a000000b005cf2fd55ee2si9526910pgd.324.2024.01.15.15.47.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 15:47:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26570-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VbMahGay; spf=pass (google.com: domain of linux-kernel+bounces-26570-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26570-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id CC4BB286186 for ; Mon, 15 Jan 2024 23:36:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B7A4721346; Mon, 15 Jan 2024 23:26:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VbMahGay" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E68371BF5C; Mon, 15 Jan 2024 23:26:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6B9D0C43394; Mon, 15 Jan 2024 23:26:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705361216; bh=IAAKBu3o4sQwDn5Oi2V3pX2k/oB7TFgt7tPQPHEaMe8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VbMahGayO6D5O0ZcZT3FcEbzl8KLwt8twRmM1SRSyY6K6dK/iJvyhsmjN4DrDKZkE QjzhW2aEgfvdpsLZOhPx6j1avGYHWx1C0BB8Hki/mhn7pFaY9gS+inhC15vnf59T/9 Xdq8e4x7mSo/7pzbbDZbL0YdHa8FC5TtoPe5JL4IKRT4EAGgH7QZconLhz8fac5qEm frUUPz2fffhdfx4Okk/suLoTd4upUIm1r10TfH1BRA2coSR8/24IacXo40S+VBeBk1 WIUiJazJ+CXbsdNJ4PvAHItmb7QDDC1XuMltEvkoDX0nz2GcUDnq38Tf5q1/FzCJ77 0vKyWVAUhxlgg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Zhiquan Li , Youquan Song , Borislav Petkov , Naoya Horiguchi , Sasha Levin , tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, linux-edac@vger.kernel.org Subject: [PATCH AUTOSEL 6.1 14/14] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel Date: Mon, 15 Jan 2024 18:25:48 -0500 Message-ID: <20240115232611.209265-14-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240115232611.209265-1-sashal@kernel.org> References: <20240115232611.209265-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.1.73 Content-Transfer-Encoding: 8bit From: Zhiquan Li [ Upstream commit 9f3b130048bfa2e44a8cfb1b616f826d9d5d8188 ] Memory errors don't happen very often, especially fatal ones. However, in large-scale scenarios such as data centers, that probability increases with the amount of machines present. When a fatal machine check happens, mce_panic() is called based on the severity grading of that error. The page containing the error is not marked as poison. However, when kexec is enabled, tools like makedumpfile understand when pages are marked as poison and do not touch them so as not to cause a fatal machine check exception again while dumping the previous kernel's memory. Therefore, mark the page containing the error as poisoned so that the kexec'ed kernel can avoid accessing the page. [ bp: Rewrite commit message and comment. ] Co-developed-by: Youquan Song Signed-off-by: Youquan Song Signed-off-by: Zhiquan Li Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Naoya Horiguchi Link: https://lore.kernel.org/r/20231014051754.3759099-1-zhiquan1.li@intel.com Signed-off-by: Sasha Levin --- arch/x86/kernel/cpu/mce/core.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index f1a748da5fab..cad6ea1911e9 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -44,6 +44,7 @@ #include #include #include +#include #include #include @@ -239,6 +240,7 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) struct llist_node *pending; struct mce_evt_llist *l; int apei_err = 0; + struct page *p; /* * Allow instrumentation around external facilities usage. Not that it @@ -292,6 +294,20 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) if (!fake_panic) { if (panic_timeout == 0) panic_timeout = mca_cfg.panic_timeout; + + /* + * Kdump skips the poisoned page in order to avoid + * touching the error bits again. Poison the page even + * if the error is fatal and the machine is about to + * panic. + */ + if (kexec_crash_loaded()) { + if (final && (final->status & MCI_STATUS_ADDRV)) { + p = pfn_to_online_page(final->addr >> PAGE_SHIFT); + if (p) + SetPageHWPoison(p); + } + } panic(msg); } else pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg); -- 2.43.0