Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3365816pxb; Mon, 17 Jan 2022 18:43:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJykKyeT2tDwX3feaR0DbAKeWndfcRo8/vLvYykQENFw+cJKO0iTYIDrZnafsd9oGD56aVb2 X-Received: by 2002:a17:903:247:b0:149:b6f1:3c8b with SMTP id j7-20020a170903024700b00149b6f13c8bmr25955024plh.83.1642473794349; Mon, 17 Jan 2022 18:43:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642473794; cv=none; d=google.com; s=arc-20160816; b=jd9vWlkeI5MDXkM05WP2O7zkAy+14qIDqE+dv9pfYIzyaqOHJGUjrVw4qkBfnpSuMv MFL2WNa57FZSn6QsH0JMXrNjjcSjZvVuKZTSB1sLS+qvZT0HvqKeFYAX1r9aMxrWyVCV DpxtLl/HFIHl1Njv51WuMhASB1+XDmr9fJjCP9TszRXq5CIe9LrGUVMoDRCJ4QRGh8Gz nZdW0m206r/2atxtIPYuMKUopB/kojy5/ITRutArgYlLsUqnxqsKRag8kjGFmL66dIuq Ifrvocpr1kZL/hVgD0T7LbB3U39noLseBwiE9T6zo0RxhzxIdSAd1pSwAi71oTZUK0Y2 nb5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=FRTPHmIG4IO9O16xnl5q3POu6nLg2l9PEjMQVdktf8M=; b=z7tWeRsgwURjpTqSiRehHWS2OX0CRV8tWfymAAanHeBzBqVXOj6LI18dR+0H3MK75W UkW0Or8g81QphRfCGyLJimYwWpR+7CUxgjXt7mih2cjbOPLp4QXD/J3GBN4UIK0OkLSC UvBsE/9oGZ+OggjaY7RDx8K71A0tSS2YiViPZP12/n4k1OfvnmaKPiFbJC/cnIY8e0q0 2VDU/lDjCkgFDRLWxiaxiKM7H1DW7/RlGUjOcT29JS1wTg2+Ot99X8YNkDaxS2r2w6XI P8xm9xrNAlP7R+IltViZSPOnghvoOMzjI4vysSrGscnrvhwqV8iOZVbK5vIbOZIqvrig IGmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=n6vduLhL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q12si16620047pgv.363.2022.01.17.18.43.01; Mon, 17 Jan 2022 18:43:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=n6vduLhL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241809AbiAQRBC (ORCPT + 99 others); Mon, 17 Jan 2022 12:01:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241346AbiAQRAP (ORCPT ); Mon, 17 Jan 2022 12:00:15 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1A5AC061765; Mon, 17 Jan 2022 09:00:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AECEBB81142; Mon, 17 Jan 2022 17:00:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 94CCBC36AE3; Mon, 17 Jan 2022 17:00:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1642438812; bh=IoB7Upsgx+3TaNkP2jKC4svaYwVImu5WjHntAS3c59E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=n6vduLhLcdgiMRwBZ4aUYI5n4RIMRIvC7S8LvFnapYK0DGahw96MkvG4gvaSly8B0 7L8BgDDxxj3WY876wg/r/IwtavmVJD0kjEJG+0NNC3+EdI16Eh8i6/a6muZ7QrzRSP Lwb8kK5f+M9/HLptW5OeKLYGxa0IIWG8Fc+a01F+VPgIeNAYYotA44Y0ftf145A3Hp VuUuEYQIukztGoU+qiuvebPJyhCKdxgxND9zrZ0NwnGvGmav5dA3iYU41kZWydtrdm KxNMGwFnIcDjccjNEPxPwf8SPKvu5cUt7X642kAXDi1jwizYHLjdkXyS0ua03V+0su uZnklk2t+V+Yw== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Hari Bathini , Michael Ellerman , Sasha Levin , sxwjean@gmail.com, sfr@canb.auug.org.au, aneesh.kumar@linux.ibm.com, nick.child@ibm.com, nathan@kernel.org, srikar@linux.vnet.ibm.com, ego@linux.vnet.ibm.com, nathanl@linux.ibm.com, parth@linux.ibm.com, clg@kaod.org, npiggin@gmail.com, robh@kernel.org, yukuai3@huawei.com, linuxppc-dev@lists.ozlabs.org Subject: [PATCH AUTOSEL 5.16 30/52] powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic Date: Mon, 17 Jan 2022 11:58:31 -0500 Message-Id: <20220117165853.1470420-30-sashal@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220117165853.1470420-1-sashal@kernel.org> References: <20220117165853.1470420-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hari Bathini [ Upstream commit 06e629c25daa519be620a8c17359ae8fc7a2e903 ] In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20211207103719.91117-2-hbathini@linux.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/kernel/fadump.c | 8 ++++++++ arch/powerpc/kernel/smp.c | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index b7ceb041743c9..60f5fc14aa235 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1641,6 +1641,14 @@ int __init setup_fadump(void) else if (fw_dump.reserve_dump_area_size) fw_dump.ops->fadump_init_mem_struct(&fw_dump); + /* + * In case of panic, fadump is triggered via ppc_panic_event() + * panic notifier. Setting crash_kexec_post_notifiers to 'true' + * lets panic() function take crash friendly path before panic + * notifiers are invoked. + */ + crash_kexec_post_notifiers = true; + return 1; } subsys_initcall(setup_fadump); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 7201fdcf02f1c..c338f9d8ab37a 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -61,6 +61,7 @@ #include #include #include +#include #ifdef DEBUG #include @@ -638,6 +639,15 @@ void crash_smp_send_stop(void) { static bool stopped = false; + /* + * In case of fadump, register data for all CPUs is captured by f/w + * on ibm,os-term rtas call. Skip IPI callbacks to other CPUs before + * this rtas call to avoid tricky post processing of those CPUs' + * backtraces. + */ + if (should_fadump_crash()) + return; + if (stopped) return; -- 2.34.1