Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3629079pxb; Mon, 24 Jan 2022 13:55:16 -0800 (PST) X-Google-Smtp-Source: ABdhPJwAMa/mBCXDZI2d/hqcMuMfw41g5wtTLOKbOeiavuZFOpgL427BAYOKShQmlt2uaeuCO/dK X-Received: by 2002:a17:902:be08:b0:14b:7036:3573 with SMTP id r8-20020a170902be0800b0014b70363573mr1113330pls.155.1643061316285; Mon, 24 Jan 2022 13:55:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643061316; cv=none; d=google.com; s=arc-20160816; b=sKqUED5ZBtkTTBUahp6nnfK8/0YvzMowAOWkeKeMDbUxDc8/TczqE2opTV7uzNGW0d j9yaOe6VtgcjL1AsVxGnT8mnjRwgHwuXRzKqeh4J8BXDh+TWEti/4puRn2N5+OT7ueG+ sLuqJ4tcz0MqXMRvkanASRiwWrOEVDx7uY2BwJa0Am9YIiRcXsTYxiT2y89+ky7OE1Hx kVb2KvA/89QFN8Ozk8IIALMaaX9+JzVtf7+NnFVG+VlLdHEzP/VxWpqg7AXLmWCNvJmK ll7naykjdRLD8wn2OiQumTyu9WSpVsp9UkLGoO3NdnHBV+grBrThqXfrqA0A0SlqAPNO AGoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=U7B3bDhGNxvI1qCtWNSyTE1m+jpKkxD2fouQJ8Aq/uQ=; b=gG92+502oEMFSwz1wN0gJnq5MY4oNbrwQLsMTUbiFV3tPmHNFO4NlFyodJnLOm2jv3 BNjHHwvkooDeYmUhnD/P+5fXOTNq4CIOdwy4X8NVar+XEyYqhmoKSyXi7EUdIFhdyKFX S9iIE7ffrs2fBiXSjne+k/VjUk3y7HIzEWRbJkabFU/Q+c5h+fz4INQmgiSoS0ugqYBC exz6wy5lO+d1g0ry9tcOkatElejrxanJyQ2mQidTHyfV2mSEWkhwSvoI9Zbajsfahwqa 7JnsGdPrbsY1eVnfZdYbSyZ4f7ZcYeOY12X3rXti1BDIRcqvFOGWS+is5ua9ngkqE00b nkBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=EcV090jp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l70si11329001pgd.660.2022.01.24.13.55.04; Mon, 24 Jan 2022 13:55:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=EcV090jp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1457491AbiAXVln (ORCPT + 99 others); Mon, 24 Jan 2022 16:41:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1445659AbiAXVEm (ORCPT ); Mon, 24 Jan 2022 16:04:42 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94472C06B5BB; Mon, 24 Jan 2022 12:04:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5D7A9B8119E; Mon, 24 Jan 2022 20:04:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E1F8C340E7; Mon, 24 Jan 2022 20:04:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1643054662; bh=0w4R2NHgExzeD9t+H0AsT8Etmds2t5Ti0YZB3Jt+H/Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EcV090jptT92hrwxE1p5EKXWgtS8o7UoCSm4NlhWv33gkg1oB83PbpZHtchjqNuyx 7qVWnLMgzlqSJnXoZ+Eevo7ZjSMMZDkDTuIJI7FgqwIOGHPTRmEmHxOmEJLN7r9XHf HKIi4IwXLIWzW1qY3Oy9U4136jQoQQLMP181krqw= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Hari Bathini , Michael Ellerman , Sasha Levin Subject: [PATCH 5.10 430/563] powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic Date: Mon, 24 Jan 2022 19:43:15 +0100 Message-Id: <20220124184039.325859101@linuxfoundation.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220124184024.407936072@linuxfoundation.org> References: <20220124184024.407936072@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hari Bathini [ Upstream commit 06e629c25daa519be620a8c17359ae8fc7a2e903 ] In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20211207103719.91117-2-hbathini@linux.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/kernel/fadump.c | 8 ++++++++ arch/powerpc/kernel/smp.c | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index eddf362caedce..c3bb800dc4352 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1641,6 +1641,14 @@ int __init setup_fadump(void) else if (fw_dump.reserve_dump_area_size) fw_dump.ops->fadump_init_mem_struct(&fw_dump); + /* + * In case of panic, fadump is triggered via ppc_panic_event() + * panic notifier. Setting crash_kexec_post_notifiers to 'true' + * lets panic() function take crash friendly path before panic + * notifiers are invoked. + */ + crash_kexec_post_notifiers = true; + return 1; } subsys_initcall(setup_fadump); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index d993f28107afa..cf99f57aed822 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -60,6 +60,7 @@ #include #include #include +#include #ifdef DEBUG #include @@ -612,6 +613,15 @@ void crash_smp_send_stop(void) { static bool stopped = false; + /* + * In case of fadump, register data for all CPUs is captured by f/w + * on ibm,os-term rtas call. Skip IPI callbacks to other CPUs before + * this rtas call to avoid tricky post processing of those CPUs' + * backtraces. + */ + if (should_fadump_crash()) + return; + if (stopped) return; -- 2.34.1