Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2718228pxv; Sun, 11 Jul 2021 23:42:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxHnA8ZN+kzFn2ZLrHsik2MyOCo2I3O10O0ti4qIafF21Oy9zhVuYTztbGJxRVyemfSKxtf X-Received: by 2002:a02:6946:: with SMTP id e67mr16211263jac.4.1626072138002; Sun, 11 Jul 2021 23:42:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626072137; cv=none; d=google.com; s=arc-20160816; b=nMy5GzCaT56MoI1Jew4ubWBopE7xLPZajLw+ejhKcnu2MyKW9WRzQIFyFvR2CgeKfQ UBF8kykI6OGT5+I7ffYWRNKV0qLkDreVNkADKUL8BcteXcpzlx/VtdXu1TnQ22firGvE +eAOj0DXXPZgPNr4nOf4cwVB7NlXNNHeq4yB1I+gqOEA5MIs80oo0GINdHr3pZHFl0sS tX8BSwbuCuZTUF48dI7ZAhy2jaa30I2sOxIIM6Z2oP3JRA3mGtnBb92/ThYBjL3w9yUJ CTSGObpqi1mN5y8RnPg14JnMmE57GbK+FPZzcdh9AbeuwQWFI/eV08PEFQP5D7irSJn9 Uz/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=FQ9yIhKgZSoLe+FuhZuG+Qb1aCVa3kJ833deWKzyNvQ=; b=fgSE5BAUMcYGhHZDP2msGU9yGLluCkjOnjLO5RQ3VbihtuYAZcWXOmYUkSWIU5Eg8z Ga5TltwULMVIz9wSfVk4LD7+49OlWWXmdiX27Og69Ol84mVWe6jQRODGx/muNUroPlP5 1VnEhF5oWxfrQTw+wcUPvZXahSteTKSo+pEVpCemQ3rppM1E6AKQk4/rea7E7bLNCDuJ Zw6CnNk+a+QVxC0HK4yO+3MXSr6cq/SB7gyHGF5mea8bqCDff1GIbb2q1SdDmhUK6bEV mqiiw8hEstwMGCOXlgnmFS4LyTK0X7N/F/zxruD/7GaLKT7sVqAn4nHyEoMjIQKpH1Ml QTTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=BcGCLfkp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z6si17504669ilq.82.2021.07.11.23.42.05; Sun, 11 Jul 2021 23:42:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=BcGCLfkp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237989AbhGLGmG (ORCPT + 99 others); Mon, 12 Jul 2021 02:42:06 -0400 Received: from mail.kernel.org ([198.145.29.99]:54744 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237666AbhGLGep (ORCPT ); Mon, 12 Jul 2021 02:34:45 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5AF1B60551; Mon, 12 Jul 2021 06:31:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1626071482; bh=Gjqxoy6zuCsX3PmidEoy9VJ+NjivXFTAkQdGcmPjcR0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BcGCLfkpPQG+1G0ONxytN1oUk6ZWFQX7TXdMd93fu2ThLPC85SbzBahjXaym2NZR8 nlVCCF2E8sdIZNpKgp8IXujoGfNGv76YDVGrlXpwD7BOzbFeKF9wr6cUnggifLkQ7C fCqBcxcpCGGCen07/7lvggsJjoUCAkQLVlssynqs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Nathan Lynch , Michael Ellerman Subject: [PATCH 5.10 089/593] powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() Date: Mon, 12 Jul 2021 08:04:09 +0200 Message-Id: <20210712060853.000563414@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210712060843.180606720@linuxfoundation.org> References: <20210712060843.180606720@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Michael Ellerman commit 7c6986ade69e3c81bac831645bc72109cd798a80 upstream. In raise_backtrace_ipi() we iterate through the cpumask of CPUs, sending each an IPI asking them to do a backtrace, but we don't wait for the backtrace to happen. We then iterate through the CPU mask again, and if any CPU hasn't done the backtrace and cleared itself from the mask, we print a trace on its behalf, noting that the trace may be "stale". This works well enough when a CPU is not responding, because in that case it doesn't receive the IPI and the sending CPU is left to print the trace. But when all CPUs are responding we are left with a race between the sending and receiving CPUs, if the sending CPU wins the race then it will erroneously print a trace. This leads to spurious "stale" traces from the sending CPU, which can then be interleaved messily with the receiving CPU, note the CPU numbers, eg: [ 1658.929157][ C7] rcu: Stack dump where RCU GP kthread last ran: [ 1658.929223][ C7] Sending NMI from CPU 7 to CPUs 1: [ 1658.929303][ C1] NMI backtrace for cpu 1 [ 1658.929303][ C7] CPU 1 didn't respond to backtrace IPI, inspecting paca. [ 1658.929362][ C1] CPU: 1 PID: 325 Comm: kworker/1:1H Tainted: G W E 5.13.0-rc2+ #46 [ 1658.929405][ C7] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 325 (kworker/1:1H) [ 1658.929465][ C1] Workqueue: events_highpri test_work_fn [test_lockup] [ 1658.929549][ C7] Back trace of paca->saved_r1 (0xc0000000057fb400) (possibly stale): [ 1658.929592][ C1] NIP: c00000000002cf50 LR: c008000000820178 CTR: c00000000002cfa0 To fix it, change the logic so that the sending CPU waits 5s for the receiving CPU to print its trace. If the receiving CPU prints its trace successfully then the sending CPU just continues, avoiding any spurious "stale" trace. This has the added benefit of allowing all CPUs to print their traces in order and avoids any interleaving of their output. Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()") Cc: stable@vger.kernel.org # v4.18+ Reported-by: Nathan Lynch Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210625140408.3351173-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/stacktrace.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) --- a/arch/powerpc/kernel/stacktrace.c +++ b/arch/powerpc/kernel/stacktrace.c @@ -230,17 +230,31 @@ static void handle_backtrace_ipi(struct static void raise_backtrace_ipi(cpumask_t *mask) { + struct paca_struct *p; unsigned int cpu; + u64 delay_us; for_each_cpu(cpu, mask) { - if (cpu == smp_processor_id()) + if (cpu == smp_processor_id()) { handle_backtrace_ipi(NULL); - else - smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, 5 * USEC_PER_SEC); - } + continue; + } - for_each_cpu(cpu, mask) { - struct paca_struct *p = paca_ptrs[cpu]; + delay_us = 5 * USEC_PER_SEC; + + if (smp_send_safe_nmi_ipi(cpu, handle_backtrace_ipi, delay_us)) { + // Now wait up to 5s for the other CPU to do its backtrace + while (cpumask_test_cpu(cpu, mask) && delay_us) { + udelay(1); + delay_us--; + } + + // Other CPU cleared itself from the mask + if (delay_us) + continue; + } + + p = paca_ptrs[cpu]; cpumask_clear_cpu(cpu, mask);