Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp4216724ybv; Mon, 10 Feb 2020 14:51:47 -0800 (PST) X-Google-Smtp-Source: APXvYqxQ0vthF2NHcATfBmKOP/UlFaU5gwuPgH5ctawiHz0hf7MAcOFhAhz2EVRfB3zGhNsBxfRI X-Received: by 2002:a05:6830:2361:: with SMTP id r1mr2799773oth.88.1581375107821; Mon, 10 Feb 2020 14:51:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581375107; cv=none; d=google.com; s=arc-20160816; b=ehXpF9uIV9GIw3tL/vOWKix0+MmO/rF2PHedfVjk/U2rBy8XQvsAlc8AYrj0ynT/lV aqOQZ2j4P5oIW1BmswKPg3JPQQFpc25/8WlpRlGZGvsHQmHOEaMk6l+IGQ/XeIjGWceW 335Iy+V7hhqQufcUcSOD006pwew40gZyKtP4VsaZ+fFfsOjaYrVY1oTybFn+1Pulz487 yIooidnIA5CDJLQQJoMe0GTwGGCSo2D//3G207Dx9KYu2Dmx+t+MwhaPAiF2VqshXR/U L0yx4gQ9Rg8C0LENZKtx9Viv5CT/2YsFsqPnPztnf8Oz7Ci6K8hOlb7H3FJShcCG6JIo cy1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=V3yaE8cm61l0ZwzB+lWb6l70guAP+tetwcy2h3RoP5w=; b=wJotiRKtFIfG/pLCP19qEXfYs4izH+oooZnq1d43/cbBm8UHX4gPpDwdC3+Y0zkBCB zRhZV56zjHXpigxDd4R5hHHRnPL35a36F6EdQfCIG8QBMLbDMN+qthY6auYzrmznGHse 0w9gjr6q91ARCBXTRBRwVa303jNbEO7IEG5Z0vxlC8icSkq2EIZZJMSD1nkxZESpGIjK bzAwRihaRo0wHeH4mzwXpDu5K7u+vs6PIopFhiyAqBcKIZgbGCvvXdZQHSZCAgxc0VCl Ho7p39gEcQVYWcJqDF2FEZZgnDIe7JbSooGBMm/2O0eE70h6UNJbGeWpJA9m/PwNkqsh E1CA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="DCEG/6mT"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v11si902995otp.279.2020.02.10.14.51.30; Mon, 10 Feb 2020 14:51:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="DCEG/6mT"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727455AbgBJWvV (ORCPT + 99 others); Mon, 10 Feb 2020 17:51:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:57102 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727116AbgBJWvV (ORCPT ); Mon, 10 Feb 2020 17:51:21 -0500 Received: from akpm3.svl.corp.google.com (unknown [104.133.8.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2EAAC2051A; Mon, 10 Feb 2020 22:51:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1581375079; bh=m0i9+cW/1wLxE5qG+SUHyTwKPJKk1xBCyUzRtHTL5vg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=DCEG/6mT6t0QBg8IK9C4cPHKgSwml5ZJ1DPx72Nl60zWrCsaRguNRs8YJDkJUS4tj JEQF1apO5SnlCMyKCn/8KlKbu9Aa9J68rXRPQQttvQl73dhnupjt1FHWViFS1MFCuB R3c2271AnjrCCUFLSx7Si9BKDsvVQwihXtu9pFEo= Date: Mon, 10 Feb 2020 14:51:18 -0800 From: Andrew Morton To: Konstantin Khlebnikov Cc: Petr Mladek , Peter Zijlstra , linux-kernel@vger.kernel.org, Steven Rostedt , Sergey Senozhatsky , Dmitry Monakhov Subject: Re: [PATCH] kernel/watchdog: flush all printk nmi buffers when hardlockup detected Message-Id: <20200210145118.1d80e248c9206aeafd5baae6@linux-foundation.org> In-Reply-To: <158132813726.1980.17382047082627699898.stgit@buzz> References: <158132813726.1980.17382047082627699898.stgit@buzz> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 10 Feb 2020 12:48:57 +0300 Konstantin Khlebnikov wrote: > In NMI context printk() could save messages into per-cpu buffers and > schedule flush by irq_work when IRQ are unblocked. This means message > about hardlockup appears in kernel log only when/if lockup is gone. I think I understand what this means. The hard lockup detector runs at NMI time but if it detects a lockup within IRQ context it cannot call printk, because it's within NMI context, where synchronous printk doesn't work. Yes? > Comment in irq_work_queue_on() states that remote IPI aren't NMI safe > thus printk() cannot schedule flush work to another cpu. > > This patch adds simple atomic counter of detected hardlockups and > flushes all per-cpu printk buffers in context softlockup watchdog > at any other cpu when it sees changes of this counter. And I think this works because the softlockup detector runs within irq context? > > ... > > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -92,6 +92,26 @@ static int __init hardlockup_all_cpu_backtrace_setup(char *str) > } > __setup("hardlockup_all_cpu_backtrace=", hardlockup_all_cpu_backtrace_setup); > # endif /* CONFIG_SMP */ > + > +atomic_t hardlockup_detected = ATOMIC_INIT(0); > + > +static inline void flush_hardlockup_messages(void) I don't think this needs to be inlined? > +{ > + static atomic_t flushed = ATOMIC_INIT(0); > + > + /* flush messages from hard lockup detector */ > + if (atomic_read(&hardlockup_detected) != atomic_read(&flushed)) { > + atomic_set(&flushed, atomic_read(&hardlockup_detected)); > + printk_safe_flush(); > + } > +} Could we add some explanatory comments here? Explain to the reader why this code exists, what purpose it serves? Basically a micro version of the above changelog. > > ... >