Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp479074pxb; Thu, 30 Sep 2021 10:02:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3af3PwgUpeVNQUhPHESgmP4AgyHdHZfwY0W3mBpkvSW1dEZ+h80P56Of9Bbn6xFbs5Vhb X-Received: by 2002:a65:47cd:: with SMTP id f13mr5866936pgs.439.1633021324449; Thu, 30 Sep 2021 10:02:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633021324; cv=none; d=google.com; s=arc-20160816; b=MJRAEzkpa9RrpSfbwXwD1XiZZHNXbZUfLrmjCeBWJnGmaiiqPP57jmhHULoyP2PAHX nSZ4FZlKaWHTuVJO1O5WD5odY+dyzam1g7tNdJLORlkdKk1UjBkt22XRXnf4L4i+qApV rEcSGrkTitMDaBhQxUcfTAFr3z7f7okRZV/DggkvMtsw8Nkjhqm7eTD4eTgYpq/JzStM saCqvWwyXVVSOLQoJ1MV5fV/x+xoduXtrFPnLIfqlhencQ6u4QS0HaLHlzOdtS3RDKNT ozx/CeAbjjfv4dkTfsZSM5d55FfdBC9Hg92W6AyQjRUjk0Ap67hr17B6YH1QEFEN6ssq 4JYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=aB9+PdoT4oKOrPMDhypLZloa7kCds8Vtm6773tzzcNU=; b=OwbnmaOh3vzvyF/dNmf34YgMBu6mw9Fu6YYRn1bJO9rYBZZlHrPYPZVE7KdlLryfwM 4izuHmgs15W4bCcndfrQzLVZ7BW2UjO971MO/jwup8nnEKA5sGmBebDt3jh6XzqCz3if StdOlgN/LWUO13dXAkBonQQiZMdNYwFxeB1Q/BRdMh5m4ccLOy2wWF9R94PMI1MhSYNU /SNkAFWBq05uGsN7g0De/Y1SNHVWipQwzgo0uLf91xBOK7bzmm6/GIFGwZkTBwPVCKVF ojfsSrFR+Uf3wTVOWRu8z6ZslrUGuFamNKBYkzNdTvsDXYMBw9BWyherMUZM+eeJvIsw pvsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n3si4003937pjh.74.2021.09.30.10.01.50; Thu, 30 Sep 2021 10:02:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350985AbhI3RAw (ORCPT + 99 others); Thu, 30 Sep 2021 13:00:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:33886 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350497AbhI3RAu (ORCPT ); Thu, 30 Sep 2021 13:00:50 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 173F6617E5; Thu, 30 Sep 2021 16:59:05 +0000 (UTC) Date: Thu, 30 Sep 2021 12:59:03 -0400 From: Steven Rostedt To: Petr Mladek Cc: "Paul E. McKenney" , Alexander Popov , Jonathan Corbet , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Joerg Roedel , Maciej Rozycki , Muchun Song , Viresh Kumar , Robin Murphy , Randy Dunlap , Lu Baolu , Kees Cook , Luis Chamberlain , Wei Liu , John Ogness , Andy Shevchenko , Alexey Kardashevskiy , Christophe Leroy , Jann Horn , Greg Kroah-Hartman , Mark Rutland , Andy Lutomirski , Dave Hansen , Thomas Garnier , Will Deacon , Ard Biesheuvel , Laura Abbott , David S Miller , Borislav Petkov , kernel-hardening@lists.openwall.com, linux-hardening@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, notify@kernel.org, Linus Torvalds Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter Message-ID: <20210930125903.0783b06e@oasis.local.home> In-Reply-To: References: <20210929185823.499268-1-alex.popov@linux.com> <20210929194924.GA880162@paulmck-ThinkPad-P17-Gen-1> X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 30 Sep 2021 11:15:41 +0200 Petr Mladek wrote: > On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote: > > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote: > > > On 29.09.2021 21:58, Alexander Popov wrote: > > > > Currently, the Linux kernel provides two types of reaction to kernel > > > > warnings: > > > > 1. Do nothing (by default), > > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction, > > > > so panic_on_warn is usually disabled on production systems. > > Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn() > work as expected. I wonder who uses it in practice and what is > the experience. Several people use it, as I see reports all the time when someone can trigger a warn on from user space, and it's listed as a DOS of the system. > > The problem is that many developers do not know about this behavior. > They use WARN() when they are lazy to write more useful message or when > they want to see all the provided details: task, registry, backtrace. WARN() Should never be used just because of laziness. If it is, then that's a bug. Let's not use that as an excuse to shoot down this proposal. WARN() should only be used to test assumptions where you do not believe something can happen. I use it all the time when the logic prevents some state, and have the WARN() enabled if that state is hit. Because to me, it shows something that shouldn't happen happened, and I need to fix the code. Basically, WARN should be used just like BUG. But Linus hates BUG, because in most cases, these bad areas shouldn't take down the entire kernel, but for some people, they WANT it to take down the system. > > Also it is inconsistent with pr_warn() behavior. Why a single line > warning would be innocent and full info WARN() cause panic/pkill? pr_warn() can be used for things that the user can hit. I'll use pr_warn, for memory failures, and such. Something that says "we ran out of resources, this will not work the way you expect". That is perfect for pr_warn. But not something that requires a stack dump. > > What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform > about even more serious problems. Why a warning should cause panic/pkill > while an alert message is just printed? Because really, WARN() == BUG() but like I said, Linus doesn't like taking down the entire system on these areas. > > > It somehow reminds me the saga with %pK. We were not able to teach > developers to use it correctly for years and ended with hashed > pointers. > > Well, this might be different. Developers might learn this the hard > way from bug reports. But there will be bug reports only when > anyone really enables this behavior. They will enable it only > when it works the right way most of the time. The panic_on_warn() has been used for years now. I do not think this is an issue. > > I wonder if kernel could survive killing of any kthread. I have never > seen a code that would check whether a kthread was killed and > restart it. We can easily check if the thread is a kernel thread or a user thread, and make the decision on that. -- Steve