Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F133AC433F5 for ; Sat, 13 Nov 2021 20:06:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CDD7961163 for ; Sat, 13 Nov 2021 20:06:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236050AbhKMUJT (ORCPT ); Sat, 13 Nov 2021 15:09:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234983AbhKMUJR (ORCPT ); Sat, 13 Nov 2021 15:09:17 -0500 Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6DEBC061766 for ; Sat, 13 Nov 2021 12:06:24 -0800 (PST) Received: by mail-ed1-x52d.google.com with SMTP id w1so52643434edd.10 for ; Sat, 13 Nov 2021 12:06:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MHD2bTlUz5x6tjTp2HBj8sSROMtXQSdxPEpMQxK64s8=; b=IaaBUvEeymEULLdK7ddk8M4Yt36DhR8YRyneX1vtNJIq7VW7n4gHnonP0l16PEc1Uq XVbcfx81U2qDOR4W6XWmiSZ+feL8J2gkzOub47CFE6fEJNKXoDb2R4aQlUAJL9V7K8Xk SqCnRB4yAgnBzWWsirezpwNrkTD2vJAMUHCFw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MHD2bTlUz5x6tjTp2HBj8sSROMtXQSdxPEpMQxK64s8=; b=IMaXgDBv6knwfyWZO69effN/yfbISmLLpHUTWmOGtO4fc9KVr3woeAXrGnKQ5acDlB PQl9rSXyvR1wHandkyXOen0FkPN+/cLchvsRJlSTWuNGhfWbj1xFlebh737T6ZT+rg+g GI9r03JFhtXylKgRPloZ+EB3s/idtmmLdXkHF0u3Wn/5F2eY2zem5JeAZte1aTIs9qVz 0rOlo36lT0jnmwZUsQMuPQEOs5ezwAalyb3JmseCCTFQwzsFknnSWKMhy/ZUsUsSmeTq FRnaoQEtzn55fxr3HvKQC601cBx9WCusIY1T+IvCPYXVbbvzKs3GEZpuUQdpAmRCP56I cmgg== X-Gm-Message-State: AOAM530fTXOFmzZYt3hQ515lCldou3nb5Ddyf/U3ZTPNTwC6FJzNA3oa p7FFVu8pVWJ3j+xjnsTkBsFRw4DOajOCah8P/ow= X-Google-Smtp-Source: ABdhPJzxcbgP89nZh9a3c3UcD/WpIXBdy63Y6M2Ofw6GdJS04VAEy+H6goPtzxNl1JkoERdT3V0xug== X-Received: by 2002:a17:906:4dd0:: with SMTP id f16mr25573507ejw.454.1636833983100; Sat, 13 Nov 2021 12:06:23 -0800 (PST) Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com. [209.85.221.47]) by smtp.gmail.com with ESMTPSA id my2sm1023991ejc.109.2021.11.13.12.06.22 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 13 Nov 2021 12:06:23 -0800 (PST) Received: by mail-wr1-f47.google.com with SMTP id u18so21952727wrg.5 for ; Sat, 13 Nov 2021 12:06:22 -0800 (PST) X-Received: by 2002:adf:d1e2:: with SMTP id g2mr30419974wrd.105.1636833509376; Sat, 13 Nov 2021 11:58:29 -0800 (PST) MIME-Version: 1.0 References: <20211027233215.306111-1-alex.popov@linux.com> <77b79f0c-48f2-16dd-1d00-22f3a1b1f5a6@linux.com> In-Reply-To: <77b79f0c-48f2-16dd-1d00-22f3a1b1f5a6@linux.com> From: Linus Torvalds Date: Sat, 13 Nov 2021 11:58:13 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter To: Alexander Popov Cc: Jonathan Corbet , Paul McKenney , Andrew Morton , Thomas Gleixner , Peter Zijlstra , Joerg Roedel , Maciej Rozycki , Muchun Song , Viresh Kumar , Robin Murphy , Randy Dunlap , Lu Baolu , Petr Mladek , Kees Cook , Luis Chamberlain , Wei Liu , John Ogness , Andy Shevchenko , Alexey Kardashevskiy , Christophe Leroy , Jann Horn , Greg Kroah-Hartman , Mark Rutland , Andy Lutomirski , Dave Hansen , Steven Rostedt , Will Deacon , Ard Biesheuvel , Laura Abbott , David S Miller , Borislav Petkov , Arnd Bergmann , Andrew Scull , Marc Zyngier , Jessica Yu , Iurii Zaikin , Rasmus Villemoes , Wang Qing , Mel Gorman , Mauro Carvalho Chehab , Andrew Klychkov , Mathieu Chouquet-Stringer , Daniel Borkmann , Stephen Kitt , Stephen Boyd , Thomas Bogendoerfer , Mike Rapoport , Bjorn Andersson , Kernel Hardening , linux-hardening@vger.kernel.org, "open list:DOCUMENTATION" , linux-arch , Linux Kernel Mailing List , linux-fsdevel , notify@kernel.org, main@lists.elisa.tech, safety-architecture@lists.elisa.tech, devel@lists.elisa.tech, Shuah Khan , Lukas Bulwahn Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 13, 2021 at 10:14 AM Alexander Popov wrote: > > Killing the process that hit a kernel warning complies with the Fail-Fast > principle [1]. The thing is a WARNING. It's not even clear that the warning has anything to do with the process that triggered it. It could happen in an interrupt, or in some async context (kernel threads, whatever), or the warning could just be something that is detected by a different user than the thing that actually caused the warning to become an issue. If you want to reboot the machine on a kernel warning, you get that fail-fast thing you want. There are two situations: - kernel testing (pretty much universally done in a virtual machine, or simply just checking 'dmesg' afterwards) - hyperscalers like google etc that just want to take any suspect machines offline asap But sending a signal to a random process is just voodoo programming, and as likely to cause other very odd failures as anything else. I really don't see the point of that signal. I'm happy to be proven wrong, but that will require some major installation actually using it first and having a lot of strong arguments to counter-act the above. Seriously, WARN_ON() can happen in situations where sending a signal may be a REALLY BAD idea, never mind the issue that it's not even clear who the signal should be sent to. Yes, yes, your patches have some random "safety guards", in that it won't send the signal to a PF_KTHREAD or the global init process. But those safety guards literally make my argument for me: sending a signal to whoever randomly triggered a warning is simply _wrong_. Adding random "don't do it in this case" doesn't make it right, it only shows that "yes, it happens to the wrong person, and here's a hack to avoid generating obvious problems". Honestly, if the intent is to not have to parse the dmesg output, then I think it would be much better to introduce a new /proc file to read the kernel tainting state, and then some test manager process could be able to poll() that file or something. Not sending a signal to random targets, but have a much more explicit model. That said, I'm not convinced that "just read the kernel message log" is in any way wrong either. Linus