Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp31379ybf; Wed, 26 Feb 2020 08:18:15 -0800 (PST) X-Google-Smtp-Source: APXvYqw3mfYvu0EdDmJFS7mw6m4V4BBhGq8f5S3hkpyETtBuvqncwB+CEgf9Jn7HqV5q8tnNMm8B X-Received: by 2002:a05:6830:1385:: with SMTP id d5mr3923433otq.61.1582733895434; Wed, 26 Feb 2020 08:18:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582733895; cv=none; d=google.com; s=arc-20160816; b=vMUJR4JwzEsfLntKo41yuyNn+iqLiq648NFXgUNzCV/9KjzePLZ6k7ufdS/+S1NmAw l+Dn4hXDe5pSZHk1rr3ntWWyMpuiyrJjj/ulzjUvkAwlujvp0vsAL+9/JsYb4fcKqbBZ 0FaYlPZR0+2EqpAQH1rQWj4I3P7nXm/wnHDbHgLAXY0eNxXbDGU53ekcSsv0M2GhPpI5 EXW/FzviCDhK8CBIxHRXQedHkfvDmXs37okctpp6GJKPEvrnapTSiua5z9po9aLQ70kR teJOZEyodJrQSc0nf/Y/XZLUjxWOdn7qfxNFdebPbrva8g6HMI2blO13zUHRezlxV7Mt S2Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=eciKYtYKZyNp/Oyu78BdOELsMG3xAZc2QO1gPvE9SmY=; b=fjK0uPxV4Jn+NIz/eezhu+xWKW/xjj2NEoZUCH5vGCrpdqX//iEvmO3H2HTYPAppox 4+DerBKMZfMxVLfwu3VnJ4pINsWhSA29zqVBCer07QeWvLJ8kg/jGAAHuvE4WhyjMb8H Ur4/nZk8ndoVWadFoUwP6VJEIuzfT504q+JZ/QWz5Z3ddBmR2oUAc74l2UHYn3bbFBzQ ClHQAG2smncfT615C8iL3gaNQAEOTzGFJm/HTm7L49EGgCocVeCqN7zLtpSGi/AxeOOu m1dSXtPCpJjCvbdpO0KJ+NUGYjh9wFB55Q8Ll1aVu1NTmUUVxdZk0X6QqGxbFcfVz95K s8hA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=XArRHwdM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m26si2805otk.305.2020.02.26.08.18.02; Wed, 26 Feb 2020 08:18:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=XArRHwdM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727471AbgBZQIh (ORCPT + 99 others); Wed, 26 Feb 2020 11:08:37 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:46184 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726631AbgBZQIh (ORCPT ); Wed, 26 Feb 2020 11:08:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=eciKYtYKZyNp/Oyu78BdOELsMG3xAZc2QO1gPvE9SmY=; b=XArRHwdMrfKlWeQhtAuAv61IZD XFla/1tlyVZOG05cJX3K49HqpYJrcvXohYtqBiSq4eLzch0TTc9+DmPT7+XuGcR1U7JdWdZPT99vt BMjsJngfWplGRYOr+PSFOaNsLkNl7Yg3TOURC5DXD4SIzvQEg+GZLlYmDT5H7qm/d76se2l5vNHg8 owu0jPPLE4Xoh1723YYGkKkV+EZ5cm3XxFazdyKeB8hB3gWG9A5GoWm3wPc6vPmXBgtJGJfp7LDAm pZOueqUGJmtDHw4NmwrIuPRPBLWJxicuAbbLs9YBTOK/NtPB0bzOlWki3ncQTEJRaJdPZBh7W2Ydi Cf8BYCsQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1j6zET-0007v5-IK; Wed, 26 Feb 2020 16:08:21 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 21DAB300130; Wed, 26 Feb 2020 17:06:23 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id A3E802B7418A0; Wed, 26 Feb 2020 17:08:18 +0100 (CET) Date: Wed, 26 Feb 2020 17:08:18 +0100 From: Peter Zijlstra To: Andy Lutomirski Cc: Frederic Weisbecker , Thomas Gleixner , LKML , X86 ML , Steven Rostedt , Brian Gerst , Juergen Gross , Paolo Bonzini , Arnd Bergmann Subject: Re: [patch 02/10] x86/mce: Disable tracing and kprobes on do_machine_check() Message-ID: <20200226160818.GY18400@hirez.programming.kicks-ass.net> References: <20200225213636.689276920@linutronix.de> <20200225220216.315548935@linutronix.de> <20200226011349.GH9599@lenoir> <20200226132850.GX18400@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 26, 2020 at 07:10:01AM -0800, Andy Lutomirski wrote: > On Wed, Feb 26, 2020 at 5:28 AM Peter Zijlstra wrote: > > On Tue, Feb 25, 2020 at 09:29:00PM -0800, Andy Lutomirski wrote: > > > > > >> +void notrace do_machine_check(struct pt_regs *regs, long error_code) > > > >> { > > > >> DECLARE_BITMAP(valid_banks, MAX_NR_BANKS); > > > >> DECLARE_BITMAP(toclear, MAX_NR_BANKS); > > > >> @@ -1360,6 +1366,7 @@ void do_machine_check(struct pt_regs *re > > > >> ist_exit(regs); > > > >> } > > > >> EXPORT_SYMBOL_GPL(do_machine_check); > > > >> +NOKPROBE_SYMBOL(do_machine_check); > > > > > > > > That won't protect all the function called by do_machine_check(), right? > > > > There are lots of them. > > > > > > > > > > It at least means we can survive to run actual C code in > > > do_machine_check(), which lets us try to mitigate this issue further. > > > PeterZ has patches for that, and maybe this series fixes it later on. > > > (I'm reading in order!) > > > > Yeah, I don't cover that either. Making the kernel completely kprobe > > safe is _lots_ more work I think. > > > > We really need some form of automation for this :/ The current situation > > is completely nonsatisfactory. > > I've looked at too many patches lately and lost track a bit of which > is which. Shouldn't a simple tracing_disable() or similar in > do_machine_check() be sufficient? It entirely depends on what the goal is :-/ On the one hand I see why people might want function tracing / kprobes enabled, OTOH it's all mighty frigging scary. Any tracing/probing/whatever on an MCE has the potential to make a bad situation worse -- not unlike the same on #DF. The same with that compiler instrumentation crap; allowing kprobes on *SAN code has the potential to inject probes in arbitrary random code. At the same time, if you're running a kernel with that on and injecting kprobes in it, you're welcome to own the remaining pieces. How far do we want to go? At some point I think we'll have to give people rope, show then the knot and leave them be. > We'd maybe want automation to check > everything before it. We still need to survive hitting a kprobe int3, > but that shouldn't have recursion issues. Right, so I think avoiding the obvious recursion issues is a more tractable problem and yes some 'safe' spot annotation should be enough to get automation working for that -- mostly.