Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941919AbcJEW35 (ORCPT ); Wed, 5 Oct 2016 18:29:57 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:34626 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934063AbcJEW3w (ORCPT ); Wed, 5 Oct 2016 18:29:52 -0400 MIME-Version: 1.0 In-Reply-To: References: <20161005054407.GC7297@1wt.eu> <20161005190604.GA8116@1wt.eu> From: Linus Torvalds Date: Wed, 5 Oct 2016 15:29:50 -0700 X-Google-Sender-Auth: ZwTDFzUlQk0YfVqN9zDQS8b_4MU Message-ID: Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers To: Kees Cook Cc: Willy Tarreau , Paul Gortmaker , Johannes Weiner , Andrew Morton , Antonio SJ Musumeci , Miklos Szeredi , Linux Kernel Mailing List , stable Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2913 Lines: 59 On Wed, Oct 5, 2016 at 3:17 PM, Kees Cook wrote: > > With my more paranoid desires, I would prefer to keep "stop kernel > execution with the state set up by this process", not just "make the > process never return to user-space". Quite honestly, I think the answer to that is: "No. Not by default". So with some kind of kernel command line option, yes, kind of like "reboot_on_oops" (or whatever it is - I've never used it ;) >> And *if* we make BUG() actually do something sane (non-trapping), we >> can easily make it be generic, not arch-specific. In fact, I'd >> implement it by just adding a "handle_bug()" in kernel/panic.c... > > Yeah, I'm not sure what the right next step would be. Do we need a new > set of functions between WARN and BUG? Or maybe extract the > process-killing logic on a per-arch level and make it a specific API > so that it can be explicitly called as part of error-handling? Hmm So the process-killing logic actually used to historically just be "call do_exit()". In fact, that's what most architectures still do in their error paths. And it's what a lot of people who just want to kill the current code do. So calling "do_exit()" is actually perfectly fine. It's just that calling do_exit() from BUG_ON() is a major pain, because of the asynchronous nature of BUG_ON(). But if you are in a regular system call and don't hold any locks, do_exit() is still fine. In fact, all that x86 really does differently from do_exit() in the fault path is to reset the stack pointer first, so that you don't get stack smashers when you have recursive faults (which used to be one really nasty failure case, not just with BUG_ON(), but with any kernel oops in general). So on x86, the crash code actually calls a function called "rewind_stack_do_exit()" instead. But the name gives it away: it's the exact same thing. So you can actually do a generic BUG_ON() (even with the current semantics) pretty much today by just having a config option that the architecture can set to specify whether you should just call "do_exit()" or "rewind_stack_do_exit()" to do that final killing action. There's a few other possible gotcha's (the code is hard to follow because the normal implementation uses a trapping instruction and hides the BUG() information in the text, so you get the whole fault path), but on the whole I think it should be fairly straightforward do just get rid of all the arch code, and replace it with a generic function that can then decide internally whether it wants to just warn, whether it wants to SIGKILL, or whether it wants to do the traditional thing and just force do_exit(). Or do new things like reboot or just halt. But it really would be very nice to never have do_exit() have to worry about odd callers. We've had a *lot* of trouble over the years with deadlocks on critical locks in do_exit(), for example. Linus