Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936187AbcJFXGJ (ORCPT ); Thu, 6 Oct 2016 19:06:09 -0400 Received: from mail-wm0-f51.google.com ([74.125.82.51]:35044 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935254AbcJFXGC (ORCPT ); Thu, 6 Oct 2016 19:06:02 -0400 MIME-Version: 1.0 In-Reply-To: References: <20161005054407.GC7297@1wt.eu> <20161005190604.GA8116@1wt.eu> From: Kees Cook Date: Thu, 6 Oct 2016 16:05:59 -0700 X-Google-Sender-Auth: cuXQznKIbgBV6-FeQJO81K5i0cI Message-ID: Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers To: Linus Torvalds Cc: Willy Tarreau , Paul Gortmaker , Johannes Weiner , Andrew Morton , Antonio SJ Musumeci , Miklos Szeredi , Linux Kernel Mailing List , stable Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1966 Lines: 45 On Thu, Oct 6, 2016 at 3:29 PM, Linus Torvalds wrote: > On Thu, Oct 6, 2016 at 3:07 PM, Kees Cook wrote: >> The "cleanest" way to handle it seemed to be the lock-busting logic >> already built into BUG, so I moved to that. > > Heh. The lock-busting logic in BUG() has always been broken. It's been > random hacks. It doesn't actually work in any general case, it just > occasionally happens to get things right. Mostly it tries to handle > the console locking (the whole "oops_in_progress" magic) so that if > you have a BUG_ON() in bad areas, at least you still end up getting > output. It seems to handle other things too, file descriptors, I think? Some giant warning, I think about fds, went away when I switched from do_exit() to BUG(). I'd have to go look more closely. > But no, it's not reliable in any way, shape or form. That's really why > you want to continue after a BUG(). Yeah, agreed about the unreliability. It's why I'm a fan of panic_on_oops. :P (Except when doing lots of tests under lkdtm, then I like having multiple Oopses without rebooting, but perhaps that is literally the only use-case...) >> By far the most problematic is "stop kernel execution from >> continuing", but that's currently the behavior that BUG depends on, so >> replacing BUG with anything needs to either fix the surrounding logic >> to fail sanely or we have the keep the feature. > > Well, I'm not sure how much we actually end up depending on it, > considering that we now have two examples of BUG() implementations > that actually do _not_ depend on stopping execution: both the sound > subsystem and the XFS version of BUG_ON() end up not actually doing > the BUG() thing. Yeah, for sure. I didn't mean to imply they all depended on it, just that finding those that do will require manual inspection. We'll not be able to do a flag-day on BUG until we fix everything. -Kees -- Kees Cook Nexus Security