Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755427AbcJFCAD (ORCPT ); Wed, 5 Oct 2016 22:00:03 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:3023 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752774AbcJFCAC (ORCPT ); Wed, 5 Oct 2016 22:00:02 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CqKAAMr/VXEJqYLHleHAEBBAEBCgEBgz0BAQEBAR5XfIJ5oBIBAQEBAQEGgRqLe4NxhkgihXgCAgEBAoFwTQECAQEBAQECBgEBAQEBAQEBN0CEYQEBAQMBOhwjBQsIAxgJJQ8FJQMHGhMbiCsHD7x7AQEBAQEBAQMBAQEBAR4FHoVUhR+CY4UUgi8FjjeFaoVZhieJRo9/SYwtg36BEwUHhG4qNIg2AQEB Date: Thu, 6 Oct 2016 12:59:57 +1100 From: Dave Chinner To: Linus Torvalds Cc: Paul Gortmaker , Johannes Weiner , Andrew Morton , Antonio SJ Musumeci , Miklos Szeredi , Linux Kernel Mailing List , stable Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers Message-ID: <20161006015957.GB9806@dastard> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2906 Lines: 71 On Tue, Oct 04, 2016 at 08:29:00PM -0700, Linus Torvalds wrote: > On Tue, Oct 4, 2016 at 7:43 PM, Paul Gortmaker > wrote: > > > > A couple years ago Ingo had an idea to kill BUG_ON abuse that I made > > a 1st pass at. Back then it seemed nobody cared. Maybe that has since > > changed? > > > > https://lkml.org/lkml/2014/4/30/359 > > So we actually have the checkpatch warning already: > > # avoid BUG() or BUG_ON() > if ($line =~ /\b(?:BUG|BUG_ON)\b/) { > my $msg_type = \&WARN; > $msg_type = \&CHK if ($file); > &{$msg_type}("AVOID_BUG", > "Avoid crashing the kernel - try > using WARN_ON & recovery code rather than BUG() or BUG_ON()\n" . > $herecurr); > } > > but it doesn't trigger on VM_BUG_ON(). > > And I'm not convinced about replacing things with BUG_ON_AND_HALT(), > it simply doesn't fix the existing issue we have: people use BUG_ON(), > and worse, _when_ they use BUG_ON(), they use it instead of error > handling, so the code _around_ the BUG_ON() tends to then very much > depend on what the BUG_ON() checks. > > This is actually one way that VM_BUG_ON() is better: it's very much by > design something that can be compiled away, so at least hopefully > nobody thinks of it as a security measure. So we could just say that > we will treat VM_BUG_ON() as a WARN_ON_ONCE(), and just not kill the > machine. In XFS, we use ASSERT() (could be XFS_BUG_ON() for all that the name matters) but we only define that to BUG_ON if CONFIG_XFS_DEBUG=y. For "production debug" kernels we have CONFIG_XFS_WARN=y, which turns ASSERT() into WARN_ON(). We get the warnings, but none of the crashiness that are desirable in a development context. This is what distro debug kernels should be using, as it also ensures we don't build in the real debug code that does things that would affect prodution systems adversely, like randomly take different allocator paths to ensure we get code coverage of all the allocator algorithms... i.e. production kernels ship with neither set, the debug kernel ships with CONFIG_XFS_WARN=y, and we do all our development with CONFIG_XFS_DEBUG=y. I think this case falls into the "production debug" classification; we want a warning, but we don't want the system to be taken down.... > But apart from the checkpatch thing, it's actually a pretty big change. Yeah, that's why we added CONFIG_XFS_WARN=y to do this - it was a 20 line change to add XFS_CONFIG_WARN instead of having to audit and modify ~1800 call sites to do something differently. And because we know that ASSERT() is not present in all kernels, it isn't ever used as a replacement for error handling. Perhaps that's the simplest solution here as well.... Just my 2c worth. -Dave. -- Dave Chinner david@fromorbit.com