Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753522AbcJDJc1 (ORCPT ); Tue, 4 Oct 2016 05:32:27 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:57442 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752605AbcJDJcZ (ORCPT ); Tue, 4 Oct 2016 05:32:25 -0400 Date: Tue, 4 Oct 2016 11:32:16 +0200 From: Johannes Weiner To: Linus Torvalds Cc: Andrew Morton , Antonio SJ Musumeci , Miklos Szeredi , Linux Kernel Mailing List , stable Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers Message-ID: <20161004093216.GA21170@cmpxchg.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.0 (2016-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3543 Lines: 83 Hi Linus, On Mon, Oct 03, 2016 at 09:00:55PM -0700, Linus Torvalds wrote: > Johannes? Please make this your first priority. And in the meantime I > will make that VM_BUG_ON() be a VM_WARN_ON_ONCE(). I'm sorry about the trouble. I'll look into where the underflow comes from that triggers this new check, and thanks for fixing it up in the meantime. > And dammit, if anybody else feels that they had done "debugging > messages with BUG_ON()", I would suggest you > > (a) rethink your approach to programming > > (b) send me patches to remove the crap entirely, or make them real > *DEBUGGING* messages, not "kill the whole machine" messages. Yeah, it's what CONFIG_DEBUG_VM effectively is :( There are a few instances where it adds additional output to dmesg or a stat file, but the vast majority of uses is VM_BUG_ON(), VM_BUG_ON_PAGE() etc., which all mean "if anything is unexpected, stop in your tracks and drop me into a kdump kernel." People do think of it as a development tool rather than verbose diagnostics in production kernels, so I assume none of the conditions protected by that config option are curated for whether the machine could or should continue to limp along after they trigger, or at least much less so than the bare BUG vs WARN instances. But I agree that most, if not all of these checks could warn under a ratelimit and be more useful for bleeding edge situations like the fedora kernel than the binary thinking of development kernels and production kernels as separate things with no overlap. And kernel people can set panic_on_warn if they want quick death and/or kdump. In the workingset code, if we detect radix tree nodes in a state in which they shouldn't be on the shadow node LRU, we could simply warn, abort the reclaim process and leave them off the LRU. Something like the below patch. I hate what it does to the codeflow, though, I don't want half of the code to be clutter that handles shouldnt-bes. But it also feels wrong to warn about a corrupted node and then keep working on it against who knows what state. I wonder if that's the fundamental dilemma that keeps us sniffing that assert() glue... diff --git a/mm/workingset.c b/mm/workingset.c index 617475f529f4..5f07db171c03 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -418,23 +418,28 @@ static enum lru_status shadow_lru_isolate(struct list_head *item, * no pages, so we expect to be able to remove them all and * delete and free the empty node afterwards. */ - BUG_ON(!workingset_node_shadows(node)); - BUG_ON(workingset_node_pages(node)); + if (WARN_ON_ONCE(!workingset_node_shadows(node))) + goto out_invalid; + if (WARN_ON_ONCE(workingset_node_pages(node))) + goto out_invalid; for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) { if (node->slots[i]) { - BUG_ON(!radix_tree_exceptional_entry(node->slots[i])); + if (WARN_ON_ONCE(!radix_tree_exceptional_entry(node->slots[i]))) + goto out_invalid; node->slots[i] = NULL; workingset_node_shadows_dec(node); - BUG_ON(!mapping->nrexceptional); + if (WARN_ON_ONCE(!mapping->nrexceptional)) + goto out_invalid; mapping->nrexceptional--; } } - BUG_ON(workingset_node_shadows(node)); + if (WARN_ON_ONCE(workingset_node_shadows(node))) + goto out_invalid; inc_node_state(page_pgdat(virt_to_page(node)), WORKINGSET_NODERECLAIM); - if (!__radix_tree_delete_node(&mapping->page_tree, node)) - BUG(); + __radix_tree_delete_node(&mapping->page_tree, node); +out_invalid: spin_unlock(&mapping->tree_lock); ret = LRU_REMOVED_RETRY; out: