Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754763AbcJEQLN (ORCPT ); Wed, 5 Oct 2016 12:11:13 -0400 Received: from mail-oi0-f41.google.com ([209.85.218.41]:36496 "EHLO mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754389AbcJEQLL (ORCPT ); Wed, 5 Oct 2016 12:11:11 -0400 MIME-Version: 1.0 In-Reply-To: <20161005092534.GA20174@cmpxchg.org> References: <20161004093216.GA21170@cmpxchg.org> <20161005092534.GA20174@cmpxchg.org> From: Linus Torvalds Date: Wed, 5 Oct 2016 09:10:56 -0700 X-Google-Sender-Auth: wjjIAdkng5khdLrh5z9aKaCXSnY Message-ID: Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers To: Johannes Weiner Cc: Andrew Morton , Antonio SJ Musumeci , Miklos Szeredi , Dave Jones , Oleg Nesterov , Dave Chinner , Michal Hocko , Jan Kara , Linux Kernel Mailing List , stable Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1606 Lines: 37 On Wed, Oct 5, 2016 at 2:25 AM, Johannes Weiner wrote: > > Here is a reproducer that triggers the warning instantly for me: Yup, confirmed.With the VM_WARN_ON_ONCE() it just gets a big nice splat and the machine happily stays up. > That radix tree node management needs some cleaning up. It probably > makes sense to split node->count into actually separate members for > clarity, and then add a root tag to distinguish shadows from regular > entries in root->rnode. I have to think about this more, the current > situation is too fragile and ugly. Ugh. I even looked at the "node->count = 1" initialization in radix_tree_extend(), and didn't react to it at all, it looked obviously correct. This code is too subtle. > But in the meantime, there is an obvious fix: don't ever store shadow > entries in root->rnode, seeing as we need nodes for proper accounting. > > It means we temporarily lose the ability to detect refaults from > single-page files, but it's probably better to keep the stable fix > small and restore that functionality in a new release. > > Patch below. NOTE: I'm traveling without access to my test rig right > now and so I have only lightly tested this on my laptop. I'm also > jetlagged like crazy, so please triple check my thinking. The patch > does fix the reproducer case and has otherwise been stable here. Hmm. I'm inclined to just apply it and mark it for stable, along with your other patch. But yes, this needs more thinking about (and obviously testing). The interactions with the radix tree are too subtle. Linus