Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp567492pxu; Tue, 1 Dec 2020 19:48:26 -0800 (PST) X-Google-Smtp-Source: ABdhPJxTZuMc7cnSgnJQ7R645zmpbqkklwPhnj2Ulf0hgl3Aki0/MAyhRuesYf9D60E8MWqTBMQm X-Received: by 2002:a50:a689:: with SMTP id e9mr716548edc.233.1606880905946; Tue, 01 Dec 2020 19:48:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606880905; cv=none; d=google.com; s=arc-20160816; b=tzuf6VkW+OcVvW/JXzIQcNKdBEFIX0691hs5rkk/yHV3gbzW7kxr3phidxcAa4qUFU kxEwFytBhvbWBq8Ni/XIyMw3MkmAeE9YLl9hQcSJ0TKrFb0WaaMI+kvHFrsfyJy3o1Ng 4xdsH68JSRBsQugM9W18rqA/Mu+gLB00niyM5UxMulZR5t3vDWUkl2+8Cu3bvMbltFZM WV0XWuyj276h5n6Z6Dn69OfgX/9++mC2qq3UmOfoNLg/a2ho/KLqbWUkzWrVfQHbcc4T bxWW14RU06x1+3KYYrN8P6Nxtk1f11QDnSA6yuG6J1EysWE6TxKFUUF83xtQ2WoBvwG3 e6+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=aT5KzyuPDKMSMYtpVyIWQldGFFFBfyXRI8bueOHPf48=; b=rS3C0aNrdhrUOzgHG/bdnRnw2O2WZslXsj9nuO/yCSp+K52uD3aD3qjnBcG6678ks1 MUMZqB7Nk7Ld79jCDspM2sLOqxnT+H6FoSlh6O9LsvjHiGNYyNbvQhciVY3WPrUoWBEx uUSnXM++Tl1ryXtT7MIMncAFORb1Mp+zMxpK9hvr3mIzMoqfp8arOLXz67zIbmJw+vGi A2gek65tJ+raV589VjJkZTd2cb0El9A2IVbcuIVfJuC0vH1it625xMptV98JtYy6Joma Ix55DZEZSKomO0KKgHB8TqOVhxmQGoZWMc1pmpYxEOJ1CsjSNcIwMiXxnnhuOmthFWUd g7tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=NBvGc24l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z6si107407ejc.420.2020.12.01.19.48.02; Tue, 01 Dec 2020 19:48:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=NBvGc24l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387421AbgLBDnx (ORCPT + 99 others); Tue, 1 Dec 2020 22:43:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727078AbgLBDnx (ORCPT ); Tue, 1 Dec 2020 22:43:53 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D20C1C0613CF for ; Tue, 1 Dec 2020 19:43:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=aT5KzyuPDKMSMYtpVyIWQldGFFFBfyXRI8bueOHPf48=; b=NBvGc24lR65L3NgxDT6K9RJAas rhl9/7C1rv3++q2GuGtZMZU6uNUXc7Im/IPsmPh/7SIS4Xor6I1WuzoPIi7NRnsuXxKDv3RiWnTnZ tGGv39U0IgZ4HZIWflu5qJbi7dwnc20FLeh3qlBmyZJnOLPwY6HFHoyBppBpABInBYF6K0la5o6KJ Z3rLnE6EnhafyABx1fpCelF0YEzc8lOX9zgEQjm8yc8lTNtGOrw2e1dVb/61g4TKgxn6spVVFpq+B p1nBUCxMXdUjj5sJ6AI4FeWwvYkWJ4qFPxNG1gNdTHcftPUJHLZmTLUiMk7ueIs9AS1l2to4yffTW SW93Vrsg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kkJ2q-0007t4-M0; Wed, 02 Dec 2020 03:43:08 +0000 Date: Wed, 2 Dec 2020 03:43:08 +0000 From: Matthew Wilcox To: Dan Williams Cc: "Shutemov, Kirill" , Linux Kernel Mailing List , Linux MM , linux-nvdimm , Vlastimil Babka , Yi Zhang Subject: Re: mapcount corruption regression Message-ID: <20201202034308.GD11935@casper.infradead.org> References: <20201201022412.GG4327@casper.infradead.org> <20201201204900.GC11935@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 01, 2020 at 06:28:45PM -0800, Dan Williams wrote: > On Tue, Dec 1, 2020 at 12:49 PM Matthew Wilcox wrote: > > > > On Tue, Dec 01, 2020 at 12:42:39PM -0800, Dan Williams wrote: > > > On Mon, Nov 30, 2020 at 6:24 PM Matthew Wilcox wrote: > > > > > > > > On Mon, Nov 30, 2020 at 05:20:25PM -0800, Dan Williams wrote: > > > > > Kirill, Willy, compound page experts, > > > > > > > > > > I am seeking some debug ideas about the following splat: > > > > > > > > > > BUG: Bad page state in process lt-pmem-ns pfn:121a12 > > > > > page:0000000051ef73f7 refcount:0 mapcount:-1024 > > > > > mapping:0000000000000000 index:0x0 pfn:0x121a12 > > > > > > > > Mapcount of -1024 is the signature of: > > > > > > > > #define PG_guard 0x00000400 > > > > > > Oh, thanks for that. I overlooked how mapcount is overloaded. Although > > > in v5.10-rc4 that value is: > > > > > > #define PG_table 0x00000400 > > > > Ah, I was looking at -next, where Roman renumbered it. > > > > I know UML had a problem where it was not clearing PG_table, but you > > seem to be running on bare metal. SuperH did too, but again, you're > > not using SuperH. > > > > > > > > > > (the bits are inverted, so this turns into 0xfffffbff which is reported > > > > as -1024) > > > > > > > > I assume you have debug_pagealloc enabled? > > > > > > Added it, but no extra spew. I'll dig a bit more on how PG_table is > > > not being cleared in this case. > > > > I only asked about debug_pagealloc because that sets PG_guard. Since > > the problem is actually PG_table, it's not relevant. > > As a shot in the dark I reverted: > > b2b29d6d0119 mm: account PMD tables like PTE tables > > ...and the test passed. That's not really surprising ... you're still freeing PMD tables without calling the destructor, which means that you're leaking ptlocks on configs that can't embed the ptlock in the struct page. I suppose it shows that you're leaking a PMD table rather than a PTE table, so that might help track it down. Checking for PG_table in free_unref_page() and calling show_stack() will probably help more.