Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp118475pxx; Mon, 26 Oct 2020 04:56:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyK2L1SzC7V1fokFn4u8kpbclNjO8v3grOc3vyeh7KVjydkauU8X7euv5KsODojy1aAkxdU X-Received: by 2002:a05:6402:195:: with SMTP id r21mr15093763edv.164.1603713381486; Mon, 26 Oct 2020 04:56:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603713381; cv=none; d=google.com; s=arc-20160816; b=R+U4eAPHk1Pu+gIlGhOX1MgYui7YKcbhprfhJlzALr/NZBR3TMyefefU2ckreqLfCj 9aNefZ0C5U+Qz+z8Xnf+8/h8/WO739rio/hcHrAaphYs8GX45o96MiGYmD6wQR0utSDb OgG12oDPz29py/nwZktAdq+uSdilBma0vM3Bi9UGfD6YjwWYwxfPxwLIc1Ze7P1GhkqW m34t88y4gY6cOHniDB2pSYvejJuWnYMQeJloPCgp8QtXu44EuehHaRwA5LXfmG6pVQN0 kvYlEMrqnLhN37LJBrEeE5tswQYs3AWgyES+o7YSMSugKfccBznOJgQPm9qnmi2p5DJg TxrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=duo67zcxsyx92AOoWnDRTIVPExICZX/dMuREiR0YyqA=; b=fa/TSuUqPf0B0i2deX7ppCT47eM6TbboG3pQfgErk6wvRbTfLEH2YT/xeDOc2VZzDR 5E4mYxj1ivxQ2ImJzeZiaGi6hI67CslJqICipJshDkDn5Yab001ObSCXzaClqQWiPWHQ 5cU2N/d5Wr5TWC8xR4PtHGB78Nh+S0OYRhbCpbxzZUn7IyaJUS0ESUSd/954e0wowMD6 bd+GI+KA0747qp6BfWMeIYVhkaWEXJKw2YFV1/Kpsq1ISn8bI7OmMECeVdvqKhSpawiH xWoOfMC7vm9AKtqMRhYjsw3fJ/7CJkffaCt5bWusJfHQkD2o5nGCam0lBFv10a15I1NV 9+Pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b1si8683665ejb.290.2020.10.26.04.55.59; Mon, 26 Oct 2020 04:56:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1768628AbgJZJtz (ORCPT + 99 others); Mon, 26 Oct 2020 05:49:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:47962 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1768620AbgJZJtu (ORCPT ); Mon, 26 Oct 2020 05:49:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8E924B2CB; Mon, 26 Oct 2020 09:49:49 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id E72A81E10F5; Mon, 26 Oct 2020 10:49:48 +0100 (CET) Date: Mon, 26 Oct 2020 10:49:48 +0100 From: Jan Kara To: Matthew Wilcox Cc: Qian Cai , Christoph Hellwig , "Darrick J. Wong" , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jens Axboe , linux-mm@kvack.org Subject: Re: kernel BUG at mm/page-writeback.c:2241 [ BUG_ON(PageWriteback(page); ] Message-ID: <20201026094948.GA29758@quack2.suse.cz> References: <645a3f332f37e09057c10bc32f4f298ce56049bb.camel@lca.pw> <20201022004906.GQ20115@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201022004906.GQ20115@casper.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 22-10-20 01:49:06, Matthew Wilcox wrote: > On Wed, Oct 21, 2020 at 08:30:18PM -0400, Qian Cai wrote: > > Today's linux-next starts to trigger this wondering if anyone has any clue. > > I've seen that occasionally too. I changed that BUG_ON to VM_BUG_ON_PAGE > to try to get a clue about it. Good to know it's not the THP patches > since they aren't in linux-next. > > I don't understand how it can happen. We have the page locked, and then we do: > > if (PageWriteback(page)) { > if (wbc->sync_mode != WB_SYNC_NONE) > wait_on_page_writeback(page); > else > goto continue_unlock; > } > > VM_BUG_ON_PAGE(PageWriteback(page), page); > > Nobody should be able to put this page under writeback while we have it > locked ... right? The page can be redirtied by the code that's supposed > to be writing it back, but I don't see how anyone can make PageWriteback > true while we're holding the page lock. FWIW here's very similar report for ext4 [1] and I strongly suspect this started happening after Linus' rewrite of the page bit waiting logic. Linus thinks it's preexisting bug which just got exposed by his changes (which is possible). I've been searching a culprit for some time but so far I failed. It's good to know it isn't ext4 specific so we should be searching in the generic code ;). So far I was concentrating more on ext4 bits... Honza [1] https://lore.kernel.org/lkml/000000000000d3a33205add2f7b2@google.com/ -- Jan Kara SUSE Labs, CR