Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1160072pxk; Mon, 31 Aug 2020 11:24:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxb9rVSj1Wa4dMixymFz6/aBRyD40npbuNfZAjaAcpKv9cUQPGNagyWagA5HQG0oCkfHhaY X-Received: by 2002:a17:907:2115:: with SMTP id qn21mr2152363ejb.278.1598898277597; Mon, 31 Aug 2020 11:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598898277; cv=none; d=google.com; s=arc-20160816; b=MILn9lldZoVziQnxKdLz6vKdnTabyWoFIHko45jiTNyfs+9sNPUij8sC99tsU4QUNz jAwd9qe401FLXAm7CkEgh7cxnFBivbppw8TT+Vn8jsyX8aXImvBun1Yh7skd1iqYLIfn guZmmWZCJKiXSJCNQxdIxsUMoNuNfOkDgObUeTYlhQKcZBsfGxP6hcr7fhKM5IHDFTk7 3n6K0jTFBgP94iF2Z2mG3T+5P7KLYo1M9Zy0L1NuLj8CZkhugl2hrTRzt0r6LUupI/hD 8ALG6FX+Un/nGKnQv+qkg7fKd3CEh0lULq3xeO+2P9TjDMTdyN2cVEPS38Wpkkb+pN7Z pEcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=jTgzspytD7DuzYJj7jRzial3MrRwpGnfxvjiXd0RJrg=; b=VAu7Qy7NcWVo1ntKWzPJBxM3qYXgfedxGnF6MO/0oM8PV16puQ/TuLAsgIs0CCmboc N7Y2fZ4WJPuXkRs3V7lFQwFDrw5fKhceTlgqBs4KooQhWXikGSNHamYELG3nqECP4U66 pa2rGwVtS5OL3El6D7ZdG3Ap8C7FziHX6tzrC/LX/EHgBGBgSsRtJQd9vTiCQxI76DTJ qocowpuclm+iG4d1X5pg93ArSyyHiHg4r2sL2AqhDG8pX7hkIR/DRJt6sMhPOuAHsyBO 5B1VEHh1jpotU2vQHJ4jFjFrTBLQd+kldlBhE+duMUp6ZuisreE5gmFfHfuz9JtflWt8 ccPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="N2g1q/Jl"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gc15si1834026ejb.464.2020.08.31.11.24.07; Mon, 31 Aug 2020 11:24:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="N2g1q/Jl"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729720AbgHaSWL (ORCPT + 99 others); Mon, 31 Aug 2020 14:22:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729702AbgHaSWL (ORCPT ); Mon, 31 Aug 2020 14:22:11 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AB57C061575 for ; Mon, 31 Aug 2020 11:22:11 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id k25so4813598ljg.9 for ; Mon, 31 Aug 2020 11:22:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jTgzspytD7DuzYJj7jRzial3MrRwpGnfxvjiXd0RJrg=; b=N2g1q/JlCxdx+13x2JZxaLmluk+WXW7bphbyfghBpkD0grHOsaVxOS8iWElJyflhut uEXYE1QCTv5zYzdWdWQOCnGSPE6RMGzxGSXeCBlSKmDD75jiddhL9VLF8BzVEifBSp3x TiMDN4tgP6a/sYMe98YwS0rxdCl8ccDfbJ8cU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jTgzspytD7DuzYJj7jRzial3MrRwpGnfxvjiXd0RJrg=; b=tqomXUdVq6WDGGbN9BArk5YZaHoimhg8jP6mDtMcjwwhlFHxJvFVma0dvz+7bVJCu5 cYNe//rwPZ67VgWL/dYJWu0DYIuPBYS24HR814VPi6EfYD2Clp2whEyxEkHwUOlwyf2T a/D8kzdKMD4IlgetjRTUSpw7Kxm8sYq/bD4YQch8UAIVC7PFCCtqKSaSwWtk7brhAWmT /LzoywGjYyjGKyHuPYiEsGAB2wyZM2dZ4htcbgKqBv2oeQt+aHsvYd8URE7toAHjGz1v Yw03CyoH5IXxp9uQDFYldiudnyvdDU4Mj3vOq8jV1Rk3AjWzBd4iXM97xwdJdhbwysVv t8Mg== X-Gm-Message-State: AOAM5318YanWjstnfO4w3KUE8kJVK63nJIe46o3Kb61ZaUBiQclaccLR QekUXmar3JxXzIskLQiJU7R+pcLtDSq3hw== X-Received: by 2002:a05:651c:1119:: with SMTP id d25mr1070321ljo.300.1598898128684; Mon, 31 Aug 2020 11:22:08 -0700 (PDT) Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com. [209.85.208.170]) by smtp.gmail.com with ESMTPSA id w6sm2144020lfn.73.2020.08.31.11.22.07 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Aug 2020 11:22:07 -0700 (PDT) Received: by mail-lj1-f170.google.com with SMTP id i10so7860706ljn.2 for ; Mon, 31 Aug 2020 11:22:07 -0700 (PDT) X-Received: by 2002:a05:651c:219:: with SMTP id y25mr1144547ljn.314.1598898126633; Mon, 31 Aug 2020 11:22:06 -0700 (PDT) MIME-Version: 1.0 References: <000000000000d3a33205add2f7b2@google.com> <20200828100755.GG7072@quack2.suse.cz> <20200831100340.GA26519@quack2.suse.cz> In-Reply-To: <20200831100340.GA26519@quack2.suse.cz> From: Linus Torvalds Date: Mon, 31 Aug 2020 11:21:50 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kernel BUG at fs/ext4/inode.c:LINE! To: Jan Kara Cc: syzbot , Andreas Dilger , Ext4 Developers List , Linux Kernel Mailing List , syzkaller-bugs , "Theodore Ts'o" , Linux-MM , Oleg Nesterov Content-Type: text/plain; charset="UTF-8" Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Aug 31, 2020 at 3:03 AM Jan Kara wrote: > > On Fri 28-08-20 12:07:55, Jan Kara wrote: > > > > Doh, so this is: > > > > wait_on_page_writeback(page); > > >>> BUG_ON(PageWriteback(page)); > > > > in mpage_prepare_extent_to_map(). So we have PageWriteback() page after we > > have called wait_on_page_writeback() on a locked page. Not sure how this > > could ever happen even less how ext4 could cause this... > > I was poking a bit into this and there were actually recent changes into > page bit waiting logic by Linus. Linus, any idea? So the main change is that now if somebody does a wake_up_page(), the page waiter will be released - even if somebody else then set the bit again (or possible if the waker never cleared it!). It used to be that the waiter went back to sleep. Which really shouldn't matter, but if we had any code that did something like end_page_writeback(); .. something does set_page_writeback() on the page again .. then the old BUG_ON() would likely never have triggered (because the waiter would have seen the writeback bit being set again and gone back to sleep), but now it will. So I would suspect a pre-existing issue that was just hidden by the old behavior and was basically impossible to trigger unless you hit *just* the right timing. And now it's easy to trigger, because the first time somebody clears PG_writeback, the wait_on_page_writeback() will just return *without* re-testing and *without* going back to sleep. Could there be somebody who does set_page_writeback() without holding the page lock? Maybe adding a WARN_ON_ONCE(!PageLocked(page)); at the top of __test_set_page_writeback() might find something? Note that it looks like this problem has been reported on Android before according to that syzbot thing. Ie, this thing: https://groups.google.com/g/syzkaller-android-bugs/c/2CfEdQd4EE0/m/xk_GRJEHBQAJ looks very similar, and predates the wake_up_page() changes. So it was probably just much _harder_ to hit before, and got easier to hit. Hmm. In fact, googling for mpage_prepare_extent_to_map "kernel BUG" seems to find stuff going back years. Here's a patchwork discussion where you had a debug patch to try to figure it out back in 2016: https://patchwork.ozlabs.org/project/linux-ext4/patch/20161122133452.GF3973@quack2.suse.cz/ although that one seems to be a different BUG_ON() in the same area. Maybe entirely unrelated, but the fact that this function shows up a fair amount is perhaps a sign of some long-running issue.. Linus