Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp403057pxu; Tue, 5 Jan 2021 14:34:08 -0800 (PST) X-Google-Smtp-Source: ABdhPJzdzahGYoAnudyjy6/lqqE0czcoKSmR0+/Gbw5I43Y6QtvwOB5/XcOHwgEvsOFQwRRbKdn4 X-Received: by 2002:a17:906:b79a:: with SMTP id dt26mr980399ejb.337.1609886048359; Tue, 05 Jan 2021 14:34:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1609886048; cv=none; d=google.com; s=arc-20160816; b=vVAl+8B3dyV8VKzEs3TStj2e8XfPR3vTp46U0xKbg19i3yFw3YfKliVPJgUZCshIAD 811bcRcjeASvl4+ANtyDEFvhu0w8sca4Kql1aQ1VfvT4TA33V05wSg0nhNkPzTeqMBC7 HYqq40tMPhWBHcKf4vCBqcCVSU6Z/taxUKz2GMLyrIuGWFElUR3svVXTZiFSTmr/FGTf UYRr5lv4IkW96QGcuzFIf8/87VxPewgpPE78yQnActzCZUIpRP8t5jGgCNFF9vuTDh9x tsUmDw570SCl7UjT1J3yj3rcIROwgusXIs61xez/ON8lMBUvL4CgoVBNn39DnYLv/rQB NYbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=z1FwgA8Ww/7IgT5bjT946IYpX2R2TC69Q6b+o2dVbrM=; b=o/ZoOegz5l0s9+DZ75spHI53l8AQQbPl4O+308eAQ9w6DPk7SJXYvdc4hNLo4fNiT/ CcUKKgx/O7gxFNbm59jhG9VQuKk/TDl4AmHA78ICXnAgncLaAcfd9va66YpmUm30nN2I hFL0aY98wXIMliSFrVkb8zJdql8NUnQwUbMvtG7keJa5dxbNHFUUjdgN7EVlsXIUjB2x Ne4h9Foe82cDS0loP+XvqDGonxnojaJiWgn1SamvOIyN5HS2SPNWQPMpvpeAE19zFuDQ 1B3uPqEFFNZ2wudc1VWilPEZobpfyQciQVo4UXNss1Tj+DMOarglrLxWddpq3j7jBzVG gndA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=hA6nC4lK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d1si186952edy.296.2021.01.05.14.33.44; Tue, 05 Jan 2021 14:34:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=hA6nC4lK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730456AbhAETci (ORCPT + 99 others); Tue, 5 Jan 2021 14:32:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727742AbhAETch (ORCPT ); Tue, 5 Jan 2021 14:32:37 -0500 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C86AC061574 for ; Tue, 5 Jan 2021 11:31:57 -0800 (PST) Received: by mail-lf1-x12f.google.com with SMTP id h205so1122121lfd.5 for ; Tue, 05 Jan 2021 11:31:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=z1FwgA8Ww/7IgT5bjT946IYpX2R2TC69Q6b+o2dVbrM=; b=hA6nC4lKLxKrUs7MpCZCbI/RF0tI9dTZsLbOuov9YMyi2/eXblwkza4XzkyzrEciUj crhUOeNHz05e8YTS0MYujgw/1O+8VFW3/DqDYUINeFKBBLNL9kILdj3grIay+vTfxBP1 dk/siDfsguJmdFkH8uAtals9Ndiz3nqLvVNcw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z1FwgA8Ww/7IgT5bjT946IYpX2R2TC69Q6b+o2dVbrM=; b=tWn25izFKA+7mhLFlaDo+SZjTF25giI4HBtwhTz+2yha3FSIPbEPMeuZ40PZ+6R2jw CCBnMX4kPgaRQzUR5uIVbp7jO5duP0AI9VJLV7wY5p9tGPRzelWnl2/g3ihXPRgoA0I7 ui69mDzLWm39Hv6kpIJNSY5CR5OqmlveWmM0QSG0JJdkJni++KGeEi8zcVi5XCGJLP+u ywvIVTP9MWVGc/NaTLelQg+IO4iMoJKlz7aO6/+1d1PMOvvoyBZ1nj2rYB6A2J3nBFjX +6dQyjZIE8O0Mys8bdLuxeZ0ZEBZIwMfAp7JXFGxVp2lQ33uu2rM7Rmgqk8Wz3AIHJnx IUhw== X-Gm-Message-State: AOAM5331mGRDIZYUnM7kUiWf7EOUu2vWn8i7nMStyeKmCRqrrsHf5sTt 4+kJ+0as1CbYclcmjONCxhtsLThK48ki6g== X-Received: by 2002:a19:c508:: with SMTP id w8mr365434lfe.658.1609875115516; Tue, 05 Jan 2021 11:31:55 -0800 (PST) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com. [209.85.167.53]) by smtp.gmail.com with ESMTPSA id l17sm9423lfg.205.2021.01.05.11.31.53 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Jan 2021 11:31:54 -0800 (PST) Received: by mail-lf1-f53.google.com with SMTP id 23so1051416lfg.10 for ; Tue, 05 Jan 2021 11:31:53 -0800 (PST) X-Received: by 2002:a2e:b4af:: with SMTP id q15mr470264ljm.507.1609875113609; Tue, 05 Jan 2021 11:31:53 -0800 (PST) MIME-Version: 1.0 References: <000000000000886dbd05b7ffa8db@google.com> <20210104124153.0992b1f7fd1a145e193a333f@linux-foundation.org> In-Reply-To: From: Linus Torvalds Date: Tue, 5 Jan 2021 11:31:37 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kernel BUG at mm/page-writeback.c:LINE! To: Hugh Dickins Cc: Andrew Morton , Matthew Wilcox , syzbot , Linux Kernel Mailing List , Linux-MM , syzkaller-bugs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 4, 2021 at 7:29 PM Hugh Dickins wrote: > > > But I feel it's really that end_page_writeback() itself is > > fundamentally buggy, because the "wakeup" is not atomic with the bit > > clearing _and_ it doesn't actually hold the page lock that is > > allegedly serializing this all. > > And we won't be adding a lock_page() into end_page_writeback()! Right. However, the more I think about this, the more I feel end_page_writeback() is kind of ok, and the real problem is that we should have had a "lock/unlock" pattern for the PG_writeback bit instead. Instead, what we have is basically a special "wait for bit to clear", and then external sychronization - even if lacking - for set/clear bit. And the "wait for bit to clear" and "set bit" aren't even adjacent - the waiting happens in write_cache_pages(), but then the actual setting happens later in the actual "(*writepage)()" function. What _would_ be nicer, I feel, is if write_cache_pages() simply did the equivalent of "lock_page()" except for setting the PG_writeback bit. The whole "wait for bit to clear" is fundamentally a much more ambiguous thing for that whole "what if somebody else got it" reason, but it's also nasty because it can be very unfair (the same way our "lock_page()" itself used to be very unfair). IOW, you can have that situation where one actor continually waits for the bit to clear, but then somebody else comes in and gets the bit instead. Of course, writeback is much less likely than lock_page(), so it probably doesn't happen much in practice. And honestly, while I think it would be good to make the rule be "writepage function was entered with the writeback bit already held", that kind of change would be a major pain. We have at leastr 49 different 'writepage' implementations, and while I think a few of them end up just being block_write_full_page(), it means that we have a lot of that BUG_ON(PageWriteback(page)); set_page_writeback(page); pattern in various random filesystem code that all would have to be changed. So I think I know what I'd _like_ the code to be like, but I don't think it's worth it. > > So the one-liner of changing the "if" to "while" in > > wait_on_page_writeback() should get us back to what we used to do. > > I think that is the realistic way to go. Yeah, that's what I'll do. > > Which makes me think that the best option would actually be to move > > the bit clearing _into_ wake_up_page(). But that looks like a very big > > change. See above - having thought about this overnight, I don't think this is even the best change. > But I expect you to take one look and quickly opt instead for the "while" > in wait_on_page_writeback(): which is not so bad now that you've shown a > scenario which justifies it. [ patch removed - I agree, this pain isn't worth it ] Linus