Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp2261702pxb; Mon, 11 Jan 2021 05:27:51 -0800 (PST) X-Google-Smtp-Source: ABdhPJx1CEHxYvBJPTDPEH6dFHa66/gRt9cdQMBdkPwFJUkmWGuMTPDbgL0bjCjC/D0Z0g1a0IAh X-Received: by 2002:a05:6402:8d9:: with SMTP id d25mr13878602edz.278.1610371670865; Mon, 11 Jan 2021 05:27:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610371670; cv=none; d=google.com; s=arc-20160816; b=v/sfD/YcmsRGapKiqS0uujszgJIsqt6AE9ycuuC6RMHiT+jm7ML3wHuJYPuN538VAt gLYOLVHfZ5e71OB5SqAEyb/QWbfVd+Bvl24KinxKeQk1iCPCIZgAQI2FuHZvdC5ZmJcN 9t+1PJVON6KBczgAEZON0t3subwvvKRuCojGo0xhrpNLLCKTbJC1z1uIMTAjBk93/iVx kBmGYABnuV5GOGOGJA5EndLq7MK1lwqUCWJwaPVkKxn3g4DjFZ3uC8NOKw1KEli5nx2Q MwiCsJ9q/cVETcdVlmHGnbRLB9GTwpOlqxZmlpeeIgfCUQTeSrdg12Am6SsYZoLRBnB8 Vf1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=vWeHZFXE08Dq7yKNRUvRJBp3yyrVV/xAgttu4h7GWFg=; b=QPQbTJWpZpZL6PYT1VCcnl0z2+G6xOeRgexFigeFMwYeND7uak8OIPp975hQMkxvyw H5ZxhGgzS78ne+j3fX8TqMwRIKa0GIBjRoUyYcy2hQFsj00nVJHaB0a/njL6qauF0icc DYF0tbhaWLbRzDi8Vkf4K3XxAclUcOZ1sJRBimtht4TbyE60wWpf0/KmTJ/2Fbix5zba +E4dHa79e2FbDs4+we+x3yAh2B9m4ak5AcLTDHT+O4Q2Js9yWelpk5EIoLbQQqaN4hsG T0rvYIAFXWYsau2IcspVYoLW1JSIrspjMqygj2J3GF8omR09n28A3ImCuK28uLVR+Q6o ifPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=DeinzU2c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l26si6913798edv.316.2021.01.11.05.27.26; Mon, 11 Jan 2021 05:27:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=DeinzU2c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732702AbhAKNRu (ORCPT + 99 others); Mon, 11 Jan 2021 08:17:50 -0500 Received: from mail.kernel.org ([198.145.29.99]:36406 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732677AbhAKNRq (ORCPT ); Mon, 11 Jan 2021 08:17:46 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5660922C7E; Mon, 11 Jan 2021 13:17:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1610371024; bh=Icqj3NnDquF0UPI8M7/V2n9LlkLHdSpbKK/t5/0sNfM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DeinzU2c66RDnkqgt19zPypHy7ZrlXJFzO5KLb0R5bTw7ZRqMtnmuHUQ104AIvUPG 8GwM6n7CdqicfePJ9h0GrLhA6PqaW+jJF++YRwqEpZVEuJEsvqlDIkD9Z++ccs69VX FOYeuut9ftHFGjYNffbOc2ap1O7XQVCAdItUwre8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com, Hugh Dickins , Andrew Morton , Matthew Wilcox , stable@kernel.org, Linus Torvalds Subject: [PATCH 5.10 109/145] mm: make wait_on_page_writeback() wait for multiple pending writebacks Date: Mon, 11 Jan 2021 14:02:13 +0100 Message-Id: <20210111130053.764396270@linuxfoundation.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210111130048.499958175@linuxfoundation.org> References: <20210111130048.499958175@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Linus Torvalds commit c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 upstream. Ever since commit 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") we've had some very occasional reports of BUG_ON(PageWriteback) in write_cache_pages(), which we thought we already fixed in commit 073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"). But syzbot just reported another one, even with that commit in place. And it turns out that there's a simpler way to trigger the BUG_ON() than the one Hugh found with page re-use. It all boils down to the fact that the page writeback is ostensibly serialized by the page lock, but that isn't actually really true. Yes, the people _setting_ writeback all do so under the page lock, but the actual clearing of the bit - and waking up any waiters - happens without any page lock. This gives us this fairly simple race condition: CPU1 = end previous writeback CPU2 = start new writeback under page lock CPU3 = write_cache_pages() CPU1 CPU2 CPU3 ---- ---- ---- end_page_writeback() test_clear_page_writeback(page) ... delayed... lock_page(); set_page_writeback() unlock_page() lock_page() wait_on_page_writeback(); wake_up_page(page, PG_writeback); .. wakes up CPU3 .. BUG_ON(PageWriteback(page)); where the BUG_ON() happens because we woke up the PG_writeback bit becasue of the _previous_ writeback, but a new one had already been started because the clearing of the bit wasn't actually atomic wrt the actual wakeup or serialized by the page lock. The reason this didn't use to happen was that the old logic in waiting on a page bit would just loop if it ever saw the bit set again. The nice proper fix would probably be to get rid of the whole "wait for writeback to clear, and then set it" logic in the writeback path, and replace it with an atomic "wait-to-set" (ie the same as we have for page locking: we set the page lock bit with a single "lock_page()", not with "wait for lock bit to clear and then set it"). However, out current model for writeback is that the waiting for the writeback bit is done by the generic VFS code (ie write_cache_pages()), but the actual setting of the writeback bit is done much later by the filesystem ".writepages()" function. IOW, to make the writeback bit have that same kind of "wait-to-set" behavior as we have for page locking, we'd have to change our roughly ~50 different writeback functions. Painful. Instead, just make "wait_on_page_writeback()" loop on the very unlikely situation that the PG_writeback bit is still set, basically re-instating the old behavior. This is very non-optimal in case of contention, but since we only ever set the bit under the page lock, that situation is controlled. Reported-by: syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com Fixes: 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") Acked-by: Hugh Dickins Cc: Andrew Morton Cc: Matthew Wilcox Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/page-writeback.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2826,7 +2826,7 @@ EXPORT_SYMBOL(__test_set_page_writeback) */ void wait_on_page_writeback(struct page *page) { - if (PageWriteback(page)) { + while (PageWriteback(page)) { trace_wait_on_page_writeback(page, page_mapping(page)); wait_on_page_bit(page, PG_writeback); }