Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp135283pxu; Wed, 25 Nov 2020 15:25:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJw66VlQ1IFQDDOadLlShLtn2z940GdcRu/AsV84uikeoyXoxDMmZH3tak4cSfmMRQNwoD1J X-Received: by 2002:a05:6402:1ac4:: with SMTP id ba4mr6824edb.383.1606346748562; Wed, 25 Nov 2020 15:25:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606346748; cv=none; d=google.com; s=arc-20160816; b=IoFrEzIgcBHYBNjZU3w5aAl8CMQIbJDW8tqKKpv2edKdQwo8tcnrJwbI6evXeCZG26 mKjlRtPCFRfsQP9ouS8RkSgeNsofrpUQP/JadMH/VgMANiQdedf+hxzp4h/ziGlg0wz3 XUO8RSVTC7zWhE7M19L+pZC48F3+DU3wxwvapqmZDduzN1OjsRJ/FDEDmwOoyhtRMoS5 9s9ynaaFA3jVtJIULdT6MwXaa6VYPWvdqdxx8qIqb2cSooBGw3XFa5HJxbAYQfjWzIhV kuKrFZtoWuD1FgaoSlaEwkaamuURDXud2EEHZ7RMOpo1BluGBywBoeywHtRsBNfAEiFE ManA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=tOPQoHjgvYUvh+HHnKxiG8jONSzsA15qJMnLlNlgP0M=; b=OqPiVp1Taiyv4i2km01ni5rC/8OUaguGgp4vOScR3d3mbSENg0X4DrSvJ2Lm792+sr LIqt3vJOhJ0rsyJBeN+mxBtDgjzDeOT8kVVVnazl3bNqBYCDjO9NPWknuMzzkw3KXQrJ X0/lHWttZQriTAAuKHqNBr853weZkIVl7GKdZPe5mxpc6N8+GSLvtTYZtzqVKEdmL0Of fRsqBCfUDjMCDpGZ73RSozC/PHczuNoCMd6sl8ICnGh4UNN/XphApKr08OTXrl35Zkmw XgfzSumBOICdEII2U6gFlbMZb3GDVxq97x3pEBlZRMyUY39eCXBdg8gM4/KlaHH/BJKX USEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=g9y81DNQ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hq10si1861287ejc.616.2020.11.25.15.25.25; Wed, 25 Nov 2020 15:25:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=g9y81DNQ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732193AbgKYVao (ORCPT + 99 others); Wed, 25 Nov 2020 16:30:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732177AbgKYVao (ORCPT ); Wed, 25 Nov 2020 16:30:44 -0500 Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22A32C0613D4 for ; Wed, 25 Nov 2020 13:30:44 -0800 (PST) Received: by mail-lf1-x133.google.com with SMTP id d8so5194016lfa.1 for ; Wed, 25 Nov 2020 13:30:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tOPQoHjgvYUvh+HHnKxiG8jONSzsA15qJMnLlNlgP0M=; b=g9y81DNQmmhmWGTiGwiN1BpWJR901eSsmOaWRaDAPAF045hxvt7onry/8Zvu9QMIoO Mj7cMnjDGxU6hMF6tVoIt4bsojN6gPJ4SXLznmP+jMHmyniwn5WHjj2vQ6xOOO0OJlr6 V8poGWCE2wHSk99omymX/xHLo/xmK1pAoTF0c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tOPQoHjgvYUvh+HHnKxiG8jONSzsA15qJMnLlNlgP0M=; b=Qk/FqGTNgckvOg6htY6RrjA6VAdNtWwP/r6l+yhAFzRVl98fADAGSXF3I/XH/C9onO 2Sr1qZMY08QRj9HIsNF9RXnwSMrAJrS2JrVYr8rFA0ZKqLVNVdIlMCDZNXSQY7Z1Ubf3 UQTVemijM0Lj8QYja6V3j+OR/Z+c+OUNAcHttBhTBB+QMYUf9bDMGIudjQVXoa9vXwA3 +54/OViL/M9fB77izlGnonePilhlI9xbrdTNKZAK5ps1batadgYPcWI59adHThMVOx3a B7xXZLhzPBZN5CppBYvbGxqs76uuOeX6KfjFYnKd+tz9fYVpks8DNbTYQ42+xbeot3MN 4uLg== X-Gm-Message-State: AOAM531ouAolcy2nYi6we/2//t8lXNNLkG0DRG7JXAknyLTNdJTKwfGC YTqkkGn2K21bd/bM9nxf2SDs2zgSZsbU5Q== X-Received: by 2002:a19:c705:: with SMTP id x5mr76577lff.16.1606339842181; Wed, 25 Nov 2020 13:30:42 -0800 (PST) Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com. [209.85.208.178]) by smtp.gmail.com with ESMTPSA id j131sm60234lfd.209.2020.11.25.13.30.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 25 Nov 2020 13:30:40 -0800 (PST) Received: by mail-lj1-f178.google.com with SMTP id f24so3773315ljk.13 for ; Wed, 25 Nov 2020 13:30:37 -0800 (PST) X-Received: by 2002:a05:651c:339:: with SMTP id b25mr15104ljp.285.1606339837289; Wed, 25 Nov 2020 13:30:37 -0800 (PST) MIME-Version: 1.0 References: <000000000000d3a33205add2f7b2@google.com> <20200828100755.GG7072@quack2.suse.cz> <20200831100340.GA26519@quack2.suse.cz> <20201124121912.GZ4327@casper.infradead.org> <20201124183351.GD4327@casper.infradead.org> <20201124201552.GE4327@casper.infradead.org> In-Reply-To: From: Linus Torvalds Date: Wed, 25 Nov 2020 13:30:20 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kernel BUG at fs/ext4/inode.c:LINE! To: Hugh Dickins Cc: Matthew Wilcox , Jan Kara , syzbot , Andreas Dilger , Ext4 Developers List , Linux Kernel Mailing List , syzkaller-bugs , "Theodore Ts'o" , Linux-MM , Oleg Nesterov , Andrew Morton , "Kirill A. Shutemov" , Nicholas Piggin , Alex Shi , Qian Cai , Christoph Hellwig , "Darrick J. Wong" , William Kucharski , Jens Axboe , linux-fsdevel , linux-xfs Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, Nov 24, 2020 at 3:24 PM Linus Torvalds wrote: > > I've applied your second patch (the smaller one that just takes a ref > around the critical section). If somebody comes up with some great > alternative, we can always revisit this. Hmm. I'm not sure about "great alternative", but it strikes me that we *could* move the clearing of the PG_writeback bit _into_ wake_up_page_bit(), under the page waitqueue lock. IOW, we could make the rule be that the bit isn't actually cleared before calling wake_up_page() at all, and we'd clear it with something like unsigned long flags = READ_ONCE(page->flags); // We can clear PG_writeback directly if PG_waiters isn't set while (!(flags & (1ul << PG_waiters))) { unsigned long new = flags & ~(1ul << PG_writeback); // PG_writeback was already clear??!!? if (WARN_ON_ONCE(new == flags)) return; new = cmpxchg(&page->flags, flags, new); if (likely(flags == new)) return; flags = new; } // Otherwise, clear the bit at the end - but under the // page waitqueue lock - inside wake_up_page_bit() return wake_up_page_bit(..); instead. That would basically make the bit clearing atomic wrt the PG_waiters flags - either using that atomic cmpxchg, or by doing it under the page queue lock so that it's atomic wrt any new waiters. This seems conceptually like the right thing to do - and if would also make the (fair) exclusive lock hand-off case atomic too, because the bit we're waking up on would never be cleared if it gets handed off directly. The above is entirely untested crap written in my MUA, and obviously requires that all callers of wake_up_page() be moved to that new world order, but I think we only have two cases: unlock_page() and end_page_writeback(). And unlock_page() already has that "clear_bit_unlock_is_negative_byte()" special case that is an ugly special case of PG_waiters atomicity. So we'd get rid of that, because the cmpxchg loop would be the better model. I'm not sure I'm willing to write and test the real patch, but it doesn't look _too_ nasty from just looking at the code. The bookmark thing makes it important to only actually clear the bit at the end (as does the handoff case anyway), but the way wake_up_page_bit() is written, that's actually very straightforward - just after the while-loop. That's when we've woken up everybody. So I'm sending this idea out to see if somebody can shoot it down, or even wants to possibly even try to do it.. Linus