Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1485078iog; Tue, 14 Jun 2022 07:03:26 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uS3bJEBnHzRpfwBvw3ryGsRTO5vv1Q+H4eN2X14c+1aecg5CJJ+UvWQEvNzweTevJA+ysD X-Received: by 2002:a05:6402:1857:b0:42d:bcd6:3a88 with SMTP id v23-20020a056402185700b0042dbcd63a88mr6418813edy.6.1655215406422; Tue, 14 Jun 2022 07:03:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655215406; cv=none; d=google.com; s=arc-20160816; b=BA2+9AtTK66cgijPASRdinch/m+WJJbzwbKGoCEOwc+A/kyWBE1lIe9/ggjt8O9bhV g8oVN7B4Rh5TA1W5ovJmJT83z0qBXPEPfeO4rGI1ZIEgoBZ93mBEirzsEgnKLlhmu3N4 wbI529lMNlttWPs6FMZQyjhpvhZtvsxA3/6T5wFSCM/BB4+7MsvJ2zWNxnjlmG4jytUu CPKoYgN6tnxVoFmuasB36FDO76D6FFfOPkyzqcYqRiq2KltzfAxFmbXdUAmONVtzP0f5 a53M1hG8C2Ozbcy/PaD7L/1EmVw76Og72URQ+9z1jTpael9SXauRrIP0beFf7q0R7rD/ DdwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=CeiLcPaUWpFLyUL+9AfxzQ6cCLGBau32H4tijDd9jqk=; b=i2xjZSc2CeVTQAdZO1qtonCTTdVsw0nPvRuvYBtgLumDn/oJpxab8XbcRdpY9juKe0 oDXJ9p4s16rCLCp7RyISGXh1YM/8OCSEIdxOkdUzbNF9GC5UsISX2iHChbEJPeSPGlYx uD08Vmqjx8IG2JX25IY3CzPaEo8kI/sp6N00aFXlLBQm/yFeTGdFsxmEpqT2SKU3qde/ +0EX/zsuI94u7ULhme+ZdZqnyLjZ8hCdmMp1feUjp733C9z+3cYkBxzGiz/7+ZhZgAoS sigzPMBtL2wfnBK1KXppXRkak7Zb7Ewxggz/ygt8Uhf7hgSnw4iZhnuZKrD0HnodaN/+ s6gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=nTBARfnL; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=Gp7XaVEq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p10-20020a170906604a00b007107e8746casi10688001ejj.519.2022.06.14.07.02.52; Tue, 14 Jun 2022 07:03:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=nTBARfnL; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=Gp7XaVEq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236161AbiFNN6z (ORCPT + 99 others); Tue, 14 Jun 2022 09:58:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343608AbiFNN6l (ORCPT ); Tue, 14 Jun 2022 09:58:41 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9251A2E696; Tue, 14 Jun 2022 06:58:39 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 2C44B1F984; Tue, 14 Jun 2022 13:58:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1655215118; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CeiLcPaUWpFLyUL+9AfxzQ6cCLGBau32H4tijDd9jqk=; b=nTBARfnLs4xqal0/JTkbIDZlKIg8p/c6pqNEIzOBzKMJh6ZxTO8uc9V3JB2zxJ+mzBkOIl len1iN+co9yZfNxXip8sirINpP0bWdJQkjl7/Bd2Dj8O/TXETPsPOkqUwx4VfWnZLSYtjv 7NfnJknbXbvnm0DSQhNNYEqohtnuS34= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1655215118; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CeiLcPaUWpFLyUL+9AfxzQ6cCLGBau32H4tijDd9jqk=; b=Gp7XaVEqQ5AwGE4SIN4e1AWnakNRl3vK1lqe/keGYSlZ9PP9Eb/siFJSVmuirbU1d/8HMs Md9NdyeL5Dx/AKAA== Received: from quack3.suse.cz (unknown [10.163.28.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id C687D2C143; Tue, 14 Jun 2022 13:58:37 +0000 (UTC) Received: by quack3.suse.cz (Postfix, from userid 1000) id 835F9A062E; Tue, 14 Jun 2022 15:58:37 +0200 (CEST) Date: Tue, 14 Jun 2022 15:58:37 +0200 From: Jan Kara To: Petr Mladek Cc: Alexandru Elisei , jack@suse.cz, sunjunchao2870@gmail.com, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, senozhatsky@chromium.org, rostedt@goodmis.org, john.ogness@linutronix.de, keescook@chromium.org, anton@enomsg.org, ccross@android.com, tony.luck@intel.com, heiko@sntech.de, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, maco@android.com, hch@lst.de, gregkh@linuxfoundation.org, jirislaby@kernel.org Subject: Re: [BUG] rockpro64 board hangs in console_init() after commit 10e14073107d Message-ID: <20220614135837.3doyrnekzja6grzc@quack3.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 14-06-22 14:23:32, Petr Mladek wrote: > On Mon 2022-06-13 17:54:35, Alexandru Elisei wrote: > > Config can be found at [1] (expires after 6 months). I've also built the > > kernel with gcc 10.3.1 [2] (aarch64-none-linux-gnu), same issue. > > > > I've bisected the build failure to commit 10e14073107d ("writeback: Fix > > inode->i_io_list not be protected by inode->i_lock error"); I've confirmed > > that that commit is responsible by successfully booting the board with a > > kernel built from v5.19-rc2 + the above commit reverted. > > It is strange. I can't see how consoles are related to filesystem > writeback. > > Anyway, the commit 10e14073107d ("writeback: Fix inode->i_io_list not > be protected by inode->i_lock error") modifies some locking and > might be source of possible deadlocks. Yes, I've got other reports from ARM people that this commit causes issues for them (kernel oops or so) so the locking changes are likely at fault... > I am not familiar with the fs code. But I noticed the following. > The patch adds: > > + if (!was_dirty) { > + wb = locked_inode_to_wb_and_lock_list(inode); > + spin_lock(&inode->i_lock); > > And locked_inode_to_wb_and_lock_list() is defined this way: > > /** > * locked_inode_to_wb_and_lock_list - determine a locked inode's wb and lock it > * @inode: inode of interest with i_lock held > * > * Returns @inode's wb with its list_lock held. @inode->i_lock must be > * held on entry and is released on return. The returned wb is guaranteed > * to stay @inode's associated wb until its list_lock is released. > */ > static struct bdi_writeback * > locked_inode_to_wb_and_lock_list(struct inode *inode) > __releases(&inode->i_lock) > __acquires(&wb->list_lock) > { > while (true) { > struct bdi_writeback *wb = inode_to_wb(inode); > > /* > * inode_to_wb() association is protected by both > * @inode->i_lock and @wb->list_lock but list_lock nests > * outside i_lock. Drop i_lock and verify that the > * association hasn't changed after acquiring list_lock. > */ > wb_get(wb); > spin_unlock(&inode->i_lock); > > It expects that inode->i_lock is taken before. But the problematic > commit takes it later. It might mess the lock and cause a deadlock. No. AFAICS inode->i_lock is held on entry to locked_inode_to_wb_and_lock_list(). The function releases it so we have to grab it again. The locking is ugly here but correct in this regard. It rather likely has to do something with reordering the checks and running locked_inode_to_wb_and_lock_list() on inodes for which we previously didn't do it but I have to yet fully understand why things crash... Honza -- Jan Kara SUSE Labs, CR