Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1097067pxb; Fri, 26 Feb 2021 02:25:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJzsSYH3EFX4u66s9HPhLB48rNOB/qcfMTQGxubosXGLBWmb45GcD3bMsn8jzfI/cj5zLnLI X-Received: by 2002:a05:6402:1689:: with SMTP id a9mr2442948edv.273.1614335118692; Fri, 26 Feb 2021 02:25:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614335118; cv=none; d=google.com; s=arc-20160816; b=eRqMGM2iWq9ZVZyKEFosjwJ09uT6F61r5CeYeR9pAc/hIxiOTJlpEwbXgpqvzwrIm/ /hTzZwyemTP9M0+1T4XfHLhrZeysEiUlXHUz37Yv1eVcDcnpW6zoPHLmR7yazTUXFl3L 5FjUJYQJ6IcL9fA/jC/4daFDSdTx45bSgRlSkAkcXS+8FZ5558gzHI2Cv64/ee2f2c8B mIzs9fYrFrdZPQwO1xYPnt0RQcZcVZeyF72ujroJIoGjtdL5vUT3bRBCdWG9PaSMSVPc bDcmK/JS+TXwulDh1NjQo6WsRfmMLirslTuIzerikFlJXiJ6Q9AqsTpBppdWaVkFAKm2 KKGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=VGfwX3vXXvxymJ4jre4PPoJPxBZt8eV8It4FOckV8n0=; b=kHDg6NSrnnEol78+UZcTpsrU7J7L3+5bj/zHyknm2fz9n8ewlvCHyoF2YvYwNiqwIY UBe/gXQErHkRR5DIMZa4LlAGO//vwgaUtUcdDAXWKYx+Pttv/YlJ0UB9bN7O8kq61Ob6 QJ5A83fPafTJ3F/U0ZDcHrsoSyZKNBAZvMqKyDmti+mFL0IUq6uOSBh2XBGADc/Pb8bw mj1IBgJuO3s4snjhwcYPU5ABvLng5W7aAgAJ/KdP+6lyg4XLRB25xdx9+OqJpSIfie6I 2oUHk/znbBicCRy+A64b++JL0HaAHLITAgIG6w5eBj/e6TXdn7UOGPRws15SfX7PHasC tQxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m10si5549985ejr.752.2021.02.26.02.24.47; Fri, 26 Feb 2021 02:25:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229915AbhBZKYl (ORCPT + 99 others); Fri, 26 Feb 2021 05:24:41 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:13004 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229835AbhBZKYk (ORCPT ); Fri, 26 Feb 2021 05:24:40 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4Dn5Lf31WszjMjN; Fri, 26 Feb 2021 18:22:18 +0800 (CST) Received: from huawei.com (10.175.127.227) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.498.0; Fri, 26 Feb 2021 18:23:50 +0800 From: "zhangyi (F)" To: , CC: , , , Subject: [PATCH] block_dump: don't put the last refcount when marking inode dirty Date: Fri, 26 Feb 2021 18:31:03 +0800 Message-ID: <20210226103103.3048803-1-yi.zhang@huawei.com> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.127.227] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org There is an AA deadlock problem when using block_dump on ext4 file system with data=journal mode. watchdog: BUG: soft lockup - CPU#19 stuck for 22s! [jbd2/pmem0-8:1002] CPU: 19 PID: 1002 Comm: jbd2/pmem0-8 RIP: 0010:queued_spin_lock_slowpath+0x60/0x3b0 ... Call Trace: _raw_spin_lock+0x57/0x70 jbd2_journal_invalidatepage+0x166/0x680 __ext4_journalled_invalidatepage+0x8c/0x120 ext4_journalled_invalidatepage+0x12/0x40 truncate_cleanup_page+0x10e/0x1c0 truncate_inode_pages_range+0x2c8/0xec0 truncate_inode_pages_final+0x41/0x90 ext4_evict_inode+0x254/0xac0 evict+0x11c/0x2f0 iput+0x20e/0x3a0 dentry_unlink_inode+0x1bf/0x1d0 __dentry_kill+0x14c/0x2c0 dput+0x2bc/0x630 block_dump___mark_inode_dirty.cold+0x5c/0x111 __mark_inode_dirty+0x678/0x6b0 mark_buffer_dirty+0x16e/0x1d0 __jbd2_journal_temp_unlink_buffer+0x127/0x1f0 __jbd2_journal_unfile_buffer+0x24/0x80 __jbd2_journal_refile_buffer+0x12f/0x1b0 jbd2_journal_commit_transaction+0x244b/0x3030 The problem is a race between jbd2 committing data buffer and user unlink the file concurrently. The jbd2 will get jh->b_state_lock and redirty the inode's data buffer and inode itself. If block_dump is enabled, it will try to find inode's dentry and invoke the last dput() after the inode was unlinked. Then the evict procedure will unmap buffer and get jh->b_state_lock again in journal_unmap_buffer(), and finally lead to deadlock. It works fine if block_dump is not enabled because the last evict procedure is not invoked in jbd2 progress and the jh->b_state_lock will also prevent inode use after free. jbd2 xxx vfs_unlink ext4_unlink jbd2_journal_commit_transaction **get jh->b_state_lock** jbd2_journal_refile_buffer mark_buffer_dirty __mark_inode_dirty block_dump___mark_inode_dirty d_find_alias d_delete unhash dput //put the last refcount evict journal_unmap_buffer **get jh->b_state_lock again** In most cases of where invoking mark_inode_dirty() will get inode's refcount and the last iput may not happen, but it's not safe. After checking the block_dump code, it only want to dump the file name of the dirty inode, so there is no need to get and put denrty, and dump an unhashed dentry is also fine. This patch remove the dget() && dput(), print the dentry name directly. Signed-off-by: zhangyi (F) Signed-off-by: yebin (H) --- fs/fs-writeback.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index c41cb887eb7d..e9b0952fe236 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2199,21 +2199,29 @@ int dirtytime_interval_handler(struct ctl_table *table, int write, static noinline void block_dump___mark_inode_dirty(struct inode *inode) { if (inode->i_ino || strcmp(inode->i_sb->s_id, "bdev")) { - struct dentry *dentry; + struct dentry *dentry = NULL; const char *name = "?"; - dentry = d_find_alias(inode); - if (dentry) { - spin_lock(&dentry->d_lock); - name = (const char *) dentry->d_name.name; + if (!hlist_empty(&inode->i_dentry)) { + spin_lock(&inode->i_lock); + if (!hlist_empty(&inode->i_dentry)) { + dentry = hlist_entry(inode->i_dentry.first, + struct dentry, d_u.d_alias); + spin_lock(&dentry->d_lock); + name = (const char *) dentry->d_name.name; + } else { + spin_unlock(&inode->i_lock); + } } + printk(KERN_DEBUG "%s(%d): dirtied inode %lu (%s) on %s\n", current->comm, task_pid_nr(current), inode->i_ino, name, inode->i_sb->s_id); + if (dentry) { spin_unlock(&dentry->d_lock); - dput(dentry); + spin_unlock(&inode->i_lock); } } } -- 2.25.4