Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp4962714pxb; Mon, 15 Feb 2021 06:09:07 -0800 (PST) X-Google-Smtp-Source: ABdhPJwBDb1bIadw8RakWGm/FLUVaQSOlVIZQNsl28w2GwlvmxFdPmI8ot7noED8WPp5QFlCDo9p X-Received: by 2002:aa7:c882:: with SMTP id p2mr5205364eds.195.1613398146995; Mon, 15 Feb 2021 06:09:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613398146; cv=none; d=google.com; s=arc-20160816; b=m6etkFiJxlo5J/Sb6m/BxOxT+Xpa3t+kuaFfED1G3JVYfAOfQq6JvYK6tqxueMMLNc 7W/yGxn0S6dewq+jjV+gNnRpeBhfV5C7JcORDnYTvKGvB2fIUIDBStk8bxwwzDFp8TT7 cFhHw8EnLXVMGdNEymVVbgpOfU6ehX5K77RNr8j09Tx9MMq1nWM87Xc4XIP0pVIxbfUd FfQcycRqVjILJ2s4VcG5h3HqQG9R/QQHIsBELJGuKrIngAycA2CLR0Qc7qFkDqYMNLJw FKrsbOZJjn2vjiNFchuDTr9n/1dWzT5DUG6HNP92ghaHeXtzNYJt65Ylwypoyvnp9K2s a7vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=ZrIuyqjfbcJYFNJ6+Z2BYdJJX/KX4wihbAKKBfTzgUk=; b=jMaUMSTjjvUyiROTWOe5AbhcQOj9Igy67nuhjSpSAwG6uPudhx8iI75JtK/xv99O5c TjhMkYgVbn8OsLoAnsKAANZYKD8FvIyVp1yiPfDQhJWCrjmY1fRAWX/8JiRZf6SoFhOj c16bZfuFp5Aw7RJomN2CPEBgAkwMv9b/NYtWZ+ZeeYCgu9sEEHLMDKvxQPGvBSJj0M5K wFVQBY1nwtNSwc2Edg+9D7jJaiV7bNirN1uB7y+kO2Zz2d5xuHGqPfmF1Pke0xO7Jw85 fLiRxVwqH7HGZ7Fc15eZQYYOHxbkHTsSm/T7RIrz4FWFrp0NS6OLWe9vPxnhYm9pYuvL l7kw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y26si12326497ejb.235.2021.02.15.06.08.36; Mon, 15 Feb 2021 06:09:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229802AbhBOOIG (ORCPT + 99 others); Mon, 15 Feb 2021 09:08:06 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:64712 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhBOOIG (ORCPT ); Mon, 15 Feb 2021 09:08:06 -0500 Received: from fsav301.sakura.ne.jp (fsav301.sakura.ne.jp [153.120.85.132]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 11FE6HGh009237; Mon, 15 Feb 2021 23:06:17 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav301.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav301.sakura.ne.jp); Mon, 15 Feb 2021 23:06:17 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav301.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 11FE6HwJ009225 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Mon, 15 Feb 2021 23:06:17 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: possible deadlock in start_this_handle (2) To: Jan Kara Cc: jack@suse.com, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com, tytso@mit.edu, mhocko@suse.cz, linux-mm@kvack.org, syzbot References: <000000000000563a0205bafb7970@google.com> <20210211104947.GL19070@quack2.suse.cz> <20210215124519.GA22417@quack2.suse.cz> From: Tetsuo Handa Message-ID: Date: Mon, 15 Feb 2021 23:06:15 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210215124519.GA22417@quack2.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 2021/02/15 21:45, Jan Kara wrote: > On Sat 13-02-21 23:26:37, Tetsuo Handa wrote: >> Excuse me, but it seems to me that nothing prevents >> ext4_xattr_set_handle() from reaching ext4_xattr_inode_lookup_create() >> without memalloc_nofs_save() when hitting ext4_get_nojournal() path. >> Will you explain when ext4_get_nojournal() path is executed? > > That's a good question but sadly I don't think that's it. > ext4_get_nojournal() is called when the filesystem is created without a > journal. In that case we also don't acquire jbd2_handle lockdep map. In the > syzbot report we can see: Since syzbot can test filesystem images, syzbot might have tested a filesystem image created both with and without journal within this boot. > > kswapd0/2246 is trying to acquire lock: > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444 > > but task is already holding lock: > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195 > > So this filesystem has very clearly been created with a journal. Also the > journal lockdep tracking machinery uses: While locks held by kswapd0/2246 are fs_reclaim, shrinker_rwsem, &type->s_umount_key#38 and jbd2_handle, isn't the dependency lockdep considers problematic is Chain exists of: jbd2_handle --> &ei->xattr_sem --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&ei->xattr_sem); lock(fs_reclaim); lock(jbd2_handle); where CPU0 is kswapd/2246 and CPU1 is the case of ext4_get_nojournal() path? If someone has taken jbd2_handle and &ei->xattr_sem in this order, isn't this dependency true? > > rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_); > > so a lockdep key is per-filesystem. Thus it is not possible that lockdep > would combine lock dependencies from two different filesystems. > > But I guess we could narrow the search for this problem by adding WARN_ONs > to ext4_xattr_set_handle() and ext4_xattr_inode_lookup_create() like: > > WARN_ON(ext4_handle_valid(handle) && !(current->flags & PF_MEMALLOC_NOFS)); > > It would narrow down a place in which PF_MEMALLOC_NOFS flag isn't set > properly... At least that seems like the most plausible way forward to me. You can use CONFIG_DEBUG_AID_FOR_SYZBOT for adding such WARN_ONs on linux-next.