Received: by 2002:ab2:6991:0:b0:1f7:f6c3:9cb1 with SMTP id v17csp1063547lqo; Thu, 9 May 2024 03:51:53 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWt6ItWxDuTt+6wBpXCi7bzYXZu2pMQ9maXmgm3w4yz2PJ+K00K+oTbB2wVVFmlK7kCpeqtIVU7efbmuwbrIHHX6sfk3dIXGmes60UOSw== X-Google-Smtp-Source: AGHT+IGByhspmewYo6oZ/CleQIE1sdEa2e2eILpEkDwwsr4CzWB3bOJN0wuSFXXFgi8O0YXreNhS X-Received: by 2002:a05:6830:1e3b:b0:6ea:1bba:9622 with SMTP id 46e09a7af769-6f0b7eac32amr5108048a34.35.1715251913163; Thu, 09 May 2024 03:51:53 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715251913; cv=pass; d=google.com; s=arc-20160816; b=yLGCjQJPE3zAgynqkQVIQD/vdOtZJGu1KlP0L8k4apWJl33b24l8hi8CdN47s8HYx7 gI6fmT730GSWdvKX4lqPSIxYHdktqVbD1wb1c76JFNF9yVoGu3Dx/+21Zjmh4tmLLx3e U1aEZT0c08XFS+/gFPCSx5OB7T3UCCPzJu7IeMqms3oAMR/iCXRMYsh3lilXDEPh2qdH AUJDR3Ihq/D/2gQ7DUdoOBIK0LpVQJLPsyYr8V4c7Nzhm9kQ+0m5vvR9/e6uE0Kk2phN 8/3jG6BBwlMh1jRp/bl6OmlWFSLPzMyHWFUP7h2OwqECnkQXN2imNE2BTOKPZHHj1d3H vXYw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=4foooDfcxxEcWbf+h9/AY339O1AL3yL5UO6C2JWnvrE=; fh=pzlPeHzK9fejhL8R1DiWAnQL6TcebBcbTMuqyJEJMGM=; b=oJJjWvCCF3BCCwTk+xzJSNvnZnfohVTFaTBFEmgqVjw8x+jf1ziccO3APprq4Nyh3B SWF0fYzesm53VOwV1qD/3iadzhmlRXJ3YWuSSgV+Yud2ee/xHo0mKj9jdYiyqtObItLI MEKtwHzeDFyDGg0nVsjmV7+P+KfEiUdJ6P0pWCdoFP+MYipaNxCtlN29rtpXCcWX+jqL BixQyTvehm9Wcq449ao4rFD+8jKqqAFVk+aZ2Kbz8WXSNU2WHYso4/OQGWaujfEJwNni sHBhvPfP6+kZr7GeYh1tqEQh1+cp17YnAJXs0i+inc43a63HIuSe0TPGjjYBsP2MzVA/ py/g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=sina.com); spf=pass (google.com: domain of linux-kernel+bounces-174429-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-174429-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-63409e824a8si1166899a12.128.2024.05.09.03.51.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 May 2024 03:51:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-174429-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=sina.com); spf=pass (google.com: domain of linux-kernel+bounces-174429-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-174429-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B9B112820A0 for ; Thu, 9 May 2024 10:51:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 30665130493; Thu, 9 May 2024 10:51:31 +0000 (UTC) Received: from mail115-171.sinamail.sina.com.cn (mail115-171.sinamail.sina.com.cn [218.30.115.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADEF212FB1B for ; Thu, 9 May 2024 10:51:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=218.30.115.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715251890; cv=none; b=tD/Jq2a3FAbGirMjkDoXio0Lr4Twy2Wx8aMEMamyOCGOahAsoPqed+Wb3+b8BmaZX7+a+UwmKfI3SWAhW0Kw29BQ7Nuns1bdyxnW5Gcw1Wkk2KBeMg6F7dPV07pWZ+QrlOb9lIOA3bg3g9mxO6217QWTU4rp2P0rl/+M51BbOrk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715251890; c=relaxed/simple; bh=c1hPixAJvj3AHl/3CV8F+R2t09WL5v2tVYkTtNhiEpM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=IAikkcTw3Ob6TWP8yIZt4JNAgvb8+bOFjBiIekTyNx6AF7F9CP2KOPngbn0EKLAL3CAOjQnRskIZ82M+bRqwWypAAzi1xAwc42Au2Q+LlmdJVhDPL4nSZ/O0T2LbEoDu57U0LkomLVp3PtrnzIT+jSUlTb4WZ5YtPSl42qlJEsw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com; spf=pass smtp.mailfrom=sina.com; arc=none smtp.client-ip=218.30.115.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sina.com X-SMAIL-HELO: localhost.localdomain Received: from unknown (HELO localhost.localdomain)([113.118.68.141]) by sina.com (172.16.235.25) with ESMTP id 663CAA1900006869; Thu, 9 May 2024 18:49:00 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com Authentication-Results: sina.com; spf=none smtp.mailfrom=hdanton@sina.com; dkim=none header.i=none; dmarc=none action=none header.from=hdanton@sina.com X-SMAIL-MID: 29018234210318 X-SMAIL-UIID: DA2DDA9809B844E8B5B5CAF31DD56CD1-20240509-184900-1 From: Hillf Danton To: Amir Goldstein Cc: syzbot , linux-fsdevel@vger.kernel.org, Al Viro , linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com, "Rafael J. Wysocki" , Pavel Machek , linux-pm@vger.kernel.org Subject: Re: [syzbot] [kernfs?] possible deadlock in kernfs_seq_start Date: Thu, 9 May 2024 18:48:48 +0800 Message-Id: <20240509104848.2403-1-hdanton@sina.com> In-Reply-To: References: <00000000000091228c0617eaae32@google.com> <20240508231904.2259-1-hdanton@sina.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On Thu, 9 May 2024 09:37:24 +0300 Amir Goldstein > On Thu, May 9, 2024 at 2:19 AM Hillf Danton wrote: > > On Tue, 07 May 2024 22:36:18 -0700 > > > syzbot has found a reproducer for the following issue on: > > > > > > HEAD commit: dccb07f2914c Merge tag 'for-6.9-rc7-tag' of git://git.kern.. > > > git tree: upstream > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=137daa6c980000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=9d7ea7de0cb32587 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=4c493dcd5a68168a94b2 > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1134f3c0980000 > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1367a504980000 > > > > > > Downloadable assets: > > > disk image: https://storage.googleapis.com/syzbot-assets/ea1961ce01fe/disk-dccb07f2.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/445a00347402/vmlinux-dccb07f2.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/461aed7c4df3/bzImage-dccb07f2.xz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+4c493dcd5a68168a94b2@syzkaller.appspotmail.com > > > > > > ====================================================== > > > WARNING: possible circular locking dependency detected > > > 6.9.0-rc7-syzkaller-00012-gdccb07f2914c #0 Not tainted > > > ------------------------------------------------------ > > > syz-executor149/5078 is trying to acquire lock: > > > ffff88802a978888 (&of->mutex){+.+.}-{3:3}, at: kernfs_seq_start+0x53/0x3b0 fs/kernfs/file.c:154 > > > > > > but task is already holding lock: > > > ffff88802d80b540 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0xb7/0xd60 fs/seq_file.c:182 > > > > > > which lock already depends on the new lock. > > > > > > > > > the existing dependency chain (in reverse order) is: > > > > > > -> #4 (&p->lock){+.+.}-{3:3}: > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > __mutex_lock_common kernel/locking/mutex.c:608 [inline] > > > __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752 > > > seq_read_iter+0xb7/0xd60 fs/seq_file.c:182 > > > call_read_iter include/linux/fs.h:2104 [inline] > > > copy_splice_read+0x662/0xb60 fs/splice.c:365 > > > do_splice_read fs/splice.c:985 [inline] > > > splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 > > > do_sendfile+0x515/0xdc0 fs/read_write.c:1301 > > > __do_sys_sendfile64 fs/read_write.c:1362 [inline] > > > __se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1348 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > -> #3 (&pipe->mutex){+.+.}-{3:3}: > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > __mutex_lock_common kernel/locking/mutex.c:608 [inline] > > > __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752 > > > iter_file_splice_write+0x335/0x14e0 fs/splice.c:687 > > > backing_file_splice_write+0x2bc/0x4c0 fs/backing-file.c:289 > > > ovl_splice_write+0x3cf/0x500 fs/overlayfs/file.c:379 > > > do_splice_from fs/splice.c:941 [inline] > > > do_splice+0xd77/0x1880 fs/splice.c:1354 file_start_write(out); ret = do_splice_from(ipipe, out, &offset, len, flags); file_end_write(out); The correct locking order is sb_writers inode lock > > > __do_splice fs/splice.c:1436 [inline] > > > __do_sys_splice fs/splice.c:1652 [inline] > > > __se_sys_splice+0x331/0x4a0 fs/splice.c:1634 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > -> #2 (sb_writers#4){.+.+}-{0:0}: > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > percpu_down_read include/linux/percpu-rwsem.h:51 [inline] > > > __sb_start_write include/linux/fs.h:1664 [inline] > > > sb_start_write+0x4d/0x1c0 include/linux/fs.h:1800 > > > mnt_want_write+0x3f/0x90 fs/namespace.c:409 but inverse order occurs here. > > > ovl_create_object+0x13b/0x370 fs/overlayfs/dir.c:629 > > > lookup_open fs/namei.c:3497 [inline] > > > open_last_lookups fs/namei.c:3566 [inline] > > > path_openat+0x1425/0x3240 fs/namei.c:3796 > > > do_filp_open+0x235/0x490 fs/namei.c:3826 > > > do_sys_openat2+0x13e/0x1d0 fs/open.c:1406 > > > do_sys_open fs/open.c:1421 [inline] > > > __do_sys_open fs/open.c:1429 [inline] > > > __se_sys_open fs/open.c:1425 [inline] > > > __x64_sys_open+0x225/0x270 fs/open.c:1425 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > -> #1 (&ovl_i_mutex_dir_key[depth]){++++}-{3:3}: > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > down_read+0xb1/0xa40 kernel/locking/rwsem.c:1526 > > > inode_lock_shared include/linux/fs.h:805 [inline] > > > lookup_slow+0x45/0x70 fs/namei.c:1708 > > > walk_component+0x2e1/0x410 fs/namei.c:2004 > > > lookup_last fs/namei.c:2461 [inline] > > > path_lookupat+0x16f/0x450 fs/namei.c:2485 > > > filename_lookup+0x256/0x610 fs/namei.c:2514 > > > kern_path+0x35/0x50 fs/namei.c:2622 > > > lookup_bdev+0xc5/0x290 block/bdev.c:1136 > > > resume_store+0x1a0/0x710 kernel/power/hibernate.c:1235 > > > kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334 > > > call_write_iter include/linux/fs.h:2110 [inline] > > > new_sync_write fs/read_write.c:497 [inline] > > > vfs_write+0xa84/0xcb0 fs/read_write.c:590 > > > ksys_write+0x1a0/0x2c0 fs/read_write.c:643 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > -> #0 (&of->mutex){+.+.}-{3:3}: > > > check_prev_add kernel/locking/lockdep.c:3134 [inline] > > > check_prevs_add kernel/locking/lockdep.c:3253 [inline] > > > validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 > > > __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > __mutex_lock_common kernel/locking/mutex.c:608 [inline] > > > __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752 > > > kernfs_seq_start+0x53/0x3b0 fs/kernfs/file.c:154 > > > traverse+0x14f/0x550 fs/seq_file.c:106 > > > seq_read_iter+0xc5e/0xd60 fs/seq_file.c:195 > > > call_read_iter include/linux/fs.h:2104 [inline] > > > copy_splice_read+0x662/0xb60 fs/splice.c:365 > > > do_splice_read fs/splice.c:985 [inline] > > > splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 > > > do_sendfile+0x515/0xdc0 fs/read_write.c:1301 > > > __do_sys_sendfile64 fs/read_write.c:1362 [inline] > > > __se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1348 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > other info that might help us debug this: > > > > > > Chain exists of: > > > &of->mutex --> &pipe->mutex --> &p->lock > > > > > > Possible unsafe locking scenario: > > > > > > CPU0 CPU1 > > > ---- ---- > > > lock(&p->lock); > > > lock(&pipe->mutex); > > > lock(&p->lock); > > > lock(&of->mutex); > > > > > > *** DEADLOCK *** > > > > This shows 16b52bbee482 ("kernfs: annotate different lockdep class for > > of->mutex of writable files") is a bandaid. > > Well, nobody said that it fixes the root cause. > But the annotation fix is correct, because the former report was > really false positive one. > > The root cause is resume_store() doing vfs path lookup. resume_store() looks innocent before locking order above is explained. > If we could deprecate this allegedly unneeded UAPI we should. > > That said, all those lockdep warnings indicate a possible deadlock > if someone tries to hibernate into an overlayfs file. > > If root tries to do that then, this is either an attack or stupidity. > Either Way the news flash from this report is "root may be able > to deadlock kernel on purpose" > Not very exciting and not likely to happen in the real world. > > The remaining question is what to do about the lockdep reports. > > Questions to PM maintainers: > Any chance to deprecate writing path to /sys/power/resume? > Userspace should have no problem getting the same done > with writing dev number. > > Thanks, > Amir.