2022-09-30 23:27:35

by syzbot

[permalink] [raw]
Subject: [syzbot] WARNING in nilfs_dat_commit_end

Hello,

syzbot found the following issue on:

HEAD commit: c3e0e1e23c70 Merge tag 'irq_urgent_for_v6.0' of git://git...
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=10f98c04880000
kernel config: https://syzkaller.appspot.com/x/.config?x=ba0d23aa7e1ffaf5
dashboard link: https://syzkaller.appspot.com/bug?extid=cbff7a52b6f99059e67f
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15d224b8880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13d3e4d0880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e7f1f925f94e/disk-c3e0e1e2.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/830dabeedf0d/vmlinux-c3e0e1e2.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3609 at fs/nilfs2/dat.c:186 nilfs_dat_commit_end+0x56b/0x610
Modules linked in:
CPU: 0 PID: 3609 Comm: segctord Not tainted 6.0.0-rc7-syzkaller-00081-gc3e0e1e23c70 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
RIP: 0010:nilfs_dat_commit_end+0x56b/0x610 fs/nilfs2/dat.c:186
Code: 48 89 ee 48 83 c4 38 5b 41 5c 41 5d 41 5e 41 5f 5d e9 69 77 03 00 e8 14 49 40 fe e8 0f 46 b9 fd e9 24 fd ff ff e8 05 49 40 fe <0f> 0b e9 75 fc ff ff e8 f9 48 40 fe e8 f4 45 b9 fd 43 80 7c 25 00
RSP: 0018:ffffc9000389f2a8 EFLAGS: 00010293
RAX: ffffffff8347407b RBX: ffff888079c311a0 RCX: ffff8880775a9d80
RDX: 0000000000000000 RSI: 0000000000000003 RDI: 000023d50af30002
RBP: 0000000000000003 R08: ffffffff83473ce9 R09: ffffed100f211f9b
R10: ffffed100f211f9b R11: 1ffff1100f211f9a R12: ffff8880723fda40
R13: ffffc9000389f3b8 R14: ffff888074fb8180 R15: 000023d50af30002
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fec40ab81d0 CR3: 000000000c88e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
nilfs_dat_commit_update+0x25/0x40 fs/nilfs2/dat.c:236
nilfs_direct_propagate+0x215/0x390 fs/nilfs2/direct.c:277
nilfs_bmap_propagate+0x6d/0x120 fs/nilfs2/bmap.c:337
nilfs_collect_file_data+0x49/0xc0 fs/nilfs2/segment.c:568
nilfs_segctor_apply_buffers+0x192/0x380 fs/nilfs2/segment.c:1012
nilfs_segctor_scan_file+0x842/0xaf0 fs/nilfs2/segment.c:1061
nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1170 [inline]
nilfs_segctor_collect fs/nilfs2/segment.c:1497 [inline]
nilfs_segctor_do_construct+0x1cce/0x6fe0 fs/nilfs2/segment.c:2039
nilfs_segctor_construct+0x143/0x8d0 fs/nilfs2/segment.c:2375
nilfs_segctor_thread_construct fs/nilfs2/segment.c:2483 [inline]
nilfs_segctor_thread+0x534/0x1180 fs/nilfs2/segment.c:2566
kthread+0x266/0x300 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2023-01-27 13:23:27

by Ryusuke Konishi

[permalink] [raw]
Subject: [PATCH] nilfs2: prevent WARNING in nilfs_dat_commit_end()

If nilfs2 reads a corrupted disk image and its DAT metadata file contains
invalid lifetime data for a virtual block number, a kernel warning can be
generated by the WARN_ON check in nilfs_dat_commit_end() and can panic
if the kernel is booted with panic_on_warn.

This patch avoids the issue with a sanity check that treats it as an
error.

Since error return is not allowed in the execution phase of
nilfs_dat_commit_end(), this inserts that sanity check in
nilfs_dat_prepare_end(), which prepares for nilfs_dat_commit_end().

As the error code, -EINVAL is returned to notify bmap layer of the
metadata corruption. When the bmap layer sees this code, it handles the
abnormal situation and replaces the return code with -EIO as it should.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Tested-by: Ryusuke Konishi <[email protected]>
---
Andrew, please add this patch to the queue. This fixes another
WARN_ON hit in fs/nilfs2/dat.c for a corrupted disk image pattern.

Thanks,
Ryusuke Konishi

fs/nilfs2/dat.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 1e7f653c1df7..9cf6ba58f585 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -158,6 +158,7 @@ void nilfs_dat_commit_start(struct inode *dat, struct nilfs_palloc_req *req,
int nilfs_dat_prepare_end(struct inode *dat, struct nilfs_palloc_req *req)
{
struct nilfs_dat_entry *entry;
+ __u64 start;
sector_t blocknr;
void *kaddr;
int ret;
@@ -169,6 +170,7 @@ int nilfs_dat_prepare_end(struct inode *dat, struct nilfs_palloc_req *req)
kaddr = kmap_atomic(req->pr_entry_bh->b_page);
entry = nilfs_palloc_block_get_entry(dat, req->pr_entry_nr,
req->pr_entry_bh, kaddr);
+ start = le64_to_cpu(entry->de_start);
blocknr = le64_to_cpu(entry->de_blocknr);
kunmap_atomic(kaddr);

@@ -179,6 +181,15 @@ int nilfs_dat_prepare_end(struct inode *dat, struct nilfs_palloc_req *req)
return ret;
}
}
+ if (unlikely(start > nilfs_mdt_cno(dat))) {
+ nilfs_err(dat->i_sb,
+ "vblocknr = %llu has abnormal lifetime: start cno (= %llu) > current cno (= %llu)",
+ (unsigned long long)req->pr_entry_nr,
+ (unsigned long long)start,
+ (unsigned long long)nilfs_mdt_cno(dat));
+ nilfs_dat_abort_entry(dat, req);
+ return -EINVAL;
+ }

return 0;
}
--
2.34.1