LinuxLists.cc - kernel BUG at fs/buffer.c:LINE!

2018-04-19 16:05:24

Subject: kernel BUG at fs/buffer.c:LINE!

Hello,

syzbot hit the following crash on upstream commit
86bbbebac1933e6e95e8234c4f7d220c5ddd38bc (Mon Apr 2 18:47:07 2018 +0000)
Merge branch 'ras-core-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e

So far this crash happened 2 times on upstream.
Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=4808221449519104
Kernel config:
https://syzkaller.appspot.com/x/.config?id=6801295859785128502
compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

FAT-fs (loop0): Directory bread(block 14) failed
FAT-fs (loop0): Directory bread(block 15) failed
netlink: 2 bytes leftover after parsing attributes in process
`syz-executor6'.
------------[ cut here ]------------
kernel BUG at fs/buffer.c:3058!
invalid opcode: 0000 [#1] SMP KASAN
VFS: Can't find a Minix filesystem V1 | V2 | V3 on device loop3.
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 16389 Comm: syz-executor0 Not tainted 4.16.0+ #11
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:submit_bh_wbc+0x5a0/0x710 fs/buffer.c:3058
RSP: 0018:ffff8801c737f6c0 EFLAGS: 00010212
RAX: 0000000000040000 RBX: ffff88017247f3f0 RCX: ffffffff81bfe2b0
RDX: 000000000001b35c RSI: ffffc90001ea8000 RDI: 0000000000000001
RBP: ffff8801c737f708 R08: 0000000000000000 R09: ffffed002e48fe8b
R10: 0000000000000001 R11: ffffed002e48fe8a R12: 1ffff10038e6fee4
R13: 0000000000000800 R14: 0000000000000000 R15: 0000000000000001
FS: 00007fb154769700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000ddce80 CR3: 00000001c4034004 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
submit_bh fs/buffer.c:3105 [inline]
__sync_dirty_buffer+0x175/0x350 fs/buffer.c:3191
sync_dirty_buffer+0x1a/0x20 fs/buffer.c:3204
fat_set_state+0x1ec/0x300 fs/fat/inode.c:696
fat_fill_super+0x2cf9/0x4940 fs/fat/inode.c:1855
vfat_fill_super+0x31/0x40 fs/fat/namei_vfat.c:1059
mount_bdev+0x2b7/0x370 fs/super.c:1119
vfat_mount+0x34/0x40 fs/fat/namei_vfat.c:1066
mount_fs+0x66/0x2d0 fs/super.c:1222
vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
vfs_kern_mount fs/namespace.c:2509 [inline]
do_new_mount fs/namespace.c:2512 [inline]
do_mount+0xea4/0x2bb0 fs/namespace.c:2842
SYSC_mount fs/namespace.c:3058 [inline]
SyS_mount+0xab/0x120 fs/namespace.c:3035
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x457d0a
RSP: 002b:00007fb154768bb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 0000000000457d0a
RDX: 0000000020000000 RSI: 0000000020000140 RDI: 00007fb154768c00
RBP: 0000000000000001 R08: 0000000020000240 R09: 0000000020000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000013
R13: 000000000000066e R14: 00000000006fcaf0 R15: 0000000000000000
Code: 89 45 d0 48 8d 43 10 48 89 45 c0 e9 52 fc ff ff e8 46 46 b1 ff f0 80
63 01 f7 e9 34 fb ff ff e8 37 46 b1 ff 0f 0b e8 30 46 b1 ff <0f> 0b e8 29
46 b1 ff 0f 0b e8 22 46 b1 ff 0f 0b e8 1b 46 b1 ff
RIP: submit_bh_wbc+0x5a0/0x710 fs/buffer.c:3058 RSP: ffff8801c737f6c0
---[ end trace 56c650f20e19f6bf ]---

---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to [email protected].

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.

2019-12-19 05:01:43

by Bart Van Assche

[permalink] [raw]

Subject: Re: kernel BUG at fs/buffer.c:LINE!

On 2019-12-18 08:21, syzbot wrote:
> syzbot has bisected this bug to:
>
> commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3
> Author: Jaegeuk Kim <[email protected]>
> Date:   Thu Jan 10 03:17:14 2019 +0000
>
>     loop: drop caches if offset or block_size are changed
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13f3ca8ee00000
> start commit:   2187f215 Merge tag 'for-5.5-rc2-tag' of
> git://git.kernel.o..
> git tree:       upstream
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=100bca8ee00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=17f3ca8ee00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=dcf10bf83926432a
> dashboard link:
> https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1171ba8ee00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=107440aee00000

Hi Jaegeuk,

Since syzbot has identified a reproducer I think that it's easy to test
whether your new patch fixes what syzbot discovered. Have you already
had the chance to test this?

Thanks,

Bart.

2019-12-19 19:18:04

by Jaegeuk Kim

[permalink] [raw]

Subject: Re: kernel BUG at fs/buffer.c:LINE!

On 12/18, Bart Van Assche wrote:
> On 2019-12-18 08:21, syzbot wrote:
> > syzbot has bisected this bug to:
> >
> > commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3
> > Author: Jaegeuk Kim <[email protected]>
> > Date:?? Thu Jan 10 03:17:14 2019 +0000
> >
> > ??? loop: drop caches if offset or block_size are changed
> >
> > bisection log:? https://syzkaller.appspot.com/x/bisect.txt?x=13f3ca8ee00000
> > start commit:?? 2187f215 Merge tag 'for-5.5-rc2-tag' of
> > git://git.kernel.o..
> > git tree:?????? upstream
> > final crash:??? https://syzkaller.appspot.com/x/report.txt?x=100bca8ee00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=17f3ca8ee00000
> > kernel config:? https://syzkaller.appspot.com/x/.config?x=dcf10bf83926432a
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e
> > syz repro:????? https://syzkaller.appspot.com/x/repro.syz?x=1171ba8ee00000
> > C reproducer:?? https://syzkaller.appspot.com/x/repro.c?x=107440aee00000
>
> Hi Jaegeuk,
>
> Since syzbot has identified a reproducer I think that it's easy to test
> whether your new patch fixes what syzbot discovered. Have you already
> had the chance to test this?

Hi Bart,

Let me try to reproduce this.

Thanks,

>
> Thanks,
>
> Bart.

2019-12-19 23:32:08

by Jaegeuk Kim

[permalink] [raw]

Subject: Re: kernel BUG at fs/buffer.c:LINE!

On 12/18, Bart Van Assche wrote:
> On 2019-12-18 08:21, syzbot wrote:
> > syzbot has bisected this bug to:
> >
> > commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3
> > Author: Jaegeuk Kim <[email protected]>
> > Date:?? Thu Jan 10 03:17:14 2019 +0000
> >
> > ??? loop: drop caches if offset or block_size are changed
> >
> > bisection log:? https://syzkaller.appspot.com/x/bisect.txt?x=13f3ca8ee00000
> > start commit:?? 2187f215 Merge tag 'for-5.5-rc2-tag' of
> > git://git.kernel.o..
> > git tree:?????? upstream
> > final crash:??? https://syzkaller.appspot.com/x/report.txt?x=100bca8ee00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=17f3ca8ee00000
> > kernel config:? https://syzkaller.appspot.com/x/.config?x=dcf10bf83926432a
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e
> > syz repro:????? https://syzkaller.appspot.com/x/repro.syz?x=1171ba8ee00000
> > C reproducer:?? https://syzkaller.appspot.com/x/repro.c?x=107440aee00000
>
> Hi Jaegeuk,
>
> Since syzbot has identified a reproducer I think that it's easy to test
> whether your new patch fixes what syzbot discovered. Have you already
> had the chance to test this?
>

I can't reproduce it with C reproducer. Hmmm...

> Thanks,
>
> Bart.

2024-03-13 10:58:44

by Ryusuke Konishi

[permalink] [raw]

Subject: [PATCH 0/2] nilfs2: fix kernel bug at submit_bh_wbc()

Hi Andrew,

please apply this series as a bug fix.

This resolves a kernel BUG reported by syzbot. Since there are two
flaws involved, I've made each one a separate patch.

The first patch alone resolves the syzbot-reported bug, but I think
both fixes should be sent to stable, so I've tagged them as such.

This series does not conflict with the currently queued conversion
to kmap_local series, etc and can be applied independently.

Thanks,
Ryusuke Konishi

Ryusuke Konishi (2):
nilfs2: fix failure to detect DAT corruption in btree and direct
mappings
nilfs2: prevent kernel bug at submit_bh_wbc()

fs/nilfs2/btree.c | 9 +++++++--
fs/nilfs2/direct.c | 9 +++++++--
fs/nilfs2/inode.c | 2 +-
3 files changed, 15 insertions(+), 5 deletions(-)

--
2.34.1

2024-03-13 10:59:08

by Ryusuke Konishi

[permalink] [raw]

Subject: [PATCH 1/2] nilfs2: fix failure to detect DAT corruption in btree and direct mappings

Syzbot has reported a kernel bug in submit_bh_wbc() when writing file
data to a nilfs2 file system whose metadata is corrupted.

There are two flaws involved in this issue.

The first flaw is that when nilfs_get_block() locates a data block
using btree or direct mapping, if the disk address translation routine
nilfs_dat_translate() fails with internal code -ENOENT due to DAT
metadata corruption, it can be passed back to nilfs_get_block(). This
causes nilfs_get_block() to misidentify an existing block as
non-existent, causing both data block lookup and insertion to fail
inconsistently.

The second flaw is that nilfs_get_block() returns a successful status
in this inconsistent state. This causes the caller
__block_write_begin_int() or others to request a read even though the
buffer is not mapped, resulting in a BUG_ON check for the BH_Mapped
flag in submit_bh_wbc() failing.

This fixes the first issue by changing the return value to code
-EINVAL when a conversion using DAT fails with code -ENOENT, avoiding
the conflicting condition that leads to the kernel bug described
above. Here, code -EINVAL indicates that metadata corruption was
detected during the block lookup, which will be properly handled as a
file system error and converted to -EIO when passing through the
nilfs2 bmap layer.

Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e
Fixes: c3a7abf06ce7 ("nilfs2: support contiguous lookup of blocks")
Tested-by: Ryusuke Konishi <[email protected]>
Cc: [email protected]
---
fs/nilfs2/btree.c | 9 +++++++--
fs/nilfs2/direct.c | 9 +++++++--
2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index 13592e82eaf6..65659fa0372e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -724,7 +724,7 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree,
dat = nilfs_bmap_get_dat(btree);
ret = nilfs_dat_translate(dat, ptr, &blocknr);
if (ret < 0)
- goto out;
+ goto dat_error;
ptr = blocknr;
}
cnt = 1;
@@ -743,7 +743,7 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree,
if (dat) {
ret = nilfs_dat_translate(dat, ptr2, &blocknr);
if (ret < 0)
- goto out;
+ goto dat_error;
ptr2 = blocknr;
}
if (ptr2 != ptr + cnt || ++cnt == maxblocks)
@@ -781,6 +781,11 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree,
out:
nilfs_btree_free_path(path);
return ret;
+
+ dat_error:
+ if (ret == -ENOENT)
+ ret = -EINVAL; /* Notify bmap layer of metadata corruption */
+ goto out;
}

static void nilfs_btree_promote_key(struct nilfs_bmap *btree,
diff --git a/fs/nilfs2/direct.c b/fs/nilfs2/direct.c
index 4c85914f2abc..893ab36824cc 100644
--- a/fs/nilfs2/direct.c
+++ b/fs/nilfs2/direct.c
@@ -66,7 +66,7 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct,
dat = nilfs_bmap_get_dat(direct);
ret = nilfs_dat_translate(dat, ptr, &blocknr);
if (ret < 0)
- return ret;
+ goto dat_error;
ptr = blocknr;
}

@@ -79,7 +79,7 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct,
if (dat) {
ret = nilfs_dat_translate(dat, ptr2, &blocknr);
if (ret < 0)
- return ret;
+ goto dat_error;
ptr2 = blocknr;
}
if (ptr2 != ptr + cnt)
@@ -87,6 +87,11 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct,
}
*ptrp = ptr;
return cnt;
+
+ dat_error:
+ if (ret == -ENOENT)
+ ret = -EINVAL; /* Notify bmap layer of metadata corruption */
+ return ret;
}

static __u64
--
2.34.1

2024-03-13 10:59:23

by Ryusuke Konishi

[permalink] [raw]

Subject: [PATCH 2/2] nilfs2: prevent kernel bug at submit_bh_wbc()

Fix a bug where nilfs_get_block() returns a successful status when
searching and inserting the specified block both fail inconsistently.
If this inconsistent behavior is not due to a previously fixed bug,
then an unexpected race is occurring, so return a temporary error
-EAGAIN instead.

This prevents callers such as __block_write_begin_int() from
requesting a read into a buffer that is not mapped, which would cause
the BUG_ON check for the BH_Mapped flag in submit_bh_wbc() to fail.

Signed-off-by: Ryusuke Konishi <[email protected]>
Fixes: 1f5abe7e7dbc ("nilfs2: replace BUG_ON and BUG calls triggerable from ioctl")
Cc: [email protected]
---
fs/nilfs2/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 9c334c722fc1..5a888b2c1803 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -112,7 +112,7 @@ int nilfs_get_block(struct inode *inode, sector_t blkoff,
"%s (ino=%lu): a race condition while inserting a data block at offset=%llu",
__func__, inode->i_ino,
(unsigned long long)blkoff);
- err = 0;
+ err = -EAGAIN;
}
nilfs_transaction_abort(inode->i_sb);
goto out;
--
2.34.1