2021-04-13 15:25:49

by Wangli (T)

[permalink] [raw]
Subject: Subject: [BUG && Question] question of BUG_ON when a device is hot-removed and when a file systems is writing

Hello,

We find a BUG_ON in submit_bh_wbc() if a device is hot-removed when a
file systems is writing.

Code path:

ext4_write_begin()

    __block_write_begin()

        __block_write_begin_int()  <== judge if 'buffer_mapped(bh)' is
false, it will get_block and continue, this time device still lives.

            ll_rw_block(REQ_OP_READ) <== bh is not uptodate, read from
device.

                submit_bh_wbc()  <== judge if 'buffer_mapped(bh)' is
false,BUG_ON().Block device dies and the buffer heads Buffer_Mapped flag
get cleared.

stack is below

[41253.006160] kernel BUG at fs/buffer.c:3015!<==
BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
[41253.006767] invalid opcode: 0000 [#1] SMP
[41253.007293] CPU: 0 PID: 22157 Comm: dd Not tainted
5.10.0-01679-ge46e150e09e0 #2
[41253.008257] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[41253.009778] RIP: 0010:submit_bh_wbc+0x231/0x2e0
[41253.010403] Code: 48 83 05 11 26 7e 0b 01 0f 0b 48 83 05 0f 26 7e 0b
01 48 83 05 0f 26 7e 0b 01 48 83 05 0f 26 7e 0b 01 48 83 05 0f 26 7e 0b
01 <0f> 0b 48 83 05 0d 26 7e 0b 01 48 83 05 0d 26 7e 0b 01 48 8d
[41253.012831] RSP: 0018:ffffc90003c17b20 EFLAGS: 00010202
[41253.013526] RAX: 0000000000000004 RBX: ffff88803d083af8 RCX:
0000000000000000
[41253.014470] RDX: ffff88803d083af8 RSI: 0000000000000000 RDI:
0000000000000000
[41253.015403] RBP: ffffc90003c17bc8 R08: 0000000000000000 R09:
ffff888100041800
[41253.016357] R10: 0000000000000000 R11: ffff88810fd71000 R12:
0000000000000000
[41253.017297] R13: ffffffff9acbda20 R14: ffffffff8f4d9c50 R15:
ffffc90003c17bc0
[41253.018242] FS: 00007f2d41f0c4c0(0000) GS:ffff88813bc00000(0000)
knlGS:0000000000000000
[41253.019308] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41253.020046] CR2: 000055d54d154000 CR3: 00000000ba850000 CR4:
00000000000006f0
[41253.020962] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[41253.021888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[41253.022805] Call Trace:
[41253.023146] ? end_buffer_read_nobh+0x20/0x20
[41253.023712] ll_rw_block+0x114/0x140
[41253.024183] __block_write_begin_int+0x343/0x8e0
[41253.024780] ? ext4_block_zero_page_range+0x580/0x580
[41253.025441] ? _cond_resched+0x25/0x70
[41253.025933] ? ext4_journal_check_start+0x16/0xe0
[41253.026548] __block_write_begin+0x15/0x20
[41253.027081] ext4_write_begin+0x5f3/0x970
[41253.027624] ext4_da_write_begin+0x15d/0x720
[41253.028186] generic_perform_write+0xd3/0x240
[41253.028754] ext4_buffered_write_iter+0x107/0x1f0
[41253.029364] ext4_file_write_iter+0x78/0xae0
[41253.029918] ? asm_exc_page_fault+0x1e/0x30
[41253.030466] new_sync_write+0x17e/0x220
[41253.030966] vfs_write+0x32f/0x3d0
[41253.031413] ksys_write+0xdd/0x170
[41253.031857] __x64_sys_write+0x1e/0x30
[41253.032350] do_syscall_64+0x4d/0x70
[41253.032817] entry_SYSCALL_64_after_hwframe+0x44/0xa9

This code path is common write path for other file systems. To address
this, we think add the check of 'buffer_mapped(bh)' just before
ll_rw_block().

@@ -2036,7 +2036,7 @@ int __block_write_begin_int(struct page *page,
loff_t pos, unsigned len,
                        continue;
                }
                if (!buffer_uptodate(bh) && !buffer_delay(bh) &&
-                   !buffer_unwritten(bh) &&
+                  !buffer_unwritten(bh) && buffer_mapped(bh) &&
                     (block_start < from || block_end > to)) {
                        ll_rw_block(REQ_OP_READ, 0, 1, &bh);
                        *wait_bh++=bh;

But it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the
device dies between when the check before ll_rw_block() and when
submit_bh_wbh()

is finally called.


Could you give some suggestions?


Thanks,

Wangli.