2021-03-31 08:17:28

by kernel test robot

[permalink] [raw]
Subject: [block] 4c95131e3c: kernel_BUG_at_block/bio.c



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 4c95131e3cb8bbb08c3b370b4ae3611874b48be0 ("block: enable use of bio allocation cache")
https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git io_uring-bio-cache


in testcase: ocfs2test
version: ocfs2test-x86_64-d802bf7-1_20210329
with following parameters:

disk: 1HDD
test: test-reserve_space
ucode: 0x21



on test machine: 8 threads Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 47.859997] kernel BUG at block/bio.c:628!
[ 47.860086] invalid opcode: 0000 [#1] SMP PTI
[ 47.860167] CPU: 5 PID: 5042 Comm: mkfs.ocfs2 Not tainted 5.12.0-rc5-00060-g4c95131e3cb8 #1
[ 47.860317] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
[ 47.860445] RIP: 0010:__bio_put (kbuild/src/consumer/block/bio.c:628 kbuild/src/consumer/block/bio.c:623)
[ 47.860514] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 47 14 40 74 0f 8b 47 64 85 c0 74 0e f0 ff 4f 64 0f 94 c0 c3 b8 01 00 00 00 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00
All code
========
0: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
7: 00 00 00
a: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
f: f6 47 14 40 testb $0x40,0x14(%rdi)
13: 74 0f je 0x24
15: 8b 47 64 mov 0x64(%rdi),%eax
18: 85 c0 test %eax,%eax
1a: 74 0e je 0x2a
1c: f0 ff 4f 64 lock decl 0x64(%rdi)
20: 0f 94 c0 sete %al
23: c3 retq
24: b8 01 00 00 00 mov $0x1,%eax
29: c3 retq
2a:* 0f 0b ud2 <-- trapping instruction
2c: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
33: 00 00 00 00
37: 66 data16
38: 66 data16
39: 2e cs
3a: 0f .byte 0xf
3b: 1f (bad)
3c: 84 00 test %al,(%rax)
...

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
9: 00 00 00 00
d: 66 data16
e: 66 data16
f: 2e cs
10: 0f .byte 0xf
11: 1f (bad)
12: 84 00 test %al,(%rax)
...
[ 47.860801] RSP: 0018:ffffc9000d42fc88 EFLAGS: 00010246
[ 47.860885] RAX: 0000000000000000 RBX: ffffc9000d42fe80 RCX: 0000000000000002
[ 47.860998] RDX: ffff88841a9a4f00 RSI: ffff88841a9a4f00 RDI: ffff88841b721318
[ 47.861109] RBP: ffff88841b721318 R08: ffff88841b721318 R09: 000000000002ab30
[ 47.861221] R10: 00000000000002c8 R11: ffff88841f36a144 R12: ffff88841b721318
[ 47.861352] R13: ffffffff81395875 R14: 00000000ffffffff R15: ffff88841a9a4f00
[ 47.861464] FS: 00007f5c66517280(0000) GS:ffff88841f340000(0000) knlGS:0000000000000000
[ 47.861590] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 47.861681] CR2: 000055d13280c000 CR3: 0000000414e82006 CR4: 00000000001706e0
[ 47.861793] Call Trace:
[ 47.861836] bio_put (kbuild/src/consumer/block/bio.c:646)
[ 47.861888] __blkdev_direct_IO (kbuild/src/consumer/fs/block_dev.c:508)
[ 47.861959] generic_file_direct_write (kbuild/src/consumer/mm/filemap.c:3473)
[ 47.862037] __generic_file_write_iter (kbuild/src/consumer/mm/filemap.c:3661)
[ 47.862113] blkdev_write_iter (kbuild/src/consumer/fs/block_dev.c:1719 kbuild/src/consumer/fs/block_dev.c:1693)
[ 47.862181] new_sync_write (kbuild/src/consumer/fs/read_write.c:519 (discriminator 1))
[ 47.862265] vfs_write (kbuild/src/consumer/fs/read_write.c:605)
[ 47.862322] ksys_pwrite64 (kbuild/src/consumer/include/linux/file.h:45 kbuild/src/consumer/fs/read_write.c:713)
[ 47.862381] do_syscall_64 (kbuild/src/consumer/arch/x86/entry/common.c:46)
[ 47.862443] entry_SYSCALL_64_after_hwframe (kbuild/src/consumer/arch/x86/entry/entry_64.S:112)
[ 47.862526] RIP: 0033:0x7f5c66692a37
[ 47.862586] Code: ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 c9 7c 0d 00 49 89 ca 8b 00 85 c0 75 10 b8 12 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 41 55 49 89 cd 41 54 49 89 d4 55 48 89
All code
========
0: ff (bad)
1: ff (bad)
2: eb b6 jmp 0xffffffffffffffba
4: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
b: 00 00 00
e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: 48 8d 05 c9 7c 0d 00 lea 0xd7cc9(%rip),%rax # 0xd7ce3
1a: 49 89 ca mov %rcx,%r10
1d: 8b 00 mov (%rax),%eax
1f: 85 c0 test %eax,%eax
21: 75 10 jne 0x33
23: b8 12 00 00 00 mov $0x12,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 59 ja 0x8b
32: c3 retq
33: 41 55 push %r13
35: 49 89 cd mov %rcx,%r13
38: 41 54 push %r12
3a: 49 89 d4 mov %rdx,%r12
3d: 55 push %rbp
3e: 48 rex.W
3f: 89 .byte 0x89

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 59 ja 0x61
8: c3 retq
9: 41 55 push %r13
b: 49 89 cd mov %rcx,%r13
e: 41 54 push %r12
10: 49 89 d4 mov %rdx,%r12
13: 55 push %rbp
14: 48 rex.W
15: 89 .byte 0x89
[ 47.862872] RSP: 002b:00007fff50508cd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
[ 47.862990] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c66692a37
[ 47.863102] RDX: 0000000000100000 RSI: 000055d13270d200 RDI: 0000000000000004
[ 47.863213] RBP: 0000000000100000 R08: 00007fff50508d54 R09: 0000000000000000
[ 47.863344] R10: 0000000000060000 R11: 0000000000000246 R12: 0000000000060000
[ 47.863456] R13: 0000000000000000 R14: 000055d130f9bbb0 R15: 000055d13270d200
[ 47.863568] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 ocfs2_nodemanager ocfs2_stackglue netconsole btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c intel_rapl_msr intel_rapl_common sd_mod t10_pi x86_pkg_temp_thermal intel_powerclamp sg i915 coretemp kvm_intel ipmi_devintf ipmi_msghandler intel_gtt drm_kms_helper kvm irqbypass crct10dif_pclmul crc32_pclmul syscopyarea crc32c_intel firewire_ohci ghash_clmulni_intel sysfillrect sysimgblt rapl fb_sys_fops ahci libahci intel_cstate mxm_wmi firewire_core libata crc_itu_t drm intel_uncore video wmi ip_tables
[ 47.864367] ---[ end trace 497211d09e7fac70 ]---
[ 47.864475] RIP: 0010:__bio_put (kbuild/src/consumer/block/bio.c:628 kbuild/src/consumer/block/bio.c:623)
[ 47.864548] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 47 14 40 74 0f 8b 47 64 85 c0 74 0e f0 ff 4f 64 0f 94 c0 c3 b8 01 00 00 00 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00
All code
========
0: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
7: 00 00 00
a: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
f: f6 47 14 40 testb $0x40,0x14(%rdi)
13: 74 0f je 0x24
15: 8b 47 64 mov 0x64(%rdi),%eax
18: 85 c0 test %eax,%eax
1a: 74 0e je 0x2a
1c: f0 ff 4f 64 lock decl 0x64(%rdi)
20: 0f 94 c0 sete %al
23: c3 retq
24: b8 01 00 00 00 mov $0x1,%eax
29: c3 retq
2a:* 0f 0b ud2 <-- trapping instruction
2c: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
33: 00 00 00 00
37: 66 data16
38: 66 data16
39: 2e cs
3a: 0f .byte 0xf
3b: 1f (bad)
3c: 84 00 test %al,(%rax)
...

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
9: 00 00 00 00
d: 66 data16
e: 66 data16
f: 2e cs
10: 0f .byte 0xf
11: 1f (bad)
12: 84 00 test %al,(%rax)


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (9.33 kB)
config-5.12.0-rc5-00060-g4c95131e3cb8 (175.56 kB)
job-script (5.68 kB)
dmesg.xz (3.93 kB)
job.yaml (4.67 kB)
Download all attachments

2021-03-31 14:59:54

by Jens Axboe

[permalink] [raw]
Subject: Re: [block] 4c95131e3c: kernel_BUG_at_block/bio.c

On 3/31/21 2:13 AM, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: 4c95131e3cb8bbb08c3b370b4ae3611874b48be0 ("block: enable use of bio allocation cache")
> https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git io_uring-bio-cache
>
>
> in testcase: ocfs2test
> version: ocfs2test-x86_64-d802bf7-1_20210329
> with following parameters:
>
> disk: 1HDD
> test: test-reserve_space
> ucode: 0x21
>
>
>
> on test machine: 8 threads Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz with 16G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> [ 47.859997] kernel BUG at block/bio.c:628!
> [ 47.860086] invalid opcode: 0000 [#1] SMP PTI
> [ 47.860167] CPU: 5 PID: 5042 Comm: mkfs.ocfs2 Not tainted 5.12.0-rc5-00060-g4c95131e3cb8 #1
> [ 47.860317] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
> [ 47.860445] RIP: 0010:__bio_put (kbuild/src/consumer/block/bio.c:628 kbuild/src/consumer/block/bio.c:623)
> [ 47.860514] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 47 14 40 74 0f 8b 47 64 85 c0 74 0e f0 ff 4f 64 0f 94 c0 c3 b8 01 00 00 00 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00
> All code
> ========
> 0: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> 7: 00 00 00
> a: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> f: f6 47 14 40 testb $0x40,0x14(%rdi)
> 13: 74 0f je 0x24
> 15: 8b 47 64 mov 0x64(%rdi),%eax
> 18: 85 c0 test %eax,%eax
> 1a: 74 0e je 0x2a
> 1c: f0 ff 4f 64 lock decl 0x64(%rdi)
> 20: 0f 94 c0 sete %al
> 23: c3 retq
> 24: b8 01 00 00 00 mov $0x1,%eax
> 29: c3 retq
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
> 33: 00 00 00 00
> 37: 66 data16
> 38: 66 data16
> 39: 2e cs
> 3a: 0f .byte 0xf
> 3b: 1f (bad)
> 3c: 84 00 test %al,(%rax)
> ...
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
> 9: 00 00 00 00
> d: 66 data16
> e: 66 data16
> f: 2e cs
> 10: 0f .byte 0xf
> 11: 1f (bad)
> 12: 84 00 test %al,(%rax)
> ...
> [ 47.860801] RSP: 0018:ffffc9000d42fc88 EFLAGS: 00010246
> [ 47.860885] RAX: 0000000000000000 RBX: ffffc9000d42fe80 RCX: 0000000000000002
> [ 47.860998] RDX: ffff88841a9a4f00 RSI: ffff88841a9a4f00 RDI: ffff88841b721318
> [ 47.861109] RBP: ffff88841b721318 R08: ffff88841b721318 R09: 000000000002ab30
> [ 47.861221] R10: 00000000000002c8 R11: ffff88841f36a144 R12: ffff88841b721318
> [ 47.861352] R13: ffffffff81395875 R14: 00000000ffffffff R15: ffff88841a9a4f00
> [ 47.861464] FS: 00007f5c66517280(0000) GS:ffff88841f340000(0000) knlGS:0000000000000000
> [ 47.861590] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 47.861681] CR2: 000055d13280c000 CR3: 0000000414e82006 CR4: 00000000001706e0
> [ 47.861793] Call Trace:
> [ 47.861836] bio_put (kbuild/src/consumer/block/bio.c:646)
> [ 47.861888] __blkdev_direct_IO (kbuild/src/consumer/fs/block_dev.c:508)
> [ 47.861959] generic_file_direct_write (kbuild/src/consumer/mm/filemap.c:3473)
> [ 47.862037] __generic_file_write_iter (kbuild/src/consumer/mm/filemap.c:3661)
> [ 47.862113] blkdev_write_iter (kbuild/src/consumer/fs/block_dev.c:1719 kbuild/src/consumer/fs/block_dev.c:1693)
> [ 47.862181] new_sync_write (kbuild/src/consumer/fs/read_write.c:519 (discriminator 1))
> [ 47.862265] vfs_write (kbuild/src/consumer/fs/read_write.c:605)
> [ 47.862322] ksys_pwrite64 (kbuild/src/consumer/include/linux/file.h:45 kbuild/src/consumer/fs/read_write.c:713)
> [ 47.862381] do_syscall_64 (kbuild/src/consumer/arch/x86/entry/common.c:46)
> [ 47.862443] entry_SYSCALL_64_after_hwframe (kbuild/src/consumer/arch/x86/entry/entry_64.S:112)
> [ 47.862526] RIP: 0033:0x7f5c66692a37
> [ 47.862586] Code: ff ff eb b6 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 c9 7c 0d 00 49 89 ca 8b 00 85 c0 75 10 b8 12 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 c3 41 55 49 89 cd 41 54 49 89 d4 55 48 89

Thanks for the report, I've fixed this up in the current branch and
pushed a new one out.

--
Jens Axboe