2018-11-07 03:24:53

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] a9f38e1dec [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit a9f38e1dec107af70d81338332494bf0a1e76597
Author: Omar Sandoval <[email protected]>
AuthorDate: Mon Oct 15 09:21:34 2018 -0600
Commit: Jens Axboe <[email protected]>
CommitDate: Tue Oct 16 09:50:14 2018 -0600

floppy: convert to blk-mq

This driver likes to fetch requests from all over the place, so make
queue_rq put requests on a list so that the logic stays the same. Tested
with QEMU.

Signed-off-by: Omar Sandoval <[email protected]>

Converted to blk_mq_init_sq_queue() and fixed a few spots where the
tag_set leaked on cleanup.

Signed-off-by: Jens Axboe <[email protected]>

6ec3938cff ataflop: convert to blk-mq
a9f38e1dec floppy: convert to blk-mq
8053e5b93e Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
337734cbca Add linux-next specific files for 20181106
+---------------------------------------------------------+------------+------------+------------+---------------+
| | 6ec3938cff | a9f38e1dec | 8053e5b93e | next-20181106 |
+---------------------------------------------------------+------------+------------+------------+---------------+
| boot_successes | 251 | 68 | 69 | 74 |
| boot_failures | 48 | 27 | 25 | 23 |
| Mem-Info | 48 | 12 | 15 | 7 |
| invoked_oom-killer:gfp_mask=0x | 6 | 4 | 3 | 2 |
| EIP:iov_iter_fault_in_readable | 4 | 1 | | |
| Out_of_memory:Kill_process | 3 | 2 | 2 | |
| Out_of_memory_and_no_killable_processes | 3 | 2 | 2 | |
| Kernel_panic-not_syncing:System_is_deadlocked_on_memory | 3 | 2 | 2 | |
| EIP:__put_user_4 | 1 | | | |
| INFO:task_blocked_for_more_than#seconds | 0 | 15 | 10 | 16 |
| EIP:lock_is_held_type | 0 | 1 | 0 | 1 |
| EIP:native_safe_halt | 0 | 3 | | |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 15 | 10 | 16 |
| EIP:strnlen_user | 0 | 1 | | |
| EIP:_local_bh_enable | 0 | 1 | | |
| EIP:__lock_acquire | 0 | 1 | 1 | 1 |
| EIP:vprintk_emit | 0 | 1 | | |
| EIP:do_softirq_own_stack | 0 | 0 | 1 | |
| EIP:default_idle | 0 | 0 | 2 | 1 |
| EIP:reschedule_interrupt | 0 | 0 | 1 | |
| EIP:delay_tsc | 0 | 0 | 0 | 1 |
| EIP:copy_page_from_iter | 0 | 0 | 0 | 1 |
| EIP:_cond_resched | 0 | 0 | 0 | 1 |
+---------------------------------------------------------+------------+------------+------------+---------------+

[ 244.196551] random: get_random_u32 called from arch_align_stack+0x2d/0x40 with crng_init=1
[ 245.254485] _warn_unseeded_randomness: 54 callbacks suppressed
[ 245.254491] random: get_random_u32 called from copy_process+0x1c8/0x17b0 with crng_init=1
[ 245.257366] random: get_random_u32 called from arch_pick_mmap_layout+0x110/0x140 with crng_init=1
[ 245.258675] random: get_random_u32 called from load_elf_binary+0x2b6/0xb48 with crng_init=1
[ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
[ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
[ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 245.682181] mount D 6372 580 1 0x00000004
[ 245.683023] Call Trace:
[ 245.683425] __schedule+0x2df/0x570
[ 245.683975] schedule+0x2d/0x80
[ 245.684476] schedule_timeout+0x19d/0x330
[ 245.685090] ? wait_for_common+0xa5/0x170
[ 245.685735] wait_for_common+0xac/0x170
[ 245.686339] ? do_sched_yield+0x90/0x90
[ 245.686935] wait_for_completion+0x12/0x20
[ 245.687571] __floppy_read_block_0+0xfb/0x150
[ 245.688244] ? floppy_resume+0x40/0x40
[ 245.688844] floppy_revalidate+0x20f/0x240
[ 245.689486] check_disk_change+0x43/0x60
[ 245.690087] floppy_open+0x1ea/0x360
[ 245.690653] __blkdev_get+0xb4/0x4d0
[ 245.691212] ? blkdev_get+0x1db/0x370
[ 245.691777] blkdev_get+0x1f3/0x370
[ 245.692351] ? path_put+0x15/0x20
[ 245.692871] ? lookup_bdev+0x4b/0x90
[ 245.693539] blkdev_get_by_path+0x3d/0x80
[ 245.694165] mount_bdev+0x2a/0x190
[ 245.694695] squashfs_mount+0x10/0x20
[ 245.695271] ? squashfs_alloc_inode+0x30/0x30
[ 245.695960] mount_fs+0xf/0x90
[ 245.696451] vfs_kern_mount+0x43/0x130
[ 245.697036] do_mount+0x187/0xc40
[ 245.697563] ? memdup_user+0x28/0x50
[ 245.698124] ksys_mount+0x60/0xc0
[ 245.698639] sys_mount+0x19/0x20
[ 245.699167] do_int80_syscall_32+0x61/0x130
[ 245.699813] entry_INT80_32+0xc7/0xc7
[ 245.700391] EIP: 0x47f46fee
[ 245.700829] Code: 11 83 c8 ff eb e2 90 90 90 90 90 90 57 56 53 8b 7c 24 20 8b 74 24 1c 8b 54 24 18 8b 4c 24 14 8b 5c 24 10 b8 15 00 00 00 cd 80 <5b> 5e 5f 3d 01 f0 ff ff 73 01 c3 e8 e8 80 03 00 81 c1 f6 5f 09 00
[ 245.703680] EAX: ffffffda EBX: affa4efa ECX: affa4f03 EDX: 080d3630
[ 245.704642] ESI: 00008010 EDI: 00000000 EBP: 00000000 ESP: affa45c0
[ 245.705637] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
[ 245.706679]
[ 245.706679] Showing all locks held in the system:
[ 245.707636] 1 lock held by khungtaskd/22:
[ 245.708258] #0: (ptrval) (rcu_read_lock){....}, at: debug_show_all_locks+0x13/0x110
[ 245.709474] 3 locks held by mount/580:
[ 245.710056] #0: (ptrval) (&bdev->bd_mutex){....}, at: __blkdev_get+0x136/0x4d0
[ 245.711186] #1: (ptrval) (floppy_mutex){....}, at: floppy_open+0x26/0x360
[ 245.712265] #2: (ptrval) (open_lock){....}, at: floppy_open+0x32/0x360
[ 245.713289]
[ 245.713539] =============================================
[ 245.713539]
[ 245.714587] NMI backtrace for cpu 0
[ 245.715132] CPU: 0 PID: 22 Comm: khungtaskd Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
[ 245.716507] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 245.717763] Call Trace:
[ 245.718153] dump_stack+0x79/0xa6
[ 245.718663] nmi_cpu_backtrace+0x8c/0x90
[ 245.719269] nmi_trigger_cpumask_backtrace+0xaa/0xe0
[ 245.720019] ? irq_force_complete_move+0x80/0x80
[ 245.720731] arch_trigger_cpumask_backtrace+0x13/0x20
[ 245.721504] watchdog+0x2d9/0x3f0
[ 245.722010] kthread+0xdb/0x110
[ 245.722499] ? hung_task_panic+0x20/0x20
[ 245.723093] ? __kthread_create_on_node+0x140/0x140
[ 245.723842] ret_from_fork+0x19/0x30
[ 245.724419] Sending NMI from CPU 0 to CPUs 1:
[ 245.725109] NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0x5/0x10
[ 245.726094] Kernel panic - not syncing: hung_task: blocked tasks
[ 245.727154] CPU: 0 PID: 22 Comm: khungtaskd Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
[ 245.728537] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 245.729790] Call Trace:
[ 245.730185] dump_stack+0x79/0xa6
[ 245.730693] panic+0x98/0x1d0
[ 245.731166] watchdog+0x2e3/0x3f0
[ 245.731672] kthread+0xdb/0x110
[ 245.732158] ? hung_task_panic+0x20/0x20
[ 245.732755] ? __kthread_create_on_node+0x140/0x140
[ 245.733504] ret_from_fork+0x19/0x30
[ 245.734093] Kernel Offset: 0x5a00000 from 0xb1000000 (relocation range: 0xb0000000-0xc07d1fff)


# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 384a8a0d3699415091a01a3c7e7d40fe232bc32f 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d --
git bisect bad 08b6b9a0ecf1807ed20f24a3dd1284d707563ad1 # 22:15 B 29 2 20 20 Merge 'linux-review/robbieko/Btrfs-fix-cur_offset-in-the-error-case-for-nocow/20181030-181552' into devel-catchup-201810311741
git bisect bad bb451bbc477cc8aa24b61bcc50db56b126838e4b # 22:41 B 17 2 9 9 Merge 'tip/master' into devel-catchup-201810311741
git bisect bad f4e4b76f16e81d604c111e0fe740434fda6cfd08 # 22:57 B 21 1 8 8 Merge 'linux-review/Yuantian-Tang/clk-qoriq-add-more-chips-support/20181031-170545' into devel-catchup-201810311741
git bisect bad e7248e5b707330ae6d6e4c45061a74faf428faa7 # 23:19 B 22 1 15 16 Merge 'jpirko-mlxsw/net_queue' into devel-catchup-201810311741
git bisect good 1e68fdfc37c7dc0e53a5ff23888c6f7284d53511 # 23:57 G 157 0 97 97 0day base guard for 'devel-catchup-201810311741'
git bisect bad 62606c224d72a98c35d21a849f95cccf95b0a252 # 00:17 B 18 1 12 13 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect good 3f80e08f40cdb308589a49077c87632fa4508b21 # 00:58 G 161 0 96 96 tcp: add tcp_reset_xmit_timer() helper
git bisect bad 382d72a9aa525b56ab8453ce61751fa712414d3d # 01:12 B 33 2 18 19 Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad b0b6a28bc4b265aa56cbf4fa8fd27c0a4fa3a49c # 01:29 B 13 1 7 7 Merge tag 'pinctrl-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect bad a36cf6865120d7534fcb132d311f03e5159f2da7 # 01:45 B 5 1 4 4 Merge tag 'mtd/for-4.20' of git://git.infradead.org/linux-mtd
git bisect bad 52990a5fb0c991ecafebdab43138b5ed41376852 # 02:00 B 12 1 6 6 block: setup bounce bio_sets properly
git bisect good ae14cc044b5988148d819c377fd0cc1c7504bc3c # 03:29 G 155 0 90 90 lightnvm: pblk: refactor put line fn on read completion
git bisect good 9316a9ed6895c4ad2f0cde171d486f80c55d8283 # 04:11 G 157 0 86 86 blk-mq: provide helper for setting up an SQ queue and tag set
git bisect bad bb2a1d4e804aa41eef0003a192a674f844dbca23 # 04:26 B 31 1 19 19 nvme-core: rework a NQN copying operation
git bisect good 21b07f35544af5e2c11f079057e8fb4263d35dd3 # 05:20 G 159 0 92 92 amiflop: convert to blk-mq
git bisect bad 8535fd6f702a4b3e225258a86dcbf5dd93b73d51 # 05:41 B 38 1 19 19 z2ram: convert to blk-mq
git bisect good 6ec3938cff95fe792a05fdf4ba14f92977e5e31d # 06:40 G 174 0 106 106 ataflop: convert to blk-mq
git bisect bad ad5fc6bb72214615f300af1f4ed57f71bc3be510 # 07:02 B 25 2 16 17 gdrom: convert to blk-mq
git bisect bad a9f38e1dec107af70d81338332494bf0a1e76597 # 07:27 B 31 1 21 21 floppy: convert to blk-mq
# first bad commit: [a9f38e1dec107af70d81338332494bf0a1e76597] floppy: convert to blk-mq
git bisect good 6ec3938cff95fe792a05fdf4ba14f92977e5e31d # 08:44 G 498 0 250 357 ataflop: convert to blk-mq
# extra tests with debug options
git bisect bad a9f38e1dec107af70d81338332494bf0a1e76597 # 09:00 B 30 2 14 15 floppy: convert to blk-mq
# extra tests on HEAD of linux-devel/devel-catchup-201810311741
git bisect bad 384a8a0d3699415091a01a3c7e7d40fe232bc32f # 09:01 B 136 24 0 194 0day head guard for 'devel-catchup-201810311741'
# extra tests on tree/branch linus/master
git bisect bad 8053e5b93eca9b011f7b79bb019bf1eeaaf96c4b # 09:28 B 37 1 19 19 Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
# extra tests with first bad commit reverted
git bisect good e63db4a6aaac124f2fd7700827d4f2566d80d01e # 10:17 G 202 0 119 119 Revert "floppy: convert to blk-mq"
# extra tests on tree/branch linux-next/master
git bisect bad 337734cbca7453b1d57049bda1b9879adde2a5f8 # 10:41 B 17 1 11 12 Add linux-next specific files for 20181106

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation


Attachments:
(No filename) (12.94 kB)
dmesg-yocto-ivb41-5:20181107080728:i386-randconfig-h1-10311702:4.19.0-rc6-00172-ga9f38e1:1.gz (21.51 kB)
dmesg-yocto-ivb41-10:20181107061345:i386-randconfig-h1-10311702:4.19.0-rc6-00171-g6ec3938c:1.gz (42.62 kB)
reproduce-yocto-ivb41-5:20181107080728:i386-randconfig-h1-10311702:4.19.0-rc6-00172-ga9f38e1:1 (986.00 B)
config-4.19.0-rc6-00172-ga9f38e1 (120.38 kB)
Download all attachments

2018-11-07 08:23:03

by jianchao.wang

[permalink] [raw]
Subject: Re: [LKP] a9f38e1dec [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.



On 11/7/18 11:16 AM, kernel test robot wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=8qsZc_c8QfZOGoITiebvlt5BO9GBK61cAY4xLOCegFU&s=ZGRhkdRCPvSKvbZzx-HAQtWcDZfzce60e2TMvBQyORo&e= master
>
> commit a9f38e1dec107af70d81338332494bf0a1e76597
> Author: Omar Sandoval <[email protected]>
> AuthorDate: Mon Oct 15 09:21:34 2018 -0600
> Commit: Jens Axboe <[email protected]>
> CommitDate: Tue Oct 16 09:50:14 2018 -0600
>
> floppy: convert to blk-mq
>
> This driver likes to fetch requests from all over the place, so make
> queue_rq put requests on a list so that the logic stays the same. Tested
> with QEMU.
>
> Signed-off-by: Omar Sandoval <[email protected]>
>
> Converted to blk_mq_init_sq_queue() and fixed a few spots where the
> tag_set leaked on cleanup.
>
> Signed-off-by: Jens Axboe <[email protected]>
>
> 6ec3938cff ataflop: convert to blk-mq
> a9f38e1dec floppy: convert to blk-mq
> 8053e5b93e Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
> 337734cbca Add linux-next specific files for 20181106
> +---------------------------------------------------------+------------+------------+------------+---------------+
> | | 6ec3938cff | a9f38e1dec | 8053e5b93e | next-20181106 |
> +---------------------------------------------------------+------------+------------+------------+---------------+
> | boot_successes | 251 | 68 | 69 | 74 |
> | boot_failures | 48 | 27 | 25 | 23 |
> | Mem-Info | 48 | 12 | 15 | 7 |
> | invoked_oom-killer:gfp_mask=0x | 6 | 4 | 3 | 2 |
> | EIP:iov_iter_fault_in_readable | 4 | 1 | | |
> | Out_of_memory:Kill_process | 3 | 2 | 2 | |
> | Out_of_memory_and_no_killable_processes | 3 | 2 | 2 | |
> | Kernel_panic-not_syncing:System_is_deadlocked_on_memory | 3 | 2 | 2 | |
> | EIP:__put_user_4 | 1 | | | |
> | INFO:task_blocked_for_more_than#seconds | 0 | 15 | 10 | 16 |
> | EIP:lock_is_held_type | 0 | 1 | 0 | 1 |
> | EIP:native_safe_halt | 0 | 3 | | |
> | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 15 | 10 | 16 |
> | EIP:strnlen_user | 0 | 1 | | |
> | EIP:_local_bh_enable | 0 | 1 | | |
> | EIP:__lock_acquire | 0 | 1 | 1 | 1 |
> | EIP:vprintk_emit | 0 | 1 | | |
> | EIP:do_softirq_own_stack | 0 | 0 | 1 | |
> | EIP:default_idle | 0 | 0 | 2 | 1 |
> | EIP:reschedule_interrupt | 0 | 0 | 1 | |
> | EIP:delay_tsc | 0 | 0 | 0 | 1 |
> | EIP:copy_page_from_iter | 0 | 0 | 0 | 1 |
> | EIP:_cond_resched | 0 | 0 | 0 | 1 |
> +---------------------------------------------------------+------------+------------+------------+---------------+
>
> [ 244.196551] random: get_random_u32 called from arch_align_stack+0x2d/0x40 with crng_init=1
> [ 245.254485] _warn_unseeded_randomness: 54 callbacks suppressed
> [ 245.254491] random: get_random_u32 called from copy_process+0x1c8/0x17b0 with crng_init=1
> [ 245.257366] random: get_random_u32 called from arch_pick_mmap_layout+0x110/0x140 with crng_init=1
> [ 245.258675] random: get_random_u32 called from load_elf_binary+0x2b6/0xb48 with crng_init=1
> [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
> [ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
> [ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 245.682181] mount D 6372 580 1 0x00000004
> [ 245.683023] Call Trace:
> [ 245.683425] __schedule+0x2df/0x570
> [ 245.683975] schedule+0x2d/0x80
> [ 245.684476] schedule_timeout+0x19d/0x330
> [ 245.685090] ? wait_for_common+0xa5/0x170
> [ 245.685735] wait_for_common+0xac/0x170
> [ 245.686339] ? do_sched_yield+0x90/0x90
> [ 245.686935] wait_for_completion+0x12/0x20
> [ 245.687571] __floppy_read_block_0+0xfb/0x150
> [ 245.688244] ? floppy_resume+0x40/0x40
> [ 245.688844] floppy_revalidate+0x20f/0x240
> [ 245.689486] check_disk_change+0x43/0x60
> [ 245.690087] floppy_open+0x1ea/0x360
> [ 245.690653] __blkdev_get+0xb4/0x4d0
> [ 245.691212] ? blkdev_get+0x1db/0x370
> [ 245.691777] blkdev_get+0x1f3/0x370
> [ 245.692351] ? path_put+0x15/0x20
> [ 245.692871] ? lookup_bdev+0x4b/0x90
> [ 245.693539] blkdev_get_by_path+0x3d/0x80
> [ 245.694165] mount_bdev+0x2a/0x190
> [ 245.694695] squashfs_mount+0x10/0x20
> [ 245.695271] ? squashfs_alloc_inode+0x30/0x30
> [ 245.695960] mount_fs+0xf/0x90
> [ 245.696451] vfs_kern_mount+0x43/0x130
> [ 245.697036] do_mount+0x187/0xc40
> [ 245.697563] ? memdup_user+0x28/0x50
> [ 245.698124] ksys_mount+0x60/0xc0
> [ 245.698639] sys_mount+0x19/0x20
> [ 245.699167] do_int80_syscall_32+0x61/0x130
> [ 245.699813] entry_INT80_32+0xc7/0xc7
> [ 245.700391] EIP: 0x47f46fee
> [ 245.700829] Code: 11 83 c8 ff eb e2 90 90 90 90 90 90 57 56 53 8b 7c 24 20 8b 74 24 1c 8b 54 24 18 8b 4c 24 14 8b 5c 24 10 b8 15 00 00 00 cd 80 <5b> 5e 5f 3d 01 f0 ff ff 73 01 c3 e8 e8 80 03 00 81 c1 f6 5f 09 00
> [ 245.703680] EAX: ffffffda EBX: affa4efa ECX: affa4f03 EDX: 080d3630
> [ 245.704642] ESI: 00008010 EDI: 00000000 EBP: 00000000 ESP: affa45c0
> [ 245.705637] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282

There is not a q->queue_lock as blk-legacy to serialize floppy_queue_rq from different contexts.

Thanks
Jianchao

2018-11-07 15:18:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [LKP] a9f38e1dec [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.

On 11/7/18 1:22 AM, jianchao.wang wrote:
>
>
> On 11/7/18 11:16 AM, kernel test robot wrote:
>> Greetings,
>>
>> 0day kernel testing robot got the below dmesg and the first bad commit is
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=8qsZc_c8QfZOGoITiebvlt5BO9GBK61cAY4xLOCegFU&s=ZGRhkdRCPvSKvbZzx-HAQtWcDZfzce60e2TMvBQyORo&e= master
>>
>> commit a9f38e1dec107af70d81338332494bf0a1e76597
>> Author: Omar Sandoval <[email protected]>
>> AuthorDate: Mon Oct 15 09:21:34 2018 -0600
>> Commit: Jens Axboe <[email protected]>
>> CommitDate: Tue Oct 16 09:50:14 2018 -0600
>>
>> floppy: convert to blk-mq
>>
>> This driver likes to fetch requests from all over the place, so make
>> queue_rq put requests on a list so that the logic stays the same. Tested
>> with QEMU.
>>
>> Signed-off-by: Omar Sandoval <[email protected]>
>>
>> Converted to blk_mq_init_sq_queue() and fixed a few spots where the
>> tag_set leaked on cleanup.
>>
>> Signed-off-by: Jens Axboe <[email protected]>
>>
>> 6ec3938cff ataflop: convert to blk-mq
>> a9f38e1dec floppy: convert to blk-mq
>> 8053e5b93e Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>> 337734cbca Add linux-next specific files for 20181106
>> +---------------------------------------------------------+------------+------------+------------+---------------+
>> | | 6ec3938cff | a9f38e1dec | 8053e5b93e | next-20181106 |
>> +---------------------------------------------------------+------------+------------+------------+---------------+
>> | boot_successes | 251 | 68 | 69 | 74 |
>> | boot_failures | 48 | 27 | 25 | 23 |
>> | Mem-Info | 48 | 12 | 15 | 7 |
>> | invoked_oom-killer:gfp_mask=0x | 6 | 4 | 3 | 2 |
>> | EIP:iov_iter_fault_in_readable | 4 | 1 | | |
>> | Out_of_memory:Kill_process | 3 | 2 | 2 | |
>> | Out_of_memory_and_no_killable_processes | 3 | 2 | 2 | |
>> | Kernel_panic-not_syncing:System_is_deadlocked_on_memory | 3 | 2 | 2 | |
>> | EIP:__put_user_4 | 1 | | | |
>> | INFO:task_blocked_for_more_than#seconds | 0 | 15 | 10 | 16 |
>> | EIP:lock_is_held_type | 0 | 1 | 0 | 1 |
>> | EIP:native_safe_halt | 0 | 3 | | |
>> | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 15 | 10 | 16 |
>> | EIP:strnlen_user | 0 | 1 | | |
>> | EIP:_local_bh_enable | 0 | 1 | | |
>> | EIP:__lock_acquire | 0 | 1 | 1 | 1 |
>> | EIP:vprintk_emit | 0 | 1 | | |
>> | EIP:do_softirq_own_stack | 0 | 0 | 1 | |
>> | EIP:default_idle | 0 | 0 | 2 | 1 |
>> | EIP:reschedule_interrupt | 0 | 0 | 1 | |
>> | EIP:delay_tsc | 0 | 0 | 0 | 1 |
>> | EIP:copy_page_from_iter | 0 | 0 | 0 | 1 |
>> | EIP:_cond_resched | 0 | 0 | 0 | 1 |
>> +---------------------------------------------------------+------------+------------+------------+---------------+
>>
>> [ 244.196551] random: get_random_u32 called from arch_align_stack+0x2d/0x40 with crng_init=1
>> [ 245.254485] _warn_unseeded_randomness: 54 callbacks suppressed
>> [ 245.254491] random: get_random_u32 called from copy_process+0x1c8/0x17b0 with crng_init=1
>> [ 245.257366] random: get_random_u32 called from arch_pick_mmap_layout+0x110/0x140 with crng_init=1
>> [ 245.258675] random: get_random_u32 called from load_elf_binary+0x2b6/0xb48 with crng_init=1
>> [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
>> [ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
>> [ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 245.682181] mount D 6372 580 1 0x00000004
>> [ 245.683023] Call Trace:
>> [ 245.683425] __schedule+0x2df/0x570
>> [ 245.683975] schedule+0x2d/0x80
>> [ 245.684476] schedule_timeout+0x19d/0x330
>> [ 245.685090] ? wait_for_common+0xa5/0x170
>> [ 245.685735] wait_for_common+0xac/0x170
>> [ 245.686339] ? do_sched_yield+0x90/0x90
>> [ 245.686935] wait_for_completion+0x12/0x20
>> [ 245.687571] __floppy_read_block_0+0xfb/0x150
>> [ 245.688244] ? floppy_resume+0x40/0x40
>> [ 245.688844] floppy_revalidate+0x20f/0x240
>> [ 245.689486] check_disk_change+0x43/0x60
>> [ 245.690087] floppy_open+0x1ea/0x360
>> [ 245.690653] __blkdev_get+0xb4/0x4d0
>> [ 245.691212] ? blkdev_get+0x1db/0x370
>> [ 245.691777] blkdev_get+0x1f3/0x370
>> [ 245.692351] ? path_put+0x15/0x20
>> [ 245.692871] ? lookup_bdev+0x4b/0x90
>> [ 245.693539] blkdev_get_by_path+0x3d/0x80
>> [ 245.694165] mount_bdev+0x2a/0x190
>> [ 245.694695] squashfs_mount+0x10/0x20
>> [ 245.695271] ? squashfs_alloc_inode+0x30/0x30
>> [ 245.695960] mount_fs+0xf/0x90
>> [ 245.696451] vfs_kern_mount+0x43/0x130
>> [ 245.697036] do_mount+0x187/0xc40
>> [ 245.697563] ? memdup_user+0x28/0x50
>> [ 245.698124] ksys_mount+0x60/0xc0
>> [ 245.698639] sys_mount+0x19/0x20
>> [ 245.699167] do_int80_syscall_32+0x61/0x130
>> [ 245.699813] entry_INT80_32+0xc7/0xc7
>> [ 245.700391] EIP: 0x47f46fee
>> [ 245.700829] Code: 11 83 c8 ff eb e2 90 90 90 90 90 90 57 56 53 8b 7c 24 20 8b 74 24 1c 8b 54 24 18 8b 4c 24 14 8b 5c 24 10 b8 15 00 00 00 cd 80 <5b> 5e 5f 3d 01 f0 ff ff 73 01 c3 e8 e8 80 03 00 81 c1 f6 5f 09 00
>> [ 245.703680] EAX: ffffffda EBX: affa4efa ECX: affa4f03 EDX: 080d3630
>> [ 245.704642] ESI: 00008010 EDI: 00000000 EBP: 00000000 ESP: affa45c0
>> [ 245.705637] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
>
> There is not a q->queue_lock as blk-legacy to serialize floppy_queue_rq from different contexts.

Doesn't look immediately racy, the internal queue list is protected by
the floppy_lock, and the busy start is marked by atomic bit setting.
Do you see an issue with it, or are you just pontificating?

FWIW, just ran qemu with two floppy drives and concurrent access,
didn't see an issue.

--
Jens Axboe


2018-11-09 01:39:08

by jianchao.wang

[permalink] [raw]
Subject: Re: [LKP] a9f38e1dec [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.



On 11/7/18 11:17 PM, Jens Axboe wrote:
> On 11/7/18 1:22 AM, jianchao.wang wrote:
>>
>>
>> On 11/7/18 11:16 AM, kernel test robot wrote:
>>> Greetings,
>>>
>>> 0day kernel testing robot got the below dmesg and the first bad commit is
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=8qsZc_c8QfZOGoITiebvlt5BO9GBK61cAY4xLOCegFU&s=ZGRhkdRCPvSKvbZzx-HAQtWcDZfzce60e2TMvBQyORo&e= master
>>>
>>> commit a9f38e1dec107af70d81338332494bf0a1e76597
>>> Author: Omar Sandoval <[email protected]>
>>> AuthorDate: Mon Oct 15 09:21:34 2018 -0600
>>> Commit: Jens Axboe <[email protected]>
>>> CommitDate: Tue Oct 16 09:50:14 2018 -0600
>>>
>>> floppy: convert to blk-mq
>>>
>>> This driver likes to fetch requests from all over the place, so make
>>> queue_rq put requests on a list so that the logic stays the same. Tested
>>> with QEMU.
>>>
>>> Signed-off-by: Omar Sandoval <[email protected]>
>>>
>>> Converted to blk_mq_init_sq_queue() and fixed a few spots where the
>>> tag_set leaked on cleanup.
>>>
>>> Signed-off-by: Jens Axboe <[email protected]>
>>>
>>> 6ec3938cff ataflop: convert to blk-mq
>>> a9f38e1dec floppy: convert to blk-mq
>>> 8053e5b93e Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>> 337734cbca Add linux-next specific files for 20181106
>>> +---------------------------------------------------------+------------+------------+------------+---------------+
>>> | | 6ec3938cff | a9f38e1dec | 8053e5b93e | next-20181106 |
>>> +---------------------------------------------------------+------------+------------+------------+---------------+
>>> | boot_successes | 251 | 68 | 69 | 74 |
>>> | boot_failures | 48 | 27 | 25 | 23 |
>>> | Mem-Info | 48 | 12 | 15 | 7 |
>>> | invoked_oom-killer:gfp_mask=0x | 6 | 4 | 3 | 2 |
>>> | EIP:iov_iter_fault_in_readable | 4 | 1 | | |
>>> | Out_of_memory:Kill_process | 3 | 2 | 2 | |
>>> | Out_of_memory_and_no_killable_processes | 3 | 2 | 2 | |
>>> | Kernel_panic-not_syncing:System_is_deadlocked_on_memory | 3 | 2 | 2 | |
>>> | EIP:__put_user_4 | 1 | | | |
>>> | INFO:task_blocked_for_more_than#seconds | 0 | 15 | 10 | 16 |
>>> | EIP:lock_is_held_type | 0 | 1 | 0 | 1 |
>>> | EIP:native_safe_halt | 0 | 3 | | |
>>> | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 15 | 10 | 16 |
>>> | EIP:strnlen_user | 0 | 1 | | |
>>> | EIP:_local_bh_enable | 0 | 1 | | |
>>> | EIP:__lock_acquire | 0 | 1 | 1 | 1 |
>>> | EIP:vprintk_emit | 0 | 1 | | |
>>> | EIP:do_softirq_own_stack | 0 | 0 | 1 | |
>>> | EIP:default_idle | 0 | 0 | 2 | 1 |
>>> | EIP:reschedule_interrupt | 0 | 0 | 1 | |
>>> | EIP:delay_tsc | 0 | 0 | 0 | 1 |
>>> | EIP:copy_page_from_iter | 0 | 0 | 0 | 1 |
>>> | EIP:_cond_resched | 0 | 0 | 0 | 1 |
>>> +---------------------------------------------------------+------------+------------+------------+---------------+
>>>
>>> [ 244.196551] random: get_random_u32 called from arch_align_stack+0x2d/0x40 with crng_init=1
>>> [ 245.254485] _warn_unseeded_randomness: 54 callbacks suppressed
>>> [ 245.254491] random: get_random_u32 called from copy_process+0x1c8/0x17b0 with crng_init=1
>>> [ 245.257366] random: get_random_u32 called from arch_pick_mmap_layout+0x110/0x140 with crng_init=1
>>> [ 245.258675] random: get_random_u32 called from load_elf_binary+0x2b6/0xb48 with crng_init=1
>>> [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
>>> [ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
>>> [ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [ 245.682181] mount D 6372 580 1 0x00000004
>>> [ 245.683023] Call Trace:
>>> [ 245.683425] __schedule+0x2df/0x570
>>> [ 245.683975] schedule+0x2d/0x80
>>> [ 245.684476] schedule_timeout+0x19d/0x330
>>> [ 245.685090] ? wait_for_common+0xa5/0x170
>>> [ 245.685735] wait_for_common+0xac/0x170
>>> [ 245.686339] ? do_sched_yield+0x90/0x90
>>> [ 245.686935] wait_for_completion+0x12/0x20
>>> [ 245.687571] __floppy_read_block_0+0xfb/0x150
>>> [ 245.688244] ? floppy_resume+0x40/0x40
>>> [ 245.688844] floppy_revalidate+0x20f/0x240
>>> [ 245.689486] check_disk_change+0x43/0x60
>>> [ 245.690087] floppy_open+0x1ea/0x360
>>> [ 245.690653] __blkdev_get+0xb4/0x4d0
>>> [ 245.691212] ? blkdev_get+0x1db/0x370
>>> [ 245.691777] blkdev_get+0x1f3/0x370
>>> [ 245.692351] ? path_put+0x15/0x20
>>> [ 245.692871] ? lookup_bdev+0x4b/0x90
>>> [ 245.693539] blkdev_get_by_path+0x3d/0x80
>>> [ 245.694165] mount_bdev+0x2a/0x190
>>> [ 245.694695] squashfs_mount+0x10/0x20
>>> [ 245.695271] ? squashfs_alloc_inode+0x30/0x30
>>> [ 245.695960] mount_fs+0xf/0x90
>>> [ 245.696451] vfs_kern_mount+0x43/0x130
>>> [ 245.697036] do_mount+0x187/0xc40
>>> [ 245.697563] ? memdup_user+0x28/0x50
>>> [ 245.698124] ksys_mount+0x60/0xc0
>>> [ 245.698639] sys_mount+0x19/0x20
>>> [ 245.699167] do_int80_syscall_32+0x61/0x130
>>> [ 245.699813] entry_INT80_32+0xc7/0xc7
>>> [ 245.700391] EIP: 0x47f46fee
>>> [ 245.700829] Code: 11 83 c8 ff eb e2 90 90 90 90 90 90 57 56 53 8b 7c 24 20 8b 74 24 1c 8b 54 24 18 8b 4c 24 14 8b 5c 24 10 b8 15 00 00 00 cd 80 <5b> 5e 5f 3d 01 f0 ff ff 73 01 c3 e8 e8 80 03 00 81 c1 f6 5f 09 00
>>> [ 245.703680] EAX: ffffffda EBX: affa4efa ECX: affa4f03 EDX: 080d3630
>>> [ 245.704642] ESI: 00008010 EDI: 00000000 EBP: 00000000 ESP: affa45c0
>>> [ 245.705637] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
>>
>> There is not a q->queue_lock as blk-legacy to serialize floppy_queue_rq from different contexts.
>
> Doesn't look immediately racy, the internal queue list is protected by
> the floppy_lock, and the busy start is marked by atomic bit setting.

Oops, I missed the fdc_busy bit set.
I just looked at the patch and found the biggest change was this.
Sorry for the noise.

> Do you see an issue with it, or are you just pontificating?
>
> FWIW, just ran qemu with two floppy drives and concurrent access,
> didn't see an issue.
>

Thanks
Jianchao

2018-11-09 23:01:38

by Jens Axboe

[permalink] [raw]
Subject: Re: [LKP] a9f38e1dec [ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.

On 11/6/18 8:16 PM, kernel test robot wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>
> commit a9f38e1dec107af70d81338332494bf0a1e76597
> Author: Omar Sandoval <[email protected]>
> AuthorDate: Mon Oct 15 09:21:34 2018 -0600
> Commit: Jens Axboe <[email protected]>
> CommitDate: Tue Oct 16 09:50:14 2018 -0600
>
> floppy: convert to blk-mq
>
> This driver likes to fetch requests from all over the place, so make
> queue_rq put requests on a list so that the logic stays the same. Tested
> with QEMU.
>
> Signed-off-by: Omar Sandoval <[email protected]>
>
> Converted to blk_mq_init_sq_queue() and fixed a few spots where the
> tag_set leaked on cleanup.
>
> Signed-off-by: Jens Axboe <[email protected]>

I don't think this is the reason for the hang, I think it just exposed
it. I posted a patch to fix this, hopefully, it's included below as
well. Can you give it a try?


From fd85da23e51d8c055592c2fde5bc36d43eb29074 Mon Sep 17 00:00:00 2001
From: Jens Axboe <[email protected]>
Date: Fri, 9 Nov 2018 15:53:04 -0700
Subject: [PATCH] floppy: fix race condition in __floppy_read_block_0()

LKP recently reported a hang at bootup in the floppy code:

[ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
[ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
[ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 245.682181] mount D 6372 580 1 0x00000004
[ 245.683023] Call Trace:
[ 245.683425] __schedule+0x2df/0x570
[ 245.683975] schedule+0x2d/0x80
[ 245.684476] schedule_timeout+0x19d/0x330
[ 245.685090] ? wait_for_common+0xa5/0x170
[ 245.685735] wait_for_common+0xac/0x170
[ 245.686339] ? do_sched_yield+0x90/0x90
[ 245.686935] wait_for_completion+0x12/0x20
[ 245.687571] __floppy_read_block_0+0xfb/0x150
[ 245.688244] ? floppy_resume+0x40/0x40
[ 245.688844] floppy_revalidate+0x20f/0x240
[ 245.689486] check_disk_change+0x43/0x60
[ 245.690087] floppy_open+0x1ea/0x360
[ 245.690653] __blkdev_get+0xb4/0x4d0
[ 245.691212] ? blkdev_get+0x1db/0x370
[ 245.691777] blkdev_get+0x1f3/0x370
[ 245.692351] ? path_put+0x15/0x20
[ 245.692871] ? lookup_bdev+0x4b/0x90
[ 245.693539] blkdev_get_by_path+0x3d/0x80
[ 245.694165] mount_bdev+0x2a/0x190
[ 245.694695] squashfs_mount+0x10/0x20
[ 245.695271] ? squashfs_alloc_inode+0x30/0x30
[ 245.695960] mount_fs+0xf/0x90
[ 245.696451] vfs_kern_mount+0x43/0x130
[ 245.697036] do_mount+0x187/0xc40
[ 245.697563] ? memdup_user+0x28/0x50
[ 245.698124] ksys_mount+0x60/0xc0
[ 245.698639] sys_mount+0x19/0x20
[ 245.699167] do_int80_syscall_32+0x61/0x130
[ 245.699813] entry_INT80_32+0xc7/0xc7

showing that we never complete that read request. The reason is that
the completion setup is racy - it initializes the completion event
AFTER submitting the IO, which means that the IO could complete
before/during the init. If it does, we are passing garbage to
complete() and we may sleep forever waiting for the event to
occur.

Fixes: 7b7b68bba5ef ("floppy: bail out in open() if drive is not responding to block0 read")
Signed-off-by: Jens Axboe <[email protected]>
---
drivers/block/floppy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index a8cfa011c284..fb23578e9a41 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4148,10 +4148,11 @@ static int __floppy_read_block_0(struct block_device *bdev, int drive)
bio.bi_end_io = floppy_rb0_cb;
bio_set_op_attrs(&bio, REQ_OP_READ, 0);

+ init_completion(&cbdata.complete);
+
submit_bio(&bio);
process_fd_request();

- init_completion(&cbdata.complete);
wait_for_completion(&cbdata.complete);

__free_page(page);
--
2.17.1


--
Jens Axboe