LinuxLists.cc - blk-mq breaks suspend even with runtime PM patch

2017-07-29 15:35:40

Subject: blk-mq breaks suspend even with runtime PM patch

Hello Jens, Christoph.

Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied
blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well
as in a VM.

I use complex disk layout involving MD, LUKS and LVM, and managed to get these
warnings from VM via serial console when suspend fails:

===
[ 245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
[ 245.520025] Not tainted 4.12.0-pf4 #1
[ 245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 245.525612] kworker/0:1 D 0 49 2 0x00000000
[ 245.527515] Workqueue: events vmstat_shepherd
[ 245.528685] Call Trace:
[ 245.529296] __schedule+0x459/0xe40
[ 245.530115] ? kvm_clock_read+0x25/0x40
[ 245.531003] ? ktime_get+0x40/0xa0
[ 245.531819] schedule+0x3d/0xb0
[ 245.532542] ? schedule+0x3d/0xb0
[ 245.533299] schedule_preempt_disabled+0x15/0x20
[ 245.534367] __mutex_lock.isra.5+0x295/0x530
[ 245.535351] __mutex_lock_slowpath+0x13/0x20
[ 245.536362] ? __mutex_lock_slowpath+0x13/0x20
[ 245.537334] mutex_lock+0x25/0x30
[ 245.538118] get_online_cpus.part.14+0x15/0x30
[ 245.539588] get_online_cpus+0x20/0x30
[ 245.540560] vmstat_shepherd+0x21/0xc0
[ 245.541538] process_one_work+0x1de/0x430
[ 245.542364] worker_thread+0x47/0x3f0
[ 245.543042] kthread+0x125/0x140
[ 245.543649] ? process_one_work+0x430/0x430
[ 245.544417] ? kthread_create_on_node+0x70/0x70
[ 245.545737] ret_from_fork+0x25/0x30
[ 245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
[ 245.547668] Not tainted 4.12.0-pf4 #1
[ 245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 245.550133] md0_raid10 D 0 459 2 0x00000000
[ 245.551092] Call Trace:
[ 245.551539] __schedule+0x459/0xe40
[ 245.552163] schedule+0x3d/0xb0
[ 245.552728] ? schedule+0x3d/0xb0
[ 245.553344] md_super_wait+0x6e/0xa0 [md_mod]
[ 245.554118] ? wake_bit_function+0x60/0x60
[ 245.554854] md_update_sb.part.60+0x3df/0x840 [md_mod]
[ 245.555771] md_check_recovery+0x215/0x4b0 [md_mod]
[ 245.556732] raid10d+0x62/0x13c0 [raid10]
[ 245.557456] ? schedule+0x3d/0xb0
[ 245.558169] ? schedule+0x3d/0xb0
[ 245.558803] ? schedule_timeout+0x21f/0x330
[ 245.559593] md_thread+0x120/0x160 [md_mod]
[ 245.560380] ? md_thread+0x120/0x160 [md_mod]
[ 245.561202] ? wake_bit_function+0x60/0x60
[ 245.561975] kthread+0x125/0x140
[ 245.562601] ? find_pers+0x70/0x70 [md_mod]
[ 245.563394] ? kthread_create_on_node+0x70/0x70
[ 245.564516] ret_from_fork+0x25/0x30
[ 245.565669] INFO: task dmcrypt_write:487 blocked for more than 120 seconds.
[ 245.567151] Not tainted 4.12.0-pf4 #1
[ 245.567946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 245.569390] dmcrypt_write D 0 487 2 0x00000000
[ 245.570352] Call Trace:
[ 245.570801] __schedule+0x459/0xe40
[ 245.571601] ? preempt_schedule_common+0x1f/0x30
[ 245.572421] ? ___preempt_schedule+0x16/0x18
[ 245.573168] schedule+0x3d/0xb0
[ 245.573733] ? schedule+0x3d/0xb0
[ 245.574378] md_write_start+0xe3/0x270 [md_mod]
[ 245.575180] ? wake_bit_function+0x60/0x60
[ 245.575915] raid10_make_request+0x3f/0x140 [raid10]
[ 245.576827] md_make_request+0xa9/0x2a0 [md_mod]
[ 245.577659] generic_make_request+0x11e/0x2f0
[ 245.578464] dmcrypt_write+0x22d/0x250 [dm_crypt]
[ 245.579288] ? dmcrypt_write+0x22d/0x250 [dm_crypt]
[ 245.580183] ? wake_up_process+0x20/0x20
[ 245.580875] kthread+0x125/0x140
[ 245.581455] ? kthread+0x125/0x140
[ 245.582064] ? crypt_iv_essiv_dtr+0x70/0x70 [dm_crypt]
[ 245.582971] ? kthread_create_on_node+0x70/0x70
[ 245.583772] ret_from_fork+0x25/0x30
[ 245.584424] INFO: task btrfs-transacti:553 blocked for more than 120
seconds.
[ 245.585830] Not tainted 4.12.0-pf4 #1
[ 245.586596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 245.587947] btrfs-transacti D 0 553 2 0x00000000
[ 245.588922] Call Trace:
[ 245.589377] __schedule+0x459/0xe40
[ 245.590062] ? btrfs_submit_bio_hook+0x8c/0x180 [btrfs]
[ 245.590976] schedule+0x3d/0xb0
[ 245.591538] ? schedule+0x3d/0xb0
[ 245.592175] io_schedule+0x16/0x40
[ 245.592794] wait_on_page_bit_common+0xe3/0x180
[ 245.593589] ? page_cache_tree_insert+0xc0/0xc0
[ 245.594387] __filemap_fdatawait_range+0x12a/0x1a0
[ 245.595229] filemap_fdatawait_range+0x14/0x30
[ 245.596033] btrfs_wait_ordered_range+0x6b/0x110 [btrfs]
[ 245.596998] ? _raw_spin_unlock+0x10/0x30
[ 245.597735] __btrfs_wait_cache_io+0x45/0x1c0 [btrfs]
[ 245.599091] btrfs_wait_cache_io+0x2c/0x30 [btrfs]
[ 245.600789] btrfs_write_dirty_block_groups+0x206/0x390 [btrfs]
[ 245.602068] commit_cowonly_roots+0x221/0x2c0 [btrfs]
[ 245.602981] btrfs_commit_transaction+0x3c4/0x900 [btrfs]
[ 245.603940] transaction_kthread+0x190/0x1c0 [btrfs]
[ 245.604806] kthread+0x125/0x140
[ 245.605398] ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
[ 245.606414] ? kthread_create_on_node+0x70/0x70
[ 245.607213] ret_from_fork+0x25/0x30
[ 245.607846] INFO: task systemd-sleep:677 blocked for more than 120 seconds.
[ 245.609061] Not tainted 4.12.0-pf4 #1
[ 245.609858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 245.611202] systemd-sleep D 0 677 1 0x00000000
[ 245.612160] Call Trace:
[ 245.612604] __schedule+0x459/0xe40
[ 245.613231] ? blk_mq_queue_reinit_work+0x100/0x100
[ 245.614080] schedule+0x3d/0xb0
[ 245.614637] ? schedule+0x3d/0xb0
[ 245.615222] blk_mq_freeze_queue_wait+0x46/0xb0
[ 245.616015] ? wake_bit_function+0x60/0x60
[ 245.616778] blk_mq_queue_reinit_work+0x6e/0x100
[ 245.617592] blk_mq_queue_reinit_dead+0x3d/0x50
[ 245.618420] ? __flow_cache_shrink+0x170/0x170
[ 245.619199] cpuhp_invoke_callback+0x84/0x430
[ 245.619965] ? cpuhp_invoke_callback+0x84/0x430
[ 245.620754] ? __flow_cache_shrink+0x170/0x170
[ 245.621530] cpuhp_down_callbacks+0x42/0x80
[ 245.622267] _cpu_down+0xc5/0x100
[ 245.622856] freeze_secondary_cpus+0x9e/0x230
[ 245.623629] suspend_devices_and_enter+0x2e6/0x800
[ 245.624471] pm_suspend+0x349/0x3c0
[ 245.625090] state_store+0x49/0xa0
[ 245.625699] kobj_attr_store+0xf/0x20
[ 245.626379] sysfs_kf_write+0x37/0x40
[ 245.627038] kernfs_fop_write+0x11c/0x1a0
[ 245.627800] __vfs_write+0x37/0x140
[ 245.628448] ? handle_mm_fault+0xde/0x220
[ 245.629155] vfs_write+0xb1/0x1a0
[ 245.629745] SyS_write+0x55/0xc0
[ 245.630337] ? trace_do_page_fault+0x37/0xf0
[ 245.631088] entry_SYSCALL_64_fastpath+0x1a/0xa5
[ 245.632109] RIP: 0033:0x7f810459dbf0
[ 245.632821] RSP: 002b:00007ffec9ba2e58 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[ 245.634127] RAX: ffffffffffffffda RBX: 00000000000000c6 RCX:
00007f810459dbf0
[ 245.635353] RDX: 0000000000000004 RSI: 000000b716c73390 RDI:
0000000000000004
[ 245.636778] RBP: 000000000000270f R08: 000000b716c73240 R09:
00007f8104a718c0
[ 245.638254] R10: 00007f810485ead8 R11: 0000000000000246 R12:
00007f810485ead8
[ 245.639500] R13: 0000000000001010 R14: 000000b716c73380 R15:
00007f810485ea80
===

With blk-mq disabled everything works okay.

How can I help you in solving this issue?

2017-07-29 21:17:16

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

Recompiled kernel with lockdep enabled gives me this:

===
[ 368.655051] Showing all locks held in the system:
[ 368.656387] 1 lock held by khungtaskd/37:
[ 368.657171] #0: (tasklist_lock){.+.+..}, at: [<ffffffffa90c801d>]
debug_show_all_locks+0x3d/0x1a0
[ 368.658725] 1 lock held by md0_raid10/458:
[ 368.659455] #0: (&mddev->reconfig_mutex){+.+.+.}, at:
[<ffffffffc036790f>] md_check_recovery+0xaf/0x4d0 [md_mod]
[ 368.661403] 3 locks held by btrfs-transacti/550:
[ 368.662754] #0: (&fs_info->transaction_kthread_mutex){+.+...}, at:
[<ffffffffc03eeb79>] transaction_kthread+0x69/0x1c0 [btrfs]
[ 368.664797] #1: (&fs_info->reloc_mutex){+.+...}, at: [<ffffffffc03f4571>]
btrfs_commit_transaction+0x2e1/0x9b0 [btrfs]
[ 368.666669] #2: (&fs_info->tree_log_mutex){+.+...}, at:
[<ffffffffc03f45e1>] btrfs_commit_transaction+0x351/0x9b0 [btrfs]
[ 368.668644] 4 locks held by kworker/0:2/888:
[ 368.669384] #0: ("events"){.+.+.+}, at: [<ffffffffa90abd0b>]
process_one_work+0x1fb/0x6e0
[ 368.670916] #1: ((shepherd).work){+.+...}, at: [<ffffffffa90abd0b>]
process_one_work+0x1fb/0x6e0
[ 368.672592] #2: (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908c245>]
get_online_cpus.part.14+0x5/0x50
[ 368.674742] #3: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908c27a>]
get_online_cpus.part.14+0x3a/0x50
[ 368.677494] 10 locks held by systemd-sleep/889:
[ 368.678650] #0: (sb_writers#5){.+.+.+}, at: [<ffffffffa924779b>]
vfs_write+0x17b/0x1a0
[ 368.680483] #1: (&of->mutex){+.+.+.}, at: [<ffffffffa92db703>]
kernfs_fop_write+0x123/0x1e0
[ 368.682412] #2: (s_active#257){.+.+.+}, at: [<ffffffffa92db70c>]
kernfs_fop_write+0x12c/0x1e0
[ 368.684440] #3: (autosleep_lock){+.+.+.}, at: [<ffffffffa90da3b7>]
pm_autosleep_lock+0x17/0x20
[ 368.686707] #4: (pm_mutex){+.+.+.}, at: [<ffffffffa90d28f8>] pm_suspend
+0x88/0x490
[ 368.688086] #5: (acpi_scan_lock){+.+.+.}, at: [<ffffffffa941c907>]
acpi_scan_lock_acquire+0x17/0x20
[ 368.690213] #6: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffffa908e160>]
freeze_secondary_cpus+0x30/0x3c0
[ 368.692016] #7: (cpu_hotplug.dep_map){++++++}, at: [<ffffffffa908da75>]
cpu_hotplug_begin+0x5/0xe0
[ 368.694347] #8: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffa908daf3>]
cpu_hotplug_begin+0x83/0xe0
[ 368.696010] #9: (all_q_mutex){+.+...}, at: [<ffffffffa933e05a>]
blk_mq_queue_reinit_work+0x1a/0x110
[ 368.698624]
[ 368.698990] =============================================
[ 368.698990]
===

Deadlock with CPU hotplug?

On sobota 29. července 2017 17:27:41 CEST Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
>
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch
> applied blk-mq breaks suspend to RAM for me. It is reproducible on my
> laptop as well as in a VM.
>
> I use complex disk layout involving MD, LUKS and LVM, and managed to get
> these warnings from VM via serial console when suspend fails:
>
> ===
> [ 245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [ 245.520025] Not tainted 4.12.0-pf4 #1
> [ 245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 245.525612] kworker/0:1 D 0 49 2 0x00000000
> [ 245.527515] Workqueue: events vmstat_shepherd
> [ 245.528685] Call Trace:
> [ 245.529296] __schedule+0x459/0xe40
> [ 245.530115] ? kvm_clock_read+0x25/0x40
> [ 245.531003] ? ktime_get+0x40/0xa0
> [ 245.531819] schedule+0x3d/0xb0
> [ 245.532542] ? schedule+0x3d/0xb0
> [ 245.533299] schedule_preempt_disabled+0x15/0x20
> [ 245.534367] __mutex_lock.isra.5+0x295/0x530
> [ 245.535351] __mutex_lock_slowpath+0x13/0x20
> [ 245.536362] ? __mutex_lock_slowpath+0x13/0x20
> [ 245.537334] mutex_lock+0x25/0x30
> [ 245.538118] get_online_cpus.part.14+0x15/0x30
> [ 245.539588] get_online_cpus+0x20/0x30
> [ 245.540560] vmstat_shepherd+0x21/0xc0
> [ 245.541538] process_one_work+0x1de/0x430
> [ 245.542364] worker_thread+0x47/0x3f0
> [ 245.543042] kthread+0x125/0x140
> [ 245.543649] ? process_one_work+0x430/0x430
> [ 245.544417] ? kthread_create_on_node+0x70/0x70
> [ 245.545737] ret_from_fork+0x25/0x30
> [ 245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
> [ 245.547668] Not tainted 4.12.0-pf4 #1
> [ 245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 245.550133] md0_raid10 D 0 459 2 0x00000000
> [ 245.551092] Call Trace:
> [ 245.551539] __schedule+0x459/0xe40
> [ 245.552163] schedule+0x3d/0xb0
> [ 245.552728] ? schedule+0x3d/0xb0
> [ 245.553344] md_super_wait+0x6e/0xa0 [md_mod]
> [ 245.554118] ? wake_bit_function+0x60/0x60
> [ 245.554854] md_update_sb.part.60+0x3df/0x840 [md_mod]
> [ 245.555771] md_check_recovery+0x215/0x4b0 [md_mod]
> [ 245.556732] raid10d+0x62/0x13c0 [raid10]
> [ 245.557456] ? schedule+0x3d/0xb0
> [ 245.558169] ? schedule+0x3d/0xb0
> [ 245.558803] ? schedule_timeout+0x21f/0x330
> [ 245.559593] md_thread+0x120/0x160 [md_mod]
> [ 245.560380] ? md_thread+0x120/0x160 [md_mod]
> [ 245.561202] ? wake_bit_function+0x60/0x60
> [ 245.561975] kthread+0x125/0x140
> [ 245.562601] ? find_pers+0x70/0x70 [md_mod]
> [ 245.563394] ? kthread_create_on_node+0x70/0x70
> [ 245.564516] ret_from_fork+0x25/0x30
> [ 245.565669] INFO: task dmcrypt_write:487 blocked for more than 120
> seconds. [ 245.567151] Not tainted 4.12.0-pf4 #1
> [ 245.567946] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 245.569390] dmcrypt_write D 0 487 2 0x00000000
> [ 245.570352] Call Trace:
> [ 245.570801] __schedule+0x459/0xe40
> [ 245.571601] ? preempt_schedule_common+0x1f/0x30
> [ 245.572421] ? ___preempt_schedule+0x16/0x18
> [ 245.573168] schedule+0x3d/0xb0
> [ 245.573733] ? schedule+0x3d/0xb0
> [ 245.574378] md_write_start+0xe3/0x270 [md_mod]
> [ 245.575180] ? wake_bit_function+0x60/0x60
> [ 245.575915] raid10_make_request+0x3f/0x140 [raid10]
> [ 245.576827] md_make_request+0xa9/0x2a0 [md_mod]
> [ 245.577659] generic_make_request+0x11e/0x2f0
> [ 245.578464] dmcrypt_write+0x22d/0x250 [dm_crypt]
> [ 245.579288] ? dmcrypt_write+0x22d/0x250 [dm_crypt]
> [ 245.580183] ? wake_up_process+0x20/0x20
> [ 245.580875] kthread+0x125/0x140
> [ 245.581455] ? kthread+0x125/0x140
> [ 245.582064] ? crypt_iv_essiv_dtr+0x70/0x70 [dm_crypt]
> [ 245.582971] ? kthread_create_on_node+0x70/0x70
> [ 245.583772] ret_from_fork+0x25/0x30
> [ 245.584424] INFO: task btrfs-transacti:553 blocked for more than 120
> seconds.
> [ 245.585830] Not tainted 4.12.0-pf4 #1
> [ 245.586596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 245.587947] btrfs-transacti D 0 553 2 0x00000000
> [ 245.588922] Call Trace:
> [ 245.589377] __schedule+0x459/0xe40
> [ 245.590062] ? btrfs_submit_bio_hook+0x8c/0x180 [btrfs]
> [ 245.590976] schedule+0x3d/0xb0
> [ 245.591538] ? schedule+0x3d/0xb0
> [ 245.592175] io_schedule+0x16/0x40
> [ 245.592794] wait_on_page_bit_common+0xe3/0x180
> [ 245.593589] ? page_cache_tree_insert+0xc0/0xc0
> [ 245.594387] __filemap_fdatawait_range+0x12a/0x1a0
> [ 245.595229] filemap_fdatawait_range+0x14/0x30
> [ 245.596033] btrfs_wait_ordered_range+0x6b/0x110 [btrfs]
> [ 245.596998] ? _raw_spin_unlock+0x10/0x30
> [ 245.597735] __btrfs_wait_cache_io+0x45/0x1c0 [btrfs]
> [ 245.599091] btrfs_wait_cache_io+0x2c/0x30 [btrfs]
> [ 245.600789] btrfs_write_dirty_block_groups+0x206/0x390 [btrfs]
> [ 245.602068] commit_cowonly_roots+0x221/0x2c0 [btrfs]
> [ 245.602981] btrfs_commit_transaction+0x3c4/0x900 [btrfs]
> [ 245.603940] transaction_kthread+0x190/0x1c0 [btrfs]
> [ 245.604806] kthread+0x125/0x140
> [ 245.605398] ? btrfs_cleanup_transaction+0x500/0x500 [btrfs]
> [ 245.606414] ? kthread_create_on_node+0x70/0x70
> [ 245.607213] ret_from_fork+0x25/0x30
> [ 245.607846] INFO: task systemd-sleep:677 blocked for more than 120
> seconds. [ 245.609061] Not tainted 4.12.0-pf4 #1
> [ 245.609858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 245.611202] systemd-sleep D 0 677 1 0x00000000
> [ 245.612160] Call Trace:
> [ 245.612604] __schedule+0x459/0xe40
> [ 245.613231] ? blk_mq_queue_reinit_work+0x100/0x100
> [ 245.614080] schedule+0x3d/0xb0
> [ 245.614637] ? schedule+0x3d/0xb0
> [ 245.615222] blk_mq_freeze_queue_wait+0x46/0xb0
> [ 245.616015] ? wake_bit_function+0x60/0x60
> [ 245.616778] blk_mq_queue_reinit_work+0x6e/0x100
> [ 245.617592] blk_mq_queue_reinit_dead+0x3d/0x50
> [ 245.618420] ? __flow_cache_shrink+0x170/0x170
> [ 245.619199] cpuhp_invoke_callback+0x84/0x430
> [ 245.619965] ? cpuhp_invoke_callback+0x84/0x430
> [ 245.620754] ? __flow_cache_shrink+0x170/0x170
> [ 245.621530] cpuhp_down_callbacks+0x42/0x80
> [ 245.622267] _cpu_down+0xc5/0x100
> [ 245.622856] freeze_secondary_cpus+0x9e/0x230
> [ 245.623629] suspend_devices_and_enter+0x2e6/0x800
> [ 245.624471] pm_suspend+0x349/0x3c0
> [ 245.625090] state_store+0x49/0xa0
> [ 245.625699] kobj_attr_store+0xf/0x20
> [ 245.626379] sysfs_kf_write+0x37/0x40
> [ 245.627038] kernfs_fop_write+0x11c/0x1a0
> [ 245.627800] __vfs_write+0x37/0x140
> [ 245.628448] ? handle_mm_fault+0xde/0x220
> [ 245.629155] vfs_write+0xb1/0x1a0
> [ 245.629745] SyS_write+0x55/0xc0
> [ 245.630337] ? trace_do_page_fault+0x37/0xf0
> [ 245.631088] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [ 245.632109] RIP: 0033:0x7f810459dbf0
> [ 245.632821] RSP: 002b:00007ffec9ba2e58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [ 245.634127] RAX: ffffffffffffffda RBX: 00000000000000c6 RCX:
> 00007f810459dbf0
> [ 245.635353] RDX: 0000000000000004 RSI: 000000b716c73390 RDI:
> 0000000000000004
> [ 245.636778] RBP: 000000000000270f R08: 000000b716c73240 R09:
> 00007f8104a718c0
> [ 245.638254] R10: 00007f810485ead8 R11: 0000000000000246 R12:
> 00007f810485ead8
> [ 245.639500] R13: 0000000000001010 R14: 000000b716c73380 R15:
> 00007f810485ea80
> ===
>
> With blk-mq disabled everything works okay.
>
> How can I help you in solving this issue?

2017-07-30 05:12:39

by Mike Galbraith

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
>
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well
> as in a VM.
>
> I use complex disk layout involving MD, LUKS and LVM, and managed to get these
> warnings from VM via serial console when suspend fails:
>
> ===
> [ 245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [ 245.520025] Not tainted 4.12.0-pf4 #1

FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
stable fixed it. If not, I'd find these two commits irresistible.

5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
4b855ad37194f blk-mq: Create hctx for each present CPU

'course applying random upstream bits does come with some risk, trying
a kernel already containing them has less "entertainment" potential.

-Mike

2017-07-30 13:50:20

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

Hello Mike et al.

On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> stable fixed it.

My build already includes v4.12.4.

> If not, I'd find these two commits irresistible.
>
> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> 4b855ad37194f blk-mq: Create hctx for each present CPU

I've applied these 2 commits, and cannot reproduce the issue anymore. Looks
like a perfect hit, thanks!

> 'course applying random upstream bits does come with some risk, trying
> a kernel already containing them has less "entertainment" potential.

Should you consider applying them to v4.12.x stable series? CC'ing Greg just
in case.

2017-08-08 16:22:55

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> Hello Mike et al.
>
> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > stable fixed it.
>
> My build already includes v4.12.4.
>
> > If not, I'd find these two commits irresistible.
> >
> > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > 4b855ad37194f blk-mq: Create hctx for each present CPU
>
> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks
> like a perfect hit, thanks!
>
> > 'course applying random upstream bits does come with some risk, trying
> > a kernel already containing them has less "entertainment" potential.
>
> Should you consider applying them to v4.12.x stable series? CC'ing Greg just
> in case.

I can queue these up if I get an ack from the developers/maintainers
that it is ok to do so...

{hint}

thanks,

greg k-h

2017-08-08 16:33:34

by Jens Axboe

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On 08/08/2017 10:22 AM, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
>> Hello Mike et al.
>>
>> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
>>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
>>> stable fixed it.
>>
>> My build already includes v4.12.4.
>>
>>> If not, I'd find these two commits irresistible.
>>>
>>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
>>> 4b855ad37194f blk-mq: Create hctx for each present CPU
>>
>> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks
>> like a perfect hit, thanks!
>>
>>> 'course applying random upstream bits does come with some risk, trying
>>> a kernel already containing them has less "entertainment" potential.
>>
>> Should you consider applying them to v4.12.x stable series? CC'ing Greg just
>> in case.
>
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...

You can add those two commits to stable.

--
Jens Axboe

2017-08-08 16:34:12

by Mike Galbraith

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> >
> > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > stable fixed it.
> >
> > My build already includes v4.12.4.
> >
> > > If not, I'd find these two commits irresistible.
> > >
> > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> >
> > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks
> > like a perfect hit, thanks!
> >
> > > 'course applying random upstream bits does come with some risk, trying
> > > a kernel already containing them has less "entertainment" potential.
> >
> > Should you consider applying them to v4.12.x stable series? CC'ing Greg just
> > in case.
>
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...
>
> {hint}

{hint++}

Those commits take Steven Rostedt's hotplug stress script runtime down
from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
you hear an "ACK" too.

-Mike

2017-08-08 16:36:05

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

Could you queue "block: disable runtime-pm for blk-mq" too please? It is also
related to suspend-resume freezes that were observed by multiple users.

Thanks.

On úterý 8. srpna 2017 18:33:29 CEST Jens Axboe wrote:
> On 08/08/2017 10:22 AM, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> >> Hello Mike et al.
> >>
> >> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> >>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> >>> stable fixed it.
> >>
> >> My build already includes v4.12.4.
> >>
> >>> If not, I'd find these two commits irresistible.
> >>>
> >>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue
> >>> mapping
> >>> 4b855ad37194f blk-mq: Create hctx for each present CPU
> >>
> >> I've applied these 2 commits, and cannot reproduce the issue anymore.
> >> Looks
> >> like a perfect hit, thanks!
> >>
> >>> 'course applying random upstream bits does come with some risk, trying
> >>> a kernel already containing them has less "entertainment" potential.
> >>
> >> Should you consider applying them to v4.12.x stable series? CC'ing Greg
> >> just in case.
> >
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
>
> You can add those two commits to stable.

2017-08-08 16:43:36

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> Could you queue "block: disable runtime-pm for blk-mq" too please? It is also
> related to suspend-resume freezes that were observed by multiple users.

What is the git commit id of that patch?

thanks,

greg k-h

2017-08-08 16:44:03

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Tue, Aug 08, 2017 at 06:34:01PM +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > > Hello Mike et al.
> > >
> > > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > > stable fixed it.
> > >
> > > My build already includes v4.12.4.
> > >
> > > > If not, I'd find these two commits irresistible.
> > > >
> > > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > >
> > > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks
> > > like a perfect hit, thanks!
> > >
> > > > 'course applying random upstream bits does come with some risk, trying
> > > > a kernel already containing them has less "entertainment" potential.
> > >
> > > Should you consider applying them to v4.12.x stable series? CC'ing Greg just
> > > in case.
> >
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> >
> > {hint}
>
> {hint++}
>
> Those commits take Steven Rostedt's hotplug stress script runtime down
> from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
> you hear an "ACK" too.

Oh, nice!

Should these go back farther than 4.12? Looks like they apply cleanly
to 4.9, didn't look older than that...

thanks,

greg k-h

2017-08-08 16:46:14

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

Greg,

this is 765e40b675a9566459ddcb8358ad16f3b8344bbe.

On ?ter? 8. srpna 2017 18:43:33 CEST Greg KH wrote:
> On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> > Could you queue "block: disable runtime-pm for blk-mq" too please? It is
> > also related to suspend-resume freezes that were observed by multiple
> > users.
> What is the git commit id of that patch?
>
> thanks,
>
> greg k-h

2017-08-08 16:51:02

by Mike Galbraith

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>
> Should these go back farther than 4.12? Looks like they apply cleanly
> to 4.9, didn't look older than that...

I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.

-Mike

2017-08-08 18:34:07

by Mike Galbraith

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> >
> > Should these go back farther than 4.12? Looks like they apply cleanly
> > to 4.9, didn't look older than that...
>
> I met prerequisites at 4.11...

FWIW, I took/modified 2d0364c8c1a9. Dunno if the suspend regression
exists in 4.11 though, without which you'll likely want nothing.

-Mike

2017-08-08 18:35:18

by Jens Axboe

[permalink] [raw]

Subject: Re: blk-mq breaks suspend even with runtime PM patch

On 08/08/2017 12:33 PM, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
>> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>>>
>>> Should these go back farther than 4.12? Looks like they apply cleanly
>>> to 4.9, didn't look older than that...
>>
>> I met prerequisites at 4.11...
>
> FWIW, I took/modified 2d0364c8c1a9. Dunno if the suspend regression
> exists in 4.11 though, without which you'll likely want nothing.

It does exist, the only change here is that we default people to
scsi-mq in 4.12+. Honestly, nobody complained since we've had scsi-mq,
so I could pivot both ways on whether we really need the changes in
earlier versions or not.

--
Jens Axboe