2013-08-30 00:29:56

by kernel neophyte

[permalink] [raw]
Subject: Bcache sleeps forever on random writes

We are evaluating to use bcache on our production systems where the
caching devices are insanely fast, in this scenario under a moderate load
of random 4k writes.. bcache fails miserably :-(

[ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
on set b082ce66-04c6-43d5-8207-ebf39840191d
[ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
[ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
[ 4442.163695] Workqueue: bcache bch_data_insert_keys
[ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
0000000000000151
[ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
ffff882fa6ac9fd8
[ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
ffff8822ca2c0d98
[ 4442.163716] Call Trace:
[ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
[ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
[ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
[ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
[ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
[ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
[ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
[ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
[ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
[ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
[ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
[ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
[ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
[ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
[ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
[ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0
[ 4442.163854] [<ffffffff81515fc7>] ? bch_keylist_pop_front+0x47/0x50
[ 4442.163859] [<ffffffff8150fed6>] ? bch_btree_insert_keys+0x56/0x250
[ 4442.163867] [<ffffffff81325abc>] ? cpumask_next_and+0x3c/0x50
[ 4442.163872] [<ffffffff81513582>] bch_btree_insert_node+0xb2/0x2f0
[ 4442.163877] [<ffffffff815137e8>] btree_insert_fn+0x28/0x50
[ 4442.163881] [<ffffffff81511b8c>] bch_btree_map_nodes_recurse+0x6c/0x170
[ 4442.163886] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 4442.163891] [<ffffffff816bcc26>] ? down_write+0x16/0x40
[ 4442.163896] [<ffffffff815117a1>] ? bch_btree_node_get+0x71/0x280
[ 4442.163901] [<ffffffff81511c30>] bch_btree_map_nodes_recurse+0x110/0x170
[ 4442.163905] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 4442.163915] [<ffffffff811b51ea>] ? dio_bio_end_io+0x5a/0x90
[ 4442.163921] [<ffffffff8107f991>] ? update_curr+0x141/0x1f0
[ 4442.163926] [<ffffffff81514dce>] __bch_btree_map_nodes+0x13e/0x1c0
[ 4442.163931] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 4442.163936] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
[ 4442.163942] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
[ 4442.163949] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 4442.163954] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 4442.163959] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 4442.163965] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 4442.163970] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.163978] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 4442.163982] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.163994] INFO: task kswapd0:80 blocked for more than 120 seconds.
[ 4442.163998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.164003] kswapd0 D 0000000000000001 0 80 2 0x00000000
[ 4442.164007] ffff882fa4b17ba8 0000000000000046 ffff882fa4b17bc8
ffff882fa60ff000
[ 4442.164013] ffff882fa593e640 ffff882fa4b17fd8 ffff882fa4b17fd8
ffff882fa4b17fd8
[ 4442.164018] ffff882f8a278000 ffff882fa593e640 ffff882fa6a84cb0
ffff8822ca2c0d98
[ 4442.164023] Call Trace:
[ 4442.164029] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.164034] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 4442.164039] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 4442.164044] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 4442.164049] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
[ 4442.164054] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
[ 4442.164059] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
[ 4442.164065] [<ffffffff81131f54>] kswapd+0x634/0x9b0
[ 4442.164071] [<ffffffff8106a720>] ? add_wait_queue+0x60/0x60
[ 4442.164076] [<ffffffff81131920>] ? try_to_free_pages+0x1a0/0x1a0
[ 4442.164080] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 4442.164085] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164090] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 4442.164094] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164101] INFO: task kworker/1:1:201 blocked for more than 120 seconds.
[ 4442.164105] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.164110] kworker/1:1 D ffffffff81813a60 0 201 2 0x00000000
[ 4442.164117] Workqueue: bcache bch_data_insert_keys
[ 4442.164119] ffff882f894c9be0 0000000000000046 0000000000000002
0000000000000002
[ 4442.164124] ffff882f89974cb0 ffff882f894c9fd8 ffff882f894c9fd8
ffff882f894c9fd8
[ 4442.164129] ffff882fa6ae8000 ffff882f89974cb0 0000000000000000
ffff882f89974cb0
[ 4442.164134] Call Trace:
[ 4442.164140] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.164145] [<ffffffff816bf0fd>] rwsem_down_read_failed+0x9d/0xe5
[ 4442.164152] [<ffffffff81332c64>] call_rwsem_down_read_failed+0x14/0x30
[ 4442.164157] [<ffffffff816bcc74>] ? down_read+0x24/0x2b
[ 4442.164162] [<ffffffff81514d75>] __bch_btree_map_nodes+0xe5/0x1c0
[ 4442.164166] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 4442.164171] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
[ 4442.164177] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
[ 4442.164182] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 4442.164187] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 4442.164192] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 4442.164196] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 4442.164200] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164206] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 4442.164210] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164215] INFO: task kworker/u64:2:377 blocked for more than 120 seconds.
[ 4442.164219] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.164224] kworker/u64:2 D ffffffff81813a60 0 377 2 0x00000000
[ 4442.164231] Workqueue: bch_btree_io btree_node_write_work
[ 4442.164233] ffff882f87cbbcc8 0000000000000046 000003e146257be1
0029002f87cbbc98
[ 4442.164238] ffff882f88053320 ffff882f87cbbfd8 ffff882f87cbbfd8
ffff882f87cbbfd8
[ 4442.164243] ffff882fa6ae9990 ffff882f88053320 ffff882f87cbbd18
ffff882f88053320
[ 4442.164249] Call Trace:
[ 4442.164254] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.164259] [<ffffffff816befb5>] rwsem_down_write_failed+0xf5/0x1a0
[ 4442.164264] [<ffffffff8150f000>] ? __btree_node_write_done+0x100/0x100
[ 4442.164269] [<ffffffff81332c93>] call_rwsem_down_write_failed+0x13/0x20
[ 4442.164274] [<ffffffff816bcc41>] ? down_write+0x31/0x40
[ 4442.164279] [<ffffffff8151144f>] btree_node_write_work+0x2f/0x80
[ 4442.164283] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 4442.164288] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 4442.164293] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 4442.164297] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 4442.164302] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164307] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 4442.164311] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164325] INFO: task bcache_allocato:2256 blocked for more than
120 seconds.
[ 4442.164329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.164334] bcache_allocato D 0000000000000001 0 2256 2 0x00000000
[ 4442.164337] ffff881004e3dd88 0000000000000046 ffff881004e3dda8
ffffffff810808ad
[ 4442.164343] ffff882fa3d64cb0 ffff881004e3dfd8 ffff881004e3dfd8
ffff881004e3dfd8
[ 4442.164348] ffff882f89ea0000 ffff882fa3d64cb0 ffff882fa6a84cb0
ffff8822ca2c0d98
[ 4442.164353] Call Trace:
[ 4442.164358] [<ffffffff810808ad>] ? dequeue_task_fair+0x2cd/0x530
[ 4442.164363] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.164368] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 4442.164373] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 4442.164378] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 4442.164383] [<ffffffff8150cdbf>] bch_allocator_thread+0x10f/0xe20
[ 4442.164388] [<ffffffff8150ccb0>] ? bch_bucket_add_unused+0xe0/0xe0
[ 4442.164392] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 4442.164398] [<ffffffff811ad3c0>] ? end_buffer_async_read+0x130/0x130
[ 4442.164402] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164407] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 4442.164411] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 4442.164417] INFO: task iozone:2565 blocked for more than 120 seconds.
[ 4442.164421] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 4442.164426] iozone D 0000000000000001 0 2565 1660 0x00000000
[ 4442.164429] ffff882fa3ae1978 0000000000000086 ffff882fa3ae1938
ffffffff81301d7a
[ 4442.164435] ffff882f8a420000 ffff882fa3ae1fd8 ffff882fa3ae1fd8
ffff882fa3ae1fd8
[ 4442.164440] ffff882fa6a84cb0 ffff882f8a420000 ffff882fa3ae1978
ffff882fbf2139f8
[ 4442.164445] Call Trace:
[ 4442.164451] [<ffffffff81301d7a>] ? generic_make_request+0xca/0x100
[ 4442.164456] [<ffffffff816be299>] schedule+0x29/0x70
[ 4442.164461] [<ffffffff816be36f>] io_schedule+0x8f/0xd0
[ 4442.164467] [<ffffffff811b797c>] do_blockdev_direct_IO+0x1a7c/0x1fb0
[ 4442.164477] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 4442.164484] [<ffffffff811b7f05>] __blockdev_direct_IO+0x55/0x60
[ 4442.164490] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 4442.164497] [<ffffffffa020f349>] ext2_direct_IO+0x79/0xe0 [ext2]
[ 4442.164502] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 4442.164509] [<ffffffff8104ade6>] ? current_fs_time+0x16/0x60
[ 4442.164516] [<ffffffff8111f126>] generic_file_direct_write+0xc6/0x180
[ 4442.164521] [<ffffffff8111f4bd>] __generic_file_aio_write+0x2dd/0x3b0
[ 4442.164526] [<ffffffff8111f5f9>] generic_file_aio_write+0x69/0xd0
[ 4442.164532] [<ffffffff8117b88a>] do_sync_write+0x7a/0xb0
[ 4442.164537] [<ffffffff8117c63e>] vfs_write+0xce/0x1e0
[ 4442.164542] [<ffffffff8117cb22>] SyS_write+0x52/0xa0
[ 4442.164548] [<ffffffff816c7a02>] system_call_fastpath+0x16/0x1b


-Neo


2013-08-30 01:16:40

by kernel neophyte

[permalink] [raw]
Subject: Re: Bcache sleeps forever on random writes

Another one:

[ 3243.199791] kworker/u64:0 D ffffffff81813a60 0 1780 2 0x00000000
[ 3243.199806] Workqueue: bch_btree_io btree_node_write_work
[ 3243.199810] ffff882e15ed9778 0000000000000046 ffff882e15ed9738
ffff882f88757058
[ 3243.199816] ffff882f884cb320 ffff882e15ed9fd8 ffff882e15ed9fd8
ffff882e15ed9fd8
[ 3243.199821] ffff882fa6ae8000 ffff882f884cb320 ffff882f88757000
ffff882c9b560d98
[ 3243.199827] Call Trace:
[ 3243.199840] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.199845] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 3243.199851] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.199857] [<ffffffff81484090>] ? ata_scsiop_mode_sense+0x380/0x380
[ 3243.199862] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 3243.199867] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
[ 3243.199874] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
[ 3243.199884] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
[ 3243.199891] [<ffffffff81076828>] ? resched_task+0x68/0x70
[ 3243.199897] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
[ 3243.199903] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
[ 3243.199910] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
[ 3243.199915] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
[ 3243.199925] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
[ 3243.199934] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
[ 3243.199940] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
[ 3243.199946] [<ffffffff81517fc8>] __btree_sort+0x48/0x230
[ 3243.199951] [<ffffffff8151765c>] ? __bch_btree_iter_init+0x7c/0xc0
[ 3243.199957] [<ffffffff81518301>] bch_btree_sort_partial+0x101/0x120
[ 3243.199962] [<ffffffff8150f000>] ? __btree_node_write_done+0x100/0x100
[ 3243.199967] [<ffffffff81518468>] bch_btree_sort_lazy+0x68/0x90
[ 3243.199971] [<ffffffff815109ba>] bch_btree_node_write+0x36a/0x4a0
[ 3243.199979] [<ffffffff8108458b>] ? idle_balance+0xeb/0x150
[ 3243.199986] [<ffffffff8106212c>] ? pwq_activate_delayed_work+0x4c/0xb0
[ 3243.199991] [<ffffffff81511477>] btree_node_write_work+0x57/0x80
[ 3243.199995] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 3243.200000] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 3243.200006] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 3243.200011] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 3243.200016] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200024] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 3243.200028] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200034] INFO: task bcache_allocato:1868 blocked for more than
120 seconds.
[ 3243.200039] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200043] bcache_allocato D 0000000000000001 0 1868 2 0x00000000
[ 3243.200048] ffff882f89f7fd88 0000000000000046 ffff882f89f7fda8
ffffffff810808ad
[ 3243.200053] ffff882fa3328000 ffff882f89f7ffd8 ffff882f89f7ffd8
ffff882f89f7ffd8
[ 3243.200058] ffff882f89e13320 ffff882fa3328000 ffff882f865eb320
ffff882c9b560d98
[ 3243.200064] Call Trace:
[ 3243.200069] [<ffffffff810808ad>] ? dequeue_task_fair+0x2cd/0x530
[ 3243.200075] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200080] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 3243.200085] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.200090] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 3243.200095] [<ffffffff8150cdbf>] bch_allocator_thread+0x10f/0xe20
[ 3243.200100] [<ffffffff8150ccb0>] ? bch_bucket_add_unused+0xe0/0xe0
[ 3243.200104] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 3243.200108] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200114] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 3243.200118] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200123] INFO: task kworker/3:2:2548 blocked for more than 120 seconds.
[ 3243.200128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200132] kworker/3:2 D ffffffff81813d40 0 2548 2 0x00000000
[ 3243.200142] Workqueue: bcache bch_data_insert_keys
[ 3243.200144] ffff882fa310b3d8 0000000000000046 ffff882fa310b3f8
9000001080000000
[ 3243.200150] ffff882f865eb320 ffff882fa310bfd8 ffff882fa310bfd8
ffff882fa310bfd8
[ 3243.200155] ffff882fa66f9990 ffff882f865eb320 ffff882f865eb320
ffff882c9b560d98
[ 3243.200160] Call Trace:
[ 3243.200166] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200171] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
[ 3243.200175] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
[ 3243.200180] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
[ 3243.200185] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
[ 3243.200190] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
[ 3243.200195] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
[ 3243.200200] [<ffffffff81076828>] ? resched_task+0x68/0x70
[ 3243.200205] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
[ 3243.200210] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
[ 3243.200215] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
[ 3243.200221] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
[ 3243.200226] [<ffffffff8113a339>] ? zone_statistics+0x99/0xc0
[ 3243.200232] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
[ 3243.200239] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
[ 3243.200244] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
[ 3243.200249] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
[ 3243.200253] [<ffffffff81510f37>] mca_alloc+0x277/0x470
[ 3243.200258] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
[ 3243.200263] [<ffffffff81517031>] ? __bch_bset_search+0x1d1/0x480
[ 3243.200270] [<ffffffff81511e7d>] btree_node_alloc_replacement+0x2d/0x60
[ 3243.200275] [<ffffffff81512f8b>] btree_split+0x7b/0x5c0
[ 3243.200282] [<ffffffff81080078>] ? dequeue_entity+0x1a8/0x5c0
[ 3243.200287] [<ffffffff81515fc7>] ? bch_keylist_pop_front+0x47/0x50
[ 3243.200293] [<ffffffff81513582>] bch_btree_insert_node+0xb2/0x2f0
[ 3243.200297] [<ffffffff815137e8>] btree_insert_fn+0x28/0x50
[ 3243.200302] [<ffffffff81511b8c>] bch_btree_map_nodes_recurse+0x6c/0x170
[ 3243.200307] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 3243.200312] [<ffffffff816bcc26>] ? down_write+0x16/0x40
[ 3243.200317] [<ffffffff815117a1>] ? bch_btree_node_get+0x71/0x280
[ 3243.200322] [<ffffffff81511c30>] bch_btree_map_nodes_recurse+0x110/0x170
[ 3243.200326] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 3243.200334] [<ffffffff81332c93>] ? call_rwsem_down_write_failed+0x13/0x20
[ 3243.200339] [<ffffffff81514dce>] __bch_btree_map_nodes+0x13e/0x1c0
[ 3243.200344] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 3243.200349] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
[ 3243.200355] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
[ 3243.200360] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 3243.200365] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 3243.200370] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 3243.200374] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 3243.200378] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200384] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 3243.200388] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200392] INFO: task kworker/2:2:2571 blocked for more than 120 seconds.
[ 3243.200396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200401] kworker/2:2 D ffffffff81813a60 0 2571 2 0x00000000
[ 3243.200407] Workqueue: bcache bch_data_insert_keys
[ 3243.200409] ffff880267b99be0 0000000000000046 ffffffff81077165
0000000000000004
[ 3243.200415] ffff882f89e14cb0 ffff880267b99fd8 ffff880267b99fd8
ffff880267b99fd8
[ 3243.200420] ffff882fa6ae9990 ffff882f89e14cb0 ffff882fa2e8ee3c
ffff882f89e14cb0
[ 3243.200425] Call Trace:
[ 3243.200430] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
[ 3243.200435] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200441] [<ffffffff816bf0fd>] rwsem_down_read_failed+0x9d/0xe5
[ 3243.200446] [<ffffffff81332c64>] call_rwsem_down_read_failed+0x14/0x30
[ 3243.200451] [<ffffffff816bcc74>] ? down_read+0x24/0x2b
[ 3243.200456] [<ffffffff81514d75>] __bch_btree_map_nodes+0xe5/0x1c0
[ 3243.200460] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 3243.200465] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
[ 3243.200471] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
[ 3243.200476] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 3243.200480] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 3243.200486] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 3243.200490] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 3243.200494] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200499] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 3243.200504] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200507] INFO: task kworker/0:0:2573 blocked for more than 120 seconds.
[ 3243.200512] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200516] kworker/0:0 D ffffffff81813a60 0 2573 2 0x00000000
[ 3243.200523] Workqueue: bcache bch_data_insert_keys
[ 3243.200525] ffff880783857be0 0000000000000046 0000000000000002
0000000000000002
[ 3243.200530] ffff882fa593b320 ffff880783857fd8 ffff880783857fd8
ffff880783857fd8
[ 3243.200535] ffffffff81c10440 ffff882fa593b320 0000000000000000
ffff882fa593b320
[ 3243.200540] Call Trace:
[ 3243.200546] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200551] [<ffffffff816bf0fd>] rwsem_down_read_failed+0x9d/0xe5
[ 3243.200556] [<ffffffff81332c64>] call_rwsem_down_read_failed+0x14/0x30
[ 3243.200561] [<ffffffff816bcc74>] ? down_read+0x24/0x2b
[ 3243.200565] [<ffffffff81514d75>] __bch_btree_map_nodes+0xe5/0x1c0
[ 3243.200570] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
[ 3243.200575] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
[ 3243.200580] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
[ 3243.200585] [<ffffffff810624d4>] process_one_work+0x174/0x490
[ 3243.200590] [<ffffffff8106368b>] worker_thread+0x11b/0x370
[ 3243.200595] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
[ 3243.200599] [<ffffffff81069f40>] kthread+0xc0/0xd0
[ 3243.200604] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200609] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
[ 3243.200613] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
[ 3243.200619] INFO: task iozone:2588 blocked for more than 120 seconds.
[ 3243.200623] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200627] iozone D 0000000000000001 0 2588 2287 0x00000000
[ 3243.200631] ffff882ec708d978 0000000000000082 ffff882ec708d938
ffffffff81301d7a
[ 3243.200636] ffff882fa2e8e640 ffff882ec708dfd8 ffff882ec708dfd8
ffff882ec708dfd8
[ 3243.200642] ffff882f865eb320 ffff882fa2e8e640 ffff882ec708d978
ffff882fbf2739f8
[ 3243.200647] Call Trace:
[ 3243.200653] [<ffffffff81301d7a>] ? generic_make_request+0xca/0x100
[ 3243.200658] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200663] [<ffffffff816be36f>] io_schedule+0x8f/0xd0
[ 3243.200673] [<ffffffff811b797c>] do_blockdev_direct_IO+0x1a7c/0x1fb0
[ 3243.200684] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200691] [<ffffffff811b7f05>] __blockdev_direct_IO+0x55/0x60
[ 3243.200697] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200701] [<ffffffff8107f991>] ? update_curr+0x141/0x1f0
[ 3243.200708] [<ffffffffa0230349>] ext2_direct_IO+0x79/0xe0 [ext2]
[ 3243.200713] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200718] [<ffffffff8107e775>] ? set_next_entity+0xa5/0xc0
[ 3243.200724] [<ffffffff8104ade6>] ? current_fs_time+0x16/0x60
[ 3243.200730] [<ffffffff8111f126>] generic_file_direct_write+0xc6/0x180
[ 3243.200736] [<ffffffff8111f4bd>] __generic_file_aio_write+0x2dd/0x3b0
[ 3243.200741] [<ffffffff8111f5f9>] generic_file_aio_write+0x69/0xd0
[ 3243.200747] [<ffffffff8117b88a>] do_sync_write+0x7a/0xb0
[ 3243.200752] [<ffffffff8117c63e>] vfs_write+0xce/0x1e0
[ 3243.200759] [<ffffffff81197ef7>] ? fget_light+0x67/0xd0
[ 3243.200763] [<ffffffff8117cb22>] SyS_write+0x52/0xa0
[ 3243.200769] [<ffffffff816c7a02>] system_call_fastpath+0x16/0x1b
[ 3243.200773] INFO: task iozone:2589 blocked for more than 120 seconds.
[ 3243.200777] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3243.200784] iozone D 0000000000000001 0 2589 2287 0x00000000
[ 3243.200788] ffff88191fc75978 0000000000000082 ffff88191fc75938
ffffffff81301d7a
[ 3243.200793] ffff882fa2e88000 ffff88191fc75fd8 ffff88191fc75fd8
ffff88191fc75fd8
[ 3243.200798] ffff882f89e14cb0 ffff882fa2e88000 ffff88191fc75978
ffff882fbf2539f8
[ 3243.200803] Call Trace:
[ 3243.200808] [<ffffffff81301d7a>] ? generic_make_request+0xca/0x100
[ 3243.200813] [<ffffffff816be299>] schedule+0x29/0x70
[ 3243.200818] [<ffffffff816be36f>] io_schedule+0x8f/0xd0
[ 3243.200823] [<ffffffff811b797c>] do_blockdev_direct_IO+0x1a7c/0x1fb0
[ 3243.200831] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200838] [<ffffffff811b7f05>] __blockdev_direct_IO+0x55/0x60
[ 3243.200843] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200849] [<ffffffffa0230349>] ext2_direct_IO+0x79/0xe0 [ext2]
[ 3243.200855] [<ffffffffa022fa80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
[ 3243.200859] [<ffffffff8104ade6>] ? current_fs_time+0x16/0x60
[ 3243.200864] [<ffffffff8111f126>] generic_file_direct_write+0xc6/0x180
[ 3243.200869] [<ffffffff8111f4bd>] __generic_file_aio_write+0x2dd/0x3b0
[ 3243.200875] [<ffffffff8111f5f9>] generic_file_aio_write+0x69/0xd0
[ 3243.200879] [<ffffffff8117b88a>] do_sync_write+0x7a/0xb0
[ 3243.200884] [<ffffffff8117c63e>] vfs_write+0xce/0x1e0
[ 3243.200888] [<ffffffff8117cb22>] SyS_write+0x52/0xa0
[ 3243.200894] [<ffffffff816c7a02>] system_call_fastpath+0x16/0x1b

On Thu, Aug 29, 2013 at 5:29 PM, kernel neophyte
<[email protected]> wrote:
> We are evaluating to use bcache on our production systems where the
> caching devices are insanely fast, in this scenario under a moderate load
> of random 4k writes.. bcache fails miserably :-(
>
> [ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
> on set b082ce66-04c6-43d5-8207-ebf39840191d
> [ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
> [ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
> [ 4442.163695] Workqueue: bcache bch_data_insert_keys
> [ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
> 0000000000000151
> [ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
> ffff882fa6ac9fd8
> [ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
> ffff8822ca2c0d98
> [ 4442.163716] Call Trace:
> [ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
> [ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
> [ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
> [ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
> [ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
> [ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
> [ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
> [ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
> [ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
> [ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
> [ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
> [ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
> [ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
> [ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
> [ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
> [ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
> [ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
> [ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
> [ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0
> [ 4442.163854] [<ffffffff81515fc7>] ? bch_keylist_pop_front+0x47/0x50
> [ 4442.163859] [<ffffffff8150fed6>] ? bch_btree_insert_keys+0x56/0x250
> [ 4442.163867] [<ffffffff81325abc>] ? cpumask_next_and+0x3c/0x50
> [ 4442.163872] [<ffffffff81513582>] bch_btree_insert_node+0xb2/0x2f0
> [ 4442.163877] [<ffffffff815137e8>] btree_insert_fn+0x28/0x50
> [ 4442.163881] [<ffffffff81511b8c>] bch_btree_map_nodes_recurse+0x6c/0x170
> [ 4442.163886] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
> [ 4442.163891] [<ffffffff816bcc26>] ? down_write+0x16/0x40
> [ 4442.163896] [<ffffffff815117a1>] ? bch_btree_node_get+0x71/0x280
> [ 4442.163901] [<ffffffff81511c30>] bch_btree_map_nodes_recurse+0x110/0x170
> [ 4442.163905] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
> [ 4442.163915] [<ffffffff811b51ea>] ? dio_bio_end_io+0x5a/0x90
> [ 4442.163921] [<ffffffff8107f991>] ? update_curr+0x141/0x1f0
> [ 4442.163926] [<ffffffff81514dce>] __bch_btree_map_nodes+0x13e/0x1c0
> [ 4442.163931] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
> [ 4442.163936] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
> [ 4442.163942] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
> [ 4442.163949] [<ffffffff810624d4>] process_one_work+0x174/0x490
> [ 4442.163954] [<ffffffff8106368b>] worker_thread+0x11b/0x370
> [ 4442.163959] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
> [ 4442.163965] [<ffffffff81069f40>] kthread+0xc0/0xd0
> [ 4442.163970] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.163978] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
> [ 4442.163982] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.163994] INFO: task kswapd0:80 blocked for more than 120 seconds.
> [ 4442.163998] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.164003] kswapd0 D 0000000000000001 0 80 2 0x00000000
> [ 4442.164007] ffff882fa4b17ba8 0000000000000046 ffff882fa4b17bc8
> ffff882fa60ff000
> [ 4442.164013] ffff882fa593e640 ffff882fa4b17fd8 ffff882fa4b17fd8
> ffff882fa4b17fd8
> [ 4442.164018] ffff882f8a278000 ffff882fa593e640 ffff882fa6a84cb0
> ffff8822ca2c0d98
> [ 4442.164023] Call Trace:
> [ 4442.164029] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.164034] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
> [ 4442.164039] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
> [ 4442.164044] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
> [ 4442.164049] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
> [ 4442.164054] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
> [ 4442.164059] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
> [ 4442.164065] [<ffffffff81131f54>] kswapd+0x634/0x9b0
> [ 4442.164071] [<ffffffff8106a720>] ? add_wait_queue+0x60/0x60
> [ 4442.164076] [<ffffffff81131920>] ? try_to_free_pages+0x1a0/0x1a0
> [ 4442.164080] [<ffffffff81069f40>] kthread+0xc0/0xd0
> [ 4442.164085] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164090] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
> [ 4442.164094] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164101] INFO: task kworker/1:1:201 blocked for more than 120 seconds.
> [ 4442.164105] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.164110] kworker/1:1 D ffffffff81813a60 0 201 2 0x00000000
> [ 4442.164117] Workqueue: bcache bch_data_insert_keys
> [ 4442.164119] ffff882f894c9be0 0000000000000046 0000000000000002
> 0000000000000002
> [ 4442.164124] ffff882f89974cb0 ffff882f894c9fd8 ffff882f894c9fd8
> ffff882f894c9fd8
> [ 4442.164129] ffff882fa6ae8000 ffff882f89974cb0 0000000000000000
> ffff882f89974cb0
> [ 4442.164134] Call Trace:
> [ 4442.164140] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.164145] [<ffffffff816bf0fd>] rwsem_down_read_failed+0x9d/0xe5
> [ 4442.164152] [<ffffffff81332c64>] call_rwsem_down_read_failed+0x14/0x30
> [ 4442.164157] [<ffffffff816bcc74>] ? down_read+0x24/0x2b
> [ 4442.164162] [<ffffffff81514d75>] __bch_btree_map_nodes+0xe5/0x1c0
> [ 4442.164166] [<ffffffff815137c0>] ? bch_btree_insert_node+0x2f0/0x2f0
> [ 4442.164171] [<ffffffff81514f04>] bch_btree_insert+0xb4/0x120
> [ 4442.164177] [<ffffffff8151e1be>] bch_data_insert_keys+0x3e/0x160
> [ 4442.164182] [<ffffffff810624d4>] process_one_work+0x174/0x490
> [ 4442.164187] [<ffffffff8106368b>] worker_thread+0x11b/0x370
> [ 4442.164192] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
> [ 4442.164196] [<ffffffff81069f40>] kthread+0xc0/0xd0
> [ 4442.164200] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164206] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
> [ 4442.164210] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164215] INFO: task kworker/u64:2:377 blocked for more than 120 seconds.
> [ 4442.164219] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.164224] kworker/u64:2 D ffffffff81813a60 0 377 2 0x00000000
> [ 4442.164231] Workqueue: bch_btree_io btree_node_write_work
> [ 4442.164233] ffff882f87cbbcc8 0000000000000046 000003e146257be1
> 0029002f87cbbc98
> [ 4442.164238] ffff882f88053320 ffff882f87cbbfd8 ffff882f87cbbfd8
> ffff882f87cbbfd8
> [ 4442.164243] ffff882fa6ae9990 ffff882f88053320 ffff882f87cbbd18
> ffff882f88053320
> [ 4442.164249] Call Trace:
> [ 4442.164254] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.164259] [<ffffffff816befb5>] rwsem_down_write_failed+0xf5/0x1a0
> [ 4442.164264] [<ffffffff8150f000>] ? __btree_node_write_done+0x100/0x100
> [ 4442.164269] [<ffffffff81332c93>] call_rwsem_down_write_failed+0x13/0x20
> [ 4442.164274] [<ffffffff816bcc41>] ? down_write+0x31/0x40
> [ 4442.164279] [<ffffffff8151144f>] btree_node_write_work+0x2f/0x80
> [ 4442.164283] [<ffffffff810624d4>] process_one_work+0x174/0x490
> [ 4442.164288] [<ffffffff8106368b>] worker_thread+0x11b/0x370
> [ 4442.164293] [<ffffffff81063570>] ? manage_workers.isra.23+0x2d0/0x2d0
> [ 4442.164297] [<ffffffff81069f40>] kthread+0xc0/0xd0
> [ 4442.164302] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164307] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
> [ 4442.164311] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164325] INFO: task bcache_allocato:2256 blocked for more than
> 120 seconds.
> [ 4442.164329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.164334] bcache_allocato D 0000000000000001 0 2256 2 0x00000000
> [ 4442.164337] ffff881004e3dd88 0000000000000046 ffff881004e3dda8
> ffffffff810808ad
> [ 4442.164343] ffff882fa3d64cb0 ffff881004e3dfd8 ffff881004e3dfd8
> ffff881004e3dfd8
> [ 4442.164348] ffff882f89ea0000 ffff882fa3d64cb0 ffff882fa6a84cb0
> ffff8822ca2c0d98
> [ 4442.164353] Call Trace:
> [ 4442.164358] [<ffffffff810808ad>] ? dequeue_task_fair+0x2cd/0x530
> [ 4442.164363] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.164368] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
> [ 4442.164373] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
> [ 4442.164378] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
> [ 4442.164383] [<ffffffff8150cdbf>] bch_allocator_thread+0x10f/0xe20
> [ 4442.164388] [<ffffffff8150ccb0>] ? bch_bucket_add_unused+0xe0/0xe0
> [ 4442.164392] [<ffffffff81069f40>] kthread+0xc0/0xd0
> [ 4442.164398] [<ffffffff811ad3c0>] ? end_buffer_async_read+0x130/0x130
> [ 4442.164402] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164407] [<ffffffff816c795c>] ret_from_fork+0x7c/0xb0
> [ 4442.164411] [<ffffffff81069e80>] ? flush_kthread_worker+0xb0/0xb0
> [ 4442.164417] INFO: task iozone:2565 blocked for more than 120 seconds.
> [ 4442.164421] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.164426] iozone D 0000000000000001 0 2565 1660 0x00000000
> [ 4442.164429] ffff882fa3ae1978 0000000000000086 ffff882fa3ae1938
> ffffffff81301d7a
> [ 4442.164435] ffff882f8a420000 ffff882fa3ae1fd8 ffff882fa3ae1fd8
> ffff882fa3ae1fd8
> [ 4442.164440] ffff882fa6a84cb0 ffff882f8a420000 ffff882fa3ae1978
> ffff882fbf2139f8
> [ 4442.164445] Call Trace:
> [ 4442.164451] [<ffffffff81301d7a>] ? generic_make_request+0xca/0x100
> [ 4442.164456] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.164461] [<ffffffff816be36f>] io_schedule+0x8f/0xd0
> [ 4442.164467] [<ffffffff811b797c>] do_blockdev_direct_IO+0x1a7c/0x1fb0
> [ 4442.164477] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
> [ 4442.164484] [<ffffffff811b7f05>] __blockdev_direct_IO+0x55/0x60
> [ 4442.164490] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
> [ 4442.164497] [<ffffffffa020f349>] ext2_direct_IO+0x79/0xe0 [ext2]
> [ 4442.164502] [<ffffffffa020ea80>] ? ext2_get_blocks+0xa60/0xa60 [ext2]
> [ 4442.164509] [<ffffffff8104ade6>] ? current_fs_time+0x16/0x60
> [ 4442.164516] [<ffffffff8111f126>] generic_file_direct_write+0xc6/0x180
> [ 4442.164521] [<ffffffff8111f4bd>] __generic_file_aio_write+0x2dd/0x3b0
> [ 4442.164526] [<ffffffff8111f5f9>] generic_file_aio_write+0x69/0xd0
> [ 4442.164532] [<ffffffff8117b88a>] do_sync_write+0x7a/0xb0
> [ 4442.164537] [<ffffffff8117c63e>] vfs_write+0xce/0x1e0
> [ 4442.164542] [<ffffffff8117cb22>] SyS_write+0x52/0xa0
> [ 4442.164548] [<ffffffff816c7a02>] system_call_fastpath+0x16/0x1b
>
>
> -Neo

2013-08-30 21:14:59

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH] bcache: Fix a shrinker deadlock

GFP_NOIO means we could be getting called recursively - mca_alloc() ->
mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
Whoops.

Signed-off-by: Kent Overstreet <[email protected]>
---

On Thu, Aug 29, 2013 at 05:29:54PM -0700, kernel neophyte wrote:
> We are evaluating to use bcache on our production systems where the
> caching devices are insanely fast, in this scenario under a moderate load
> of random 4k writes.. bcache fails miserably :-(
>
> [ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
> on set b082ce66-04c6-43d5-8207-ebf39840191d
> [ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
> [ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
> [ 4442.163695] Workqueue: bcache bch_data_insert_keys
> [ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
> 0000000000000151
> [ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
> ffff882fa6ac9fd8
> [ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
> ffff8822ca2c0d98
> [ 4442.163716] Call Trace:
> [ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
> [ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
> [ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
> [ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
> [ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
> [ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
> [ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
> [ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
> [ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
> [ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
> [ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
> [ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
> [ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
> [ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
> [ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
> [ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
> [ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
> [ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
> [ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
> [ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0

Ohhh, that definitely isn't supposed to happen.

Wonder why I hadn't seen this before, looking at the backtrace it's
pretty obvious what's broken though - try this patch:

drivers/md/bcache/btree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 60908de..55e8666 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -617,7 +617,7 @@ static int bch_mca_shrink(struct shrinker *shrink, struct shrink_control *sc)
return mca_can_free(c) * c->btree_pages;

/* Return -1 if we can't do anything right now */
- if (sc->gfp_mask & __GFP_WAIT)
+ if (sc->gfp_mask & __GFP_IO)
mutex_lock(&c->bucket_lock);
else if (!mutex_trylock(&c->bucket_lock))
return -1;
--
1.8.4.rc3

2013-08-31 04:20:23

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] bcache: Fix a shrinker deadlock

On Fri, Aug 30 2013, Kent Overstreet wrote:
> GFP_NOIO means we could be getting called recursively - mca_alloc() ->
> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
> Whoops.

Kent, can you provide and updated repo with the pending patches? There's
been some churn here lately (and good fixes), would like to ensure I
don't miss anything.

--
Jens Axboe

Subject: Re: [PATCH] bcache: Fix a shrinker deadlock

thanks applied to my local kernel git

Stefan

Am 30.08.2013 23:15, schrieb Kent Overstreet:
> GFP_NOIO means we could be getting called recursively - mca_alloc() ->
> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
> Whoops.
>
> Signed-off-by: Kent Overstreet <[email protected]>
> ---
>
> On Thu, Aug 29, 2013 at 05:29:54PM -0700, kernel neophyte wrote:
>> We are evaluating to use bcache on our production systems where the
>> caching devices are insanely fast, in this scenario under a moderate load
>> of random 4k writes.. bcache fails miserably :-(
>>
>> [ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
>> on set b082ce66-04c6-43d5-8207-ebf39840191d
>> [ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
>> [ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
>> [ 4442.163695] Workqueue: bcache bch_data_insert_keys
>> [ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
>> 0000000000000151
>> [ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
>> ffff882fa6ac9fd8
>> [ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
>> ffff8822ca2c0d98
>> [ 4442.163716] Call Trace:
>> [ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
>> [ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
>> [ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
>> [ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
>> [ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
>> [ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
>> [ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
>> [ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
>> [ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
>> [ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
>> [ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
>> [ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
>> [ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
>> [ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
>> [ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
>> [ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
>> [ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
>> [ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
>> [ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
>> [ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0
>
> Ohhh, that definitely isn't supposed to happen.
>
> Wonder why I hadn't seen this before, looking at the backtrace it's
> pretty obvious what's broken though - try this patch:
>
> drivers/md/bcache/btree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 60908de..55e8666 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -617,7 +617,7 @@ static int bch_mca_shrink(struct shrinker *shrink, struct shrink_control *sc)
> return mca_can_free(c) * c->btree_pages;
>
> /* Return -1 if we can't do anything right now */
> - if (sc->gfp_mask & __GFP_WAIT)
> + if (sc->gfp_mask & __GFP_IO)
> mutex_lock(&c->bucket_lock);
> else if (!mutex_trylock(&c->bucket_lock))
> return -1;
>

Subject: Re: [PATCH] bcache: Fix a shrinker deadlock

Thanks! No crashes since your fix.

Stefan

This mail was sent with my iPhone.

Am 30.08.2013 um 23:15 schrieb Kent Overstreet <[email protected]>:

> GFP_NOIO means we could be getting called recursively - mca_alloc() ->
> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
> Whoops.
>
> Signed-off-by: Kent Overstreet <[email protected]>
> ---
>
> On Thu, Aug 29, 2013 at 05:29:54PM -0700, kernel neophyte wrote:
>> We are evaluating to use bcache on our production systems where the
>> caching devices are insanely fast, in this scenario under a moderate load
>> of random 4k writes.. bcache fails miserably :-(
>>
>> [ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
>> on set b082ce66-04c6-43d5-8207-ebf39840191d
>> [ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
>> [ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
>> [ 4442.163695] Workqueue: bcache bch_data_insert_keys
>> [ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
>> 0000000000000151
>> [ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
>> ffff882fa6ac9fd8
>> [ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
>> ffff8822ca2c0d98
>> [ 4442.163716] Call Trace:
>> [ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
>> [ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
>> [ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
>> [ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
>> [ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
>> [ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
>> [ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
>> [ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
>> [ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
>> [ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
>> [ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
>> [ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
>> [ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
>> [ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
>> [ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
>> [ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
>> [ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
>> [ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
>> [ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
>> [ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0
>
> Ohhh, that definitely isn't supposed to happen.
>
> Wonder why I hadn't seen this before, looking at the backtrace it's
> pretty obvious what's broken though - try this patch:
>
> drivers/md/bcache/btree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 60908de..55e8666 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -617,7 +617,7 @@ static int bch_mca_shrink(struct shrinker *shrink, struct shrink_control *sc)
> return mca_can_free(c) * c->btree_pages;
>
> /* Return -1 if we can't do anything right now */
> - if (sc->gfp_mask & __GFP_WAIT)
> + if (sc->gfp_mask & __GFP_IO)
> mutex_lock(&c->bucket_lock);
> else if (!mutex_trylock(&c->bucket_lock))
> return -1;
> --
> 1.8.4.rc3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-09-04 23:35:53

by kernel neophyte

[permalink] [raw]
Subject: Re: [PATCH] bcache: Fix a shrinker deadlock

On Fri, Aug 30, 2013 at 2:15 PM, Kent Overstreet <[email protected]> wrote:
> GFP_NOIO means we could be getting called recursively - mca_alloc() ->
> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
> Whoops.
>
> Signed-off-by: Kent Overstreet <[email protected]>

Awesome! I tested the fix... no crashes/deadlock.
But I see lower benchmark numbers for random write.. is this expected
for this change ?

Thanks Kent.
-Suhas

> ---
>
> On Thu, Aug 29, 2013 at 05:29:54PM -0700, kernel neophyte wrote:
>> We are evaluating to use bcache on our production systems where the
>> caching devices are insanely fast, in this scenario under a moderate load
>> of random 4k writes.. bcache fails miserably :-(
>>
>> [ 3588.513638] bcache: bch_cached_dev_attach() Caching sda4 as bcache0
>> on set b082ce66-04c6-43d5-8207-ebf39840191d
>> [ 4442.163661] INFO: task kworker/0:0:4 blocked for more than 120 seconds.
>> [ 4442.163671] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 4442.163678] kworker/0:0 D ffffffff81813d40 0 4 2 0x00000000
>> [ 4442.163695] Workqueue: bcache bch_data_insert_keys
>> [ 4442.163699] ffff882fa6ac93c8 0000000000000046 ffff882fa6ac93e8
>> 0000000000000151
>> [ 4442.163705] ffff882fa6a84cb0 ffff882fa6ac9fd8 ffff882fa6ac9fd8
>> ffff882fa6ac9fd8
>> [ 4442.163711] ffff882fa6ad6640 ffff882fa6a84cb0 ffff882fa6a84cb0
>> ffff8822ca2c0d98
>> [ 4442.163716] Call Trace:
>> [ 4442.163729] [<ffffffff816be299>] schedule+0x29/0x70
>> [ 4442.163735] [<ffffffff816be57e>] schedule_preempt_disabled+0xe/0x10
>> [ 4442.163741] [<ffffffff816bc862>] __mutex_lock_slowpath+0x112/0x1b0
>> [ 4442.163746] [<ffffffff816bc3da>] mutex_lock+0x2a/0x50
>> [ 4442.163752] [<ffffffff815112e5>] bch_mca_shrink+0x1b5/0x2f0
>> [ 4442.163759] [<ffffffff8117fc32>] ? prune_super+0x162/0x1b0
>> [ 4442.163769] [<ffffffff8112ebb4>] shrink_slab+0x154/0x300
>> [ 4442.163776] [<ffffffff81076828>] ? resched_task+0x68/0x70
>> [ 4442.163782] [<ffffffff81077165>] ? check_preempt_curr+0x75/0xa0
>> [ 4442.163788] [<ffffffff8113a379>] ? fragmentation_index+0x19/0x70
>> [ 4442.163794] [<ffffffff8113140f>] do_try_to_free_pages+0x20f/0x4b0
>> [ 4442.163800] [<ffffffff81131864>] try_to_free_pages+0xe4/0x1a0
>> [ 4442.163810] [<ffffffff81126e9c>] __alloc_pages_nodemask+0x60c/0x9b0
>> [ 4442.163818] [<ffffffff8116062a>] alloc_pages_current+0xba/0x170
>> [ 4442.163824] [<ffffffff8112240e>] __get_free_pages+0xe/0x40
>> [ 4442.163829] [<ffffffff8150ebb3>] mca_data_alloc+0x73/0x1d0
>> [ 4442.163834] [<ffffffff8150ee5a>] mca_bucket_alloc+0x14a/0x1f0
>> [ 4442.163838] [<ffffffff81511020>] mca_alloc+0x360/0x470
>> [ 4442.163843] [<ffffffff81511d1c>] bch_btree_node_alloc+0x8c/0x1c0
>> [ 4442.163849] [<ffffffff81513020>] btree_split+0x110/0x5c0
>
> Ohhh, that definitely isn't supposed to happen.
>
> Wonder why I hadn't seen this before, looking at the backtrace it's
> pretty obvious what's broken though - try this patch:
>
> drivers/md/bcache/btree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 60908de..55e8666 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -617,7 +617,7 @@ static int bch_mca_shrink(struct shrinker *shrink, struct shrink_control *sc)
> return mca_can_free(c) * c->btree_pages;
>
> /* Return -1 if we can't do anything right now */
> - if (sc->gfp_mask & __GFP_WAIT)
> + if (sc->gfp_mask & __GFP_IO)
> mutex_lock(&c->bucket_lock);
> else if (!mutex_trylock(&c->bucket_lock))
> return -1;
> --
> 1.8.4.rc3
>

2013-09-04 23:44:48

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH] bcache: Fix a shrinker deadlock

On Wed, Sep 04, 2013 at 04:35:50PM -0700, kernel neophyte wrote:
> On Fri, Aug 30, 2013 at 2:15 PM, Kent Overstreet <[email protected]> wrote:
> > GFP_NOIO means we could be getting called recursively - mca_alloc() ->
> > mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
> > Whoops.
> >
> > Signed-off-by: Kent Overstreet <[email protected]>
>
> Awesome! I tested the fix... no crashes/deadlock.
> But I see lower benchmark numbers for random write.. is this expected
> for this change ?

No... how much lower?