A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again. In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to it's knees.
PID: 18105 TASK: ffff8807fd412180 CPU: 5 COMMAND: "kdmflush"
#0 [ffff8808157e7670] schedule at ffffffff8143f489
#1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
#2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
#3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
#4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
#5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
#6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
#7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
#8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
#9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
#10 [ffff8808157e7e78] acct_process at ffffffff810a27ba
#11 [ffff8808157e7e98] do_exit at ffffffff8105e29a
#12 [ffff8808157e7ee8] kthread at ffffffff8107afee
#13 [ffff8808157e7f48] kernel_thread_helper at ffffffff8144a5c4
[exception RIP: kernel_thread_helper]
RIP: ffffffff8144a5c0 RSP: ffff8808157e7f58 RFLAGS: 00000202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8107af60 RDI: ffff8803ee491d18
RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
Signed-off-by: Mike Galbraith <[email protected]>
Cc: [email protected]
kernel/workqueue.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 042d221..ac25db1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2407,8 +2407,10 @@ static int rescuer_thread(void *__wq)
repeat:
set_current_state(TASK_INTERRUPTIBLE);
- if (kthread_should_stop())
+ if (kthread_should_stop()) {
+ __set_current_state(TASK_RUNNING);
return 0;
+ }
/*
* See whether any cpu is asking for help. Unbounded
On Wed, Nov 28, 2012 at 07:17:18AM +0100, Mike Galbraith wrote:
>
> A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
> off, never to be seen again. In the case where this occurred, an exiting
> thread hit reiserfs homebrew conditional resched while holding a mutex,
> bringing the box to it's knees.
>
> PID: 18105 TASK: ffff8807fd412180 CPU: 5 COMMAND: "kdmflush"
> #0 [ffff8808157e7670] schedule at ffffffff8143f489
> #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
> #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
> #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
> #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
> #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
> #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
> #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
> #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
> #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
> #10 [ffff8808157e7e78] acct_process at ffffffff810a27ba
> #11 [ffff8808157e7e98] do_exit at ffffffff8105e29a
> #12 [ffff8808157e7ee8] kthread at ffffffff8107afee
> #13 [ffff8808157e7f48] kernel_thread_helper at ffffffff8144a5c4
> [exception RIP: kernel_thread_helper]
> RIP: ffffffff8144a5c0 RSP: ffff8808157e7f58 RFLAGS: 00000202
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8107af60 RDI: ffff8803ee491d18
> RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>
> Signed-off-by: Mike Galbraith <[email protected]>
> Cc: [email protected]
Applied to wq/for-3.8. Thanks!
--
tejun