2021-09-28 17:38:25

by Wen Yang

[permalink] [raw]
Subject: [PATCH] fs/fs-writeback.c: add a preemption point to move_expired_inodes

We encountered an unrecovered_softlockup issue on !PREEMPT
kernel config with 4.9 based kernel.

PID: 185895 TASK: ffff880455dac280 CPU: 8 COMMAND: "kworker/u449:39"
#0 [ffff883f7e803c08] machine_kexec at ffffffff81061578
#1 [ffff883f7e803c68] __crash_kexec at ffffffff81127c19
#2 [ffff883f7e803d30] panic at ffffffff811b2255
#3 [ffff883f7e803db8] unrecovered_softlockup_detect at ffffffff811b2d57
#4 [ffff883f7e803ee0] watchdog_timer_fn at ffffffff8115827e
#5 [ffff883f7e803f18] __hrtimer_run_queues at ffffffff811085e3
#6 [ffff883f7e803f70] hrtimer_interrupt at ffffffff81108d8a
#7 [ffff883f7e803fc0] local_apic_timer_interrupt at ffffffff810580f8
#8 [ffff883f7e803fd8] smp_apic_timer_interrupt at ffffffff81745405
#9 [ffff883f7e803ff0] apic_timer_interrupt at ffffffff81743b90
--- <IRQ stack> ---
#10 [ffffc90086a93b88] apic_timer_interrupt at ffffffff81743b90
[exception RIP: __list_del_entry+44]
RIP: ffffffff813be22c RSP: ffffc90086a93c30 RFLAGS: 00000202
RAX: ffff88522b8f8418 RBX: ffff88522b8f8418 RCX: dead000000000200
RDX: ffff8816fab00e68 RSI: ffffc90086a93c60 RDI: ffff8816fab01af8
RBP: ffffc90086a93c30 R8: ffff8816fab01af8 R9: 0000000100400018
R10: ffff885ae5ed8280 R11: 0000000000000000 R12: ffff8816fab01af8
R13: ffffc90086a93c60 R14: ffffc90086a93d08 R15: ffff883f631d2000
ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0000
#11 [ffffc90086a93c38] move_expired_inodes at ffffffff8127c74c
#12 [ffffc90086a93ca8] queue_io at ffffffff8127cde6
#13 [ffffc90086a93cd8] wb_writeback at ffffffff8128121f
#14 [ffffc90086a93d80] wb_workfn at ffffffff812819f4
#15 [ffffc90086a93e18] process_one_work at ffffffff810a5dc9
#16 [ffffc90086a93e60] worker_thread at ffffffff810a60ae
#17 [ffffc90086a93ec0] kthread at ffffffff810ac696
#18 [ffffc90086a93f50] ret_from_fork at ffffffff81741dd9

crash> set
PID: 185895
COMMAND: "kworker/u449:39"
TASK: ffff880455dac280 [THREAD_INFO: ffff880455dac280]
CPU: 8
STATE: TASK_RUNNING (PANIC)

It has been running continuously for 53.052, as follows:
crash> ps -m | grep 185895
[ 0 00:00:53.052] [RU] PID: 185895 TASK: ffff880455dac280 CPU: 8
COMMAND: "kworker/u449:39"

And the TIF_NEED_RESCHED flag has been set, as follows:
crash> struct thread_info -x ffff880455dac280
struct thread_info {
flags = 0x88,
status = 0x0
}

Let's just add cond_resched() within move_expired_inodes()'s list-moving loop in
order to avoid the watchdog splats.

Signed-off-by: Wen Yang <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
fs/fs-writeback.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 06d04a7..1546121 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1404,6 +1404,7 @@ static int move_expired_inodes(struct list_head *delaying_queue,
if (sb && sb != inode->i_sb)
do_sb_sort = 1;
sb = inode->i_sb;
+ cond_resched();
}

/* just one sb in list, splice to dispatch_queue and we're done */
@@ -1420,6 +1421,7 @@ static int move_expired_inodes(struct list_head *delaying_queue,
if (inode->i_sb == sb)
list_move(&inode->i_io_list, dispatch_queue);
}
+ cond_resched();
}
out:
return moved;
--
1.8.3.1


2021-10-20 16:15:53

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/fs-writeback.c: add a preemption point to move_expired_inodes

On Wed 29-09-21 01:34:04, Wen Yang wrote:
> We encountered an unrecovered_softlockup issue on !PREEMPT
> kernel config with 4.9 based kernel.
>
> PID: 185895 TASK: ffff880455dac280 CPU: 8 COMMAND: "kworker/u449:39"
> #0 [ffff883f7e803c08] machine_kexec at ffffffff81061578
> #1 [ffff883f7e803c68] __crash_kexec at ffffffff81127c19
> #2 [ffff883f7e803d30] panic at ffffffff811b2255
> #3 [ffff883f7e803db8] unrecovered_softlockup_detect at ffffffff811b2d57
> #4 [ffff883f7e803ee0] watchdog_timer_fn at ffffffff8115827e
> #5 [ffff883f7e803f18] __hrtimer_run_queues at ffffffff811085e3
> #6 [ffff883f7e803f70] hrtimer_interrupt at ffffffff81108d8a
> #7 [ffff883f7e803fc0] local_apic_timer_interrupt at ffffffff810580f8
> #8 [ffff883f7e803fd8] smp_apic_timer_interrupt at ffffffff81745405
> #9 [ffff883f7e803ff0] apic_timer_interrupt at ffffffff81743b90
> --- <IRQ stack> ---
> #10 [ffffc90086a93b88] apic_timer_interrupt at ffffffff81743b90
> [exception RIP: __list_del_entry+44]
> RIP: ffffffff813be22c RSP: ffffc90086a93c30 RFLAGS: 00000202
> RAX: ffff88522b8f8418 RBX: ffff88522b8f8418 RCX: dead000000000200
> RDX: ffff8816fab00e68 RSI: ffffc90086a93c60 RDI: ffff8816fab01af8
> RBP: ffffc90086a93c30 R8: ffff8816fab01af8 R9: 0000000100400018
> R10: ffff885ae5ed8280 R11: 0000000000000000 R12: ffff8816fab01af8
> R13: ffffc90086a93c60 R14: ffffc90086a93d08 R15: ffff883f631d2000
> ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0000
> #11 [ffffc90086a93c38] move_expired_inodes at ffffffff8127c74c
> #12 [ffffc90086a93ca8] queue_io at ffffffff8127cde6
> #13 [ffffc90086a93cd8] wb_writeback at ffffffff8128121f
> #14 [ffffc90086a93d80] wb_workfn at ffffffff812819f4
> #15 [ffffc90086a93e18] process_one_work at ffffffff810a5dc9
> #16 [ffffc90086a93e60] worker_thread at ffffffff810a60ae
> #17 [ffffc90086a93ec0] kthread at ffffffff810ac696
> #18 [ffffc90086a93f50] ret_from_fork at ffffffff81741dd9
>
> crash> set
> PID: 185895
> COMMAND: "kworker/u449:39"
> TASK: ffff880455dac280 [THREAD_INFO: ffff880455dac280]
> CPU: 8
> STATE: TASK_RUNNING (PANIC)
>
> It has been running continuously for 53.052, as follows:
> crash> ps -m | grep 185895
> [ 0 00:00:53.052] [RU] PID: 185895 TASK: ffff880455dac280 CPU: 8
> COMMAND: "kworker/u449:39"
>
> And the TIF_NEED_RESCHED flag has been set, as follows:
> crash> struct thread_info -x ffff880455dac280
> struct thread_info {
> flags = 0x88,
> status = 0x0
> }
>
> Let's just add cond_resched() within move_expired_inodes()'s list-moving loop in
> order to avoid the watchdog splats.
>
> Signed-off-by: Wen Yang <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/fs-writeback.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 06d04a7..1546121 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1404,6 +1404,7 @@ static int move_expired_inodes(struct list_head *delaying_queue,
> if (sb && sb != inode->i_sb)
> do_sb_sort = 1;
> sb = inode->i_sb;
> + cond_resched();

Thanks for the patch but you certainly cannot do this since we are holding
wb->list_lock during the whole move_expired_inodes() duration. It is not
trivial to implement safe dropping of the lock in move_expired_inodes() (or
queue_io() for that matter). How many inodes were there on b_dirty, b_io,
b_more_io, and tmp lists and from how many superblocks?

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR