From: "Paul E. McKenney" <[email protected]>
[ Upstream commit 9f47eb5461aaeb6cb8696f9d11503ae90e4d5cb0 ]
Very large I/Os can cause the following RCU CPU stall warning:
RIP: 0010:rb_prev+0x8/0x50
Code: 49 89 c0 49 89 d1 48 89 c2 48 89 f8 e9 e5 fd ff ff 4c 89 48 10 c3 4c =
89 06 c3 4c 89 40 10 c3 0f 1f 00 48 8b 0f 48 39 cf 74 38 <48> 8b 47 10 48 85 c0 74 22 48 8b 50 08 48 85 d2 74 0c 48 89 d0 48
RSP: 0018:ffffc9002212bab0 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13
RAX: ffff888821f93630 RBX: ffff888821f93630 RCX: ffff888821f937e0
RDX: 0000000000000000 RSI: 0000000000102000 RDI: ffff888821f93630
RBP: 0000000000103000 R08: 000000000006c000 R09: 0000000000000238
R10: 0000000000102fff R11: ffffc9002212bac8 R12: 0000000000000001
R13: ffffffffffffffff R14: 0000000000102000 R15: ffff888821f937e0
__lookup_extent_mapping+0xa0/0x110
try_release_extent_mapping+0xdc/0x220
btrfs_releasepage+0x45/0x70
shrink_page_list+0xa39/0xb30
shrink_inactive_list+0x18f/0x3b0
shrink_lruvec+0x38e/0x6b0
shrink_node+0x14d/0x690
do_try_to_free_pages+0xc6/0x3e0
try_to_free_mem_cgroup_pages+0xe6/0x1e0
reclaim_high.constprop.73+0x87/0xc0
mem_cgroup_handle_over_high+0x66/0x150
exit_to_usermode_loop+0x82/0xd0
do_syscall_64+0xd4/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
On a PREEMPT=n kernel, the try_release_extent_mapping() function's
"while" loop might run for a very long time on a large I/O. This commit
therefore adds a cond_resched() to this loop, providing RCU any needed
quiescent states.
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/extent_io.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 42b7409d4cc55..2f9f738ecf84a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4437,6 +4437,8 @@ int try_release_extent_mapping(struct extent_map_tree *map,
/* once for us */
free_extent_map(em);
+
+ cond_resched(); /* Allow large-extent preemption. */
}
}
return try_release_extent_state(map, tree, page, mask);
--
2.25.1
On Mon, Aug 10, 2020 at 03:14:30PM -0400, Sasha Levin wrote:
> From: "Paul E. McKenney" <[email protected]>
>
> [ Upstream commit 9f47eb5461aaeb6cb8696f9d11503ae90e4d5cb0 ]
>
> Very large I/Os can cause the following RCU CPU stall warning:
>
> RIP: 0010:rb_prev+0x8/0x50
> Code: 49 89 c0 49 89 d1 48 89 c2 48 89 f8 e9 e5 fd ff ff 4c 89 48 10 c3 4c =
> 89 06 c3 4c 89 40 10 c3 0f 1f 00 48 8b 0f 48 39 cf 74 38 <48> 8b 47 10 48 85 c0 74 22 48 8b 50 08 48 85 d2 74 0c 48 89 d0 48
> RSP: 0018:ffffc9002212bab0 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13
> RAX: ffff888821f93630 RBX: ffff888821f93630 RCX: ffff888821f937e0
> RDX: 0000000000000000 RSI: 0000000000102000 RDI: ffff888821f93630
> RBP: 0000000000103000 R08: 000000000006c000 R09: 0000000000000238
> R10: 0000000000102fff R11: ffffc9002212bac8 R12: 0000000000000001
> R13: ffffffffffffffff R14: 0000000000102000 R15: ffff888821f937e0
> __lookup_extent_mapping+0xa0/0x110
> try_release_extent_mapping+0xdc/0x220
> btrfs_releasepage+0x45/0x70
> shrink_page_list+0xa39/0xb30
> shrink_inactive_list+0x18f/0x3b0
> shrink_lruvec+0x38e/0x6b0
> shrink_node+0x14d/0x690
> do_try_to_free_pages+0xc6/0x3e0
> try_to_free_mem_cgroup_pages+0xe6/0x1e0
> reclaim_high.constprop.73+0x87/0xc0
> mem_cgroup_handle_over_high+0x66/0x150
> exit_to_usermode_loop+0x82/0xd0
> do_syscall_64+0xd4/0x100
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> On a PREEMPT=n kernel, the try_release_extent_mapping() function's
> "while" loop might run for a very long time on a large I/O. This commit
> therefore adds a cond_resched() to this loop, providing RCU any needed
> quiescent states.
>
> Signed-off-by: Paul E. McKenney <[email protected]>
Paul,
this patch was well hidden in some huge RCU pile
(https://lore.kernel.org/lkml/[email protected]/)
I wonder why you haven't CCed linux-btrfs, I spotted the patch queued
for stable by incidentally. The timestamp is from June, that's quite
some time ago. We can deal with one more patch and I tend to reply with
acks quickly for easy patches like this to not block other peoples work
but I'm a bit disappointed by sidestepping maintained subsystems. It's
not just this patch, it happens from time time only to increase the
disapointement.
On Tue, Aug 11, 2020 at 09:57:20AM +0200, David Sterba wrote:
> On Mon, Aug 10, 2020 at 03:14:30PM -0400, Sasha Levin wrote:
> > From: "Paul E. McKenney" <[email protected]>
> >
> > [ Upstream commit 9f47eb5461aaeb6cb8696f9d11503ae90e4d5cb0 ]
> >
> > Very large I/Os can cause the following RCU CPU stall warning:
> >
> > RIP: 0010:rb_prev+0x8/0x50
> > Code: 49 89 c0 49 89 d1 48 89 c2 48 89 f8 e9 e5 fd ff ff 4c 89 48 10 c3 4c =
> > 89 06 c3 4c 89 40 10 c3 0f 1f 00 48 8b 0f 48 39 cf 74 38 <48> 8b 47 10 48 85 c0 74 22 48 8b 50 08 48 85 d2 74 0c 48 89 d0 48
> > RSP: 0018:ffffc9002212bab0 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13
> > RAX: ffff888821f93630 RBX: ffff888821f93630 RCX: ffff888821f937e0
> > RDX: 0000000000000000 RSI: 0000000000102000 RDI: ffff888821f93630
> > RBP: 0000000000103000 R08: 000000000006c000 R09: 0000000000000238
> > R10: 0000000000102fff R11: ffffc9002212bac8 R12: 0000000000000001
> > R13: ffffffffffffffff R14: 0000000000102000 R15: ffff888821f937e0
> > __lookup_extent_mapping+0xa0/0x110
> > try_release_extent_mapping+0xdc/0x220
> > btrfs_releasepage+0x45/0x70
> > shrink_page_list+0xa39/0xb30
> > shrink_inactive_list+0x18f/0x3b0
> > shrink_lruvec+0x38e/0x6b0
> > shrink_node+0x14d/0x690
> > do_try_to_free_pages+0xc6/0x3e0
> > try_to_free_mem_cgroup_pages+0xe6/0x1e0
> > reclaim_high.constprop.73+0x87/0xc0
> > mem_cgroup_handle_over_high+0x66/0x150
> > exit_to_usermode_loop+0x82/0xd0
> > do_syscall_64+0xd4/0x100
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > On a PREEMPT=n kernel, the try_release_extent_mapping() function's
> > "while" loop might run for a very long time on a large I/O. This commit
> > therefore adds a cond_resched() to this loop, providing RCU any needed
> > quiescent states.
> >
> > Signed-off-by: Paul E. McKenney <[email protected]>
>
> Paul,
>
> this patch was well hidden in some huge RCU pile
> (https://lore.kernel.org/lkml/[email protected]/)
>
> I wonder why you haven't CCed linux-btrfs, I spotted the patch queued
> for stable by incidentally. The timestamp is from June, that's quite
> some time ago. We can deal with one more patch and I tend to reply with
> acks quickly for easy patches like this to not block other peoples work
> but I'm a bit disappointed by sidestepping maintained subsystems. It's
> not just this patch, it happens from time time only to increase the
> disapointement.
My bad, and please accept my apologies. I clearly left out the
step of adding proper Cc: lines. :-/
Thanx, Paul