2018-10-22 11:32:48

by Olof Johansson

[permalink] [raw]
Subject: Circular lock dep in btrfs triggered by shrinker

Hi,

I hit the below circular locking dependency. Seems like the assumption made in
712e36c5f2a7fa56 ("btrfs: use GFP_KERNEL in btrfs_alloc_inode") either isn't
true, or has since changed?


======================================================
WARNING: possible circular locking dependency detected
4.19.0 #25 Not tainted
------------------------------------------------------
kswapd0/414 is trying to acquire lock:
000000008b8f1971 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node+0x35/0x1e0
but task is already holding lock:
00000000032f657e (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:
-> #3 (fs_reclaim){+.+.}:
kmem_cache_alloc+0x24/0x270
btrfs_alloc_inode+0x1f/0x260
alloc_inode+0x13/0x80
iget5_locked+0x3f/0x80
btrfs_iget+0x52/0x680
__lookup_free_space_inode+0xd9/0x110
lookup_free_space_inode+0x5e/0xd0
load_free_space_cache+0x63/0x190
cache_block_group+0x1b7/0x3e0
find_free_extent+0x747/0x12d0
btrfs_reserve_extent+0x96/0x170
btrfs_alloc_tree_block+0xf6/0x540
__btrfs_cow_block+0x110/0x4e0
btrfs_cow_block+0xd3/0x1f0
btrfs_search_slot+0x20d/0x9c0
btrfs_insert_empty_items+0x62/0xb0
btrfs_new_inode+0x1d3/0x6f0
btrfs_create+0xc9/0x1c0
lookup_open+0x69b/0x770
path_openat+0x39b/0xbb0
do_filp_open+0x85/0xf0
do_sys_open+0x11f/0x1f0
do_syscall_64+0x6b/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #2 (&caching_ctl->mutex){+.+.}:
cache_block_group+0x1ac/0x3e0
find_free_extent+0x747/0x12d0
btrfs_reserve_extent+0x96/0x170
btrfs_alloc_tree_block+0xf6/0x540
__btrfs_cow_block+0x110/0x4e0
btrfs_cow_block+0xd3/0x1f0
btrfs_search_slot+0x20d/0x9c0
btrfs_insert_empty_items+0x62/0xb0
btrfs_new_inode+0x1d3/0x6f0
btrfs_create+0xc9/0x1c0
lookup_open+0x69b/0x770
path_openat+0x39b/0xbb0
do_filp_open+0x85/0xf0
do_sys_open+0x11f/0x1f0
do_syscall_64+0x6b/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #1 (&space_info->groups_sem){++++}:
find_free_extent+0xc95/0x12d0
btrfs_reserve_extent+0x96/0x170
btrfs_alloc_tree_block+0xf6/0x540
__btrfs_cow_block+0x110/0x4e0
btrfs_cow_block+0xd3/0x1f0
btrfs_search_slot+0x20d/0x9c0
btrfs_lookup_inode+0x25/0x90
__btrfs_update_delayed_inode+0x5c/0x1f0
btrfs_commit_inode_delayed_inode+0x111/0x120
btrfs_evict_inode+0x3d3/0x540
evict+0xbc/0x180
d_delete+0xb4/0xc0
vfs_rmdir+0x10d/0x130
do_rmdir+0x1f9/0x210
do_syscall_64+0x6b/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&delayed_node->mutex){+.+.}:
__mutex_lock+0x7c/0x940
__btrfs_release_delayed_node+0x35/0x1e0
btrfs_evict_inode+0x223/0x540
evict+0xbc/0x180
dispose_list+0x38/0x60
prune_icache_sb+0x3d/0x50
super_cache_scan+0x136/0x180
do_shrink_slab+0x13a/0x3e0
shrink_slab+0x20d/0x2a0
shrink_node+0xdb/0x450
kswapd+0x3d2/0x8b0
kthread+0xfb/0x130
ret_from_fork+0x24/0x30

other info that might help us debug this:

Chain exists of:
&delayed_node->mutex --> &caching_ctl->mutex --> fs_reclaim

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&caching_ctl->mutex);
lock(fs_reclaim);
lock(&delayed_node->mutex);

*** DEADLOCK ***

3 locks held by kswapd0/414:
#0: 00000000032f657e (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
#1: 0000000041eb3025 (shrinker_rwsem){++++}, at: shrink_slab+0x125/0x2a0
#2: 00000000c96e13d0 (&type->s_umount_key#23){++++}, at: trylock_super+0x11/0x50

stack backtrace:
CPU: 38 PID: 414 Comm: kswapd0 Not tainted 4.19.0 #25
Hardware name: System manufacturer System Product Name/PRIME X399-A, BIOS 0807 08/03/2018
Call Trace:
dump_stack+0x5f/0x8b
print_circular_bug.isra.35+0x198/0x1f0
__lock_acquire+0x12d5/0x16e0
? __lock_acquire+0x4a6/0x16e0
? lock_acquire+0xb4/0x1b0
lock_acquire+0xb4/0x1b0
? __btrfs_release_delayed_node+0x35/0x1e0
? __btrfs_release_delayed_node+0x35/0x1e0
__mutex_lock+0x7c/0x940
? __btrfs_release_delayed_node+0x35/0x1e0
? __btrfs_release_delayed_node+0x35/0x1e0
? lock_acquire+0xb4/0x1b0
? __btrfs_release_delayed_node+0x35/0x1e0
__btrfs_release_delayed_node+0x35/0x1e0
btrfs_evict_inode+0x223/0x540
evict+0xbc/0x180
dispose_list+0x38/0x60
prune_icache_sb+0x3d/0x50
super_cache_scan+0x136/0x180
do_shrink_slab+0x13a/0x3e0
shrink_slab+0x20d/0x2a0
shrink_node+0xdb/0x450
kswapd+0x3d2/0x8b0
kthread+0xfb/0x130
? mem_cgroup_shrink_node+0x240/0x240
? kthread_create_worker_on_cpu+0x40/0x40
ret_from_fork+0x24/0x30


2018-10-22 18:54:35

by David Sterba

[permalink] [raw]
Subject: Re: Circular lock dep in btrfs triggered by shrinker

On Mon, Oct 22, 2018 at 03:07:26AM -0700, Olof Johansson wrote:
> I hit the below circular locking dependency. Seems like the assumption made in
> 712e36c5f2a7fa56 ("btrfs: use GFP_KERNEL in btrfs_alloc_inode") either isn't
> true, or has since changed?

I think it must have been there from the beginning. There were reports
of this lockdep warning like the below and IIRC a few more
(https://lkml.kernel.org/r/[email protected]), but without a
resolution.

Incidentally, there was a fix that's now in the 4.20 pull and only after
I had seen your report I realized that it was the fix for the warning:

https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/commit/?h=for-4.20-part1&id=84de76a2fb217dc1b6bc2965cc397d1648aa1404

It disables the filesystem allocations using the memalloc_nofs mechanism
around free space inode allocation, while my original patch expected
only regular inodes created by VFS.

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.19.0 #25 Not tainted
> ------------------------------------------------------
> kswapd0/414 is trying to acquire lock:
> 000000008b8f1971 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node+0x35/0x1e0
> but task is already holding lock:
> 00000000032f657e (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
> -> #3 (fs_reclaim){+.+.}:
> kmem_cache_alloc+0x24/0x270
> btrfs_alloc_inode+0x1f/0x260
> alloc_inode+0x13/0x80
> iget5_locked+0x3f/0x80
> btrfs_iget+0x52/0x680
> __lookup_free_space_inode+0xd9/0x110

The patch adds memalloc_nofs into ^^^^. It's scheduled for stable so
the warning will disappear eventually, thanks for the report.

2018-10-23 07:54:08

by Michal Hocko

[permalink] [raw]
Subject: Re: Circular lock dep in btrfs triggered by shrinker

On Mon 22-10-18 20:22:43, David Sterba wrote:
> On Mon, Oct 22, 2018 at 03:07:26AM -0700, Olof Johansson wrote:
> > I hit the below circular locking dependency. Seems like the assumption made in
> > 712e36c5f2a7fa56 ("btrfs: use GFP_KERNEL in btrfs_alloc_inode") either isn't
> > true, or has since changed?
>
> I think it must have been there from the beginning. There were reports
> of this lockdep warning like the below and IIRC a few more
> (https://lkml.kernel.org/r/[email protected]), but without a
> resolution.
>
> Incidentally, there was a fix that's now in the 4.20 pull and only after
> I had seen your report I realized that it was the fix for the warning:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/commit/?h=for-4.20-part1&id=84de76a2fb217dc1b6bc2965cc397d1648aa1404
>
> It disables the filesystem allocations using the memalloc_nofs mechanism
> around free space inode allocation, while my original patch expected
> only regular inodes created by VFS.

Wouldn't it be better/possible to take the mark the scope at the higher
level where you take the trans handle?
--
Michal Hocko
SUSE Labs

2018-10-23 08:42:37

by Olof Johansson

[permalink] [raw]
Subject: Re: Circular lock dep in btrfs triggered by shrinker

On Mon, Oct 22, 2018 at 7:22 PM David Sterba <[email protected]> wrote:
>
> On Mon, Oct 22, 2018 at 03:07:26AM -0700, Olof Johansson wrote:
> > I hit the below circular locking dependency. Seems like the assumption made in
> > 712e36c5f2a7fa56 ("btrfs: use GFP_KERNEL in btrfs_alloc_inode") either isn't
> > true, or has since changed?
>
> I think it must have been there from the beginning. There were reports
> of this lockdep warning like the below and IIRC a few more
> (https://lkml.kernel.org/r/[email protected]), but without a
> resolution.
>
> Incidentally, there was a fix that's now in the 4.20 pull and only after
> I had seen your report I realized that it was the fix for the warning:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/commit/?h=for-4.20-part1&id=84de76a2fb217dc1b6bc2965cc397d1648aa1404
>
> It disables the filesystem allocations using the memalloc_nofs mechanism
> around free space inode allocation, while my original patch expected
> only regular inodes created by VFS.

Ah, great. I'll take that branch for a spin later today and confirm at
my end too.


-Olof