2020-12-11 11:43:37

by Sergey Senozhatsky

[permalink] [raw]
Subject: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

Hi,

I got the following lockdep splat the other day, while running some
tests on 4.19. I didn't test other stable kernels, but it seems that
5.4 should also have similar problem.

As far as I can tell, ext4_dir_open() has been removed quite recently:
https://www.mail-archive.com/[email protected]/msg18727.html

Eric, Ted, Jaegeuk, how difficult would it be to remove ext4_dir_open()
from the stable kernels? (I'm interested in ext4 in particular, I
understand that other filesystems may also need similar patches)



[ 133.454721] kswapd0/79 is trying to acquire lock:
[ 133.454724] 00000000a815a55f (jbd2_handle){++++}, at: start_this_handle+0x1f9/0x859
[ 133.454730]
but task is already holding lock:
[ 133.454731] 00000000106bd5a3 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x2f
[ 133.454736]
which lock already depends on the new lock.

[ 133.454739]
the existing dependency chain (in reverse order) is:
[ 133.454740]
-> #2 (fs_reclaim){+.+.}:
[ 133.454745] kmem_cache_alloc_trace+0x44/0x28b
[ 133.454748] mempool_create_node+0x46/0x92
[ 133.454750] fscrypt_initialize+0xa0/0xbf
[ 133.454752] fscrypt_get_encryption_info+0xa4/0x774
[ 133.454754] ext4_dir_open+0x1b/0x2d
[ 133.454757] do_dentry_open+0x144/0x36d
[ 133.454759] path_openat+0x2d7/0x156d
[ 133.454762] do_filp_open+0x97/0x13e
[ 133.454764] do_sys_open+0x128/0x3a3
[ 133.454766] do_syscall_64+0x6f/0x22a
[ 133.454769] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 133.454771]
-> #1 (fscrypt_init_mutex){+.+.}:
[ 133.454774] mutex_lock_nested+0x20/0x26
[ 133.454776] fscrypt_initialize+0x20/0xbf
[ 133.454778] fscrypt_get_encryption_info+0xa4/0x774
[ 133.454780] fscrypt_inherit_context+0xbe/0xe6
[ 133.454782] __ext4_new_inode+0x11ee/0x1631
[ 133.454785] ext4_mkdir+0x112/0x416
[ 133.454787] vfs_mkdir2+0x135/0x1c6
[ 133.454789] do_mkdirat+0xc3/0x138
[ 133.454791] do_syscall_64+0x6f/0x22a
[ 133.454793] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 133.454795]
-> #0 (jbd2_handle){++++}:
[ 133.454798] start_this_handle+0x21c/0x859
[ 133.454800] jbd2__journal_start+0xa2/0x282
[ 133.454802] ext4_release_dquot+0x58/0x93
[ 133.454804] dqput+0x196/0x1ec
[ 133.454806] __dquot_drop+0x8d/0xb2
[ 133.454809] ext4_clear_inode+0x22/0x8c
[ 133.454811] ext4_evict_inode+0x127/0x662
[ 133.454813] evict+0xc0/0x241
[ 133.454816] dispose_list+0x36/0x54
[ 133.454818] prune_icache_sb+0x56/0x76
[ 133.454820] super_cache_scan+0x13a/0x19c
[ 133.454822] shrink_slab+0x39a/0x572
[ 133.454824] shrink_node+0x3f8/0x63b
[ 133.454826] balance_pgdat+0x1bd/0x326
[ 133.454828] kswapd+0x2ad/0x510
[ 133.454831] kthread+0x14d/0x155
[ 133.454833] ret_from_fork+0x24/0x50
[ 133.454834]
other info that might help us debug this:

[ 133.454836] Chain exists of:
jbd2_handle --> fscrypt_init_mutex --> fs_reclaim

[ 133.454840] Possible unsafe locking scenario:

[ 133.454841] CPU0 CPU1
[ 133.454843] ---- ----
[ 133.454844] lock(fs_reclaim);
[ 133.454846] lock(fscrypt_init_mutex);
[ 133.454848] lock(fs_reclaim);
[ 133.454850] lock(jbd2_handle);
[ 133.454851]
*** DEADLOCK ***

[ 133.454854] 3 locks held by kswapd0/79:
[ 133.454855] #0: 00000000106bd5a3 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x2f
[ 133.454859] #1: 00000000c230047b (shrinker_rwsem){++++}, at: shrink_slab+0x3b/0x572
[ 133.454862] #2: 00000000ce797452 (&type->s_umount_key#45){++++}, at: trylock_super+0x1b/0x47
[ 133.454866]
stack backtrace:
[ 133.454869] CPU: 6 PID: 79 Comm: kswapd0 Not tainted 4.19.161 #43
[ 133.454872] Call Trace:
[ 133.454877] dump_stack+0xbd/0x11d
[ 133.454880] ? print_circular_bug+0x2c1/0x2d4
[ 133.454883] __lock_acquire+0x1977/0x1981
[ 133.454886] ? start_this_handle+0x1f9/0x859
[ 133.454888] lock_acquire+0x1b7/0x202
[ 133.454890] ? start_this_handle+0x1f9/0x859
[ 133.454893] start_this_handle+0x21c/0x859
[ 133.454895] ? start_this_handle+0x1f9/0x859
[ 133.454897] ? kmem_cache_alloc+0x1d1/0x27d
[ 133.454900] jbd2__journal_start+0xa2/0x282
[ 133.454902] ? __ext4_journal_start_sb+0x10b/0x208
[ 133.454905] ext4_release_dquot+0x58/0x93
[ 133.454907] dqput+0x196/0x1ec
[ 133.454909] __dquot_drop+0x8d/0xb2
[ 133.454912] ? dquot_drop+0x27/0x43
[ 133.454914] ext4_clear_inode+0x22/0x8c
[ 133.454917] ext4_evict_inode+0x127/0x662
[ 133.454920] evict+0xc0/0x241
[ 133.454923] dispose_list+0x36/0x54
[ 133.454926] prune_icache_sb+0x56/0x76
[ 133.454928] super_cache_scan+0x13a/0x19c
[ 133.454931] shrink_slab+0x39a/0x572
[ 133.454934] shrink_node+0x3f8/0x63b
[ 133.454938] balance_pgdat+0x1bd/0x326
[ 133.454941] kswapd+0x2ad/0x510
[ 133.454946] ? init_wait_entry+0x2e/0x2e
[ 133.454949] kthread+0x14d/0x155
[ 133.454951] ? wakeup_kswapd+0x20d/0x20d
[ 133.454953] ? kthread_destroy_worker+0x62/0x62
[ 133.454956] ret_from_fork+0x24/0x50

-ss


2020-12-11 11:46:13

by Eric Biggers

[permalink] [raw]
Subject: Re: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

On Fri, Dec 11, 2020 at 12:36:57PM +0900, Sergey Senozhatsky wrote:
> Hi,
>
> I got the following lockdep splat the other day, while running some
> tests on 4.19. I didn't test other stable kernels, but it seems that
> 5.4 should also have similar problem.
>
> As far as I can tell, ext4_dir_open() has been removed quite recently:
> https://www.mail-archive.com/[email protected]/msg18727.html
>
> Eric, Ted, Jaegeuk, how difficult would it be to remove ext4_dir_open()
> from the stable kernels? (I'm interested in ext4 in particular, I
> understand that other filesystems may also need similar patches)
>
>
>
> [ 133.454721] kswapd0/79 is trying to acquire lock:
> [ 133.454724] 00000000a815a55f (jbd2_handle){++++}, at: start_this_handle+0x1f9/0x859
> [ 133.454730]
> but task is already holding lock:
> [ 133.454731] 00000000106bd5a3 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x2f
> [ 133.454736]
> which lock already depends on the new lock.
>
> [ 133.454739]
> the existing dependency chain (in reverse order) is:
> [ 133.454740]
> -> #2 (fs_reclaim){+.+.}:
> [ 133.454745] kmem_cache_alloc_trace+0x44/0x28b
> [ 133.454748] mempool_create_node+0x46/0x92
> [ 133.454750] fscrypt_initialize+0xa0/0xbf
> [ 133.454752] fscrypt_get_encryption_info+0xa4/0x774
> [ 133.454754] ext4_dir_open+0x1b/0x2d
> [ 133.454757] do_dentry_open+0x144/0x36d
> [ 133.454759] path_openat+0x2d7/0x156d
> [ 133.454762] do_filp_open+0x97/0x13e
> [ 133.454764] do_sys_open+0x128/0x3a3
> [ 133.454766] do_syscall_64+0x6f/0x22a
> [ 133.454769] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 133.454771]
> -> #1 (fscrypt_init_mutex){+.+.}:
> [ 133.454774] mutex_lock_nested+0x20/0x26
> [ 133.454776] fscrypt_initialize+0x20/0xbf
> [ 133.454778] fscrypt_get_encryption_info+0xa4/0x774
> [ 133.454780] fscrypt_inherit_context+0xbe/0xe6
> [ 133.454782] __ext4_new_inode+0x11ee/0x1631
> [ 133.454785] ext4_mkdir+0x112/0x416
> [ 133.454787] vfs_mkdir2+0x135/0x1c6
> [ 133.454789] do_mkdirat+0xc3/0x138
> [ 133.454791] do_syscall_64+0x6f/0x22a
> [ 133.454793] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 133.454795]
> -> #0 (jbd2_handle){++++}:
> [ 133.454798] start_this_handle+0x21c/0x859
> [ 133.454800] jbd2__journal_start+0xa2/0x282
> [ 133.454802] ext4_release_dquot+0x58/0x93
> [ 133.454804] dqput+0x196/0x1ec
> [ 133.454806] __dquot_drop+0x8d/0xb2
> [ 133.454809] ext4_clear_inode+0x22/0x8c
> [ 133.454811] ext4_evict_inode+0x127/0x662
> [ 133.454813] evict+0xc0/0x241
> [ 133.454816] dispose_list+0x36/0x54
> [ 133.454818] prune_icache_sb+0x56/0x76
> [ 133.454820] super_cache_scan+0x13a/0x19c
> [ 133.454822] shrink_slab+0x39a/0x572
> [ 133.454824] shrink_node+0x3f8/0x63b
> [ 133.454826] balance_pgdat+0x1bd/0x326
> [ 133.454828] kswapd+0x2ad/0x510
> [ 133.454831] kthread+0x14d/0x155
> [ 133.454833] ret_from_fork+0x24/0x50
> [ 133.454834]
> other info that might help us debug this:
>
> [ 133.454836] Chain exists of:
> jbd2_handle --> fscrypt_init_mutex --> fs_reclaim
>
> [ 133.454840] Possible unsafe locking scenario:
>
> [ 133.454841] CPU0 CPU1
> [ 133.454843] ---- ----
> [ 133.454844] lock(fs_reclaim);
> [ 133.454846] lock(fscrypt_init_mutex);
> [ 133.454848] lock(fs_reclaim);
> [ 133.454850] lock(jbd2_handle);
> [ 133.454851]

This actually got fixed by the patch series
https://lkml.kernel.org/linux-fscrypt/[email protected]/
which went into 5.10. The more recent patch to remove ext4_dir_open() isn't
related.

It's a hard patch series to backport. Backporting it to 5.4 would be somewhat
feasible, while 4.19 would be very difficult as there have been a lot of other
fscrypt commits which would heavily conflict with cherry-picks.

How interested are you in having this fixed? Did you encounter an actual
deadlock or just the lockdep report?

- Eric

2020-12-11 11:46:45

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

On (20/12/10 19:48), Eric Biggers wrote:
> >
> > [ 133.454836] Chain exists of:
> > jbd2_handle --> fscrypt_init_mutex --> fs_reclaim
> >
> > [ 133.454840] Possible unsafe locking scenario:
> >
> > [ 133.454841] CPU0 CPU1
> > [ 133.454843] ---- ----
> > [ 133.454844] lock(fs_reclaim);
> > [ 133.454846] lock(fscrypt_init_mutex);
> > [ 133.454848] lock(fs_reclaim);
> > [ 133.454850] lock(jbd2_handle);
> > [ 133.454851]
>
> This actually got fixed by the patch series
> https://lkml.kernel.org/linux-fscrypt/[email protected]/
> which went into 5.10. The more recent patch to remove ext4_dir_open() isn't
> related.
>
> It's a hard patch series to backport. Backporting it to 5.4 would be somewhat
> feasible, while 4.19 would be very difficult as there have been a lot of other
> fscrypt commits which would heavily conflict with cherry-picks.
>
> How interested are you in having this fixed? Did you encounter an actual
> deadlock or just the lockdep report?

Difficult to say. On one hand 'yes' I see lockups on my devices (4.19
kernel); I can't tell at the moment what's the root cause. So on the
other hand 'no' I can't say that it's because of ext4_dir_open().

What I saw so far involved ext4, kswapd, khugepaged and lots of other things.

[ 1598.655901] INFO: task khugepaged:66 blocked for more than 122 seconds.
[ 1598.655914] Call Trace:
[ 1598.655920] __schedule+0x506/0x1240
[ 1598.655924] ? kvm_zap_rmapp+0x52/0x69
[ 1598.655927] schedule+0x3f/0x78
[ 1598.655929] __rwsem_down_read_failed_common+0x186/0x201
[ 1598.655933] call_rwsem_down_read_failed+0x14/0x30
[ 1598.655936] down_read+0x2e/0x45
[ 1598.655939] rmap_walk_file+0x73/0x1ce
[ 1598.655941] page_referenced+0x10d/0x154
[ 1598.655948] shrink_active_list+0x1d4/0x475

[..]

[ 1598.655986] INFO: task kswapd0:79 blocked for more than 122 seconds.
[ 1598.655993] Call Trace:
[ 1598.655995] __schedule+0x506/0x1240
[ 1598.655998] schedule+0x3f/0x78
[ 1598.656000] __rwsem_down_read_failed_common+0x186/0x201
[ 1598.656003] call_rwsem_down_read_failed+0x14/0x30
[ 1598.656006] down_read+0x2e/0x45
[ 1598.656008] rmap_walk_file+0x73/0x1ce
[ 1598.656010] page_referenced+0x10d/0x154
[ 1598.656015] shrink_active_list+0x1d4/0x475

[..]

[ 1598.658233] __rwsem_down_read_failed_common+0x186/0x201
[ 1598.658235] call_rwsem_down_read_failed+0x14/0x30
[ 1598.658238] down_read+0x2e/0x45
[ 1598.658240] rmap_walk_file+0x73/0x1ce
[ 1598.658242] page_referenced+0x10d/0x154
[ 1598.658247] shrink_active_list+0x1d4/0x475
[ 1598.658250] shrink_node+0x27e/0x661
[ 1598.658254] try_to_free_pages+0x425/0x7ec
[ 1598.658258] __alloc_pages_nodemask+0x80b/0x1514
[ 1598.658279] __do_page_cache_readahead+0xd4/0x1a9
[ 1598.658282] filemap_fault+0x346/0x573
[ 1598.658287] ext4_filemap_fault+0x31/0x44

-ss

2020-12-11 11:55:42

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

On (20/12/11 13:08), Sergey Senozhatsky wrote:
> >
> > How interested are you in having this fixed? Did you encounter an actual
> > deadlock or just the lockdep report?
>

Got one more. fscrypt_get_encryption_info() again, but from ext4_lookup().

[ 162.840909] kswapd0/80 is trying to acquire lock:
[ 162.840912] 0000000078ea628f (jbd2_handle){++++}, at: start_this_handle+0x1f9/0x859
[ 162.840919]
but task is already holding lock:
[ 162.840922] 00000000314ed5a0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x2f
[ 162.840929]
which lock already depends on the new lock.

[ 162.840932]
the existing dependency chain (in reverse order) is:
[ 162.840934]
-> #2 (fs_reclaim){+.+.}:
[ 162.840940] kmem_cache_alloc_trace+0x44/0x28b
[ 162.840944] mempool_create_node+0x46/0x92
[ 162.840947] fscrypt_initialize+0xa0/0xbf
[ 162.840950] fscrypt_get_encryption_info+0xa4/0x774
[ 162.840953] fscrypt_setup_filename+0x99/0x2d1
[ 162.840956] __fscrypt_prepare_lookup+0x25/0x6b
[ 162.840960] ext4_lookup+0x1b2/0x323
[ 162.840963] path_openat+0x9a5/0x156d
[ 162.840966] do_filp_open+0x97/0x13e
[ 162.840970] do_sys_open+0x128/0x3a3
[ 162.840973] do_syscall_64+0x6f/0x22a
[ 162.840977] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 162.840979]
-> #1 (fscrypt_init_mutex){+.+.}:
[ 162.840983] mutex_lock_nested+0x20/0x26
[ 162.840986] fscrypt_initialize+0x20/0xbf
[ 162.840989] fscrypt_get_encryption_info+0xa4/0x774
[ 162.840992] fscrypt_inherit_context+0xbe/0xe6
[ 162.840995] __ext4_new_inode+0x11ee/0x1631
[ 162.840999] ext4_mkdir+0x112/0x416
[ 162.841002] vfs_mkdir2+0x135/0x1c6
[ 162.841004] do_mkdirat+0xc3/0x138
[ 162.841007] do_syscall_64+0x6f/0x22a
[ 162.841011] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 162.841012]
-> #0 (jbd2_handle){++++}:
[ 162.841017] start_this_handle+0x21c/0x859
[ 162.841019] jbd2__journal_start+0xa2/0x282
[ 162.841022] ext4_release_dquot+0x58/0x93
[ 162.841025] dqput+0x196/0x1ec
[ 162.841028] __dquot_drop+0x8d/0xb2
[ 162.841032] ext4_clear_inode+0x22/0x8c
[ 162.841035] ext4_evict_inode+0x127/0x662
[ 162.841038] evict+0xc0/0x241
[ 162.841041] dispose_list+0x36/0x54
[ 162.841045] prune_icache_sb+0x56/0x76
[ 162.841048] super_cache_scan+0x13a/0x19c
[ 162.841051] shrink_slab+0x39a/0x572
[ 162.841054] shrink_node+0x3f8/0x63b
[ 162.841056] balance_pgdat+0x1bd/0x326
[ 162.841059] kswapd+0x2ad/0x510
[ 162.841062] kthread+0x14d/0x155
[ 162.841066] ret_from_fork+0x24/0x50
[ 162.841068]
other info that might help us debug this:

[ 162.841070] Chain exists of:
jbd2_handle --> fscrypt_init_mutex --> fs_reclaim

[ 162.841075] Possible unsafe locking scenario:

[ 162.841077] CPU0 CPU1
[ 162.841079] ---- ----
[ 162.841081] lock(fs_reclaim);
[ 162.841084] lock(fscrypt_init_mutex);
[ 162.841086] lock(fs_reclaim);
[ 162.841089] lock(jbd2_handle);
[ 162.841091]
*** DEADLOCK ***

[ 162.841095] 3 locks held by kswapd0/80:
[ 162.841097] #0: 00000000314ed5a0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x2f
[ 162.841102] #1: 00000000be0d2066 (shrinker_rwsem){++++}, at: shrink_slab+0x3b/0x572
[ 162.841107] #2: 000000007c23fde5 (&type->s_umount_key#45){++++}, at: trylock_super+0x1b/0x47
[ 162.841111]
stack backtrace:
[ 162.841115] CPU: 0 PID: 80 Comm: kswapd0 Not tainted 4.19.161 #44
[ 162.841121] Call Trace:
[ 162.841127] dump_stack+0xbd/0x11d
[ 162.841131] ? print_circular_bug+0x2c1/0x2d4
[ 162.841135] __lock_acquire+0x1977/0x1981
[ 162.841139] ? start_this_handle+0x1f9/0x859
[ 162.841142] lock_acquire+0x1b7/0x202
[ 162.841145] ? start_this_handle+0x1f9/0x859
[ 162.841149] start_this_handle+0x21c/0x859
[ 162.841151] ? start_this_handle+0x1f9/0x859
[ 162.841155] ? kmem_cache_alloc+0x1d1/0x27d
[ 162.841159] jbd2__journal_start+0xa2/0x282
[ 162.841162] ? __ext4_journal_start_sb+0x10b/0x208
[ 162.841165] ext4_release_dquot+0x58/0x93
[ 162.841169] dqput+0x196/0x1ec
[ 162.841172] __dquot_drop+0x8d/0xb2
[ 162.841175] ? dquot_drop+0x27/0x43
[ 162.841179] ext4_clear_inode+0x22/0x8c
[ 162.841183] ext4_evict_inode+0x127/0x662
[ 162.841187] evict+0xc0/0x241
[ 162.841191] dispose_list+0x36/0x54
[ 162.841195] prune_icache_sb+0x56/0x76
[ 162.841198] super_cache_scan+0x13a/0x19c
[ 162.841202] shrink_slab+0x39a/0x572
[ 162.841206] shrink_node+0x3f8/0x63b
[ 162.841212] balance_pgdat+0x1bd/0x326
[ 162.841217] kswapd+0x2ad/0x510
[ 162.841223] ? init_wait_entry+0x2e/0x2e
[ 162.841227] kthread+0x14d/0x155
[ 162.841230] ? wakeup_kswapd+0x20d/0x20d
[ 162.841233] ? kthread_destroy_worker+0x62/0x62
[ 162.841237] ret_from_fork+0x24/0x50

-ss

2020-12-11 19:07:53

by Eric Biggers

[permalink] [raw]
Subject: Re: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

On Fri, Dec 11, 2020 at 01:08:07PM +0900, Sergey Senozhatsky wrote:
> On (20/12/10 19:48), Eric Biggers wrote:
> > >
> > > [ 133.454836] Chain exists of:
> > > jbd2_handle --> fscrypt_init_mutex --> fs_reclaim
> > >
> > > [ 133.454840] Possible unsafe locking scenario:
> > >
> > > [ 133.454841] CPU0 CPU1
> > > [ 133.454843] ---- ----
> > > [ 133.454844] lock(fs_reclaim);
> > > [ 133.454846] lock(fscrypt_init_mutex);
> > > [ 133.454848] lock(fs_reclaim);
> > > [ 133.454850] lock(jbd2_handle);
> > > [ 133.454851]
> >
> > This actually got fixed by the patch series
> > https://lkml.kernel.org/linux-fscrypt/[email protected]/
> > which went into 5.10. The more recent patch to remove ext4_dir_open() isn't
> > related.
> >
> > It's a hard patch series to backport. Backporting it to 5.4 would be somewhat
> > feasible, while 4.19 would be very difficult as there have been a lot of other
> > fscrypt commits which would heavily conflict with cherry-picks.
> >
> > How interested are you in having this fixed? Did you encounter an actual
> > deadlock or just the lockdep report?
>
> Difficult to say. On one hand 'yes' I see lockups on my devices (4.19
> kernel); I can't tell at the moment what's the root cause. So on the
> other hand 'no' I can't say that it's because of ext4_dir_open().
>
> What I saw so far involved ext4, kswapd, khugepaged and lots of other things.
>
> [ 1598.655901] INFO: task khugepaged:66 blocked for more than 122 seconds.
> [ 1598.655914] Call Trace:
> [ 1598.655920] __schedule+0x506/0x1240
> [ 1598.655924] ? kvm_zap_rmapp+0x52/0x69
> [ 1598.655927] schedule+0x3f/0x78
> [ 1598.655929] __rwsem_down_read_failed_common+0x186/0x201
> [ 1598.655933] call_rwsem_down_read_failed+0x14/0x30
> [ 1598.655936] down_read+0x2e/0x45
> [ 1598.655939] rmap_walk_file+0x73/0x1ce
> [ 1598.655941] page_referenced+0x10d/0x154
> [ 1598.655948] shrink_active_list+0x1d4/0x475
>
> [..]
>
> [ 1598.655986] INFO: task kswapd0:79 blocked for more than 122 seconds.
> [ 1598.655993] Call Trace:
> [ 1598.655995] __schedule+0x506/0x1240
> [ 1598.655998] schedule+0x3f/0x78
> [ 1598.656000] __rwsem_down_read_failed_common+0x186/0x201
> [ 1598.656003] call_rwsem_down_read_failed+0x14/0x30
> [ 1598.656006] down_read+0x2e/0x45
> [ 1598.656008] rmap_walk_file+0x73/0x1ce
> [ 1598.656010] page_referenced+0x10d/0x154
> [ 1598.656015] shrink_active_list+0x1d4/0x475
>
> [..]
>
> [ 1598.658233] __rwsem_down_read_failed_common+0x186/0x201
> [ 1598.658235] call_rwsem_down_read_failed+0x14/0x30
> [ 1598.658238] down_read+0x2e/0x45
> [ 1598.658240] rmap_walk_file+0x73/0x1ce
> [ 1598.658242] page_referenced+0x10d/0x154
> [ 1598.658247] shrink_active_list+0x1d4/0x475
> [ 1598.658250] shrink_node+0x27e/0x661
> [ 1598.658254] try_to_free_pages+0x425/0x7ec
> [ 1598.658258] __alloc_pages_nodemask+0x80b/0x1514
> [ 1598.658279] __do_page_cache_readahead+0xd4/0x1a9
> [ 1598.658282] filemap_fault+0x346/0x573
> [ 1598.658287] ext4_filemap_fault+0x31/0x44

Could you provide some more information about what is causing these actual
lockups for you? Are there more stack traces?

I'd be surprised if it's related to the fscrypt-related lockdep reports you're
getting, as the fscrypt bug seemed unlikely to cause deadlocks in practice, as
most of the time fscrypt_get_encryption_info() does *not* do or wait for a
GFP_KERNEL allocation. It's only in certain causes (generally, when things are
being initialized as opposed to already running) that it could.

See e.g. how the lockdep reports assume GFP_KERNEL done under
fscrypt_init_mutex, but that only happens the first time an encrypted directory
is accessed after boot. So that path can't cause a deadlock after that.

This was also a 5 years old bug, so it's unclear why it would suddenly be
causing problems just now...

Maybe something changed that exposed the bug more. I don't know what it would
be, though.

As I said, the fix would be difficult to backport. It required a redesign of
how encrypted files get created, as there were lots of different ways in which
fscrypt_get_encryption_info() could be called during a filesystem transaction.
There's a nontrivial risk of regressions by backporting it. (In fact I already
found and fixed two regressions in it upstream...)

So it would be helpful to know if this is important enough to you that you would
be willing to accept a risk of regressions above that of a typical stable
backport in order to get the re-design that fixes this issue backported.

- Eric

2021-01-13 11:36:46

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [stable] ext4 fscrypt_get_encryption_info() circular locking dependency

Hello,

Eric, sorry for the delay.

On (20/12/11 10:03), Eric Biggers wrote:
> > [..]
> >
> > [ 1598.658233] __rwsem_down_read_failed_common+0x186/0x201
> > [ 1598.658235] call_rwsem_down_read_failed+0x14/0x30
> > [ 1598.658238] down_read+0x2e/0x45
> > [ 1598.658240] rmap_walk_file+0x73/0x1ce
> > [ 1598.658242] page_referenced+0x10d/0x154
> > [ 1598.658247] shrink_active_list+0x1d4/0x475
> > [ 1598.658250] shrink_node+0x27e/0x661
> > [ 1598.658254] try_to_free_pages+0x425/0x7ec
> > [ 1598.658258] __alloc_pages_nodemask+0x80b/0x1514
> > [ 1598.658279] __do_page_cache_readahead+0xd4/0x1a9
> > [ 1598.658282] filemap_fault+0x346/0x573
> > [ 1598.658287] ext4_filemap_fault+0x31/0x44
>
> Could you provide some more information about what is causing these actual
> lockups for you? Are there more stack traces?

I think I have some leads, and, just like you said, this deos not appear
to be ext4 related.

A likely root cause for the lockups I'm observing, is that kswapd
and virtio_balloon have reverse locking order for THP pages:

down_write(mapping->i_mmap_rwsem) --> page->PG_locked
vs
page->PG_locked --> down_read(mapping->i_mmap_rwsem)

-ss