2024-02-04 23:06:27

by Chuck Lever

[permalink] [raw]
Subject: deadlock in RELEASE_LOCKOWNER path

Hi Neil-

I'm testing v6.8-rc3 + nfsd-next. This series includes:

nfsd: fix RELEASE_LOCKOWNER

and

nfsd: don't call locks_release_private() twice concurrently


======================================================
WARNING: possible circular locking dependency detected
6.8.0-rc3-00052-gc20ad5c2d60c #1 Not tainted
------------------------------------------------------
nfsd/1218 is trying to acquire lock:
ffff88814d754198 (&ctx->flc_lock){+.+.}-{2:2}, at: check_for_locks+0xf6/0x1d0 [nfsd]

but task is already holding lock:
ffff8881210e61f0 (&fp->fi_lock){+.+.}-{2:2}, at: check_for_locks+0x2d/0x1d0 [nfsd]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&fp->fi_lock){+.+.}-{2:2}:
_raw_spin_lock+0x2f/0x42
nfsd_break_deleg_cb+0x295/0x2f6 [nfsd]
__break_lease+0x38b/0x864
break_lease+0x87/0xc2
do_dentry_open+0x5df/0xc8d
vfs_open+0xbb/0xc4
dentry_open+0x5a/0x7a
__nfsd_open.isra.0+0x1ed/0x2a3 [nfsd]
nfsd_open_verified+0x16/0x1c [nfsd]
nfsd_file_do_acquire+0x149c/0x1650 [nfsd]
nfsd_file_acquire+0x16/0x1c [nfsd]
nfsd4_commit+0x72/0x10c [nfsd]
nfsd4_proc_compound+0x12c5/0x17a8 [nfsd]
nfsd_dispatch+0x32f/0x590 [nfsd]
svc_process_common+0xb64/0x13b8 [sunrpc]
svc_process+0x3b8/0x40c [sunrpc]
svc_recv+0x1478/0x1803 [sunrpc]
nfsd+0x2a1/0x2e3 [nfsd]
kthread+0x2cb/0x2da
ret_from_fork+0x2a/0x62
ret_from_fork_asm+0x1b/0x30

-> #0 (&ctx->flc_lock){+.+.}-{2:2}:
__lock_acquire+0x1e49/0x27f7
lock_acquire+0x25c/0x3df
_raw_spin_lock+0x2f/0x42
check_for_locks+0xf6/0x1d0 [nfsd]
nfsd4_release_lockowner+0x2b9/0x3a4 [nfsd]
nfsd4_proc_compound+0x12c5/0x17a8 [nfsd]
nfsd_dispatch+0x32f/0x590 [nfsd]
svc_process_common+0xb64/0x13b8 [sunrpc]
svc_process+0x3b8/0x40c [sunrpc]
svc_recv+0x1478/0x1803 [sunrpc]
nfsd+0x2a1/0x2e3 [nfsd]
kthread+0x2cb/0x2da
ret_from_fork+0x2a/0x62
ret_from_fork_asm+0x1b/0x30

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&fp->fi_lock);
lock(&ctx->flc_lock);
lock(&fp->fi_lock);
lock(&ctx->flc_lock);

*** DEADLOCK ***

2 locks held by nfsd/1218:
#0: ffff88823a3ccf90 (&clp->cl_lock#2){+.+.}-{2:2}, at: nfsd4_release_lockowner+0x22d/0x3a4 [nfsd]
#1: ffff8881210e61f0 (&fp->fi_lock){+.+.}-{2:2}, at: check_for_locks+0x2d/0x1d0 [nfsd]

stack backtrace:
CPU: 2 PID: 1218 Comm: nfsd Not tainted 6.8.0-rc3-00052-gc20ad5c2d60c #1
Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.3 10/28/2020
Call Trace:
<TASK>
dump_stack_lvl+0x70/0xa4
dump_stack+0x10/0x16
print_circular_bug+0x37c/0x38f
check_noncircular+0x16d/0x19a
? __pfx_mark_lock+0x10/0x10
? __pfx_check_noncircular+0x10/0x10
? add_chain_block+0x142/0x19c
__lock_acquire+0x1e49/0x27f7
? __pfx___lock_acquire+0x10/0x10
? check_for_locks+0xf6/0x1d0 [nfsd]
lock_acquire+0x25c/0x3df
? check_for_locks+0xf6/0x1d0 [nfsd]
? __pfx_lock_acquire+0x10/0x10
? __kasan_check_write+0x14/0x1a
? do_raw_spin_lock+0x146/0x1ea
_raw_spin_lock+0x2f/0x42
? check_for_locks+0xf6/0x1d0 [nfsd]
check_for_locks+0xf6/0x1d0 [nfsd]
nfsd4_release_lockowner+0x2b9/0x3a4 [nfsd]
? __pfx_nfsd4_release_lockowner+0x10/0x10 [nfsd]
nfsd4_proc_compound+0x12c5/0x17a8 [nfsd]
? nfsd_read_splice_ok+0xe/0x1f [nfsd]
nfsd_dispatch+0x32f/0x590 [nfsd]
? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
? svc_reserve+0xed/0xfc [sunrpc]
svc_process_common+0xb64/0x13b8 [sunrpc]
? __pfx_svc_process_common+0x10/0x10 [sunrpc]
? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
? xdr_inline_decode+0x92/0x1cb [sunrpc]
svc_process+0x3b8/0x40c [sunrpc]
svc_recv+0x1478/0x1803 [sunrpc]
nfsd+0x2a1/0x2e3 [nfsd]
? __kthread_parkme+0xcc/0x19c
? __pfx_nfsd+0x10/0x10 [nfsd]
kthread+0x2cb/0x2da
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2a/0x62
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>

I get a very similar splat while testing v5.15.y and v5.10.y
with "nfsd: fix RELEASE_LOCKOWNER" applied. I'm circling back
to v6.1.y as well, but I expect I will see the same there.


--
Chuck Lever




2024-02-05 00:56:43

by NeilBrown

[permalink] [raw]
Subject: Re: deadlock in RELEASE_LOCKOWNER path

On Mon, 05 Feb 2024, Chuck Lever III wrote:
> Hi Neil-
>
> I'm testing v6.8-rc3 + nfsd-next. This series includes:
>
> nfsd: fix RELEASE_LOCKOWNER
>
> and
>
> nfsd: don't call locks_release_private() twice concurrently
>
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.8.0-rc3-00052-gc20ad5c2d60c #1 Not tainted
> ------------------------------------------------------
> nfsd/1218 is trying to acquire lock:
> ffff88814d754198 (&ctx->flc_lock){+.+.}-{2:2}, at: check_for_locks+0xf6/0x1d0 [nfsd]
>
> but task is already holding lock:
> ffff8881210e61f0 (&fp->fi_lock){+.+.}-{2:2}, at: check_for_locks+0x2d/0x1d0 [nfsd]
>
> which lock already depends on the new lock.

I should have found this when I was checking on flc_lock... sorry.

I think this could actually deadlock if a lease was being broken on a
file at the same time that some interesting file locking was happening
... though that might not actually be possible as conflicting locks and
delegations rarely mix. Still - we should fix it.

One approach would be to split flc_lock into two locks, one for leases
and one for posix+flock locks. That would avoid this conflict, but
isn't particularly elegant.

I'm not at all certain that nfsd_break_deleg_cb() needs to take fi_lock.
In earlier code it needed to walk ->fi_delegations and that would need
the lock - but that was removed in v4.17-rc1~110^2~22.

The only remaining possible need for fi_lock is fi_had_conflict.
nfsd_break_deleg_cb() take the lock when setting the flag, and
nfsd_set_delegation() takes the lock when testing the flag. I cannot
see why the strong exclusion is needed.

We test fi_had_conflict early in nfsd_set_delegation as an optimisation,
and that makes sense.
Test it again after calling vfs_setlease() to get the lease makes sense
in case there was some other grant/revoke before the early test and the
vfs_setlease(). But once vfs_setlease has succeed and we confirmed no
conflict yet, I cannot see why we would abort the least. If a revoke
came in while nfsd_set_deletation() is running the result doesn't need
to be different to if a revoke comes in just aster nfsd_set_delegation()
completes.... Does it?

So I think we can drop the lock frome break_deleg_cb() and make the
importance of fi_had_conflict explicit by moving the test out of the
lock.
e.g.

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 12534e12dbb3..8b112673d389 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5154,10 +5154,8 @@ nfsd_break_deleg_cb(struct file_lock *fl)
*/
fl->fl_break_time = 0;

- spin_lock(&fp->fi_lock);
fp->fi_had_conflict = true;
nfsd_break_one_deleg(dp);
- spin_unlock(&fp->fi_lock);
return false;
}

@@ -5771,13 +5769,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
if (status)
goto out_unlock;

+ status = -EAGAIN;
+ if (fp->fi_had_conflict)
+ goto out_unlock;
+
spin_lock(&state_lock);
spin_lock(&clp->cl_lock);
spin_lock(&fp->fi_lock);
- if (fp->fi_had_conflict)
- status = -EAGAIN;
- else
- status = hash_delegation_locked(dp, fp);
+ status = hash_delegation_locked(dp, fp);
spin_unlock(&fp->fi_lock);
spin_unlock(&clp->cl_lock);
spin_unlock(&state_lock);


fi_had_conflict is set under flc_lock, and vfs_setlease() will take
flc_lock, so while the set and test may appear lockless, they aren't.

I'll do some basic testing and send a proper patch for your
consideration.

Thanks
NeilBrown