2019-12-10 17:34:04

by David Howells

[permalink] [raw]
Subject: [PATCH] rxrpc: Mutexes are unusable from softirq context, so use rwsem instead

rxrpc_call::user_mutex is of type struct mutex, but it's required to start
off locked on an incoming call as it is being set up in softirq context to
prevent sendmsg and recvmsg interfering with it until it is ready. It is
then unlocked in rxrpc_input_packet() to make the call live.

Unfortunately, commit a0855d24fc22d49cdc25664fb224caee16998683
("locking/mutex: Complain upon mutex API misuse in IRQ contexts") causes
big warnings to be splashed in dmesg for each a new call that comes in from
the server.

It *seems* like it should be okay, since the accept path trylocks the mutex
when no one else can see it and drops the mutex before it leaves softirq
context.

Fix this by switching to using an rw_semaphore instead as that is permitted
to be used in softirq context.

Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Signed-off-by: David Howells <[email protected]>
cc: Peter Zijlstra <[email protected]>
cc: Ingo Molnar <[email protected]>
cc: Will Deacon <[email protected]>
---

net/rxrpc/af_rxrpc.c | 10 +++++-----
net/rxrpc/ar-internal.h | 2 +-
net/rxrpc/call_accept.c | 4 ++--
net/rxrpc/call_object.c | 6 +++---
net/rxrpc/input.c | 2 +-
net/rxrpc/recvmsg.c | 14 +++++++-------
net/rxrpc/sendmsg.c | 16 ++++++++--------
7 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index d72ddb67bb74..17e55e12376c 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -322,7 +322,7 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *sock,
/* The socket has been unlocked. */
if (!IS_ERR(call)) {
call->notify_rx = notify_rx;
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
}

rxrpc_put_peer(cp.peer);
@@ -351,7 +351,7 @@ void rxrpc_kernel_end_call(struct socket *sock, struct rxrpc_call *call)
{
_enter("%d{%d}", call->debug_id, atomic_read(&call->usage));

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);
rxrpc_release_call(rxrpc_sk(sock->sk), call);

/* Make sure we're not going to call back into a kernel service */
@@ -361,7 +361,7 @@ void rxrpc_kernel_end_call(struct socket *sock, struct rxrpc_call *call)
spin_unlock_bh(&call->notify_lock);
}

- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
rxrpc_put_call(call, rxrpc_call_put_kernel);
}
EXPORT_SYMBOL(rxrpc_kernel_end_call);
@@ -456,14 +456,14 @@ void rxrpc_kernel_set_max_life(struct socket *sock, struct rxrpc_call *call,
{
unsigned long now;

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

now = jiffies;
hard_timeout += now;
WRITE_ONCE(call->expect_term_by, hard_timeout);
rxrpc_reduce_call_timer(call, hard_timeout, now, rxrpc_timer_set_for_hard);

- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
}
EXPORT_SYMBOL(rxrpc_kernel_set_max_life);

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 7c7d10f2e0c1..1701f2406463 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -557,7 +557,7 @@ struct rxrpc_call {
struct rxrpc_sock __rcu *socket; /* socket responsible */
struct rxrpc_net *rxnet; /* Network namespace to which call belongs */
const struct rxrpc_security *security; /* applied security module */
- struct mutex user_mutex; /* User access mutex */
+ struct rw_semaphore user_mutex; /* User access mutex */
unsigned long ack_at; /* When deferred ACK needs to happen */
unsigned long ack_lost_at; /* When ACK is figured as lost */
unsigned long resend_at; /* When next resend needs to happen */
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 135bf5cd8dd5..fb9a751e3c35 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -378,7 +378,7 @@ struct rxrpc_call *rxrpc_new_incoming_call(struct rxrpc_local *local,
* event and userspace is prevented from doing so until the state is
* appropriate.
*/
- if (!mutex_trylock(&call->user_mutex))
+ if (!down_write_trylock(&call->user_mutex))
BUG();

/* Make the call live. */
@@ -493,7 +493,7 @@ struct rxrpc_call *rxrpc_accept_call(struct rxrpc_sock *rx,
* We are, however, still holding the socket lock, so other accepts
* must wait for us and no one can add the user ID behind our backs.
*/
- if (mutex_lock_interruptible(&call->user_mutex) < 0) {
+ if (down_write_killable(&call->user_mutex) < 0) {
release_sock(&rx->sk);
kleave(" = -ERESTARTSYS");
return ERR_PTR(-ERESTARTSYS);
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index a31c18c09894..7a5d48b23fce 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -115,7 +115,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp,
if (!call->rxtx_annotations)
goto nomem_2;

- mutex_init(&call->user_mutex);
+ init_rwsem(&call->user_mutex);

/* Prevent lockdep reporting a deadlock false positive between the afs
* filesystem and sys_sendmsg() via the mmap sem.
@@ -247,7 +247,7 @@ struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
/* We need to protect a partially set up call against the user as we
* will be acting outside the socket lock.
*/
- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

/* Publish the call, even though it is incompletely set up as yet */
write_lock(&rx->call_lock);
@@ -317,7 +317,7 @@ struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
trace_rxrpc_call(call->debug_id, rxrpc_call_error,
atomic_read(&call->usage), here, ERR_PTR(ret));
rxrpc_release_call(rx, call);
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
rxrpc_put_call(call, rxrpc_call_put);
_leave(" = %d", ret);
return ERR_PTR(ret);
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 157be1ff8697..6955ad66433b 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -1397,7 +1397,7 @@ int rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
if (!call)
goto reject_packet;
rxrpc_send_ping(call, skb);
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
}

/* Process a call packet; this either discards or passes on the ref
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 8578c39ec839..e6bf52fb0ad8 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -511,12 +511,12 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
/* We're going to drop the socket lock, so we need to lock the call
* against interference by sendmsg.
*/
- if (!mutex_trylock(&call->user_mutex)) {
+ if (!down_write_trylock(&call->user_mutex)) {
ret = -EWOULDBLOCK;
if (flags & MSG_DONTWAIT)
goto error_requeue_call;
ret = -ERESTARTSYS;
- if (mutex_lock_interruptible(&call->user_mutex) < 0)
+ if (down_write_killable(&call->user_mutex) < 0)
goto error_requeue_call;
}

@@ -591,7 +591,7 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
ret = copied;

error_unlock_call:
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
rxrpc_put_call(call, rxrpc_call_put);
trace_rxrpc_recvmsg(call, rxrpc_recvmsg_return, 0, 0, 0, ret);
return ret;
@@ -651,7 +651,7 @@ int rxrpc_kernel_recv_data(struct socket *sock, struct rxrpc_call *call,

ASSERTCMP(call->state, !=, RXRPC_CALL_SERVER_ACCEPTING);

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

switch (READ_ONCE(call->state)) {
case RXRPC_CALL_CLIENT_RECV_REPLY:
@@ -704,7 +704,7 @@ int rxrpc_kernel_recv_data(struct socket *sock, struct rxrpc_call *call,

if (_service)
*_service = call->service_id;
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
_leave(" = %d [%zu,%d]", ret, iov_iter_count(iter), *_abort);
return ret;

@@ -744,7 +744,7 @@ bool rxrpc_kernel_get_reply_time(struct socket *sock, struct rxrpc_call *call,
rxrpc_seq_t hard_ack, top, seq;
bool success = false;

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

if (READ_ONCE(call->state) != RXRPC_CALL_CLIENT_RECV_REPLY)
goto out;
@@ -766,7 +766,7 @@ bool rxrpc_kernel_get_reply_time(struct socket *sock, struct rxrpc_call *call,
success = true;

out:
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
return success;
}
EXPORT_SYMBOL(rxrpc_kernel_get_reply_time);
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index 813fd6888142..d3a4749a2f8a 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -38,9 +38,9 @@ static int rxrpc_wait_for_tx_window_intr(struct rxrpc_sock *rx,
return sock_intr_errno(*timeo);

trace_rxrpc_transmit(call, rxrpc_transmit_wait);
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
*timeo = schedule_timeout(*timeo);
- if (mutex_lock_interruptible(&call->user_mutex) < 0)
+ if (down_write_killable(&call->user_mutex) < 0)
return sock_intr_errno(*timeo);
}
}
@@ -668,7 +668,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
break;
}

- ret = mutex_lock_interruptible(&call->user_mutex);
+ ret = down_write_killable_nested(&call->user_mutex, 1);
release_sock(&rx->sk);
if (ret < 0) {
ret = -ERESTARTSYS;
@@ -737,7 +737,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
}

out_put_unlock:
- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
error_put:
rxrpc_put_call(call, rxrpc_call_put);
_leave(" = %d", ret);
@@ -772,7 +772,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
ASSERTCMP(msg->msg_name, ==, NULL);
ASSERTCMP(msg->msg_control, ==, NULL);

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

_debug("CALL %d USR %lx ST %d on CONN %p",
call->debug_id, call->user_call_ID, call->state, call->conn);
@@ -796,7 +796,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
break;
}

- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
_leave(" = %d", ret);
return ret;
}
@@ -820,13 +820,13 @@ bool rxrpc_kernel_abort_call(struct socket *sock, struct rxrpc_call *call,

_enter("{%d},%d,%d,%s", call->debug_id, abort_code, error, why);

- mutex_lock(&call->user_mutex);
+ down_write(&call->user_mutex);

aborted = rxrpc_abort_call(why, call, 0, abort_code, error);
if (aborted)
rxrpc_send_abort_packet(call);

- mutex_unlock(&call->user_mutex);
+ up_write(&call->user_mutex);
return aborted;
}
EXPORT_SYMBOL(rxrpc_kernel_abort_call);


2019-12-10 19:12:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rxrpc: Mutexes are unusable from softirq context, so use rwsem instead

On Tue, Dec 10, 2019 at 05:32:58PM +0000, David Howells wrote:
> rxrpc_call::user_mutex is of type struct mutex, but it's required to start
> off locked on an incoming call as it is being set up in softirq context to
> prevent sendmsg and recvmsg interfering with it until it is ready. It is
> then unlocked in rxrpc_input_packet() to make the call live.
>
> Unfortunately, commit a0855d24fc22d49cdc25664fb224caee16998683
> ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") causes
> big warnings to be splashed in dmesg for each a new call that comes in from
> the server.
>
> It *seems* like it should be okay, since the accept path trylocks the mutex
> when no one else can see it and drops the mutex before it leaves softirq
> context.
>
> Fix this by switching to using an rw_semaphore instead as that is permitted
> to be used in softirq context.

This really has the very same problem. It just avoids the WARN. We do PI
boosting for rwsem write side identical to what we do for mutexes.

I would rather we revert David's patch for now and more carefully
consider what to do about this.

2019-12-10 19:31:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rxrpc: Mutexes are unusable from softirq context, so use rwsem instead

On Tue, Dec 10, 2019 at 08:10:09PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 10, 2019 at 05:32:58PM +0000, David Howells wrote:
> > rxrpc_call::user_mutex is of type struct mutex, but it's required to start
> > off locked on an incoming call as it is being set up in softirq context to
> > prevent sendmsg and recvmsg interfering with it until it is ready. It is
> > then unlocked in rxrpc_input_packet() to make the call live.
> >
> > Unfortunately, commit a0855d24fc22d49cdc25664fb224caee16998683
> > ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") causes
> > big warnings to be splashed in dmesg for each a new call that comes in from
> > the server.
> >
> > It *seems* like it should be okay, since the accept path trylocks the mutex
> > when no one else can see it and drops the mutex before it leaves softirq
> > context.
> >
> > Fix this by switching to using an rw_semaphore instead as that is permitted
> > to be used in softirq context.
>
> This really has the very same problem. It just avoids the WARN. We do PI
> boosting for rwsem write side identical to what we do for mutexes.
>
> I would rather we revert David's patch for now and more carefully
> consider what to do about this.

To clarify (I only just reliazed David is a bit ambiguous here), take
this patch out for now:

a0855d24fc22 ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")

The RXRPC code has been there for a while... and like I wrote, both
mutex and rwsem have the exact same issue, the rwsem code just doesn't
have a WARN on it.

2019-12-10 22:15:18

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH] Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts"

This ended up causing some noise in places such as rxrpc running in softirq.

The warning is misleading in this case as the mutex trylock and unlock
operations are done within the same context; and therefore we need not
worry about the PI-boosting issues that comes along with no single-owner
lock guarantees.

While we don't want to support this in mutexes, there is no way out of
this yet; so lets get rid of the WARNs for now, as it is only fair to
code that has historically relied on non-preemptible softirq guarantees.
In addition, changing the lock type is also unviable: exclusive rwsems
have the same issue (just not the WARN_ON) and counting semaphores
would introduce a performance hit as mutexes are a lot more optimized.

This reverts commit 5d4ebaa87329ef226e74e52c80ac1c62e4948987.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
kernel/locking/mutex.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 54cc5f9286e9..5352ce50a97e 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -733,9 +733,6 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
*/
void __sched mutex_unlock(struct mutex *lock)
{
-#ifdef CONFIG_DEBUG_MUTEXES
- WARN_ON(in_interrupt());
-#endif
#ifndef CONFIG_DEBUG_LOCK_ALLOC
if (__mutex_unlock_fast(lock))
return;
@@ -1416,7 +1413,6 @@ int __sched mutex_trylock(struct mutex *lock)

#ifdef CONFIG_DEBUG_MUTEXES
DEBUG_LOCKS_WARN_ON(lock->magic != lock);
- WARN_ON(in_interrupt());
#endif

locked = __mutex_trylock(lock);
--
2.16.4

2019-12-10 22:34:27

by David Howells

[permalink] [raw]
Subject: Re: [PATCH] Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts"

Davidlohr Bueso <[email protected]> wrote:

> This ended up causing some noise in places such as rxrpc running in softirq.
>
> The warning is misleading in this case as the mutex trylock and unlock
> operations are done within the same context; and therefore we need not
> worry about the PI-boosting issues that comes along with no single-owner
> lock guarantees.
>
> While we don't want to support this in mutexes, there is no way out of
> this yet; so lets get rid of the WARNs for now, as it is only fair to
> code that has historically relied on non-preemptible softirq guarantees.
> In addition, changing the lock type is also unviable: exclusive rwsems
> have the same issue (just not the WARN_ON) and counting semaphores
> would introduce a performance hit as mutexes are a lot more optimized.
>
> This reverts commit 5d4ebaa87329ef226e74e52c80ac1c62e4948987.
>
> Signed-off-by: Davidlohr Bueso <[email protected]>

Tested-by: David Howells <[email protected]>

2019-12-10 23:27:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts"


* Davidlohr Bueso <[email protected]> wrote:

> This ended up causing some noise in places such as rxrpc running in softirq.
>
> The warning is misleading in this case as the mutex trylock and unlock
> operations are done within the same context; and therefore we need not
> worry about the PI-boosting issues that comes along with no single-owner
> lock guarantees.
>
> While we don't want to support this in mutexes, there is no way out of
> this yet; so lets get rid of the WARNs for now, as it is only fair to
> code that has historically relied on non-preemptible softirq guarantees.
> In addition, changing the lock type is also unviable: exclusive rwsems
> have the same issue (just not the WARN_ON) and counting semaphores
> would introduce a performance hit as mutexes are a lot more optimized.
>
> This reverts commit 5d4ebaa87329ef226e74e52c80ac1c62e4948987.

Not sure where that SHA1 came from (it's not in Linus's tree), the right
one is:

a0855d24fc22: ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")

I've fixed the changelog accordingly.

Thanks,

Ingo

Subject: [tip: locking/urgent] Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts"

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID: c571b72e2b845ca0519670cb7c4b5fe5f56498a5
Gitweb: https://git.kernel.org/tip/c571b72e2b845ca0519670cb7c4b5fe5f56498a5
Author: Davidlohr Bueso <[email protected]>
AuthorDate: Tue, 10 Dec 2019 14:05:23 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 11 Dec 2019 00:27:43 +01:00

Revert "locking/mutex: Complain upon mutex API misuse in IRQ contexts"

This ended up causing some noise in places such as rxrpc running in softirq.

The warning is misleading in this case as the mutex trylock and unlock
operations are done within the same context; and therefore we need not
worry about the PI-boosting issues that comes along with no single-owner
lock guarantees.

While we don't want to support this in mutexes, there is no way out of
this yet; so lets get rid of the WARNs for now, as it is only fair to
code that has historically relied on non-preemptible softirq guarantees.
In addition, changing the lock type is also unviable: exclusive rwsems
have the same issue (just not the WARN_ON) and counting semaphores
would introduce a performance hit as mutexes are a lot more optimized.

This reverts:

a0855d24fc22: ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")

Fixes: a0855d24fc22: ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
Signed-off-by: Davidlohr Bueso <[email protected]>
Tested-by: David Howells <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/mutex.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 54cc5f9..5352ce5 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -733,9 +733,6 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
*/
void __sched mutex_unlock(struct mutex *lock)
{
-#ifdef CONFIG_DEBUG_MUTEXES
- WARN_ON(in_interrupt());
-#endif
#ifndef CONFIG_DEBUG_LOCK_ALLOC
if (__mutex_unlock_fast(lock))
return;
@@ -1416,7 +1413,6 @@ int __sched mutex_trylock(struct mutex *lock)

#ifdef CONFIG_DEBUG_MUTEXES
DEBUG_LOCKS_WARN_ON(lock->magic != lock);
- WARN_ON(in_interrupt());
#endif

locked = __mutex_trylock(lock);