2021-02-26 17:55:01

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH 1/4] kernel/futex: Kill rt_mutex_next_owner()

Update wake_futex_pi() and kill the call altogether. This is possible because:

(i) The case of fixup_owner() in which the pi_mutex was stolen from the
signaled enqueued top-waiter which fails to trylock and doesn't see a
current owner of the rtmutex but needs to acknowledge an non-enqueued
higher priority waiter, which is the other alternative. This used to be
handled by rt_mutex_next_owner(), which guaranteed fixup_pi_state_owner('newowner')
never to be nil. Nowadays the logic is handled by an EAGAIN loop, without
the need of rt_mutex_next_owner(). Specifically:

c1e2f0eaf015 (futex: Avoid violating the 10th rule of futex)
9f5d1c336a10 (futex: Handle transient "ownerless" rtmutex state correctly)

(ii) rt_mutex_next_owner() and rt_mutex_top_waiter() are semantically
equivalent, as of:

c28d62cf52d7 (locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter())

So instead of keeping the call around, just use the good ole rt_mutex_top_waiter().
This patch implies no change in semantics.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
kernel/futex.c | 7 +++++--
kernel/locking/rtmutex.c | 20 --------------------
kernel/locking/rtmutex_common.h | 1 -
3 files changed, 5 insertions(+), 23 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index e68db7745039..db8002dbca7a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1494,13 +1494,14 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
{
u32 curval, newval;
+ struct rt_mutex_waiter *top_waiter;
struct task_struct *new_owner;
bool postunlock = false;
DEFINE_WAKE_Q(wake_q);
int ret = 0;

- new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
- if (WARN_ON_ONCE(!new_owner)) {
+ top_waiter = rt_mutex_top_waiter(&pi_state->pi_mutex);
+ if (WARN_ON_ONCE(!top_waiter)) {
/*
* As per the comment in futex_unlock_pi() this should not happen.
*
@@ -1513,6 +1514,8 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_
goto out_unlock;
}

+ new_owner = top_waiter->task;
+
/*
* We pass it to the next owner. The WAITERS bit is always kept
* enabled while there is PI state around. We cleanup the owner
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 03b21135313c..919391c604f1 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1792,26 +1792,6 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
return ret;
}

-/**
- * rt_mutex_next_owner - return the next owner of the lock
- *
- * @lock: the rt lock query
- *
- * Returns the next owner of the lock or NULL
- *
- * Caller has to serialize against other accessors to the lock
- * itself.
- *
- * Special API call for PI-futex support
- */
-struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock)
-{
- if (!rt_mutex_has_waiters(lock))
- return NULL;
-
- return rt_mutex_top_waiter(lock)->task;
-}
-
/**
* rt_mutex_wait_proxy_lock() - Wait for lock acquisition
* @lock: the rt_mutex we were woken on
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index ca6fb489007b..a5007f00c1b7 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -130,7 +130,6 @@ enum rtmutex_chainwalk {
/*
* PI-futex support (proxy locking functions, etc.):
*/
-extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);
extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock,
struct task_struct *proxy_owner);
extern void rt_mutex_proxy_unlock(struct rt_mutex *lock);
--
2.26.2


2021-02-26 17:55:01

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH 4/4] kernel/futex: Explicitly document pi_lock for pi_state owner fixup

This seems to belong in the serialization and lifetime rules section.
pi_state_update_owner() will take the pi_mutex's owner's pi_lock to
do whatever fixup, successful or not.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
kernel/futex.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/futex.c b/kernel/futex.c
index dcd6132485e1..475055715371 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -981,6 +981,7 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
* p->pi_lock:
*
* p->pi_state_list -> pi_state->list, relation
+ * pi_mutex->owner -> pi_state->owner, relation
*
* pi_state->refcount:
*
--
2.26.2

2021-03-01 15:24:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/4] kernel/futex: Kill rt_mutex_next_owner()



For lack of a 0/n, these patches look good to me.

Acked-by: Peter Zijlstra (Intel) <[email protected]>

Subject: [tip: locking/core] kernel/futex: Kill rt_mutex_next_owner()

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 9a4b99fce659c03699f1cb5003ebe7c57c084d49
Gitweb: https://git.kernel.org/tip/9a4b99fce659c03699f1cb5003ebe7c57c084d49
Author: Davidlohr Bueso <[email protected]>
AuthorDate: Fri, 26 Feb 2021 09:50:26 -08:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 11 Mar 2021 19:19:17 +01:00

kernel/futex: Kill rt_mutex_next_owner()

Update wake_futex_pi() and kill the call altogether. This is possible because:

(i) The case of fixup_owner() in which the pi_mutex was stolen from the
signaled enqueued top-waiter which fails to trylock and doesn't see a
current owner of the rtmutex but needs to acknowledge an non-enqueued
higher priority waiter, which is the other alternative. This used to be
handled by rt_mutex_next_owner(), which guaranteed fixup_pi_state_owner('newowner')
never to be nil. Nowadays the logic is handled by an EAGAIN loop, without
the need of rt_mutex_next_owner(). Specifically:

c1e2f0eaf015 (futex: Avoid violating the 10th rule of futex)
9f5d1c336a10 (futex: Handle transient "ownerless" rtmutex state correctly)

(ii) rt_mutex_next_owner() and rt_mutex_top_waiter() are semantically
equivalent, as of:

c28d62cf52d7 (locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter())

So instead of keeping the call around, just use the good ole rt_mutex_top_waiter().
No change in semantics.

Signed-off-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/futex.c | 7 +++++--
kernel/locking/rtmutex.c | 20 --------------------
kernel/locking/rtmutex_common.h | 1 -
3 files changed, 5 insertions(+), 23 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index e68db77..db8002d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1494,13 +1494,14 @@ static void mark_wake_futex(struct wake_q_head *wake_q, struct futex_q *q)
static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
{
u32 curval, newval;
+ struct rt_mutex_waiter *top_waiter;
struct task_struct *new_owner;
bool postunlock = false;
DEFINE_WAKE_Q(wake_q);
int ret = 0;

- new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
- if (WARN_ON_ONCE(!new_owner)) {
+ top_waiter = rt_mutex_top_waiter(&pi_state->pi_mutex);
+ if (WARN_ON_ONCE(!top_waiter)) {
/*
* As per the comment in futex_unlock_pi() this should not happen.
*
@@ -1513,6 +1514,8 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_
goto out_unlock;
}

+ new_owner = top_waiter->task;
+
/*
* We pass it to the next owner. The WAITERS bit is always kept
* enabled while there is PI state around. We cleanup the owner
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 48fff64..29f09d0 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1793,26 +1793,6 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
}

/**
- * rt_mutex_next_owner - return the next owner of the lock
- *
- * @lock: the rt lock query
- *
- * Returns the next owner of the lock or NULL
- *
- * Caller has to serialize against other accessors to the lock
- * itself.
- *
- * Special API call for PI-futex support
- */
-struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock)
-{
- if (!rt_mutex_has_waiters(lock))
- return NULL;
-
- return rt_mutex_top_waiter(lock)->task;
-}
-
-/**
* rt_mutex_wait_proxy_lock() - Wait for lock acquisition
* @lock: the rt_mutex we were woken on
* @to: the timeout, null if none. hrtimer should already have
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index ca6fb48..a5007f0 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -130,7 +130,6 @@ enum rtmutex_chainwalk {
/*
* PI-futex support (proxy locking functions, etc.):
*/
-extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);
extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock,
struct task_struct *proxy_owner);
extern void rt_mutex_proxy_unlock(struct rt_mutex *lock);

Subject: [tip: locking/core] kernel/futex: Explicitly document pi_lock for pi_state owner fixup

The following commit has been merged into the locking/core branch of tip:

Commit-ID: c2e4bfe0eef313eeb1c4c9d921be7a9d91d5d71a
Gitweb: https://git.kernel.org/tip/c2e4bfe0eef313eeb1c4c9d921be7a9d91d5d71a
Author: Davidlohr Bueso <[email protected]>
AuthorDate: Fri, 26 Feb 2021 09:50:29 -08:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 11 Mar 2021 19:19:17 +01:00

kernel/futex: Explicitly document pi_lock for pi_state owner fixup

This seems to belong in the serialization and lifetime rules section.
pi_state_update_owner() will take the pi_mutex's owner's pi_lock to
do whatever fixup, successful or not.

Signed-off-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/futex.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/futex.c b/kernel/futex.c
index dcd6132..4750557 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -981,6 +981,7 @@ static inline void exit_pi_state_list(struct task_struct *curr) { }
* p->pi_lock:
*
* p->pi_state_list -> pi_state->list, relation
+ * pi_mutex->owner -> pi_state->owner, relation
*
* pi_state->refcount:
*