2021-03-09 03:00:58

by Zheng Yejian

[permalink] [raw]
Subject: [PATCH 4.4 0/3] Backport patch series to update Futex from 4.9

Lee sent a patchset to update Futex for 4.9, see https://www.spinics.net/lists/stable/msg443081.html,
Then Xiaoming sent a follow-up patch for it, see https://lore.kernel.org/lkml/20210225093120.GD641347@dell/.

These patchsets may also resolve following issues in 4.4.260 which have been reported in 4.9,
see https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-4.4.y&id=319f66f08de1083c1fe271261665c209009dd65a
> /*
> * The task is on the way out. When the futex state is
> * FUTEX_STATE_DEAD, we know that the task has finished
> * the cleanup:
> */
> int ret = (p->futex_state = FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;

Here may be:
int ret = (p->futex_state == FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;

> raw_spin_unlock_irq(&p->pi_lock);
> /*
> * If the owner task is between FUTEX_STATE_EXITING and
> * FUTEX_STATE_DEAD then store the task pointer and keep
> * the reference on the task struct. The calling code will
> * drop all locks, wait for the task to reach
> * FUTEX_STATE_DEAD and then drop the refcount. This is
> * required to prevent a live lock when the current task
> * preempted the exiting task between the two states.
> */
> if (ret == -EBUSY)

And here, the variable "ret" may only be "-ESRCH" or "-EAGAIN", but not "-EBUSY".

> *exiting = p;
> else
> put_task_struct(p);

Since 074e7d515783 ("futex: Ensure the correct return value from futex_lock_pi()") has
been merged in 4.4.260, I send the remain 3 patches.

Peter Zijlstra (1):
futex: Change locking rules

Thomas Gleixner (2):
futex: Cure exit race
futex: fix dead code in attach_to_pi_owner()

kernel/futex.c | 209 +++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 177 insertions(+), 32 deletions(-)

--
2.25.4


2021-03-09 03:01:23

by Zheng Yejian

[permalink] [raw]
Subject: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

From: Thomas Gleixner <[email protected]>

The handle_exit_race() function is defined in commit 9c3f39860367
("futex: Cure exit race"), which never returns -EBUSY. This results
in a small piece of dead code in the attach_to_pi_owner() function:

int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
...
if (ret == -EBUSY)
*exiting = p; /* dead code */

The return value -EBUSY is added to handle_exit_race() in upsteam
commit ac31c7ff8624409 ("futex: Provide distinct return value when
owner is exiting"). This commit was incorporated into v4.9.255, before
the function handle_exit_race() was introduced, whitout Modify
handle_exit_race().

To fix dead code, extract the change of handle_exit_race() from
commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
is exiting"), re-incorporated.

Lee writes:

This commit takes the remaining functional snippet of:

ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")

... and is the correct fix for this issue.

Fixes: 9c3f39860367 ("futex: Cure exit race")
Cc: [email protected] # v4.9.258
Signed-off-by: Xiaoming Ni <[email protected]>
Reviewed-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
---
kernel/futex.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 116766ef7de6..98c65b3c3a00 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1202,11 +1202,11 @@ static int handle_exit_race(u32 __user *uaddr, u32 uval,
u32 uval2;

/*
- * If the futex exit state is not yet FUTEX_STATE_DEAD, wait
- * for it to finish.
+ * If the futex exit state is not yet FUTEX_STATE_DEAD, tell the
+ * caller that the alleged owner is busy.
*/
if (tsk && tsk->futex_state != FUTEX_STATE_DEAD)
- return -EAGAIN;
+ return -EBUSY;

/*
* Reread the user space value to handle the following situation:
--
2.25.4

2021-03-09 03:02:31

by Zheng Yejian

[permalink] [raw]
Subject: [PATCH 4.4 2/3] futex: Cure exit race

From: Thomas Gleixner <[email protected]>

commit da791a667536bf8322042e38ca85d55a78d3c273 upstream.

Stefan reported, that the glibc tst-robustpi4 test case fails
occasionally. That case creates the following race between
sys_exit() and sys_futex_lock_pi():

CPU0 CPU1

sys_exit() sys_futex()
do_exit() futex_lock_pi()
exit_signals(tsk) No waiters:
tsk->flags |= PF_EXITING; *uaddr == 0x00000PID
mm_release(tsk) Set waiter bit
exit_robust_list(tsk) { *uaddr = 0x80000PID;
Set owner died attach_to_pi_owner() {
*uaddr = 0xC0000000; tsk = get_task(PID);
} if (!tsk->flags & PF_EXITING) {
... attach();
tsk->flags |= PF_EXITPIDONE; } else {
if (!(tsk->flags & PF_EXITPIDONE))
return -EAGAIN;
return -ESRCH; <--- FAIL
}

ESRCH is returned all the way to user space, which triggers the glibc test
case assert. Returning ESRCH unconditionally is wrong here because the user
space value has been changed by the exiting task to 0xC0000000, i.e. the
FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
is a valid state and the kernel has to handle it, i.e. taking the futex.

Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
is set in the task which 'owns' the futex. If the value has changed, let
the kernel retry the operation, which includes all regular sanity checks
and correctly handles the FUTEX_OWNER_DIED case.

If it hasn't changed, then return ESRCH as there is no way to distinguish
this case from malfunctioning user space. This happens when the exiting
task did not have a robust list, the robust list was corrupted or the user
space value in the futex was simply bogus.

Reported-by: Stefan Liebler <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: [email protected]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[Lee: Required to satisfy functional dependency from futex back-port.
Re-add the missing handle_exit_race() parts from:
3d4775df0a89 ("futex: Replace PF_EXITPIDONE with a state")]
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
---
kernel/futex.c | 71 +++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 65 insertions(+), 6 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index b410752f5ad1..116766ef7de6 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1196,11 +1196,67 @@ static void wait_for_owner_exiting(int ret, struct task_struct *exiting)
put_task_struct(exiting);
}

+static int handle_exit_race(u32 __user *uaddr, u32 uval,
+ struct task_struct *tsk)
+{
+ u32 uval2;
+
+ /*
+ * If the futex exit state is not yet FUTEX_STATE_DEAD, wait
+ * for it to finish.
+ */
+ if (tsk && tsk->futex_state != FUTEX_STATE_DEAD)
+ return -EAGAIN;
+
+ /*
+ * Reread the user space value to handle the following situation:
+ *
+ * CPU0 CPU1
+ *
+ * sys_exit() sys_futex()
+ * do_exit() futex_lock_pi()
+ * futex_lock_pi_atomic()
+ * exit_signals(tsk) No waiters:
+ * tsk->flags |= PF_EXITING; *uaddr == 0x00000PID
+ * mm_release(tsk) Set waiter bit
+ * exit_robust_list(tsk) { *uaddr = 0x80000PID;
+ * Set owner died attach_to_pi_owner() {
+ * *uaddr = 0xC0000000; tsk = get_task(PID);
+ * } if (!tsk->flags & PF_EXITING) {
+ * ... attach();
+ * tsk->futex_state = } else {
+ * FUTEX_STATE_DEAD; if (tsk->futex_state !=
+ * FUTEX_STATE_DEAD)
+ * return -EAGAIN;
+ * return -ESRCH; <--- FAIL
+ * }
+ *
+ * Returning ESRCH unconditionally is wrong here because the
+ * user space value has been changed by the exiting task.
+ *
+ * The same logic applies to the case where the exiting task is
+ * already gone.
+ */
+ if (get_futex_value_locked(&uval2, uaddr))
+ return -EFAULT;
+
+ /* If the user space value has changed, try again. */
+ if (uval2 != uval)
+ return -EAGAIN;
+
+ /*
+ * The exiting task did not have a robust list, the robust list was
+ * corrupted or the user space value in *uaddr is simply bogus.
+ * Give up and tell user space.
+ */
+ return -ESRCH;
+}
+
/*
* Lookup the task for the TID provided from user space and attach to
* it after doing proper sanity checks.
*/
-static int attach_to_pi_owner(u32 uval, union futex_key *key,
+static int attach_to_pi_owner(u32 __user *uaddr, u32 uval, union futex_key *key,
struct futex_pi_state **ps,
struct task_struct **exiting)
{
@@ -1211,12 +1267,15 @@ static int attach_to_pi_owner(u32 uval, union futex_key *key,
/*
* We are the first waiter - try to look up the real owner and attach
* the new pi_state to it, but bail out when TID = 0 [1]
+ *
+ * The !pid check is paranoid. None of the call sites should end up
+ * with pid == 0, but better safe than sorry. Let the caller retry
*/
if (!pid)
- return -ESRCH;
+ return -EAGAIN;
p = futex_find_get_task(pid);
if (!p)
- return -ESRCH;
+ return handle_exit_race(uaddr, uval, NULL);

if (unlikely(p->flags & PF_KTHREAD)) {
put_task_struct(p);
@@ -1235,7 +1294,7 @@ static int attach_to_pi_owner(u32 uval, union futex_key *key,
* FUTEX_STATE_DEAD, we know that the task has finished
* the cleanup:
*/
- int ret = (p->futex_state = FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
+ int ret = handle_exit_race(uaddr, uval, p);

raw_spin_unlock_irq(&p->pi_lock);
/*
@@ -1301,7 +1360,7 @@ static int lookup_pi_state(u32 __user *uaddr, u32 uval,
* We are the first waiter - try to look up the owner based on
* @uval and attach to it.
*/
- return attach_to_pi_owner(uval, key, ps, exiting);
+ return attach_to_pi_owner(uaddr, uval, key, ps, exiting);
}

static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
@@ -1417,7 +1476,7 @@ static int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucket *hb,
* attach to the owner. If that fails, no harm done, we only
* set the FUTEX_WAITERS bit in the user space variable.
*/
- return attach_to_pi_owner(uval, key, ps, exiting);
+ return attach_to_pi_owner(uaddr, newval, key, ps, exiting);
}

/**
--
2.25.4

2021-03-09 03:03:05

by Zheng Yejian

[permalink] [raw]
Subject: [PATCH 4.4 1/3] futex: Change locking rules

From: Peter Zijlstra <[email protected]>

Currently futex-pi relies on hb->lock to serialize everything. But hb->lock
creates another set of problems, especially priority inversions on RT where
hb->lock becomes a rt_mutex itself.

The rt_mutex::wait_lock is the most obvious protection for keeping the
futex user space value and the kernel internal pi_state in sync.

Rework and document the locking so rt_mutex::wait_lock is held accross all
operations which modify the user space value and the pi state.

This allows to invoke rt_mutex_unlock() (including deboost) without holding
hb->lock as a next step.

Nothing yet relies on the new locking rules.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
[Lee: Back-ported in support of a previous futex back-port attempt]
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
---
kernel/futex.c | 138 +++++++++++++++++++++++++++++++++++++++----------
1 file changed, 112 insertions(+), 26 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index a14b7ef90e5c..b410752f5ad1 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1014,6 +1014,39 @@ static void exit_pi_state_list(struct task_struct *curr)
* [10] There is no transient state which leaves owner and user space
* TID out of sync. Except one error case where the kernel is denied
* write access to the user address, see fixup_pi_state_owner().
+ *
+ *
+ * Serialization and lifetime rules:
+ *
+ * hb->lock:
+ *
+ * hb -> futex_q, relation
+ * futex_q -> pi_state, relation
+ *
+ * (cannot be raw because hb can contain arbitrary amount
+ * of futex_q's)
+ *
+ * pi_mutex->wait_lock:
+ *
+ * {uval, pi_state}
+ *
+ * (and pi_mutex 'obviously')
+ *
+ * p->pi_lock:
+ *
+ * p->pi_state_list -> pi_state->list, relation
+ *
+ * pi_state->refcount:
+ *
+ * pi_state lifetime
+ *
+ *
+ * Lock order:
+ *
+ * hb->lock
+ * pi_mutex->wait_lock
+ * p->pi_lock
+ *
*/

/*
@@ -1021,10 +1054,12 @@ static void exit_pi_state_list(struct task_struct *curr)
* the pi_state against the user space value. If correct, attach to
* it.
*/
-static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
+static int attach_to_pi_state(u32 __user *uaddr, u32 uval,
+ struct futex_pi_state *pi_state,
struct futex_pi_state **ps)
{
pid_t pid = uval & FUTEX_TID_MASK;
+ int ret, uval2;

/*
* Userspace might have messed up non-PI and PI futexes [3]
@@ -1032,8 +1067,33 @@ static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
if (unlikely(!pi_state))
return -EINVAL;

+ /*
+ * We get here with hb->lock held, and having found a
+ * futex_top_waiter(). This means that futex_lock_pi() of said futex_q
+ * has dropped the hb->lock in between queue_me() and unqueue_me_pi(),
+ * which in turn means that futex_lock_pi() still has a reference on
+ * our pi_state.
+ */
WARN_ON(!atomic_read(&pi_state->refcount));

+ /*
+ * Now that we have a pi_state, we can acquire wait_lock
+ * and do the state validation.
+ */
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+
+ /*
+ * Since {uval, pi_state} is serialized by wait_lock, and our current
+ * uval was read without holding it, it can have changed. Verify it
+ * still is what we expect it to be, otherwise retry the entire
+ * operation.
+ */
+ if (get_futex_value_locked(&uval2, uaddr))
+ goto out_efault;
+
+ if (uval != uval2)
+ goto out_eagain;
+
/*
* Handle the owner died case:
*/
@@ -1049,11 +1109,11 @@ static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
* is not 0. Inconsistent state. [5]
*/
if (pid)
- return -EINVAL;
+ goto out_einval;
/*
* Take a ref on the state and return success. [4]
*/
- goto out_state;
+ goto out_attach;
}

/*
@@ -1065,14 +1125,14 @@ static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
* Take a ref on the state and return success. [6]
*/
if (!pid)
- goto out_state;
+ goto out_attach;
} else {
/*
* If the owner died bit is not set, then the pi_state
* must have an owner. [7]
*/
if (!pi_state->owner)
- return -EINVAL;
+ goto out_einval;
}

/*
@@ -1081,11 +1141,29 @@ static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
* user space TID. [9/10]
*/
if (pid != task_pid_vnr(pi_state->owner))
- return -EINVAL;
-out_state:
+ goto out_einval;
+
+out_attach:
atomic_inc(&pi_state->refcount);
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
*ps = pi_state;
return 0;
+
+out_einval:
+ ret = -EINVAL;
+ goto out_error;
+
+out_eagain:
+ ret = -EAGAIN;
+ goto out_error;
+
+out_efault:
+ ret = -EFAULT;
+ goto out_error;
+
+out_error:
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+ return ret;
}

/**
@@ -1178,6 +1256,9 @@ static int attach_to_pi_owner(u32 uval, union futex_key *key,

/*
* No existing pi state. First waiter. [2]
+ *
+ * This creates pi_state, we have hb->lock held, this means nothing can
+ * observe this state, wait_lock is irrelevant.
*/
pi_state = alloc_pi_state();

@@ -1202,7 +1283,8 @@ static int attach_to_pi_owner(u32 uval, union futex_key *key,
return 0;
}

-static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
+static int lookup_pi_state(u32 __user *uaddr, u32 uval,
+ struct futex_hash_bucket *hb,
union futex_key *key, struct futex_pi_state **ps,
struct task_struct **exiting)
{
@@ -1213,7 +1295,7 @@ static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
* attach to the pi_state when the validation succeeds.
*/
if (match)
- return attach_to_pi_state(uval, match->pi_state, ps);
+ return attach_to_pi_state(uaddr, uval, match->pi_state, ps);

/*
* We are the first waiter - try to look up the owner based on
@@ -1232,7 +1314,7 @@ static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)))
return -EFAULT;

- /*If user space value changed, let the caller retry */
+ /* If user space value changed, let the caller retry */
return curval != uval ? -EAGAIN : 0;
}

@@ -1296,7 +1378,7 @@ static int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucket *hb,
*/
match = futex_top_waiter(hb, key);
if (match)
- return attach_to_pi_state(uval, match->pi_state, ps);
+ return attach_to_pi_state(uaddr, uval, match->pi_state, ps);

/*
* No waiter and user TID is 0. We are here because the
@@ -1436,6 +1518,7 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,

if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)) {
ret = -EFAULT;
+
} else if (curval != uval) {
/*
* If a unconditional UNLOCK_PI operation (user space did not
@@ -1969,7 +2052,7 @@ retry_private:
* rereading and handing potential crap to
* lookup_pi_state.
*/
- ret = lookup_pi_state(ret, hb2, &key2,
+ ret = lookup_pi_state(uaddr2, ret, hb2, &key2,
&pi_state, &exiting);
}

@@ -2247,7 +2330,6 @@ static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
int err = 0;

oldowner = pi_state->owner;
-
/*
* We are here because either:
*
@@ -2266,11 +2348,10 @@ static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
* because we can fault here. Imagine swapped out pages or a fork
* that marked all the anonymous memory readonly for cow.
*
- * Modifying pi_state _before_ the user space value would
- * leave the pi_state in an inconsistent state when we fault
- * here, because we need to drop the hash bucket lock to
- * handle the fault. This might be observed in the PID check
- * in lookup_pi_state.
+ * Modifying pi_state _before_ the user space value would leave the
+ * pi_state in an inconsistent state when we fault here, because we
+ * need to drop the locks to handle the fault. This might be observed
+ * in the PID check in lookup_pi_state.
*/
retry:
if (!argowner) {
@@ -2331,21 +2412,26 @@ retry:
return argowner == current;

/*
- * To handle the page fault we need to drop the hash bucket
- * lock here. That gives the other task (either the highest priority
- * waiter itself or the task which stole the rtmutex) the
- * chance to try the fixup of the pi_state. So once we are
- * back from handling the fault we need to check the pi_state
- * after reacquiring the hash bucket lock and before trying to
- * do another fixup. When the fixup has been done already we
- * simply return.
+ * To handle the page fault we need to drop the locks here. That gives
+ * the other task (either the highest priority waiter itself or the
+ * task which stole the rtmutex) the chance to try the fixup of the
+ * pi_state. So once we are back from handling the fault we need to
+ * check the pi_state after reacquiring the locks and before trying to
+ * do another fixup. When the fixup has been done already we simply
+ * return.
+ *
+ * Note: we hold both hb->lock and pi_mutex->wait_lock. We can safely
+ * drop hb->lock since the caller owns the hb -> futex_q relation.
+ * Dropping the pi_mutex->wait_lock requires the state revalidate.
*/
handle_fault:
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(q->lock_ptr);

err = fault_in_user_writeable(uaddr);

spin_lock(q->lock_ptr);
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);

/*
* Check if someone else fixed it for us:
--
2.25.4

2021-03-09 10:42:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
> From: Thomas Gleixner <[email protected]>
>
> The handle_exit_race() function is defined in commit 9c3f39860367
> ("futex: Cure exit race"), which never returns -EBUSY. This results
> in a small piece of dead code in the attach_to_pi_owner() function:
>
> int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
> ...
> if (ret == -EBUSY)
> *exiting = p; /* dead code */
>
> The return value -EBUSY is added to handle_exit_race() in upsteam
> commit ac31c7ff8624409 ("futex: Provide distinct return value when
> owner is exiting"). This commit was incorporated into v4.9.255, before
> the function handle_exit_race() was introduced, whitout Modify
> handle_exit_race().
>
> To fix dead code, extract the change of handle_exit_race() from
> commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
> is exiting"), re-incorporated.
>
> Lee writes:
>
> This commit takes the remaining functional snippet of:
>
> ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
>
> ... and is the correct fix for this issue.
>
> Fixes: 9c3f39860367 ("futex: Cure exit race")
> Cc: [email protected] # v4.9.258
> Signed-off-by: Xiaoming Ni <[email protected]>
> Reviewed-by: Lee Jones <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> Signed-off-by: Zheng Yejian <[email protected]>
> ---
> kernel/futex.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)

Same here, what is the upstream git id?

thanks,

greg k-h

2021-03-09 10:43:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 1/3] futex: Change locking rules

On Tue, Mar 09, 2021 at 11:06:03AM +0800, Zheng Yejian wrote:
> From: Peter Zijlstra <[email protected]>
>
> Currently futex-pi relies on hb->lock to serialize everything. But hb->lock
> creates another set of problems, especially priority inversions on RT where
> hb->lock becomes a rt_mutex itself.
>
> The rt_mutex::wait_lock is the most obvious protection for keeping the
> futex user space value and the kernel internal pi_state in sync.
>
> Rework and document the locking so rt_mutex::wait_lock is held accross all
> operations which modify the user space value and the pi state.
>
> This allows to invoke rt_mutex_unlock() (including deboost) without holding
> hb->lock as a next step.
>
> Nothing yet relies on the new locking rules.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Thomas Gleixner <[email protected]>
> [Lee: Back-ported in support of a previous futex back-port attempt]
> Signed-off-by: Lee Jones <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> Signed-off-by: Zheng Yejian <[email protected]>
> ---
> kernel/futex.c | 138 +++++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 112 insertions(+), 26 deletions(-)

What is the git commit id of this patch in Linus's tree?

thanks,

greg k-h

2021-03-09 10:43:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 0/3] Backport patch series to update Futex from 4.9

On Tue, Mar 09, 2021 at 11:06:02AM +0800, Zheng Yejian wrote:
> Lee sent a patchset to update Futex for 4.9, see https://www.spinics.net/lists/stable/msg443081.html,
> Then Xiaoming sent a follow-up patch for it, see https://lore.kernel.org/lkml/20210225093120.GD641347@dell/.
>
> These patchsets may also resolve following issues in 4.4.260 which have been reported in 4.9,
> see https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-4.4.y&id=319f66f08de1083c1fe271261665c209009dd65a
> > /*
> > * The task is on the way out. When the futex state is
> > * FUTEX_STATE_DEAD, we know that the task has finished
> > * the cleanup:
> > */
> > int ret = (p->futex_state = FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
>
> Here may be:
> int ret = (p->futex_state == FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
>
> > raw_spin_unlock_irq(&p->pi_lock);
> > /*
> > * If the owner task is between FUTEX_STATE_EXITING and
> > * FUTEX_STATE_DEAD then store the task pointer and keep
> > * the reference on the task struct. The calling code will
> > * drop all locks, wait for the task to reach
> > * FUTEX_STATE_DEAD and then drop the refcount. This is
> > * required to prevent a live lock when the current task
> > * preempted the exiting task between the two states.
> > */
> > if (ret == -EBUSY)
>
> And here, the variable "ret" may only be "-ESRCH" or "-EAGAIN", but not "-EBUSY".
>
> > *exiting = p;
> > else
> > put_task_struct(p);
>
> Since 074e7d515783 ("futex: Ensure the correct return value from futex_lock_pi()") has
> been merged in 4.4.260, I send the remain 3 patches.

There already are 2 futex patches in the 4.4.y stable queue, do those
not resolve these issues for you?

If not, please resend this series with the needed git commit ids added to
them.

thanks,

greg k-h

2021-03-09 18:18:35

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

On Tue, 09 Mar 2021, Greg KH wrote:

> On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
> > From: Thomas Gleixner <[email protected]>
> >
> > The handle_exit_race() function is defined in commit 9c3f39860367
> > ("futex: Cure exit race"), which never returns -EBUSY. This results
> > in a small piece of dead code in the attach_to_pi_owner() function:
> >
> > int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
> > ...
> > if (ret == -EBUSY)
> > *exiting = p; /* dead code */
> >
> > The return value -EBUSY is added to handle_exit_race() in upsteam
> > commit ac31c7ff8624409 ("futex: Provide distinct return value when
> > owner is exiting"). This commit was incorporated into v4.9.255, before
> > the function handle_exit_race() was introduced, whitout Modify
> > handle_exit_race().
> >
> > To fix dead code, extract the change of handle_exit_race() from
> > commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
> > is exiting"), re-incorporated.
> >
> > Lee writes:
> >
> > This commit takes the remaining functional snippet of:
> >
> > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
> >
> > ... and is the correct fix for this issue.
> >
> > Fixes: 9c3f39860367 ("futex: Cure exit race")
> > Cc: [email protected] # v4.9.258
> > Signed-off-by: Xiaoming Ni <[email protected]>
> > Reviewed-by: Lee Jones <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > Signed-off-by: Zheng Yejian <[email protected]>
> > ---
> > kernel/futex.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
>
> Same here, what is the upstream git id?

It doesn't have one as such - it's a part-patch:

> > This commit takes the remaining functional snippet of:
> >
> > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2021-03-10 12:03:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

On Tue, Mar 09, 2021 at 06:14:37PM +0000, Lee Jones wrote:
> On Tue, 09 Mar 2021, Greg KH wrote:
>
> > On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
> > > From: Thomas Gleixner <[email protected]>
> > >
> > > The handle_exit_race() function is defined in commit 9c3f39860367
> > > ("futex: Cure exit race"), which never returns -EBUSY. This results
> > > in a small piece of dead code in the attach_to_pi_owner() function:
> > >
> > > int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
> > > ...
> > > if (ret == -EBUSY)
> > > *exiting = p; /* dead code */
> > >
> > > The return value -EBUSY is added to handle_exit_race() in upsteam
> > > commit ac31c7ff8624409 ("futex: Provide distinct return value when
> > > owner is exiting"). This commit was incorporated into v4.9.255, before
> > > the function handle_exit_race() was introduced, whitout Modify
> > > handle_exit_race().
> > >
> > > To fix dead code, extract the change of handle_exit_race() from
> > > commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
> > > is exiting"), re-incorporated.
> > >
> > > Lee writes:
> > >
> > > This commit takes the remaining functional snippet of:
> > >
> > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
> > >
> > > ... and is the correct fix for this issue.
> > >
> > > Fixes: 9c3f39860367 ("futex: Cure exit race")
> > > Cc: [email protected] # v4.9.258
> > > Signed-off-by: Xiaoming Ni <[email protected]>
> > > Reviewed-by: Lee Jones <[email protected]>
> > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > > Signed-off-by: Zheng Yejian <[email protected]>
> > > ---
> > > kernel/futex.c | 6 +++---
> > > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > Same here, what is the upstream git id?
>
> It doesn't have one as such - it's a part-patch:
>
> > > This commit takes the remaining functional snippet of:
> > >
> > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")

That wasn't obvious :(

Is this a backport of another patch in the stable tree somewhere?

confused,

greg k-h

2021-03-10 13:33:21

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

On Wed, 10 Mar 2021, Greg KH wrote:

> On Tue, Mar 09, 2021 at 06:14:37PM +0000, Lee Jones wrote:
> > On Tue, 09 Mar 2021, Greg KH wrote:
> >
> > > On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
> > > > From: Thomas Gleixner <[email protected]>
> > > >
> > > > The handle_exit_race() function is defined in commit 9c3f39860367
> > > > ("futex: Cure exit race"), which never returns -EBUSY. This results
> > > > in a small piece of dead code in the attach_to_pi_owner() function:
> > > >
> > > > int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
> > > > ...
> > > > if (ret == -EBUSY)
> > > > *exiting = p; /* dead code */
> > > >
> > > > The return value -EBUSY is added to handle_exit_race() in upsteam
> > > > commit ac31c7ff8624409 ("futex: Provide distinct return value when
> > > > owner is exiting"). This commit was incorporated into v4.9.255, before
> > > > the function handle_exit_race() was introduced, whitout Modify
> > > > handle_exit_race().
> > > >
> > > > To fix dead code, extract the change of handle_exit_race() from
> > > > commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
> > > > is exiting"), re-incorporated.
> > > >
> > > > Lee writes:
> > > >
> > > > This commit takes the remaining functional snippet of:
> > > >
> > > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
> > > >
> > > > ... and is the correct fix for this issue.
> > > >
> > > > Fixes: 9c3f39860367 ("futex: Cure exit race")
> > > > Cc: [email protected] # v4.9.258
> > > > Signed-off-by: Xiaoming Ni <[email protected]>
> > > > Reviewed-by: Lee Jones <[email protected]>
> > > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > > > Signed-off-by: Zheng Yejian <[email protected]>
> > > > ---
> > > > kernel/futex.c | 6 +++---
> > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > Same here, what is the upstream git id?
> >
> > It doesn't have one as such - it's a part-patch:
> >
> > > > This commit takes the remaining functional snippet of:
> > > >
> > > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
>
> That wasn't obvious :(

This was also my thinking, which is why I replied to the original
patch in an attempt to clarify what I thought was happening.

> Is this a backport of another patch in the stable tree somewhere?

Yes, it looks like it.

The full patch was back-ported to v4.14 as:

e6e00df182908f34360c3c9f2d13cc719362e9c0

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2021-03-10 14:12:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()

On Wed, Mar 10, 2021 at 01:28:02PM +0000, Lee Jones wrote:
> On Wed, 10 Mar 2021, Greg KH wrote:
>
> > On Tue, Mar 09, 2021 at 06:14:37PM +0000, Lee Jones wrote:
> > > On Tue, 09 Mar 2021, Greg KH wrote:
> > >
> > > > On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
> > > > > From: Thomas Gleixner <[email protected]>
> > > > >
> > > > > The handle_exit_race() function is defined in commit 9c3f39860367
> > > > > ("futex: Cure exit race"), which never returns -EBUSY. This results
> > > > > in a small piece of dead code in the attach_to_pi_owner() function:
> > > > >
> > > > > int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
> > > > > ...
> > > > > if (ret == -EBUSY)
> > > > > *exiting = p; /* dead code */
> > > > >
> > > > > The return value -EBUSY is added to handle_exit_race() in upsteam
> > > > > commit ac31c7ff8624409 ("futex: Provide distinct return value when
> > > > > owner is exiting"). This commit was incorporated into v4.9.255, before
> > > > > the function handle_exit_race() was introduced, whitout Modify
> > > > > handle_exit_race().
> > > > >
> > > > > To fix dead code, extract the change of handle_exit_race() from
> > > > > commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
> > > > > is exiting"), re-incorporated.
> > > > >
> > > > > Lee writes:
> > > > >
> > > > > This commit takes the remaining functional snippet of:
> > > > >
> > > > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
> > > > >
> > > > > ... and is the correct fix for this issue.
> > > > >
> > > > > Fixes: 9c3f39860367 ("futex: Cure exit race")
> > > > > Cc: [email protected] # v4.9.258
> > > > > Signed-off-by: Xiaoming Ni <[email protected]>
> > > > > Reviewed-by: Lee Jones <[email protected]>
> > > > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> > > > > Signed-off-by: Zheng Yejian <[email protected]>
> > > > > ---
> > > > > kernel/futex.c | 6 +++---
> > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > Same here, what is the upstream git id?
> > >
> > > It doesn't have one as such - it's a part-patch:
> > >
> > > > > This commit takes the remaining functional snippet of:
> > > > >
> > > > > ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
> >
> > That wasn't obvious :(
>
> This was also my thinking, which is why I replied to the original
> patch in an attempt to clarify what I thought was happening.
>
> > Is this a backport of another patch in the stable tree somewhere?
>
> Yes, it looks like it.
>
> The full patch was back-ported to v4.14 as:
>
> e6e00df182908f34360c3c9f2d13cc719362e9c0

Ok, Zheng, can you put this information in the patch and resend the
whole series?

thanks,

greg k-h

2021-03-11 01:41:51

by Zheng Yejian

[permalink] [raw]
Subject: Re: [PATCH 4.4 3/3] futex: fix dead code in attach_to_pi_owner()



On 2021/3/10 22:10, Greg KH wrote:
> On Wed, Mar 10, 2021 at 01:28:02PM +0000, Lee Jones wrote:
>> On Wed, 10 Mar 2021, Greg KH wrote:
>>
>>> On Tue, Mar 09, 2021 at 06:14:37PM +0000, Lee Jones wrote:
>>>> On Tue, 09 Mar 2021, Greg KH wrote:
>>>>
>>>>> On Tue, Mar 09, 2021 at 11:06:05AM +0800, Zheng Yejian wrote:
>>>>>> From: Thomas Gleixner <[email protected]>
>>>>>>
>>>>>> The handle_exit_race() function is defined in commit 9c3f39860367
>>>>>> ("futex: Cure exit race"), which never returns -EBUSY. This results
>>>>>> in a small piece of dead code in the attach_to_pi_owner() function:
>>>>>>
>>>>>> int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
>>>>>> ...
>>>>>> if (ret == -EBUSY)
>>>>>> *exiting = p; /* dead code */
>>>>>>
>>>>>> The return value -EBUSY is added to handle_exit_race() in upsteam
>>>>>> commit ac31c7ff8624409 ("futex: Provide distinct return value when
>>>>>> owner is exiting"). This commit was incorporated into v4.9.255, before
>>>>>> the function handle_exit_race() was introduced, whitout Modify
>>>>>> handle_exit_race().
>>>>>>
>>>>>> To fix dead code, extract the change of handle_exit_race() from
>>>>>> commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
>>>>>> is exiting"), re-incorporated.
>>>>>>
>>>>>> Lee writes:
>>>>>>
>>>>>> This commit takes the remaining functional snippet of:
>>>>>>
>>>>>> ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
>>>>>>
>>>>>> ... and is the correct fix for this issue.
>>>>>>
>>>>>> Fixes: 9c3f39860367 ("futex: Cure exit race")
>>>>>> Cc: [email protected] # v4.9.258
>>>>>> Signed-off-by: Xiaoming Ni <[email protected]>
>>>>>> Reviewed-by: Lee Jones <[email protected]>
>>>>>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>>>>>> Signed-off-by: Zheng Yejian <[email protected]>
>>>>>> ---
>>>>>> kernel/futex.c | 6 +++---
>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> Same here, what is the upstream git id?
>>>>
>>>> It doesn't have one as such - it's a part-patch:
>>>>
>>>>>> This commit takes the remaining functional snippet of:
>>>>>>
>>>>>> ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")
>>>
>>> That wasn't obvious :(
>>
>> This was also my thinking, which is why I replied to the original
>> patch in an attempt to clarify what I thought was happening.
>>
>>> Is this a backport of another patch in the stable tree somewhere?
>>
>> Yes, it looks like it.
>>
>> The full patch was back-ported to v4.14 as:
>>
>> e6e00df182908f34360c3c9f2d13cc719362e9c0
>
> Ok, Zheng, can you put this information in the patch and resend the
> whole series?
>

Sure, I'll send a "v2" patchset soon.
Thanks for your suggestions,

Zheng Yejian

2021-03-11 03:50:24

by Zheng Yejian

[permalink] [raw]
Subject: Re: [PATCH 4.4 0/3] Backport patch series to update Futex from 4.9



On 2021/3/9 18:41, Greg KH wrote:
> On Tue, Mar 09, 2021 at 11:06:02AM +0800, Zheng Yejian wrote:
>> Lee sent a patchset to update Futex for 4.9, see https://www.spinics.net/lists/stable/msg443081.html,
>> Then Xiaoming sent a follow-up patch for it, see https://lore.kernel.org/lkml/20210225093120.GD641347@dell/.
>>
>> These patchsets may also resolve following issues in 4.4.260 which have been reported in 4.9,
>> see https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/?h=linux-4.4.y&id=319f66f08de1083c1fe271261665c209009dd65a
>> > /*
>> > * The task is on the way out. When the futex state is
>> > * FUTEX_STATE_DEAD, we know that the task has finished
>> > * the cleanup:
>> > */
>> > int ret = (p->futex_state = FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
>>
>> Here may be:
>> int ret = (p->futex_state == FUTEX_STATE_DEAD) ? -ESRCH : -EAGAIN;
>>
>> > raw_spin_unlock_irq(&p->pi_lock);
>> > /*
>> > * If the owner task is between FUTEX_STATE_EXITING and
>> > * FUTEX_STATE_DEAD then store the task pointer and keep
>> > * the reference on the task struct. The calling code will
>> > * drop all locks, wait for the task to reach
>> > * FUTEX_STATE_DEAD and then drop the refcount. This is
>> > * required to prevent a live lock when the current task
>> > * preempted the exiting task between the two states.
>> > */
>> > if (ret == -EBUSY)
>>
>> And here, the variable "ret" may only be "-ESRCH" or "-EAGAIN", but not "-EBUSY".
>>
>> > *exiting = p;
>> > else
>> > put_task_struct(p);
>>
>> Since 074e7d515783 ("futex: Ensure the correct return value from futex_lock_pi()") has
>> been merged in 4.4.260, I send the remain 3 patches.
>
> There already are 2 futex patches in the 4.4.y stable queue, do those
> not resolve these issues for you?

I think that 2 futex patches in 4.4 stable queue are fixing other issues:
futex-fix-irq-self-deadlock-and-satisfy-assertion.patch
futex-fix-spin_lock-spin_unlock_irq-imbalance.patch
But I am not very sure if there are any lock conflicts between that 2
patches and this 3 patches.

>
> If not, please resend this series with the needed git commit ids added to
> them.

I have add that information and sent a "v2" patchset.