2015-02-17 04:46:43

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH] futex: Robustify wake_futex()

Current code assumes that wake_futex() will never fail, thus
we are rather sloppy when incrementing the return value in wake
related calls, accounting for the newly woken task. Of course
this will never occur, thus not a problem. This bug is as real
as the need for the redundant pi checks in wake_futex().

These redundant checks are fine and past discussion indicates
that they will stay. However, it does introduce this mismatch,
thus it is better to robustify the function and avoid any
assumptions that could bite us in the arse the future.

Signed-off-by: Davidlohr Bueso <[email protected]>
---

This got my attention this while looking at Micheal's futex manpage
changes. The downside of this patch is that the return error from
wake_futex() is even more redundant now... however it still seems
like the right thing to do.

kernel/futex.c | 45 +++++++++++++++++++++++++++++++++------------
1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 2a5e383..a01843a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1091,13 +1091,16 @@ static void __unqueue_futex(struct futex_q *q)
/*
* The hash bucket lock must be held when this is called.
* Afterwards, the futex_q must not be accessed.
+ *
+ * Returns true only when the task was woken, otherwise false.
*/
-static void wake_futex(struct futex_q *q)
+static bool wake_futex(struct futex_q *q)
{
struct task_struct *p = q->task;

- if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
- return;
+ if (unlikely(WARN(q->pi_state || q->rt_waiter,
+ "refusing to wake PI futex\n")))
+ return false;

/*
* We set q->lock_ptr = NULL _before_ we wake up the task. If
@@ -1120,6 +1123,7 @@ static void wake_futex(struct futex_q *q)

wake_up_state(p, TASK_NORMAL);
put_task_struct(p);
+ return true;
}

static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this)
@@ -1244,7 +1248,10 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
if (!(this->bitset & bitset))
continue;

- wake_futex(this);
+ if (!wake_futex(this)) {
+ ret = -EINVAL;
+ break;
+ }
if (++ret >= nr_wake)
break;
}
@@ -1320,7 +1327,11 @@ retry_private:
ret = -EINVAL;
goto out_unlock;
}
- wake_futex(this);
+
+ if (!wake_futex(this)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
if (++ret >= nr_wake)
break;
}
@@ -1334,7 +1345,11 @@ retry_private:
ret = -EINVAL;
goto out_unlock;
}
- wake_futex(this);
+
+ if (!wake_futex(this)) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
if (++op_ret >= nr_wake2)
break;
}
@@ -1674,13 +1689,19 @@ retry_private:
}

/*
- * Wake nr_wake waiters. For requeue_pi, if we acquired the
- * lock, we already woke the top_waiter. If not, it will be
- * woken by futex_unlock_pi().
+ * For requeue_pi, if we acquired the lock, we already woke
+ * the top_waiter. If not, it will be woken by futex_unlock_pi.
+ *
+ * The regular (non-pi) case is much simpler: Wake the top
+ * waiter (next in line) and repeat.
*/
- if (++task_count <= nr_wake && !requeue_pi) {
- wake_futex(this);
- continue;
+ if (!requeue_pi) {
+ if (!wake_futex(this)) {
+ ret = -EINVAL;
+ break;
+ }
+ if (++task_count <= nr_wake)
+ continue;
}

/* Ensure we requeue to the expected futex for requeue_pi. */
--
2.1.4



2015-02-18 17:07:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] futex: Robustify wake_futex()


* Davidlohr Bueso <[email protected]> wrote:

> Current code assumes that wake_futex() will never fail,
> thus we are rather sloppy when incrementing the return
> value in wake related calls, accounting for the newly
> woken task. Of course this will never occur, thus not a
> problem. This bug is as real as the need for the
> redundant pi checks in wake_futex().
>
> These redundant checks are fine and past discussion
> indicates that they will stay. However, it does introduce
> this mismatch, thus it is better to robustify the
> function and avoid any assumptions that could bite us in
> the arse the future.

So can the current code crash or hang if the WARN()
triggers?

> kernel/futex.c | 45 +++++++++++++++++++++++++++++++++------------
> 1 file changed, 33 insertions(+), 12 deletions(-)

My counter argument is that we add quite a bit of pointless
complexity:

> - if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
> - return;
> + if (unlikely(WARN(q->pi_state || q->rt_waiter,
> + "refusing to wake PI futex\n")))
> + return false;

> - wake_futex(this);
> + if (!wake_futex(this)) {
> + ret = -EINVAL;
> + break;
> + }

+ [ 4 more usage sites ]

while the WARN() already told the user that the kernel is
broken.

So what's the point? Does it avoid any real badness, state
corruption, crash, hang, etc.?

Thanks,

Ingo

2015-02-19 01:28:34

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH] futex: Robustify wake_futex()

On Wed, 2015-02-18 at 18:07 +0100, Ingo Molnar wrote:
> while the WARN() already told the user that the kernel is
> broken.
>
> So what's the point? Does it avoid any real badness, state
> corruption, crash, hang, etc.?

Right, I mentioned the warn (redundant pi checks) being completely
redundant - thus this is not a real issue... feel free to toss, this was
just a wtf when reading the code ;)

2015-02-19 09:56:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] futex: Robustify wake_futex()


* Davidlohr Bueso <[email protected]> wrote:

> @@ -1674,13 +1689,19 @@ retry_private:
> }
>
> /*
> - * Wake nr_wake waiters. For requeue_pi, if we acquired the
> - * lock, we already woke the top_waiter. If not, it will be
> - * woken by futex_unlock_pi().
> + * For requeue_pi, if we acquired the lock, we already woke
> + * the top_waiter. If not, it will be woken by futex_unlock_pi.
> + *
> + * The regular (non-pi) case is much simpler: Wake the top
> + * waiter (next in line) and repeat.
> */
> - if (++task_count <= nr_wake && !requeue_pi) {
> - wake_futex(this);
> - continue;
> + if (!requeue_pi) {
> + if (!wake_futex(this)) {
> + ret = -EINVAL;
> + break;
> + }
> + if (++task_count <= nr_wake)
> + continue;
> }
>

Hm, so at a first glance this change appears to go beyond
the scope of adding a return value to wake_futex()?

For example before the change in the !requeue_pi case we'd
only call wake_futex() if ++task_count <= nr_wake, after
the change we always call it.

What's the intention here?

Thanks,

Ingo