2003-08-26 03:19:44

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 1/2] Futex minor fixes

Hi Andrew, Ingo,

This was posted before, but dropped.

Name: Minor futex comment tweaks and cleanups
Author: Rusty Russell
Status: Tested on 2.6.0-test4-bk2

D: Changes:
D:
D: (1) don't return 0 from futex_wait if we are somehow
D: spuriously woken up, return -EINTR on any such case,
D:
D: (2) remove bogus comment about address no longer being in this
D: address space: we hold the mm lock, and __pin_page succeeded, so it
D: can't be true,
D:
D: (3) remove bogus comment about "get_user might fault and schedule",
D:
D: (4) remove list_empty check: we still hold the lock, so it can
D: never happen.
D:
D: (5) Single error exit path, and move __queue_me to the end (order
D: doesn't matter since we're inside the futex lock).

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .6150-linux-2.5.70-bk1/kernel/futex.c .6150-linux-2.5.70-bk1.updated/kernel/futex.c
--- .6150-linux-2.5.70-bk1/kernel/futex.c 2003-05-27 15:02:23.000000000 +1000
+++ .6150-linux-2.5.70-bk1.updated/kernel/futex.c 2003-05-28 16:42:45.000000000 +1000
@@ -312,7 +312,7 @@ static inline int futex_wait(unsigned lo
unsigned long time)
{
DECLARE_WAITQUEUE(wait, current);
- int ret = 0, curval;
+ int ret, curval;
struct page *page;
struct futex_q q;

@@ -322,55 +322,50 @@ static inline int futex_wait(unsigned lo

page = __pin_page(uaddr - offset);
if (!page) {
- unlock_futex_mm();
- return -EFAULT;
+ ret = -EFAULT;
+ goto unlock;
}
- __queue_me(&q, page, uaddr, offset, -1, NULL);
-
/*
- * Page is pinned, but may no longer be in this address space.
+ * Page is pinned, but may be a kernel address.
* It cannot schedule, so we access it with the spinlock held.
*/
if (get_user(curval, (int *)uaddr) != 0) {
- unlock_futex_mm();
ret = -EFAULT;
- goto out;
+ goto unlock;
}
+
if (curval != val) {
- unlock_futex_mm();
ret = -EWOULDBLOCK;
- goto out;
+ goto unlock;
}
- /*
- * The get_user() above might fault and schedule so we
- * cannot just set TASK_INTERRUPTIBLE state when queueing
- * ourselves into the futex hash. This code thus has to
- * rely on the futex_wake() code doing a wakeup after removing
- * the waiter from the list.
- */
+
+ __queue_me(&q, page, uaddr, offset, -1, NULL);
add_wait_queue(&q.waiters, &wait);
set_current_state(TASK_INTERRUPTIBLE);
- if (!list_empty(&q.list)) {
- unlock_futex_mm();
- time = schedule_timeout(time);
- }
+ unlock_futex_mm();
+
+ time = schedule_timeout(time);
+
set_current_state(TASK_RUNNING);
/*
* NOTE: we don't remove ourselves from the waitqueue because
* we are the only user of it.
*/
- if (time == 0) {
- ret = -ETIMEDOUT;
- goto out;
- }
- if (signal_pending(current))
- ret = -EINTR;
-out:
- /* Were we woken up anyway? */
+
+ /* Were we woken up (and removed from queue)? Always return
+ * success when this happens. */
if (!unqueue_me(&q))
ret = 0;
+ else if (time == 0)
+ ret = -ETIMEDOUT;
+ else
+ ret = -EINTR;
+
put_page(q.page);
+ return ret;

+unlock:
+ unlock_futex_mm();
return ret;
}

--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.


2003-08-26 09:26:40

by Alex Riesen

[permalink] [raw]
Subject: Re: [PATCH 1/2] Futex minor fixes

Rusty Russell, Tue, Aug 26, 2003 05:05:56 +0200:
> Hi Andrew, Ingo,
>
> This was posted before, but dropped.
>
> Name: Minor futex comment tweaks and cleanups
> Author: Rusty Russell
> Status: Tested on 2.6.0-test4-bk2
>
> D: Changes:
> D:
> D: (1) don't return 0 from futex_wait if we are somehow
> D: spuriously woken up, return -EINTR on any such case,

But the code below does not mean there actually was a signal,
unless I'm missing something.

> - if (time == 0) {
> - ret = -ETIMEDOUT;
> - goto out;
> - }
> - if (signal_pending(current))
> - ret = -EINTR;
> -out:
> - /* Were we woken up anyway? */
> +
> + /* Were we woken up (and removed from queue)? Always return
> + * success when this happens. */
> if (!unqueue_me(&q))
> ret = 0;
> + else if (time == 0)
> + ret = -ETIMEDOUT;
> + else
> + ret = -EINTR;
> +

Here. EINTR is often (if not always) assumed to be caused by a signal.
And someone may rightfully depend on it being that way.

-alex

2003-08-27 05:18:58

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/2] Futex minor fixes

In message <[email protected]> you write:
> Rusty Russell, Tue, Aug 26, 2003 05:05:56 +0200:
> > Hi Andrew, Ingo,
> >
> > This was posted before, but dropped.
> >
> > Name: Minor futex comment tweaks and cleanups
> > Author: Rusty Russell
> > Status: Tested on 2.6.0-test4-bk2
> >
> > D: Changes:
> > D:
> > D: (1) don't return 0 from futex_wait if we are somehow
> > D: spuriously woken up, return -EINTR on any such case,
>
> Here. EINTR is often (if not always) assumed to be caused by a signal.
> And someone may rightfully depend on it being that way.

Yes. Changed code to loop in this case. I don't know of anyone who
actually randomly wakes processes, but just in case. Returning "0"
always means as "you were woken up by someone using FUTEX_WAKE", and
some callers *need to know*.

How's this?

Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: Minor futex comment tweaks and cleanups
Author: Rusty Russell
Status: Tested on 2.6.0-test4-bk2

D: Changes:
D:
D: (1) don't return 0 from futex_wait if we are somehow
D: spuriously woken up, loop in that case.
D:
D: (2) remove bogus comment about address no longer being in this
D: address space: we hold the mm lock, and __pin_page succeeded, so it
D: can't be true,
D:
D: (3) remove bogus comment about "get_user might fault and schedule",
D:
D: (4) clarify comment about hashing: we hash address of struct page,
D: not page itself,
D:
D: (4) remove list_empty check: we still hold the lock, so it can
D: never happen, and
D:
D: (5) single error exit path, and move __queue_me to the end (order
D: doesn't matter since we're inside the futex lock).

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9416-linux-2.6.0-test4-bk2/kernel/futex.c .9416-linux-2.6.0-test4-bk2.updated/kernel/futex.c
--- .9416-linux-2.6.0-test4-bk2/kernel/futex.c 2003-05-27 15:02:23.000000000 +1000
+++ .9416-linux-2.6.0-test4-bk2.updated/kernel/futex.c 2003-08-27 12:34:53.000000000 +1000
@@ -84,9 +84,7 @@ static inline void unlock_futex_mm(void)
spin_unlock(&current->mm->page_table_lock);
}

-/*
- * The physical page is shared, so we can hash on its address:
- */
+/* The struct page is shared, so we can hash on its address. */
static inline struct list_head *hash_futex(struct page *page, int offset)
{
return &futex_queues[hash_long((unsigned long)page + offset,
@@ -311,67 +309,68 @@ static inline int futex_wait(unsigned lo
int val,
unsigned long time)
{
- DECLARE_WAITQUEUE(wait, current);
- int ret = 0, curval;
+ wait_queue_t wait;
+ int ret, curval;
struct page *page;
struct futex_q q;

+again:
init_waitqueue_head(&q.waiters);
+ init_waitqueue_entry(wait, current);

lock_futex_mm();

page = __pin_page(uaddr - offset);
if (!page) {
- unlock_futex_mm();
- return -EFAULT;
+ ret = -EFAULT;
+ goto unlock;
}
- __queue_me(&q, page, uaddr, offset, -1, NULL);

/*
- * Page is pinned, but may no longer be in this address space.
+ * Page is pinned, but may be a kernel address.
* It cannot schedule, so we access it with the spinlock held.
*/
if (get_user(curval, (int *)uaddr) != 0) {
- unlock_futex_mm();
ret = -EFAULT;
- goto out;
+ goto unlock;
}
+
if (curval != val) {
- unlock_futex_mm();
ret = -EWOULDBLOCK;
- goto out;
+ goto unlock;
}
- /*
- * The get_user() above might fault and schedule so we
- * cannot just set TASK_INTERRUPTIBLE state when queueing
- * ourselves into the futex hash. This code thus has to
- * rely on the futex_wake() code doing a wakeup after removing
- * the waiter from the list.
- */
+
+ __queue_me(&q, page, uaddr, offset, -1, NULL);
add_wait_queue(&q.waiters, &wait);
set_current_state(TASK_INTERRUPTIBLE);
- if (!list_empty(&q.list)) {
- unlock_futex_mm();
- time = schedule_timeout(time);
- }
+ unlock_futex_mm();
+
+ time = schedule_timeout(time);
+
set_current_state(TASK_RUNNING);
/*
* NOTE: we don't remove ourselves from the waitqueue because
* we are the only user of it.
*/
- if (time == 0) {
- ret = -ETIMEDOUT;
- goto out;
- }
- if (signal_pending(current))
- ret = -EINTR;
-out:
- /* Were we woken up anyway? */
+ put_page(q.page);
+
+ /* Were we woken up (and removed from queue)? Always return
+ * success when this happens. */
if (!unqueue_me(&q))
ret = 0;
- put_page(q.page);
+ else if (time == 0)
+ ret = -ETIMEDOUT;
+ else if (signal_pending(current))
+ ret = -EINTR;
+ else
+ /* Spurious wakeup somehow. Loop. */
+ goto again;

return ret;
+
+unlock:
+ unlock_futex_mm();
+ return ret;
}

static int futex_close(struct inode *inode, struct file *filp)

2003-08-27 06:50:34

by Alex Riesen

[permalink] [raw]
Subject: Re: [PATCH 1/2] Futex minor fixes

Rusty Russell, Wed, Aug 27, 2003 04:40:14 +0200:
> In message <[email protected]> you write:
> > Rusty Russell, Tue, Aug 26, 2003 05:05:56 +0200:
> > > Hi Andrew, Ingo,
> > >
> > > This was posted before, but dropped.
> > >
> > > Name: Minor futex comment tweaks and cleanups
> > > Author: Rusty Russell
> > > Status: Tested on 2.6.0-test4-bk2
> > >
> > > D: Changes:
> > > D:
> > > D: (1) don't return 0 from futex_wait if we are somehow
> > > D: spuriously woken up, return -EINTR on any such case,
> >
> > Here. EINTR is often (if not always) assumed to be caused by a signal.
> > And someone may rightfully depend on it being that way.
>
> Yes. Changed code to loop in this case. I don't know of anyone who
> actually randomly wakes processes, but just in case. Returning "0"
> always means as "you were woken up by someone using FUTEX_WAKE", and
> some callers *need to know*.
>
> How's this?

Now it's consistent with what EINTR conventionally mean :)

> Rusty.
> --
...
> +
> + /* Were we woken up (and removed from queue)? Always return
> + * success when this happens. */
> if (!unqueue_me(&q))
> ret = 0;
> - put_page(q.page);
> + else if (time == 0)
> + ret = -ETIMEDOUT;
> + else if (signal_pending(current))
> + ret = -EINTR;
> + else
> + /* Spurious wakeup somehow. Loop. */
> + goto again;
>
> return ret;

Btw, what could that spurious wakeups be?
It set to loop unconditionally, so if the source of wakeup insists on
wakeing up the code could result in endless loop, right?

-alex

2003-08-28 08:43:28

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/2] Futex minor fixes

In message <[email protected]> you write:
> Btw, what could that spurious wakeups be?

I don't know, but every other primitive in the kernel handles it, so
we should too.

> It set to loop unconditionally, so if the source of wakeup insists on
> wakeing up the code could result in endless loop, right?

Yes. If someone wakes you up, you will wake up. If someone wakes you
up an infinite number of times, you will wake up an infinite number of
times. cf. waitqueues.

Hope that clarifies,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-08-28 08:43:25

by Alex Riesen

[permalink] [raw]
Subject: Re: [PATCH 1/2] Futex minor fixes

Rusty Russell, Thu, Aug 28, 2003 02:49:10 +0200:
> In message <[email protected]> you write:
> > It set to loop unconditionally, so if the source of wakeup insists on
> > wakeing up the code could result in endless loop, right?
>
> Yes. If someone wakes you up, you will wake up. If someone wakes you
> up an infinite number of times, you will wake up an infinite number of
> times. cf. waitqueues.
>

Right. If only we could know what that wakers could be...