2003-09-12 18:18:36

by Felipe W Damasio

[permalink] [raw]
Subject: [PATCH] kernel/futex.c: Uneeded memory barrier

Hi Rusty,

Patch against 2.6-test5.

Kills an unneeded set_current_state after schedule_timeout, since it
already guarantees that the task will be TASK_RUNNING.

Also, when setting the state to TASK_RUNNING, isn't that memory
barrier unneeded? Patch removes this memory barrier too.

If it looks good, please consider applying.

Thanks.

Felipe


2003-09-12 18:32:58

by Felipe W Damasio

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

--- linux-2.6.0-test5/kernel/futex.c.orig 2003-09-12 15:14:42.000000000 -0300
+++ linux-2.6.0-test5/kernel/futex.c 2003-09-12 15:14:56.000000000 -0300
@@ -381,13 +381,12 @@
* We were woken already.
*/
spin_unlock(&futex_lock);
- set_current_state(TASK_RUNNING);
+ __set_current_state(TASK_RUNNING);
return 0;
}

spin_unlock(&futex_lock);
time = schedule_timeout(time);
- set_current_state(TASK_RUNNING);

/*
* NOTE: we don't remove ourselves from the waitqueue because


Attachments:
futex-state.patch (498.00 B)

2003-09-13 19:02:53

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

The patch looks fine to me.

Felipe W Damasio wrote:
> Kills an unneeded set_current_state after schedule_timeout, since it
> already guarantees that the task will be TASK_RUNNING.
>
> Also, when setting the state to TASK_RUNNING, isn't that memory
> barrier unneeded? Patch removes this memory barrier too.

If _all_ instances in the kernel of

set_current_state(TASK_RUNNING)

can be validly turned into

__set_current_state(TASK_RUNNING)

it would be good to make the barrier in set_current_state() itself
conditional on the state being state.

-- Jamie

2003-09-14 11:40:56

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

In message <[email protected]> you write:
> Kills an unneeded set_current_state after schedule_timeout, since it
> already guarantees that the task will be TASK_RUNNING.

I thought we already got rid of that once: damn thing won't die...

> Also, when setting the state to TASK_RUNNING, isn't that memory
> barrier unneeded? Patch removes this memory barrier too.

I personally *HATE* the set_task_state()/__set_task_state() macros.
Simple assignments shouldn't be hidden behind macros, unless there's
something really subtle involved.

Personally, when there's a normal and a __ version of a function, I
use the normal version unless there's a real (performance or
correctness) reason. (ie. I prefer the "think less" version 8).

I don't mind either way. I'll roll it in the next update.

Cheers,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-09-14 14:08:50

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

Rusty Russell wrote:
> I personally *HATE* the set_task_state()/__set_task_state() macros.
> Simple assignments shouldn't be hidden behind macros, unless there's
> something really subtle involved.

There _is_ something subtle involved. Back in ye olde days, folk
would set current->state directly. Andrea Arcangeli noticed a subtle
race condition, where a memory barrier is needed after assigning the
state and before testing whatever condition the task is waiting on,
otherwise the state could be written after the condition was tested.

That's when all assignments to current->state were changed to
set_task_state().

Sprinkling special kinds of memory barrier into all the drivers is not
the kind of thing driver writers get right. Also if you look at the
details, the barrier is quite an unusual kind on i386, using the
"xchg" instruction, and it only needs to be a CPU barrier on SMP
systems.

> Personally, when there's a normal and a __ version of a function, I
> use the normal version unless there's a real (performance or
> correctness) reason. (ie. I prefer the "think less" version 8).

It's always correct to use the "normal" version of set_task_state().

The places where __set_task_state() is used are performance tweaks,
because the memory barrier is not for free. That's why you see it
throughout "core" kernel code, where the authors devote more energy to
thinking about the SMP subtleties.

-- Jamie

2003-09-15 05:43:02

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

In message <[email protected]> you write:
> If _all_ instances in the kernel of
>
> set_current_state(TASK_RUNNING)
>
> can be validly turned into
>
> __set_current_state(TASK_RUNNING)
>
> it would be good to make the barrier in set_current_state() itself
> conditional on the state being state.

Or eliminate the macro altogether, and create set_current_interruptible()
and set_current_uninterruptible() which have the barrier, and let the normal
users do the assignment the way God intended 8)

In practice, those who care are using __, the rest aren't critical.

Cheers,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-09-15 05:43:05

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

In message <[email protected]> you write:
> Rusty Russell wrote:
> > I personally *HATE* the set_task_state()/__set_task_state() macros.
> > Simple assignments shouldn't be hidden behind macros, unless there's
> > something really subtle involved.
>
> There _is_ something subtle involved. Back in ye olde days, folk

This is what I hate about EMail. You had two choices here: either I
don't understand you, or you don't understand me. You chose wrong,
and wasted a lot of time on an (excellent, BTW) explanation.

I wasn't clear: __set_task_state() and __set_current_state() should
not exist, they are assignments. set_task_state() should not exist,
since it's only used for current anyway. set_current_state should be
split into set_current_interruptible() and
set_current_uninterruptible(), except...

> Sprinkling special kinds of memory barrier into all the drivers is not
> the kind of thing driver writers get right. Also if you look at the

....hiding the subtlety in wrapper functions is the wrong approach. We
have excellent wait_event, wait_event_interruptible and
wait_event_interruptible_timeout macros in wait.h which these drivers
should be using, which would make them simpler, less buggy and
smaller.

Hope that clarifies?
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-09-15 09:23:16

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

On Mon, Sep 15, 2003 at 01:41:30PM +1000, Rusty Russell wrote:
> ....hiding the subtlety in wrapper functions is the wrong approach. We
> have excellent wait_event, wait_event_interruptible and
> wait_event_interruptible_timeout macros in wait.h which these drivers
> should be using, which would make them simpler, less buggy and
> smaller.

"smaller and simpler" hmm. And _more_ buggy. Let's take this case:

add_wait_queue(&wq, &wait);
for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
if (condition)
break;
if (file->f_flags & O_NONBLOCK) {
ret = -EAGAIN;
break;
}
if (signal_pending(current)) {
ret = -ERESTARTSYS;
break;
}
schedule();
}
__set_current_state(TASK_RUNNING);
remove_wait_queue(&wq, &wait);

There are cases like the above which make the wait_event*() macros
inappropriate:

- needing to test for extra conditions to set "ret" accordingly (eg,
non-blocking IO)
- needing to atomically dequeue some data

I've yet to see anyone using wait_event*() in these circumstances -
they're great for your simple "did something happen" case which the
majority of drivers use, but there are use cases where wait_event*()
is not appropriate.

--
Russell King ([email protected]) http://www.arm.linux.org.uk/personal/
Linux kernel maintainer of:
2.6 ARM Linux - http://www.arm.linux.org.uk/
2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2003-09-15 14:35:49

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

In message <[email protected]> you write:
> > Kills an unneeded set_current_state after schedule_timeout, since it
> > already guarantees that the task will be TASK_RUNNING.

In fact, furthur cleanups are possible.

Thanks!
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: Clean up futex task state setting
Author: Rusty Russell, Felipe W Damasio <[email protected]>
Depends: Misc/futex-jamie-plus1.patch.gz
Status: Trivial

D: Felipe points out that set_task_state is overkill. In fact,
D: futex_lock protects us so we can set if after the queued test,
D: simplifying the whole function slightly

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .7327-2.6.0-test5-bk3-futex-task_state.pre/kernel/futex.c .7327-2.6.0-test5-bk3-futex-task_state/kernel/futex.c
--- .7327-2.6.0-test5-bk3-futex-task_state.pre/kernel/futex.c 2003-09-15 19:37:14.000000000 +1000
+++ .7327-2.6.0-test5-bk3-futex-task_state/kernel/futex.c 2003-09-15 19:37:14.000000000 +1000
@@ -374,20 +374,16 @@ static int futex_wait(unsigned long uadd
*/
add_wait_queue(&q.waiters, &wait);
spin_lock(&futex_lock);
- set_current_state(TASK_INTERRUPTIBLE);

if (unlikely(list_empty(&q.list))) {
- /*
- * We were woken already.
- */
+ /* We were woken already. */
spin_unlock(&futex_lock);
- set_current_state(TASK_RUNNING);
return 0;
}

+ __set_current_state(TASK_INTERRUPTIBLE);
spin_unlock(&futex_lock);
time = schedule_timeout(time);
- set_current_state(TASK_RUNNING);

/*
* NOTE: we don't remove ourselves from the waitqueue because

2003-09-16 01:04:32

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

In message <[email protected]> you write:
> On Mon, Sep 15, 2003 at 01:41:30PM +1000, Rusty Russell wrote:
> > ....hiding the subtlety in wrapper functions is the wrong approach. We
> > have excellent wait_event, wait_event_interruptible and
> > wait_event_interruptible_timeout macros in wait.h which these drivers
> > should be using, which would make them simpler, less buggy and
> > smaller.
>
> "smaller and simpler" hmm. And _more_ buggy. Let's take this case:
>
> add_wait_queue(&wq, &wait);
> for (;;) {
> set_current_state(TASK_INTERRUPTIBLE);
> if (condition)
> break;
> if (file->f_flags & O_NONBLOCK) {
> ret = -EAGAIN;
> break;
> }
> if (signal_pending(current)) {
> ret = -ERESTARTSYS;
> break;
> }
> schedule();
> }
> __set_current_state(TASK_RUNNING);
> remove_wait_queue(&wq, &wait);
>
> There are cases like the above which make the wait_event*() macros
> inappropriate:

There are, sure, but I'm not sure this is really one:

ret = 0;
if (file->f_flags & O_NONBLOCK) {
if (!condition)
ret = -EAGAIN;
} else
wait_event_interruptible(&wq, condition, &ret);

AFAICT this is equivalent, and clearer?

> - needing to atomically dequeue some data

Definitely: the futex code is a good example of this. But the
presence of spinlocks actually makes the barrier issue vanish anyway.

> I've yet to see anyone using wait_event*() in these circumstances -
> they're great for your simple "did something happen" case which the
> majority of drivers use, but there are use cases where wait_event*()
> is not appropriate.

I don't mind someone doing something weird and needing to use lower
primitives, but I still feel the wait_event* family are badly
underused. Perhaps it's just lack of publicity...

Cheers!
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2003-09-16 12:05:22

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] kernel/futex.c: Uneeded memory barrier

Rusty Russell wrote:
> In message <[email protected]> you write:
> > > Kills an unneeded set_current_state after schedule_timeout, since it
> > > already guarantees that the task will be TASK_RUNNING.
>
> In fact, furthur cleanups are possible.

I agree, nice cleanups :)

-- Jamie