2004-09-15 23:00:13

by Albert Cahalan

[permalink] [raw]
Subject: get_current is __pure__, maybe __const__ even

Andi Kleen writes:

> Please CSE "current" manually. It generates
> much better code on some architectures
> because the compiler cannot do it for you.

This looks fixable.

At the very least, __attribute__((__pure__))
will apply to your get_current function.

I think __attribute__((__const__)) will too,
even though it's technically against the
documentation. While you do indeed read from
memory, you don't read from memory that could
be seen as changing. Nothing done during the
lifetime of a task will change "current" as
viewed from within that task.



2004-09-15 23:17:20

by Jakub Jelinek

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, Sep 15, 2004 at 06:50:00PM -0400, Albert Cahalan wrote:
> Andi Kleen writes:
>
> > Please CSE "current" manually. It generates
> > much better code on some architectures
> > because the compiler cannot do it for you.
>
> This looks fixable.
>
> At the very least, __attribute__((__pure__))
> will apply to your get_current function.
>
> I think __attribute__((__const__)) will too,
> even though it's technically against the
> documentation. While you do indeed read from
> memory, you don't read from memory that could
> be seen as changing. Nothing done during the
> lifetime of a task will change "current" as
> viewed from within that task.

current will certainly change in schedule (),
so either you'd need to avoid using current
in schedule() and use some other accessor
for the same without such attribute, or
#ifdef the attribute out when compiling sched.c.

Jakub

2004-09-15 23:37:41

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, Sep 15, 2004 at 06:50:00PM -0400, Albert Cahalan wrote:
>> This looks fixable.
>> At the very least, __attribute__((__pure__))
>> will apply to your get_current function.
>> I think __attribute__((__const__)) will too,
>> even though it's technically against the
>> documentation. While you do indeed read from
>> memory, you don't read from memory that could
>> be seen as changing. Nothing done during the
>> lifetime of a task will change "current" as
>> viewed from within that task.

On Wed, Sep 15, 2004 at 07:15:18PM -0400, Jakub Jelinek wrote:
> current will certainly change in schedule (),
> so either you'd need to avoid using current
> in schedule() and use some other accessor
> for the same without such attribute, or
> #ifdef the attribute out when compiling sched.c.

Why would barrier() not suffice?


-- wli

2004-09-15 23:33:17

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

Andi Kleen writes:
>> Please CSE "current" manually. It generates
>> much better code on some architectures
>> because the compiler cannot do it for you.

On Wed, Sep 15, 2004 at 06:50:00PM -0400, Albert Cahalan wrote:
> This looks fixable.
> At the very least, __attribute__((__pure__))
> will apply to your get_current function.
> I think __attribute__((__const__)) will too,
> even though it's technically against the
> documentation. While you do indeed read from
> memory, you don't read from memory that could
> be seen as changing. Nothing done during the
> lifetime of a task will change "current" as
> viewed from within that task.

The fix needs to be made to gcc, not the kernel.


-- wli

2004-09-16 02:15:27

by Albert Cahalan

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, 2004-09-15 at 19:29, William Lee Irwin III wrote:
> On Wed, Sep 15, 2004 at 06:50:00PM -0400, Albert Cahalan wrote:
> >> This looks fixable.
> >> At the very least, __attribute__((__pure__))
> >> will apply to your get_current function.
> >> I think __attribute__((__const__)) will too,
> >> even though it's technically against the
> >> documentation. While you do indeed read from
> >> memory, you don't read from memory that could
> >> be seen as changing. Nothing done during the
> >> lifetime of a task will change "current" as
> >> viewed from within that task.
>
> On Wed, Sep 15, 2004 at 07:15:18PM -0400, Jakub Jelinek wrote:
> > current will certainly change in schedule (),

Not really!

>From the viewpoint of a single task looking
at current, it does not change. The task is
paused, and may well start up again on a
different CPU, but current doesn't change.

Any state gcc might keep would be stored on
the kernel stack or in a register, which will
be preserved because tasks don't share these.

AFAIK, gcc generates thread-safe code. It won't
convert code to something like this:

int foo(int bar){
static task_struct *__L131241 = get_current();
// blah, blah...
}

> > so either you'd need to avoid using current
> > in schedule() and use some other accessor
> > for the same without such attribute, or
> > #ifdef the attribute out when compiling sched.c.
>
> Why would barrier() not suffice?

I don't think even barrier() is needed.
Suppose gcc were to cache the value of
current over a schedule. Who cares? It'll
be the same after schedule() as it was
before.


2004-09-16 02:36:38

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, Sep 15, 2004 at 07:15:18PM -0400, Jakub Jelinek wrote:
>>> current will certainly change in schedule (),

On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> Not really!

Yes it does. The interior of schedule() is C and must be compiled also.


At some point in the past, I wrote:
>> Why would barrier() not suffice?

On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> I don't think even barrier() is needed.
> Suppose gcc were to cache the value of
> current over a schedule. Who cares? It'll
> be the same after schedule() as it was
> before.

Not over a call to schedule(). In the midst of schedule().


-- wli

2004-09-16 02:48:04

by Nick Piggin

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

William Lee Irwin III wrote:

> On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
>
>>I don't think even barrier() is needed.
>>Suppose gcc were to cache the value of
>>current over a schedule. Who cares? It'll
>>be the same after schedule() as it was
>>before.
>
>
> Not over a call to schedule(). In the midst of schedule().
>

In a way, it is. Because after context_switch, the stack
and registers have been replaced by the new task. So if
current was cached somewhere before that task had scheduled
off, then it still would be correct now that it is scheduled
back on.

At points *within* context_switch, current won't be right,
but AFAIKS current is never used in there.

2004-09-16 03:23:12

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
>>> I don't think even barrier() is needed. Suppose gcc were to cache
>>> the value of current over a schedule. Who cares? It'll be the same
>>> after schedule() as it was before.

William Lee Irwin III wrote:
>> Not over a call to schedule(). In the midst of schedule().

On Thu, Sep 16, 2004 at 12:47:50PM +1000, Nick Piggin wrote:
> In a way, it is. Because after context_switch, the stack
> and registers have been replaced by the new task. So if
> current was cached somewhere before that task had scheduled
> off, then it still would be correct now that it is scheduled
> back on.
> At points *within* context_switch, current won't be right,
> but AFAIKS current is never used in there.

We have a clobber in inline asm and a manual barrier() so we don't need
to change anything if current were to be properly CSE'd. It's somewhat
academic; if current were (mis)used there, it would change.


-- wli

2004-09-16 03:53:14

by Albert Cahalan

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, 2004-09-15 at 22:36, William Lee Irwin III wrote:
> On Wed, Sep 15, 2004 at 07:15:18PM -0400, Jakub Jelinek wrote:
> >>> current will certainly change in schedule (),
>
> On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> > Not really!
>
> Yes it does. The interior of schedule() is C and must be compiled also.

Sure. It doesn't matter. The part that matters is
all arch-specific assembly.

Hey, the arm port already uses __const__.
Unless the arm port is broken, there's proof.

> At some point in the past, I wrote:
> >> Why would barrier() not suffice?
>
> On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> > I don't think even barrier() is needed.
> > Suppose gcc were to cache the value of
> > current over a schedule. Who cares? It'll
> > be the same after schedule() as it was
> > before.
>
> Not over a call to schedule(). In the midst of schedule().

OK, let's look.

First, there's fork/vfork/clone. At no point does
"current" change. A process comes into existance
with a ready-made current.

Second, there's sched.c with context_switch().
That does everything via switch_to, like so:

/* Here we just switch the register state and the stack. */
switch_to(prev, next, prev);

No problem. Now I only need to show that switch_to()
is safe. Unfortunately, it's arch-specific code.
I'll look at a few examples...

x86_64: assembly, and thus OK
i386: assembly, and thus OK
ppc: assembly, and thus OK
arm: already uses __const__ :-)

To find a problem, you need to find an arch which
runs C code with current being inconsistent with
the stack or registers. That would be wild and evil.
In any case, adding __attribute__((__const__)) is
an arch-specific change. A truly evil arch running
C code with an inconsistent current can just leave
off the attribute or, better yet, stop being evil.


2004-09-16 04:00:00

by William Lee Irwin III

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, 2004-09-15 at 22:36, William Lee Irwin III wrote:
>> Not over a call to schedule(). In the midst of schedule().

On Wed, Sep 15, 2004 at 11:49:38PM -0400, Albert Cahalan wrote:
> OK, let's look.
> First, there's fork/vfork/clone. At no point does
> "current" change. A process comes into existance
> with a ready-made current.
> Second, there's sched.c with context_switch().
> That does everything via switch_to, like so:
> /* Here we just switch the register state and the stack. */
> switch_to(prev, next, prev);
> No problem. Now I only need to show that switch_to()
> is safe. Unfortunately, it's arch-specific code.
> I'll look at a few examples...

gcc has to compile more than Linux. There has to be a rule for how this
works, not "gee, Linux will still work and get faster if we change this".


-- wli

2004-09-16 06:58:43

by Andi Kleen

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

Albert Cahalan <[email protected]> writes:

> Andi Kleen writes:
>
>> Please CSE "current" manually. It generates
>> much better code on some architectures
>> because the compiler cannot do it for you.
>
> This looks fixable.

I tried it some years ago, but I ran into problems with the scheduler
and some other code and dropped it.

One problem is that gcc doesn't have a "drop all const/pure
cache values" barrier. Without this I don't think it can be
safely implemented.

-Andi

2004-09-16 09:04:40

by Russell King

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Wed, Sep 15, 2004 at 07:36:04PM -0700, William Lee Irwin III wrote:
> On Wed, Sep 15, 2004 at 07:15:18PM -0400, Jakub Jelinek wrote:
> >>> current will certainly change in schedule (),
>
> On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> > Not really!
>
> Yes it does. The interior of schedule() is C and must be compiled also.
>
>
> At some point in the past, I wrote:
> >> Why would barrier() not suffice?
>
> On Wed, Sep 15, 2004 at 10:10:20PM -0400, Albert Cahalan wrote:
> > I don't think even barrier() is needed.
> > Suppose gcc were to cache the value of
> > current over a schedule. Who cares? It'll
> > be the same after schedule() as it was
> > before.
>
> Not over a call to schedule(). In the midst of schedule().

Actually, I find myself agreeing with Albert here. Consider the
following points:

- "current_thread()" depends on the kernel stack pointer.

- the kernel stack pointer is changed when we switch threads.
- the rest of the register set is changed when we switch threads.

Therefore, if we have current_thread() cached in a register, and we
away from thread A to thread B, and back to A, has the cached copy
become invalid for thread A ? No.

Now look at the same thing from thread B's perspective. Has anything
changed because thread A has run? No.

IOW, think from a tasks point of view. It gets into the scheduler,
and switch_to() is just a normal function which just happens to sleep
for some time.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-09-16 09:11:31

by Andi Kleen

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

> IOW, think from a tasks point of view. It gets into the scheduler,
> and switch_to() is just a normal function which just happens to sleep
> for some time.

On x86/x86-64 the stack switch is inlined into schedule()

-Andi

2004-09-16 09:33:31

by Russell King

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Thu, Sep 16, 2004 at 11:11:28AM +0200, Andi Kleen wrote:
> > IOW, think from a tasks point of view. It gets into the scheduler,
> > and switch_to() is just a normal function which just happens to sleep
> > for some time.
>
> On x86/x86-64 the stack switch is inlined into schedule()

Yes, and does it not save the whole CPU context and restore it, or tell
the compiler that certain registers which you don't preserve are
clobbered? If it didn't, I think you'd find that you have a bug
there.

The scheduler quite rightly expects, for any thread, that any variable
which may be stored in a CPU register before the context switch has the
same value as after the context switch.

(note - "preserve" above - I don't mean from one thread to another,
I mean preserved within the context of one thread.)

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-09-16 11:03:58

by Andi Kleen

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

> The scheduler quite rightly expects, for any thread, that any variable
> which may be stored in a CPU register before the context switch has the
> same value as after the context switch.

Just current isn't a varible, it's some inline assembly in a inline
function. The question was if that function could be marked const/pure.

-Andi

2004-09-16 13:47:14

by Albert Cahalan

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Thu, 2004-09-16 at 02:58, Andi Kleen wrote:
> Albert Cahalan <[email protected]> writes:
>
> > Andi Kleen writes:
> >
> >> Please CSE "current" manually. It generates
> >> much better code on some architectures
> >> because the compiler cannot do it for you.
> >
> > This looks fixable.
>
> I tried it some years ago, but I ran into problems with the scheduler
> and some other code and dropped it.

Since then, have you changed:

a. the minimum required compiler version?
b. the clobbers on your switch_to asm?

If not, maybe you should.

> One problem is that gcc doesn't have a "drop all const/pure
> cache values" barrier. Without this I don't think it can be
> safely implemented.

That's only if gcc can find some place to
hide state which will contaminate another task.
Where could that be? Remember, arm works.

Also, glibc works, with per-thread errno:

/* Function to get address of global `errno' variable. */
extern int *__errno_location (void) __THROW __attribute__ ((__const__));

/* Function to get address of global `h_errno' variable. */
extern int *__h_errno_location (void) __THROW __attribute__ ((__const__));


2004-09-16 14:18:31

by Albert Cahalan

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Thu, 2004-09-16 at 02:58, Andi Kleen wrote:
> Albert Cahalan <[email protected]> writes:
>
> > Andi Kleen writes:
> >
> >> Please CSE "current" manually. It generates
> >> much better code on some architectures
> >> because the compiler cannot do it for you.
> >
> > This looks fixable.
>
> I tried it some years ago, but I ran into problems with the scheduler
> and some other code and dropped it.

Right now, I'm thinking the switch_to assembly
has bad asm constraints.

I count 8 items passed in or out of the asm.
I count 11 clobbers. You don't have 19 registers.

If you pass something in and then destroy it,
you're supposed to use a dummy output. Look at
the i386 version, where esi and edi are dummy
outputs for this reason.

I recall seeing i386 compilers complain about
clobbered inputs. I guess the x86-64 gcc needs
to have this warning added?


2004-09-16 14:40:10

by Russell King

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

On Thu, Sep 16, 2004 at 01:03:55PM +0200, Andi Kleen wrote:
> > The scheduler quite rightly expects, for any thread, that any variable
> > which may be stored in a CPU register before the context switch has the
> > same value as after the context switch.
>
> Just current isn't a varible, it's some inline assembly in a inline
> function. The question was if that function could be marked const/pure.

Sigh. What does that matter?

current_thread, which is what we should be talking about here, is
indeed a function, but its returned value depends wholely on the
stack pointer value.

Since the stack pointer and thread info are invariably coupled
together, it is right to mark it const/pure.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-09-16 19:27:33

by Andi Kleen

[permalink] [raw]
Subject: Re: get_current is __pure__, maybe __const__ even

> I count 8 items passed in or out of the asm.
> I count 11 clobbers. You don't have 19 registers.

Read it again. Many of the input arguments are not
registers, but "i" or "m"

> I recall seeing i386 compilers complain about
> clobbered inputs. I guess the x86-64 gcc needs
> to have this warning added?

All gcc ports use the same code for this, no difference.

-Andi