2001-02-07 18:01:07

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] micro-opt DEBUG_ADD_PAGE

On Tue, 6 Feb 2001, Linus Torvalds wrote:
> > - if (bh->b_size % correct_size) {
> > + if (bh->b_size != correct_size) {
>
> Actually, I'd rather leave it in, but speed it up with the saner and
> faster if (bh->b_size & (correct_size-1)) {

Micro-optimization season?

--- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001
+++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001
@@ -200,8 +200,8 @@
* with the pagemap_lru_lock held!
*/
#define DEBUG_ADD_PAGE \
- if (PageActive(page) || PageInactiveDirty(page) || \
- PageInactiveClean(page)) BUG();
+ if ((page)->flags & ((1<<PG_active)|(1<<PG_inactive_dirty)| \
+ (1<<PG_inactive_clean))) BUG();

#define ZERO_PAGE_BUG \
if (page_count(page) == 0) BUG();


2001-02-07 18:18:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE



On Wed, 7 Feb 2001, Hugh Dickins wrote:
>
> Micro-optimization season?

I'd rather not do these kinds of things that the compiler should be able
to trivially do for us.

(gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it
here? Did you check the code? Have you asked the gcc lists?)

Linus

2001-02-07 20:43:34

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Wed, 7 Feb 2001, Linus Torvalds wrote:
>
> I'd rather not do these kinds of things that the compiler should be able
> to trivially do for us.
>
> (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it
> here? Did you check the code? Have you asked the gcc lists?)

The "(1<<PG_bitshift)" part of it is done, sure; but I've rechecked
activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0
kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1).

None of those optimizes this: I believe the semantics of "||" (don't
try next test if first succeeds) forbid the optimization "|" gives?

2.91 and 2.96 give three movs (two unnecessary), three tests,
three jumps (first two not usually taken):

232: 8b 43 18 mov 0x18(%ebx),%eax
235: a8 40 test $0x40,%al
237: 75 0f jne 248 <activate_page_nolock+0x4c>
239: 8b 43 18 mov 0x18(%ebx),%eax
23c: a8 80 test $0x80,%al
23e: 75 08 jne 248 <activate_page_nolock+0x4c>
240: 8b 43 18 mov 0x18(%ebx),%eax
243: f6 c4 08 test $0x8,%ah
246: 74 19 je 261 <activate_page_nolock+0x65>

2.97 is jumpier: mov and je mov test jne mov test jne jmp.
That looks worse to me: David, earlier on you advertized
http://www.codesourcery.com/gcc-snapshots/
Is this something worth your pursuing with the gcc guys?

Hugh

--- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001
+++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001
@@ -200,8 +200,8 @@
* with the pagemap_lru_lock held!
*/
#define DEBUG_ADD_PAGE \
- if (PageActive(page) || PageInactiveDirty(page) || \
- PageInactiveClean(page)) BUG();
+ if ((page)->flags & ((1<<PG_active)|(1<<PG_inactive_dirty)| \
+ (1<<PG_inactive_clean))) BUG();

#define ZERO_PAGE_BUG \
if (page_count(page) == 0) BUG();

2001-02-07 21:40:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE



On Wed, 7 Feb 2001, Hugh Dickins wrote:
>
> The "(1<<PG_bitshift)" part of it is done, sure; but I've rechecked
> activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0
> kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1).
>
> None of those optimizes this: I believe the semantics of "||" (don't
> try next test if first succeeds) forbid the optimization "|" gives?

No. The optimization is entirely legal - but the fact that
"constant_test_bit()" uses a "volatile unsigned int *" is the reason why
gcc thinks it can't optimize it.

Oh, well. That "volatile" is really totally bogus. But it's there because
there are probably drivers that do

while (test_bit(...))
/* nothing */;

and the compiler woul doptimize it away a bit too much without the
volatile. Dang.

You could try to remove the volatile from test_bit, and see if that fixes
it - but then we'd have to find and add the proper "rmb()" calls to people
who do the endless loop kind of thing like above.

Linus

2001-02-08 01:24:49

by Kai Germaschewski

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Wed, 7 Feb 2001, Linus Torvalds wrote:

> No. The optimization is entirely legal - but the fact that
> "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> gcc thinks it can't optimize it.

This thing did attract me somewhat and I decided to learn a little about
compilers.

Result: Unfortunately it's not just the volatile, there's a bunch of
conditions you have to fulfill to have the compiler optimize this. (Sounds
like work for the compiler guys).

Test program is attached, inspecting the code (egcs 2.91.66 and
gcc-2.96 (-69) generate the same code gives the following conclusions:

- f1(unsigned long f): manually optimized

if (f & ((1 << 1) | (1 << 2) | (1 << 4))) {

-> optimized code (of course)


- f2(unsigned long f): leave some work to the compiler

if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) {

-> optimized code (good)


- f3(unsigned int f): use constant_test_bit macro

if (constant_test_bit(1, &f) || constant_test_bit(2, &f) ||
constant_test_bit(4, &f)) {

-> optimized code

where

#define constant_test_bit(nr, addr) \
(((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0)

(doesn't optimize when putting *const* unsigned int there)

- f4: same thing as f3, but use (unsigned long f) instead of
(unsigned int f)

-> no optimization

- f5: same thing as f3, but use inline function for constant_test_bit

-> no optimization

- f6: same thing as f3, but use test_bit instead of constant_test_bit,
where

#define test_bit(nr,addr) \
(__builtin_constant_p(nr) ? \
constant_test_bit((nr),(addr)) : \
variable_test_bit((nr),(addr)))

-> no optimization


Conclusion: With the compilers tested, lots of cases are not optimized
although the could be in theory:
- casting even from unsigned int to unsigned long breaks optimization
- macros are better than inline
- Even though evaluated at compile time, __builtin_constant_p breaks
optimization here, too.

BTW: volatile makes optimization impossible as well, of course, it leads
to repeated reloads of the variable, whereas otherwise it's cached in a
register in the above "no optimization" cases. That's expected behavior.

--Kai

Test code:
----------

#define ADDR (*(volatile long *) addr)

static __inline__ int inl_constant_test_bit(int nr, const void * addr)
{
return ((1UL << (nr & 31)) & (((unsigned int *) addr)[nr >> 5])) != 0;
}

#define constant_test_bit(nr, addr) (((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0)

static __inline__ int variable_test_bit(int nr, volatile void * addr)
{
int oldbit;

__asm__ __volatile__(
"btl %2,%1\n\tsbbl %0,%0"
:"=r" (oldbit)
:"m" (ADDR),"Ir" (nr));
return oldbit;
}

#define test_bit(nr,addr) \
(__builtin_constant_p(nr) ? \
constant_test_bit((nr),(addr)) : \
variable_test_bit((nr),(addr)))




int f1(unsigned long f)
{
if (f & ((1 << 1) | (1 << 2) | (1 << 4))) {
return 1;
}
return 0;
}

int f2(unsigned long f)
{
if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) {
return 1;
}
return 0;
}

int f3(unsigned int f)
{
if (constant_test_bit(1, &f) || constant_test_bit(2, &f) || constant_test_bit(4, &f)) {
return 1;
}
return 0;
}

int f4(unsigned long f)
{
if (constant_test_bit(1, &f) || constant_test_bit(2, &f) || constant_test_bit(4, &f)) {
return 1;
}
return 0;
}

int f5(unsigned int f)
{
if (inl_constant_test_bit(1, &f) || inl_constant_test_bit(2, &f) || inl_constant_test_bit(4, &f)) {
return 1;
}
return 0;
}

int f6(unsigned int f)
{
if (test_bit(1, &f) || test_bit(2, &f) || test_bit(4, &f)) {
return 1;
}
return 0;
}

2001-02-08 16:25:42

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Wed, 7 Feb 2001, Linus Torvalds wrote:
> On Wed, 7 Feb 2001, Hugh Dickins wrote:
> >
> > None of those optimizes this: I believe the semantics of "||" (don't
> > try next test if first succeeds) forbid the optimization "|" gives?
>
> No. The optimization is entirely legal - but the fact that
> "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> gcc thinks it can't optimize it.

Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
with three "mov"s. But take the "volatile"s out of constant_test_bit(),
and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
jumps - which is what originally offended me.

But Mark (in test program in private mail) shows gcc combining bits
into one test and one jump, just as we'd hope (and I wrongly thought
forbidden). Perhaps the inline function nature of constant_test_bit()
(which Mark didn't use) gets in the way of combining those tests.

> You could try to remove the volatile from test_bit, and see if that fixes
> it - but then we'd have to find and add the proper "rmb()" calls to people
> who do the endless loop kind of thing like above.

That is not an inviting path to me, at least not any time soon!

I think this all argues for the little patch I suggested - just avoid
test_bit() here. But it was only intended as a quick little suggestion:
looks like our tastes differ, and you prefer taking the _tiny_ hit of
using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.

Hugh

2001-02-08 16:38:24

by David Weinehall

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, Feb 08, 2001 at 04:24:23PM +0000, Hugh Dickins wrote:
> On Wed, 7 Feb 2001, Linus Torvalds wrote:
> > On Wed, 7 Feb 2001, Hugh Dickins wrote:
> > >
> > > None of those optimizes this: I believe the semantics of "||" (don't
> > > try next test if first succeeds) forbid the optimization "|" gives?
> >
> > No. The optimization is entirely legal - but the fact that
> > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> > gcc thinks it can't optimize it.
>
> Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
> with three "mov"s. But take the "volatile"s out of constant_test_bit(),
> and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
> jumps - which is what originally offended me.
>
> But Mark (in test program in private mail) shows gcc combining bits
> into one test and one jump, just as we'd hope (and I wrongly thought
> forbidden). Perhaps the inline function nature of constant_test_bit()
> (which Mark didn't use) gets in the way of combining those tests.
>
> > You could try to remove the volatile from test_bit, and see if that fixes
> > it - but then we'd have to find and add the proper "rmb()" calls to people
> > who do the endless loop kind of thing like above.
>
> That is not an inviting path to me, at least not any time soon!
>
> I think this all argues for the little patch I suggested - just avoid
> test_bit() here. But it was only intended as a quick little suggestion:
> looks like our tastes differ, and you prefer taking the _tiny_ hit of
> using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.

Well, after all, it's debugging code, and the code now is easy to read.
Your code, while more efficient, isn't. I think that clarity takes
priority over efficiency in non-critical code such as debugging
code. Of course, this is my personal opinion...


/David
_ _
// David Weinehall <[email protected]> /> Northern lights wander \\
// Project MCA Linux hacker // Dance across the winter sky //
\> http://www.acc.umu.se/~tao/ </ Full colour fire </

2001-02-08 16:57:38

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, 8 Feb 2001, David Weinehall wrote:
>
> Well, after all, it's debugging code, and the code now is easy to read.
> Your code, while more efficient, isn't. I think that clarity takes
> priority over efficiency in non-critical code such as debugging
> code. Of course, this is my personal opinion...

I agree my version isn't _as_ easy, and if this code only got built
into DEBUG kernels, I would never have bothered about it; but it's
built into every kernel, on executed paths, so it's no less critical.

Hugh

2001-02-08 17:04:18

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, 8 Feb 2001, Hugh Dickins wrote:
> On Thu, 8 Feb 2001, David Weinehall wrote:
> >
> > Well, after all, it's debugging code, and the code now is easy to read.
> > Your code, while more efficient, isn't. I think that clarity takes
> > priority over efficiency in non-critical code such as debugging
> > code. Of course, this is my personal opinion...
>
> I agree my version isn't _as_ easy, and if this code only got built
> into DEBUG kernels, I would never have bothered about it; but it's
> built into every kernel, on executed paths, so it's no less critical.

Since it's DEBUG code only and nicely "hidden" in a .h file,
why not have the efficient code with a well-written comment
documenting what the code does and why it is there ?

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-02-08 17:03:28

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, 8 Feb 2001, Hugh Dickins wrote:

> On Wed, 7 Feb 2001, Linus Torvalds wrote:
> > On Wed, 7 Feb 2001, Hugh Dickins wrote:
> > >
> > > None of those optimizes this: I believe the semantics of "||" (don't
> > > try next test if first succeeds) forbid the optimization "|" gives?
> >
> > No. The optimization is entirely legal - but the fact that
> > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> > gcc thinks it can't optimize it.
>
> Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
> with three "mov"s. But take the "volatile"s out of constant_test_bit(),
> and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
> jumps - which is what originally offended me.
>
> But Mark (in test program in private mail) shows gcc combining bits
> into one test and one jump, just as we'd hope (and I wrongly thought
> forbidden). Perhaps the inline function nature of constant_test_bit()
> (which Mark didn't use) gets in the way of combining those tests.
>
> > You could try to remove the volatile from test_bit, and see if that fixes
> > it - but then we'd have to find and add the proper "rmb()" calls to people
> > who do the endless loop kind of thing like above.
>
> That is not an inviting path to me, at least not any time soon!
>
> I think this all argues for the little patch I suggested - just avoid
> test_bit() here. But it was only intended as a quick little suggestion:
> looks like our tastes differ, and you prefer taking the _tiny_ hit of
> using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.
>

The use of the key word 'volatile' has gone just a bit too far in
some cases.

given:
funct()
{
volatile unsigned int;
}

Is plain dumb. There is nobody else that can touch that local
variable except the code in funct(). Even if it's recursive,
the Nth invocation still can't (using legal 'C' code) touch
that variable. Therefore, it should not be declared volatile.


Another problem with 'volatile' has to do with pointers. When
it's possible for some object to be modified by some external
influence, we see:

volatile struct whatever *ptr;

Now, it's unclear if gcc knows that we don't give a damn about
the address contained in 'ptr'. We know that it's not going to
change. What we are concerned with are the items within the
'struct whatever'. From what I've seen, gcc just reloads the
pointer.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-02-08 17:23:18

by Stephen Wille Padnos

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

"Richard B. Johnson" wrote:
[snip]
> Another problem with 'volatile' has to do with pointers. When
> it's possible for some object to be modified by some external
> influence, we see:
>
> volatile struct whatever *ptr;
>
> Now, it's unclear if gcc knows that we don't give a damn about
> the address contained in 'ptr'. We know that it's not going to
> change. What we are concerned with are the items within the
> 'struct whatever'. From what I've seen, gcc just reloads the
> pointer.
>
> Cheers,
> Dick Johnson
>
gcc should treat
volatile struct whatever *ptr;

as a different case than
struct whatever * volatile ptr;

which is also different from
volatile struct whatever * volatile ptr;

I think (but can't find my K&R C book to confirm :) that the first case
declares the struct as volatile, and the second case declares the
pointer volatile (the third case declares a volatile pointer to a
structure with volatile parts). So, the programmer should have the
choice, if gcc is dealing with volatile correctly.

Of course, that doesn't mean that the authors have made the right choice
:)

--
Stephen Wille Padnos
Programmer, Engineer, Problem Solver
[email protected]

2001-02-08 17:59:27

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:

> "Richard B. Johnson" wrote:
> [snip]
> > Another problem with 'volatile' has to do with pointers. When
> > it's possible for some object to be modified by some external
> > influence, we see:
> >
> > volatile struct whatever *ptr;
> >
> > Now, it's unclear if gcc knows that we don't give a damn about
> > the address contained in 'ptr'. We know that it's not going to
> > change. What we are concerned with are the items within the
> > 'struct whatever'. From what I've seen, gcc just reloads the
> > pointer.
> >
> > Cheers,
> > Dick Johnson
> >
> gcc should treat
> volatile struct whatever *ptr;
>
> as a different case than
> struct whatever * volatile ptr;
>
> which is also different from
> volatile struct whatever * volatile ptr;
>
> I think (but can't find my K&R C book to confirm :) that the first case
> declares the struct as volatile, and the second case declares the
> pointer volatile (the third case declares a volatile pointer to a
> structure with volatile parts). So, the programmer should have the
> choice, if gcc is dealing with volatile correctly.
>
> Of course, that doesn't mean that the authors have made the right choice
> :)
>

Yes. My point is that a lot of authors have declared just about everything
'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
be "safe". It's likely that there are many hundreds of thousands of
unneeded register-reloads because of this.

It might be useful for somebody who has a lot of time on his/her
hands to go through some of these drivers.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


2001-02-08 18:19:59

by Stephen Wille Padnos

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

"Richard B. Johnson" wrote:
>
> On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:
>
> > "Richard B. Johnson" wrote:
> > [snip]
> > > Another problem with 'volatile' has to do with pointers. When
> > > it's possible for some object to be modified by some external
> > > influence, we see:
> > >
> > > volatile struct whatever *ptr;
> > >
> > > Now, it's unclear if gcc knows that we don't give a damn about
> > > the address contained in 'ptr'. We know that it's not going to
> > > change. What we are concerned with are the items within the
> > > 'struct whatever'. From what I've seen, gcc just reloads the
> > > pointer.
> > >
[snip]

> Yes. My point is that a lot of authors have declared just about everything
> 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
> be "safe". It's likely that there are many hundreds of thousands of
> unneeded register-reloads because of this.
>
> It might be useful for somebody who has a lot of time on his/her
> hands to go through some of these drivers.

I would be willing to do this (on the slow boat - I don't have THAT much
spare time :), but only if we can be sure that the gcc optimizer will
correctly handle a normal pointer to volatile data. Your experiences
would seem to indicate that the optimizer needs fixing before much
effort should be spent on this.

--
Stephen Wille Padnos
Programmer, Engineer, Problem Solver
[email protected]

2001-02-08 19:34:39

by Richard B. Johnson

[permalink] [raw]
Subject: Re: [PATCH] micro-opt DEBUG_ADD_PAGE

On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:

> "Richard B. Johnson" wrote:
> >
> > On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:
> >
> > > "Richard B. Johnson" wrote:
> > > [snip]
> > > > Another problem with 'volatile' has to do with pointers. When
> > > > it's possible for some object to be modified by some external
> > > > influence, we see:
> > > >
> > > > volatile struct whatever *ptr;
> > > >
> > > > Now, it's unclear if gcc knows that we don't give a damn about
> > > > the address contained in 'ptr'. We know that it's not going to
> > > > change. What we are concerned with are the items within the
> > > > 'struct whatever'. From what I've seen, gcc just reloads the
> > > > pointer.
> > > >
> [snip]
>
> > Yes. My point is that a lot of authors have declared just about everything
> > 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
> > be "safe". It's likely that there are many hundreds of thousands of
> > unneeded register-reloads because of this.
> >
> > It might be useful for somebody who has a lot of time on his/her
> > hands to go through some of these drivers.
>
> I would be willing to do this (on the slow boat - I don't have THAT much
> spare time :), but only if we can be sure that the gcc optimizer will
> correctly handle a normal pointer to volatile data. Your experiences
> would seem to indicate that the optimizer needs fixing before much
> effort should be spent on this.
>

Well the question for that is; "What compiler?". I'm currently
using egcs-2.91.66, one of the "approved" versions for compiling
the kernel. It treats all volatiles about the same:


volatile int i;
volatile int *p;
int volatile *q;
volatile int * volatile r;

void foo()
{
while(*p == i)
;
while(*q == i)
;
while(*r == i)
;
}
...makes :


.file "main.c"
.version "01.01"
gcc2_compiled.:
.text
.align 4
.globl foo
.type foo,@function
foo:
pushl %ebp
movl %esp,%ebp
nop
.align 4
.L2:
movl p,%eax
movl (%eax),%edx
movl i,%eax
cmpl %eax,%edx
je .L4
jmp .L3
.align 4
.L4:
jmp .L2
.align 4
.L3:
nop
.align 4
.L5:
movl q,%eax
movl (%eax),%edx
movl i,%eax
cmpl %eax,%edx
je .L7
jmp .L6
.align 4
.L7:
jmp .L5
.align 4
.L6:
nop
.align 4
.L8:
movl r,%eax
movl (%eax),%edx
movl i,%eax
cmpl %eax,%edx
je .L10
jmp .L9
.align 4
.L10:
jmp .L8
.align 4
.L9:
.L1:
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size foo,.Lfe1-foo
.comm i,4,4
.comm p,4,4
.comm q,4,4
.comm r,4,4
.ident "GCC: (GNU) egcs-2.91.66 19990314 (egcs-1.1.2 release)"


Since there seems to be a rather big difference between what is
expected to be done, and what happens to be the result, this
certainly contributes to the possible over-use of 'volatile' in
some kernel code.

It's certainly better to be safe than sorry, but in some cases "safe"
is just a bit "strange". FYI, ../linux/drivers/net/atp.c doesn't use
'volatile' at all. However, ../linux/drivers/net/bmac.c uses it 40
times. I'll bet a buck that both of the drivers work and the one
without 'volatile' keywords does the work with fewer instructions.

These are just two drivers chosen at random. The driver I've been
working on to make 'bullet proof', pcnet32.c uses 'volatile' twice.
And, at least in one occasion, the wrong thing is declared volatile
(the value in a pointer to a structure ), however gcc doesn't seem
to care because it reloads the values of the structure members every time,
anyway. So, in this case, the address-value in the pointer will never
change, but gcc reloads all the pointed-to members anyway, so the
'volatile' keyword not useful.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.