> > s = --s >> PAGE_SHIFT;
>
> This code has undefined behaviour.
Do you mean that this:
s = (s-1) >> PAGE_SHIFT
is more deterministic? If so, I agree -- if you mean
something else, I am kind of lost.
> > if (likely (s) < s)
>
> What is that supposed to do?
This is doing the "you are looking at the obvious
and not noticing" [I mean me, not you]. I started with
another algorithm, crappier, that required this in order
to work properly, and I never took it out; not to mention
it is wrong - it should read 'if (likely ((1 << exp) < s))'
... McFly, anybody home?
I also realized I was not calling get_order() -the old version-
statically in the test case, so that slowed it down ... in any
case, it does not seem to affect much [probably because of the
tight loop].
> BTW, I just noticed
>
> #define likely(x) __builtin_expect((x),1)
>
> I think this should rather be
>
> #define likely(x) __builtin_expect((x)!=0,1)
Yep, for NGPT what I did was to cast it to unsigned long,
although your way might be more elegant. Care to submit that?
Okay, so now I got it fixed, I will be submitting a new patch
right now with the whole thing against 2.5.47. The changes versus
the previous one are:
--- page.h 13 Nov 2002 00:49:22 -0000 1.1.2.1
+++ page.h 13 Nov 2002 22:58:51 -0000 1.1.2.2
@@ -10,17 +10,10 @@
static __inline__
int generic_get_order (unsigned long s)
{
- int exp;
-
- s = --s >> PAGE_SHIFT;
+ s = (s - 1) >> PAGE_SHIFT;
if (s == 0)
return 0;
-
- exp = fls (s);
- s = 1 << exp;
- if (likely (s) < s)
- exp++;
- return exp;
+ return fls (s);
}
#endif _GENERIC_PAGE_H
Inaky Perez-Gonzalez -- Not speaking for Intel - opinions are my own [or my
fault]
"Perez-Gonzalez, Inaky" <[email protected]> writes:
> > > s = --s >> PAGE_SHIFT;
> >
> > This code has undefined behaviour.
>
> Do you mean that this:
>
> s = (s-1) >> PAGE_SHIFT
>
> is more deterministic? If so, I agree -- if you mean something else,
> I am kind of lost.
I mean that this code violates the rule that you may modify a value
only once between two sequence points. Newer gccs have a warning for
this (-Wsequence-point), the info page tells more.
Also, did I understand it right that you want to use fls even on
architectures that don't have it as a builtin? I would guess that will
actually be noticeably slower, since generic_fls is so complicated.
--
Falk