LinuxLists.cc - Runaway cron task on 2.5.63/4 bk?

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Kevin Brosius <[email protected]> wrote:
>
> Second attempt to send this after not seeing it post after about a day.
> Anyone else have kernel posting problems?
>
> I started seeing the cron task runaway, using 100% CPU continuously on a
> single CPU with
> 2.5.63+bk and now with 2.5.64 (about two weeks now.) No other
> apps/tasks seem to be affected, that I've noticed. It seems to take
> upwards of 8 hours running the kernel for this to occur.
>
> top shows:
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 594 root 25 0 1428 620 1364 R 49.9 0.1 195:23 cron
>

Yes I've seen this four times over maybe three weeks. Three times on dual
CPU, once on a different UP machine.

In all cases, crond is stuck in a loop calling nanosleep with a tv_sec value
of a bit over 4,000,000 and a tv_nsec value of zero. nanosleep keeps
returning EINVAL immediately.

I'm not sure why crond is trying to sleep for so long. Maybe it has set an
alarm.

errr, OK. This returns -EINVAL:

#include <time.h>

main()
{
struct timespec req;
struct timespec rem;
int ret;

req.tv_sec = 5000000;
req.tv_nsec = 0;

ret = nanosleep(&req, &rem);
if (ret)
perror("nanosleep");
}

I shall take a look....

2003-03-09 08:06:09

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Andrew Morton <[email protected]> wrote:
>
> errr, OK. This returns -EINVAL:
>
> #include <time.h>
>
> main()
> {
> struct timespec req;
> struct timespec rem;
> int ret;
>
> req.tv_sec = 5000000;
> req.tv_nsec = 0;
>
> ret = nanosleep(&req, &rem);
> if (ret)
> perror("nanosleep");
> }
>

OK, I give up.

/*
* This is a considered response, not exactly in
* line with the standard (in fact it is silent on
* possible overflows). We assume such a large
* value is ALMOST always a programming error and
* try not to compound it by setting a really dumb
* value.
*/
return -EINVAL;

George, RH7.3 and RH8.0 cron daemons are triggering this (trying to sleep
for 4,500,000 seconds) and it causes them to go into a busy loop.

I think we need to just sleep for as long as we can and return an
appropriate partial result.

2003-03-09 16:18:46

by Todd Mokros

[permalink] [raw]

Subject: [PATCH] Re: Runaway cron task on 2.5.63/4 bk?

On Sun, 2003-03-09 at 03:17, Andrew Morton wrote:
> Andrew Morton <[email protected]> wrote:
> >
> > errr, OK. This returns -EINVAL:
> >
> > #include <time.h>
> >
> > main()
> > {
> > struct timespec req;
> > struct timespec rem;
> > int ret;
> >
> > req.tv_sec = 5000000;
> > req.tv_nsec = 0;
> >
> > ret = nanosleep(&req, &rem);
> > if (ret)
> > perror("nanosleep");
> > }
> >
>
> OK, I give up.
>
> /*
> * This is a considered response, not exactly in
> * line with the standard (in fact it is silent on
> * possible overflows). We assume such a large
> * value is ALMOST always a programming error and
> * try not to compound it by setting a really dumb
> * value.
> */
> return -EINVAL;
>
> George, RH7.3 and RH8.0 cron daemons are triggering this (trying to sleep
> for 4,500,000 seconds) and it causes them to go into a busy loop.
>
> I think we need to just sleep for as long as we can and return an
> appropriate partial result.

Cron really isn't at fault, I saw sleep(52) return 4500000, which it
just passed into another sleep call.
The problem is a bug in do_clock_nanosleep. If it gets interrupted by a
signal, when it calculates the amount of time left, it doesn't check if
jiffies has advanced past the expire time, and can pass a negative value
to jiffies_to_timespec, which results in values around 4,500,000
((unsigned int)-1)/HZ, which ends up as sleep's return value. The
following trivial patch appears to have fixed the problem on my system.
Hopefully this isn't wrapped.

--- 2.5-merge/kernel/posix-timers.c Sun Mar 9 08:49:11 2003
+++ 2.5-snapshot/kernel/posix-timers.c Sun Mar 9 08:49:11 2003
@@ -1282,6 +1282,9 @@
if (abs)
return -ERESTARTNOHAND;

+ if (time_after_eq(jiffies_f, new_timer.expires))
+ return 0;
+
jiffies_to_timespec(new_timer.expires - jiffies_f, tsave);

while (tsave->tv_nsec < 0) {

--
Todd Mokros <[email protected]>

2003-03-10 19:31:59

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Andrew Morton wrote:
> Andrew Morton <[email protected]> wrote:
>
>>errr, OK. This returns -EINVAL:
>>
>>#include <time.h>
>>
>>main()
>>{
>> struct timespec req;
>> struct timespec rem;
>> int ret;
>>
>> req.tv_sec = 5000000;
>> req.tv_nsec = 0;
>>
>> ret = nanosleep(&req, &rem);
>> if (ret)
>> perror("nanosleep");
>>}
>>
>
>
> OK, I give up.
>
> /*
> * This is a considered response, not exactly in
> * line with the standard (in fact it is silent on
> * possible overflows). We assume such a large
> * value is ALMOST always a programming error and
> * try not to compound it by setting a really dumb
> * value.
> */
> return -EINVAL;
>
> George, RH7.3 and RH8.0 cron daemons are triggering this (trying to sleep
> for 4,500,000 seconds) and it causes them to go into a busy loop.
>
> I think we need to just sleep for as long as we can and return an
> appropriate partial result.
>
>
Linus has fixed the problem cron showed up, so.

Lets consider this one on its own merits. What SHOULD sleep do when
asked to sleep for MAX_INT number of jiffies or more, i.e. when
jiffies overflows? My notion, above, it that it is clearly an error.
I suppose as HZ gets bigger, this argument will carry less weight,
but, still:

We have, I think, three choices:
1.) Error out as it does now,
2.) Sleep for MAX_INT and return ?????
3.) Sleep for MAX_INT and then sleep some more until the actual time
is reached.

2.) Requires, if we are to return other than OK, some way to flag that
the error happened.

3.) Likewise, requires more bits in the timer. If we went to a 64-bit
expire count, we could do the "right" thing, however it adds an int to
the size of the timer_struct.

So, folks, what is the _right_ thing to do here?

-g

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-03-10 19:40:53

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Mon, 10 Mar 2003, george anzinger wrote:
>
> Lets consider this one on its own merits. What SHOULD sleep do when
> asked to sleep for MAX_INT number of jiffies or more, i.e. when
> jiffies overflows? My notion, above, it that it is clearly an error.

My suggestion (in order of preference):
- sleep the max amount, and then restart as if a signal had happened
- sleep the max amount (old behaviour)
- consider it an error (new behaviour)

In this case the error case actually helped find the other unrelated bug,
so in this case the error actually _helped_ us. However, that was only
"help" from a kernel perspective, from a user perspective I definitely
think that it makes no sense to have "sleep(largenum)" return -EINVAL.

And in the end it's the user that matters.

Linus

2003-03-10 22:10:58

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Linus Torvalds wrote:
> On Mon, 10 Mar 2003, george anzinger wrote:
>
>>Lets consider this one on its own merits. What SHOULD sleep do when
>>asked to sleep for MAX_INT number of jiffies or more, i.e. when
>>jiffies overflows? My notion, above, it that it is clearly an error.
>
>
> My suggestion (in order of preference):
> - sleep the max amount, and then restart as if a signal had happened

I think this will require a 64-bit expire in the timer_struct
(actually it would not be treated as such, but the struct would still
need the added bits). Is this ok?

I will look at the problem in detail and see if there might be another
way without the need of the added bits.

> - sleep the max amount (old behavior)
> - consider it an error (new behavior)
>
> In this case the error case actually helped find the other unrelated bug,
> so in this case the error actually _helped_ us. However, that was only
> "help" from a kernel perspective, from a user perspective I definitely
> think that it makes no sense to have "sleep(largenum)" return -EINVAL.
>
> And in the end it's the user that matters.
>
Hm... I changed it to what it is to make it easier to track down
problems in the test code... and this was user code. My thinking was
that such large values are clear errors, and having the code "hang" in
the sleep just hides the problem. But then, I NEVER make a system
call without checking for errors.... And, I was making a LOT of sleep
calls and wanted to know which one(s) were wrong.

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-03-10 22:24:00

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

george anzinger <[email protected]> wrote:
>
> Linus Torvalds wrote:
> > On Mon, 10 Mar 2003, george anzinger wrote:
> >
> >>Lets consider this one on its own merits. What SHOULD sleep do when
> >>asked to sleep for MAX_INT number of jiffies or more, i.e. when
> >>jiffies overflows? My notion, above, it that it is clearly an error.
> >
> >
> > My suggestion (in order of preference):
> > - sleep the max amount, and then restart as if a signal had happened
>
> I think this will require a 64-bit expire in the timer_struct
> (actually it would not be treated as such, but the struct would still
> need the added bits). Is this ok?
>
> I will look at the problem in detail and see if there might be another
> way without the need of the added bits.

Is it not possible to just sit in a loop, sleeping for 0x7fffffff jiffies
on each iteration? (Until the final partial bit of course)

> Hm... I changed it to what it is to make it easier to track down
> problems in the test code... and this was user code. My thinking was
> that such large values are clear errors, and having the code "hang" in
> the sleep just hides the problem. But then, I NEVER make a system
> call without checking for errors.... And, I was making a LOT of sleep
> calls and wanted to know which one(s) were wrong.

If an app wants to sleep forever, calling

while (1)
sleep(MAX_INT);

seems like a reasonable approach. I'd expect quite a lot of applications
would be doing that.

2003-03-10 22:36:44

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Andrew Morton wrote:
> george anzinger <[email protected]> wrote:
>
>>Linus Torvalds wrote:
>>
>>>On Mon, 10 Mar 2003, george anzinger wrote:
>>>
>>>
>>>>Lets consider this one on its own merits. What SHOULD sleep do when
>>>>asked to sleep for MAX_INT number of jiffies or more, i.e. when
>>>>jiffies overflows? My notion, above, it that it is clearly an error.
>>>
>>>
>>>My suggestion (in order of preference):
>>> - sleep the max amount, and then restart as if a signal had happened
>>
>>I think this will require a 64-bit expire in the timer_struct
>>(actually it would not be treated as such, but the struct would still
>>need the added bits). Is this ok?
>>
>>I will look at the problem in detail and see if there might be another
>>way without the need of the added bits.
>
>
> Is it not possible to just sit in a loop, sleeping for 0x7fffffff jiffies
> on each iteration? (Until the final partial bit of course)

Seems reasonable. I will have a look.

-g
>
>
>>Hm... I changed it to what it is to make it easier to track down
>>problems in the test code... and this was user code. My thinking was
>>that such large values are clear errors, and having the code "hang" in
>>the sleep just hides the problem. But then, I NEVER make a system
>>call without checking for errors.... And, I was making a LOT of sleep
>>calls and wanted to know which one(s) were wrong.
>
>
> If an app wants to sleep forever, calling
>
> while (1)
> sleep(MAX_INT);
>
> seems like a reasonable approach. I'd expect quite a lot of applications
> would be doing that.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-03-10 22:55:11

by Felipe Alfaro Solana

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

----- Original Message -----
From: Andrew Morton <[email protected]>
Date: Mon, 10 Mar 2003 14:29:44 -0800
To: george anzinger <[email protected]>
Subject: Re: Runaway cron task on 2.5.63/4 bk?

> If an app wants to sleep forever, calling
>
> while (1)
> sleep(MAX_INT);
>
> seems like a reasonable approach. I'd expect quite a lot of applications
> would be doing that.

why not sleep(0)?

Felipe

--
______________________________________________
http://www.linuxmail.org/
Now with e-mail forwarding for only US$5.95/yr

Powered by Outblaze

2003-03-10 23:30:18

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Tue, 11 Mar 2003, Felipe Alfaro Solana wrote:
>
> why not sleep(0)?

I think a much more likely (and correct) usage for big sleep values is
more something like this:

do_with_timeout(xxx, int timeout)
{
struct timespec ts;

... set up some async event ..
ts.tv_nsec = 0;
ts.tv_sec = timeout;
while (nanosleep(&ts, &ts)) {
if (async event happened)
return happy;
}
.. tear down the async event if it didn't happen ..
}

and here the natural thing to do in user space is to just make the "no
timeout" case be a huge value.

At which point it is a _bug_ in the kernel if we return early with some
random error code.

Linus

2003-03-11 10:10:23

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

diff -urP -I '\$Id:.*Exp \$' -X /usr/src/patch.exclude linux-2.5.64-kb/Documentation/scaled_math.txt linux/Documentation/scaled_math.txt
--- linux-2.5.64-kb/Documentation/scaled_math.txt 1969-12-31 16:00:00.000000000 -0800
+++ linux/Documentation/scaled_math.txt 2003-03-10 16:41:52.000000000 -0800
@@ -0,0 +1,169 @@
+This file gives a bit of information on scaling integers, the math
+involved, and the considerations needed to decide if and how to use
+scaled math.
+
+What is it anyway?
+
+Scaled math is a method of doing integer math which allows you to:
+
+A.) Work with fractions in integer math.
+B.) Use MPY instead of DIV (much faster).
+C.) To reduce round off (or digitizing) errors.
+
+Basically, in scaled math you would replace an equation of this sort:
+
+r = foo / bar or r = foo * top / bar
+
+with this:
+
+r = (foo * SC) / (bar * SC) or r = (foo * top *SC) / (bar * SC)
+
+Regrouping these:
+
+r = foo * (SC / bar) / SC or r = foo * ((top * SC) / bar) / SC
+
+SC is the scale factor. We choose SC carefully to retain the most bits
+in the calculation and to make the math easy. We make the math easy by
+making SC be a power of 2 so the * SC and / SC operations are just
+shifts.
+
+
+How does it accomplish all this?
+
+The best way to show the benefits of scaled math is to go through an
+example. Here is a common problem: During boot up of an 800MHZ machine
+we measure the speed of the machine against the time base. In the case
+of the i386 machine with a TSC this means we program the PIT for a fixed
+time interval and measure the difference in TSC values over this
+interval. Suppose we measure this over 0.010 seconds (10 ms). With
+this particular machine we would get something like 8,000,000 (call this
+TSC-M). We want to use this number to convert TSC counts to micro
+seconds. So the math is:
+
+usec = tsccount * 0.01 * 1000000 / TSC-M or
+usec = tsccount * 10000 / TSC-M
+
+Now the first thing to notice here is that (10000 / TSC-M) is less than
+one and, if we precalculated it we would loose big time. We might try
+precalculating (TSC-M / 100000) but this also has problems. First it
+would require a "div" each time we wanted to convert a "tsccount" to
+micro seconds. The second problem is that the result of this
+precalculation would be on the order of 80, or only 6 to 7 bits so we
+would loose a lot of precision.
+
+The scaled way to do this is to precalculate ((10000 * SC) / TSC-M). In
+fact this is what is done in the i386 time code. In this case SC is
+(2**32). This allows micro seconds to be calculated as:
+
+usec = tsccount * ((10000 * SC) / TSC-M) / SC
+
+where ((10000 * SC) / TSC-M) is a precalculated constant and "/ SC"
+becomes a right shift of 32 bits. The easy way to do this is to do a
+simple "mul" instruction and take the high order bits as the result.
+I.e. the right shift doesn't even need to be done in this case.
+
+What have we gained here?
+
+The precision is much higher. The constant will be on the order of
+5,368,709 so we have about 1 bit in 5 million of precision, not bad.
+
+We now do a multiply to do the conversion which is much faster than the
+divide. Note, also, if we happen to want nano seconds instead of micro
+seconds we could just change the constant term to:
+
+(10000000 * SC) / TSC-M
+
+which has even more precision. Note here that we are really trying to
+multiply the TSC count by something like 1.25 (or divide it by 0.8) both
+of which are fractions that would loose most of there precision with out
+scaling.
+
+How to choose SC:
+
+SC has to be chosen carefully to avoid both underflow and overflow.
+
+With the routines provided in the sc_math.h file, the calculations expand
+for a brief moment to 64-bits and then are reduced back to 32-bits. For
+example, the calculation of ((10000 * SC) / TSC-M) will shift "10000"
+left 32 bits to form a 64-bit numerator for the divide by TSC-M. SC
+must be chosen so that the result of the divide is no more than 32-bits
+(i.e. does not overflow). In our example, this defines the slowest
+machine we can handle (10,000 tsc counts in 0.01 sec or 1MHZ).
+
+Like wise, if SC is too small, the result will be too small and
+precision will be lost.
+
+Notes on the sc_math.h functions:
+
+What the sc_math.h functions do is to provide routines that allow usage
+of the ability of the hardware to multiply two integers and return a
+result that is twice the length of the original integers. At the same
+time it provides access to the divide instruction which can divide a
+64-bit numerator by a 32-bit denominator and return a 32-bit quotient and
+remainder.
+
+In addition, to help with the scaling, routines are provide that combine
+the common scaling shift operation with the multiply or divide.
+
+Since (2**32) is a common scaling, functions to deal with it most
+efficiently are provided.
+
+Functions that allow easy calculation of conversion constants at compile
+time are also provided.
+
+Details:
+
+All the functions work with unsigned long or unsigned long long. We
+leave it for another day to do the signed versions. Also, we have
+provided a generic sc_math.h file. This begs for each 32-bit arch to
+supply an asm version which will be much more efficient.
+
+SC_32(x) given an integer (or long) returns (unsigned long long)x<<32
+SC_n(n,x) given an integer (or long) returns (unsigned long long)x<<n
+
+These may be used to form constants at compile time, e.g.:
+
+unsigned long sc_constant = SC_n(24, (constant expression)) / constant;
+
+mpy_sc32(unsigned long a, unsigned long b); returns (a * b) >> 32
+
+mpy_sc24(unsigned long a, unsigned long b); returns (a * b) >> 24
+
+mpy_sc_n(const N, unsigned long a, unsigned long b); returns (a * b) >> N
+
+Note: N must be a constant here.
+
+div_sc32(unsigned long a, unsigned long b); returns (a << 32) / b
+
+div_sc24(unsigned long a, unsigned long b); returns (a << 24) / b
+
+div_sc_n(const N, unsigned long a, unsigned long b); returns (a << N) / b
+
+Note: N must be a constant here.
+
+In addition, the following functions provide access to the mpy and div
+instructions:
+
+mpy_l_X_l_ll(unsigned long mpy1, unsigned long mpy2);
+returns (unsigned long long)(mpy1 * mpy2)
+
+mpy_l_X_l_h(unsigned long mpy1, unsigned long mpy2, unsigned long *hi);
+returns (unsigned long)(mpy1 * mpy2) & 0xffffffff and
+ (unsigned long)(mpy1 * mpy2) >> 32 in hi
+
+mpy_ll_X_l_ll(unsigned long long mpy1, unsigned long mpy2);
+returns (unsigned long long)(mpy1 * mpy2)
+
+Note: The long long mpy1, this routine allows a string of mpys where it
+is undetermined where the result becomes long long.
+
+div_ll_X_l_rem(unsigned long long divs, unsigned long div, unsigned long *rem);
+returns (unsigned long)(divs/div)
+with the remainder in rem
+
+div_long_long_rem() is an alias for the above.
+
+div_h_or_l_X_l_rem(unsigned long divh, unsigned long divl,
+ unsigned long div, unsigned long *rem)
+returns (unsigned long)((divh << 32) | divl) / div
+with the remainder in rem
diff -urP -I '\$Id:.*Exp \$' -X /usr/src/patch.exclude linux-2.5.64-kb/include/asm-generic/sc_math.h linux/include/asm-generic/sc_math.h
--- linux-2.5.64-kb/include/asm-generic/sc_math.h 1969-12-31 16:00:00.000000000 -0800
+++ linux/include/asm-generic/sc_math.h 2003-03-10 16:41:52.000000000 -0800
@@ -0,0 +1,311 @@
+#ifndef SC_MATH_GENERIC
+#define SC_MATH_GENERIC
+/*
+ * These are the generic scaling functions for machines which
+ * do not yet have the arch asm versions (or for the 64-bit
+ * long systems)
+ */
+/*
+ * Pre scaling defines
+ */
+#define SC_32(x) ((unsigned long long)(x) << 32)
+#define SC_n(n,x) (((unsigned long long)(x)) << (n))
+
+#if (BITS_PER_LONG < 64)
+
+#define SCC_SHIFT 16
+#define SCC_MASK ((1 << SCC_SHIFT) -1)
+/*
+ * mpy a long by a long and return a long long
+ */
+
+extern inline long long mpy_l_X_l_ll(unsigned long mpy1,unsigned long mpy2)
+{
+ unsigned long low1 = (unsigned) (mpy1 & SCC_MASK);
+ unsigned long high1 = (mpy1 >> SCC_SHIFT);
+ unsigned long low2 = (unsigned) (mpy2 & SCC_MASK);
+ unsigned long high2 = (mpy2 >> SCC_SHIFT);
+
+ unsigned long long cross = (low1 * high2) + (high1 * low2);
+
+ return (((long long)(high1 * high2)) << (SCC_SHIFT + SCC_SHIFT)) +
+ (cross << SCC_SHIFT) +
+ (low1 * low2);
+
+}
+/*
+ * mpy a long long by a long and return a long long
+ */
+
+extern inline long long mpy_ll_X_l_ll(unsigned long long mpy1,
+ unsigned long mpy2)
+{
+ long long result = mpy_l_X_l_ll((unsigned long)mpy1, mpy2);
+ result += (mpy_l_X_l_ll((long)(mpy1 >> 32), mpy2) << 32);
+ return result;
+}
+/*
+ * mpy a long by a long and return the low part and a seperate hi part
+ */
+
+
+extern inline unsigned long mpy_l_X_l_h(unsigned long mpy1,
+ unsigned long mpy2,
+ unsigned long *hi)
+{
+ unsigned long long it = mpy_l_X_l_ll(mpy1, mpy2);
+ *hi = (unsigned long)(it >> 32);
+ return (unsigned long)it;
+
+}
+/*
+ * This routine preforms the following calculation:
+ *
+ * X = (a*b)>>32
+ * we could, (but don't) also get the part shifted out.
+ */
+extern inline unsigned long mpy_sc32(unsigned long a,unsigned long b)
+{
+ return (unsigned long)(mpy_l_X_l_ll(a, b) >> 32);
+}
+/*
+ * X = (a/b)<<32 or more precisely x = (a<<32)/b
+ */
+#include <asm/div64.h>
+#if 0 // maybe one day we will do signed numbers...
+/*
+ * do_div doesn't handle signed numbers, so:
+ */
+#define do_div_signed(result, div) \
+({ \
+ long rem, flip = 0; \
+ if (result < 0){ \
+ result = -result; \
+ flip = 2; /* flip rem & result sign*/ \
+ if (div < 0){ \
+ div = -div; \
+ flip--; /* oops, just flip rem */ \
+ } \
+ } \
+ rem = do_div(result,div); \
+ rem = flip ? -rem : rem; \
+ if ( flip == 2) \
+ result = -result; \
+ rem; \
+})
+#endif
+
+extern inline unsigned long div_sc32(unsigned long a, unsigned long b)
+{
+ unsigned long long result = SC_32(a);
+ do_div(result, b);
+ return (unsigned long)result;
+}
+/*
+ * X = (a*b)>>24
+ * we could, (but don't) also get the part shifted out.
+ */
+
+#define mpy_sc24(a,b) mpy_sc_n(24,(a),(b))
+/*
+ * X = (a/b)<<24 or more precisely x = (a<<24)/b
+ */
+#define div_sc24(a,b) div_sc_n(24,(a),(b))
+
+/*
+ * The routines allow you to do x = ((a<< N)/b) and
+ * x=(a*b)>>N for values of N from 1 to 32.
+ *
+ * These are handy to have to do scaled math.
+ * Scaled math has two nice features:
+ * A.) A great deal more precision can be maintained by
+ * keeping more signifigant bits.
+ * B.) Often an in line div can be replaced with a mpy
+ * which is a LOT faster.
+ */
+
+/* x = (aa * bb) >> N */
+
+
+#define mpy_sc_n(N,aa,bb) ({(unsigned long)(mpy_l_X_l_ll((aa), (bb)) >> (N));})
+
+/* x = (aa << N / bb) */
+#define div_sc_n(N,aa,bb) ({unsigned long long result = SC_n((N), (aa)); \
+ do_div(result, (bb)); \
+ (long)result;})
+
+
+/*
+ * (long)X = ((long long)divs) / (long)div
+ * (long)rem = ((long long)divs) % (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ * Well, it would if done in asm, this code just truncates..
+ */
+#define div_long_long_rem(a,b,c) div_ll_X_l_rem((a),(b),(c))
+
+/* x = divs / div; *rem = divs % div; */
+extern inline unsigned long div_ll_X_l_rem(unsigned long long divs,
+ unsigned long div,
+ unsigned long * rem)
+{
+ unsigned long long it = divs;
+ *rem = do_div(it, div);
+ return (unsigned long)it;
+}
+/*
+ * same as above, but no remainder
+ */
+extern inline unsigned long div_ll_X_l(unsigned long long divs,
+ unsigned long div)
+{
+ unsigned long long it = divs;
+ do_div(it, div);
+ return (unsigned long)it;
+}
+/*
+ * (long)X = (((long)divh<<32) | (long)divl) / (long)div
+ * (long)rem = (((long)divh<<32) % (long)divl) / (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ * Well, it would if done in asm, this code just truncates..
+ */
+extern inline unsigned long div_h_or_l_X_l_rem(unsigned long divh,
+ unsigned long divl,
+ unsigned long div,
+ unsigned long* rem)
+{
+ unsigned long long result = SC_32(divh) + (divl);
+
+ return div_ll_X_l_rem(result, (div), (rem));
+
+}
+#else
+/* The 64-bit long version */
+
+/*
+ * The 64-bit long machine can do most of these in native C. We assume that
+ * the "long long" of 32-bit machines is typedefed away so the we need only
+ * deal with longs. This code should be tight enought that asm code is not
+ * needed.
+ */
+
+/*
+ * mpy a long by a long and return a long
+ */
+
+extern inline unsigned long mpy_l_X_l_ll(unsigned long mpy1, unsigned long mpy2)
+{
+
+ return (mpy1) * (mpy2);
+
+}
+/*
+ * mpy a long by a long and return the low part and a separate hi part
+ * This code always returns 32 values... may not be what you want...
+ */
+
+
+extern inline unsigned long mpy_l_X_l_h(unsigned long mpy1,
+ unsigned long mpy2,
+ unsigned long *hi)
+{
+ unsigned long it = mpy1 * mpy2;
+ *hi = (it >> 32);
+ return it & 0xffffffff;
+
+}
+/*
+ * This routine preforms the following calculation:
+ *
+ * X = (a*b)>>32
+ * we could, (but don't) also get the part shifted out.
+ */
+extern inline unsigned long mpy_sc32(unsigned long a, unsigned long b)
+{
+ return (mpy1 * mpy2) >> 32);
+}
+/*
+ * X = (a/b)<<32 or more precisely x = (a<<32)/b
+ */
+
+extern inline long div_sc32(long a, long b)
+{
+ return SC_32(a) / (b);
+}
+/*
+ * X = (a*b)>>24
+ * we could, (but don't) also get the part shifted out.
+ */
+
+#define mpy_sc24(a,b) mpy_sc_n(24,a,b)
+/*
+ * X = (a/b)<<24 or more precisely x = (a<<24)/b
+ */
+#define div_sc24(a,b) div_sc_n(24,a,b)
+
+/*
+ * The routines allow you to do x = ((a<< N)/b) and
+ * x=(a*b)>>N for values of N from 1 to 32.
+ *
+ * These are handy to have to do scaled math.
+ * Scaled math has two nice features:
+ * A.) A great deal more precision can be maintained by
+ * keeping more signifigant bits.
+ * B.) Often an in line div can be replaced with a mpy
+ * which is a LOT faster.
+ */
+
+/* x = (aa * bb) >> N */
+
+
+#define mpy_sc_n(N,aa,bb) ((aa) * (bb)) >> N)
+
+/* x = (aa << N / bb) */
+#define div_sc_n(N,aa,bb) (SC_n((N), (aa)) / (bb))
+
+
+/*
+ * (long)X = ((long long)divs) / (long)div
+ * (long)rem = ((long long)divs) % (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ * Well, it would if done in asm, this code just truncates..
+ */
+#define div_long_long_rem(a,b,c) div_ll_X_l_rem(a, b, c)
+
+/* x = divs / div; *rem = divs % div; */
+extern inline unsigned long div_ll_X_l_rem(unsigned long divs,
+ unsigned long div,
+ unsigned long * rem)
+{
+ *rem = divs % div;
+ return divs / div;
+}
+/*
+ * same as above, but no remainder
+ */
+extern inline unsigned long div_ll_X_l(unsigned long divs,
+ unsigned long div)
+{
+ return divs / div;
+}
+/*
+ * (long)X = (((long)divh<<32) | (long)divl) / (long)div
+ * (long)rem = (((long)divh<<32) % (long)divl) / (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ * Well, it would if done in asm, this code just truncates..
+ */
+extern inline unsigned long div_h_or_l_X_l_rem(unsigned long divh,
+ unsigned long divl,
+ unsigned long div,
+ unsigned long* rem)
+{
+ long result = SC_32(divh) + divl;
+
+ return div_ll_X_l_rem(result, div, rem);
+
+}
+#endif // else(BITS_PER_LONG < 64)
+#endif
diff -urP -I '\$Id:.*Exp \$' -X /usr/src/patch.exclude linux-2.5.64-kb/include/asm-i386/sc_math.h linux/include/asm-i386/sc_math.h
--- linux-2.5.64-kb/include/asm-i386/sc_math.h 1969-12-31 16:00:00.000000000 -0800
+++ linux/include/asm-i386/sc_math.h 2003-03-10 16:41:52.000000000 -0800
@@ -0,0 +1,152 @@
+#ifndef SC_MATH
+#define SC_MATH
+
+#define MATH_STR(X) #X
+#define MATH_NAME(X) X
+
+
+/*
+ * Pre scaling defines
+ */
+#define SC_32(x) ((unsigned long long)x<<32)
+#define SC_n(n,x) (((unsigned long long)x)<<n)
+/*
+ * This routine preforms the following calculation:
+ *
+ * X = (a*b)>>32
+ * we could, (but don't) also get the part shifted out.
+ */
+extern inline unsigned long
+mpy_sc32(unsigned long a, unsigned long b)
+{
+ long edx;
+ __asm__("mull %2":"=a"(a), "=d"(edx)
+ : "rm"(b), "0"(a));
+ return edx;
+}
+/*
+ * X = (a/b)<<32 or more precisely x = (a<<32)/b
+ */
+
+extern inline unsigned long
+div_sc32(unsigned long a, unsigned long b)
+{
+ unsigned long dum;
+ __asm__("divl %2":"=a"(b), "=d"(dum)
+ : "r"(b), "0"(0), "1"(a));
+
+ return b;
+}
+/*
+ * X = (a*b)>>24
+ * we could, (but don't) also get the part shifted out.
+ */
+
+#define mpy_sc24(a,b) mpy_sc_n(24,a,b)
+/*
+ * X = (a/b)<<24 or more precisely x = (a<<24)/b
+ */
+#define div_sc24(a,b) div_sc_n(24,a,b)
+
+/*
+ * The routines allow you to do x = (a/b) << N and
+ * x=(a*b)>>N for values of N from 1 to 32.
+ *
+ * These are handy to have to do scaled math.
+ * Scaled math has two nice features:
+ * A.) A great deal more precision can be maintained by
+ * keeping more signifigant bits.
+ * B.) Often an in line div can be repaced with a mpy
+ * which is a LOT faster.
+ */
+
+#define mpy_sc_n(N,aa,bb) ({unsigned long edx, a=aa, b=bb; \
+ __asm__("mull %2\n\t" \
+ "shldl $(32-"MATH_STR(N)"), %0, %1" \
+ :"=a" (a), "=d" (edx)\
+ :"rm" (b), \
+ "0" (a)); edx;})
+
+#define div_sc_n(N,aa,bb) ({unsigned long dum=aa, dum2, b=bb; \
+ __asm__("shrdl $(32-"MATH_STR(N)"), %4, %3\n\t" \
+ "sarl $(32-"MATH_STR(N)"), %4\n\t" \
+ "divl %2" \
+ :"=a" (dum2), "=d" (dum) \
+ :"rm" (b), "0" (0), "1" (dum)); dum2;})
+
+/*
+ * (long)X = ((long long)divs) / (long)div
+ * (long)rem = ((long long)divs) % (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ */
+#define div_long_long_rem(a, b, c) div_ll_X_l_rem(a, b, c)
+
+extern inline unsigned long
+div_ll_X_l_rem(unsigned long long divs, unsigned long div, unsigned long *rem)
+{
+ unsigned long dum2;
+ __asm__("divl %2":"=a"(dum2), "=d"(*rem)
+ : "rm"(div), "A"(divs));
+
+ return dum2;
+
+}
+/*
+ * same as above, but no remainder
+ */
+extern inline unsigned long
+div_ll_X_l(unsigned long long divs, unsigned long div)
+{
+ unsigned long dum;
+ return div_ll_X_l_rem(divs, div, &dum);
+}
+/*
+ * (long)X = (((long)divh<<32) | (long)divl) / (long)div
+ * (long)rem = (((long)divh<<32) % (long)divl) / (long)div
+ *
+ * Warning, this will do an exception if X overflows.
+ */
+extern inline unsigned long
+div_h_or_l_X_l_rem(unsigned long divh, unsigned long divl,
+ unsigned long div, unsigned long *rem)
+{
+ unsigned long dum2;
+ __asm__("idivl %2":"=a"(dum2), "=d"(*rem)
+ : "rm"(div), "0"(divl), "1"(divh));
+
+ return dum2;
+
+}
+extern inline unsigned long long
+mpy_l_X_l_ll(unsigned long mpy1, unsigned long mpy2)
+{
+ unsigned long long eax;
+ __asm__("mull %1\n\t":"=A"(eax)
+ : "rm"(mpy2), "a"(mpy1));
+
+ return eax;
+
+}
+extern inline unsigned long
+mpy_l_X_l_h(unsigned long mpy1, unsigned long mpy2, unsigned long *hi)
+{
+ long eax;
+ __asm__("mull %2\n\t":"=a"(eax), "=d"(*hi)
+ : "rm"(mpy2), "0"(mpy1));
+
+ return eax;
+
+}
+/*
+ * mpy a unsigned long long by a unsigned long and return a unsigned long long
+ */
+extern inline unsigned long long
+mpy_ll_X_l_ll(unsigned long long mpy1, unsigned long mpy2)
+{
+ unsigned long long result = mpy_l_X_l_ll((unsigned long)mpy1, mpy2);
+ result += (mpy_l_X_l_ll((long)(mpy1 >> 32), mpy2) << 32);
+ return result;
+}
+
+#endif
diff -urP -I '\$Id:.*Exp \$' -X /usr/src/patch.exclude linux-2.5.64-kb/include/linux/thread_info.h linux/include/linux/thread_info.h
--- linux-2.5.64-kb/include/linux/thread_info.h 2002-12-11 06:25:32.000000000 -0800
+++ linux/include/linux/thread_info.h 2003-03-10 16:39:52.000000000 -0800
@@ -12,7 +12,7 @@
*/
struct restart_block {
long (*fn)(struct restart_block *);
- unsigned long arg0, arg1, arg2;
+ unsigned long arg0, arg1, arg2, arg3;
};

extern long do_no_restart_syscall(struct restart_block *parm);
diff -urP -I '\$Id:.*Exp \$' -X /usr/src/patch.exclude linux-2.5.64-kb/kernel/posix-timers.c linux/kernel/posix-timers.c
--- linux-2.5.64-kb/kernel/posix-timers.c 2003-03-05 15:10:40.000000000 -0800
+++ linux/kernel/posix-timers.c 2003-03-11 00:38:41.000000000 -0800
@@ -23,6 +23,7 @@
#include <linux/compiler.h>
#include <linux/idr.h>
#include <linux/posix-timers.h>
+#include <asm/sc_math.h>

#ifndef div_long_long_rem
#include <asm/div64.h>
@@ -183,7 +184,7 @@
__initcall(init_posix_timers);

static inline int
-tstojiffie(struct timespec *tp, int res, unsigned long *jiff)
+tstojiffie(struct timespec *tp, int res, u64 *jiff)
{
unsigned long sec = tp->tv_sec;
long nsec = tp->tv_nsec + res - 1;
@@ -203,7 +204,7 @@
* below. Here it is enough to just discard the high order
* bits.
*/
- *jiff = HZ * sec;
+ *jiff = mpy_l_X_l_ll(HZ, sec);
/*
* Do the res thing. (Don't forget the add in the declaration of nsec)
*/
@@ -221,9 +222,12 @@
static void
tstotimer(struct itimerspec *time, struct k_itimer *timer)
{
+ u64 result;
int res = posix_clocks[timer->it_clock].res;
- tstojiffie(&time->it_value, res, &timer->it_timer.expires);
- tstojiffie(&time->it_interval, res, &timer->it_incr);
+ tstojiffie(&time->it_value, res, &result);
+ timer->it_timer.expires = (unsigned long)result;
+ tstojiffie(&time->it_interval, res, &result);
+ timer->it_incr = (unsigned long)result;
}

static void
@@ -1020,6 +1024,9 @@
* Note also that the while loop assures that the sub_jiff_offset
* will be less than a jiffie, thus no need to normalize the result.
* Well, not really, if called with ints off :(
+
+ * HELP, this code should make an attempt at resolution beyond the
+ * jiffie. Trouble is this is "arch" dependent...
*/

int
@@ -1208,6 +1215,7 @@
struct timespec t;
struct timer_list new_timer;
struct abs_struct abs_struct = { .list = { .next = 0 } };
+ u64 rq_time = 0;
int abs;
int rtn = 0;
int active;
@@ -1226,11 +1234,12 @@
* time and continue.
*/
restart_block->fn = do_no_restart_syscall;
- if (!restart_block->arg2)
- return -EINTR;

- new_timer.expires = restart_block->arg2;
- if (time_before(new_timer.expires, jiffies))
+ rq_time = restart_block->arg3;
+ rq_time = (rq_time << 32) + restart_block->arg2;
+ if (!rq_time)
+ return -EINTR;
+ if (rq_time <= get_jiffies_64())
return 0;
}

@@ -1243,37 +1252,37 @@
}
do {
t = *tsave;
- if ((abs || !new_timer.expires) &&
- !(rtn = adjust_abs_time(&posix_clocks[which_clock],
- &t, abs))) {
- /*
- * On error, we don't set up the timer so
- * we don't arm the timer so
- * del_timer_sync() will return 0, thus
- * active is zero... and so it goes.
- */
+ if (abs || !rq_time){
+ adjust_abs_time(&posix_clocks[which_clock], &t, abs);

- tstojiffie(&t,
- posix_clocks[which_clock].res,
- &new_timer.expires);
+ tstojiffie(&t, posix_clocks[which_clock].res, &rq_time);
}
- if (new_timer.expires) {
- current->state = TASK_INTERRUPTIBLE;
- add_timer(&new_timer);
-
- schedule();
+#if (BITS_PER_LONG < 64)
+ if ((rq_time - get_jiffies_64()) > MAX_JIFFY_OFFSET){
+ new_timer.expires = MAX_JIFFY_OFFSET;
+ }else
+#endif
+ {
+ new_timer.expires = (long)rq_time;
}
+ current->state = TASK_INTERRUPTIBLE;
+ add_timer(&new_timer);
+
+ schedule();
}
- while ((active = del_timer_sync(&new_timer)) &&
+ while ((active = del_timer_sync(&new_timer) ||
+ rq_time > get_jiffies_64()) &&
!test_thread_flag(TIF_SIGPENDING));

+
if (abs_struct.list.next) {
spin_lock_irq(&nanosleep_abs_list_lock);
list_del(&abs_struct.list);
spin_unlock_irq(&nanosleep_abs_list_lock);
}
if (active) {
- unsigned long jiffies_f = jiffies;
+ s64 left;
+ unsigned long rmd;

/*
* Always restart abs calls from scratch to pick up any
@@ -1282,20 +1291,19 @@
if (abs)
return -ERESTARTNOHAND;

- jiffies_to_timespec(new_timer.expires - jiffies_f, tsave);
+ left = rq_time - get_jiffies_64();
+ if (left < 0)
+ return 0;
+
+ tsave->tv_sec = div_long_long_rem(left, HZ, &rmd);
+ tsave->tv_nsec = rmd * (NSEC_PER_SEC / HZ);

- while (tsave->tv_nsec < 0) {
- tsave->tv_nsec += NSEC_PER_SEC;
- tsave->tv_sec--;
- }
- if (tsave->tv_sec < 0) {
- tsave->tv_sec = 0;
- tsave->tv_nsec = 1;
- }
restart_block->fn = clock_nanosleep_restart;
restart_block->arg0 = which_clock;
restart_block->arg1 = (unsigned long)tsave;
- restart_block->arg2 = new_timer.expires;
+ restart_block->arg2 = rq_time & 0xffffffffLL;
+ restart_block->arg3 = rq_time >> 32;
+
return -ERESTART_RESTARTBLOCK;
}

Binary files linux-2.5.64-kb/lib/gen_crc32table and linux/lib/gen_crc32table differ
Binary files linux-2.5.64-kb/scripts/docproc and linux/scripts/docproc differ
Binary files linux-2.5.64-kb/scripts/fixdep and linux/scripts/fixdep differ
Binary files linux-2.5.64-kb/scripts/kallsyms and linux/scripts/kallsyms differ
Binary files linux-2.5.64-kb/scripts/mk_elfconfig and linux/scripts/mk_elfconfig differ
Binary files linux-2.5.64-kb/scripts/modpost and linux/scripts/modpost differ
Binary files linux-2.5.64-kb/usr/gen_init_cpio and linux/usr/gen_init_cpio differ
Binary files linux-2.5.64-kb/usr/initramfs_data.cpio.gz and linux/usr/initramfs_data.cpio.gz differ

Attachments:

posix-longsleep-2.5.64-1.0.patch (24.95 kB)

2003-03-11 22:39:06

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

2003-03-11 22:54:15

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Tue, 11 Mar 2003, Andrew Morton wrote:
>
> gcc will generate 64bit * 64bit multiplies without resorting to
> any library code

However, gcc is unable to do-the-right-thing and generate 32x32->64
multiplies, or 32x64->64 multiplies, even though those are both a _lot_
faster than the full 64x64->64 case.

And in quite a _lot_ of cases, that's actually what you want. It might
actually make sense to add a "do_mul()" thing to allow architectures to do
these cases right, since gcc doesn't.

> and you can probably do the division with do_div().

Yes. This is the same issue - gcc will always promote a 64-bit divide to
be _fully_ 64-bit, even if the mixed-size 64/32 -> [64,32] case is much
faster and simpler. Which is why do_div() exists in the first place.

Linus

2003-03-11 23:03:30

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Linus Torvalds <[email protected]> wrote:
>
>
> On Tue, 11 Mar 2003, Andrew Morton wrote:
> >
> > gcc will generate 64bit * 64bit multiplies without resorting to
> > any library code
>
> However, gcc is unable to do-the-right-thing and generate 32x32->64
> multiplies, or 32x64->64 multiplies, even though those are both a _lot_
> faster than the full 64x64->64 case.

2.95.3 and 3.2.1 seem to do it right?

long a;
long b;
long long c;

void foo(void)
{
c = a * b;
}

.file "t.c"
.version "01.01"
gcc2_compiled.:
.text
.align 4
.globl foo
.type foo,@function
foo:
pushl %ebp
movl %esp,%ebp
movl a,%eax
imull b,%eax
movl %eax,c
cltd
movl %edx,c+4
.L2:
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size foo,.Lfe1-foo
.comm a,4,4
.comm b,4,4
.comm c,8,4
.ident "GCC: (GNU) 2.95.3 20010315 (release)"

2003-03-11 23:24:57

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Tue, 11 Mar 2003, Andrew Morton wrote:
>
> 2.95.3 and 3.2.1 seem to do it right?

Try the "64x32->64" version. gcc didn't use to get that one right, but
maybe it does now.

Linus

2003-03-11 23:25:52

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Linus Torvalds wrote:
>
> On Tue, 11 Mar 2003, Andrew Morton wrote:
>
>>gcc will generate 64bit * 64bit multiplies without resorting to
>>any library code
>
>
> However, gcc is unable to do-the-right-thing and generate 32x32->64
> multiplies, or 32x64->64 multiplies, even though those are both a _lot_
> faster than the full 64x64->64 case.
>
> And in quite a _lot_ of cases, that's actually what you want. It might
> actually make sense to add a "do_mul()" thing to allow architectures to do
> these cases right, since gcc doesn't.
>
>
>>and you can probably do the division with do_div().
>
>
> Yes. This is the same issue - gcc will always promote a 64-bit divide to
> be _fully_ 64-bit, even if the mixed-size 64/32 -> [64,32] case is much
> faster and simpler. Which is why do_div() exists in the first place.

Often the 64/32 -> [32,32] is all that is needed and that is even
faster if we could get to it.
>
> Linus
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-03-11 23:28:59

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Linus Torvalds <[email protected]> wrote:
>
>
> On Tue, 11 Mar 2003, Andrew Morton wrote:
> >
> > 2.95.3 and 3.2.1 seem to do it right?
>
> Try the "64x32->64" version. gcc didn't use to get that one right, but
> maybe it does now.
>

long a;
long long b;
long long c;

void foo(void)
{
c = a * b;
}

It seems to get that wrong. At least, there are three multiplies in there.
3.2.1 is similar.

.file "t.c"
.version "01.01"
gcc2_compiled.:
.text
.align 4
.globl foo
.type foo,@function
foo:
pushl %ebp
movl %esp,%ebp
subl $16,%esp
pushl %esi
pushl %ebx
movl a,%eax
movl %eax,%ebx
movl %eax,%esi
sarl $31,%esi
movl %ebx,%eax
mull b
movl %eax,-8(%ebp)
movl %edx,-4(%ebp)
movl %ebx,%eax
imull b+4,%eax
addl %eax,%edx
movl %edx,-4(%ebp)
movl b,%eax
imull %esi,%eax
addl %eax,-4(%ebp)
movl -8(%ebp),%eax
movl -4(%ebp),%edx
movl %eax,c
movl %edx,c+4
popl %ebx
popl %esi
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size foo,.Lfe1-foo
.comm a,4,4
.comm b,8,4
.comm c,8,4
.ident "GCC: (GNU) 2.95.3 20010315 (release)"

2003-03-11 23:36:08

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Andrew Morton wrote:
> Linus Torvalds <[email protected]> wrote:
>
>>
>>On Tue, 11 Mar 2003, Andrew Morton wrote:
>>
>>>2.95.3 and 3.2.1 seem to do it right?
>>
>>Try the "64x32->64" version. gcc didn't use to get that one right, but
>>maybe it does now.
>>
>
>
>
> long a;
> long long b;
> long long c;
>
> void foo(void)
> {
> c = a * b;
> }
>
> It seems to get that wrong. At least, there are three multiplies in there.
> 3.2.1 is similar.
>

You might try it unsigned. There are issues with the signed version.
>
> .file "t.c"
> .version "01.01"
> gcc2_compiled.:
> .text
> .align 4
> .globl foo
> .type foo,@function
> foo:
> pushl %ebp
> movl %esp,%ebp
> subl $16,%esp
> pushl %esi
> pushl %ebx
> movl a,%eax
> movl %eax,%ebx
> movl %eax,%esi
> sarl $31,%esi
> movl %ebx,%eax
> mull b
> movl %eax,-8(%ebp)
> movl %edx,-4(%ebp)
> movl %ebx,%eax
> imull b+4,%eax
> addl %eax,%edx
> movl %edx,-4(%ebp)
> movl b,%eax
> imull %esi,%eax
> addl %eax,-4(%ebp)
> movl -8(%ebp),%eax
> movl -4(%ebp),%edx
> movl %eax,c
> movl %edx,c+4
> popl %ebx
> popl %esi
> movl %ebp,%esp
> popl %ebp
> ret
> .Lfe1:
> .size foo,.Lfe1-foo
> .comm a,4,4
> .comm b,8,4
> .comm c,8,4
> .ident "GCC: (GNU) 2.95.3 20010315 (release)"
>
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-03-11 23:38:06

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Tue, 11 Mar 2003, Andrew Morton wrote:
>
> 2.95.3 and 3.2.1 seem to do it right?

Well, they do, but your test case is wrong:

> long a;
> long b;
> long long c;
>
> void foo(void)
> {
> c = a * b;

This is just a 32*32->32 multiply, with a final sign extension to 64 bits.

You need to do

c = (long long) a * b;

to get a 32*32->64 multiply. And yes, gcc gets that case right.

Linus

2003-03-12 00:38:25

by Matti Aarnio

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

On Tue, Mar 11, 2003 at 03:02:31PM -0800, Linus Torvalds wrote:
> On Tue, 11 Mar 2003, Andrew Morton wrote:
> > gcc will generate 64bit * 64bit multiplies without resorting to
> > any library code
>
> However, gcc is unable to do-the-right-thing and generate 32x32->64
> multiplies, or 32x64->64 multiplies, even though those are both a _lot_
> faster than the full 64x64->64 case.
>
> And in quite a _lot_ of cases, that's actually what you want. It might
> actually make sense to add a "do_mul()" thing to allow architectures to do
> these cases right, since gcc doesn't.

Some architectures have a bit stricter limitations -- S390 limits divisor
to 2^31-1, for example.

A number of systems simply flaunt the task, and instead implement
mere 32/32 division in do_div().
(arm, cris, m68knommu, sh (?), sparc(32), v850)

The original pure C code to do 64/32 division to 64/32 results is
very much gone in favour of architecture specific assembly codes
(where system isn't 64 bit one already..)
Ah, include/asm-parisc/div64.h still has it in 2.5.64 sources..

> > and you can probably do the division with do_div().

If you need arbitrary divisions at all. Filesystems for example
can (in most cases) do with power-of-two divisions, e.g.: LL >> count
You may, perhaps, need to pre-calculate a number of those shift-counts
when mounting a filesystem.

> Yes. This is the same issue - gcc will always promote a 64-bit divide to
> be _fully_ 64-bit, even if the mixed-size 64/32 -> [64,32] case is much
> faster and simpler. Which is why do_div() exists in the first place.

Originally it was lib/vsprintf.c's internal (and very portable)
divide numerator by small base, produce changed numerator, and
remainder... The innermost element in arbitrary base number printing.

In 2.5 there is some odd: #define sector_div(a, b) do_div(a, b)
(only with CONFIG_LDB), and usage in jiffie-to-clock conversion...
... and all over the code in various odd nooks, XFS filesystem included...

> Linus

/Matti Aarnio

2003-03-12 01:52:20

by Jamie Lokier

[permalink] [raw]

Subject: Re: Runaway cron task on 2.5.63/4 bk?

Andrew Morton wrote:
> > However, gcc is unable to do-the-right-thing and generate 32x32->64
> > multiplies, or 32x64->64 multiplies, even though those are both a _lot_
> > faster than the full 64x64->64 case.
>
> 2.95.3 and 3.2.1 seem to do it right?
>
> long a;
> long b;
> long long c;
>
> void foo(void)
> {
> c = a * b;
> }

Your code is wrong for this test. It does a 32x32->32 multiply, and
then sign extends the result to 64 bits.

The correct test has "c = (long long) a * b;".

-- Jamie

2003-03-12 03:35:27

[permalink] [raw]

Subject: [PATCH] Re: Runaway cron task on 2.5.63/4 bk?

Attachments:

hrtimers-large-2.5.64-1.1.patch (4.86 kB)

2003-03-12 04:46:20

[permalink] [raw]

Subject: Re: [PATCH] Re: Runaway cron task on 2.5.63/4 bk?

george anzinger <[email protected]> wrote:
>
> Ok, here is what I have. I changed nano sleep to use a local 64-bit
> value for the target expire time in jiffies. As much as MAX-INT/2-1
> will be put in the timer at any one time. It loops till the target
> time is met or exceeded. The changes affect (clock)nanosleep only and
> not timers (they still error out for large values).

Seem sane.

> I now use the simple u64=(long long) a * b for the mpy so I have
> dropped the sc_math.h stuff (I will bring that round again :).

Resistance shall be unflagging!

> What do you think?

Sorry, but this little bit:

while ((active = del_timer_sync(&new_timer) ||
rq_time > get_jiffies_64()) &&
!test_thread_flag(TIF_SIGPENDING));

if (abs_struct.list.next) {
spin_lock_irq(&nanosleep_abs_list_lock);
list_del(&abs_struct.list);
spin_unlock_irq(&nanosleep_abs_list_lock);
}
if (active) {

should be dragged out and mercifully shot. Is it possible to make that while
loop a little clearer?

The abs_list exactly duplicates the kernel's existing waitqueue
functionality. You can use prepare_to_wait()/finish_wait() there.

posix_timers_id, posix_clocks[], nanosleep_abs_list_lock and
nanosleep_abs_list should be static to posix-timers.c.

2003-03-12 09:59:38