2002-08-20 09:22:31

by Johan Adolfsson

[permalink] [raw]
Subject: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

The cris architecture don't have any tsc, but it has a couple of
timer registers that can be used to get better than jiffie resolution.

I set the time to a 40 us resolution counter with a slight
"jump" since lower 8 bit only counts from 0 to 249,
the patch does not take wrapping of the register into account either
to save some cycles, is that a problem or a good thing?

The num is xor:d with the value from 2 timer registers,
which in turn contains different fields breifly described below.

Does the patch below look sane?

/Johan Adolfsson


--- random.c 7 Dec 2001 16:53:17 -0000 1.10
+++ random.c 20 Aug 2002 09:10:04 -0000
@@ -746,6 +746,15 @@ static void add_timer_randomness(struct
__u32 high;
rdtsc(time, high);
num ^= high;
+#elif defined (__CRIS__)
+ /* R_TIMER0_DATA, 8 bit, 40 us resolution, counting down from 250 */
+ /* R_TIMER_DATA, 4*8 bit, timer1, timer0, 38.4kHz, 7.3728MHz */
+ /* R_PRESCALE_STATUS, upper 16 bit: 320ns resolution,
+ lower 16 bit: 40 ns resolution, ~10 bits used,
+ counting down from 1000 */
+ time = jiffies << 8;
+ time |= (TIMER0_DIV - *R_TIMER0_DATA);
+ num ^= *R_PRESCALE_STATUS ^ *R_TIMER_DATA;
#else
time = jiffies;
#endif



2002-08-20 13:59:50

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

On Tue, Aug 20, 2002 at 11:31:10AM +0200, [email protected] wrote:
> The cris architecture don't have any tsc, but it has a couple of
> timer registers that can be used to get better than jiffie resolution.
>
> I set the time to a 40 us resolution counter with a slight
> "jump" since lower 8 bit only counts from 0 to 249,
> the patch does not take wrapping of the register into account either
> to save some cycles, is that a problem or a good thing?

That should be fine. More important is actually scaling the entropy
count based on the timing granularity of the source. Keyboards and
mice tend to have a granularity of about 1khz so timestamps better
than milliseconds 'invent' entropy in the current code.

> The num is xor:d with the value from 2 timer registers,
> which in turn contains different fields breifly described below.
>
> Does the patch below look sane?

Looks fine, but I think we want to come up with a cleaner scheme of
having per-arch high-res timestamps. I'd hate to have that grow to
several pages of ifdefs and not have it available anywhere else.

> +++ random.c 20 Aug 2002 09:10:04 -0000
> @@ -746,6 +746,15 @@ static void add_timer_randomness(struct
> __u32 high;
> rdtsc(time, high);
> num ^= high;
> +#elif defined (__CRIS__)
> + /* R_TIMER0_DATA, 8 bit, 40 us resolution, counting down from 250 */
> + /* R_TIMER_DATA, 4*8 bit, timer1, timer0, 38.4kHz, 7.3728MHz */
> + /* R_PRESCALE_STATUS, upper 16 bit: 320ns resolution,
> + lower 16 bit: 40 ns resolution, ~10 bits used,
> + counting down from 1000 */
> + time = jiffies << 8;
> + time |= (TIMER0_DIV - *R_TIMER0_DATA);
> + num ^= *R_PRESCALE_STATUS ^ *R_TIMER_DATA;
> #else
> time = jiffies;
> #endif

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-08-20 16:23:51

by Johan Adolfsson

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())


From: "Oliver Xymoron" <[email protected]>
> On Tue, Aug 20, 2002 at 11:31:10AM +0200, [email protected] wrote:
> > The cris architecture don't have any tsc, but it has a couple of
> > timer registers that can be used to get better than jiffie resolution.
> >
> > I set the time to a 40 us resolution counter with a slight
> > "jump" since lower 8 bit only counts from 0 to 249,
> > the patch does not take wrapping of the register into account either
> > to save some cycles, is that a problem or a good thing?
>
> That should be fine. More important is actually scaling the entropy
> count based on the timing granularity of the source. Keyboards and
> mice tend to have a granularity of about 1khz so timestamps better
> than milliseconds 'invent' entropy in the current code.

The ETRAX chips where the cris architecture is used is typically used in
headless embedded devices connected to a network. Currently I don't think
we use SA_RANDOM anywhere in our device drivers although it would be
nice to be able to use network and other interfaces as entropy/randomness
source (serial, parallel etc.) without to much concerns.

> > The num is xor:d with the value from 2 timer registers,
> > which in turn contains different fields breifly described below.
> >
> > Does the patch below look sane?
>
> Looks fine, but I think we want to come up with a cleaner scheme of
> having per-arch high-res timestamps. I'd hate to have that grow to
> several pages of ifdefs and not have it available anywhere else.

Yes, I've seen the discussion before.
Any idea of how such a solution should look like?
Put an inline function or macro in asm/timex.h (?) together with an
ARCH_HAS_RANDOM_TIMESTAMP define?

E.g. like this for i386:
#define ARCH_HAS_RANDOM_TIMESTAMP
#define RANDOM_TIMESTAMP(time, num) do{\
if ( test_bit(X86_FEATURE_TSC, &boot_cpu_data.x86_capability) ) { \
__u32 high; \
rdtsc(time, high); \
num ^= high; \
} else { \
time = jiffies; \
} \
}while(0)

And then in random.c:
ifdef ARCH_HAS_RANDOM_TIMESTAMP
RANDOM_TIMESTAMP(time, num);
#else
time = jiffies;
#endif

/Johan


2002-08-20 17:02:07

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

On Tue, Aug 20, 2002 at 06:32:29PM +0200, [email protected] wrote:
>
> From: "Oliver Xymoron" <[email protected]>
> > On Tue, Aug 20, 2002 at 11:31:10AM +0200, [email protected] wrote:
> > > The cris architecture don't have any tsc, but it has a couple of
> > > timer registers that can be used to get better than jiffie resolution.
> > >
> > > I set the time to a 40 us resolution counter with a slight
> > > "jump" since lower 8 bit only counts from 0 to 249,
> > > the patch does not take wrapping of the register into account either
> > > to save some cycles, is that a problem or a good thing?
> >
> > That should be fine. More important is actually scaling the entropy
> > count based on the timing granularity of the source. Keyboards and
> > mice tend to have a granularity of about 1khz so timestamps better
> > than milliseconds 'invent' entropy in the current code.
>
> The ETRAX chips where the cris architecture is used is typically used in
> headless embedded devices connected to a network. Currently I don't think
> we use SA_RANDOM anywhere in our device drivers although it would be
> nice to be able to use network and other interfaces as entropy/randomness
> source (serial, parallel etc.) without to much concerns.

See my recent (lengthy) posts on this subject.

> > > The num is xor:d with the value from 2 timer registers,
> > > which in turn contains different fields breifly described below.
> > >
> > > Does the patch below look sane?
> >
> > Looks fine, but I think we want to come up with a cleaner scheme of
> > having per-arch high-res timestamps. I'd hate to have that grow to
> > several pages of ifdefs and not have it available anywhere else.
>
> Yes, I've seen the discussion before.
> Any idea of how such a solution should look like?
> Put an inline function or macro in asm/timex.h (?) together with an
> ARCH_HAS_RANDOM_TIMESTAMP define?

I don't think we want to make it specific to random. I think we just
want to call it hires.

> E.g. like this for i386:
> #define ARCH_HAS_RANDOM_TIMESTAMP
> #define RANDOM_TIMESTAMP(time, num) do{\
> if ( test_bit(X86_FEATURE_TSC, &boot_cpu_data.x86_capability) ) { \
> __u32 high; \
> rdtsc(time, high); \
> num ^= high; \
> } else { \
> time = jiffies; \
> } \
> }while(0)
>
> And then in random.c:
> ifdef ARCH_HAS_RANDOM_TIMESTAMP
> RANDOM_TIMESTAMP(time, num);
> #else
> time = jiffies;
> #endif

Again, too random-specific. And we need a way to get the timescale.
Perhaps something like:

speed=get_timestamp_khz;
lowbits=get_hires_timestamp();

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-08-20 17:25:31

by Johan Adolfsson

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

> > Put an inline function or macro in asm/timex.h (?) together with an
> > ARCH_HAS_RANDOM_TIMESTAMP define?
>
> I don't think we want to make it specific to random. I think we just
> want to call it hires.
>
> > E.g. like this for i386:
> > #define ARCH_HAS_RANDOM_TIMESTAMP
> > #define RANDOM_TIMESTAMP(time, num) do{\
..
> > And then in random.c:
> > ifdef ARCH_HAS_RANDOM_TIMESTAMP
> > RANDOM_TIMESTAMP(time, num);
> > #else
> > time = jiffies;
> > #endif
>
> Again, too random-specific. And we need a way to get the timescale.
> Perhaps something like:
>
> speed=get_timestamp_khz;
> lowbits=get_hires_timestamp();

But isn't the "num ^= high;" a way to improve the randomness
and the high value doesn't really need to be linear to the time?
Thus it would be random-specific and not just a timestamp.

/Johan


2002-08-20 17:59:02

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

On Tue, Aug 20, 2002 at 07:34:02PM +0200, [email protected] wrote:
> > > Put an inline function or macro in asm/timex.h (?) together with an
> > > ARCH_HAS_RANDOM_TIMESTAMP define?
> >
> > I don't think we want to make it specific to random. I think we just
> > want to call it hires.
> >
> > > E.g. like this for i386:
> > > #define ARCH_HAS_RANDOM_TIMESTAMP
> > > #define RANDOM_TIMESTAMP(time, num) do{\
> ..
> > > And then in random.c:
> > > ifdef ARCH_HAS_RANDOM_TIMESTAMP
> > > RANDOM_TIMESTAMP(time, num);
> > > #else
> > > time = jiffies;
> > > #endif
> >
> > Again, too random-specific. And we need a way to get the timescale.
> > Perhaps something like:
> >
> > speed=get_timestamp_khz;
> > lowbits=get_hires_timestamp();
>
> But isn't the "num ^= high;" a way to improve the randomness
> and the high value doesn't really need to be linear to the time?

No, the high order bits aren't very interesting at all. Don't worry
about that bit, it's just cuteness.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-08-20 19:12:16

by Johan Adolfsson

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

> > > Perhaps something like:
> > >
> > > speed=get_timestamp_khz;
> > > lowbits=get_hires_timestamp();
> >
> > But isn't the "num ^= high;" a way to improve the randomness
> > and the high value doesn't really need to be linear to the time?
>
> No, the high order bits aren't very interesting at all. Don't worry
> about that bit, it's just cuteness.

ok:-)
But if the the timestamp doesn't have to be entirely accurate
the name should perhaps reflect that, since at least on the cris arch
you can save some cycles if you can accept a glitch now and then.
We want to be able to separate it from similar functions used
for high-res timers etc. which need a more accurate function.

/Johan


2002-08-20 19:30:11

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

On Tue, Aug 20, 2002 at 09:20:50PM +0200, [email protected] wrote:
> > > > Perhaps something like:
> > > >
> > > > speed=get_timestamp_khz;
> > > > lowbits=get_hires_timestamp();
> > >
> > > But isn't the "num ^= high;" a way to improve the randomness
> > > and the high value doesn't really need to be linear to the time?
> >
> > No, the high order bits aren't very interesting at all. Don't worry
> > about that bit, it's just cuteness.
>
> ok:-)
> But if the the timestamp doesn't have to be entirely accurate
> the name should perhaps reflect that, since at least on the cris arch
> you can save some cycles if you can accept a glitch now and then.

Describe this glitch again? Bear in mind that resolution is a very
different matter than accuracy. Jiffies are not accurate in any
absolute sense, but they are monotonic. And unless you've got
interrupts shut off, even get_cycles can only giving you the time
when it stores the timestamp in the register, which may have little to
do with the time the next instruction that uses it executes.

> We want to be able to separate it from similar functions used
> for high-res timers etc. which need a more accurate function.

Bear in mind that most arches won't have this sort of difficulty so in
the bigger picture, this might be overkill. Put a comment next to the
generic function that says "might not be accurate" and when a user of
the interface comes along that actually cares about accuracy, the
situation can be reevaluated.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-08-20 22:08:56

by Johan Adolfsson

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())


> > > No, the high order bits aren't very interesting at all. Don't worry
> > > about that bit, it's just cuteness.
> >
> > ok:-)
> > But if the the timestamp doesn't have to be entirely accurate
> > the name should perhaps reflect that, since at least on the cris arch
> > you can save some cycles if you can accept a glitch now and then.
>
> Describe this glitch again? Bear in mind that resolution is a very
> different matter than accuracy. Jiffies are not accurate in any
> absolute sense, but they are monotonic. And unless you've got
> interrupts shut off, even get_cycles can only giving you the time
> when it stores the timestamp in the register, which may have little to
> do with the time the next instruction that uses it executes.

I'l try...
timer0 counts from 250 down to 1 and generates the timer interrupt
at 100Hz and starts counting from 250 again.
It's value can be read to find out when in a jiffie you are with 40 us
resolution.
The prescaler is running at 25 MHz and counting from 1000 to 1
and is used as clock input to timer0.

When reading timer0, it might have wrapped before jiffies is updated
causing the time to go back unless handled.
Unfortunatly the prescaler value is in a different register then
timer0 so that value can wrap as well compared to the timer0 value.
It's possible to solve but it takes some extra code which might be
bad to have in e.g. the network interrupt path.
So my idea was to shift the jiffies value 8 bits, add/or the 8 bit (0-250)
from the timer0 register and perhaps also 10 bits (0-1000) from
the prescaler value to get 40 ns resolution (the CPU runs at 100MHz)
and ignore the proper math to scale it and the wraps as well.
In a way the wraps will only add uncertainty anyway so I thought
it would be okey.

I just compared the generated asm:
Accurate timestamp scaled to ns: 45 instructions (resolution actually 40 ns)
Approximate 40 ns resolution: 21 instructions
Approximate 40 us resolution: 9 instructions
For comparison one syscall path (gettimeofday()) is approx 400 instructions
and the add_timer_randomness() function that only uses jiffies is 76
instructions, so mayby I'm microoptimising here?
Is it worth the cycles to get 40 ns resolution instead of 40us ?

> > We want to be able to separate it from similar functions used
> > for high-res timers etc. which need a more accurate function.
>
> Bear in mind that most arches won't have this sort of difficulty so in
> the bigger picture, this might be overkill. Put a comment next to the
> generic function that says "might not be accurate" and when a user of
> the interface comes along that actually cares about accuracy, the
> situation can be reevaluated.

Perhaps, but if some (most) architectures has a cheap accurate
implementation
my bet is that those will be assumed to be accurate for all archs and
get used all over the place and then we're in trouble...;-)

/Johan


2002-08-20 22:56:26

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())

On Wed, Aug 21, 2002 at 12:17:26AM +0200, [email protected] wrote:

> I just compared the generated asm:
> Accurate timestamp scaled to ns: 45 instructions (resolution actually 40 ns)
> Approximate 40 ns resolution: 21 instructions
> Approximate 40 us resolution: 9 instructions
> For comparison one syscall path (gettimeofday()) is approx 400 instructions
> and the add_timer_randomness() function that only uses jiffies is 76
> instructions, so mayby I'm microoptimising here?
> Is it worth the cycles to get 40 ns resolution instead of 40us ?

Seems like it's probably worth the effort. In practice, such
difference often are lost in the noise compared to cache flushes, etc.
Does the 'correct' code suffer branch penalties or the like that might
make it significantly worse than the quick code? If not, then I'd say
definitely use it.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-08-21 09:02:42

by Johan Adolfsson

[permalink] [raw]
Subject: Re: [RFC] Improved add_timer_randomness for __CRIS__ (instead of rdtsc())


From: "Oliver Xymoron" <[email protected]>
> On Wed, Aug 21, 2002 at 12:17:26AM +0200, [email protected] wrote:
>
> > I just compared the generated asm:
> > Accurate timestamp scaled to ns: 45 instructions (resolution actually 40
ns)
> > Approximate 40 ns resolution: 21 instructions
> > Approximate 40 us resolution: 9 instructions
> > For comparison one syscall path (gettimeofday()) is approx 400
instructions
> > and the add_timer_randomness() function that only uses jiffies is 76
> > instructions, so mayby I'm microoptimising here?
> > Is it worth the cycles to get 40 ns resolution instead of 40us ?
>
> Seems like it's probably worth the effort. In practice, such
> difference often are lost in the noise compared to cache flushes, etc.
> Does the 'correct' code suffer branch penalties or the like that might
> make it significantly worse than the quick code? If not, then I'd say
> definitely use it.

The correct code has two potential branches instead of one and also
need to stack one register when I have them in functions, but that might
change if the function is inlined in add_timer_randomness().
I can shave off a few instruction if I don't scale it to ns but instead
transform it to a plain 25MHz counter which is what we want anyway I guess.

BTW: I think the trust_pct approach looks nice adn I hope it gets included.

/Johan