Is there any way to measure (with microsecond accuracy) the time of a
program execution (without using Machine Specific Registers) ?
I've already tried getrusage(), times() and clock() but they all have
10 millisecond accuracy, even though they claim to have microsecond
acuracy.
The only thing that seems to work is to use one of the tools that measure
performanc through accessing the machine specific registers. They give you
the ability to measure the clock cycles used, but their accuracy is also
very low from what I have seen up to now.
Thank you very much in advance
--) Vangelis
Hi,
How about TSC? I know this has disadvantages such as:
a) not all machines have TSC
b) not all machines that claim to have TSC have a usable one.
c) on SMP the kernel makes a best effort to synchronize TSC but this may
or may not be guaranteed
d) you still need a userspace implementation to correctly map TSC cycles
to (micro)seconds using various MHz-specific ratios/whatever. I think
someone I know has already done this work but will let him speak if he
wants to release this to public or not.
other than the above, TSC (rdtsc instruction) is perfectly available to
userspace applications without special privileges. And it is 64bit so it
won't easily wrap around...
regards,
Tigran.
On Thu, 7 Dec 2000, Kotsovinos Vangelis wrote:
>
> Is there any way to measure (with microsecond accuracy) the time of a
> program execution (without using Machine Specific Registers) ?
> I've already tried getrusage(), times() and clock() but they all have
> 10 millisecond accuracy, even though they claim to have microsecond
> acuracy.
> The only thing that seems to work is to use one of the tools that measure
> performanc through accessing the machine specific registers. They give you
> the ability to measure the clock cycles used, but their accuracy is also
> very low from what I have seen up to now.
>
> Thank you very much in advance
>
> --) Vangelis
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>
On Thu, 7 Dec 2000, Tigran Aivazian wrote:
> Hi,
>
> How about TSC? I know this has disadvantages such as:
>
> a) not all machines have TSC
while we are on this subject, please let me emphasize that you should
_not_ be using cpuid instruction to detect the presence of TSC but should
parse the /proc/cpuinfo file. There are many valid reasons why Linux's
idea of TSC presence may not be the same as hardware's (cpuid
instruction) idea.
Regards,
Tigran
Followup to: <[email protected]>
By author: Tigran Aivazian <[email protected]>
In newsgroup: linux.dev.kernel
>
> while we are on this subject, please let me emphasize that you should
> _not_ be using cpuid instruction to detect the presence of TSC but should
> parse the /proc/cpuinfo file. There are many valid reasons why Linux's
> idea of TSC presence may not be the same as hardware's (cpuid
> instruction) idea.
>
Unfortunately the most important instance of the in-kernel flag -- the
global one in the somewhat misnamed boot_cpu_data.x86_features --
isn't actually readable in the /proc/cpuinfo file. It is perfectly
possible (e.g. the "notsc" option) for ALL the CPUs to report this
capability, but the global capability to still be off.
I would like to have exported the global capabilities into
/proc/cpuinfo, but I'm worried about breaking software (the "flags"
versus "features" issue was bad enough -- unfortunately, in cases like
this, there probably is no "good" solution.)
-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
On 7 Dec 2000, H. Peter Anvin wrote:
> Unfortunately the most important instance of the in-kernel flag -- the
> global one in the somewhat misnamed boot_cpu_data.x86_features --
> isn't actually readable in the /proc/cpuinfo file. It is perfectly
> possible (e.g. the "notsc" option) for ALL the CPUs to report this
> capability, but the global capability to still be off.
Hmm, I recall I implemented and explicitly verified switching the
/proc/cpuinfo "tsc" flag (as well as the userland access to the TSC) off
when I wrote the code to handle the "notsc" option. Has it changed since
then? I recall you modified the code a bit -- I looked at the changes
then but I was pretty confident the semantics was preserved.
There is no possibility to have TSC and non-TSC chips mixed in a single
SMP system (due to existing hardware, even though it's possible in
theory), so there is no problem with such an assymetry. Either all chips
have the "tsc" flag in /proc/cpuinfo on or off.
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
Kotsovinos Vangelis wrote:
>
> Is there any way to measure (with microsecond accuracy) the time of a
> program execution (without using Machine Specific Registers) ?
> I've already tried getrusage(), times() and clock() but they all have
> 10 millisecond accuracy, even though they claim to have microsecond
> acuracy.
> The only thing that seems to work is to use one of the tools that measure
> performanc through accessing the machine specific registers. They give you
> the ability to measure the clock cycles used, but their accuracy is also
> very low from what I have seen up to now.
Can you not just use something like gettimeofday()? Do two consecutive calls to
find the execution time of the instruction itself, and then do two calls on
either side of the program execution. Subtract the instruction execution time
from the delta, and that should give a pretty good idea of execution time.
On a 400Mhz G4, getttimeofday() consistantly takes 2 microseconds to run.
--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]
On Thu, 7 Dec 2000, Maciej W. Rozycki wrote:
> On 7 Dec 2000, H. Peter Anvin wrote:
>
> > Unfortunately the most important instance of the in-kernel flag -- the
> > global one in the somewhat misnamed boot_cpu_data.x86_features --
> > isn't actually readable in the /proc/cpuinfo file. It is perfectly
> > possible (e.g. the "notsc" option) for ALL the CPUs to report this
> > capability, but the global capability to still be off.
>
> Hmm, I recall I implemented and explicitly verified switching the
> /proc/cpuinfo "tsc" flag (as well as the userland access to the TSC) off
> when I wrote the code to handle the "notsc" option. Has it changed since
> then? I recall you modified the code a bit -- I looked at the changes
> then but I was pretty confident the semantics was preserved.
The present situation is inconsistent: "notsc" removes cpuinfo's
"tsc" flag in the UP case (when cpu_data[0] is boot_cpu_data), but
not in the SMP case. I don't believe HPA's recent mods affected that
behaviour, but it is made consistent (cleared in SMP case too) by the
patch I sent him a couple of days ago, below updated for test12-pre7...
I didn't test userland access to the TSC, but my reading of the code
was that prior to this patch, it would be disallowed on the boot cpu,
but still allowed on auxiliaries - because disable_tsc set X86_CR4_TSD
if cpu_has_tsc, but initing boot cpu forces cpu_has_tsc to !cpu_has_tsc.
My patch description was:
identify_cpu() re-evaluates x86_capability, which left cpu_has_tsc true
(and cpu MHz shown as 0.000) in non-SMP "notsc" case: #ifdef CONFIG_TSC
was bogus. And set X86_CR4_TSD here when testing this cpu's capability,
not where cpu_init() tests cpu_has_tsc (boot_cpu's adjusted capability).
I have removed the "FIX-HPA" comment line: of course, that's none of my
business, but if you approve the patch I imagine you'd want that to go too
(I agree it's a bit ugly there, but safest to disable cpu_has_tsc soonest).
Hugh
--- test12-pre7/arch/i386/kernel/setup.c Thu Dec 7 17:25:55 2000
+++ linux/arch/i386/kernel/setup.c Thu Dec 7 17:56:35 2000
@@ -1999,10 +1999,14 @@
* we do "generic changes."
*/
+#ifndef CONFIG_X86_TSC
/* TSC disabled? */
-#ifdef CONFIG_TSC
- if ( tsc_disable )
- clear_bit(X86_FEATURE_TSC, &c->x86_capability);
+ if ( test_bit(X86_FEATURE_TSC, &c->x86_capability) ) {
+ if (tsc_disable || !cpu_has_tsc) {
+ clear_bit(X86_FEATURE_TSC, &c->x86_capability);
+ set_in_cr4(X86_CR4_TSD);
+ }
+ }
#endif
/* Disable the PN if appropriate */
@@ -2218,9 +2222,7 @@
#ifndef CONFIG_X86_TSC
if (tsc_disable && cpu_has_tsc) {
printk("Disabling TSC...\n");
- /**** FIX-HPA: DOES THIS REALLY BELONG HERE? ****/
clear_bit(X86_FEATURE_TSC, boot_cpu_data.x86_capability);
- set_in_cr4(X86_CR4_TSD);
}
#endif
On Thu, 7 Dec 2000, Hugh Dickins wrote:
> The present situation is inconsistent: "notsc" removes cpuinfo's
> "tsc" flag in the UP case (when cpu_data[0] is boot_cpu_data), but
> not in the SMP case. I don't believe HPA's recent mods affected that
> behaviour, but it is made consistent (cleared in SMP case too) by the
> patch I sent him a couple of days ago, below updated for test12-pre7...
My original code was specifically tested on a SMP system -- having no
suitable system I wrote it mainly to make sure TSC-less SMP systems (i.e.
486 ones) run fine. If it doesn't work as expected anymore, then an error
slipped in somehow since then.
> I didn't test userland access to the TSC, but my reading of the code
> was that prior to this patch, it would be disallowed on the boot cpu,
> but still allowed on auxiliaries - because disable_tsc set X86_CR4_TSD
> if cpu_has_tsc, but initing boot cpu forces cpu_has_tsc to !cpu_has_tsc.
Note that identify_cpu() rereads feature flags, so everything should be
fine.
> I have removed the "FIX-HPA" comment line: of course, that's none of my
> business, but if you approve the patch I imagine you'd want that to go too
> (I agree it's a bit ugly there, but safest to disable cpu_has_tsc soonest).
It might probably be done in identify_cpu() but do we want to fiddle with
cr4 there?
Well, it appears an error slipped in, indeed. The following change is
the key one. Everything should be fine once it's changed.
Peter would you accept the patch (see below)?
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: [email protected], PGP key available +
diff -up --recursive --new-file linux-2.4.0-test11.macro/arch/i386/kernel/setup.c linux-2.4.0-test11/arch/i386/kernel/setup.c
--- linux-2.4.0-test11.macro/arch/i386/kernel/setup.c Mon Nov 20 07:03:47 2000
+++ linux-2.4.0-test11/arch/i386/kernel/setup.c Thu Dec 7 20:43:24 2000
@@ -1959,7 +1959,7 @@ void __init identify_cpu(struct cpuinfo_
*/
/* TSC disabled? */
-#ifdef CONFIG_TSC
+#ifndef CONFIG_X86_TSC
if ( tsc_disable )
clear_bit(X86_FEATURE_TSC, &c->x86_capability);
#endif
You might want to try the Linux Trace Toolkit. It'll give you microsecond
accuracy on program execution time measurement.
Check it out:
http://www.opersys.com/LTT
Karim
Kotsovinos Vangelis wrote:
>
> Is there any way to measure (with microsecond accuracy) the time of a
> program execution (without using Machine Specific Registers) ?
> I've already tried getrusage(), times() and clock() but they all have
> 10 millisecond accuracy, even though they claim to have microsecond
> acuracy.
> The only thing that seems to work is to use one of the tools that measure
> performanc through accessing the machine specific registers. They give you
> the ability to measure the clock cycles used, but their accuracy is also
> very low from what I have seen up to now.
>
> Thank you very much in advance
>
> --) Vangelis
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
--
===================================================
Karim Yaghmour
[email protected]
Operating System Consultant
(Linux kernel, real-time and distributed systems)
===================================================
Ok, I'll check it out...
Thank you very much,
--) Vangelis
On Thu, 7 Dec 2000, Karim Yaghmour wrote:
>
> You might want to try the Linux Trace Toolkit. It'll give you microsecond
> accuracy on program execution time measurement.
>
> Check it out:
> http://www.opersys.com/LTT
>
> Karim
>
> Kotsovinos Vangelis wrote:
> >
> > Is there any way to measure (with microsecond accuracy) the time of a
> > program execution (without using Machine Specific Registers) ?
> > I've already tried getrusage(), times() and clock() but they all have
> > 10 millisecond accuracy, even though they claim to have microsecond
> > acuracy.
> > The only thing that seems to work is to use one of the tools that measure
> > performanc through accessing the machine specific registers. They give you
> > the ability to measure the clock cycles used, but their accuracy is also
> > very low from what I have seen up to now.
> >
> > Thank you very much in advance
> >
> > --) Vangelis
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> ===================================================
> Karim Yaghmour
> [email protected]
> Operating System Consultant
> (Linux kernel, real-time and distributed systems)
> ===================================================
>
On Thu, 7 Dec 2000, Christopher Friesen wrote:
> Kotsovinos Vangelis wrote:
> >
> > Is there any way to measure (with microsecond accuracy) the time of a
> > program execution (without using Machine Specific Registers) ?
> > I've already tried getrusage(), times() and clock() but they all have
> > 10 millisecond accuracy, even though they claim to have microsecond
> > acuracy.
> > The only thing that seems to work is to use one of the tools that measure
> > performanc through accessing the machine specific registers. They give you
> > the ability to measure the clock cycles used, but their accuracy is also
> > very low from what I have seen up to now.
>
> Can you not just use something like gettimeofday()? Do two consecutive calls to
> find the execution time of the instruction itself, and then do two calls on
> either side of the program execution. Subtract the instruction execution time
> from the delta, and that should give a pretty good idea of execution time.
Well, it is a pretty complex program that I want to measure, with more
than one modules that run one after another... they sleep and use signals
to wake each other up, they use semaphores etc. What I want to measure is
the time the program is running (not waiting for other processes or
waiting for a signal etc).
Also, there are other processes running on the
system (for example, my program needs about 50 seconds of real time to
execute and I estimate the time it is "running" to be about 5000-10000
microseconds)...
Thanks anyway,
Vangelis
>
> The present situation is inconsistent: "notsc" removes cpuinfo's
> "tsc" flag in the UP case (when cpu_data[0] is boot_cpu_data), but
> not in the SMP case. I don't believe HPA's recent mods affected that
> behaviour, but it is made consistent (cleared in SMP case too) by the
> patch I sent him a couple of days ago, below updated for test12-pre7...
>
Great. You've taken something that was somewhat broken in the UP case and
introduced massive braindamage in the SMP case. What really needs to be is
that the global enables (boot_cpu_data) should be exposed.
-hpa