2007-12-11 23:33:00

by Rene Herman

[permalink] [raw]
Subject: [RFT] Port 0x80 I/O speed

/* gcc -W -Wall -O2 -o port80 port80.c */

#include <stdlib.h>
#include <stdio.h>

#include <sys/io.h>

#define LOOPS 10000

inline unsigned long long rdtsc(void)
{
unsigned long long tsc;

asm volatile ("rdtsc": "=A" (tsc));

return tsc;
}

inline void serialize(void)
{
asm volatile ("cpuid": : : "eax", "ebx", "ecx", "edx");
}

int main(void)
{
unsigned long long start;
unsigned long long overhead;
unsigned long long output;
unsigned long long input;
int i;

if (iopl(3) < 0) {
perror("iopl");
return EXIT_FAILURE;
}

asm volatile ("cli");
start = rdtsc();
for (i = 0; i < LOOPS; i++) {
serialize();
serialize();
}
overhead = rdtsc() - start;

start = rdtsc() + overhead;
for (i = 0; i < LOOPS; i++) {
serialize();
asm volatile ("outb %al, $0x80");
serialize();
}
output = rdtsc() - start;

start = rdtsc() + overhead;
for (i = 0; i < LOOPS; i++) {
serialize();
asm volatile ("inb $0x80, %%al": : : "al");
serialize();
}
input = rdtsc() - start;
asm volatile ("sti");

output /= LOOPS;
input /= LOOPS;
printf("cycles: out %llu, in %llu\n", output, input);

return EXIT_SUCCESS;
}


Attachments:
port80.c (1.11 kB)

2007-12-11 23:41:19

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wednesday 12 December 2007 01:31:18 Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80 thread
> which had a serialising problem. This one should as far as I can see measure
> the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>
> Rene.
>


Sure,

maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1767, in 1147
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1774, in 1148
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1769, in 1150
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1769, in 1150
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1777, in 1150
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1766, in 1149
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1768, in 1148
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1765, in 1147


Core 2 Duo system (ICH8/Intel DG965RY motherboard)

Subject: Re: [RFT] Port 0x80 I/O speed

El Wed, 12 Dec 2007 00:31:18 +0100
Rene Herman <[email protected]> escribió:

> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80 thread
> which had a serialising problem. This one should as far as I can see measure
> the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>
> Rene.

On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:

cycles: out 1844674407370808, in 1844674407369087

It is not constant but variations are not significant afaics

2007-12-11 23:46:20

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 00:40, Maxim Levitsky wrote:

> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1767, in 1147
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1774, in 1148
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1769, in 1150
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1769, in 1150
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1777, in 1150
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1766, in 1149
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1768, in 1148
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1765, in 1147
>
>
> Core 2 Duo system (ICH8/Intel DG965RY motherboard)

1.8 Ghz?

Rene.

2007-12-11 23:53:00

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 00:43, Alejandro Riveira Fern?ndez wrote:

> On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
>
> cycles: out 1844674407370808, in 1844674407369087
>
> It is not constant but variations are not significant afaics

Eh, oh, I guess you need to compile as a 32-bit binary...

Rene.

2007-12-11 23:55:26

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and
> run the attached program? This is about testing how long I/O port access
> to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I can
> see measure the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing
> with an -O2 compile should be most useful. Thanks!

(AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo)

nigel@home:~/Downloads$ gcc port80.c -o port80
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1235, in 1207
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1238, in 1205
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1237, in 1209
nigel@home:~/Downloads$ gcc -O2 port80.c -o port80
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1844674407370794, in 1844674407369408
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1844674407370795, in 1844674407369404
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1844674407370795, in 1844674407369409
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1844674407370798, in 1844674407369407
nigel@home:~/Downloads$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 36
model name : AMD Turion(tm) 64 Mobile Technology ML-34
stepping : 2
cpu MHz : 800.000
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt
lm 3dnowext 3dnow rep_good pni lahf_lm
bogomips : 1592.87
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc


Regards,

Nigel

2007-12-12 00:02:42

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 00:55, Nigel Cunningham wrote:

> (AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo)
>
> nigel@home:~/Downloads$ gcc port80.c -o port80
> nigel@home:~/Downloads$ sudo ./port80
> cycles: out 1235, in 1207

Looking good.

> nigel@home:~/Downloads$ gcc -O2 port80.c -o port80
> nigel@home:~/Downloads$ sudo ./port80
> cycles: out 1844674407370794, in 1844674407369408

Obviously not. I suppose this changes with -m32 on the GCC command line?
(sorry for missing that, I have no 64-bit machines).

Rene.

Subject: Re: [RFT] Port 0x80 I/O speed

El Wed, 12 Dec 2007 00:51:25 +0100
Rene Herman <[email protected]> escribió:

> On 12-12-07 00:43, Alejandro Riveira Fernández wrote:
>
> > On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
> >
> > cycles: out 1844674407370808, in 1844674407369087
> >
> > It is not constant but variations are not significant afaics
>
> Eh, oh, I guess you need to compile as a 32-bit binary...

I tried without -O2 as Nigel Cunningham...

cycles: out 1562, in 865
cycles: out 1562, in 866
cycles: out 1555, in 858
cycles: out 1562, in 866

With -m32 -O2
cycles: out 1566, in 876
cycles: out 1555, in 865
cycles: out 1594, in 931
cycles: out 1559, in 874
...
>
> Rene.

2007-12-12 00:15:06

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wednesday 12 December 2007 01:44:42 Rene Herman wrote:
> On 12-12-07 00:40, Maxim Levitsky wrote:
>
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1767, in 1147
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1774, in 1148
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1769, in 1150
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1769, in 1150
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1777, in 1150
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1766, in 1149
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1768, in 1148
> > maxim@MAIN:~/tmp$ sudo ./port800
> > cycles: out 1765, in 1147
> >
> >
> > Core 2 Duo system (ICH8/Intel DG965RY motherboard)
>
> 1.8 Ghz?
>
> Rene.
>
>

2.1 GHZ. but usually reduced by EIST to 1.5 GHz
(It can have just two values above)

I did the tests again:


CPU frequency locked to 2128.000 Mhz:
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1650, in 1065
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1648, in 1066
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1649, in 1065
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1648, in 1065
maxim@MAIN:~/tmp$


CPU frequency locked to: 1596.000 Mhz
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1730, in 1138
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1730, in 1138
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1735, in 1140
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1734, in 1139
maxim@MAIN:~/tmp$ sudo ./port800
cycles: out 1734, in 1138
maxim@MAIN:~/tmp$ 2128.000


A bit strange, isn't it?
Regards,
Maxim Levitsky

2007-12-12 00:17:42

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 01:09, Alejandro Riveira Fern?ndez wrote:

>>> On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
>>>
>>> cycles: out 1844674407370808, in 1844674407369087
>>>
>>> It is not constant but variations are not significant afaics
>> Eh, oh, I guess you need to compile as a 32-bit binary...
>
> I tried without -O2 as Nigel Cunningham...
>
> cycles: out 1562, in 865
> cycles: out 1562, in 866
> cycles: out 1555, in 858
> cycles: out 1562, in 866
>
> With -m32 -O2
> cycles: out 1566, in 876
> cycles: out 1555, in 865
> cycles: out 1594, in 931
> cycles: out 1559, in 874

Great, thanks much for reporting. Sort of interesting in itself that without
-O2 you do still get correct results on 64-bit but for some other time.

You're the first one to go significantly below 1 us it seems.

Rene.

2007-12-12 00:29:10

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 01:14, Maxim Levitsky wrote:

> CPU frequency locked to 2128.000 Mhz:
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1650, in 1065

> CPU frequency locked to: 1596.000 Mhz
> maxim@MAIN:~/tmp$ sudo ./port800
> cycles: out 1730, in 1138

> A bit strange, isn't it?

Well, yes. Don't know what that effect is. A bus-clock divided from the CPU
clock comes to mind, but I believe that shouldn't happen to LPC.

Anyways, we're looking for an upper bound and that's still nicely below 2 us
on everything upto now, so I guess it doesn't matter all that much. Thanks!

Rene.

2007-12-12 01:19:18

by Alistair John Strachan

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Tuesday 11 December 2007 23:31:18 Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to
> port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.

cycles: out 2712, in 2606

1.5GHz C7, Via chipset. 32bit OS.

--
Cheers,
Alistair.

137/1 Warrender Park Road, Edinburgh, UK.

2007-12-12 01:30:29

by Randy Dunlap

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, 12 Dec 2007 00:31:18 +0100 Rene Herman wrote:

> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80 thread
> which had a serialising problem. This one should as far as I can see measure
> the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!

(-m32 build on x86_64)

midway:/home/rddunlap/src # ./port80
cycles: out 2702, in 1903
midway:/home/rddunlap/src # ./port80
cycles: out 2688, in 1893
midway:/home/rddunlap/src # ./port80
cycles: out 2703, in 1909
midway:/home/rddunlap/src # ./port80
cycles: out 2687, in 1893
midway:/home/rddunlap/src # ./port80
cycles: out 2687, in 1893
midway:/home/rddunlap/src # ./port80
cycles: out 2701, in 1907
midway:/home/rddunlap/src # ./port80
cycles: out 2701, in 1919
midway:/home/rddunlap/src # ./port80
cycles: out 2687, in 1893
midway:/home/rddunlap/src # ./port80
cycles: out 2701, in 1909
midway:/home/rddunlap/src # ./port80
cycles: out 2706, in 1906

/proc/cpuinfo says CPU speed is
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) D CPU 3.00GHz
stepping : 4
cpu MHz : 2999.988


---
~Randy

2007-12-12 01:41:46

by Mike Lampard

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, 12 Dec 2007 10:01:18 am Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to
> port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I can see
> measure the right thing though. Please yell if you disagree...

cycles: out 1399, in 303
cycles: out 1347, in 297
cycles: out 1235, in 251
cycles: out 1342, in 249
cycles: out 1393, in 274
cycles: out 1241, in 261
cycles: out 1238, in 251
cycles: out 1383, in 277
cycles: out 1228, in 252
cycles: out 1413, in 303
cycles: out 1394, in 268
cycles: out 1378, in 292
cycles: out 1239, in 265

-m32 build on x64
processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 107
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
stepping : 1
cpu MHz : 2300.000

2007-12-12 02:08:11

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> On 12-12-07 00:55, Nigel Cunningham wrote:
>
>> (AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo)
>>
>> nigel@home:~/Downloads$ gcc port80.c -o port80
>> nigel@home:~/Downloads$ sudo ./port80
>> cycles: out 1235, in 1207
>
> Looking good.
>
>> nigel@home:~/Downloads$ gcc -O2 port80.c -o port80
>> nigel@home:~/Downloads$ sudo ./port80
>> cycles: out 1844674407370794, in 1844674407369408
>
> Obviously not. I suppose this changes with -m32 on the GCC command line?
> (sorry for missing that, I have no 64-bit machines).

Yes, it does:

nigel@home:~/Downloads$ gcc -m32 -o port80 port80.c
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1231, in 1208
nigel@home:~/Downloads$ sudo ./port80
cycles: out 1233, in 1210

Incidentally:

nigel@home:~/Downloads$ processor_speed

(A little script I made because my lappy does a solid lock every now and
then that seems to be cpu-freq related - locking it to one frequency
makes the lock far less common).

Speed is now 1800000.
nigel@home:~/Downloads$ sudo ./port80
cycles: out 2472, in 2505
nigel@home:~/Downloads$ sudo ./port80
cycles: out 2489, in 2515
nigel@home:~/Downloads$ sudo ./port80
cycles: out 2481, in 2503
nigel@home:~/Downloads$ sudo ./port80
cycles: out 2476, in 2507

So the same effect Maxim reported is seen here.

Regards,

Nigel

2007-12-12 05:01:17

by Chris Holvenstot

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


cycles: out 1296, in 1243
cycles: out 1312, in 1245
cycles: out 1289, in 1239
cycles: out 1309, in 1245
cycles: out 1308, in 1244
cycles: out 1325, in 1239
cycles: out 1310, in 1245
cycles: out 1289, in 1239
cycles: out 1301, in 1252
cycles: out 1325, in 1249
cycles: out 1307, in 1249
cycles: out 1304, in 1247

AMD 64 X2 4600 + / 2 gig memory - running the 32 bit version of
2.6.24-rc5-git1

>From /proc/cpuinfo

processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 43
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
stepping : 1
cpu MHz : 2412.378
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm
3dnowext 3d
now pni lahf_lm cmp_legacy ts fid vid ttp
bogomips : 4827.69
clflush size : 64


2007-12-12 05:23:17

by Kyle McMartin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
> asm volatile ("rdtsc": "=A" (tsc));

rdtsc returns a 64-bit value in two 32-bit regs, you need to do

inline unsigned long long rdtsc(void)
{
unsigned int lo, hi;
asm volatile ("rdtsc": "=a" (lo), "=d" (hi));
return (unsigned long long)hi << 32 | lo;
}

as in msr.h, otherwise you'll only be looking at the value in %rax.

cheers,
Kyle

2007-12-12 07:20:10

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 06:23, Kyle McMartin wrote:

> On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
>> asm volatile ("rdtsc": "=A" (tsc));
>
> rdtsc returns a 64-bit value in two 32-bit regs, you need to do
>
> inline unsigned long long rdtsc(void)
> {
> unsigned int lo, hi;
> asm volatile ("rdtsc": "=a" (lo), "=d" (hi));
> return (unsigned long long)hi << 32 | lo;
> }
>
> as in msr.h, otherwise you'll only be looking at the value in %rax.

On 32-bit, "=A" is edx:eax. Not sure what the point is in not letting it be
that on 64-bit in fact, but yes, the thing should be compiled as 32-bit.

Rene.

2007-12-12 08:17:29

by Paolo Ornati

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, 12 Dec 2007 00:31:18 +0100
Rene Herman <[email protected]> wrote:
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>

### Core2 Duo 1.8 GHz ###

X86_64 -m32 -O2:

$ for i in `seq 5`; do sudo ./port80; sleep 1; done
cycles: out 1498, in 964
cycles: out 1498, in 964
cycles: out 1499, in 964
cycles: out 1498, in 964
cycles: out 1498, in 965

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz
stepping : 6
cpu MHz : 1864.805
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 3731.82
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

[...]


--
Paolo Ornati
Linux 2.6.24-rc4-g94545bad on x86_64

2007-12-12 08:35:29

by Dave Young

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Dec 12, 2007 7:31 AM, Rene Herman <[email protected]> wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80 thread
> which had a serialising problem. This one should as far as I can see measure
> the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>
> Rene.
>
> /* gcc -W -Wall -O2 -o port80 port80.c */
>
> #include <stdlib.h>
> #include <stdio.h>
>
> #include <sys/io.h>
>
> #define LOOPS 10000
>
> inline unsigned long long rdtsc(void)
> {
> unsigned long long tsc;
>
> asm volatile ("rdtsc": "=A" (tsc));
>
> return tsc;
> }
>
> inline void serialize(void)
> {
> asm volatile ("cpuid": : : "eax", "ebx", "ecx", "edx");
> }
>
> int main(void)
> {
> unsigned long long start;
> unsigned long long overhead;
> unsigned long long output;
> unsigned long long input;
> int i;
>
> if (iopl(3) < 0) {
> perror("iopl");
> return EXIT_FAILURE;
> }
>
> asm volatile ("cli");
> start = rdtsc();
> for (i = 0; i < LOOPS; i++) {
> serialize();
> serialize();
> }
> overhead = rdtsc() - start;
>
> start = rdtsc() + overhead;
> for (i = 0; i < LOOPS; i++) {
> serialize();
> asm volatile ("outb %al, $0x80");
> serialize();
> }
> output = rdtsc() - start;
>
> start = rdtsc() + overhead;
> for (i = 0; i < LOOPS; i++) {
> serialize();
> asm volatile ("inb $0x80, %%al": : : "al");
> serialize();
> }
> input = rdtsc() - start;
> asm volatile ("sti");
>
> output /= LOOPS;
> input /= LOOPS;
> printf("cycles: out %llu, in %llu\n", output, input);
>
> return EXIT_SUCCESS;
> }
>
>

dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2522, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2522, in 1795
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2523, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2516, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2516, in 1798
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2523, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2518, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2517, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2523, in 1788
dave@darkstar:~/work/tmp$ sudo ./port80
cycles: out 2517, in 1788

dave@darkstar:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) D CPU 2.80GHz
stepping : 7
cpu MHz : 2793.194
cache size : 1024 KB

2007-12-12 08:38:50

by Edwin de Caluwé

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Exactly constant timing in every iteration:

cycles: out 667, in 305

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Celeron (Coppermine)
stepping : 3
cpu MHz : 497.582
cache size : 128 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov pat pse36 mmx fxsr sse up
bogomips : 996.21
clflush size : 32


On 12/12/07, Paolo Ornati <[email protected]> wrote:
> On Wed, 12 Dec 2007 00:31:18 +0100
> Rene Herman <[email protected]> wrote:
> >
> > and on a PII 400 (Intel 440BX chipset) a constant:
> >
> > rene@6bap:~/src/port80$ su -c ./port80
> > cycles: out 553, in 251
> >
> > Results are (mostly) independent of compiler optimisation, but testing
> with
> > an -O2 compile should be most useful. Thanks!
> >
>
> ### Core2 Duo 1.8 GHz ###
>
> X86_64 -m32 -O2:
>
> $ for i in `seq 5`; do sudo ./port80; sleep 1; done
> cycles: out 1498, in 964
> cycles: out 1498, in 964
> cycles: out 1499, in 964
> cycles: out 1498, in 964
> cycles: out 1498, in 965
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz
> stepping : 6
> cpu MHz : 1864.805
> cache size : 2048 KB
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
> constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
> ssse3 cx16 xtpr lahf_lm
> bogomips : 3731.82
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> [...]
>
>
> --
> Paolo Ornati
> Linux 2.6.24-rc4-g94545bad on x86_64
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2007-12-12 08:48:55

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12/12/2007 12:31 AM, Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and
> run the attached program? This is about testing how long I/O port access

model name : Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz

cycles: out 6490, in 3135
cycles: out 6484, in 3126
cycles: out 6511, in 3128
cycles: out 6500, in 3135
cycles: out 6492, in 3154
cycles: out 6506, in 3136
cycles: out 6516, in 3144
cycles: out 6489, in 3140
cycles: out 6492, in 3129
cycles: out 6507, in 3130

2007-12-12 09:24:56

by Juergen Beisert

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

$ for i in `seq 5`; do ./port80; sleep 1; done
cycles: out 5260, in 2372
cycles: out 5260, in 2384
cycles: out 5260, in 2323
cycles: out 5270, in 2382
cycles: out 5259, in 2323

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 501.208
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 mmx fxsr sse
bogomips : 993.28

#########

$ for i in `seq 5`; do ./port80; sleep 1; done
cycles: out 1214, in 1095
cycles: out 1215, in 1094
cycles: out 1214, in 1095
cycles: out 1216, in 1095
cycles: out 1213, in 1093

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 11
model name : Intel(R) Celeron(TM) CPU 1000MHz
stepping : 1
cpu MHz : 1002.285
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat
pse36 mmx fxsr sse
bogomips : 1998.84

2007-12-12 10:02:44

by Luciano Rocha

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>

$ sudo ./port80
cycles: out 2729, in 1872
$ sudo ./port80
cycles: out 2729, in 1872
$ sudo ./port80
cycles: out 2711, in 1856
$ sudo ./port80
cycles: out 2711, in 1856

model name : Intel(R) Pentium(R) 4 CPU 1.90GHz
stepping : 2
cpu MHz : 1917.229
cache size : 256 KB

--
Luciano Rocha <[email protected]>
Eurotux Inform?tica, S.A. <http://www.eurotux.com/>


Attachments:
(No filename) (707.00 B)
(No filename) (189.00 B)
Download all attachments

2007-12-12 10:28:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


On Wed, 2007-12-12 at 00:31 +0100, Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80 thread
> which had a serialising problem. This one should as far as I can see measure
> the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!

Since a lot of people reported timings for all the fancy new x86_64
hardware, I've not included those. Timings for my ancient machines still
in service:

$ gcc -o port80 -O2 port80.c
$ sudo ./port80
cycles: out 1736, in 1735
$ sudo ./port80
cycles: out 1831, in 1827
$ sudo ./port80
cycles: out 1735, in 1735
$ sudo ./port80
cycles: out 1743, in 1737
$ sudo ./port80
cycles: out 1737, in 1734

$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(tm) MP 1800+
stepping : 2
cpu MHz : 1533.420
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov
pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow ts
bogomips : 3069.09
clflush size : 32

processor : 1
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 1533.420
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow ts
bogomips : 3067.07
clflush size : 32


---

# ./port80
cycles: out 812, in 354
# ./port80
cycles: out 811, in 354
# ./port80
cycles: out 811, in 354
# ./port80
cycles: out 811, in 354

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 1
cpu MHz : 672.071
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips : 1331.20


---


# ./port80
cycles: out 116, in 47
# ./port80
cycles: out 116, in 47
# ./port80
cycles: out 116, in 47
# ./port80
cycles: out 116, in 47

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 5
model : 2
model name : Pentium 75 - 200
stepping : 12
cpu MHz : 99.476
fdiv_bug : no
hlt_bug : no
f00f_bug : yes
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8
bogomips : 198.24


P133 clocked at 100MHz

2007-12-12 10:34:41

by Dave Haywood

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile
> and run the attached program? This is about testing how long I/O port
> access to port 0x80 takes. It measures in CPU cycles so CPU speed is
> crucial in reporting.

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 6
cpu MHz : 930.347
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1861.99
clflush size : 32

cycles: out 1183, in 706

cycles: out 1183, in 707

cycles: out 1182, in 706

cycles: out 1182, in 706

cycles: out 1183, in 706

cycles: out 1182, in 706

cycles: out 1182, in 706

cycles: out 1183, in 706

cycles: out 1183, in 707

cycles: out 1183, in 707

cycles: out 1183, in 706

cycles: out 1183, in 706

cycles: out 1183, in 706

cycles: out 1182, in 706

cycles: out 1183, in 707

cycles: out 1183, in 707

cycles: out 1183, in 706

cycles: out 1183, in 706

cycles: out 1182, in 706

cycles: out 1183, in 706

cycles: out 1183, in 706

cycles: out 1183, in 706

Dave.

2007-12-12 11:25:31

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Dec 12, 2007 9:48 AM, Jiri Slaby <[email protected]> wrote:
> On 12/12/2007 12:31 AM, Rene Herman wrote:
> > Good day.
> >
> > Would some people on x86 (both 32 and 64) be kind enough to compile and
> > run the attached program? This is about testing how long I/O port access

model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz

cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898
cycles: out 3217, in 1898

model name : Dual Core AMD Opteron(tm) Processor 275
cycles: out 5508, in 5524
cycles: out 5509, in 5525
cycles: out 5510, in 5522
cycles: out 5512, in 5522
cycles: out 5512, in 5522
cycles: out 5510, in 5524
cycles: out 5510, in 5522
cycles: out 5511, in 5525
cycles: out 5513, in 5521
cycles: out 5509, in 5522

2007-12-12 11:29:30

by George Spelvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Here are a variety of machines:

600 MHz PIII (Katmai), 440BX chipset, 82371AB/EB/MB PIIX4 ISA bridge:
cycles: out 794, in 348
cycles: out 791, in 348
cycles: out 791, in 349
cycles: out 791, in 348
cycles: out 791, in 348

433 MHz Celeron (Mendocino), 440 BX chipset, same ISA bridge:
cycles: out 624, in 297
cycles: out 623, in 296
cycles: out 624, in 297
cycles: out 623, in 297
cycles: out 623, in 296

1100 MHz Athlon, nForce2 chipset, nForce2 ISA bridge:
cycles: out 1295, in 1162
cycles: out 1295, in 1162
cycles: out 1295, in 1162
cycles: out 1295, in 1162
cycles: out 1295, in 1162

800 MHz Transmeta Crusoe TM5800, Transmeta/ALi M7101 chipset.
cycles: out 1212, in 388
cycles: out 1195, in 375
cycles: out 1197, in 377
cycles: out 1196, in 376
cycles: out 1196, in 377

2200 MHz Athlon 64, K8T890 chipset, VT8237 ISA bridge:
cycles: out 1844674407370814, in 1844674407365758
cycles: out 1844674407370813, in 1844674407365756
cycles: out 1844674407370805, in 1844674407365750
cycles: out 1844674407370813, in 1844674407365755
cycles: out 1844674407370814, in 1844674407365756

Um, huh? That's gcc 4.2.3 (Debian version 4.2.2-4), -O2. Very odd.

I can run it with -O0:
cycles: out 4894, in 4894
cycles: out 4905, in 4917
cycles: out 4910, in 4896
cycles: out 4909, in 4896
cycles: out 4894, in 4898
cycles: out 4911, in 4898

or with -O2 -m32:
cycles: out 4914, in 4927
cycles: out 4913, in 4927
cycles: out 4913, in 4913
cycles: out 4914, in 4913
cycles: out 4913, in 4929
cycles: out 4912, in 4912
cycles: out 4913, in 4915

With -O2, the cycle counts come out (before division) as
out: 0xFFFFFFFFFFEA6F4F
in: 0xFFFFFFFFFCE68BB6
I think the "A" constraint doesn't work quite the same in
64-bit code. The compiler seems to be using %rdx rather than
%edx:%eax.

Subject: Re: [RFT] Port 0x80 I/O speed

El Wed, 12 Dec 2007 01:16:06 +0100
Rene Herman <[email protected]> escribió:

> On 12-12-07 01:09, Alejandro Riveira Fernández wrote:

[...]

>
> Great, thanks much for reporting. Sort of interesting in itself that without
> -O2 you do still get correct results on 64-bit but for some other time.
>
> You're the first one to go significantly below 1 us it seems.

:( I have seen the other msg and i have to say that the tests where done at
1GHz not at full speed. At full speed i see

cycles: out 3025, in 1653
cycles: out 3040, in 1708
cycles: out 3044, in 1650
cycles: out 3034, in 1652
cycles: out 3035, in 1652
cycles: out 3037, in 1652
cycles: out 3043, in 1709
cycles: out 3032, in 1648
cycles: out 3039, in 1652
cycles: out 3041, in 1652
cycles: out 3048, in 1704
cycles: out 3040, in 1650
cycles: out 3023, in 1631
cycles: out 3036, in 1652
cycles: out 3042, in 1706
cycles: out 3047, in 1708
cycles: out 3047, in 1711
cycles: out 3036, in 1652




>
> Rene.

2007-12-12 12:27:29

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to
> port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.

vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
stepping : 6

cpu MHz : 2394.000
cycles: out 1830, in 1166

cpu MHz : 1596.000
cycles: out 1925, in 1266

##

vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 10

cpu MHz : 850.000
cycles: out 1142, in 475

cpu MHz : 700.000
cycles: out 914, in 406

##

vendor_id : AuthenticAMD
cpu family : 6
model : 7
model name : AMD Duron(tm) processor
stepping : 1

cpu MHz : 1300.108
cycles: out 2562, in 2562

##

vendor_id : GenuineIntel
cpu family : 6
model : 5
model name : Pentium II (Deschutes)
stepping : 2

cpu MHz : 449.242
cycles: out 607, in 272

--
Ville Syrj?l?
[email protected]
http://www.sci.fi/~syrjala/

2007-12-12 12:36:18

by Paolo Ornati

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12 Dec 2007 06:20:49 -0500
[email protected] wrote:

> With -O2, the cycle counts come out (before division) as
> out: 0xFFFFFFFFFFEA6F4F
> in: 0xFFFFFFFFFCE68BB6
> I think the "A" constraint doesn't work quite the same in
> 64-bit code. The compiler seems to be using %rdx rather than
> %edx:%eax.

In another message he says to compile it with "-m32" :)

--
Paolo Ornati
Linux 2.6.24-rc5-g4af75653 on x86_64

2007-12-12 13:21:54

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 13:59, linux-os (Dick Johnson) wrote:

> On Tue, 11 Dec 2007, [utf-8] Alejandro Riveira Fern?ndez wrote:

>> On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
>>
>> cycles: out 1844674407370808, in 1844674407369087
>>
>> It is not constant but variations are not significant afaics
>>
>
> It looks as though this hardware does not have a port 0x80
> and its access is trapped by the hardware with a long time-out!
> This may be the reason when the _p was called "harmful" on this
> platform!
>
> I'm not sure the "rules" for port access allow for this kind of
> behavior. This design may be defective, needing to be brought
> to the attention of the vendor. A decent vendor would update
> a FPGA and provide code to burn a new BIOS.

I'm afraid it's just the test that is "defective" as 64-bit code. For some
reason "=A" doesn't mean edx:eax on amd64 even though it's a useful register
pair to be able to name there as well. Didn't catch that being without amd64
machines myself.

Oh well. gcc -m32 fixes it...

Rene.

2007-12-12 13:24:47

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


On Wed, 12 Dec 2007, Rene Herman wrote:

> On 12-12-07 13:59, linux-os (Dick Johnson) wrote:
>
>> On Tue, 11 Dec 2007, [utf-8] Alejandro Riveira Fern?ndez wrote:
>
>>> On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
>>>
>>> cycles: out 1844674407370808, in 1844674407369087
>>>
>>> It is not constant but variations are not significant afaics
>>>
>>
>> It looks as though this hardware does not have a port 0x80
>> and its access is trapped by the hardware with a long time-out!
>> This may be the reason when the _p was called "harmful" on this
>> platform!
>>
>> I'm not sure the "rules" for port access allow for this kind of
>> behavior. This design may be defective, needing to be brought
>> to the attention of the vendor. A decent vendor would update
>> a FPGA and provide code to burn a new BIOS.
>
> I'm afraid it's just the test that is "defective" as 64-bit code. For some
> reason "=A" doesn't mean edx:eax on amd64 even though it's a useful register
> pair to be able to name there as well. Didn't catch that being without amd64
> machines myself.
>
> Oh well. gcc -m32 fixes it...
>
> Rene.
>

Yep. By bad. I actually believed the returned value!


Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2007-12-12 14:32:34

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 09:59, Juergen Beisert wrote:

> $ for i in `seq 5`; do ./port80; sleep 1; done
> cycles: out 5260, in 2372
> cycles: out 5260, in 2384
> cycles: out 5260, in 2323
> cycles: out 5270, in 2382
> cycles: out 5259, in 2323
>
> $ cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 7
> model name : Pentium III (Katmai)
> stepping : 3
> cpu MHz : 501.208

This one's really wrong. This would be 10 microsecs. I suppose the
machine/bus was heavily loaded or something?

> $ for i in `seq 5`; do ./port80; sleep 1; done
> cycles: out 1214, in 1095
> cycles: out 1215, in 1094
> cycles: out 1214, in 1095
> cycles: out 1216, in 1095
> cycles: out 1213, in 1093
>
> $ cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 11
> model name : Intel(R) Celeron(TM) CPU 1000MHz

This one's normal again...

Rene.

2007-12-12 14:49:37

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 12:20, [email protected] wrote:

> Here are a variety of machines:

Thanks much for all! Collecting all data now...

> With -O2, the cycle counts come out (before division) as
> out: 0xFFFFFFFFFFEA6F4F
> in: 0xFFFFFFFFFCE68BB6
> I think the "A" constraint doesn't work quite the same in
> 64-bit code. The compiler seems to be using %rdx rather than
> %edx:%eax.

Yes indeed, that tripped me up. Have been using the "=A" locally for a while
for similar timing tests. Will use a manual "=a" (lo), "=d" (hi) I guess for
amd64 compatibility from now on.

If I'd care deeply I'd probably categorize this as a backwards compatibility
bug in GCC though. Things were probably never guaranteed but they certainly
worked that way...

Rene.

2007-12-12 15:12:37

by Juergen Beisert

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wednesday 12 December 2007 15:30, Rene Herman wrote:
> On 12-12-07 09:59, Juergen Beisert wrote:
> > $ for i in `seq 5`; do ./port80; sleep 1; done
> > cycles: out 5260, in 2372
> > cycles: out 5260, in 2384
> > cycles: out 5260, in 2323
> > cycles: out 5270, in 2382
> > cycles: out 5259, in 2323
> >
> > $ cat /proc/cpuinfo
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 7
> > model name : Pentium III (Katmai)
> > stepping : 3
> > cpu MHz : 501.208
>
> This one's really wrong. This would be 10 microsecs. I suppose the
> machine/bus was heavily loaded or something?

Ups, your are right:

$ cat /proc/acpi/processor/CPU0/throttling
state count: 2
active state: T1
states:
T0: 00%
*T1: 50%

With T0 it get:

$ for i in `seq 5`; do ./port80; sleep 1; done
cycles: out 684, in 280
cycles: out 684, in 280
cycles: out 684, in 280
cycles: out 684, in 280
cycles: out 684, in 280

Juergen

2007-12-12 15:47:32

by Romano Giannetti

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed



On Wed, 2007-12-12 at 00:31 +0100, Rene Herman wrote:
> cc -W -Wall -O2 -o port80 port80.c

On a laptop with a CoreDuo T2080/1.73GHz, but running on battery at
800 MHz (on-demand):

(0)rukbat:~/tmp% for i in {1..10}; do
sudo ./port80
done
cycles: out 3575, in 2844
cycles: out 3589, in 2923
cycles: out 3672, in 2864
cycles: out 3575, in 2843
cycles: out 3607, in 2859
cycles: out 3623, in 2877
cycles: out 3604, in 2848
cycles: out 3575, in 2849
cycles: out 3598, in 2861
cycles: out 3613, in 2861

With a cpu-hog running, cpufreq reporting 1.73GHz:


(0)rukbat:~/tmp% for i in {1..10}; do
sudo ./port80
done
cycles: out 3446, in 2652
cycles: out 3499, in 2668
cycles: out 3395, in 2578
cycles: out 3452, in 2662
cycles: out 3448, in 2662
cycles: out 3622, in 2754
cycles: out 3457, in 2662
cycles: out 3451, in 2660
cycles: out 3581, in 2850
cycles: out 3477, in 2696

HTH,
Romano

Romano

--
Sorry for the disclaimer --- ?I cannot stop it!



--
La presente comunicaci?n tiene car?cter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribuci?n, reproducci?n o uso de esta comunicaci?n y/o de la informaci?n contenida en la misma est?n estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicaci?n por error, por favor, notif?quelo inmediatamente al remitente contestando a este mensaje y proceda a continuaci?n a destruirlo. Gracias por su colaboraci?n.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.

2007-12-12 15:54:13

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 12:25, Jiri Slaby wrote:

Thanks for reporting! You have two results that are somewhat off:

> Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz
> cycles: out 6500, in 3135

6500 / 3000 = 2.17. Fairly high...

> Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
> cycles: out 3217, in 1898

3217 / 2000 = 1.61. Okay.

> Dual Core AMD Opteron(tm) Processor 275
> cycles: out 5508, in 5524

2200 MHz right?

5508 / 2200 = 2.50. Very high.

Nothing was dozing off in some ACPI sleep state or something?

Rene.

2007-12-12 15:55:28

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 12:25, Jiri Slaby wrote:

Thanks for reporting! You have two results that are somewhat off:

> Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz
> cycles: out 6500, in 3135

6500 / 3000 = 2.17. Fairly high...

> Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
> cycles: out 3217, in 1898

3217 / 2000 = 1.61. Okay.

> Dual Core AMD Opteron(tm) Processor 275
> cycles: out 5508, in 5524

2200 MHz right?

5508 / 2200 = 2.50. Very high.

Nothing was dozing off in some ACPI sleep state or something?

Rene.

2007-12-12 16:17:21

by John Stoffel

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


My results, PIII, Dual 550Mhz Xeon.

jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 775, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 774, in 332
jfsnew:~/src> sudo ./port80
cycles: out 775, in 335


jfsnew:~/src> cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 547.216
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips : 1095.01
clflush size : 32

2007-12-12 16:29:35

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 10:57, Romano Giannetti wrote:

> On a laptop with a CoreDuo T2080/1.73GHz, but running on battery at
> 800 MHz (on-demand):

> cycles: out 3575, in 2844

Okay, I'm going to ignore this one. This would be 4 microsecs but there are
sleep states involved, and if a piece of hardware would work only when the
system went to sleep I'd call it broken...

> With a cpu-hog running, cpufreq reporting 1.73GHz:

> cycles: out 3446, in 2652

So this one will do I guess. The test program disables interrupts and as
such, the cpu-hog shouldn't have interfered with the measurement. Thanks!

Rene.

2007-12-12 16:39:38

by Oliver Pinter

[permalink] [raw]
Subject: [RFT] Port 0x80 I/O speed

pancs:/tmp# for((i=0;i<20;i++)); do ./port80; done
cycles: out 4098, in 2532
cycles: out 3951, in 2389
cycles: out 4043, in 2485
cycles: out 4058, in 2393
cycles: out 4056, in 2509
cycles: out 4063, in 2394
cycles: out 4076, in 2508
cycles: out 4143, in 2395
cycles: out 4062, in 2502
cycles: out 4084, in 2383
cycles: out 4055, in 2510
cycles: out 4228, in 2410
cycles: out 4071, in 2508
cycles: out 4062, in 2502
cycles: out 3982, in 2398
cycles: out 4103, in 2391
cycles: out 4160, in 2404
cycles: out 4010, in 2462
cycles: out 4105, in 2449
cycles: out 4028, in 2477

---------

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping : 9
cpu MHz : 3150.045
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 6303.37
clflush size : 64

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping : 9
cpu MHz : 3150.045
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 6300.14
clflush size : 64

2007-12-12 16:54:17

by Ondrej Zary

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wednesday 12 December 2007 00:31:18 Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to
> port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I can see
> measure the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>
> Rene.

Cyrix MII PR300 (225MHz), i430TX: cycles: out 263, in 93
Pentium MMX 166MHz @133MHz, VIA VPX: cycles: out 163, in 163
Celeron 433MHz, i440BX: cycles: out 620, in 305
Celeron 1.3GHz, i440BX: cycles: out 2114, in 849
Celeron 1.7GHz (P4-based), i845: cycles: out 2178, in 1651
Pentium 4 3.2GHz, i925X: cycles: out 2824, in 1899
Xeon E5310 1.6GHz, Dell PE1950 cycles: out 2631, in 1606
Xeon 3050 2.13GHz, Dell PE860 cycles: out 3367, in 1959

--
Ondrej Zary

2007-12-12 16:59:21

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

[Rene Herman - Wed, Dec 12, 2007 at 12:31:18AM +0100]
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to
> port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I can see
> measure the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing with
> an -O2 compile should be most useful. Thanks!
>
> Rene.

| /* gcc -W -Wall -O2 -o port80 port80.c */
|
| #include <stdlib.h>
| #include <stdio.h>
|
| #include <sys/io.h>
|
| #define LOOPS 10000
|
| inline unsigned long long rdtsc(void)
| {
| unsigned long long tsc;
|
| asm volatile ("rdtsc": "=A" (tsc));
|
| return tsc;
| }
|
| inline void serialize(void)
| {
| asm volatile ("cpuid": : : "eax", "ebx", "ecx", "edx");
| }
|
| int main(void)
| {
| unsigned long long start;
| unsigned long long overhead;
| unsigned long long output;
| unsigned long long input;
| int i;
|
| if (iopl(3) < 0) {
| perror("iopl");
| return EXIT_FAILURE;
| }
|
| asm volatile ("cli");
| start = rdtsc();
| for (i = 0; i < LOOPS; i++) {
| serialize();
| serialize();
| }
| overhead = rdtsc() - start;
|
| start = rdtsc() + overhead;
| for (i = 0; i < LOOPS; i++) {
| serialize();
| asm volatile ("outb %al, $0x80");
| serialize();
| }
| output = rdtsc() - start;
|
| start = rdtsc() + overhead;
| for (i = 0; i < LOOPS; i++) {
| serialize();
| asm volatile ("inb $0x80, %%al": : : "al");
| serialize();
| }
| input = rdtsc() - start;
| asm volatile ("sti");
|
| output /= LOOPS;
| input /= LOOPS;
| printf("cycles: out %llu, in %llu\n", output, input);
|
| return EXIT_SUCCESS;
| }

Here we go (for 1000 times started) ;)

---
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1427, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1449
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1427, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1449
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1449
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1449
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1429, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
cycles: out 1428, in 1450
---

processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 36
model name : AMD Turion(tm) 64 Mobile Technology ML-30
stepping : 2
cpu MHz : 800.000
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm ts fid vid ttp tm stc
bogomips : 1601.04
clflush size : 64

Cyrill

2007-12-12 17:06:06

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Ondrej Zary wrote:
>
> Cyrix MII PR300 (225MHz), i430TX: cycles: out 263, in 93
> Pentium MMX 166MHz @133MHz, VIA VPX: cycles: out 163, in 163
> Celeron 433MHz, i440BX: cycles: out 620, in 305
> Celeron 1.3GHz, i440BX: cycles: out 2114, in 849
> Celeron 1.7GHz (P4-based), i845: cycles: out 2178, in 1651
> Pentium 4 3.2GHz, i925X: cycles: out 2824, in 1899
> Xeon E5310 1.6GHz, Dell PE1950 cycles: out 2631, in 1606
> Xeon 3050 2.13GHz, Dell PE860 cycles: out 3367, in 1959
>

So between 0.88 and 1.6 µs for the out case; 0.4 and 1.0 µs for the in case.

-hpa

2007-12-12 17:28:06

by Török Edwin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile
> and run the attached program? This is about testing how long I/O port
> access to port 0x80 takes. It measures in CPU cycles so CPU speed is
> crucial in reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I
> can see measure the right thing though. Please yell if you disagree...
>
Hi,

Tested on 2 systems.

System I
------------
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 47
model name : AMD Athlon(tm) 64 Processor 3200+
stepping : 2
cpu MHz : 2000.000
cache size : 512 KB
fpu : yes
fpu_exception : yes

Motherboard: Asus A9N-E


With -m32:

edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 619, in 583
edwin@lightspeed2:~$ sudo ./port80
cycles: out 619, in 583
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067

After making the __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
I get this with 64:
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067
edwin@lightspeed2:~$ sudo ./port80
cycles: out 618, in 583
edwin@lightspeed2:~$ sudo ./port80
cycles: out 618, in 583
edwin@lightspeed2:~$ sudo ./port80
cycles: out 1107, in 1067

If I stop cpudyn I get a constant 618/583.

System II
------------

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 14
model name : Genuine Intel(R) CPU T2300 @ 1.66GHz
stepping : 8
cpu MHz : 1667.000
cache size : 2048 KB
physical id : 0
siblings : 2

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 14
model name : Genuine Intel(R) CPU T2300 @ 1.66GHz
stepping : 8
cpu MHz : 1667.000
cache size : 2048 KB
physical id : 0
siblings : 2


Dell Inspiron 6400, Intel Core Duo (ICH7 chipset)

thunder:/home/edwin# ./port80
cycles: out 2480, in 1867
thunder:/home/edwin# ./port80
cycles: out 2482, in 1865
thunder:/home/edwin# ./port80
cycles: out 2968, in 1893
thunder:/home/edwin# ./port80
cycles: out 1991, in 1372
thunder:/home/edwin# ./port80
cycles: out 1979, in 1366
thunder:/home/edwin# ./port80
cycles: out 2473, in 1865
thunder:/home/edwin# ./port80
cycles: out 2484, in 1869

After setting CPU governor to performance:

# ./port80
cycles: out 2368, in 1783
thunder:/home/edwin# ./port80
cycles: out 2377, in 1783
thunder:/home/edwin# ./port80
cycles: out 2367, in 1774
thunder:/home/edwin# ./port80
cycles: out 2370, in 1780
thunder:/home/edwin# ./port80
cycles: out 2365, in 1782
thunder:/home/edwin# ./port80
cycles: out 2369, in 1774
thunder:/home/edwin# ./port80
cycles: out 2366, in 1784
thunder:/home/edwin# ./port80
cycles: out 2379, in 1786
thunder:/home/edwin# ./port80
cycles: out 2367, in 1773
thunder:/home/edwin# ./port80
cycles: out 2376, in 1783
thunder:/home/edwin# ./port80
cycles: out 2360, in 1784
thunder:/home/edwin# ./port80
cycles: out 2367, in 1783
thunder:/home/edwin# ./port80
cycles: out 2370, in 1783
thunder:/home/edwin# ./port80
cycles: out 2382, in 1782

Also tried in a loop, but values are not constant:
while true; do ./port80; done
cycles: out 2415, in 1818
cycles: out 2405, in 1817
cycles: out 2414, in 1810
cycles: out 2411, in 1819
cycles: out 2407, in 1821
cycles: out 2410, in 1820
cycles: out 2418, in 1821
cycles: out 2408, in 1847
cycles: out 2404, in 1805
cycles: out 2411, in 1858
cycles: out 2395, in 1765
cycles: out 2377, in 1786
cycles: out 2378, in 1813
cycles: out 2395, in 1800
cycles: out 2381, in 1793
cycles: out 2382, in 1790
cycles: out 2399, in 1835
cycles: out 1928, in 1327
cycles: out 2410, in 1781
cycles: out 1996, in 1287
cycles: out 2369, in 1768
cycles: out 2401, in 1805
cycles: out 2395, in 1802
cycles: out 2389, in 1786
cycles: out 2359, in 1768
cycles: out 2495, in 1858
cycles: out 2408, in 1809
cycles: out 2919, in 1859
cycles: out 2404, in 1798
cycles: out 2393, in 1791
cycles: out 2882, in 1797
cycles: out 2404, in 1789
cycles: out 2406, in 1785
cycles: out 2393, in 1840
cycles: out 2498, in 1818
cycles: out 2402, in 1805
cycles: out 2888, in 1858
cycles: out 2397, in 1802
cycles: out 2411, in 1810
cycles: out 2396, in 1788
cycles: out 2362, in 1780
cycles: out 2861, in 1785
cycles: out 2380, in 1780
cycles: out 2357, in 1785
cycles: out 2342, in 1783
cycles: out 1916, in 1294
cycles: out 2358, in 1768
cycles: out 2371, in 1763
cycles: out 2386, in 1783
cycles: out 1919, in 1320
cycles: out 2355, in 1782
cycles: out 2330, in 1787
cycles: out 2350, in 1781
cycles: out 1881, in 1269
cycles: out 2378, in 1768
cycles: out 2381, in 1739
cycles: out 2365, in 1768
cycles: out 2362, in 1759
cycles: out 2368, in 1739
cycles: out 2354, in 1775
cycles: out 2375, in 1783
cycles: out 2369, in 1785
cycles: out 2361, in 1769
cycles: out 2382, in 1785
cycles: out 2370, in 1783


--Edwin

2007-12-12 18:40:00

by SL Baur

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12/11/07, Rene Herman <[email protected]> wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.

model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
cpu MHz : 3201.345

cycles: out 3026, in 2204
cycles: out 3031, in 2182
cycles: out 3019, in 2196
cycles: out 3030, in 2201
cycles: out 3013, in 2186

-sb

2007-12-12 18:45:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Jiri Slaby wrote:
> On Dec 12, 2007 9:48 AM, Jiri Slaby <[email protected]> wrote:
>> On 12/12/2007 12:31 AM, Rene Herman wrote:
>>> Good day.
>>>
>>> Would some people on x86 (both 32 and 64) be kind enough to compile and
>>> run the attached program? This is about testing how long I/O port access
>
> model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
> cycles: out 3217, in 1898

1.6 µs, on the high end....

> model name : Dual Core AMD Opteron(tm) Processor 275
> cycles: out 5508, in 5524

Definitely an outlier; 2.5 µs here.

-hpa

2007-12-12 18:54:55

by David P. Reed

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

I have been having a fun time testing this on my AMD64x2 system.

Since out's to port 80 hang the system hard after a while, I can run a
test just after booting, but the next run will typically hang it.

I did also test two ports thought to be unused. They do *not* hang the
system. Thus apparently there is some device responding to port 80!

Running the (slightly modified to test 3 ports instead of just port 80)
test when the CPU is running at 800 MHz, here's what I get for port 80
and port ec and port ef.

port 80: cycles: out 1430, in 792
port ef: cycles: out 1431, in 1378
port ec: cycles: out 1432, in 1372

[Note: port 80, when it doesn't hang, which is very often, responds to a
read twice as fast as a port which is "not there". Typically though,
the second time I run the test, the system freezes solid. Seems like
evidence of a present device, not a missing one. Also, port 80 responds
when booted with acpi=off, but never seems to hang - sounds like ACPI
causes it to be active in some way]
----------------------------

System info: HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
/proc/cpuinfo

processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 72
model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping : 2
cpu MHz : 800.000
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy
svm extapic cr8_legacy
bogomips : 1608.35
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 72
model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping : 2
cpu MHz : 800.000
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy
svm extapic cr8_legacy
bogomips : 1608.35
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

2007-12-12 19:08:29

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 19:39, SL Baur wrote:

> model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
> cpu MHz : 3201.345
>
> cycles: out 3026, in 2204
> cycles: out 3031, in 2182
> cycles: out 3019, in 2196
> cycles: out 3030, in 2201
> cycles: out 3013, in 2186

Thank you. I just posted the combined results, although I see it's not made
it to the list yet, probably due to a somewhat overly broad CC...

These results fit the pattern.

Rene.

2007-12-12 19:21:14

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12-12-07 19:44, H. Peter Anvin wrote:

>> model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
>> cycles: out 3217, in 1898
>
> 1.6 ?s, on the high end....
>
>> model name : Dual Core AMD Opteron(tm) Processor 275
>> cycles: out 5508, in 5524
>
> Definitely an outlier; 2.5 ?s here.

Jah, I don't trust that one. Just posted the complete results:

http://lkml.org/lkml/2007/12/12/309

Rene.

2007-12-12 19:30:12

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Kyle McMartin wrote:
> On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
>> asm volatile ("rdtsc": "=A" (tsc));
>
> rdtsc returns a 64-bit value in two 32-bit regs, you need to do
>
> inline unsigned long long rdtsc(void)
> {
> unsigned int lo, hi;
> asm volatile ("rdtsc": "=a" (lo), "=d" (hi));
> return (unsigned long long)hi << 32 | lo;
> }
>
> as in msr.h, otherwise you'll only be looking at the value in %rax.
>

"=A" works on 32-bit systems (only), obviously, and gcc will generally
produce slightly better code as a result (gcc could really use a
register renaming/copy propagation step *after* multi-register entities
are broken apart, at least on architectures which don't have register
pairs as a hardware constraint.)

-hpa

2007-12-12 19:32:19

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

> "=A" works on 32-bit systems (only), obviously, and gcc will generally
> produce slightly better code as a result (gcc could really use a
> register renaming/copy propagation step *after* multi-register entities

I believe gcc 4.3 (or maybe 4.2) does that already -- it splits them much
earlier.

-Andi

2007-12-12 19:43:53

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Andi Kleen wrote:
>> "=A" works on 32-bit systems (only), obviously, and gcc will generally
>> produce slightly better code as a result (gcc could really use a
>> register renaming/copy propagation step *after* multi-register entities
>
> I believe gcc 4.3 (or maybe 4.2) does that already -- it splits them much
> earlier.

Probably 4.3, then. I don't see 4.2 doing that, or at least not very
successfully. Which is fine, I'm sure they had bigger fish to fry.
Good to hear it has gotten some attention recently.

-hpa

2007-12-12 21:19:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> On 12-12-07 01:09, Alejandro Riveira Fern?ndez wrote:
>
>>>> On my AMD 3800 X2 (2000MHz) ULi M1697 2.6.24-rc5 i get:
>>>>
>>>> cycles: out 1844674407370808, in 1844674407369087
>>>>
>>>> It is not constant but variations are not significant afaics
>>> Eh, oh, I guess you need to compile as a 32-bit binary...
>>
>> I tried without -O2 as Nigel Cunningham...
>>
>> cycles: out 1562, in 865
>> cycles: out 1562, in 866
>> cycles: out 1555, in 858
>> cycles: out 1562, in 866
>>
>> With -m32 -O2
>> cycles: out 1566, in 876
>> cycles: out 1555, in 865
>> cycles: out 1594, in 931
>> cycles: out 1559, in 874
>
> Great, thanks much for reporting. Sort of interesting in itself that
> without -O2 you do still get correct results on 64-bit but for some
> other time.
>
> You're the first one to go significantly below 1 us it seems.

Make sure the CPU is actually running at full frequency.

It probably would have been better to have used gettimeofday() around a
sufficiently big loop, so that we would have gotten wall time rather
than cycles.

-hpa

2007-12-12 21:33:04

by Jesper Juhl

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 12/12/2007, Rene Herman <[email protected]> wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run
> the attached program? This is about testing how long I/O port access to port
> 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>

root@dragon:/home/juhl# uname -a
Linux dragon 2.6.24-rc3-g2ffbb837 #7 SMP PREEMPT Mon Nov 19 22:16:27
CET 2007 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
AuthenticAMD GNU/Linux
root@dragon:/home/juhl# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 35
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
stepping : 2
cpu MHz : 2200.000
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
bogomips : 4401.67
clflush size : 64

processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 35
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
stepping : 2
cpu MHz : 2200.000
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
bogomips : 4399.52
clflush size : 64

root@dragon:/home/juhl# for i in $(seq 1 10); do ./port80 ; sleep 1 ; done
cycles: out 1839, in 1715
cycles: out 1728, in 1604
cycles: out 1762, in 1714
cycles: out 1770, in 1722
cycles: out 1759, in 1723
cycles: out 1764, in 1723
cycles: out 1762, in 1712
cycles: out 1761, in 1723
cycles: out 1771, in 1715
cycles: out 1770, in 1709


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2007-12-12 23:54:57

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


On Dec 12 2007 00:31, Rene Herman wrote:
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and run the
> attached program? This is about testing how long I/O port access to port 0x80
> takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>

Transmeta TM5800 CPU with nominal frequency 933 MHz, but it has a
hardware(!) 'ondemand' governor over the range of frequencies that
the user allowed scaling over, irrespective of the software governor.
(That is, if the CPU can do 300,533 and 933 MHz, and setting the
min/max to 300/533 will cause the hardware to 'ondemand' between 300
and 533 only. And that even if 'performance' is set on the software
side - so the only way to enforce 'performance' is to actually set
min=933.)

00:46 takeshi:../cpu0/cpufreq # echo 300000 >scaling_min_freq
00:46 takeshi:../cpu0/cpufreq # echo 933000 >scaling_max_freq
00:42 takeshi:/dev/shm # for ((x=1;x<=10;++x)); do ./port80 ; done
cycles: out 1514, in 584
cycles: out 1371, in 516
cycles: out 1291, in 472
cycles: out 1304, in 437
cycles: out 1308, in 410
cycles: out 1315, in 419
cycles: out 1315, in 419
cycles: out 1314, in 419
cycles: out 1315, in 420
cycles: out 1315, in 419
00:44 takeshi:/dev/shm # for ((x=1;x<=10;++x)); do ./port80 ; done
cycles: out 1511, in 596
cycles: out 1373, in 517
cycles: out 1293, in 470
cycles: out 1294, in 424
cycles: out 1304, in 414
cycles: out 1315, in 420
cycles: out 1313, in 418
cycles: out 1313, in 418
cycles: out 1314, in 420
cycles: out 1314, in 419

Increasing x above 10 yields only 1313-1315 values, so nothing more to see.
Note the values < 1313!! They only happen during scaling.
(I think scaling is the most important sideexercise in this test,
even on hardware which does not have hardware governors - e.g. your
average Intel/AMD CPUs.)

Single frequencies:
00:46 takeshi:../cpu0/cpufreq # echo 933000 >scaling_min_freq
00:46 takeshi:../cpu0/cpufreq # echo 933000 >scaling_max_freq
00:46 takeshi:../cpu0/cpufreq # for ((x=1;x<=10;++x)); do /dev/shm/port80 ;
donecycles: out 1314, in 419
cycles: out 1315, in 420
cycles: out 1314, in 419
cycles: out 1312, in 417
cycles: out 1315, in 420
cycles: out 1314, in 419
cycles: out 1313, in 419
cycles: out 1313, in 419
cycles: out 1314, in 419
cycles: out 1312, in 418

00:47 takeshi:../cpu0/cpufreq # echo 533000 >scaling_min_freq
00:47 takeshi:../cpu0/cpufreq # echo 533000 >scaling_max_freq
00:49 takeshi:../cpu0/cpufreq # for ((x=1;x<=10;++x)); do /dev/shm/port80 ;
donecycles: out 1372, in 508
cycles: out 1370, in 509
cycles: out 1372, in 515
cycles: out 1372, in 503
cycles: out 1372, in 503
cycles: out 1370, in 509
cycles: out 1368, in 513
cycles: out 1372, in 516
cycles: out 1372, in 504
cycles: out 1372, in 516

00:47 takeshi:../cpu0/cpufreq # echo 300000 >scaling_min_freq
00:47 takeshi:../cpu0/cpufreq # echo 300000 >scaling_max_freq
00:48 takeshi:../cpu0/cpufreq # for ((x=1;x<=10;++x)); do /dev/shm/port80 ;
donecycles: out 1515, in 650
cycles: out 1516, in 649
cycles: out 1517, in 651
cycles: out 1513, in 645
cycles: out 1514, in 649
cycles: out 1516, in 644
cycles: out 1517, in 651
cycles: out 1516, in 644
cycles: out 1512, in 647
cycles: out 1514, in 649

2007-12-13 00:14:05

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed


On Dec 13 2007 00:54, Jan Engelhardt wrote:
>
>Transmeta TM5800 CPU with nominal frequency 933 MHz, but it has a
>hardware(!) 'ondemand' governor over the range of frequencies that
>the user allowed scaling over, irrespective of the software governor.
>(That is, if the CPU can do 300,533 and 933 MHz, and setting the
>min/max to 300/533 will cause the hardware to 'ondemand' between 300
>and 533 only. And that even if 'performance' is set on the software
>side - so the only way to enforce 'performance' is to actually set
>min=933.)
>
>00:46 takeshi:../cpu0/cpufreq # echo 300000 >scaling_min_freq
>00:46 takeshi:../cpu0/cpufreq # echo 933000 >scaling_max_freq
>00:42 takeshi:/dev/shm # for ((x=1;x<=10;++x)); do ./port80 ; done
>cycles: out 1514, in 584
^ most likely 300 MHz state
>cycles: out 1371, in 516
^ 533 MHz state
>cycles: out 1291, in 472
^ ??
>cycles: out 1304, in 437
^ ??
>cycles: out 1308, in 410
^ ??
>cycles: out 1315, in 419
^ 933 MHz state
>cycles: out 1315, in 419
>cycles: out 1314, in 419
>cycles: out 1315, in 420
>cycles: out 1315, in 419

...this is in stark contrast to values of other users, where,
skimming through the mails, higher cpu frequencies yields
higher out/in values, but not so on TM5800. Highly interesting.

2007-12-13 02:08:45

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Jan Engelhardt wrote:
> On Dec 12 2007 00:31, Rene Herman wrote:
>> Would some people on x86 (both 32 and 64) be kind enough to compile and run the
>> attached program? This is about testing how long I/O port access to port 0x80
>> takes. It measures in CPU cycles so CPU speed is crucial in reporting.
>>
>
> Transmeta TM5800 CPU with nominal frequency 933 MHz, but it has a
> hardware(!) 'ondemand' governor over the range of frequencies that
> the user allowed scaling over, irrespective of the software governor.
> (That is, if the CPU can do 300,533 and 933 MHz, and setting the
> min/max to 300/533 will cause the hardware to 'ondemand' between 300
> and 533 only. And that even if 'performance' is set on the software
> side - so the only way to enforce 'performance' is to actually set
> min=933.)
>

Actually it has two different ones (economy and performance), you can
use the "longrun" utility to select which one.

Either way, all Transmeta processors have a fixed TSC, so there is no issue.

-hpa

2007-12-13 02:45:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> On 12-12-07 06:23, Kyle McMartin wrote:
>
>> On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
>>> asm volatile ("rdtsc": "=A" (tsc));
>>
>> rdtsc returns a 64-bit value in two 32-bit regs, you need to do
>>
>> inline unsigned long long rdtsc(void)
>> {
>> unsigned int lo, hi;
>> asm volatile ("rdtsc": "=a" (lo), "=d" (hi));
>> return (unsigned long long)hi << 32 | lo;
>> }
>>
>> as in msr.h, otherwise you'll only be looking at the value in %rax.
>
> On 32-bit, "=A" is edx:eax. Not sure what the point is in not letting it
> be that on 64-bit in fact, but yes, the thing should be compiled as 32-bit.

On 64-bit, "=A" is rdx:rax, which means that a 64-bit value ends up in
rax only (a 128-bit value, which gcc does support on 128-bit platforms,
would be in rdx:rax.)

gcc can't deal internally with values that span partial registers.

-hpa

2007-12-13 15:20:06

by Jiri Slaby

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman napsal(a):
> On 12-12-07 19:44, H. Peter Anvin wrote:
>
>>> model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
>>> cycles: out 3217, in 1898
>>
>> 1.6 µs, on the high end....
>>
>>> model name : Dual Core AMD Opteron(tm) Processor 275
>>> cycles: out 5508, in 5524
>>
>> Definitely an outlier; 2.5 µs here.

Now, it's around
cycles: out 4763, in 5515

It's
cpu MHz : 2205.209

2007-12-13 16:42:52

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On Wed, Dec 12, 2007 at 01:57:01PM +0200, Ville Syrj?l? wrote:
> On Wed, Dec 12, 2007 at 12:31:18AM +0100, Rene Herman wrote:
> > Good day.
> >
> > Would some people on x86 (both 32 and 64) be kind enough to compile and run
> > the attached program? This is about testing how long I/O port access to
> > port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> > reporting.
>
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
> stepping : 6
>
> cpu MHz : 2394.000
> cycles: out 1830, in 1166
>
> cpu MHz : 1596.000
> cycles: out 1925, in 1266

BTW that was my home desktop. I just noticed that my work desktop gives out
significantly different results.

vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
stepping : 6

cpu MHz : 2667.000
cycles: out 4263, in 2487

cpu MHz : 1600.000
cycles: out 4391, in 2623

There seems to be more variation in the results between runs on this system
too. I suppose the difference must be up to the chipset.

My home system:
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)

My work system:
00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01)

--
Ville Syrj?l?
[email protected]
http://www.sci.fi/~syrjala/

2007-12-13 17:30:11

by James Kosin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and
> run the attached program? This is about testing how long I/O port access
> to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.
>
> Posted a previous incarnation of this before, buried in the outb 0x80
> thread which had a serialising problem. This one should as far as I can
> see measure the right thing though. Please yell if you disagree...
>
> For me, on a Duron 1300 (AMD756 chipset) I have a constant:
>
> rene@7ixe4:~/src/port80$ su -c ./port80
> cycles: out 2400, in 2400
>
> and on a PII 400 (Intel 440BX chipset) a constant:
>
> rene@6bap:~/src/port80$ su -c ./port80
> cycles: out 553, in 251
>
> Results are (mostly) independent of compiler optimisation, but testing
> with an -O2 compile should be most useful. Thanks!
>
> Rene.
>
[root@beta jkosin]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 499.156
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov pat pse36 mmx fxsr sse
bogomips : 996.14

[root@beta jkosin]# ./a.out
cycles: out 683, in 299
[root@beta jkosin]# ./a.out
cycles: out 683, in 299
[root@beta jkosin]# ./a.out
cycles: out 683, in 299
[root@beta jkosin]#

--
Scanned by ClamAV - http://www.clamav.net

2007-12-13 22:08:49

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 13-12-07 17:27, James Kosin wrote:

> model name : Pentium III (Katmai)
> cpu MHz : 499.156

> [root@beta jkosin]# ./a.out
> cycles: out 683, in 299

Thanks much, you made spot 32 ;-)

http://lkml.org/lkml/2007/12/12/309

By the way, if anyone in this/these thread(s) wrote something today (*) they
would like me to reply to -- my horseshit ISP is completely crapping out on
me and I haven't seen your message. Resends ofcourse welcome.

(*) [now - 16 hours, now]

Rene.

2007-12-13 22:31:01

by Jesper Juhl

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 13/12/2007, Rene Herman <[email protected]> wrote:
> On 13-12-07 17:27, James Kosin wrote:
>
> > model name : Pentium III (Katmai)
> > cpu MHz : 499.156
>
> > [root@beta jkosin]# ./a.out
> > cycles: out 683, in 299
>
> Thanks much, you made spot 32 ;-)
>
> http://lkml.org/lkml/2007/12/12/309
>
> By the way, if anyone in this/these thread(s) wrote something today (*) they
> would like me to reply to -- my horseshit ISP is completely crapping out on
> me and I haven't seen your message. Resends ofcourse welcome.
>
Don't know if you saw mine (at least it's not on your list), but it's
archived here: http://lkml.org/lkml/2007/12/12/399


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2007-12-13 22:39:17

by Rene Herman

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

On 13-12-07 23:30, Jesper Juhl wrote:

> On 13/12/2007, Rene Herman <[email protected]> wrote:

>> http://lkml.org/lkml/2007/12/12/309
>>
>> By the way, if anyone in this/these thread(s) wrote something today (*) they
>> would like me to reply to -- my horseshit ISP is completely crapping out on
>> me and I haven't seen your message. Resends ofcourse welcome.
>>
> Don't know if you saw mine (at least it's not on your list), but it's
> archived here: http://lkml.org/lkml/2007/12/12/399

Nope didn't, thanks. Same CPU as the unexpected 59 spot it seems, but yours
with an expected 0.80 figure.

Rene.

2007-12-14 14:08:18

by James Kosin

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote:
> On 13-12-07 17:27, James Kosin wrote:
>
>> model name : Pentium III (Katmai)
>> cpu MHz : 499.156
>
>> [root@beta jkosin]# ./a.out
>> cycles: out 683, in 299
>
> Thanks much, you made spot 32 ;-)
>
> http://lkml.org/lkml/2007/12/12/309
>
> By the way, if anyone in this/these thread(s) wrote something today (*)
> they would like me to reply to -- my horseshit ISP is completely
> crapping out on me and I haven't seen your message. Resends ofcourse
> welcome.
>
> (*) [now - 16 hours, now]
>
> Rene.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Maybe, I should have added the lspci to the list of information.
kernel version is latest 2.4 kernel.

[root@beta root]# lspci
00:00.0 Host bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host bridge
(rev 03)
00:01.0 PCI bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge
(rev 03)
00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
(rev 10)
00:0a.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
00:0b.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro
215GP (rev 5c)

--
Scanned by ClamAV - http://www.clamav.net

2007-12-22 22:27:17

by Bauke Jan Douma

[permalink] [raw]
Subject: Re: [RFT] Port 0x80 I/O speed

Rene Herman wrote on 12-12-07 00:31:
> Good day.
>
> Would some people on x86 (both 32 and 64) be kind enough to compile and
> run the attached program? This is about testing how long I/O port access
> to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
> reporting.

Compiled as you wished.
And here are the results from 10 iterations...:

cycles: out 1965, in 1263
cycles: out 1968, in 1251
cycles: out 1957, in 1257
cycles: out 1992, in 1253
cycles: out 1959, in 1248
cycles: out 1965, in 1264
cycles: out 1957, in 1256
cycles: out 1959, in 1248
cycles: out 1962, in 1298
cycles: out 1962, in 1275

Linux skyscraper 2.6.23.11 #1 SMP Sun Dec 16 11:54:12 CET 2007 i686 GNU/Linux

32bits system

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
stepping : 7
cpu MHz : 2400.182
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr s
bogomips : 4802.73
clflush size : 64

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
stepping : 7
cpu MHz : 2400.182
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 4
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr s
bogomips : 10614.49
clflush size : 64

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
stepping : 7
cpu MHz : 2400.182
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr s
bogomips : 4800.17
clflush size : 64

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
stepping : 7
cpu MHz : 2400.182
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr s
bogomips : 4800.17
clflush size : 64