2006-05-19 06:17:39

by John Richard Moser

[permalink] [raw]
Subject: Stealing ur megahurts (no, really)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I saw someone bench Windows XP on a 66MHz CPU with 20MB RAM (Pentium
clocked down), and had an interesting thought.. try to bear with me on
this one.

Often times software engineers will seek out old machines, like 486DX
based or Pentium Pro or whatnot, whatever they can find. This is
because of something apparently called "Requirements Engineering," which
a colleague of mine explained to me as a bunch of programming students
stopping by his dorm to use his slow ass machine so they can figure out
what the "Minimum System Requirements" for their projects should be.

Scrambling for an old machine is ridiculous. Down-clocking makes sense
because you can adjust to varied levels; but it's difficult and usually
infeasible. Pulling memory and mix and matching is not much better.

On Linux we have mem= to toy with memory, which I personally HAVE used
to evaluate how various distributions and releases of GNOME operate
under memory pressure. This is a lot more convenient than pulling chips
and trying to find the right combination. This option was, apparently,
designed for situations where actual system memory capacity is
mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
is 255M....); but is very useful in this application too.

This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
Obviously we can't do this directly, as convenient as this would be; but
the idea warrants some thought, and some thought I gave it. What I came
up with was simple: Adjust time slice length and place a delay between
time slices so they're evenly spaced.

The idea here would be to determine first the ratio of CPU real MHz to
target MHz and use that for adjustment. We need to calculate a number
of things to do this:

- The normal length of a time slice, in mS. I believe this is 20mS.
- The normal number of time slices per second. This should be 50.
- The ratio of real_mhz to target_mhz
- The new length of a time slice, (target_mhz/real_mhz) * 20mS
- The spacing between time slices, 20 - time_slice (*)

(*) (1000mS - (time_slice * time_slices)) / time_slices
== 1000mS/time_slices - ((time_slice*time_slices) / time_slices)
== 1000/time_slices - time_slice
== 20 - time_slice WHERE time_slices = 20 per second

What this means is the scheduler sits idle between time slices, and uses
shorter time slices that reflect what a full time slice would have time
to do on a lower clocked CPU. This of course isn't perfect, but it's near.

A modern 2GHz CPU with cpumhz=200 would result in the following data:

time_slice = 2mS
time_slices = 50 per second
delay_between_slices = 18mS

This should be sufficient that 1/10 the time (2 of 20mS) is allotted to
a time slice; and the CPU wastes 9/10 of the time idle. This would
somewhat simulate a 200MHz system. Probably not very well though.

Questions? Comments? Particular ideas on what would happen?

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

Creative brains are a valuable, limited resource. They shouldn't be
wasted on re-inventing the wheel when there are so many fascinating
new problems waiting out there.
-- Eric Steven Raymond

We will enslave their women, eat their children and rape their
cattle!
-- Bosc, Evil alien overlord from the fifth dimension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRG1h7As1xW0HCTEFAQKeug//ei2tmEZpwexhhm1mh3VE693VZWE2rryG
Dnfq8g08RT0SnmS5GPXLcux/tyjLnt4aSI4MthCWpPrz32C+j3dkwTHz+HAW8/hT
Dl4om6lHL1cFGJG7asbCfjjA6TEYU2yIAWBbCYyrdoFU64Xt4Itb6oBwLkJcVwVJ
VqYbKmF2uc6VBblWlDUqSFs+imlwMzlli6BiUyUC1cm8y4GibQhFkcvoNBMv9b3d
Zzag+Azhs/Bc7RT69tSFOzlRlImAMARrhtZVrANCMo3cf8klvtXunBQe3YS1ygFq
yB/GdZC/tzWkds7aSpTniAeuzNC4/kfCFdV9p/V3ZBTBM9xkcque+OE/p5E3YGZY
b3HqNXXBqP+AjPba5vFti4wQMM7ZzPFouw5CmIGlOdjIRPzGb22HWGMikMtmcrnt
uD+2olJHIk9cJ5wK9A7ForY9eldyELTJTa+Vn5rUe7jZAEUPil17E0gktchYC1es
NPrerDP1KDANdnIFHjhA9uZPcehtj594Dbk3Mgrq55Rsn5IuKmGntfSnc4aI1YiH
uqi5J++pH/UfhlkdUOwZlRbiHxl+wjPN+4PGGUqXnknJphLHiSSRzEAzmlm9TDP3
bovPzD+Q27ddXjs3s28W94QvSHzdAP2dyqrWb55E9LmAUIBtLBLt+jrHdfhcRIJc
Zxr5jX514Y8=
=Mc6F
-----END PGP SIGNATURE-----


2006-05-19 10:10:09

by Matti Aarnio

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

On Fri, May 19, 2006 at 02:13:02AM -0400, John Richard Moser wrote:
...
> On Linux we have mem= to toy with memory, which I personally HAVE used
> to evaluate how various distributions and releases of GNOME operate
> under memory pressure. This is a lot more convenient than pulling chips
> and trying to find the right combination. This option was, apparently,
> designed for situations where actual system memory capacity is
> mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
> is 255M....); but is very useful in this application too.
>
> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
> Obviously we can't do this directly, as convenient as this would be; but
> the idea warrants some thought, and some thought I gave it. What I came
> up with was simple: Adjust time slice length and place a delay between
> time slices so they're evenly spaced.
...
> Questions? Comments? Particular ideas on what would happen?

Modern machines have ability to be "speed controlled" - Perhaps
they can cut their speed by 1/3 or 1/2, but run slower anyway
in the name of energy conservation.


Another approach (not thinking on multiprocessor systems now)
is to somehow gobble up system performance into some "hoarder"
(highest scheduling priority, eats up 90% of time slices doing
excellent waste of CPU resources..)

Combine that with internal timer ticking at 1000 or 1024 Hz, and
you do get fairly good approximation of a machine running at 1/10
of its real speed.

Kernel IO tasks might skew statistics a bit, but that is another story.


In multiprocessor systems similar hoarders do work combined with
CPU Affinity - one hoarder for each processor.

/Matti Aarnio

2006-05-19 11:02:11

by Panagiotis Issaris

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

Hi,

John Richard Moser wrote:

>[...]
>Scrambling for an old machine is ridiculous. Down-clocking makes sense
>because you can adjust to varied levels; but it's difficult and usually
>infeasible. Pulling memory and mix and matching is not much better.
>
>On Linux we have mem= to toy with memory, which I personally HAVE used
>to evaluate how various distributions and releases of GNOME operate
>under memory pressure. This is a lot more convenient than pulling chips
>and trying to find the right combination. This option was, apparently,
>designed for situations where actual system memory capacity is
>mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
>is 255M....); but is very useful in this application too.
>[...]
>
>
An easier way might be to use a system emulator like Qemu.
You can specify the amount of memory the emulated system has,
and if you do not use the kernel accelerating module (kqemu)
it slows down considerably.

Of course, it would be nicer if you could actually specify performance
levels and an issue with this approach is that it does not uniformly
scale down performance: I think IO emulation performance is a lot worse
then CPU emulation performance (in Qemu).

With friendly regards,
Takis

2006-05-19 11:22:24

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

* John Richard Moser ([email protected]) wrote:

> Scrambling for an old machine is ridiculous. Down-clocking makes sense
> because you can adjust to varied levels; but it's difficult and usually
> infeasible. Pulling memory and mix and matching is not much better.

<...>

> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
> Obviously we can't do this directly, as convenient as this would be; but
> the idea warrants some thought, and some thought I gave it. What I came
> up with was simple: Adjust time slice length and place a delay between
> time slices so they're evenly spaced.

<...>

Hi John,
While cpu downclocking helps a bit, it would be hopelessly inaccurate
for figuring out if your app would run fast enough on the given
ancient machine. A lot else has happened to the world since the days
of the 200MHz CPU:
* Faster memory
* Larger caches
* Faster PCI busses
* Instruction set additions (various more levels of SSE etc)
* Faster discs
* Changes to the CPU architecture/implementation

Still, it would be interesting to see the difference in performance
of a downclocked modern processor and its 10 year old clock equivalent.

Dave

--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2006-05-19 11:43:52

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)


On Fri, 19 May 2006, Dr. David Alan Gilbert wrote:

> * John Richard Moser ([email protected]) wrote:
>
>> Scrambling for an old machine is ridiculous. Down-clocking makes sense
>> because you can adjust to varied levels; but it's difficult and usually
>> infeasible. Pulling memory and mix and matching is not much better.
>
> <...>
>
>> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
>> Obviously we can't do this directly, as convenient as this would be; but
>> the idea warrants some thought, and some thought I gave it. What I came
>> up with was simple: Adjust time slice length and place a delay between
>> time slices so they're evenly spaced.
>
> <...>
>
> Hi John,
> While cpu downclocking helps a bit, it would be hopelessly inaccurate
> for figuring out if your app would run fast enough on the given
> ancient machine. A lot else has happened to the world since the days
> of the 200MHz CPU:
> * Faster memory
> * Larger caches
> * Faster PCI busses
> * Instruction set additions (various more levels of SSE etc)
> * Faster discs
> * Changes to the CPU architecture/implementation
>
> Still, it would be interesting to see the difference in performance
> of a downclocked modern processor and its 10 year old clock equivalent.
>
> Dave
>
> --
> -----Open up your eyes, open up your mind, open up your code -------
> / Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
> \ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
> \ _________________________|_____ http://www.treblig.org |_______/
> -

You can readily slow down a machine by creating a bunch of tasks
that just do:

int main() {
for(;;)
;
}

They will use their entire time-slice until preempted. You want
it slower, create more such tasks. FYI, `top` should show them
all getting the same amount of CPU time. If they don't the
scheduler is broken!

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.4 on an i686 machine (5592.89 BogoMips).
New book: http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2006-05-19 15:07:15

by Lexington Luthor

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

Panagiotis Issaris wrote:
> An easier way might be to use a system emulator like Qemu.
> You can specify the amount of memory the emulated system has,
> and if you do not use the kernel accelerating module (kqemu)
> it slows down considerably.
>
> Of course, it would be nicer if you could actually specify performance
> levels and an issue with this approach is that it does not uniformly
> scale down performance: I think IO emulation performance is a lot worse
> then CPU emulation performance (in Qemu).
>

I have in the past used Bochs to precisely control the exact speed at
which I run an emulated machine. Though this was for some DOS app which
insisted on a ~20Mhz CPU, there is no reason this would not work on a
Linux system emulated by Bochs.

http://bochs.sourceforge.net/

Regards,
LL

2006-05-19 17:25:48

by John Richard Moser

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Matti Aarnio wrote:
> On Fri, May 19, 2006 at 02:13:02AM -0400, John Richard Moser wrote:
> ...
>> On Linux we have mem= to toy with memory, which I personally HAVE used
>> to evaluate how various distributions and releases of GNOME operate
>> under memory pressure. This is a lot more convenient than pulling chips
>> and trying to find the right combination. This option was, apparently,
>> designed for situations where actual system memory capacity is
>> mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
>> is 255M....); but is very useful in this application too.
>>
>> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
>> Obviously we can't do this directly, as convenient as this would be; but
>> the idea warrants some thought, and some thought I gave it. What I came
>> up with was simple: Adjust time slice length and place a delay between
>> time slices so they're evenly spaced.
> ...
>> Questions? Comments? Particular ideas on what would happen?
>
> Modern machines have ability to be "speed controlled" - Perhaps
> they can cut their speed by 1/3 or 1/2, but run slower anyway
> in the name of energy conservation.
>

Not fine grained enough. 1.8GHz desktop athlon 64 can run at 600MHz or
1.8GHz. A laptop CPU may run at 2.0GHz, 1.4GHz, 600MHz, and 400MHz.

>
> Another approach (not thinking on multiprocessor systems now)
> is to somehow gobble up system performance into some "hoarder"
> (highest scheduling priority, eats up 90% of time slices doing
> excellent waste of CPU resources..)

Possible, but could possibly create other issues.

>
> Combine that with internal timer ticking at 1000 or 1024 Hz, and
> you do get fairly good approximation of a machine running at 1/10
> of its real speed.
>
> Kernel IO tasks might skew statistics a bit, but that is another story.
>

Yeah, also a thought. With my approach you still had interrupts to
account for et al, since on a slow system we should still have <10uS
response time there.

>
> In multiprocessor systems similar hoarders do work combined with
> CPU Affinity - one hoarder for each processor.
>
> /Matti Aarnio
>

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

Creative brains are a valuable, limited resource. They shouldn't be
wasted on re-inventing the wheel when there are so many fascinating
new problems waiting out there.
-- Eric Steven Raymond

We will enslave their women, eat their children and rape their
cattle!
-- Bosc, Evil alien overlord from the fifth dimension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRG3+gQs1xW0HCTEFAQKK0g/+OPCuhqnT+CC0hhcu3fEAcrW9KlAi1/P5
W1zkx1R3zxIsnDmByNigcRSRKdDPQw/qczWtyiQFT3oAr1P2wrYJhv7IsAc1UQ5n
v7xbQt9lXVIZMolpkoctP8Xdv2oINnHjQFbvKPyNINvigff7Ow5E9ADsi7igbfjG
NmHVWe6a8DhEs9SP1Q6HLkHGvaNSS8S7KXGgT2UlwWtx4AlQL7OLq8nhIgHDWtZq
ltw9NDDgLjG2CTeEW3TNMgDZ2QOE1nGRsk44b4En5+iGXW7d8cLa9HFkeDA6Skpz
6wf3R8XRRpxD9dKB7n4ex6Qq4YK45z4xSvHRsLKQTnxh9UMmeZpCaTjrGm+0abak
CLJXVmzvj20f3wB47J9kSOphdBAX5hQSso1d9V+YWh7WQ9Kkp0LSXyOWdZOAFzCX
Hlgxv9djmNic85IOdnvd++zKf/EeqHBz2/Mk6Fpb5+Sh6YfYrcHnqlFswCn5guR8
GxBXYR1toCT3eeDhbVJXD0oqgLSLh7SMwkDQhERj2nHTiBfmtUDO8er9NqZl6Yf/
AH3/HibLFYYNIAkNGsCxVJ8exoA/sz283kwtYVgG+qJzoMGQaqcBqLeIwd7mp0XX
CJYjENB3uGe6RoWUNHPYoAG68G7WgI9L9U2DRokXkAldX1Fc+u7Ed7GhWrgbisqK
sEDXdtnkMbg=
=FcTe
-----END PGP SIGNATURE-----

2006-05-19 17:26:40

by John Richard Moser

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Panagiotis Issaris wrote:
> Hi,
>
> John Richard Moser wrote:
>
>> [...]
>> Scrambling for an old machine is ridiculous. Down-clocking makes sense
>> because you can adjust to varied levels; but it's difficult and usually
>> infeasible. Pulling memory and mix and matching is not much better.
>>
>> On Linux we have mem= to toy with memory, which I personally HAVE used
>> to evaluate how various distributions and releases of GNOME operate
>> under memory pressure. This is a lot more convenient than pulling chips
>> and trying to find the right combination. This option was, apparently,
>> designed for situations where actual system memory capacity is
>> mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
>> is 255M....); but is very useful in this application too.
>> [...]
>>
>>
> An easier way might be to use a system emulator like Qemu.
> You can specify the amount of memory the emulated system has,
> and if you do not use the kernel accelerating module (kqemu)
> it slows down considerably.
>

Yes but it slows down to like a 25MHz system :)

> Of course, it would be nicer if you could actually specify performance
> levels and an issue with this approach is that it does not uniformly
> scale down performance: I think IO emulation performance is a lot worse
> then CPU emulation performance (in Qemu).

Yep.

>
> With friendly regards,
> Takis
>
>

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

Creative brains are a valuable, limited resource. They shouldn't be
wasted on re-inventing the wheel when there are so many fascinating
new problems waiting out there.
-- Eric Steven Raymond

We will enslave their women, eat their children and rape their
cattle!
-- Bosc, Evil alien overlord from the fifth dimension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRG3+tgs1xW0HCTEFAQJ2ow/+KOKJNdX5d1pSZ36u+lT6AIs8yN0LSbS5
VVKFJLMrZWf0BfRgvxyGjazlEXbIaYeaSXYGr5+Gkbztr0cJ1/WyyO9dhEvVS8uq
YJP9GAjfcsmrtLZT49jM/9XimK2xGgAFExmbxmEXFtGrcVanzFA/zvSiqbJmUMEt
z+4BR7wKX/Q+iBKrSLibCTLlzpstI8YXhZxMVR2ZOfFU18nl7Pv5Y9sUB6EUKu2V
6B1eT0pBQ+bLtKhsNbOIOvGUpzkpe/bHAqBzxYLugBclmyM3SFoncHDXpg08qoVm
LIXJi0Y/QJQovQlbzRz+Xse0IibBCPd4+jGNjk6/fkIvVqZVvGsLFHrmb5S0v4W7
qBB4atl7w7tb29J/gzPqAlqaqc/eNI3ZtNG/KEfvEqjaTHuc6mIXjs6OHyimXDgD
WPvNZYrxwLoXhCCSkzlZ8BgEjL59DXtR8YZoEI3tSI9k9HnuKe4sv5dbCntsdsTR
d0o5Kvcil6aZKJWx9St8BafOpcGff2D3YpzgcmBhUhYoX/Ni1+1fgnuha+Oo9W7i
+TMOf8DwV+oUBV4hHbLXQzkDBu/dT60LrieYRANQxHx1uDh55S66Uk1pmhu74iw1
NCxZbok20007Hzle3Se9qL6f7r4XdDQHI6bUcu5JgJyv+pVQcOdkLZSClVYAQkVn
pwYsqzp/KJk=
=Cmvr
-----END PGP SIGNATURE-----

2006-05-19 17:28:29

by John Richard Moser

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Dr. David Alan Gilbert wrote:
> * John Richard Moser ([email protected]) wrote:
>
>> Scrambling for an old machine is ridiculous. Down-clocking makes sense
>> because you can adjust to varied levels; but it's difficult and usually
>> infeasible. Pulling memory and mix and matching is not much better.
>
> <...>
>
>> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
>> Obviously we can't do this directly, as convenient as this would be; but
>> the idea warrants some thought, and some thought I gave it. What I came
>> up with was simple: Adjust time slice length and place a delay between
>> time slices so they're evenly spaced.
>
> <...>
>
> Hi John,
> While cpu downclocking helps a bit, it would be hopelessly inaccurate
> for figuring out if your app would run fast enough on the given
> ancient machine. A lot else has happened to the world since the days
> of the 200MHz CPU:
> * Faster memory
> * Larger caches
> * Faster PCI busses
> * Instruction set additions (various more levels of SSE etc)
> * Faster discs
> * Changes to the CPU architecture/implementation
>

Skews and fuzz. Imperfections, but at least we get a general idea. ;)

> Still, it would be interesting to see the difference in performance
> of a downclocked modern processor and its 10 year old clock equivalent.
>

Yes. Too bad CPUs can't be uniformly underclocked by design; they have
at most 3-5 different levels of CPU frequency you can pick from at run time.

> Dave
>

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

Creative brains are a valuable, limited resource. They shouldn't be
wasted on re-inventing the wheel when there are so many fascinating
new problems waiting out there.
-- Eric Steven Raymond

We will enslave their women, eat their children and rape their
cattle!
-- Bosc, Evil alien overlord from the fifth dimension
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRG3/Iws1xW0HCTEFAQIjJg//Z/IGDjhXqE9Cca7LCcnHBCcQ8Rts7moW
L0e1sCb6zrNWBTWo5h6OrzAJh5aTzeeciKpDufkuvvR+BREchiCTIm61IxokHCCS
2EQ8qfDJWD6ZkOi42tt4t/LftFaUbu7zxpalf5hA5qbCid1CjdqEiYZREDaDbqrf
uPNVk/w8TTaK5B8/+xWAxSNCnslGW7LRsYkLoQw2eTM4xKcNf4L76rCj/0SXoMcm
v56tx40CsfFtqzK5D+4y80hMzqGQ+ll3aenkgZIaD61rhcGL/QZPPAGC3F3rg+94
2iyaimu9582m6P9sdFHVrYVfCqLg8AKOIammBFxwPPmFaqaLeIjmsoQ5T+QMJbLJ
JZlsTFLG3FeeXuwGEOlO+dqZLKkF3ubfveFi3iUMkJkv7QnBbPAMRVwQL0Evl3WW
Ltegi6b8QxriFhNrkNAVv9L4IlhQkhGe4sff3xQNj3ZBms1RW85QhDDDUBX5eNHo
G8/Xdd9QcAVEBKt+welYsYcMS366dXir4STq9wANhks3S6sSWJUpEA5RrF8s2fN7
aNCWvO14sl9dscI4+w1vGQB9eGFcfIYWf+M1doQyKJgtx+bVRiE+mEWW2SZoKPCT
oTCEhPNOJenxVV6zqOsQT0wjyhRyONbwQJiv0sMr+9PLCe8A7u9VHUvOoQQ6bQOA
oBFc3EGABK4=
=dCD2
-----END PGP SIGNATURE-----

2006-05-19 17:37:32

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

* John Richard Moser ([email protected]) wrote:
> That I wrote
> > <...>
> >
> > Hi John,
> > While cpu downclocking helps a bit, it would be hopelessly inaccurate
> > for figuring out if your app would run fast enough on the given
> > ancient machine. A lot else has happened to the world since the days
> > of the 200MHz CPU:
> > * Faster memory
> > * Larger caches
> > * Faster PCI busses
> > * Instruction set additions (various more levels of SSE etc)
> > * Faster discs
> > * Changes to the CPU architecture/implementation
> >
>
> Skews and fuzz. Imperfections, but at least we get a general idea. ;)

Really? I bet there is a factor of 2 at least in that lot when you
put them together? (Depending on what you are running)
Remember the reason you are scrabbling around for this ancient machine
is to answer a question along the lines of 'is my program useable on a
.....' ?
Also you want to make sure you haven't made an assumption about an actual
feature (you left a cmov in somewhere? You assumed AGP? LBA block addressing
etc).

Dave
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2006-05-19 17:57:03

by David Lang

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

On Fri, 19 May 2006, Dr. David Alan Gilbert wrote:

>> Skews and fuzz. Imperfections, but at least we get a general idea. ;)
>
> Really? I bet there is a factor of 2 at least in that lot when you
> put them together? (Depending on what you are running)

however, in most cases the difference between the native machine and the
thing you are testing for is a factor >20, so a fuzz factor of 2 still
gets you pretty close.

> Remember the reason you are scrabbling around for this ancient machine
> is to answer a question along the lines of 'is my program useable on a
> .....' ?
> Also you want to make sure you haven't made an assumption about an actual
> feature (you left a cmov in somewhere? You assumed AGP? LBA block addressing
> etc).

if you need this level of detail you need the actual hardware or a
hardware emulator. however there's a lot of 'is the performance
reasonable' type testing that needs to be done (and useually needs to be
done repeatedly, either with different performance settings or with
after fixing performance issues) before you have to go to this step.

David Lang



--
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare

2006-05-19 20:08:25

by Joel Jaeggli

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

Matti Aarnio wrote:
> On Fri, May 19, 2006 at 02:13:02AM -0400, John Richard Moser wrote:
> ...
>> On Linux we have mem= to toy with memory, which I personally HAVE used
>> to evaluate how various distributions and releases of GNOME operate
>> under memory pressure. This is a lot more convenient than pulling chips
>> and trying to find the right combination. This option was, apparently,
>> designed for situations where actual system memory capacity is
>> mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
>> is 255M....); but is very useful in this application too.
>>
>> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
>> Obviously we can't do this directly, as convenient as this would be; but
>> the idea warrants some thought, and some thought I gave it. What I came
>> up with was simple: Adjust time slice length and place a delay between
>> time slices so they're evenly spaced.
> ...
>> Questions? Comments? Particular ideas on what would happen?

The other thing I would observe is that clock speed is only part of the
equation, it's one thing to soak up some cpu cycles, but the cpu may be
a lot more superscalar (pipelineing, simd instructions, multiple cores
etc) than the one you're trying to simulate, probably it also has a lot
more cache and much faster memory. So that while you can certainly soak
up a lot of cpu pretty easily there are other considerations that might
effect simulating the performance of say a 100mhz pentium on say an
athlon 64x2.

emulation would probably go a lot further as an approach

> Modern machines have ability to be "speed controlled" - Perhaps
> they can cut their speed by 1/3 or 1/2, but run slower anyway
> in the name of energy conservation.
>
>
> Another approach (not thinking on multiprocessor systems now)
> is to somehow gobble up system performance into some "hoarder"
> (highest scheduling priority, eats up 90% of time slices doing
> excellent waste of CPU resources..)
>

<snip>

--
-------------------------------------------------
Joel Jaeggli ([email protected])
GPG Key Fingerprint:
5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2

2006-05-20 18:35:45

by Antonio Vargas

[permalink] [raw]
Subject: Re: Stealing ur megahurts (no, really)

On 5/19/06, Joel Jaeggli <[email protected]> wrote:
> Matti Aarnio wrote:
> > On Fri, May 19, 2006 at 02:13:02AM -0400, John Richard Moser wrote:
> > ...
> >> On Linux we have mem= to toy with memory, which I personally HAVE used
> >> to evaluate how various distributions and releases of GNOME operate
> >> under memory pressure. This is a lot more convenient than pulling chips
> >> and trying to find the right combination. This option was, apparently,
> >> designed for situations where actual system memory capacity is
> >> mis-detected (mandrake 7.2 and its insistence that a 256M memory stick
> >> is 255M....); but is very useful in this application too.
> >>
> >> This brings the idea of a cpumhz= parameter to adjust CPU clock rate.
> >> Obviously we can't do this directly, as convenient as this would be; but
> >> the idea warrants some thought, and some thought I gave it. What I came
> >> up with was simple: Adjust time slice length and place a delay between
> >> time slices so they're evenly spaced.
> > ...
> >> Questions? Comments? Particular ideas on what would happen?
>
> The other thing I would observe is that clock speed is only part of the
> equation, it's one thing to soak up some cpu cycles, but the cpu may be
> a lot more superscalar (pipelineing, simd instructions, multiple cores
> etc) than the one you're trying to simulate, probably it also has a lot
> more cache and much faster memory. So that while you can certainly soak
> up a lot of cpu pretty easily there are other considerations that might
> effect simulating the performance of say a 100mhz pentium on say an
> athlon 64x2.
>
> emulation would probably go a lot further as an approach
>
> > Modern machines have ability to be "speed controlled" - Perhaps
> > they can cut their speed by 1/3 or 1/2, but run slower anyway
> > in the name of energy conservation.
> >
> >
> > Another approach (not thinking on multiprocessor systems now)
> > is to somehow gobble up system performance into some "hoarder"
> > (highest scheduling priority, eats up 90% of time slices doing
> > excellent waste of CPU resources..)
> >
>
> <snip>
>

what is really needed is to be able to throttle the memory latency and
bandwith.. even with +++Ghz processors if there is no memory speed all
the rest lacks... this would enable better testing of memory intensive
algorithms


--
Greetz, Antonio Vargas aka winden of network

http://wind.codepixel.com/
[email protected]
[email protected]

Every day, every year
you have to work
you have to study
you have to scene.