2002-01-31 22:56:27

by Greg Boyce

[permalink] [raw]
Subject: Machines misreporting Bogomips

kernel folk,

I've got a strange issue that I've been struggling to find the solution to
for some time now.

I work in a group that assists in the managing of large numbers of
deployed linux boxes running variants of the 2.2 kernel on them. The
machines themselves are all pretty standard. There are slight variances
on vendors, cpu speeds, etc., but they're all running from the same
motherboards.

Every once in a while we come across single machines which are running a
lot slower than they should be, and are misreporting their speed in
bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
of the kernel don't appear to affect the machines themselves at all.

I was wondering if anyone would be able to provide me with a starting
point to hunt this down. The only solution we had found in the past was
to replace the machines, but some of them are located out of the country
and that would be expensive.

Here is the output from /proc/cpuinfo for machines. The first machine is
normal, the second is affected by this bug. They're both running the same
hardware, although the first machine's CPU is 650mhz instead of 500mhz.

Machine 1:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 3
cpu MHz : 645.676332
cache size : 0 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 psn mmx osfxsr kni
bogomips : 643.89

Machine 2:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 496.677416
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 mmx osfxsr kni
bogomips : 4.06

Let me know if there's anything else I can provide to help with the
diagnosis. The machine itself is an IBM Netfinity 4000R.

--
Gregory Boyce



2002-01-31 23:22:07

by Matthew Dharm

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

I don't know if this will help much, but I have seen that type of
performance drop on systems when the entire cache is disabled.

Matt

On Thu, Jan 31, 2002 at 05:55:57PM -0500, Greg Boyce wrote:
> kernel folk,
>
> I've got a strange issue that I've been struggling to find the solution to
> for some time now.
>
> I work in a group that assists in the managing of large numbers of
> deployed linux boxes running variants of the 2.2 kernel on them. The
> machines themselves are all pretty standard. There are slight variances
> on vendors, cpu speeds, etc., but they're all running from the same
> motherboards.
>
> Every once in a while we come across single machines which are running a
> lot slower than they should be, and are misreporting their speed in
> bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> of the kernel don't appear to affect the machines themselves at all.
>
> I was wondering if anyone would be able to provide me with a starting
> point to hunt this down. The only solution we had found in the past was
> to replace the machines, but some of them are located out of the country
> and that would be expensive.
>
> Here is the output from /proc/cpuinfo for machines. The first machine is
> normal, the second is affected by this bug. They're both running the same
> hardware, although the first machine's CPU is 650mhz instead of 500mhz.
>
> Machine 1:
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 8
> model name : Pentium III (Coppermine)
> stepping : 3
> cpu MHz : 645.676332
> cache size : 0 KB
> fdiv_bug : no
> hlt_bug : no
> sep_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 3
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 psn mmx osfxsr kni
> bogomips : 643.89
>
> Machine 2:
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 7
> model name : Pentium III (Katmai)
> stepping : 3
> cpu MHz : 496.677416
> cache size : 512 KB
> fdiv_bug : no
> hlt_bug : no
> sep_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 mmx osfxsr kni
> bogomips : 4.06
>
> Let me know if there's anything else I can provide to help with the
> diagnosis. The machine itself is an IBM Netfinity 4000R.
>
> --
> Gregory Boyce
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Matthew Dharm Home: [email protected]
Maintainer, Linux USB Mass Storage Driver

G: Baaap booop BAHHHP.
Mir: 9600 Baud?
Mik: No, no! 9600 goes baap booop, not booop bahhhp!
-- Greg, Miranda and Mike
User Friendly, 12/31/1998


Attachments:
(No filename) (2.93 kB)
(No filename) (232.00 B)
Download all attachments

2002-01-31 23:34:17

by Roger Larsson

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

On Fridayen den 1 February 2002 00.21, Matthew Dharm wrote:
> I don't know if this will help much, but I have seen that type of
> performance drop on systems when the entire cache is disabled.
>
> Matt

>From the report below...
> > cache size : 0 KB

/RogerL

>
> On Thu, Jan 31, 2002 at 05:55:57PM -0500, Greg Boyce wrote:
> > kernel folk,
> >
> > I've got a strange issue that I've been struggling to find the solution
> > to for some time now.
> >
> > I work in a group that assists in the managing of large numbers of
> > deployed linux boxes running variants of the 2.2 kernel on them. The
> > machines themselves are all pretty standard. There are slight variances
> > on vendors, cpu speeds, etc., but they're all running from the same
> > motherboards.
> >
> > Every once in a while we come across single machines which are running a
> > lot slower than they should be, and are misreporting their speed in
> > bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> > of the kernel don't appear to affect the machines themselves at all.
> >
> > I was wondering if anyone would be able to provide me with a starting
> > point to hunt this down. The only solution we had found in the past was
> > to replace the machines, but some of them are located out of the country
> > and that would be expensive.
> >
> > Here is the output from /proc/cpuinfo for machines. The first machine is
> > normal, the second is affected by this bug. They're both running the
> > same hardware, although the first machine's CPU is 650mhz instead of
> > 500mhz.
> >
> > Machine 1:
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 8
> > model name : Pentium III (Coppermine)
> > stepping : 3
> > cpu MHz : 645.676332
> > cache size : 0 KB
> > fdiv_bug : no
> > hlt_bug : no
> > sep_bug : no
> > f00f_bug : no
> > coma_bug : no
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 3
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > cmov pat pse36 psn mmx osfxsr kni
> > bogomips : 643.89
> >
> > Machine 2:
> > processor : 0
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 7
> > model name : Pentium III (Katmai)
> > stepping : 3
> > cpu MHz : 496.677416
> > cache size : 512 KB
> > fdiv_bug : no
> > hlt_bug : no
> > sep_bug : no
> > f00f_bug : no
> > coma_bug : no
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 2
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > cmov pat pse36 mmx osfxsr kni
> > bogomips : 4.06
> >
> > Let me know if there's anything else I can provide to help with the
> > diagnosis. The machine itself is an IBM Netfinity 4000R.
> >
> > --
> > Gregory Boyce
> >
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/

--
Roger Larsson
Skellefte?
Sweden

2002-02-01 00:11:37

by Greg Boyce

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

The machine reporting cache size of 0KB is the one that's behaving
normally. The misbehaving machine is the one reporting 512K cache.

> > > cache size : 512 KB
...
> > > bogomips : 4.06

On Fri, 1 Feb 2002, Roger Larsson wrote:

> On Fridayen den 1 February 2002 00.21, Matthew Dharm wrote:
> > I don't know if this will help much, but I have seen that type of
> > performance drop on systems when the entire cache is disabled.
> >
> > Matt
>
> >From the report below...
> > > cache size : 0 KB
>
> /RogerL
>
> >
> > On Thu, Jan 31, 2002 at 05:55:57PM -0500, Greg Boyce wrote:
> > > kernel folk,
> > >
> > > I've got a strange issue that I've been struggling to find the solution
> > > to for some time now.
> > >
> > > I work in a group that assists in the managing of large numbers of
> > > deployed linux boxes running variants of the 2.2 kernel on them. The
> > > machines themselves are all pretty standard. There are slight variances
> > > on vendors, cpu speeds, etc., but they're all running from the same
> > > motherboards.
> > >
> > > Every once in a while we come across single machines which are running a
> > > lot slower than they should be, and are misreporting their speed in
> > > bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> > > of the kernel don't appear to affect the machines themselves at all.
> > >
> > > I was wondering if anyone would be able to provide me with a starting
> > > point to hunt this down. The only solution we had found in the past was
> > > to replace the machines, but some of them are located out of the country
> > > and that would be expensive.
> > >
> > > Here is the output from /proc/cpuinfo for machines. The first machine is
> > > normal, the second is affected by this bug. They're both running the
> > > same hardware, although the first machine's CPU is 650mhz instead of
> > > 500mhz.
> > >
> > > Machine 1:
> > > processor : 0
> > > vendor_id : GenuineIntel
> > > cpu family : 6
> > > model : 8
> > > model name : Pentium III (Coppermine)
> > > stepping : 3
> > > cpu MHz : 645.676332
> > > cache size : 0 KB
> > > fdiv_bug : no
> > > hlt_bug : no
> > > sep_bug : no
> > > f00f_bug : no
> > > coma_bug : no
> > > fpu : yes
> > > fpu_exception : yes
> > > cpuid level : 3
> > > wp : yes
> > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > > cmov pat pse36 psn mmx osfxsr kni
> > > bogomips : 643.89
> > >
> > > Machine 2:
> > > processor : 0
> > > vendor_id : GenuineIntel
> > > cpu family : 6
> > > model : 7
> > > model name : Pentium III (Katmai)
> > > stepping : 3
> > > cpu MHz : 496.677416
> > > cache size : 512 KB
> > > fdiv_bug : no
> > > hlt_bug : no
> > > sep_bug : no
> > > f00f_bug : no
> > > coma_bug : no
> > > fpu : yes
> > > fpu_exception : yes
> > > cpuid level : 2
> > > wp : yes
> > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> > > cmov pat pse36 mmx osfxsr kni
> > > bogomips : 4.06
> > >
> > > Let me know if there's anything else I can provide to help with the
> > > diagnosis. The machine itself is an IBM Netfinity 4000R.
> > >
> > > --
> > > Gregory Boyce
> > >
> > >
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > > in the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> Roger Larsson
> Skellefte?
> Sweden
>

2002-02-01 09:59:58

by Horst von Brand

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

Greg Boyce <[email protected]> said:

[...]

> Every once in a while we come across single machines which are running a
> lot slower than they should be, and are misreporting their speed in
> bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> of the kernel don't appear to affect the machines themselves at all.

Just misrepresented bogomips or is the machine really slower? Perhaps the
CPU is being underclocked?
--
Horst von Brand http://counter.li.org # 22616

2002-02-01 17:11:25

by Greg Boyce

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

On Fri, 1 Feb 2002, Horst von Brand wrote:

> Greg Boyce <[email protected]> said:
>
> [...]
>
> > Every once in a while we come across single machines which are running a
> > lot slower than they should be, and are misreporting their speed in
> > bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> > of the kernel don't appear to affect the machines themselves at all.
>
> Just misrepresented bogomips or is the machine really slower? Perhaps the
> CPU is being underclocked?
> --
> Horst von Brand http://counter.li.org # 22616
>

The machine is actually slower. That's how I noticed the problem.

Underclocking dosen't seem likely due to the difference in speed. It's 4
bogomips instead of 500. The machine is running at about the speed of a
386 (I believe that's about right). It almost seems as if someone turned
off the turbo button. But of course I haven't seen one of those since my
old 486 :)

--
Greg Boyce


2002-02-01 20:36:14

by Gerhard Mack

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips



On Fri, 1 Feb 2002, Greg Boyce wrote:

> Date: Fri, 1 Feb 2002 12:11:07 -0500 (EST)
> From: Greg Boyce <[email protected]>
> To: Horst von Brand <[email protected]>
> Cc: [email protected]
> Subject: Re: Machines misreporting Bogomips
>
> On Fri, 1 Feb 2002, Horst von Brand wrote:
>
> > Greg Boyce <[email protected]> said:
> >
> > [...]
> >
> > > Every once in a while we come across single machines which are running a
> > > lot slower than they should be, and are misreporting their speed in
> > > bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> > > of the kernel don't appear to affect the machines themselves at all.
> >
> > Just misrepresented bogomips or is the machine really slower? Perhaps the
> > CPU is being underclocked?
> > --
> > Horst von Brand http://counter.li.org # 22616
> >
>
> The machine is actually slower. That's how I noticed the problem.
>
> Underclocking dosen't seem likely due to the difference in speed. It's 4
> bogomips instead of 500. The machine is running at about the speed of a
> 386 (I believe that's about right). It almost seems as if someone turned
> off the turbo button. But of course I haven't seen one of those since my
> old 486 :)
>
> --
> Greg Boyce
>
>

Could they be running with cache disabled in the bios?




--
Gerhard Mack

[email protected]

<>< As a computer I find your faith in technology amusing.

2002-02-01 20:53:58

by Greg Boyce

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

> > The machine is actually slower. That's how I noticed the problem.
> >
> > Underclocking dosen't seem likely due to the difference in speed. It's 4
> > bogomips instead of 500. The machine is running at about the speed of a
> > 386 (I believe that's about right). It almost seems as if someone turned
> > off the turbo button. But of course I haven't seen one of those since my
> > old 486 :)
> >
> > --
> > Greg Boyce
> >
> >
>
> Could they be running with cache disabled in the bios?
>

The machine is reporting that the cache is enabled. Even if this was
true, I have trouble believing that turning on the cache would result in a
50,000% increase in speed (4 bogomips compared to 500).

I have a feeling that I'm going to have to chalk this one up to hardware
failure. Another kind soul suggested it could be ECC memory reporting a
continual string of 1 bit failures.

--
Greg Boyce



2002-02-01 23:29:53

by Alan

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

> The machine is reporting that the cache is enabled. Even if this was
> true, I have trouble believing that turning on the cache would result in a
> 50,000% increase in speed (4 bogomips compared to 500).

L1 and L2 cache both disabled comes up as about 2.5 bogomips typically on
a Pentium II/III.

2002-02-01 23:35:13

by Greg Boyce

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

On Fri, 1 Feb 2002, Alan Cox wrote:

> > The machine is reporting that the cache is enabled. Even if this was
> > true, I have trouble believing that turning on the cache would result in a
> > 50,000% increase in speed (4 bogomips compared to 500).
>
> L1 and L2 cache both disabled comes up as about 2.5 bogomips typically on
> a Pentium II/III.
>

Ahh. I was working with someone else trying to figure out if the cache
would affect the calculated bogomips. Looks like it would.

The machine I'm reporting shows 512K of cache though. I included a second
machine as a comparison, and apparently choose poorly. That was the
machine reporting no cache.

Would a machine with L1 cache disabled, but with 512K of L2 cache report
around 4 Bogomips, or would the performance hit not be that strong?

2002-02-01 23:47:44

by Alan

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

> > L1 and L2 cache both disabled comes up as about 2.5 bogomips typically on
> > a Pentium II/III.
>
> The machine I'm reporting shows 512K of cache though. I included a second
> machine as a comparison, and apparently choose poorly. That was the
> machine reporting no cache.

It isnt the amount of cache, it is whether the cache is enabled. You can
have 512K of cache showing that is disabled in software

2002-02-03 07:37:22

by watermodem

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

Horst von Brand wrote:

> Greg Boyce <[email protected]> said:
>
> [...]
>
>
>>Every once in a while we come across single machines which are running a
>>lot slower than they should be, and are misreporting their speed in
>>bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
>>of the kernel don't appear to affect the machines themselves at all.
>>
>
> Just misrepresented bogomips or is the machine really slower? Perhaps the
> CPU is being underclocked?
>

If they are Intel CPU's and the heatsink <-> CPU connection is poor
(no heatsink compound, heatsink loose) or the fan is dead/dying or
due to dust poor airflow this is reasonable. Intel CPUs slow down
when they get hot as as safety measure.

Clean the machine, remove the heatsink/fan. Replace fan if needed.
(they are cheap so replace). Clean or replace heatsink. Apply a
good heatsink compound such as ArticSilver. [just a small dab]
Make sure you have proper contact and preasure. Re-run the speed test.

If the video card, case or motherboard have fans check them too.
Note that some of these heatsinks are epoxied on with a conductive
epoxy so it is unlikely that you could safely remove those. Replacing
the fans is ok though. If you are unsure of the proceses go to
one of the hardware or overclocking web pages.






2002-02-03 21:40:49

by Barry K. Nathan

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

watermodem wrote:
> If they are Intel CPU's and the heatsink <-> CPU connection is poor
> (no heatsink compound, heatsink loose) or the fan is dead/dying or
> due to dust poor airflow this is reasonable. Intel CPUs slow down
> when they get hot as as safety measure.

Note that this only applies to Pentium 4's. I believe Coppermine Pentium
III's will simply stop running (i.e., the computer freezes altogether
instead of slowing down) to prevent overheating. I'm not sure off the top
of my head what older Pentiums do, except that they certainly don't have
the slowdown trick that the Pentium 4 has.

Also, some BIOSes have an option called "CPU Speed at Boot" or something
like that, which has a Low and a High setting -- this serves the same
purpose as those old Turbo switches. (Note that I'm *not* talking about
the SpeedStep settings that newer laptops have in their BIOSes.)

-Barry K. Nathan <[email protected]>

2002-02-03 22:05:48

by Juhan Ernits

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips



On Fri, 1 Feb 2002, Greg Boyce wrote:

> On Fri, 1 Feb 2002, Alan Cox wrote:
>
> > > The machine is reporting that the cache is enabled. Even if this was
> > > true, I have trouble believing that turning on the cache would result in a
> > > 50,000% increase in speed (4 bogomips compared to 500).
> >
> > L1 and L2 cache both disabled comes up as about 2.5 bogomips typically on
> > a Pentium II/III.
> >

> Would a machine with L1 cache disabled, but with 512K of L2 cache report
> around 4 Bogomips, or would the performance hit not be that strong?


May be the following experiences are relevant:

- a Pentium MMX 200 MHz with burned L1 cache. It was worse than a 386 with
turbo switch turned off (didn-t have time to measure bogomips, just had it
replaced).

- a PIII 850 with L2 cache turned off appeared comparable to a 433
MHz Celeron (860 BogoMips) while with L2 cache turned on it showed 1684
BogoMips.

So that would confirm that the snaillike behaviour of your systems has
something to do with either _burned_ of disabled L1 cache, not L2 cache.

Best regards,

Juhan Ernits

2002-02-08 17:14:00

by Andrew Scott

[permalink] [raw]
Subject: Re: Machines misreporting Bogomips

On 31 Jan 2002 at 17:55, Greg Boyce wrote:

> kernel folk,
>
> I've got a strange issue that I've been struggling to find the solution to
> for some time now.
>
> I work in a group that assists in the managing of large numbers of
> deployed linux boxes running variants of the 2.2 kernel on them. The
> machines themselves are all pretty standard. There are slight variances
> on vendors, cpu speeds, etc., but they're all running from the same
> motherboards.
>
> Every once in a while we come across single machines which are running a
> lot slower than they should be, and are misreporting their speed in
> bogomips under /proc/cpuinfo. Reinstalling the OS and changing versions
> of the kernel don't appear to affect the machines themselves at all.
>
> I was wondering if anyone would be able to provide me with a starting
> point to hunt this down. The only solution we had found in the past was
> to replace the machines, but some of them are located out of the country
> and that would be expensive.

It seems to me that there was an issue with timers not being set up
properly, or changing their settings during startup, which could cause a
machine to behave like it was running slow. On more recent 2.2.x kernels
you would see a line like 'timer configuration lost' in dmesg, which meant
that the computer had the problem, and a workaround was being implimented.

On kernels that didn't detect the timer problem you could sometimes boot
with no problem, but other times you'd get a kernel that seemed to run very
slowly.

I don't remember if it affected the bogomips reporting, but I would think
that it could.

BTW, I think that the kernels I had the problems with were pre 2.2.17,
though I'm not positive. 2.2.20 and 2.2.19 do not exhibit the problem. i.e.
they detect the problem and work around it.




_
/ \ / [email protected]
/ \ \ /
/ \_/