2002-12-16 02:27:07

by Anders Gustafsson

[permalink] [raw]
Subject: Re: [XFS] add missing file xfs_iomap.c


On Sun, Dec 15, 2002 at 10:07:43PM +0000, Linux Kernel Mailing List wrote:

> # This patch includes the following deltas:
> # ChangeSet 1.940.1.1 -> 1.940.1.2
> # (new) -> 1.1 fs/xfs/xfs.ko
> #
>
> xfs.ko |binary
> 1 files changed
>
>
> Binary files a/fs/xfs/xfs.ko and b/fs/xfs/xfs.ko differ

A .ko in the bitkeeper?

This can't be correct?

--
Anders Gustafsson - [email protected] - http://0x63.nu/


2002-12-16 03:48:45

by Scott Robert Ladd

[permalink] [raw]
Subject: /proc/cpuinfo and hyperthreading

Hello:

When I cat /proc/cpuinfo on my Pentium 4 system, it says:

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Pentium 4 (Northwood)
stepping : 7
cpu MHz : 2783.753
cache size : 512 KB
physical id : 0
siblings : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 5488.64

Am I correct to infer that the "siblings" entry refers to the 2-way
hyperthreading on my CPU?

During boot, the system reports:

Dec 15 14:30:34 Tycho kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00]
enabled)
Dec 15 14:30:34 Tycho kernel: Processor #0 15:2 APIC version 16
Dec 15 14:30:34 Tycho kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01]
enabled)
Dec 15 14:30:34 Tycho kernel: Processor #1 15:2 APIC version 16
Dec 15 14:30:34 Tycho kernel: Building zonelist for node : 0
Dec 15 14:30:34 Tycho kernel: Kernel command line: BOOT_IMAGE=smp ro
root=306
Dec 15 14:30:34 Tycho kernel: Found and enabled local APIC!
Dec 15 14:30:34 Tycho kernel: Initializing CPU#0
Dec 15 14:30:34 Tycho kernel: Detected 2783.753 MHz processor.
Dec 15 14:30:34 Tycho kernel: Console: colour VGA+ 80x25
Dec 15 14:30:34 Tycho kernel: Calibrating delay loop... 5488.64 BogoMIPS
Dec 15 14:30:34 Tycho kernel: Memory: 255916k/261888k available (1411k
kernel code, 5216k reserved, 567k data, 276k init, 0k highmem)
Dec 15 14:30:34 Tycho kernel: Dentry cache hash table entries: 32768 (order:
6, 262144 bytes)
Dec 15 14:30:34 Tycho kernel: Inode-cache hash table entries: 16384 (order:
5, 131072 bytes)
Dec 15 14:30:34 Tycho kernel: Mount-cache hash table entries: 512 (order: 0,
4096 bytes)
Dec 15 14:30:34 Tycho kernel: -> /dev
Dec 15 14:30:34 Tycho kernel: -> /dev/console
Dec 15 14:30:34 Tycho kernel: -> /root
Dec 15 14:30:34 Tycho kernel: CPU: Before vendor init, caps: bfebfbff
00000000 00000000, vendor = 0
Dec 15 14:30:34 Tycho kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Dec 15 14:30:34 Tycho kernel: CPU: L2 cache: 512K
Dec 15 14:30:34 Tycho kernel: CPU: Physical Processor ID: 0
Dec 15 14:30:34 Tycho kernel: CPU: After vendor init, caps: bfebfbff
00000000 00000000 00000000
Dec 15 14:30:34 Tycho kernel: CPU: After generic, caps: bfebfbff
00000000 00000000 00000000
Dec 15 14:30:34 Tycho kernel: CPU: Common caps: bfebfbff
00000000 00000000 00000000
Dec 15 14:30:34 Tycho kernel: Enabling fast FPU save and restore... done.
Dec 15 14:30:34 Tycho kernel: Enabling unmasked SIMD FPU exception
support... done.
Dec 15 14:30:34 Tycho kernel: Checking 'hlt' instruction... OK.
Dec 15 14:30:34 Tycho kernel: POSIX conformance testing by UNIFIX
Dec 15 14:30:34 Tycho kernel: CPU0: Intel Pentium 4 (Northwood) stepping 07
Dec 15 14:30:34 Tycho kernel: per-CPU timeslice cutoff: 1462.97 usecs.
Dec 15 14:30:34 Tycho kernel: task migration cache decay timeout: 2 msecs.
Dec 15 14:30:34 Tycho kernel: SMP motherboard not detected.
Dec 15 14:30:34 Tycho kernel: enabled ExtINT on CPU#0
Dec 15 14:30:34 Tycho kernel: ESR value before enabling vector: 00000000
Dec 15 14:30:34 Tycho kernel: ESR value after enabling vector: 00000000
Dec 15 14:30:34 Tycho kernel: Using local APIC timer interrupts.
Dec 15 14:30:34 Tycho kernel: calibrating APIC timer ...
Dec 15 14:30:34 Tycho kernel: ..... CPU clock speed is 2783.0885 MHz.
Dec 15 14:30:34 Tycho kernel: ..... host bus clock speed is 132.0565 MHz.
Dec 15 14:30:34 Tycho kernel: Starting migration thread for cpu 0
Dec 15 14:30:34 Tycho kernel: CPUS done 2

I just want to be sure that hyperthreading is, in fact, working in 2.5.51.

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions, http://www.coyotegulch.com
No ads -- just very free (and somewhat unusual) code.

2002-12-16 03:55:07

by Robert Love

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

On Sun, 2002-12-15 at 22:58, Scott Robert Ladd wrote:

> Am I correct to infer that the "siblings" entry refers to the 2-way
> hyperthreading on my CPU?

Yep, the 'siblings' value is the number of virtual processors in the
physical package.

Do you only see one processor listing in /proc/cpuinfo, though? You
should see one for each (virtual) processor. That means two in a single
HT-enabled P4, each with the same physical id.

So it seems your chip works... is the kernel compiled for SMP?

Robert Love

2002-12-16 04:03:50

by Scott Robert Ladd

[permalink] [raw]
Subject: RE: /proc/cpuinfo and hyperthreading

Robert Love wrote:
> Yep, the 'siblings' value is the number of virtual processors in the
> physical package.
>
> Do you only see one processor listing in /proc/cpuinfo, though? You
> should see one for each (virtual) processor. That means two in a single
> HT-enabled P4, each with the same physical id.

That's what I expected!

> So it seems your chip works... is the kernel compiled for SMP?

Yup, it's compiled for SMP -- or, at least, I selected that option in make
menuconfig... ;) The boot reports:

Dec 15 11:51:18 Tycho kernel: Linux version 2.5.51 (root@Tycho)
(gcc version 2.95.4 20011002 (Debian prerelease))
#11 SMP Sat Dec 14 21:40:42 EST 2002

But later in the boot, it also states:

Dec 15 11:51:18 Tycho kernel: SMP motherboard not detected.

Something just doesn't look right about this.

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions, http://www.coyotegulch.com
No ads -- just very free (and somewhat unusual) code.

2002-12-16 04:30:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [XFS] add missing file xfs_iomap.c



On Mon, 16 Dec 2002, Anders Gustafsson wrote:
>
> This can't be correct?

It wasn't. It should be fixed in the final 2.5.52 (not cleanly, but..)

Linus

2002-12-16 06:18:09

by Zwane Mwaikambo

[permalink] [raw]
Subject: RE: /proc/cpuinfo and hyperthreading

On Sun, 15 Dec 2002, Scott Robert Ladd wrote:

> But later in the boot, it also states:
>
> Dec 15 11:51:18 Tycho kernel: SMP motherboard not detected.
>
> Something just doesn't look right about this.

Thats just the MP table parsing code whining. Which is ok since you're
using ACPI... hmm then again...

if (!smp_found_config) {
printk(KERN_NOTICE "SMP motherboard not detected.\n");
smpboot_clear_io_apic_irqs();
phys_cpu_present_map = 1;
if (APIC_init_uniprocessor())


--
function.linuxpower.ca

2002-12-16 06:25:22

by Zwane Mwaikambo

[permalink] [raw]
Subject: RE: /proc/cpuinfo and hyperthreading

On Mon, 16 Dec 2002, Zwane Mwaikambo wrote:

> On Sun, 15 Dec 2002, Scott Robert Ladd wrote:
>
> > But later in the boot, it also states:
> >
> > Dec 15 11:51:18 Tycho kernel: SMP motherboard not detected.
> >
> > Something just doesn't look right about this.
>
> Thats just the MP table parsing code whining. Which is ok since you're
> using ACPI... hmm then again...
>
> if (!smp_found_config) {
> printk(KERN_NOTICE "SMP motherboard not detected.\n");
> smpboot_clear_io_apic_irqs();
> phys_cpu_present_map = 1;
> if (APIC_init_uniprocessor())

Dec 15 14:30:34 Tycho kernel: CPUS done 2

It's ok.

(I feel like Jekyll & Hyde)...

--
function.linuxpower.ca

2002-12-16 13:28:44

by Scott Robert Ladd

[permalink] [raw]
Subject: re: /proc/cpuinfo and hyperthreading

Zwane Mwaikambo wrote:
> It's ok.

I'm not so sure.

To get the most benefit from two logical CPUs, don't I need the kernel to
operate as a 2-CPU SMP system?

Windows XP initializes the system as SMP with two CPUs; when I run an OpenMP
application under Windows, it reports two CPUs and a maximum of two threads.
Under Linux,

Linux SMP should initialize based on the number of logical CPUS, not the
physical number of ships; thus, I should be seeing two CPUs in
/proc/cpuinfo, not one.

..Scott

2002-12-16 13:43:26

by Brian Jackson

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

You could always boot once with nosmp and run some benchmarks and then
reboot (with smp) and run some more benchmarks, and see if there is a
difference.

--Brian Jackson


Scott Robert Ladd writes:

> Zwane Mwaikambo wrote:
>> It's ok.
>
> I'm not so sure.
>
> To get the most benefit from two logical CPUs, don't I need the kernel to
> operate as a 2-CPU SMP system?
>
> Windows XP initializes the system as SMP with two CPUs; when I run an OpenMP
> application under Windows, it reports two CPUs and a maximum of two threads.
> Under Linux,
>
> Linux SMP should initialize based on the number of logical CPUS, not the
> physical number of ships; thus, I should be seeing two CPUs in
> /proc/cpuinfo, not one.
>
> ..Scott
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-12-16 14:01:32

by Dave Jones

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

On Sun, Dec 15, 2002 at 10:58:12PM -0500, Scott Robert Ladd wrote:
> During boot, the system reports:
> Dec 15 14:30:34 Tycho kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00]
> enabled)
> Dec 15 14:30:34 Tycho kernel: Processor #0 15:2 APIC version 16
> Dec 15 14:30:34 Tycho kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01]
> enabled)
> Dec 15 14:30:34 Tycho kernel: Processor #1 15:2 APIC version 16
> Dec 15 14:30:34 Tycho kernel: Building zonelist for node : 0

Looks like you're either missing some ACPI config options, or
you haven't updated the BIOS yet. On 2.5.51 with latest BIOS on
the same box, I get..

Dec 12 16:28:55 tetrachloride kernel: Processor #1 15:2 APIC version 16
Dec 12 16:28:55 tetrachloride kernel: ACPI: LAPIC_NMI (acpi_id[0x01] polarity[0x0] trigger[0x0] lint[0x1])
Dec 12 16:28:55 tetrachloride kernel: ACPI: LAPIC_NMI (acpi_id[0x02] polarity[0x0] trigger[0x0] lint[0x1])
Dec 12 16:28:55 tetrachloride kernel: ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0])
Dec 12 16:28:55 tetrachloride kernel: IOAPIC[0]: Assigned apic_id 2
Dec 12 16:28:55 tetrachloride kernel: IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, IRQ 0-23
Dec 12 16:28:55 tetrachloride kernel: ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x1] trigger[0x3])
Dec 12 16:28:55 tetrachloride kernel: ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x0])
Dec 12 16:28:55 tetrachloride kernel: Using ACPI (MADT) for SMP configuration information
Dec 12 16:28:55 tetrachloride kernel: Building zonelist for node: 0

Dave

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-12-16 14:00:51

by Richard B. Johnson

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

On Mon, 16 Dec 2002, Brian Jackson wrote:

> You could always boot once with nosmp and run some benchmarks and then
> reboot (with smp) and run some more benchmarks, and see if there is a
> difference.
>
> --Brian Jackson
>
>
> Scott Robert Ladd writes:
>
> > Zwane Mwaikambo wrote:
> >> It's ok.
> >
> > I'm not so sure.
> >
> > To get the most benefit from two logical CPUs, don't I need the kernel to
> > operate as a 2-CPU SMP system?
> >
> > Windows XP initializes the system as SMP with two CPUs; when I run an OpenMP
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How do you know this? How can I learn what Windows does with
Win/2000/professional? The only way I know I have two CPUs is when the
machine fails to reboot because the file-system has been completely
trashed by the two CPUs banging on it at the same time. The solution has
been to remove one CPU. M$ claims; "Windows will over-power the system
if two CPUs are present...." Direct quote. If you have two logical
CPUs, you can't remove one, therefore, unless M$ has fixed the problem(s)
in XP, you can't use Windows with two logical CPUs, i.e., hyperthreading.


> > application under Windows, it reports two CPUs and a maximum of two threads.
> > Under Linux,
> >


Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2002-12-16 14:26:47

by Scott Robert Ladd

[permalink] [raw]
Subject: RE: /proc/cpuinfo and hyperthreading

Dave Jones wrote:
> Looks like you're either missing some ACPI config options, or
> you haven't updated the BIOS yet. On 2.5.51 with latest BIOS on
> the same box, I get..

Everything is fixed. No, I hadn't upgraded the BIOS; when I asked a contact
at Intel about the problem, I was told told me that BIOS was the latest.

I should have know better than to believe them!

Thank you very much; cat /proc/cpuinfo now reports two CPUs.

..Scott

2002-12-16 14:47:50

by Scott Robert Ladd

[permalink] [raw]
Subject: RE: /proc/cpuinfo and hyperthreading

Richard Johnson asked:
> How do you know this? How can I learn what Windows does with
> Win/2000/professional?

Run the Windows Task Manager and selected the Performance tab; on my system,
it shows two separate graphs, one for each logical CPU.

> if two CPUs are present...." Direct quote. If you have two logical

> CPUs, you can't remove one, therefore, unless M$ has fixed the problem(s)
> in XP, you can't use Windows with two logical CPUs, i.e., hyperthreading.

The machine came with Windows XP pre-installed; I ran it a couple of times,
then blew it away (do I hear cheers?) when I installed Linux. I probably
didn't run it long-enough to hit any bugs.

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions, http://www.coyotegulch.com
No ads -- just very free (and somewhat unusual) code.

2002-12-16 15:03:38

by MånsRullgård

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

"Scott Robert Ladd" <[email protected]> writes:

> > How do you know this? How can I learn what Windows does with
> > Win/2000/professional?
>
> Run the Windows Task Manager and selected the Performance tab; on my system,
> it shows two separate graphs, one for each logical CPU.

It's easy to write a program that displays any number of graphs
vaguely related to the system load. How do we know that the
performance meter isn't lying?

--
M?ns Rullg?rd
[email protected]

2002-12-16 15:35:12

by Scott Robert Ladd

[permalink] [raw]
Subject: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

M?ns Rullg?rd wrote:
> It's easy to write a program that displays any number of graphs
> vaguely related to the system load. How do we know that the
> performance meter isn't lying?

We don't.

All I can say is that the performance meter seems (note the weasel-word)
proper when running Win2K SMP on a dual PIII-933 box at one of my client
sites. However, such experience does *not* guarantee that WinXP is reporting
valid numbers for a P4 with HT.

Here's a little test I ran this morning, now that my new system is
operational. My benchmark is a full "make bootstrap" compile of gcc-3.2.1,
with and without the - j 2 make switch that enables two threads of
compilation. Using the 2.5.51 SMP kernel, I see the following compile times:

SMP w/o -j 2: 28m11s
"nosmp" with -j 2: 27m32s
SMP with -j 2: 24m21s

HT appears to give a very tiny benefit even without an SMP kernel -- and
*with* an SMP kernel, I get a 16% improvement in my compile time. That
pretty much matches my expectation (i.e., a HT processor is *not* equal to
dual processor, but it *is* better than a non-HT processor).

Just some food for collective thought.

..Scott

2002-12-16 15:48:29

by Andrew Theurer

[permalink] [raw]
Subject: Re: /proc/cpuinfo and hyperthreading

On Monday 16 December 2002 07:54, Brian Jackson wrote:
> You could always boot once with nosmp and run some benchmarks and then
> reboot (with smp) and run some more benchmarks, and see if there is a
> difference.

Yes, but wouldn't booting a UP kernel be a better comparison? After all, why
incur any possible overhead of an SMP kernel if you don't need to? Some
benchmarks may not show a difference, but some definitely will.

-Andrew Theurer

2002-12-16 22:31:16

by J.A. Magallon

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)


On 2002.12.16 Scott Robert Ladd wrote:
>M?ns Rullg?rd wrote:
>> It's easy to write a program that displays any number of graphs
>> vaguely related to the system load. How do we know that the
>> performance meter isn't lying?
>
>We don't.
>
>All I can say is that the performance meter seems (note the weasel-word)
>proper when running Win2K SMP on a dual PIII-933 box at one of my client
>sites. However, such experience does *not* guarantee that WinXP is reporting
>valid numbers for a P4 with HT.
>
>Here's a little test I ran this morning, now that my new system is
>operational. My benchmark is a full "make bootstrap" compile of gcc-3.2.1,
>with and without the - j 2 make switch that enables two threads of
>compilation. Using the 2.5.51 SMP kernel, I see the following compile times:
>
> SMP w/o -j 2: 28m11s
> "nosmp" with -j 2: 27m32s
> SMP with -j 2: 24m21s
>
>HT appears to give a very tiny benefit even without an SMP kernel -- and
>*with* an SMP kernel, I get a 16% improvement in my compile time. That
>pretty much matches my expectation (i.e., a HT processor is *not* equal to
>dual processor, but it *is* better than a non-HT processor).
>

HT can give no benefit in UP case, nobody knows that the sibling exists
and the P4 does not paralelize itself. The gain you see is due to
computation-io overlap.

This my render code, implemented with posix threads, running on a dual
[email protected]. Work is just dynamic strctures walk-through and floating
point calculation, no IO. In this example the database is tiny, so there
is no swap, and the box is 'all mine', any other process eating CPU.

Processes do not bounce between cpus and ht-aware scheduler
prefers a processor in different physical package when two cpu intensive
threads are running, so in the 2-threads case they run on different
packages:

Number of threads Elapsed time User Time System Time
1 53:216 53:220 00:000
2 29:272 58:180 00:320
3 27:162 1:21:450 00:540
4 25:094 1:41:080 01:250

Elapsed is measured by the parent thread, that is not doing anything
but wait on a pthread_join. User and system times are the sum of
times for all the children threads, that do real work.

The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
I have my cpus doubled but each one has half the pipelining for floating
point...see the user cpu time increased due to 'worst' processors and
cache pollution on each package.

So, IMHO and for my apps, HyperThreading is just a bad joke.

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.20-jam1 (gcc 3.2 (Mandrake Linux 9.1 3.2-4mdk))

2002-12-16 23:13:17

by Scott Robert Ladd

[permalink] [raw]
Subject: RE: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

J.A. Magallon wrote:
> HT can give no benefit in UP case, nobody knows that the sibling exists
> and the P4 does not paralelize itself. The gain you see is due to
> computation-io overlap.

I see the light! Thank you.

> This my render code, implemented with posix threads, running on a dual
> [email protected].

> Number of threads Elapsed time User Time System Time
> 1 53:216 53:220 00:000
> 2 29:272 58:180 00:320
> 3 27:162 1:21:450 00:540
> 4 25:094 1:41:080 01:250
>
> Elapsed is measured by the parent thread, that is not doing anything
> but wait on a pthread_join. User and system times are the sum of
> times for all the children threads, that do real work.
>
> The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
> I have my cpus doubled but each one has half the pipelining for floating
> point...see the user cpu time increased due to 'worst' processors and
> cache pollution on each package.

>From what I can see, HT provides a 0-15% increase in performance, depending
heavily on the type of code being run. In other words, HT helps, but it is
*no* substitute for true multiple processors. And it is ONLY of value when
an SMP kernel is in use.

What you're seeing meshes with my results: our perfromance gains from HT are
about the same. HT didn't lose either of us anything, but it sure as heck
didn't make the kind of difference the hype seems to imply.

As for REAL SMP: I posted some more numbers on my web site (URL below),
using the same gcc compile test on my dual-proc with PIII-600s. Using a
single process, the compile took just under a 100 minutes, while with two
processes, it finished in 58.5 minutes. Real SMP reduced the time by 40%
(again, similar to your numbers).

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions, http://www.coyotegulch.com
No ads -- just very free (and somewhat unusual) code.

2002-12-16 23:20:24

by J.A. Magallon

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)


On 2002.12.17 Scott Robert Ladd wrote:
[...]
>
>From what I can see, HT provides a 0-15% increase in performance, depending
>heavily on the type of code being run. In other words, HT helps, but it is
>*no* substitute for true multiple processors. And it is ONLY of value when
>an SMP kernel is in use.
>

What I don't like is that Intel sells it like the best thing since sliced
bread, and get a money for it, see the price of Xeons compared to normal P4s...

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.20-jam1 (gcc 3.2 (Mandrake Linux 9.1 3.2-4mdk))

2002-12-16 23:42:27

by H. Peter Anvin

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

Followup to: <[email protected]>
By author: "Scott Robert Ladd" <[email protected]>
In newsgroup: linux.dev.kernel
>
> From what I can see, HT provides a 0-15% increase in performance, depending
> heavily on the type of code being run. In other words, HT helps, but it is
> *no* substitute for true multiple processors. And it is ONLY of value when
> an SMP kernel is in use.
>

It would be interesting to compare an UP kernel with HT off to an SMP
kernel with the HT on...

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-12-17 06:12:18

by Denis Vlasenko

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

On 16 December 2002 21:27, J.A. Magallon wrote:
> On 2002.12.17 Scott Robert Ladd wrote:
> [...]
>
> From what I can see, HT provides a 0-15% increase in performance,
> depending
>
> >heavily on the type of code being run. In other words, HT helps, but
> > it is *no* substitute for true multiple processors. And it is ONLY
> > of value when an SMP kernel is in use.
>
> What I don't like is that Intel sells it like the best thing since
> sliced bread, and get a money for it, see the price of Xeons compared
> to normal P4s...

What did you expect? They are making processors for money, and have
to push the sales.

As to HT, it's definitely a good thing. Multiple CPUs on a chip is
a logical step. HT in P4 is rather weak, but future processors will
likely have more advanced cores.

I never heard about HT from AMD camp. I'm curious what they do. ;)
--
vda

2002-12-17 19:20:54

by Bill Davidsen

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

On Mon, 16 Dec 2002, J.A. Magallon wrote:

> Number of threads Elapsed time User Time System Time
> 1 53:216 53:220 00:000
> 2 29:272 58:180 00:320
> 3 27:162 1:21:450 00:540
> 4 25:094 1:41:080 01:250
>
> Elapsed is measured by the parent thread, that is not doing anything
> but wait on a pthread_join. User and system times are the sum of
> times for all the children threads, that do real work.
>
> The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
> I have my cpus doubled but each one has half the pipelining for floating
> point...see the user cpu time increased due to 'worst' processors and
> cache pollution on each package.
>
> So, IMHO and for my apps, HyperThreading is just a bad joke.

I must be misreading this, it looks to me as though having threads running
HT is reducing the clock time, and frankly that's what I want. It may not
be as good as having more processors, but it certainly is better for
nothing, even for your application. I read that as about 10% faster, and I
know people who spend more on fans to o/c their CPU than the premium for a
Xeon.

More to the point, since you have no choice if you want to go fast or have
>2 CPUs, you get HT included. Clearly if you want good latency you don't
run SMP at all due to the extra locking, that's a kernel issue, not HT.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-12-17 20:36:44

by H. Peter Anvin

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

Followup to: <[email protected]>
By author: Denis Vlasenko <[email protected]>
In newsgroup: linux.dev.kernel
>
> As to HT, it's definitely a good thing. Multiple CPUs on a chip is
> a logical step. HT in P4 is rather weak, but future processors will
> likely have more advanced cores.
>

SMT and SMP-on-chip are two very different things.

> I never heard about HT from AMD camp. I'm curious what they do. ;)

Not have insanely long pipelines, so that a single thread can actually
use the processor functional units?

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-12-18 17:48:57

by Andrew Burgess

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)

>Number of threads Elapsed time User Time System Time
>1 53:216 53:220 00:000
>2 29:272 58:180 00:320
>3 27:162 1:21:450 00:540
>4 25:094 1:41:080 01:250

>Elapsed is measured by the parent thread, that is not doing anything
>but wait on a pthread_join. User and system times are the sum of
>times for all the children threads, that do real work.

>The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
>I have my cpus doubled but each one has half the pipelining for floating
>point...see the user cpu time increased due to 'worst' processors and
>cache pollution on each package.

>So, IMHO and for my apps, HyperThreading is just a bad joke.

Why do you care about user time? The elapsed time went down by
4 minutes (2->4 threads), if that's a joke I don't get it :-)

New Intel Ad: "What are you going to do with your 4 minutes today?"

2002-12-19 21:56:59

by J.A. Magallon

[permalink] [raw]
Subject: Re: HT Benchmarks (was: /proc/cpuinfo and hyperthreading)


On 2002.12.18 Andrew Burgess wrote:
>>Number of threads Elapsed time User Time System Time
>>1 53:216 53:220 00:000
>>2 29:272 58:180 00:320
>>3 27:162 1:21:450 00:540
>>4 25:094 1:41:080 01:250
>
>>Elapsed is measured by the parent thread, that is not doing anything
>>but wait on a pthread_join. User and system times are the sum of
>>times for all the children threads, that do real work.
>
>>The jump from 1->2 threads is fine, the one from 2->4 is ridiculous...
>>I have my cpus doubled but each one has half the pipelining for floating
>>point...see the user cpu time increased due to 'worst' processors and
>>cache pollution on each package.
>
>>So, IMHO and for my apps, HyperThreading is just a bad joke.
>
>Why do you care about user time? The elapsed time went down by
>4 minutes (2->4 threads), if that's a joke I don't get it :-)
>
>New Intel Ad: "What are you going to do with your 4 minutes today?"
>

Of course I gain something. The problem is the price you pay for the
gain.

Prices in Spain: a P4 with 512Kb cache, 210 euros. Equal features (freq,
cache), but Xeon version, 320 euros. So you pay 50% more money for
10% more performance. Not too fair...

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.20-jam2 (gcc 3.2 (Mandrake Linux 9.1 3.2-4mdk))