Is there a canonical way for user-space software to determine how many
real CPUs are present in a system (as opposed to HyperThreaded or
otherwise virtual CPUs)?
We have an application that for performance reasons wants to run one
process per CPU. However, on a HyperThreaded system /proc/cpuinfo
lists two CPUs, and running two processes in this case is the wrong
thing to do. (Hyperthreading ends up degrading our performance,
perhaps due to cache or bus contention).
Please CC replies.
Thanks,
Dan Maas
On Mon, Mar 21, 2005 at 08:27:26PM -0500, Dan Maas wrote:
> Is there a canonical way for user-space software to determine how many
> real CPUs are present in a system (as opposed to HyperThreaded or
> otherwise virtual CPUs)?
>
> We have an application that for performance reasons wants to run one
> process per CPU. However, on a HyperThreaded system /proc/cpuinfo
> lists two CPUs, and running two processes in this case is the wrong
> thing to do. (Hyperthreading ends up degrading our performance,
> perhaps due to cache or bus contention).
Compare the 'physical id' fields of /proc/cpuinfo, and count
how many unique values you get.
Ie, on my dual+ht, I see..
physical id : 0
physical id : 0
physical id : 3
physical id : 3
Which indicates 2 real CPUs split in two.
Dave
On 03.22, Dan Maas wrote:
> Is there a canonical way for user-space software to determine how many
> real CPUs are present in a system (as opposed to HyperThreaded or
> otherwise virtual CPUs)?
>
This is 2xXeonHT, is, 4 cpus on 2 packages:
cat /proc/cpuinfo:
processor : 0
...
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
processor : 1
...
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
processor : 2
...
physical id : 3
siblings : 2
core id : 3
cpu cores : 1
processor : 3
...
physical id : 3
siblings : 2
core id : 3
cpu cores : 1
So something like:
cat /proc/cpuinfo | grep 'core id' | uniq | wc -l
would give you the number of packages or 'real cpus'. Then you have to
choose which ones are unrelated. Usually evens are siblings of odds, but
I won't trust on it...
> We have an application that for performance reasons wants to run one
> process per CPU. However, on a HyperThreaded system /proc/cpuinfo
> lists two CPUs, and running two processes in this case is the wrong
> thing to do. (Hyperthreading ends up degrading our performance,
> perhaps due to cache or bus contention).
>
I always hear people about HT 'degrading' performance. Obviously you don't
get a 200%, but it is always better than 100%. With my simulation code,
in which I did not anything special for HT (it uses my 4 cpus as 'real' ones),
I usually get a 125-130% gain. So the theoretical performance loos true.
Your application behaviour has to be really nasty to run slower with 2 threads
on an HT-P4 that with one thread.
Hope this helps.
--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandrakelinux release 10.2 (Cooker) for i586
Linux 2.6.11-jam6 (gcc 3.4.3 (Mandrakelinux 10.2 3.4.3-6mdk)) #1
Dan Maas wrote:
> Is there a canonical way for user-space software to determine how many
> real CPUs are present in a system (as opposed to HyperThreaded or
> otherwise virtual CPUs)?
>
> We have an application that for performance reasons wants to run one
> process per CPU. However, on a HyperThreaded system /proc/cpuinfo
> lists two CPUs, and running two processes in this case is the wrong
> thing to do. (Hyperthreading ends up degrading our performance,
> perhaps due to cache or bus contention).
>
> Please CC replies.
>
> Thanks,
> Dan Maas
> -
The simplest thing to do would be to boot with the "noht" parameter.
Or you can use "schedtool" (google or freshmeat) to set the CPU-affinity
at runtime to avoid HyperThreading the processes.
Daniel Andersen
--
>This is 2xXeonHT, is, 4 cpus on 2 packages:
>
>cat /proc/cpuinfo:
>
>processor : 0
>...
>physical id : 0
>siblings : 2
>core id : 0
>cpu cores : 1
>
>processor : 1
>...
>physical id : 0
>siblings : 2
>core id : 0
>cpu cores : 1
>
>processor : 2
>...
>physical id : 3
>siblings : 2
>core id : 3
>cpu cores : 1
>
>processor : 3
>...
>physical id : 3
>siblings : 2
>core id : 3
>cpu cores : 1
>
>So something like:
>
>cat /proc/cpuinfo | grep 'core id' | uniq | wc -l
>
>would give you the number of packages or 'real cpus'. Then you have to
>choose which ones are unrelated. Usually evens are siblings of
>odds, but
>I won't trust on it...
>
Number of unique physical id will tell you the number of physical CPU
packages in the system.
Number of unique core id will tell you the total number of CPU cores in
the system.
Number of processor will tell you the total number of logical CPUs on
the system.
Then to find out the matching pairs,
- to pair up all HT siblings on a core: Processors that have same "core
id" are HT siblings in a core.
- to pair up all CPUs in a package: Processors that have same "physical
id" are all the CPUs belonging to the same physical package.
Thanks,
Venki
* Dave Jones ([email protected]) wrote:
> Compare the 'physical id' fields of /proc/cpuinfo, and count
> how many unique values you get.
> Ie, on my dual+ht, I see..
>
> physical id : 0
> physical id : 0
> physical id : 3
> physical id : 3
>
> Which indicates 2 real CPUs split in two.
Is this guarenteed to be safe on all architectures? Parsing
/proc/cpuinfo accross different architectures can be a bit hairy;
I'm thinking when non-x86 start to have multiple hardware threads
people might want to do the same thing.
Dave
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
On Mon, Mar 21, 2005 at 08:56:03PM -0500, Dave Jones wrote:
> Compare the 'physical id' fields of /proc/cpuinfo, and count
> how many unique values you get.
It doesn't work for opteron, at least. These are in two sockets, but the
phys id is the same.
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 5
model name : AMD Opteron(tm) Processor 244
stepping : 8
cpu MHz : 1792.493
cache size : 1024 KB
physical id : 0
siblings : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm 3dnowext
3dnow
bogomips : 3514.36
processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 5
model name : AMD Opteron(tm) Processor 244
stepping : 10
cpu MHz : 1792.493
cache size : 1024 KB
physical id : 0
siblings : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm 3dnowext
3dnow
bogomips : 3571.71
--
Tom Vier <[email protected]>
DSA Key ID 0x15741ECE
Pallipadi, Venkatesh wrote:
>
>
>
>>This is 2xXeonHT, is, 4 cpus on 2 packages:
>>
>>cat /proc/cpuinfo:
>>
>>processor : 0
>>...
>>physical id : 0
>>siblings : 2
>>core id : 0
>>cpu cores : 1
>>
>>processor : 1
>>...
>>physical id : 0
>>siblings : 2
>>core id : 0
>>cpu cores : 1
>>
>>processor : 2
>>...
>>physical id : 3
>>siblings : 2
>>core id : 3
>>cpu cores : 1
>>
>>processor : 3
>>...
>>physical id : 3
>>siblings : 2
>>core id : 3
>>cpu cores : 1
>>
>>So something like:
>>
>>cat /proc/cpuinfo | grep 'core id' | uniq | wc -l
>>
>>would give you the number of packages or 'real cpus'. Then you have to
>>choose which ones are unrelated. Usually evens are siblings of
>>odds, but
>>I won't trust on it...
>>
>
>
> Number of unique physical id will tell you the number of physical CPU
> packages in the system.
For some Intel processors... Tom Vier just posted his cpuinfo which
shows all of his processors, which he notes are in separate sockets, are
identified as physical zero. I didn't find any Intel systems which
lacked unique physical ID, but clearly that's not true everywhere.
It's not clear if that's bizarre practice on AMD system boards or if
it's mis-reported. Of course Tom may be running a NUMA setup, in which
case I won't guess what's expected to be displayed. I've added him to
the CC list, in hopes of comment.
> Number of unique core id will tell you the total number of CPU cores in
> the system.
> Number of processor will tell you the total number of logical CPUs on
> the system.
>
> Then to find out the matching pairs,
> - to pair up all HT siblings on a core: Processors that have same "core
> id" are HT siblings in a core.
> - to pair up all CPUs in a package: Processors that have same "physical
> id" are all the CPUs belonging to the same physical package.
--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
> For some Intel processors... Tom Vier just posted his cpuinfo which shows all
> of his processors, which he notes are in separate sockets, are identified as
> physical zero. I didn't find any Intel systems which lacked unique physical ID,
> but clearly that's not true everywhere.
Hmmh what will we do once there will be a dual-PS3 (*) chip?
(*) Currently pure imagination, but what I mean is a two-or-more-CPU machine
with each CPU having 8 CPUs, basically. The latter fact (8 CPUs on one cpu
chip) is reality as far as my eye was sidetracked on a Playstation3 article
about their new CPU.
Jan Engelhardt
--
On Tue, Mar 22, 2005 at 04:26:47PM -0500, Bill Davidsen wrote:
> It's not clear if that's bizarre practice on AMD system boards or if
> it's mis-reported. Of course Tom may be running a NUMA setup, in which
> case I won't guess what's expected to be displayed. I've added him to
> the CC list, in hopes of comment.
It's numa (two cores, one ram ctrlr per core, one core per package). I'm
running an x86 kernel, btw, not 64bit. I have CONFIG_X86_HT set, and it
looks like it gets the pkg id from the apic (there's only one in multicore
packages?), but i might be reading it wrong.
My dmseg overflows before syslog starts, so all i could gather is:
Mar 23 12:04:25 zero kernel: Brought up 2 CPUs
Mar 23 12:04:25 zero kernel: CPU0 attaching sched-domain:
Mar 23 12:04:25 zero kernel: domain 0: span 3
Mar 23 12:04:25 zero kernel: groups: 1 2
Mar 23 12:04:25 zero kernel: CPU1 attaching sched-domain:
Mar 23 12:04:25 zero kernel: domain 0: span 3
Mar 23 12:04:25 zero kernel: groups: 2 1
I don't know how the scheduling domains work, and i'm too busy to look it up
right now.
--
Tom Vier <[email protected]>
DSA Key ID 0x15741ECE
Tom Vier wrote:
> On Tue, Mar 22, 2005 at 04:26:47PM -0500, Bill Davidsen wrote:
>
>>It's not clear if that's bizarre practice on AMD system boards or if
>>it's mis-reported. Of course Tom may be running a NUMA setup, in which
>>case I won't guess what's expected to be displayed. I've added him to
>>the CC list, in hopes of comment.
>
>
> It's numa (two cores, one ram ctrlr per core, one core per package). I'm
> running an x86 kernel, btw, not 64bit. I have CONFIG_X86_HT set, and it
> looks like it gets the pkg id from the apic (there's only one in multicore
> packages?), but i might be reading it wrong.
>
> My dmseg overflows before syslog starts, so all i could gather is:
Thanks, Tom. I suspect that NUMA has issues of its own in this area. I
always set up the kernel buffer size and run dmesg with -s200000 or so
on machines which tend to be overly verbose.
I leave it to someone really expert to figure out how to tell the actual
number of sockets, cores, and siblings. I think the system scheduler may
want this info, but I certainly don't intend to start 2nd guessing the
kernel on this stuff. The current scheduler does a pretty good job of
handling HT now, I'm not about to try and do better.
Interesting thought, I believe the IBM chip for PS not only has many
cores, but one article said they were not all the same. That, and
thoughts of someone running an SMP system with chips having
non-identical sibling count make me glad someone else is doing the
scheduler.
>
> Mar 23 12:04:25 zero kernel: Brought up 2 CPUs
> Mar 23 12:04:25 zero kernel: CPU0 attaching sched-domain:
> Mar 23 12:04:25 zero kernel: domain 0: span 3
> Mar 23 12:04:25 zero kernel: groups: 1 2
> Mar 23 12:04:25 zero kernel: CPU1 attaching sched-domain:
> Mar 23 12:04:25 zero kernel: domain 0: span 3
> Mar 23 12:04:25 zero kernel: groups: 2 1
>
> I don't know how the scheduling domains work, and i'm too busy to look it up
> right now.
>
--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me