I have an AMD Athlon64 X2 4400+ CPU (dual core, not SMT capable, so
two cores and two threads) - running a 32bit kernel.
Linux dragon 2.6.16-rc3-git7 #4 SMP PREEMPT Thu Feb 16 22:14:55 CET
2006 i686 athlon-4 i386 GNU/Linux
I just noticed that the number of core siblings reported through sysfs
is wrong. The number of thread siblings is correct and so is the info
reported via /proc/cpuinfo - only sysfs seems to get it wrong.
juhl@dragon:~$ cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
2
juhl@dragon:~$ cat /sys/devices/system/cpu/cpu1/topology/core_siblings
3
two thread siblings makes perfect sense, but 3 core siblings?
Did my cores start to reproduce? - I know CPU's get hot, but I didn't
know that was the reason ;-)
Proc nicely reports 2 cores with same physical id and 2 siblings :
juhl@dragon:~$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 35
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
stepping : 2
cpu MHz : 2200.724
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
bogomips : 4405.02
processor : 1
vendor_id : AuthenticAMD
cpu family : 15
model : 35
model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
stepping : 2
cpu MHz : 2200.724
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
bogomips : 4399.55
I tried adding some printk's to arch/i386/kernel/smpboot.c to see how
cpu_core_map got initialized, but couldn't spot any errors.
In set_cpu_sibling_map() there's a loop that runs for_each_cpu_mask(i,
cpu_sibling_setup_map) and does
cpu_set(i, cpu_core_map[cpu]);
cpu_set(cpu, cpu_core_map[i]);
I tried adding printk(KERN_WARNING "DEBUG: i = %d, cpu = %d\n", i, cpu);
just before those cpuset() calls, and that prints out
DEBUG: i = 0, cpu = 0
DEBUG: i = 0, cpu = 1
DEBUG: i = 1, cpu = 1
So we'll set bits 0 & 1 for each of the two cores in cpu_core_map[] ,
which looks sane to me.
So where does the extra core sibling come from that's reported via sysfs?
As far as I can tell, the
#define topology_core_siblings(cpu) (cpu_core_map[cpu])
in include/asm-i386/topology.h , will cause this bit of code :
#define define_siblings_show_func(name) \
static ssize_t show_##name(struct sys_device *dev, char *buf) \
{ \
ssize_t len = -1; \
unsigned int cpu = dev->id; \
len = cpumask_scnprintf(buf, NR_CPUS+1, topology_##name(cpu)); \
return (len + sprintf(buf + len, "\n")); \
}
...
#ifdef topology_core_siblings
define_siblings_show_func(core_siblings);
...
in drivers/base/topology.c to print out the nr of core siblings based
on what's in cpu_core_map - which as far as I can tell is OK.
Obviously something is wrong, but I just can't seem to spot it. Any clues?
--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
On Thursday 16 February 2006 22:46, Jesper Juhl wrote:
> Obviously something is wrong, but I just can't seem to spot it. Any clues?
It's a bitmap. 3 = 0b11
-Andi
On 2/16/06, Andi Kleen <[email protected]> wrote:
> On Thursday 16 February 2006 22:46, Jesper Juhl wrote:
>
> > Obviously something is wrong, but I just can't seem to spot it. Any clues?
>
> It's a bitmap. 3 = 0b11
>
When I was reading the smpboot code my brain *was* actually in the
"this is a bitmap" mode, but when I then looked at the sysfs code it
for some reason switched to "this wants to just print the number of
siblings as an integer" mode - which was obviously where I went wrong.
If it's being treated as a bitmap when it's created why would that
change when it gets printed - D'OH!
Thank you very much for that hit with the clue stick Andi.
--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
On Thu, 2006-02-16 at 23:08 +0100, Jesper Juhl wrote:
> On 2/16/06, Andi Kleen <[email protected]> wrote:
> > On Thursday 16 February 2006 22:46, Jesper Juhl wrote:
> >
> > > Obviously something is wrong, but I just can't seem to spot it. Any clues?
> >
> > It's a bitmap. 3 = 0b11
> >
>
> When I was reading the smpboot code my brain *was* actually in the
> "this is a bitmap" mode, but when I then looked at the sysfs code it
> for some reason switched to "this wants to just print the number of
> siblings as an integer" mode - which was obviously where I went wrong.
> If it's being treated as a bitmap when it's created why would that
> change when it gets printed - D'OH!
>
> Thank you very much for that hit with the clue stick Andi.
While looking around for other examples, I ran across
cpufreq/affected_cpus. Shouldn't cpufreq.c:show_affected_cpus() be
using cpumask_scnprintf instead of "%u"?
But anyway...
It seems to me that this could be confusing for a lot of people who are
casually browsing through sysfs. Why not name core_siblings something
like core_sibling_bitmap? Using _units isn't unprecedented, we have
read_ahead_kb. I think it's nice not having to look it up to know
read_ahead is in kb and not bytes or sectors. Likewise, it'd be nice to
know it's a bitmap and not a count just by looking at the name.
More alternatively, bitmaps seem ugly. Why not one of these options
instead?
- a space separated list of bitmap offsets: "0 1" instead of 3
- a directory of symlinks to devices/system/cpu/cpu[N]:
devices/system/cpu/cpu1/topology/core_siblings/0 -> ../../../cpu0
devices/system/cpu/cpu1/topology/core_siblings/1 -> ../../../cpu1
--
Zan Lynx <[email protected]>
On Friday 17 February 2006 03:10, Zan Lynx wrote:
> It seems to me that this could be confusing for a lot of people who are
> casually browsing through sysfs. Why not name core_siblings something
> like core_sibling_bitmap?
Because it's already a fixed ABI that is put in stone.
The only way to do what you want would be to add a new field
and keep the old one alone, but frankly your rationale for
it ("could be confusing to someone") doesn't seem convincing enough
for such a thing. Especially since each sysfs field can
cost considerable memory when a dentry and a inode have to be allocated
for it.
I guess if you worry about such people a better way to help
them would be to write them a frontend that displays
the information in there in nice form.
-Andi
On Fri, 2006-02-17 at 03:18 +0100, Andi Kleen wrote:
> On Friday 17 February 2006 03:10, Zan Lynx wrote:
>
> > It seems to me that this could be confusing for a lot of people who are
> > casually browsing through sysfs. Why not name core_siblings something
> > like core_sibling_bitmap?
>
> Because it's already a fixed ABI that is put in stone.
Hardly "fixed in stone." I checked before making the suggestion to
change it, the topology sysfs entries are only in 2.6.16-rc kernels, so
they've never been released in an official mainline kernel. Fixed in
clay that hasn't been fired yet, perhaps. Maybe by -rc3 it's already
baking in the kiln, but its still changeable.
> The only way to do what you want would be to add a new field
> and keep the old one alone, but frankly your rationale for
> it ("could be confusing to someone") doesn't seem convincing enough
> for such a thing. Especially since each sysfs field can
> cost considerable memory when a dentry and a inode have to be allocated
> for it.
Now is hardly the time to worry about efficiency in sysfs. It's already
a maze of twisty symlinks. And the whole rationale of sysfs is easy
shell-tool access to kernel data about the system.
> I guess if you worry about such people a better way to help
> them would be to write them a frontend that displays
> the information in there in nice form.
Nice frontends processing efficient but hard to read data would be best
implemented in binary through netlink or system calls where there would
be no need to format to text, create an inode, then have userspace open
directories, follow links, open a file, read it, parse it BACK into
binary, etc.
So obviously efficiency isn't a primary sysfs consideration. That would
leave ... ease of use? :-)
Your two objections aren't convincing to me, but I don't suppose it
matters since I don't have a major interest in the topology code or how
it works. So continue on! It was only a helpful suggestion.
--
Zan Lynx <[email protected]>