2002-07-30 16:07:56

by Steven Cole

[permalink] [raw]
Subject: 2.5.29, CPU#1 not working with CONFIG_SMP=y, 2.5.28 OK.

On my dual p3 with an Intel STL2 motherboard, linux-2.5.29 does not see
the second CPU properly. And yes, I double checked that I really had
CONFIG_SMP=y for 2.5.29, rebuilding to make sure. 2.5.28 worked fine
for SMP.

Here are snippets from the dmesg output from 2.5.28 and 2.5.29:

2.5.28 dmesg snippet:

Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 999.0611 MHz.
..... host bus clock speed is 133.0281 MHz.
cpu: 0, clocks: 133281, slice: 4038
CPU0<T0:133280,T1:129232,D:10,S:4038,C:133281>
cpu: 1, clocks: 133281, slice: 4038
CPU1<T0:133280,T1:125200,D:4,S:4038,C:133281>
checking TSC synchronization across CPUs: passed.
migration_task 0 on cpu=0
migration_task 1 on cpu=1
Linux NET4.0 for Linux 2.4

2.5.29 dmesg snippet:

Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 999.0634 MHz.
..... host bus clock speed is 133.0284 MHz.
cpu: 0, clocks: 133284, slice: 4038
CPU0<T0:133280,T1:129232,D:10,S:4038,C:133284>
checking TSC synchronization across 2 CPUs: passed.
Bringing up 3
CPUS done 4294967295
Linux NET4.0 for Linux 2.4

Earlier in the 2.5.29 dmesg output, the second CPU is initialized:

Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 1994.75 BogoMIPS
CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000
CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
CPU: Common caps: 0383fbff 00000000 00000000 00000000
CPU1: Intel 00/08 stepping 06
Total of 2 processors activated (3969.02 BogoMIPS).

The output of /proc/cpuinfo shows two cpus for 2.5.28 and only
one for 2.5.29:

[steven@spc5 steven]$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : 00/08
stepping : 6
cpu MHz : 1000.127
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 mmx fxsr sse
bogomips : 1974.27

And indeed, the second cpu is not being used. I noticed this when
running a benchmark for 2.5.29 which came out much lower, about half
speed.

I also tried Craig Kulesa's patches for rmap and slabLRU for 2.5.29
since the readme mentioned an SMP fix, but that kernel shows this same
problem as vanilla 2.5.29.

Steven




2002-07-31 02:02:34

by Rusty Russell

[permalink] [raw]
Subject: Re: 2.5.29, CPU#1 not working with CONFIG_SMP=y, 2.5.28 OK.

In message <[email protected]> you write:
> Rusty,
>
> I sent the following to lkml before I realized that this should have
> been cc'ed to you. In the meantime, I looked at changes to init/main.c,
> so I tried rebooting 2.5.29 with maxcpus=2 on the command line at boot.
> That changed the line from dmesg which read
> CPUS done 4294967295
> to
> CPUS done 2
> but still I have the same result in that only CPU#0 is running.

> 2.5.29 dmesg snippet:
>
> Using local APIC timer interrupts.
> calibrating APIC timer ...
> ..... CPU clock speed is 999.0634 MHz.
> ..... host bus clock speed is 133.0284 MHz.
> cpu: 0, clocks: 133284, slice: 4038
> CPU0<T0:133280,T1:129232,D:10,S:4038,C:133284>
> checking TSC synchronization across 2 CPUs: passed.
> Bringing up 3

Hmm... this is the hint, here. Please try the patch below (trivial,
but untested).

Please tell the results!
Rusty.

diff -urpN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.29/include/asm-i386/smp.h working-2.5.29-smpfix/include/asm-i386/smp.h
--- linux-2.5.29/include/asm-i386/smp.h Sat Jul 27 15:24:39 2002
+++ working-2.5.29-smpfix/include/asm-i386/smp.h Wed Jul 31 11:28:04 2002
@@ -85,7 +85,9 @@ extern volatile int logical_apicid_to_cp
*/
#define smp_processor_id() (current_thread_info()->cpu)

-#define cpu_possible(cpu) (phys_cpu_present_map & (1<<(cpu)))
+extern volatile unsigned long cpu_callout_map;
+
+#define cpu_possible(cpu) (cpu_callout_map & (1<<(cpu)))
#define cpu_online(cpu) (cpu_online_map & (1<<(cpu)))

extern inline unsigned int num_online_cpus(void)
@@ -113,7 +115,6 @@ static __inline int logical_smp_processo
return GET_APIC_LOGICAL_ID(*(unsigned long *)(APIC_BASE+APIC_LDR));
}

-extern volatile unsigned long cpu_callout_map;
/* We don't mark CPUs online until __cpu_up(), so we need another measure */
static inline int num_booting_cpus(void)
{

--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-07-31 14:05:26

by Steven Cole

[permalink] [raw]
Subject: Re: 2.5.29, CPU#1 not working with CONFIG_SMP=y, 2.5.28 OK.

On Tue, 2002-07-30 at 19:34, Rusty Russell wrote:
[snipped]
>
> Hmm... this is the hint, here. Please try the patch below (trivial,
> but untested).
>
> Please tell the results!
> Rusty.

Yes, that worked. Thanks.

I had also previously applied the "Fix ksoftirqd and migration threads
initcalls" changeset 1.476.1.13 to my 2.5.29 tree.

I tested this with and without appending maxcpus=2 to the boot line.

Steven