2009-04-17 17:03:59

by Maciej Rutecki

[permalink] [raw]
Subject: Spontaneous reboots since 2.6.29-rc*

Sometimes I observed spontaneous reboots during booting kernel, after
this message:
[...]
[ 0.024996] ... PM-Timer delta = 357949
[ 0.024996] ... PM-Timer result ok
[ 0.024996] ..... delta 1249983
[ 0.024996] ..... mult: 53694522
[ 0.024996] ..... calibration result: 199997
[ 0.024996] ..... CPU clock speed is 1999.0972 MHz.
[ 0.024996] ..... host bus clock speed is 199.0997 MHz.
[ 0.024996] Booting processor 1 APIC 0x1 ip 0x6000

after this should be:
[ 0.000999] Initializing CPU#1
[ 0.000999] masked ExtINT on CPU#1
[ 0.000999] Calibrating delay using timer specific routine..
4067.56 BogoMIPS (lpj=2033784)
[ 0.000999] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 0.000999] CPU: L2 cache: 1024K


Also I found this message in dmesg:
[ 0.000000] 4 Processors exceeds NR_CPUS limit of 2
[ 0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs


I only have two processors:
maciek@zlom:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Pentium(R) Dual CPU E2180 @ 2.00GHz
stepping : 13
cpu MHz : 1200.000
cache size : 1024 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx
lm constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl est
tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 4423.96
clflush size : 64
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Pentium(R) Dual CPU E2180 @ 2.00GHz
stepping : 13
cpu MHz : 1200.000
cache size : 1024 KB
physical id : 0
siblings : 1
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx
lm constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl est
tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 4067.56
clflush size : 64
power management:

I use latest bios. Unfortunately it happens only sometimes, so I
cannot try bisect.

Config:
http://unixy.pl/maciek/download/kernel/2.6.30-rc2/pc/config-2.6.30-rc2

Dmesg:
http://unixy.pl/maciek/download/kernel/2.6.30-rc2/pc/dmesg-2.6.30-rc2.txt

Regards

--
Maciej Rutecki
http://www.maciek.unixy.pl


2009-04-20 05:15:50

by Yinghai Lu

[permalink] [raw]
Subject: Re: Spontaneous reboots since 2.6.29-rc*

On Fri, Apr 17, 2009 at 10:03 AM, Maciej Rutecki
<[email protected]> wrote:
> Sometimes I observed spontaneous reboots during booting kernel, after
> this message:
> [...]
> [ ? ?0.024996] ... PM-Timer delta = 357949
> [ ? ?0.024996] ... PM-Timer result ok
> [ ? ?0.024996] ..... delta 1249983
> [ ? ?0.024996] ..... mult: 53694522
> [ ? ?0.024996] ..... calibration result: 199997
> [ ? ?0.024996] ..... CPU clock speed is 1999.0972 MHz.
> [ ? ?0.024996] ..... host bus clock speed is 199.0997 MHz.
> [ ? ?0.024996] Booting processor 1 APIC 0x1 ip 0x6000
>
> after this should be:
> [ ? ?0.000999] Initializing CPU#1
> [ ? ?0.000999] masked ExtINT on CPU#1
> [ ? ?0.000999] Calibrating delay using timer specific routine..
> 4067.56 BogoMIPS (lpj=2033784)
> [ ? ?0.000999] CPU: L1 I cache: 32K, L1 D cache: 32K
> [ ? ?0.000999] CPU: L2 cache: 1024K
>
>
> Also I found this message in dmesg:
> [ ? ?0.000000] 4 Processors exceeds NR_CPUS limit of 2
> [ ? ?0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs

your config has
CONFIG_NR_CPUS=2

ACPI MADT said:
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
[ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)

so it should be ok

but the MTRR looks weird...

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000bf5e0000 (usable)
[ 0.000000] BIOS-e820: 00000000bf5e0000 - 00000000bf5e3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 00000000bf5e3000 - 00000000bf5f0000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000bf5f0000 - 00000000bf600000 (reserved)
[ 0.000000] BIOS-e820: 00000000c0000000 - 00000000d0000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] DMI 2.4 present.
[ 0.000000] last_pfn = 0xbf5e0 max_arch_pfn = 0x100000
[ 0.000000] MTRR default type: uncachable
[ 0.000000] MTRR fixed ranges enabled:
[ 0.000000] 00000-9FFFF write-back
[ 0.000000] A0000-BFFFF uncachable
[ 0.000000] C0000-CAFFF write-protect
[ 0.000000] CB000-EFFFF uncachable
[ 0.000000] F0000-FFFFF write-through
[ 0.000000] MTRR variable ranges enabled:
[ 0.000000] 0 base 000000000 mask F00000000 write-back
[ 0.000000] 1 base 0C0000000 mask FC0000000 uncachable
[ 0.000000] 2 base 0BF800000 mask FFF800000 uncachable
[ 0.000000] 3 base 0BF700000 mask FFFF00000 uncachable
[ 0.000000] 4 base 0BF600000 mask FFFF00000 uncachable
[ 0.000000] 5 disabled
[ 0.000000] 6 disabled
[ 0.000000] 7 disabled

[ 17.421843] [drm] Initialized drm 1.1.0 20060810
[ 17.431172] pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 17.431179] pci 0000:00:02.0: setting latency timer to 64
[ 17.433356] mtrr: type mismatch for d0000000,10000000 old:
write-back new: write-combining
[ 17.433360] [drm] MTRR allocation failed. Graphics performance may suffer.
[ 17.433388] pci 0000:00:02.0: irq 27 for MSI/MSI-X
[ 17.433409] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[ 17.452114] mtrr: type mismatch for d0000000,10000000 old:
write-back new: write-combining
[ 17.749084] set status page addr 0x00043000

maybe you could enable
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1

YH

2009-04-20 06:18:14

by Maciej Rutecki

[permalink] [raw]
Subject: Re: Spontaneous reboots since 2.6.29-rc*

2009/4/20 Yinghai Lu <[email protected]>:

>
> maybe you could enable
> CONFIG_MTRR_SANITIZER=y
> CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
> CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
>

I will check latest git, if that reboots still exists.

Regards
--
Maciej Rutecki
http://www.maciek.unixy.pl