2007-11-29 21:00:53

by Guennadi Liakhovetski

[permalink] [raw]
Subject: [Timers SMP] can this machine be helped?

Hi,

I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still
using. It has many peculiarities, so, I wouldn't be surprised if the
answer to my questions would be "sorry, the patient is rather dead than
alive".

Some of the problems lie in ACPI area, I tried some time ago to fix the
ACPI tables for these machine, but never got enough time for that. So I'm
still booting with acpi=noirq

Another problem is its battery is dead and it's hard soldered to the
mainboard (Compaq)...

It might also have some problems with one of its 3 SCSI busses.

I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
to specify nohz=off. Then I get

* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources

Without this parameter it hangs usually between

Time: acpi_pm clocksource has been installed.

and

Switched to high resolution mode on CPU 1
Switched to high resolution mode on CPU 0

Tried booting with clocksource=tsc then I've got

Marking TSC unstable due to: possible TSC halt in C2.

And then a few of these:

BUG: soft lockup - CPU#0 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3)
EIP: 0060:[<c0233d33>] EFLAGS: 00000283 CPU: 0
EIP is at acpi_processor_idle+0x2ae/0x477
EAX: 00000000 EBX: fffffeab ECX: 00000001 EDX: 00000001
ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c01053fa>] show_trace_log_lvl+0x1a/0x30
[<c0105f42>] show_trace+0x12/0x20
[<c01024fc>] show_regs+0x1c/0x20
[<c014fabb>] softlockup_tick+0x11b/0x150
[<c01311f2>] run_local_timers+0x12/0x20
[<c013168f>] update_process_times+0x2f/0x60
[<c014597a>] tick_sched_timer+0x6a/0xe0
[<c013fba0>] hrtimer_interrupt+0x120/0x1a0
[<c0119ff5>] smp_apic_timer_interrupt+0x55/0x90
[<c0104e70>] apic_timer_interrupt+0x28/0x30
[<c0102624>] cpu_idle+0x84/0xf0
[<c0316a7d>] rest_init+0x5d/0x60
[<c03e1a7f>] start_kernel+0x2af/0x2f0
[<00000000>] run_init_process+0x3feff000/0x20
=======================

so, is there any way I can still reasonably use this system? Which
configuration / command-line parameters should I try?

If needed can provide complete dmesg (with nohz=off or with
clocksource=tsc) and .config.

Thanks
Guennadi
---
Guennadi Liakhovetski


2007-12-02 19:00:05

by Pavel Machek

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

Hi!

> I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still
> using. It has many peculiarities, so, I wouldn't be surprised if the
> answer to my questions would be "sorry, the patient is rather dead than
> alive".
>
> Some of the problems lie in ACPI area, I tried some time ago to fix the
> ACPI tables for these machine, but never got enough time for that. So I'm
> still booting with acpi=noirq
>
> Another problem is its battery is dead and it's hard soldered to the
> mainboard (Compaq)...
>
> It might also have some problems with one of its 3 SCSI busses.
>
> I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
> CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
> to specify nohz=off. Then I get

Try highres=off, too... Hehe, and even idle=poll might help.

> Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3)
> EIP: 0060:[<c0233d33>] EFLAGS: 00000283 CPU: 0
> EIP is at acpi_processor_idle+0x2ae/0x477
> EAX: 00000000 EBX: fffffeab ECX: 00000001 EDX: 00000001
> ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> [<c01053fa>] show_trace_log_lvl+0x1a/0x30
> [<c0105f42>] show_trace+0x12/0x20

...and disable softlockup watchdog, too...

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-12-03 21:45:00

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

On Sun, 2 Dec 2007, Pavel Machek wrote:

> > I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
> > CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
> > to specify nohz=off. Then I get
>
> Try highres=off, too... Hehe, and even idle=poll might help.

Ok, for now I've compiled a kernel with all "advanced" features off like
hrt, nohz. Will see how it behaves. But thanks for the hints.

> > Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3)
> > EIP: 0060:[<c0233d33>] EFLAGS: 00000283 CPU: 0
> > EIP is at acpi_processor_idle+0x2ae/0x477
> > EAX: 00000000 EBX: fffffeab ECX: 00000001 EDX: 00000001
> > ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90
> > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff0ff0 DR7: 00000400
> > [<c01053fa>] show_trace_log_lvl+0x1a/0x30
> > [<c0105f42>] show_trace+0x12/0x20
>
> ...and disable softlockup watchdog, too...

So, you think those BUGs are bogus?

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-12-03 21:47:31

by Pavel Machek

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

On Mon 2007-12-03 22:45:06, Guennadi Liakhovetski wrote:
> On Sun, 2 Dec 2007, Pavel Machek wrote:
>
> > > I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
> > > CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
> > > to specify nohz=off. Then I get
> >
> > Try highres=off, too... Hehe, and even idle=poll might help.
>
> Ok, for now I've compiled a kernel with all "advanced" features off like
> hrt, nohz. Will see how it behaves. But thanks for the hints.

Let us know...

> > > CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
> > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > DR6: ffff0ff0 DR7: 00000400
> > > [<c01053fa>] show_trace_log_lvl+0x1a/0x30
> > > [<c0105f42>] show_trace+0x12/0x20
> >
> > ...and disable softlockup watchdog, too...
>
> So, you think those BUGs are bogus?

Well.... you want to make old machine usable, right? Disabling
warnings is fair game.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-12-03 21:57:18

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

On Mon, 3 Dec 2007, Pavel Machek wrote:

> On Mon 2007-12-03 22:45:06, Guennadi Liakhovetski wrote:
> > On Sun, 2 Dec 2007, Pavel Machek wrote:
>
> > > > CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
> > > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > DR6: ffff0ff0 DR7: 00000400
> > > > [<c01053fa>] show_trace_log_lvl+0x1a/0x30
> > > > [<c0105f42>] show_trace+0x12/0x20
> > >
> > > ...and disable softlockup watchdog, too...
> >
> > So, you think those BUGs are bogus?
>
> Well.... you want to make old machine usable, right? Disabling
> warnings is fair game.

Ouch, but not if a CPU __really__ gets stuck for 13 seconds...

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-12-05 02:19:51

by Robert Hancock

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

Guennadi Liakhovetski wrote:
> Hi,
>
> I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still
> using. It has many peculiarities, so, I wouldn't be surprised if the
> answer to my questions would be "sorry, the patient is rather dead than
> alive".
>
> Some of the problems lie in ACPI area, I tried some time ago to fix the
> ACPI tables for these machine, but never got enough time for that. So I'm
> still booting with acpi=noirq
>
> Another problem is its battery is dead and it's hard soldered to the
> mainboard (Compaq)...
>
> It might also have some problems with one of its 3 SCSI busses.
>
> I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
> CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
> to specify nohz=off. Then I get
>
> * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
> * this clock source is slow. Consider trying other clock sources
>
> Without this parameter it hangs usually between
>
> Time: acpi_pm clocksource has been installed.
>
> and
>
> Switched to high resolution mode on CPU 1
> Switched to high resolution mode on CPU 0
>
> Tried booting with clocksource=tsc then I've got
>
> Marking TSC unstable due to: possible TSC halt in C2.
>
> And then a few of these:
>
> BUG: soft lockup - CPU#0 stuck for 13s! [swapper:0]
>
> Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3)
> EIP: 0060:[<c0233d33>] EFLAGS: 00000283 CPU: 0
> EIP is at acpi_processor_idle+0x2ae/0x477
> EAX: 00000000 EBX: fffffeab ECX: 00000001 EDX: 00000001
> ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 000002d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> [<c01053fa>] show_trace_log_lvl+0x1a/0x30
> [<c0105f42>] show_trace+0x12/0x20
> [<c01024fc>] show_regs+0x1c/0x20
> [<c014fabb>] softlockup_tick+0x11b/0x150
> [<c01311f2>] run_local_timers+0x12/0x20
> [<c013168f>] update_process_times+0x2f/0x60
> [<c014597a>] tick_sched_timer+0x6a/0xe0
> [<c013fba0>] hrtimer_interrupt+0x120/0x1a0
> [<c0119ff5>] smp_apic_timer_interrupt+0x55/0x90
> [<c0104e70>] apic_timer_interrupt+0x28/0x30
> [<c0102624>] cpu_idle+0x84/0xf0
> [<c0316a7d>] rest_init+0x5d/0x60
> [<c03e1a7f>] start_kernel+0x2af/0x2f0
> [<00000000>] run_init_process+0x3feff000/0x20
> =======================
>
> so, is there any way I can still reasonably use this system? Which
> configuration / command-line parameters should I try?
>
> If needed can provide complete dmesg (with nohz=off or with
> clocksource=tsc) and .config.

How about disabling ACPI entirely, acpi=off on kernel command line? I
wouldn't be surprised to see a lot of ACPI stuff broken on an older
machine like that..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-12-05 06:47:42

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

On Tue, 4 Dec 2007, Robert Hancock wrote:

> Guennadi Liakhovetski wrote:
> >
> > I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using.
> > It has many peculiarities, so, I wouldn't be surprised if the answer to my
> > questions would be "sorry, the patient is rather dead than alive".
>
> How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't
> be surprised to see a lot of ACPI stuff broken on an older machine like that..

See above - it's an SMP.

Thanks
Guennadi
---
Guennadi Liakhovetski

2007-12-05 14:34:33

by Robert Hancock

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

Guennadi Liakhovetski wrote:
> On Tue, 4 Dec 2007, Robert Hancock wrote:
>
>> Guennadi Liakhovetski wrote:
>>> I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using.
>>> It has many peculiarities, so, I wouldn't be surprised if the answer to my
>>> questions would be "sorry, the patient is rather dead than alive".
>> How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't
>> be surprised to see a lot of ACPI stuff broken on an older machine like that..
>
> See above - it's an SMP.
>
> Thanks
> Guennadi
> ---
> Guennadi Liakhovetski
>

On a machine that old, you shouldn't need ACPI to detect both CPUs, it
should be able to use MPS..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-12-05 15:04:58

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: [Timers SMP] can this machine be helped?

On Mon, 3 Dec 2007, Pavel Machek wrote:

> On Mon 2007-12-03 22:45:06, Guennadi Liakhovetski wrote:
> > On Sun, 2 Dec 2007, Pavel Machek wrote:
> >
> > > > I compiled a .24-ish kernel for it with CONFIG_NO_HZ and
> > > > CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have
> > > > to specify nohz=off. Then I get
> > >
> > > Try highres=off, too... Hehe, and even idle=poll might help.
> >
> > Ok, for now I've compiled a kernel with all "advanced" features off like
> > hrt, nohz. Will see how it behaves. But thanks for the hints.
>
> Let us know...

Ok, it cold-booted (after a power-off, which is usually the most
difficult case for this machine) once fine. On another boot it was
veeeeery slow. I have a dmesg from that boot, here're some of the first
differences with a subsequent normal boot:

-apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
+apm: BIOS version 1.2 Flags 0x0b (Driver version 1.16ac)

...

Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled
+serial 00:08: activated
00:08: ttyS0 at I/O 0x3e8 (irq = 4) is a 16550A
+serial 00:09: activated
00:09: ttyS1 at I/O 0x2e8 (irq = 3) is a NS16550A

...

ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller (0x8086:0x7111 rev 0x01) at PCI slot 0000:00:14.1
PIIX4: not 100% native mode: will probe irqs later
-PIIX4: IDE port disabled
+ ide0: BM-DMA at 0x58c0-0x58c7, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0x58c8-0x58cf, BIOS settings: hdc:pio, hdd:pio
+Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...

"+" is the slow boot. A reboot helped. And I've got two soft lockups now
on that new boot each for 13s (hm, it's always 13s...:-() and this time I
didn't specify "clocksource=tsc" as the last time.

On Wed, 5 Dec 2007, Robert Hancock wrote:

> On a machine that old, you shouldn't need ACPI to detect both CPUs, it should
> be able to use MPS..

Don't know the details, but remember it didn't work without ACPI. The only
solution was acpi=noirq.

Thanks
Guennadi
---
Guennadi Liakhovetski