2003-06-28 20:36:21

by Edward Tandi

[permalink] [raw]
Subject: Linux 2.4.22-pre2 and AthlonMP

On the Tyan Tiger board (2460) fitted with two processors, if I use
'noapic' in the lilo boot options I get the following in messages (and
eventually a crash):


Jun 28 17:54:36 machine syslogd 1.4.1: restart.
Jun 28 17:54:36 machine kernel: klogd 1.4.1, log source = /proc/kmsg
started.
Jun 28 17:54:36 machine kernel: Inspecting /boot/System.map
Jun 28 17:54:36 machine partmon: Checking if partitions have enough free
diskspace:
Jun 28 17:54:36 machine kernel: 40(40)
Jun 28 17:54:36 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:36 machine last message repeated 1127 times
Jun 28 17:54:36 machine partmon: ^[[65G[
Jun 28 17:54:36 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:36 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:36 machine partmon:
Jun 28 17:54:36 machine rc: Starting partmon: succeeded
Jun 28 17:54:36 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:37 machine last message repeated 18 times
Jun 28 17:54:37 machine nfslock: rpc.lockd startup succeeded
Jun 28 17:54:37 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:37 machine last message repeated 5 times
Jun 28 17:54:37 machine rpc.statd[744]: Version 1.0.1 Starting
Jun 28 17:54:37 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:37 machine nfslock: rpc.statd startup succeeded
Jun 28 17:54:37 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:54:37 machine last message repeated 66 times
...
Jun 28 17:56:12 machine smbd[1925]: Got SIGHUP
Jun 28 17:56:12 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:56:43 machine last message repeated 3057 times
Jun 28 17:57:44 machine last message repeated 6149 times
Jun 28 17:58:01 machine last message repeated 1808 times
Jun 28 17:58:01 machine ntpd[1023]: kernel time discipline status change
41
Jun 28 17:58:01 machine kernel: APIC error on CPU1: 40(40)
Jun 28 17:58:32 machine last message repeated 3058 times


The system boots, but hangs shortly thereafter. Removing the noapic
option, I get the following new messages:


Jun 28 18:27:46 machine kernel: ACPI: PCI Interrupt Link [LNKA] (IRQs 3
5 10 11, enabled at IRQ 9)
Jun 28 18:27:46 machine kernel: ACPI: PCI Interrupt Link [LNKB] (IRQs 3
*5 10 11)
Jun 28 18:27:46 machine kernel: ACPI: PCI Interrupt Link [LNKC] (IRQs 3
5 *10 11)
Jun 28 18:27:46 machine kernel: ACPI: PCI Interrupt Link [LNKD] (IRQs 3
5 10 *11)
Jun 28 18:27:46 machine kernel: ACPI-0352: *** Error: Looking up
[Z00Q] in namespace, AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: search_node c1596ac0 start_node c1596ac0
return_node 00000000
Jun 28 18:27:46 machine kernel: ACPI-1121: *** Error: Method
execution failed [\_SB_.PCI0.ISA_.SIO_.COM1._STA] (Node c1596ac0),
AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: ACPI-0352: *** Error: Looking up
[Z00Q] in namespace, AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: search_node c1596dc0 start_node c1596dc0
return_node 00000000
Jun 28 18:27:46 machine kernel: ACPI-1121: *** Error: Method
execution failed [\_SB_.PCI0.ISA_.SIO_.COM2._STA] (Node c1596dc0),
AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: ACPI-0352: *** Error: Looking up
[Z00Q] in namespace, AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: search_node c1594300 start_node c1594300
return_node 00000000
Jun 28 18:27:46 machine kernel: ACPI-1121: *** Error: Method
execution failed [\_SB_.PCI0.ISA_.SIO_.LPT_._STA] (Node c1594300),
AE_NOT_FOUND
Jun 28 18:27:46 machine kernel: PCI: Probing PCI hardware
Jun 28 18:27:46 machine partmon: ^[[65G[
Jun 28 18:27:46 machine kernel: PCI: Using ACPI for IRQ routing
Jun 28 18:27:46 machine kernel: PCI: if you experience problems, try
using option 'pci=noacpi' or even 'acpi=off'
Jun 28 18:27:46 machine kernel: BIOS failed to enable PCI standards
compliance, fixing this error.


I presume this last set of messages is due to the recent ACPI changes.
If I try booting with the suggested 'pci=noacpi', the machine hangs
during boot before it gets to the SCSI driver. Setting 'acpi=off' gets
rid of the messages and the box appears to run OK.

Ed-T.



2003-06-28 21:39:35

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sad, 2003-06-28 at 21:50, Edward Tandi wrote:
> Jun 28 18:27:46 machine kernel: PCI: Using ACPI for IRQ routing
> Jun 28 18:27:46 machine kernel: PCI: if you experience problems, try
> using option 'pci=noacpi' or even 'acpi=off'
> Jun 28 18:27:46 machine kernel: BIOS failed to enable PCI standards
> compliance, fixing this error.

Start by upgrading to their current BIOS

2003-06-28 22:35:22

by Edward Tandi

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sat, 2003-06-28 at 22:51, Alan Cox wrote:
> On Sad, 2003-06-28 at 21:50, Edward Tandi wrote:
> > Jun 28 18:27:46 machine kernel: PCI: Using ACPI for IRQ routing
> > Jun 28 18:27:46 machine kernel: PCI: if you experience problems, try
> > using option 'pci=noacpi' or even 'acpi=off'
> > Jun 28 18:27:46 machine kernel: BIOS failed to enable PCI standards
> > compliance, fixing this error.
>
> Start by upgrading to their current BIOS

Believe or not, it _is_ the latest bios for that board
(Tyan S2460 BIOS v1.05, 2nd Jan 2003).

Ed-T.


2003-06-28 23:06:58

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sad, 2003-06-28 at 23:50, Edward Tandi wrote:
> > > using option 'pci=noacpi' or even 'acpi=off'
> > > Jun 28 18:27:46 machine kernel: BIOS failed to enable PCI standards
> > > compliance, fixing this error.
> >
> > Start by upgrading to their current BIOS
>
> Believe or not, it _is_ the latest bios for that board
> (Tyan S2460 BIOS v1.05, 2nd Jan 2003).

Then I guess you have a problem. We try and fix up BIOS problems but there
is a limit to what we can do, and if it has problems like the one that is
logged I'd be worried what else it might do - eg I suspect Nvidia 4x AGP cards
aren't too solid on it.

The APIC errors also suggest something isn't happy at all at the hardware
layer. Are you using MP processors ?

2003-06-28 23:37:29

by Edward Tandi

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sun, 2003-06-29 at 00:17, Alan Cox wrote:
> On Sad, 2003-06-28 at 23:50, Edward Tandi wrote:
> > > > using option 'pci=noacpi' or even 'acpi=off'
> > > > Jun 28 18:27:46 machine kernel: BIOS failed to enable PCI standards
> > > > compliance, fixing this error.
> > >
> > > Start by upgrading to their current BIOS
> >
> > Believe or not, it _is_ the latest bios for that board
> > (Tyan S2460 BIOS v1.05, 2nd Jan 2003).
>
> Then I guess you have a problem. We try and fix up BIOS problems but there
> is a limit to what we can do, and if it has problems like the one that is
> logged I'd be worried what else it might do - eg I suspect Nvidia 4x AGP cards
> aren't too solid on it.

It does have an AGP NVidia card in it. I'm using the standard XFree
drivers with it at the moment but I have played UT on it for hours
before (using NVidia drivers) without problems. It might be an AGP x2
card though. The computer is now mostly a back-end server and I haven't
really pushed it on the graphics side recently.

Could the problem be caused by some BIOS setting? I could spend some
time looking at them.

> The APIC errors also suggest something isn't happy at all at the hardware
> layer. Are you using MP processors ?

Yes, MP processors. This is not a new machine, It has been running quite
nicely for nearly two years. There have been some kernel releases in the
past that have shown some instability, but I can usually find a fairly
recent version and tweak the kernel-build settings so that it becomes
stable.

The version running prior to this one was 2.4.21-rc3. This version
allowed me to specify noapic.

Ed-T.

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-06-29 10:30:50

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sul, 2003-06-29 at 00:52, Edward Tandi wrote:
> It does have an AGP NVidia card in it. I'm using the standard XFree
> drivers with it at the moment but I have played UT on it for hours
> before (using NVidia drivers) without problems. It might be an AGP x2
> card though. The computer is now mostly a back-end server and I haven't
> really pushed it on the graphics side recently
>
> Could the problem be caused by some BIOS setting? I could spend some
> time looking at them.

The BIOS has magic tuning tables for AMD76x chipsets for various video
cards. Its one of the reasons that new BIOSes sometimes make AGP 4x
work, or more reliable.

> The version running prior to this one was 2.4.21-rc3. This version
> allowed me to specify noapic.

Out of interest, compile out ACPI support and see what it does

2003-06-29 15:32:05

by Edward Tandi

[permalink] [raw]
Subject: Re: Linux 2.4.22-pre2 and AthlonMP

On Sun, 2003-06-29 at 11:42, Alan Cox wrote:
> On Sul, 2003-06-29 at 00:52, Edward Tandi wrote:
> > It does have an AGP NVidia card in it. I'm using the standard XFree
> > drivers with it at the moment but I have played UT on it for hours
> > before (using NVidia drivers) without problems. It might be an AGP x2
> > card though. The computer is now mostly a back-end server and I haven't
> > really pushed it on the graphics side recently
> >
> > Could the problem be caused by some BIOS setting? I could spend some
> > time looking at them.
>
> The BIOS has magic tuning tables for AMD76x chipsets for various video
> cards. Its one of the reasons that new BIOSes sometimes make AGP 4x
> work, or more reliable.
>
> > The version running prior to this one was 2.4.21-rc3. This version
> > allowed me to specify noapic.
>
> Out of interest, compile out ACPI support and see what it does

OK, compiled without ACPI. The system boots and runs fine with or
without noapic. No nasty warnings.

The AMD76x power management has been off in all tests to date.

Ed-T.