2005-04-01 23:24:09

by Christopher Allen Wing

[permalink] [raw]
Subject: clock runs at double speed on x86_64 system w/ATI RS200 chipset

I'm testing a system based on a ATI Radeon Xpress 200 motherboard.
(host bridge PCI device 1002:5950)

Something is causing the timer interrupt to be received twice as often as
desired; this makes the clock run at double normal speed.

I first noticed the problem when testing Red Hat 2.4 and 2.6 kernels;
however, I just reproduced it on the latest kernel.org release (2.6.11.6).


While messing around with the 2.4 kernel, I managed to make the problem go
away by getting the timer interrupt to be delivered via the XT-PIC instead
of the APIC. I don't know enough about how the interrupt routing/ACPI
works to figure out what's wrong.


At first I thought the problem seemed similar to the one discussed in June
2004 on lkml under the subject "linux-2.6.7-bk2 runs faster than
linux-2.6.7 ;)"; see:

http://marc.theaimsgroup.com/?w=2&r=1&s=linux-2.6.7-bk2+runs+faster&q=t


However the problem still exists in 2.6.11.6, as well as older Red Hat 2.4
and 2.6 kernels, so I'm guessing that it is an unrelated problem.

In short the timer interrupt gets received twice as many times as it
should:


$ cat /proc/interrupts; sleep 10; cat /proc/interrupts
0: 2812271 IO-APIC-edge timer
LOC: 1405962

(only 5 seconds elapse; not 10)

0: 2822285 IO-APIC-edge timer
LOC: 1410969



Note that this corresponds to 1000 local APIC timer ints/sec, but 2000
'timer' ints/second.



Here's the dmesg log from booting:

http://www-personal.engin.umich.edu/~wingc/code/dmesg-2.6.11.6

I also see messages from the kernel like:

APIC error on CPU0: 00(40)
APIC error on CPU0: 40(40)

so I'd guess that something is wrong in the way that the machine is set
up. Perhaps the BIOS or ACPI tables are just defective.


I'd appreciate it if anyone familiar with how ACPI and the interrupt
routing could suggest a way to figure out what's going on.


Thanks,

Chris Wing
[email protected]


2005-04-02 11:05:50

by Mikael Pettersson

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

On Fri, 1 Apr 2005 18:24:00 -0500 (EST), Christopher Allen Wing wrote:
>I also see messages from the kernel like:
>
> APIC error on CPU0: 00(40)
> APIC error on CPU0: 40(40)
>
>so I'd guess that something is wrong in the way that the machine is set
>up. Perhaps the BIOS or ACPI tables are just defective.

Those are "received illegal vector" errors, and they
typically indicate hardware flakiness or BIOS issues.

Could be inadequate power supply, inadequate cooling,
a BIOS bug (please check for updates), a too new CPU
(again, check for a BIOS update), or simply a poorly-
designed mainboard.

/Mikael

2005-04-02 18:20:22

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset



On Sat, 2 Apr 2005, Mikael Pettersson wrote:

> > APIC error on CPU0: 00(40)
> > APIC error on CPU0: 40(40)
>
> Those are "received illegal vector" errors, and they
> typically indicate hardware flakiness or BIOS issues.
>
> Could be inadequate power supply, inadequate cooling,
> a BIOS bug (please check for updates), a too new CPU
> (again, check for a BIOS update), or simply a poorly-
> designed mainboard.


Thanks. I tried the latest BIOS for the board but that did not resolve the
problem. The clock still runs at double speed (2000 timer
interrupts/second instead of 1000) and I still get the APIC errors.

I'll enter a support request with the manufacturer.



I was able to get the problem to go away by using a BIOS option to
"disable APIC mode". When I do this the kernel outputs at boot:

ACPI: Using PIC for interrupt routing

and the output of /proc/interrupts reads 'XT-PIC' for everything.


If anyone has a suggestion for debugging the clock problem in APIC mode
I'd be interested. I'm guessing that something is causing the timer
interrupt to be mapped twice- are there any tools for looking at the ACPI
tables that may help, or are there kernel boot options to give more detail
about how the interrupt routing is being set up?


Thanks,

Chris Wing
[email protected]

2005-04-03 12:32:24

by Mikael Pettersson

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

On Sat, 2 Apr 2005 13:19:44 -0500 (EST), Christopher Allen Wing wrote:
>On Sat, 2 Apr 2005, Mikael Pettersson wrote:
>
>> > APIC error on CPU0: 00(40)
>> > APIC error on CPU0: 40(40)
>>
>> Those are "received illegal vector" errors, and they
>> typically indicate hardware flakiness or BIOS issues.
>>
>> Could be inadequate power supply, inadequate cooling,
>> a BIOS bug (please check for updates), a too new CPU
>> (again, check for a BIOS update), or simply a poorly-
>> designed mainboard.
>
>
>Thanks. I tried the latest BIOS for the board but that did not resolve the
>problem. The clock still runs at double speed (2000 timer
>interrupts/second instead of 1000) and I still get the APIC errors.
>
>I'll enter a support request with the manufacturer.
>
>
>
>I was able to get the problem to go away by using a BIOS option to
>"disable APIC mode". When I do this the kernel outputs at boot:
>
> ACPI: Using PIC for interrupt routing
>
>and the output of /proc/interrupts reads 'XT-PIC' for everything.
>
>
>If anyone has a suggestion for debugging the clock problem in APIC mode
>I'd be interested. I'm guessing that something is causing the timer
>interrupt to be mapped twice- are there any tools for looking at the ACPI
>tables that may help, or are there kernel boot options to give more detail
>about how the interrupt routing is being set up?

Well, first step is to try w/o ACPI. ACPI is inherently fragile
and bugs there can easily explain your timer problems. Either
recompile with CONFIG_ACPI=n, or boot with "acpi=off pci=noacpi".

/Mikael

2005-04-04 14:53:46

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset



On Sun, 3 Apr 2005, Mikael Pettersson wrote:

> Well, first step is to try w/o ACPI. ACPI is inherently fragile
> and bugs there can easily explain your timer problems. Either
> recompile with CONFIG_ACPI=n, or boot with "acpi=off pci=noacpi".


When I boot without ACPI (I used 'acpi=off pci=noacpi') the system fails
to come up all the way; it hangs after loading the SATA driver. (but
before the SATA driver finishes probing the disks)

I'm guessing that the interrupt from the SATA controller is not getting
through? Anyway, I assumed that ACPI was basically required for x86_64
systems to work, is this not really the case?


Thanks,

Chris
[email protected]

2005-04-04 15:32:38

by Mikael Pettersson

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

Christopher Allen Wing writes:
>
>
> On Sun, 3 Apr 2005, Mikael Pettersson wrote:
>
> > Well, first step is to try w/o ACPI. ACPI is inherently fragile
> > and bugs there can easily explain your timer problems. Either
> > recompile with CONFIG_ACPI=n, or boot with "acpi=off pci=noacpi".
>
>
> When I boot without ACPI (I used 'acpi=off pci=noacpi') the system fails
> to come up all the way; it hangs after loading the SATA driver. (but
> before the SATA driver finishes probing the disks)
>
> I'm guessing that the interrupt from the SATA controller is not getting
> through? Anyway, I assumed that ACPI was basically required for x86_64
> systems to work, is this not really the case?

In principle ACPI shouldn't be needed, but in its absence the
BIOS must provide an MP table and the x86-64 kernel must still
have code to parse it -- otherwise I/O APIC mode won't work.
I don't know if that's the case or not.

I suggest you boot normally (with ACPI fully enabled) and send a
bug report to LKML and the ACPI list with the interrupt routing
info from the kernel log.

2005-04-04 21:50:41

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset



On Mon, 4 Apr 2005, Mikael Pettersson wrote:

> In principle ACPI shouldn't be needed, but in its absence the
> BIOS must provide an MP table and the x86-64 kernel must still
> have code to parse it -- otherwise I/O APIC mode won't work.
> I don't know if that's the case or not.

Thanks. What I meant is that I thought distributions are enabling ACPI by
default because the mptable is likely to be broken.

> I suggest you boot normally (with ACPI fully enabled) and send a
> bug report to LKML and the ACPI list with the interrupt routing
> info from the kernel log.

I entered a bug report under ACPI on the kernel bugzilla:

http://bugme.osdl.org/show_bug.cgi?id=4442

containing the relevant information. It looks like booting with 'noapic'
on the command line will be an acceptable workaround for now.


Thanks,

Chris
[email protected]

2005-04-05 17:45:50

by Andi Kleen

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

Christopher Allen Wing <[email protected]> writes:

> On Sun, 3 Apr 2005, Mikael Pettersson wrote:
>
>> Well, first step is to try w/o ACPI. ACPI is inherently fragile
>> and bugs there can easily explain your timer problems. Either
>> recompile with CONFIG_ACPI=n, or boot with "acpi=off pci=noacpi".
>
>
> When I boot without ACPI (I used 'acpi=off pci=noacpi') the system fails
> to come up all the way; it hangs after loading the SATA driver. (but
> before the SATA driver finishes probing the disks)
>
> I'm guessing that the interrupt from the SATA controller is not getting
> through? Anyway, I assumed that ACPI was basically required for x86_64
> systems to work, is this not really the case?

Alternatively you can try to boot with noapic. Does that help?

-Andi

2005-04-05 18:06:05

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

On Tue, 5 Apr 2005, Andi Kleen wrote:

> Alternatively you can try to boot with noapic. Does that help?


Yes, with 'noapic' the system boots normally and the clock runs at normal
speed.

dmesg of 2.6.11.6 without any command line options. (default: ACPI
enabled, APIC enabled):

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apic

/proc/interrupts on 2.6.11.6 with ACPI enabled, APIC enabled:

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic
(clock runs at double speed)


dmesg of 2.6.11.6 with 'noapic' command line option:

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-noapic

/proc/interrupts on 2.6.11.6 with 'noapic':

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-noapic
(clock runs normally)



Are you thinking of blacklisting the APIC on this system until we figure
out what's going on?

-Chris

2005-04-05 18:18:30

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset (with APIC enabled)

I booted with 'apic=debug' in case this is useful to find out what's
wrong.


dmesg of 2.6.11.6 with ACPI enabled, APIC enabled, 'apic=debug':

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug
(clock runs at double speed)

dmesg of 2.6.11.6 with ACPI enabled, but disabled for PCI routing, APIC
enabled ('pci=noacpi apic=debug'):

http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-nopciacpi-apicdebug
(system hangs when loading SATA driver)



One difference I see between ACPI IRQ routing and 'pci=noacpi' is this:

(with ACPI IRQ routing)
..TIMER: vector=0x31 pin1=2 pin2=-1

(with 'pci=noacpi')
..TIMER: vector=0x31 pin1=2 pin2=0


so it would seem that mp_irqs[] differs between the ACPI case and
'pci=noacpi' for mp_ExtINT.


-Chris
[email protected]


On Tue, 5 Apr 2005, Andi Kleen wrote:

> Alternatively you can try to boot with noapic. Does that help?
>
> -Andi

2005-04-05 18:33:19

by Andi Kleen

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

On Tue, Apr 05, 2005 at 02:02:20PM -0400, Christopher Allen Wing wrote:
>
> Are you thinking of blacklisting the APIC on this system until we figure
> out what's going on?

Some more debugging first might be good. Perhaps it is the same issue
many Nvidia boards have with the APIC timer override being wrong;
although in this case it should more not tick at all, but might
be still worth a try.
Try booting with acpi_skip_timer_override

-Andi

2005-04-05 19:17:27

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

here's the patch for x86_64
The kernel is compiling... I'll try it when it finishes.

-Chris



--- linux-2.6.11.6/arch/x86_64/kernel/setup.c.orig 2005-03-25 22:28:14.000000000 -0500
+++ linux-2.6.11.6/arch/x86_64/kernel/setup.c 2005-04-05 15:05:47.656886736 -0400
@@ -333,6 +333,12 @@
else if (!memcmp(from, "acpi=strict", 11)) {
acpi_strict = 1;
}
+
+#ifdef CONFIG_X86_IO_APIC
+ else if (!memcmp(from, "acpi_skip_timer_override", 24))
+ acpi_skip_timer_override = 1;
+#endif
+
#endif

if (!memcmp(from, "nolapic", 7) ||




On Tue, 5 Apr 2005, Andi Kleen wrote:

> Try booting with acpi_skip_timer_override
>
> -Andi

2005-04-05 18:52:59

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset



On Tue, 5 Apr 2005, Andi Kleen wrote:

> Some more debugging first might be good. Perhaps it is the same issue
> many Nvidia boards have with the APIC timer override being wrong;
> although in this case it should more not tick at all, but might
> be still worth a try.
> Try booting with acpi_skip_timer_override

That doesn't work on x86_64, because unfortunately I think
arch/x86_64/kernel/setup.c is missing the code to parse for that option.


I'll add in the code from arch/i386/kernel/setup.c, rebuild the kernel and
see what happens.

-Chris

2005-04-05 19:51:58

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset



On Tue, 5 Apr 2005, Andi Kleen wrote:

> Try booting with acpi_skip_timer_override


Nope, this doesn't fix the problem. Here's the dmesg of 2.6.11.6 with
'acpi_skip_timer_override apic=debug':
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug-acpi_skip_timer_override

Here's /proc/interrupts:
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apicdebug-acpi_skip_timer_override


The clock still runs at double speed. The IRQ assignments seem to all have
been permuted, though, with 'acpi_skip_timer_override'


Thanks,
Chris

2005-04-06 20:37:54

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset

I noticed that the x86_64 kernel has 4 different ways of configuring the
timer interrupt in APIC mode:

arch/x86_64/kernel/io_apic.c :

/* style 1 */
if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?
*/

/* style 2 */
apic_printk(APIC_VERBOSE,KERN_INFO "...trying to set up timer (IRQ0) through the 8259A ... ");
if (pin2 != -1) {
apic_printk(APIC_VERBOSE,"\n..... (found pin %d) ...", pin2);
/*
* legacy devices should be connected to IO APIC #0
*/

/* style 3 */
apic_printk(APIC_VERBOSE, KERN_INFO "...trying to set up timer as Virtual Wire IRQ...");


/* style 4 */
apic_printk(APIC_VERBOSE, KERN_INFO "...trying to set up timer as ExtINT IRQ...");


I hacked the kernel with the following patch to try using all 4 timer
configurations. (by overriding 'pin1' and 'pin2', and by bypassing the
code that sets up 'Virtual Wire IRQ')

Unfortunately I wasn't able to change the behavior in any case. I couldn't
get the last configuration ('trying to set up timer as ExtINT IRQ') to
work; the machine just hung. I'm guessing that the code
io_apic.c::unlock_ExtINT_logic() may have never been tested on AMD chips?

No matter what I did, the clock still ran at double normal speed. Perhaps
we are just programming the APIC incorrectly for this board in some way?


booting with standard options (ACPI enabled, 'apic=debug'); this uses
method 1:
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic

booting with 'force_apic_timer=-1,0 apic=debug' to force it to use method
#2 to route the timer interrupt:
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug-forcetimer=-1,0
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic-forcetimer=-1,0

booting with 'force_apic_timer=-1,-1 apic=debug' to force it to use method
#3:
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug-forcetimer=-1,-1
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic-forcetimer=-1,-1

(note that /proc/interrupts says 'local-APIC-edge' for timer
interrupt, but it still receives twice as many interrupts)

booting with 'force_apic_timer=-1,-1 novwtimer apic=debug' to force it to
use method #4:

(machine just hangs when trying to set up the timer)


-Chris
[email protected]



--- linux-2.6.11.6/arch/x86_64/kernel/io_apic.c.orig 2005-03-25 22:28:21.000000000 -0500
+++ linux-2.6.11.6/arch/x86_64/kernel/io_apic.c 2005-04-06 16:28:25.120441232 -0400
@@ -1564,6 +1564,10 @@
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
+static int apic_timer_forced = 0;
+static int force_pin1, force_pin2;
+static int force_novwtimer = 0;
+
static inline void check_timer(void)
{
int pin1, pin2;
@@ -1587,8 +1591,13 @@
init_8259A(1);
enable_8259A_irq(0);

- pin1 = find_isa_irq_pin(0, mp_INT);
- pin2 = find_isa_irq_pin(0, mp_ExtINT);
+ if (apic_timer_forced) {
+ pin1 = force_pin1;
+ pin2 = force_pin2;
+ } else {
+ pin1 = find_isa_irq_pin(0, mp_INT);
+ pin2 = find_isa_irq_pin(0, mp_ExtINT);
+ }

apic_printk(APIC_VERBOSE,KERN_INFO "..TIMER: vector=0x%02X pin1=%d pin2=%d\n", vector, pin1, pin2);

@@ -1639,6 +1648,7 @@
nmi_watchdog = 0;
}

+ if (!force_novwtimer) {
apic_printk(APIC_VERBOSE, KERN_INFO "...trying to set up timer as Virtual Wire IRQ...");

disable_8259A_irq(0);
@@ -1652,6 +1662,7 @@
}
apic_write_around(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | vector);
apic_printk(APIC_VERBOSE," failed.\n");
+ }

apic_printk(APIC_VERBOSE, KERN_INFO "...trying to set up timer as ExtINT IRQ...");

@@ -1669,6 +1680,41 @@
panic("IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter\n");
}

+static int __init force_apic_timer(char *str)
+{
+ int timer_irqs[3];
+
+ get_options(str, ARRAY_SIZE(timer_irqs), timer_irqs);
+ if (timer_irqs[0] != 2) {
+ printk(KERN_WARNING "force_apic_timer must specify pin1,pin2\n");
+ goto out;
+ }
+
+ apic_timer_forced = 1;
+ force_pin1 = timer_irqs[1];
+ force_pin2 = timer_irqs[2];
+
+out:
+ return 1;
+}
+
+static int __init novwtimer(char *str)
+{
+ force_novwtimer = 1;
+ return 1;
+}
+
+static int __init noirq(char *str)
+{
+ force_noirq = 1;
+ return 1;
+}
+
+__setup("force_apic_timer=", force_apic_timer);
+__setup("novwtimer", novwtimer);
+__setup("noirq", noirq);
+
+
/*
*
* IRQ's that are handled by the PIC in the MPS IOAPIC case.

2005-04-06 22:13:34

by Christopher Allen Wing

[permalink] [raw]
Subject: [PATCH] Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset (workaround for APIC mode?)

The attached patch gets the clock to work normally for me without
disabling APIC mode entirely. But I'm still not sure what's going on.


dmesg of 2.6.11.6 with default options (ACPI, APIC, 'apic=debug'):
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic

dmesg with patch, and 'timerhack apic=debug':
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/dmesg-2.6.11.6-acpi-apicdebug-timerhack
http://www-personal.engin.umich.edu/~wingc/apictimer/dmesg/interrupts-2.6.11-6-acpi-apic-timerhack


The patch causes the timer to be routed via the "Virtual Wire IRQ" mode,
and I see in /proc/interrupts:

0: 376947 local-APIC-edge timer

instead of 'IO-APIC-edge'. I no longer get duplicate timer interrupts; it
seems to track the 'LOC' interrupt count normally.


The crucial part of the patch, besides skipping attempting to set up the
timer IRQ through the APIC mp_INT or mp_ExtINT, is:

clear_IO_APIC_pin(0, pin1)


Without this function call, I still get duplicate timer interrupts when
using Virtual Wire to route the timer.


I'm still seeing 'APIC error on CPU0: 00(40)' messages from time to time.


-Chris
[email protected]



--- linux-2.6.11.6/arch/x86_64/kernel/io_apic.c.orig 2005-03-25 22:28:21.000000000 -0500
+++ linux-2.6.11.6/arch/x86_64/kernel/io_apic.c 2005-04-06 17:56:46.486511088 -0400
@@ -1564,6 +1564,8 @@
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
+static int timer_hack = 0;
+
static inline void check_timer(void)
{
int pin1, pin2;
@@ -1592,6 +1594,10 @@

apic_printk(APIC_VERBOSE,KERN_INFO "..TIMER: vector=0x%02X pin1=%d pin2=%d\n", vector, pin1, pin2);

+ if (timer_hack) {
+ /* for some reason this stops duplicate timer IRQ? */
+ clear_IO_APIC_pin(0, pin1);
+ } else {
if (pin1 != -1) {
/*
* Ok, does IRQ0 through the IOAPIC work?
@@ -1633,6 +1639,7 @@
clear_IO_APIC_pin(0, pin2);
}
printk(" failed.\n");
+ }

if (nmi_watchdog) {
printk(KERN_WARNING "timer doesn't work through the IO-APIC - disabling NMI Watchdog!\n");
@@ -1669,6 +1676,14 @@
panic("IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter\n");
}

+static int __init timerhack(char *str)
+{
+ timer_hack = 1;
+ return 1;
+}
+__setup("timerhack", timerhack);
+
+
/*
*
* IRQ's that are handled by the PIC in the MPS IOAPIC case.

2005-04-07 07:37:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset (workaround for APIC mode?)

>
> I'm still seeing 'APIC error on CPU0: 00(40)' messages from time to time.

Thanks for the analysis. The clear_IO_APIC_pin looks quite hackish,
I am not sure I want to put that into the mainline kernel.
The APIC errors are also suspicious.

I don't want to blacklist ATI from just a single report,
but if there are more it is probably best to just disable
the IO-APIC by default there for now.

-Andi


2005-04-07 17:25:06

by Christopher Allen Wing

[permalink] [raw]
Subject: Re: [PATCH] Re: clock runs at double speed on x86_64 system w/ATI RS200 chipset (workaround for APIC mode?)



On Thu, 7 Apr 2005, Andi Kleen wrote:

> >
> > I'm still seeing 'APIC error on CPU0: 00(40)' messages from time to time.
>
> Thanks for the analysis. The clear_IO_APIC_pin looks quite hackish,
> I am not sure I want to put that into the mainline kernel.

Of course. The patch was a simplification, the idea was to just prevent it
from using the default routing; here's a patch that's functionally
equivalent for me:



--- arch/x86_64/kernel/io_apic.c.orig 2005-03-25 22:28:21.000000000 -0500
+++ arch/x86_64/kernel/io_apic.c 2005-04-07 13:13:58.813193024 -0400
@@ -1564,6 +1564,8 @@
* is so screwy. Thanks to Brian Perkins for testing/hacking this beast
* fanatically on his truly buggy board.
*/
+static int timer_hack = 0;
+
static inline void check_timer(void)
{
int pin1, pin2;
@@ -1597,7 +1599,7 @@
* Ok, does IRQ0 through the IOAPIC work?
*/
unmask_IO_APIC_irq(0);
- if (timer_irq_works()) {
+ if ((!timer_hack) && timer_irq_works()) {
nmi_watchdog_default();
if (nmi_watchdog == NMI_IO_APIC) {
disable_8259A_irq(0);
@@ -1669,6 +1671,14 @@
panic("IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter\n");
}

+static int __init timerhack(char *str)
+{
+ timer_hack = 1;
+ return 1;
+}
+__setup("timerhack", timerhack);
+
+
/*
*
* IRQ's that are handled by the PIC in the MPS IOAPIC case.




With that patch I get the same behavior; the timer interrupt is labeled
'local-APIC-edge' and it ticks at the correct rate.



> The APIC errors are also suspicious.
>
> I don't want to blacklist ATI from just a single report,
> but if there are more it is probably best to just disable
> the IO-APIC by default there for now.

It will be interesting to see if anyone else has problems when systems
with this ATI integrated chipset (Radeon Xpress 200) become more common.


Thanks,

Chris
[email protected]