2004-06-16 19:32:30

by Peter Cordes

[permalink] [raw]
Subject: x86-64: double timer interrupts in recent 2.4.x


Nobody replied to this message on [email protected], or
[email protected]. Hopefully I've found the right places to send this this
time around. Actually, Roland Fehrenbacher saw my message in a list archive
and mailed me to confirm that he saw the same double-speed clock problem on
two different machines, so it's not just Tyan S2880 boards. He suggested I
mail Andi and lkml, so here goes. (I haven't tested again with anything more
recent than 2.4.27-pre2, so if this is fixed, sorry.)

-----

I just noticed that on my Opteron cluster, the nodes that are running 64bit
kernels have their clocks ticking at double speed. This happens with
Linux 2.4.26, and 2.4.27-pre2, compiled with gcc 3.3.3 (Debian 20040401) in a
Debian pure64 chroot. Linux 2.4.25, compiled on Debian Woody + bi-arch gcc
3.3.2 20030908, does _not_ have the problem. The config options were pretty
much the same for all kernels, and all the kernels are plain vanilla flavour
from http://www.ca.kernel.org.

If I run ntpdate to set the clock, then 10 seconds later it will be 10
seconds fast. Running date(1), the system time advances 20 seconds in 10
seconds of real time. (I haven't done anything weird with adjtimex(8).)
time sleep 10 takes 5 seconds, but bash reports its real time as 10 seconds.
The timer interrupt counter is increasing at a rate of 200/real second, so
it seems like the system is getting timer interrupts twice as fast as it
should. (With 2.4.25, it is 100/sec, same as HZ).

Linux says it is using the PIT and TSC timers. I have HPET enabled in my
Linux config, but I guess Tyan's S2880 mobo doesn't have one. This is a
dual-Opteron 240 machine, BTW.

i386 Linux on the same machines has no problems with timekeeping. (But I
haven't tested versions later than 2.4.25 in legacy mode.)

I spent some time poking around the timer code that increments xtime, but I
guess the fact that the timer irqs are coming at double speed indicates that
the problem lies elsewhere. Maybe the code that sets up the timer?

$ uname -a
Linux node6.cs.dal.ca 2.4.26 #2 SMP Fri May 14 14:46:42 ADT 2004 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/interrupts
CPU0 CPU1
0: 4415908 0 IO-APIC-edge timer
1: 2 0 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
9: 0 0 IO-APIC-level acpi
14: 17861 1 IO-APIC-edge ide0
19: 0 0 IO-APIC-level usb-ohci, usb-ohci
24: 563942 0 IO-APIC-level eth0
25: 564331 0 IO-APIC-level eth1
NMI: 19097 19097
LOC: 2211090 2211095
ERR: 0
MIS: 0

Only CPU0 is getting the timer interrupt, but at least we know it's not
that both CPUs are getting the timer interrupt. (Both CPUs get 100 LOC:
(local APIC) interrupts/sec, but that happens on the non-buggy 2.4.25, too.)

Thanks for any help,

I'm not subscribed to the lkml, so please CC me on any followups.

--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cor , des.ca)

"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BC


Attachments:
(No filename) (3.20 kB)
signature.asc (351.00 B)
Digital signature
Download all attachments

2004-06-16 19:42:53

by Andi Kleen

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Wed, 16 Jun 2004 16:28:26 -0300
Peter Cordes <[email protected]> wrote:

>
> Nobody replied to this message on [email protected], or
> [email protected]. Hopefully I've found the right places to send this this
> time around. Actually, Roland Fehrenbacher saw my message in a list archive
> and mailed me to confirm that he saw the same double-speed clock problem on
> two different machines, so it's not just Tyan S2880 boards. He suggested I
> mail Andi and lkml, so here goes. (I haven't tested again with anything more
> recent than 2.4.27-pre2, so if this is fixed, sorry.)

It would be a good start if you could track down which kernel started
causing this behaviour (best with -bk* kernels, -pre* is not fine grained
enough).

-Andi

2004-06-17 08:54:14

by Mikael Pettersson

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> I just noticed that on my Opteron cluster, the nodes that are running 64bit
>kernels have their clocks ticking at double speed. This happens with
>Linux 2.4.26, and 2.4.27-pre2

I had the same problem: 2.4 x86-64 kernels ticking the clock
twice its normal speed, unless I booted with pci=noacpi.

This got fixed very recently I believe, in a 2.4.27-pre kernel.

/Mikael

2004-06-17 10:26:57

by Andi Kleen

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
Mikael Pettersson <[email protected]> wrote:

> On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> > I just noticed that on my Opteron cluster, the nodes that are running 64bit
> >kernels have their clocks ticking at double speed. This happens with
> >Linux 2.4.26, and 2.4.27-pre2
>
> I had the same problem: 2.4 x86-64 kernels ticking the clock
> twice its normal speed, unless I booted with pci=noacpi.
>
> This got fixed very recently I believe, in a 2.4.27-pre kernel.

In which one exactly? Most likely it was an ACPI problem/fix.
Len, do you remember fixing such an issue?

-Andi

2004-06-17 10:50:03

by Mikael Pettersson

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

Andi Kleen writes:
> On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
> Mikael Pettersson <[email protected]> wrote:
>
> > On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> > > I just noticed that on my Opteron cluster, the nodes that are running 64bit
> > >kernels have their clocks ticking at double speed. This happens with
> > >Linux 2.4.26, and 2.4.27-pre2
> >
> > I had the same problem: 2.4 x86-64 kernels ticking the clock
> > twice its normal speed, unless I booted with pci=noacpi.
> >
> > This got fixed very recently I believe, in a 2.4.27-pre kernel.
>
> In which one exactly? Most likely it was an ACPI problem/fix.
> Len, do you remember fixing such an issue?

I'm away from my K8 at the moment, but I can check this on Saturday.

/Mikael

2004-06-17 14:27:12

by Brown, Len

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Thu, 2004-06-17 at 06:26, Andi Kleen wrote:
> On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
> Mikael Pettersson <[email protected]> wrote:
>
> > On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> > > I just noticed that on my Opteron cluster, the nodes that are running 64bit
> > >kernels have their clocks ticking at double speed. This happens with
> > >Linux 2.4.26, and 2.4.27-pre2
> >
> > I had the same problem: 2.4 x86-64 kernels ticking the clock
> > twice its normal speed, unless I booted with pci=noacpi.
> >
> > This got fixed very recently I believe, in a 2.4.27-pre kernel.
>
> In which one exactly? Most likely it was an ACPI problem/fix.
> Len, do you remember fixing such an issue?

No, I don't remember this symptom.

-Len


2004-06-18 07:43:59

by Peter Cordes

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Thu, Jun 17, 2004 at 12:26:45PM +0200, Andi Kleen wrote:
> On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
> Mikael Pettersson <[email protected]> wrote:
>
> > On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> > > I just noticed that on my Opteron cluster, the nodes that are running 64bit
> > >kernels have their clocks ticking at double speed. This happens with
> > >Linux 2.4.26, and 2.4.27-pre2
> >
> > I had the same problem: 2.4 x86-64 kernels ticking the clock
> > twice its normal speed, unless I booted with pci=noacpi.
> >
> > This got fixed very recently I believe, in a 2.4.27-pre kernel.
>
> In which one exactly? Most likely it was an ACPI problem/fix.
> Len, do you remember fixing such an issue?

It's fixed in 2.4.27-pre3 and later. Coincidentally or not, it was
released only 4 days after I mentioned the bug on debian-amd64 and
discuss@x86-64. I'd narrow it down further, but kernel.org doesn't have -bk
patches for 2.4, and I don't know where to download more fine-grained patch
versions.

(BTW, 2.4.27-pre6 doesn't compile without declaring
struct task_struct *tsk; in rwsem-spinlock.c:__rwsem_wake_one_writer.)

Thanks for the help.

--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cor , des.ca)

"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BC


Attachments:
(No filename) (1.42 kB)
signature.asc (351.00 B)
Digital signature
Download all attachments

2004-06-19 12:39:29

by Mikael Pettersson

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Fri, 18 Jun 2004 04:41:32 -0300, Peter Cordes wrote:
>On Thu, Jun 17, 2004 at 12:26:45PM +0200, Andi Kleen wrote:
>> On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
>> Mikael Pettersson <[email protected]> wrote:
>>
>> > On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
>> > > I just noticed that on my Opteron cluster, the nodes that are running
> 64bit
>> > >kernels have their clocks ticking at double speed. This happens with
>> > >Linux 2.4.26, and 2.4.27-pre2
>> >
>> > I had the same problem: 2.4 x86-64 kernels ticking the clock
>> > twice its normal speed, unless I booted with pci=noacpi.
>> >
>> > This got fixed very recently I believe, in a 2.4.27-pre kernel.
>>
>> In which one exactly? Most likely it was an ACPI problem/fix.
>> Len, do you remember fixing such an issue?
>
> It's fixed in 2.4.27-pre3 and later.

Confirmed: pre2 has the bug, pre3 and later do not.

For reference, here's how pre2 and pre3 dmesg outputs
differ on my K8 (MSI K8T Neo-FIS2R). There are several
changes related to IRQ0.

--- dmesg-2.4.27-pre2 2004-06-19 13:56:57.000000000 +0200
+++ dmesg-2.4.27-pre3 2004-06-19 13:59:14.000000000 +0200
@@ -30,7 +30,6 @@
ACPI: FADT (v001 AMIINT VIA_K8 0x00000011 MSFT 0x00000097) @ 0x000000003fff0030
ACPI: MADT (v001 AMIINT VIA_K8 0x00000009 MSFT 0x00000097) @ 0x000000003fff00c0
ACPI: DSDT (v001 VIA VIA_K8 0x00001000 MSFT 0x0100000d) @ 0x0000000000000000
-ACPI: Parsing Local APIC info in MADT
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:4 APIC version 16
@@ -39,6 +38,9 @@
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, IRQ 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
+ACPI: IRQ0 used by override.
+ACPI: IRQ2 used by override.
+ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
Kernel command line: BOOT_IMAGE=bzimage apic
Initializing CPU#0
@@ -59,8 +61,8 @@
testing NMI watchdog ... OK.
ENABLING IO-APIC IRQs
init IO_APIC IRQs
- IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
-..TIMER: vector=0x31 pin1=0 pin2=-1
+ IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
+..TIMER: vector=0x31 pin1=2 pin2=-1
Using local APIC timer interrupts.
Detected 12.500 MHz APIC timer.
cpu: 0, clocks: 2000140, slice: 1000070
@@ -80,31 +82,23 @@
ACPI: Power Resource [LPTP] (off)
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 12 14 15)
-ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15)
+ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 *12 14 15)
-ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15)
-ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15)
-ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15)
-ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 10 11 12 14 15)
+ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
+ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
+ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
+ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
PCI: Using configuration type 1
PCI: Probing PCI hardware
-IOAPIC[0]: Set PCI routing entry (2-16 -> 0xa9 -> IRQ 16 Mode:1 Active:1)
-00:00:01[A] -> 2-16 -> vector 0xa9 -> IRQ 16
-IOAPIC[0]: Set PCI routing entry (2-17 -> 0xb1 -> IRQ 17 Mode:1 Active:1)
-00:00:01[B] -> 2-17 -> vector 0xb1 -> IRQ 17
-IOAPIC[0]: Set PCI routing entry (2-20 -> 0xb9 -> IRQ 20 Mode:1 Active:1)
-00:00:11[A] -> 2-20 -> vector 0xb9 -> IRQ 20
-IOAPIC[0]: Set PCI routing entry (2-22 -> 0xc1 -> IRQ 22 Mode:1 Active:1)
-00:00:11[C] -> 2-22 -> vector 0xc1 -> IRQ 22
-IOAPIC[0]: Set PCI routing entry (2-18 -> 0xc9 -> IRQ 18 Mode:1 Active:1)
-00:00:05[C] -> 2-18 -> vector 0xc9 -> IRQ 18
-IOAPIC[0]: Set PCI routing entry (2-19 -> 0xd1 -> IRQ 19 Mode:1 Active:1)
-00:00:05[D] -> 2-19 -> vector 0xd1 -> IRQ 19
-IOAPIC[0]: Set PCI routing entry (2-21 -> 0xd9 -> IRQ 21 Mode:1 Active:1)
-00:00:10[A] -> 2-21 -> vector 0xd9 -> IRQ 21
-IOAPIC[0]: Set PCI routing entry (2-23 -> 0xe1 -> IRQ 23 Mode:1 Active:1)
-00:00:12[A] -> 2-23 -> vector 0xe1 -> IRQ 23
-number of MP IRQ sources: 16.
+00:00:01[A] -> 2-16 -> vector 0xa9 -> IRQ 16 level low
+00:00:01[B] -> 2-17 -> vector 0xb1 -> IRQ 17 level low
+00:00:11[A] -> 2-20 -> vector 0xb9 -> IRQ 20 level low
+00:00:11[C] -> 2-22 -> vector 0xc1 -> IRQ 22 level low
+00:00:05[C] -> 2-18 -> vector 0xc9 -> IRQ 18 level low
+00:00:05[D] -> 2-19 -> vector 0xd1 -> IRQ 19 level low
+00:00:10[A] -> 2-21 -> vector 0xd9 -> IRQ 21 level low
+00:00:12[A] -> 2-23 -> vector 0xe1 -> IRQ 23 level low
+number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

@@ -117,7 +111,7 @@
....... : IO APIC version: 0003
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
- 00 001 01 0 0 0 0 0 1 1 31
+ 00 000 00 1 0 0 0 0 0 0 00
01 001 01 0 0 0 0 0 1 1 39
02 001 01 0 0 0 0 0 1 1 31
03 001 01 0 0 0 0 0 1 1 41
@@ -142,7 +136,7 @@
16 001 01 1 1 0 1 0 1 1 C1
17 001 01 1 1 0 1 0 1 1 E1
IRQ to pin mappings:
-IRQ0 -> 0:0-> 0:2
+IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4

2004-06-19 13:57:06

by Brown, Len

[permalink] [raw]
Subject: Re: x86-64: double timer interrupts in recent 2.4.x

On Sat, 2004-06-19 at 08:39, Mikael Pettersson wrote:
> On Fri, 18 Jun 2004 04:41:32 -0300, Peter Cordes wrote:
> >On Thu, Jun 17, 2004 at 12:26:45PM +0200, Andi Kleen wrote:
> >> On Thu, 17 Jun 2004 10:54:00 +0200 (MEST)
> >> Mikael Pettersson <[email protected]> wrote:
> >>
> >> > On Wed, 16 Jun 2004 16:28:26 -0300, Peter Cordes wrote:
> >> > > I just noticed that on my Opteron cluster, the nodes that are running
> > 64bit
> >> > >kernels have their clocks ticking at double speed. This happens with
> >> > >Linux 2.4.26, and 2.4.27-pre2
> >> >
> >> > I had the same problem: 2.4 x86-64 kernels ticking the clock
> >> > twice its normal speed, unless I booted with pci=noacpi.
> >> >
> >> > This got fixed very recently I believe, in a 2.4.27-pre kernel.
> >>
> Confirmed: pre2 has the bug, pre3 and later do not.
>
> For reference, here's how pre2 and pre3 dmesg outputs
> differ on my K8 (MSI K8T Neo-FIS2R).

All the changes you highlight were made by me.
If it is really important to figure out why this system
failed in the past and works now, send me the complete dmesg
for each kernel, /proc/interrupts and output from dmidecode.


> --- dmesg-2.4.27-pre2 2004-06-19 13:56:57.000000000 +0200
> +++ dmesg-2.4.27-pre3 2004-06-19 13:59:14.000000000 +0200

> @@ -39,6 +38,9 @@
> IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, IRQ 0-23
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
> +ACPI: IRQ0 used by override.
> +ACPI: IRQ2 used by override.
> +ACPI: IRQ9 used by override.

believe it or not, these printks were added to work
around an x86_64 gcc bug.

> Using ACPI (MADT) for SMP configuration information
> Kernel command line: BOOT_IMAGE=bzimage apic

If this is a VIA system, IIR you should not longer need
the "apic" cmdline parameter.

> Initializing CPU#0
> @@ -59,8 +61,8 @@
> testing NMI watchdog ... OK.
> ENABLING IO-APIC IRQs
> init IO_APIC IRQs
> - IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
> -..TIMER: vector=0x31 pin1=0 pin2=-1
> + IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
> +..TIMER: vector=0x31 pin1=2 pin2=-1

Timer seems to be happy now on IRQ/pin2
Not immediately clear why it was on pin0+pin2 before.

> Using local APIC timer interrupts.
> Detected 12.500 MHz APIC timer.
> cpu: 0, clocks: 2000140, slice: 1000070

> +number of MP IRQ sources: 15.
> number of IO-APIC #2 registers: 24.
> testing the IO APIC.......................
>
> @@ -117,7 +111,7 @@
> ....... : IO APIC version: 0003
> .... IRQ redirection table:
> NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
> - 00 001 01 0 0 0 0 0 1 1 31
> + 00 000 00 1 0 0 0 0 0 0 00
> 01 001 01 0 0 0 0 0 1 1 39
> 02 001 01 0 0 0 0 0 1 1 31

notice double mapping on vector 31 is now gone.

> 03 001 01 0 0 0 0 0 1 1 41
> @@ -142,7 +136,7 @@
> 16 001 01 1 1 0 1 0 1 1 C1
> 17 001 01 1 1 0 1 0 1 1 E1
> IRQ to pin mappings:
> -IRQ0 -> 0:0-> 0:2

This one was broken. Whenever you see a double IRQ mapping
like this in ACPI mode it indicates a bug.

> +IRQ0 -> 0:2
> IRQ1 -> 0:1
> IRQ3 -> 0:3
> IRQ4 -> 0:4

anyway, if you see problems going forward, please let me know.

cheers,
-Len


2004-06-21 10:46:29

by Milan Gabor

[permalink] [raw]
Subject: Re: [discuss] x86-64: double timer interrupts in recent 2.4.x

Hi!

I have Suse 9.0 and dual Opteron on MSI K8T Master 2 motherboard.
I also get interrupts only on one cpu and my clock is ticking strange,
so I have to synchronize it with NTP server frequently.

This is from my system:
CPU0 CPU1
0: 30434 16139843 IO-APIC-edge timer
1: 944 0 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
14: 30 1 IO-APIC-edge ide0
16: 657371 0 IO-APIC-level eth0
20: 261267 0 IO-APIC-level libata
NMI: 694146 873271
LOC: 16167676 16167576
ERR: 1
MIS: 0

Linux www 2.4.21-226-smp #1 SMP Tue Jun 15 09:14:10 UTC 2004 x86_64
x86_64 x86_64 GNU/Linux


I am also running irq_balance and acpi=off set from grub boot menu.
Without acpi=off system never boots.

Is there any solution, so clock will work OK and interrupts will be on
both CPUs?

MIlan


Peter Cordes wrote:

> Nobody replied to this message on [email protected], or
> [email protected]. Hopefully I've found the right places to send this this
> time around. Actually, Roland Fehrenbacher saw my message in a list archive
> and mailed me to confirm that he saw the same double-speed clock problem on
> two different machines, so it's not just Tyan S2880 boards. He suggested I
> mail Andi and lkml, so here goes. (I haven't tested again with anything more
> recent than 2.4.27-pre2, so if this is fixed, sorry.)
>
> -----
>
> I just noticed that on my Opteron cluster, the nodes that are running 64bit
> kernels have their clocks ticking at double speed. This happens with
> Linux 2.4.26, and 2.4.27-pre2, compiled with gcc 3.3.3 (Debian 20040401) in a
> Debian pure64 chroot. Linux 2.4.25, compiled on Debian Woody + bi-arch gcc
> 3.3.2 20030908, does _not_ have the problem. The config options were pretty
> much the same for all kernels, and all the kernels are plain vanilla flavour
> from http://www.ca.kernel.org.
>
> If I run ntpdate to set the clock, then 10 seconds later it will be 10
> seconds fast. Running date(1), the system time advances 20 seconds in 10
> seconds of real time. (I haven't done anything weird with adjtimex(8).)
> time sleep 10 takes 5 seconds, but bash reports its real time as 10 seconds.
> The timer interrupt counter is increasing at a rate of 200/real second, so
> it seems like the system is getting timer interrupts twice as fast as it
> should. (With 2.4.25, it is 100/sec, same as HZ).
>
> Linux says it is using the PIT and TSC timers. I have HPET enabled in my
> Linux config, but I guess Tyan's S2880 mobo doesn't have one. This is a
> dual-Opteron 240 machine, BTW.
>
> i386 Linux on the same machines has no problems with timekeeping. (But I
> haven't tested versions later than 2.4.25 in legacy mode.)
>
> I spent some time poking around the timer code that increments xtime, but I
> guess the fact that the timer irqs are coming at double speed indicates that
> the problem lies elsewhere. Maybe the code that sets up the timer?
>
> $ uname -a
> Linux node6.cs.dal.ca 2.4.26 #2 SMP Fri May 14 14:46:42 ADT 2004 x86_64 x86_64 x86_64 GNU/Linux
> $ cat /proc/interrupts
> CPU0 CPU1
> 0: 4415908 0 IO-APIC-edge timer
> 1: 2 0 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 9: 0 0 IO-APIC-level acpi
> 14: 17861 1 IO-APIC-edge ide0
> 19: 0 0 IO-APIC-level usb-ohci, usb-ohci
> 24: 563942 0 IO-APIC-level eth0
> 25: 564331 0 IO-APIC-level eth1
> NMI: 19097 19097
> LOC: 2211090 2211095
> ERR: 0
> MIS: 0
>
> Only CPU0 is getting the timer interrupt, but at least we know it's not
> that both CPUs are getting the timer interrupt. (Both CPUs get 100 LOC:
> (local APIC) interrupts/sec, but that happens on the non-buggy 2.4.25, too.)
>
> Thanks for any help,
>
> I'm not subscribed to the lkml, so please CC me on any followups.
>

2004-06-21 11:04:22

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [discuss] x86-64: double timer interrupts in recent 2.4.x

On Mon, Jun 21, 2004 at 12:45:51PM +0200, Milan Gabor wrote:
> Hi!
>
> I have Suse 9.0 and dual Opteron on MSI K8T Master 2 motherboard.
> I also get interrupts only on one cpu and my clock is ticking strange,
> so I have to synchronize it with NTP server frequently.
>
> This is from my system:
> CPU0 CPU1
> 0: 30434 16139843 IO-APIC-edge timer
> 1: 944 0 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 14: 30 1 IO-APIC-edge ide0
> 16: 657371 0 IO-APIC-level eth0
> 20: 261267 0 IO-APIC-level libata
> NMI: 694146 873271
> LOC: 16167676 16167576
> ERR: 1
> MIS: 0
>
> Linux www 2.4.21-226-smp #1 SMP Tue Jun 15 09:14:10 UTC 2004 x86_64
> x86_64 x86_64 GNU/Linux
>
>
> I am also running irq_balance and acpi=off set from grub boot menu.
> Without acpi=off system never boots.
>
> Is there any solution, so clock will work OK and interrupts will be on
> both CPUs?
>
> MIlan

This patch could fix that (replace i386 with x86_64):

http://marc.theaimsgroup.com/?l=linux-kernel&m=108774225111967&w=2


--
Vojtech Pavlik
SuSE Labs, SuSE CR