2003-08-10 07:26:50

by Ryan C. Underwood

[permalink] [raw]
Subject: PCI parallel card causes erratic timekeeping? (2.4.21)


Hi,

Here's a strange one for the kernel time gurus out there!

We have a file server with a Intel PR440FX dual PPro mainboard (2x200Mhz
PPro CPU). The PR440FX has 3 PCI slots and 1 PCI/ISA shared slot, in
addition to onboard PIIX3 IDE, AIC-7880 SCSI, and Intel 82557 ethernet.

In the 3 PCI slots, we have an old Matrox video card, a AHA-2940UW, and
a Promise PDC20262 ATA-66 controller card.

In the ISA slot, up until recently, we had a dual parallel port card
that was attached to the network printers. However, the printers and
the card were fried in a storm recently; luckily, the server survived.
We replaced the card with a dual PCI parallel port card (Netmos
NM9715CV) and the printers are now working fine.

However, a new problem emerged. The software clock of the system is
crazy! Sometimes it is very fast, other times very slow. NTP
constantly loses synchronization, and since this machine is also a
Kerberos KDC, Kerberos tickets are flakey. :( This problem exists
independently of whether a driver is loaded for the card or not; if it
is in the system at all, the clock runs screwy. If I remove the card,
the clock is back to normal as far as I can tell.

What do you all think could be causing this problem, and what other
information would I need to provide to find a solution? The only
strange thing about that slot is that it is the only PCI slot that is
not master capable in the PR440FX. The card received its own IRQ (19,
from the IO-APIC).

I have heard about strange problems with timekeeping on SMP machines,
but there's no problem with this box except when that card is inserted.
I checked out messing with the kernel ticks (using tickadj), but it
seems that would only help with a clock that is consistently skewed one
way or the other, not one that is as erratic as this one.

Can simply inserting a card generally make a system clock act screwy
like this? Should I try to find a different card?

Thanks,

--
Ryan Underwood, <nemesis at icequake.net>, icq=10317253


2003-08-12 21:14:48

by john stultz

[permalink] [raw]
Subject: Re: PCI parallel card causes erratic timekeeping? (2.4.21)

On Sun, 2003-08-10 at 00:26, Ryan Underwood wrote:
> In the ISA slot, up until recently, we had a dual parallel port card
> that was attached to the network printers. However, the printers and
> the card were fried in a storm recently; luckily, the server survived.
> We replaced the card with a dual PCI parallel port card (Netmos
> NM9715CV) and the printers are now working fine.
>
> However, a new problem emerged. The software clock of the system is
> crazy! Sometimes it is very fast, other times very slow. NTP
> constantly loses synchronization, and since this machine is also a
> Kerberos KDC, Kerberos tickets are flakey. :( This problem exists
> independently of whether a driver is loaded for the card or not; if it
> is in the system at all, the clock runs screwy. If I remove the card,
> the clock is back to normal as far as I can tell.

[snip]

> Can simply inserting a card generally make a system clock act screwy
> like this? Should I try to find a different card?

Sounds like the card is blocking the timer interrupt from being handled.
This would cause a loss in time as well as time inconsistencies. I'm a
bit curious about time running too fast, though. It could be NTP trying
to compensate for the slowness and then overcompensates if the card
stops misbehaving. Its also possible calibrate_tsc is being confused by
the card's funkyness. It would be interesting to see what Bogomips is
being reported as the system acts up.

Sounds like a bad card to me.

thanks
-john