2005-05-02 19:28:48

by Oskar Liljeblad

[permalink] [raw]
Subject: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

I'm running 2.6.11.8 on an server with two Promise Ultra133 TX2 (PDC20269)
PCI cards, same hardware revision (judging from stickers on the cards).
I'm using the CONFIG_BLK_DEV_PDC202XX_NEW driver.
Each card has two connected hard drives. Whenever I read from a disk
on one of the cards (e.g. using 'dd if=/dev/hdX of=/dev/null bs=1M'), and
at the same time read from a disk on the other card, there is heavy
software clock drift. It drifts about 2-5 seconds per minute.

This does not happen if I read from two drives connected on the same
card, or if I read from a drive connected to the motherboard IDE
(VIA vt8233a) and a drive on either of the Promise cards.

Oskar Liljeblad ([email protected])


2005-05-03 14:22:18

by Drew Winstel

[permalink] [raw]
Subject: RE: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

Hello,


>I'm running 2.6.11.8 on an server with two Promise Ultra133 TX2 (PDC20269)
>PCI cards, same hardware revision (judging from stickers on the cards).
>I'm using the CONFIG_BLK_DEV_PDC202XX_NEW driver.
>Each card has two connected hard drives. Whenever I read from a disk
>on one of the cards (e.g. using 'dd if=/dev/hdX of=/dev/null bs=1M'), and
>at the same time read from a disk on the other card, there is heavy
>software clock drift. It drifts about 2-5 seconds per minute.

>This does not happen if I read from two drives connected on the same
>card, or if I read from a drive connected to the motherboard IDE
>(VIA vt8233a) and a drive on either of the Promise cards.

>Oskar Liljeblad ([email protected])

Just to verify your setup:

You have a total of four hard drives connected to your PDC20269, hde, hdg,
hdi, and hdk, correct?

Are all four drives running in DMA mode?

Please post the output of lspci -vv and hdparm run on each of the four hard
drives.

Also, you may want to try downloading and using Albert Lee's pata_pdc2027x
driver (part of libata-dev-2.6 tree). See info at my thread from earlier:
http://marc.theaimsgroup.com/?l=linux-ide&m=110902518625384&w=2

Download the latest libata-dev patch set at
http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/

Hope this gets you started.

Drew

2005-05-03 14:48:09

by Oskar Liljeblad

[permalink] [raw]
Subject: Re: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

On Tuesday, May 03, 2005 at 09:58, Drew Winstel wrote:

Hi & Thanks for your quick reply!

> Just to verify your setup:
>
> You have a total of four hard drives connected to your PDC20269, hde, hdg,
> hdi, and hdk, correct?

Yes, two drives per physical PDC20269 card.

> Are all four drives running in DMA mode?

Yes, all are running udma6. Here's what hdparm tells me for all drives
(only difference is serial number):

# hdparm /dev/hde

/dev/hde:
multcount = 0 (off)
IO_support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 256 (on)
geometry = 30515/255/63, sectors = 490234752, start = 0
# hdparm -i /dev/hde

/dev/hde:

Model=Maxtor 6B250R0, FwRev=BAH41BM0, SerialNo=<...>
Config={ Fixed }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
BuffType=DualPortCache, BuffSize=16384kB, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=yes: disabled (255) WriteCache=enabled
Drive conforms to: (null):

* signifies the current active mode

#

> Please post the output of lspci -vv and hdparm run on each of the four hard
> drives.

See attached file for lspci output.

> Also, you may want to try downloading and using Albert Lee's pata_pdc2027x
> driver (part of libata-dev-2.6 tree). See info at my thread from earlier:
> http://marc.theaimsgroup.com/?l=linux-ide&m=110902518625384&w=2

Interesting, I will try them out ASAP and return to you!

> Download the latest libata-dev patch set at
> http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/
>
> Hope this gets you started.

Thanks again!

Regards,

Oskar Liljeblad ([email protected])


Attachments:
(No filename) (1.89 kB)
lspci.out (8.12 kB)
Download all attachments

2005-05-03 15:12:56

by Drew Winstel

[permalink] [raw]
Subject: RE: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards


> geometry = 30515/255/63, sectors = 490234752, start = 0

>/dev/hde:

> Model=Maxtor 6B250R0, FwRev=BAH41BM0, SerialNo=<...>
> Config={ Fixed }
> RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
> BuffType=DualPortCache, BuffSize=16384kB, MaxMultSect=16, MultSect=off
> CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
> IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
> PIO modes: pio0 pio1 pio2 pio3 pio4
> DMA modes: mdma0 mdma1 mdma2
> UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
> AdvancedPM=yes: disabled (255) WriteCache=enabled
> Drive conforms to: (null):

> * signifies the current active mode

Hmm... that puzzles me, although for no other reason than I'm not familiar
with how Maxtor drives report themselves. Having the BIOS-reported LBA
sectors not equal to the OS-reported geometry may not be a problem, but
I must defer to the experts on that one.

As an FYI just in case, the new libata-based driver will treat your drives
as SCSI drives, so you'll see the drives as sda, sdb, and so forth instead of
hd?.

Good luck!

Drew

2005-05-03 16:02:35

by Oskar Liljeblad

[permalink] [raw]
Subject: Re: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

On Tuesday, May 03, 2005 at 10:17, Drew Winstel wrote:
>
> Hmm... that puzzles me, although for no other reason than I'm not familiar
> with how Maxtor drives report themselves. Having the BIOS-reported LBA
> sectors not equal to the OS-reported geometry may not be a problem, but
> I must defer to the experts on that one.
>
> As an FYI just in case, the new libata-based driver will treat your drives
> as SCSI drives, so you'll see the drives as sda, sdb, and so forth instead of
> hd?.

Hm, I patched the kernel with 2.6.11-libata-dev1, compiled it with

# CONFIG_BLK_DEV_PDC202XX_NEW is not set
CONFIG_BLK_DEV_VIA82CXXX=y (for the motherboard IDE)
CONFIG_SCSI=y
CONFIG_SCSI_SATA=y
CONFIG_SCSI_PATA_PDC2027X=y

and rebooted. SCSI is initialized and the pata_pdc2027x driver is
loaded, but it doesn't seem to find any devices. Or maybe it doesn't
look for devices at all. I can tell that it's loaded by the existence
of /sys/bus/pci/drivers/pata_pdc2027x (a directory which is empty).

/proc/scsi/scsi is also empty besides the "Attached devices:" line.
During startup the kernel does say "Probing IDE interface ide0"
through "ide5" (finding only devices on ide0). I also tried compiling
pata_pdc2027x as a module, with same result.

What's wrong here?

Regards,

Oskar Liljeblad ([email protected])

2005-05-03 16:07:49

by Drew Winstel

[permalink] [raw]
Subject: RE: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

>Hm, I patched the kernel with 2.6.11-libata-dev1, compiled it with

># CONFIG_BLK_DEV_PDC202XX_NEW is not set
>CONFIG_BLK_DEV_VIA82CXXX=y (for the motherboard IDE)
>CONFIG_SCSI=y
>CONFIG_SCSI_SATA=y
>CONFIG_SCSI_PATA_PDC2027X=y

>and rebooted. SCSI is initialized and the pata_pdc2027x driver is
>loaded, but it doesn't seem to find any devices. Or maybe it doesn't
>look for devices at all. I can tell that it's loaded by the existence
>of /sys/bus/pci/drivers/pata_pdc2027x (a directory which is empty).

>/proc/scsi/scsi is also empty besides the "Attached devices:" line.
>During startup the kernel does say "Probing IDE interface ide0"
>through "ide5" (finding only devices on ide0). I also tried compiling
>pata_pdc2027x as a module, with same result.

>What's wrong here?

I think I know what the problem is.

In include/linux/libata.h, make sure the preprocessor declarations are as
follows. I think the defaults have ATA_ENABLE_PATA undefined.

#define ATA_ENABLE_ATAPI /* undefine to disable ATAPI support */
#define ATA_ENABLE_PATA /* define to enable PATA support in some
* low-level drivers */

Sorry I forgot to mention this earlier.

Drew

2005-05-03 17:28:53

by Oskar Liljeblad

[permalink] [raw]
Subject: Re: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

On Tuesday, May 03, 2005 at 11:53, Drew Winstel wrote:
> I think I know what the problem is.
>
> In include/linux/libata.h, make sure the preprocessor declarations are as
> follows. I think the defaults have ATA_ENABLE_PATA undefined.
>
> #define ATA_ENABLE_ATAPI /* undefine to disable ATAPI support */
> #define ATA_ENABLE_PATA /* define to enable PATA support in some
> * low-level drivers */

Thanks, now it loads correctly. Unfortunately the clock drift still occurs
with pata_pdc2027x. I'm guessing here, but can clock drift have anything
to do with IRQs? Also, is it normal to see errors in /proc/interrupt?

# cat /proc/interrupts
CPU0
0: 954189 XT-PIC timer
2: 0 XT-PIC cascade
8: 16 XT-PIC rtc
9: 0 XT-PIC acpi
10: 1630 XT-PIC eth1
11: 103147 XT-PIC libata
12: 9990 XT-PIC eth0
14: 5993 XT-PIC ide0
15: 145866 XT-PIC libata
NMI: 0
LOC: 0
ERR: 23672
MIS: 0

Regards,

Oskar Liljeblad ([email protected])

2005-05-03 18:29:43

by Drew Winstel

[permalink] [raw]
Subject: RE: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

> Thanks, now it loads correctly. Unfortunately the clock drift still occurs
> with pata_pdc2027x. I'm guessing here, but can clock drift have anything
> to do with IRQs? Also, is it normal to see errors in /proc/interrupt?

I've never noticed any errors before, but that could just be a result of me
never actually bothering to look.

> # cat /proc/interrupts

<snip>
> ERR: 23672

Hmmm....

I'm grasping at straws here.

Let's do some poking into your kernel config. What do you have set under
"Processor type and features"?

I experienced a similar situation once in the past, but that was a result of
drives losing DMA and doing simultaneous activity on eight drives and four
controllers. Moving to the pata_pdc2027x driver seems to have cleared it
right up.

Sorry I can't provide a quick cure-all!

Drew

2005-05-04 18:01:28

by Bill Davidsen

[permalink] [raw]
Subject: Re: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

Oskar Liljeblad wrote:
> On Tuesday, May 03, 2005 at 11:53, Drew Winstel wrote:
>
>>I think I know what the problem is.
>>
>>In include/linux/libata.h, make sure the preprocessor declarations are as
>>follows. I think the defaults have ATA_ENABLE_PATA undefined.
>>
>>#define ATA_ENABLE_ATAPI /* undefine to disable ATAPI support */
>>#define ATA_ENABLE_PATA /* define to enable PATA support in some
>> * low-level drivers */
>
>
> Thanks, now it loads correctly. Unfortunately the clock drift still occurs
> with pata_pdc2027x. I'm guessing here, but can clock drift have anything
> to do with IRQs? Also, is it normal to see errors in /proc/interrupt?

Are you by chance running ntpd? And have you checked to see if it dies
as soon as it starts? There's a bug in the ntp M/L wrt execstack.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2005-05-04 20:12:30

by Oskar Liljeblad

[permalink] [raw]
Subject: Re: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

On Tuesday, May 03, 2005 at 13:45, Drew Winstel wrote:
> > Thanks, now it loads correctly. Unfortunately the clock drift still occurs
> > with pata_pdc2027x. I'm guessing here, but can clock drift have anything
> > to do with IRQs? Also, is it normal to see errors in /proc/interrupt?
>
> I've never noticed any errors before, but that could just be a result of me
> never actually bothering to look.

Are interrupts errors serious? Can anyone tell? The error count was higher
when I had a fifth PCI card in the computer (natsemi ethernet NIC).
Could there be some kind of PCI card conflict? Maybe I should try to
remove a few of them...

[..]
> Let's do some poking into your kernel config. What do you have set under
> "Processor type and features"?

CONFIG_X86_PC=y
CONFIG_MK7=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y

In other words, Athlon/Duron/K7 + HPET + Local APIC + IO-Apic
(though I have tried with both XT-PIC and Local APIC, with same
drift).

And no, the clock drift occurs no matter if ntpd is running or not.
It's having a very hard time to syncronize with the remote servers,
because the clock drift is too high. (It also says somewhere in the
NTP documentation that it doesn't handle too high clock drift.)

Regards,

Oskar Liljeblad ([email protected])

2005-05-05 14:45:19

by Drew Winstel

[permalink] [raw]
Subject: RE: clock drift with two Promise Ultra133 TX2 (PDC 20269) cards

> Are interrupts errors serious? Can anyone tell? The error count was higher
> when I had a fifth PCI card in the computer (natsemi ethernet NIC).
> Could there be some kind of PCI card conflict? Maybe I should try to
> remove a few of them...

It's worth a shot. Just for the record, I haven't had any such errors on my
machine.
root@linux /proc # cat interrupts
CPU0
0: 46494411 XT-PIC timer
1: 13644 XT-PIC i8042
2: 0 XT-PIC cascade
5: 1210194 XT-PIC ohci_hcd, SiS SI7012
7: 19 XT-PIC parport0
9: 3492898 XT-PIC acpi, ohci_hcd, nvidia
10: 0 XT-PIC ohci_hcd
11: 29194 XT-PIC ehci_hcd, eth0
12: 295381 XT-PIC i8042
14: 567754 XT-PIC ide0
15: 48 XT-PIC ide1
NMI: 0
ERR: 0
root@linux /proc # uptime
09:38:16 up 12:54, 5 users, load average: 0.08, 0.03, 0.03

> In other words, Athlon/Duron/K7 + HPET + Local APIC + IO-Apic
> (though I have tried with both XT-PIC and Local APIC, with same
> drift).

So much for that thought. The kernel config is solid (as expected).

> And no, the clock drift occurs no matter if ntpd is running or not.
> It's having a very hard time to syncronize with the remote servers,
> because the clock drift is too high. (It also says somewhere in the
> NTP documentation that it doesn't handle too high clock drift.)

I just had an idea. Using the old PDC20269 IDE driver, try repeating
your test to trigger clock drift. While the test is in progress, run
hdparm (no need for -i) on both drives. I wonder if it's dropping out
of DMA mode temporarily for no apparent reason.

Thanks,
Drew