2004-04-23 21:45:51

by Sebastian Witt

[permalink] [raw]
Subject: PROBLEM: Oops when using both channels of the PDC20262

If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux server3 2.6.5 #2 SMP Sat Feb 14 01:52:40 CET 2004 i686 unknown unknown GNU/Linux

Gnu C 3.2.2
Gnu make 3.80
binutils 2.13.2
util-linux 2.12
mount 2.12
module-init-tools 0.9.10
e2fsprogs 1.32
xfsprogs 2.6.0
quota-tools 3.10.
Linux C Library 2.3.1
Dynamic linker (ldd) 2.3.1
Linux C++ Library 5.0.2
Procps 3.1.9
Net-tools 1.60
Kbd 1.08
Sh-utils 5.0
Modules Loaded raid5 xor md ipt_LOG ipt_limit ipt_state ip_conntrack iptable_filter ip_tables md5 ipv6 bridge ns83820 3c59x


Attachments:
log1.log (1.25 kB)
log2.log (11.34 kB)
oops1.log (2.43 kB)
oops2.log (4.35 kB)
ver (821.00 B)
Download all attachments

2004-04-24 19:42:52

by Denis Vlasenko

[permalink] [raw]
Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

On Saturday 24 April 2004 00:30, Sebastian Witt wrote:
> Hello,
>
> I'm getting some Oopses with kernel 2.6.5 when there is high load on
> both channels of a Promise PDC20262 (Ultra66) card on a SMP machine
> (Tyan S1834, Via Apollo Pro chipset).

I recall similar report. Reporter found that there is a #define
in the source which can be enabled to make driver serialize access
to channels. That 'fixed' (most probably worked around, though)
the problem.

I can't say whether it was a hardware or driver problem,
I didn't look into it.

> There are no problems when I use 2.6.1, but I have this problem
> since 2.6.2.
> It only occurs when I use the PDC20262, not when using the onboard
> IDE-controller.
>
> It is reproduceable after a few seconds when I use 'dd if=/dev/hde
> of=/dev/hdh bs=512'.
> Using of=/dev/null also works, but it takes longer.
>
> Mostly it reports smp_apic_timer_interrupt+1c/140, but the last time
> I tried it, it also reports <__mask_IO_APIC_irq+40/e0>.
>
> I've attached the logs and the ksymoops trace.
--
vda

Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

On Saturday 24 of April 2004 21:42, Denis Vlasenko wrote:
> On Saturday 24 April 2004 00:30, Sebastian Witt wrote:
> > Hello,
> >
> > I'm getting some Oopses with kernel 2.6.5 when there is high load on
> > both channels of a Promise PDC20262 (Ultra66) card on a SMP machine
> > (Tyan S1834, Via Apollo Pro chipset).
>
> I recall similar report. Reporter found that there is a #define
> in the source which can be enabled to make driver serialize access
> to channels. That 'fixed' (most probably worked around, though)
> the problem.

Denis, you are talking about hpt366.c not pdc202xx_old.c. ;-)

> I can't say whether it was a hardware or driver problem,
> I didn't look into it.
>
> > There are no problems when I use 2.6.1, but I have this problem
> > since 2.6.2.
> > It only occurs when I use the PDC20262, not when using the onboard
> > IDE-controller.

There were some change in pdc202xx_old.c driver in 2.6.2.
Please revert this patch and report if it helps.

diff -Nru a/drivers/ide/pci/pdc202xx_old.c b/drivers/ide/pci/pdc202xx_old.c
--- a/drivers/ide/pci/pdc202xx_old.c Tue Feb 3 19:45:42 2004
+++ b/drivers/ide/pci/pdc202xx_old.c Tue Feb 3 19:45:42 2004
@@ -361,16 +361,38 @@
return ((u8)(CIS & mask));
}

+/*
+ * Set the control register to use the 66MHz system
+ * clock for UDMA 3/4/5 mode operation when necessary.
+ *
+ * It may also be possible to leave the 66MHz clock on
+ * and readjust the timing parameters.
+ */
+static void pdc_old_enable_66MHz_clock(ide_hwif_t *hwif)
+{
+ unsigned long clock_reg = hwif->dma_master + 0x11;
+ u8 clock = hwif->INB(clock_reg);
+
+ hwif->OUTB(clock | (hwif->channel ? 0x08 : 0x02), clock_reg);
+}
+
+static void pdc_old_disable_66MHz_clock(ide_hwif_t *hwif)
+{
+ unsigned long clock_reg = hwif->dma_master + 0x11;
+ u8 clock = hwif->INB(clock_reg);
+
+ hwif->OUTB(clock & ~(hwif->channel ? 0x08 : 0x02), clock_reg);
+}
+
static int config_chipset_for_dma (ide_drive_t *drive)
{
struct hd_driveid *id = drive->id;
ide_hwif_t *hwif = HWIF(drive);
struct pci_dev *dev = hwif->pci_dev;
u32 drive_conf = 0;
- u8 mask = hwif->channel ? 0x08 : 0x02;
u8 drive_pci = 0x60 + (drive->dn << 2);
u8 test1 = 0, test2 = 0, speed = -1;
- u8 AP = 0, CLKSPD = 0, cable = 0;
+ u8 AP = 0, cable = 0;

u8 ultra_66 = ((id->dma_ultra & 0x0010) ||
(id->dma_ultra & 0x0008)) ? 1 : 0;
@@ -394,21 +416,6 @@
BUG();
}

- CLKSPD = hwif->INB(hwif->dma_master + 0x11);
-
- /*
- * Set the control register to use the 66Mhz system
- * clock for UDMA 3/4 mode operation. If one drive on
- * a channel is U66 capable but the other isn't we
- * fall back to U33 mode. The BIOS INT 13 hooks turn
- * the clock on then off for each read/write issued. I don't
- * do that here because it would require modifying the
- * kernel, separating the fop routines from the kernel or
- * somehow hooking the fops calls. It may also be possible to
- * leave the 66Mhz clock on and readjust the timing
- * parameters.
- */
-
if ((ultra_66) && (cable)) {
#ifdef DEBUG
printk(KERN_DEBUG "ULTRA 66/100/133: %s channel of Ultra 66/100/133 "
@@ -416,29 +423,12 @@
hwif->channel ? "Secondary" : "Primary");
printk(KERN_DEBUG " Switching to Ultra33 mode.\n");
#endif /* DEBUG */
- /* Primary : zero out second bit */
- /* Secondary : zero out fourth bit */
- hwif->OUTB(CLKSPD & ~mask, (hwif->dma_master + 0x11));
printk(KERN_WARNING "Warning: %s channel requires an 80-pin cable for operation.\n", hwif->channel ? "Secondary":"Primary");
printk(KERN_WARNING "%s reduced to Ultra33 mode.\n", drive->name);
- } else {
- if (ultra_66) {
- /*
- * check to make sure drive on same channel
- * is u66 capable
- */
- if (hwif->drives[!(drive->dn%2)].id) {
- if (hwif->drives[!(drive->dn%2)].id->dma_ultra & 0x0078) {
- hwif->OUTB(CLKSPD | mask, (hwif->dma_master + 0x11));
- } else {
- hwif->OUTB(CLKSPD & ~mask, (hwif->dma_master + 0x11));
- }
- } else { /* udma4 drive by itself */
- hwif->OUTB(CLKSPD | mask, (hwif->dma_master + 0x11));
- }
- }
}

+ pdc_old_disable_66MHz_clock(drive->hwif);
+
drive_pci = 0x60 + (drive->dn << 2);
pci_read_config_dword(dev, drive_pci, &drive_conf);
if ((drive_conf != 0x004ff304) && (drive_conf != 0x004ff3c4))
@@ -536,6 +526,8 @@

static int pdc202xx_old_ide_dma_begin(ide_drive_t *drive)
{
+ if (drive->current_speed > XFER_UDMA_2)
+ pdc_old_enable_66MHz_clock(drive->hwif);
if (drive->addressing == 1) {
struct request *rq = HWGROUP(drive)->rq;
ide_hwif_t *hwif = HWIF(drive);
@@ -569,6 +561,8 @@
clock = hwif->INB(high_16 + 0x11);
hwif->OUTB(clock & ~(hwif->channel ? 0x08:0x02), high_16+0x11);
}
+ if (drive->current_speed > XFER_UDMA_2)
+ pdc_old_disable_66MHz_clock(drive->hwif);
return __ide_dma_end(drive);
}

@@ -757,10 +751,7 @@

hwif->speedproc = &pdc202xx_tune_chipset;

- if (!hwif->dma_base) {
- hwif->drives[0].autotune = hwif->drives[1].autotune = 1;
- return;
- }
+ hwif->drives[0].autotune = hwif->drives[1].autotune = 1;

hwif->ultra_mask = 0x3f;
hwif->mwdma_mask = 0x07;


> > It is reproduceable after a few seconds when I use 'dd if=/dev/hde
> > of=/dev/hdh bs=512'.
> > Using of=/dev/null also works, but it takes longer.
> >
> > Mostly it reports smp_apic_timer_interrupt+1c/140, but the last time
> > I tried it, it also reports <__mask_IO_APIC_irq+40/e0>.
> >
> > I've attached the logs and the ksymoops trace.

Strange, it looks like IO-APIC problem.
Have you tried booting with "noapic" kernel parameter?

Regards,
Bartlomiej

2004-04-25 00:23:40

by Sebastian Witt

[permalink] [raw]
Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

Bartlomiej Zolnierkiewicz wrote:
>
> There were some change in pdc202xx_old.c driver in 2.6.2.
> Please revert this patch and report if it helps.

I've tested 2.6.2 with the reverted patch and it seems to work.
Normally it takes <1min to crash the system when I access the discs
on the controller, with the patched driver it works >1 hour without a error.

Also I've copied the driver to the 2.6.5 tree now trying this (after
disabling the procfs code, it seems that it has changed).

>
> Strange, it looks like IO-APIC problem.
> Have you tried booting with "noapic" kernel parameter?

Yes, then /proc/interrupts shows that it uses the XT-PIC, but it also
crashes (now without a Oops, totally freezed).

Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

On Sunday 25 of April 2004 02:20, Sebastian Witt wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > There were some change in pdc202xx_old.c driver in 2.6.2.
> > Please revert this patch and report if it helps.
>
> I've tested 2.6.2 with the reverted patch and it seems to work.
> Normally it takes <1min to crash the system when I access the discs
> on the controller, with the patched driver it works >1 hour without a
> error.

Ok, so now I know I screwed something but don't know what (yet). :-)

Please return back to 2.6.5 and try this patch, it disables PIO autotune.
It fixed hangs for people disabling Promise BIOS but...

linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN drivers/ide/pci/pdc202xx_old.c~pdc_old_notune drivers/ide/pci/pdc202xx_old.c
--- linux-2.6.6-rc2-bk1/drivers/ide/pci/pdc202xx_old.c~pdc_old_notune 2004-04-25 03:25:49.962088200 +0200
+++ linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c 2004-04-25 03:26:59.749478904 +0200
@@ -728,7 +728,7 @@ static void __init init_hwif_pdc202xx (i

hwif->speedproc = &pdc202xx_tune_chipset;

- hwif->drives[0].autotune = hwif->drives[1].autotune = 1;
+// hwif->drives[0].autotune = hwif->drives[1].autotune = 1;

hwif->ultra_mask = 0x3f;
hwif->mwdma_mask = 0x07;

_

If it doesn't help try this one instead
(reverts old way of setting 66MHz clock):

linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c | 13 ++++++------
1 files changed, 7 insertions(+), 6 deletions(-)

diff -puN drivers/ide/pci/pdc202xx_old.c~pdc_old_fix drivers/ide/pci/pdc202xx_old.c
--- linux-2.6.6-rc2-bk1/drivers/ide/pci/pdc202xx_old.c~pdc_old_fix 2004-04-25 03:09:26.692567888 +0200
+++ linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c 2004-04-25 03:21:30.607516088 +0200
@@ -405,10 +405,15 @@ static int config_chipset_for_dma (ide_d
if (ultra_66 && cable) {
printk(KERN_WARNING "Warning: %s channel requires an 80-pin cable for operation.\n", hwif->channel ? "Secondary":"Primary");
printk(KERN_WARNING "%s reduced to Ultra33 mode.\n", drive->name);
+ pdc_old_disable_66MHz_clock(drive->hwif);
}

- if (dev->device != PCI_DEVICE_ID_PROMISE_20246)
- pdc_old_disable_66MHz_clock(drive->hwif);
+ if (ultra_66 && !cable) {
+ if (hwif->drives[!(drive->dn%2)].id->dma_ultra & 0x0078)
+ pdc_old_enable_66MHz_clock(drive->hwif);
+ else
+ pdc_old_disable_66MHz_clock(drive->hwif);
+ }

drive_pci = 0x60 + (drive->dn << 2);
pci_read_config_dword(dev, drive_pci, &drive_conf);
@@ -507,8 +512,6 @@ static int pdc202xx_quirkproc (ide_drive

static int pdc202xx_old_ide_dma_begin(ide_drive_t *drive)
{
- if (drive->current_speed > XFER_UDMA_2)
- pdc_old_enable_66MHz_clock(drive->hwif);
if (drive->addressing == 1) {
struct request *rq = HWGROUP(drive)->rq;
ide_hwif_t *hwif = HWIF(drive);
@@ -542,8 +545,6 @@ static int pdc202xx_old_ide_dma_end(ide_
clock = hwif->INB(high_16 + 0x11);
hwif->OUTB(clock & ~(hwif->channel ? 0x08:0x02), high_16+0x11);
}
- if (drive->current_speed > XFER_UDMA_2)
- pdc_old_disable_66MHz_clock(drive->hwif);
return __ide_dma_end(drive);
}


_

Thanks,
Bartlomiej

2004-04-25 19:24:49

by Sebastian Witt

[permalink] [raw]
Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

Bartlomiej Zolnierkiewicz wrote:
>
> Please return back to 2.6.5 and try this patch, it disables PIO autotune.
> It fixed hangs for people disabling Promise BIOS but...
>
> linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c | 2 +-
> 1 files changed, 1 insertion(+), 1 deletion(-)
> ...

Thanks, this patch works. I've tested now multiple times a 2.6.5 kernel
with and without this patch. The BIOS of my controller is also disabled
if this is important...

PDC20262: IDE controller at PCI slot 0000:00:11.0
PDC20262: chipset revision 1
PDC20262: 100% native mode on irq 17
PDC20262: (U)DMA Burst Bit DISABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0xdc00-0xdc07, BIOS settings: hde:DMA, hdf:DMA
ide3: BM-DMA at 0xdc08-0xdc0f, BIOS settings: hdg:DMA, hdh:DMA

Ultra66 Chipset.
------------------------------- General Status
---------------------------------
Burst Mode : disabled
Host Mode : Normal
Bus Clocking : 33 PCI Internal
IO pad select : 4 mA
Status Polling Period : 9
Interrupt Check Status Polling Delay : 9
--------------- Primary Channel ---------------- Secondary Channel
-------------
enabled enabled
66 Clocking disabled disabled
Mode PCI Mode PCI
FIFO Empty FIFO Empty
--------------- drive0 --------- drive1 -------- drive0 ----------
drive1 ------
DMA enabled: yes yes yes yes
DMA Mode: UDMA 4 UDMA 4 UDMA 4 UDMA 4
PIO Mode: PIO ? PIO ? PIO ? PIO ?

-------------

If you need some more testing etc. I'm available.

Thanks,
Sebastian

Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

On Sunday 25 of April 2004 20:02, Sebastian Witt wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > Please return back to 2.6.5 and try this patch, it disables PIO
> > autotune. It fixed hangs for people disabling Promise BIOS but...
> >
> > linux-2.6.6-rc2-bk1-bzolnier/drivers/ide/pci/pdc202xx_old.c | 2 +-
> > 1 files changed, 1 insertion(+), 1 deletion(-)
> > ...
>
> Thanks, this patch works. I've tested now multiple times a 2.6.5 kernel
> with and without this patch. The BIOS of my controller is also disabled
> if this is important...

Thanks. Can you retest with enabled BIOS?

Please also send me output output of 'cat /proc/ide/ide2/config'
and 'cat /proc/ide/ide3/config' before and after applying this patch.

Cheers,
Bartlomiej

2004-04-25 20:21:00

by Sebastian Witt

[permalink] [raw]
Subject: Re: PROBLEM: Oops when using both channels of the PDC20262

pci bus 00 device 88 vendor 105a device 4d38 channel 1
5a 10 38 4d 07 00 00 02 01 00 04 01 00 20 00 00
01 cc 00 00 01 d0 00 00 01 d4 00 00 01 d8 00 00
01 dc 00 00 00 00 00 d9 00 00 00 00 5a 10 33 4d
00 00 00 00 00 00 00 00 00 00 00 00 11 01 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ca 33 00 00 00 00 00 00 00 00 00 00 00 00 00 00
11 24 41 00 11 24 41 00 11 24 41 00 11 24 41 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
5a 10 38 4d 07 00 00 02 01 00 04 01 00 20 00 00
01 cc 00 00 01 d0 00 00 01 d4 00 00 01 d8 00 00
01 dc 00 00 00 00 00 d9 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 11 01 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ca 33 00 00 00 00 00 00 00 00 00 00 00 00 00 00
11 24 41 00 11 24 41 00 11 24 41 00 11 24 41 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


Attachments:
ide2-patched (823.00 B)
ide2-unpatched (823.00 B)
ide3-patched (823.00 B)
ide3-unpatched (823.00 B)
Download all attachments