Brief Synopsis:
IDE initialization on non-DMA controllers causes OOPS during boot
due to dereference of null function pointers.
Description:
This patch fixes an issue where null function pointers associated with
DMA are called during initialization of IDE hard drive controllers
(causing a kernel OOPS on boot). The problem only occurs on
controllers which do not support DMA. It has been tested successfuly
against a non-DMA IDE controller on x86.
I am not subscribed, so please CC me on any followup e-mails.
Glenn.
diff -Naur linux-2.4.24/drivers/ide/ide-dma.c linux-2.4.24-patched/drivers/ide/ide-dma.c
--- linux-2.4.24/drivers/ide/ide-dma.c 2003-08-25 11:44:41.000000000 +0000
+++ linux-2.4.24-patched/drivers/ide/ide-dma.c 2004-01-23 03:23:08.000000000 +0000
@@ -566,6 +566,33 @@
}
/**
+ * __ide_dma_no_op - Empty DMA function.
+ *
+ * Empty DMA function for controllers that do not support DMA.
+ */
+
+int __ide_dma_no_op (void)
+{
+ return 0;
+}
+
+EXPORT_SYMBOL(__ide_dma_no_op);
+
+/**
+ * __ide_dma_unsupported - Warning function for DMA operation.
+ *
+ * Warning function for DMA operation on unsupported hardware.
+ */
+
+int __ide_dma_unsupported (ide_hwif_t *hwif)
+{
+ printk(KERN_WARNING "DMA not supported by %s\n", hwif->name );
+ return 1;
+}
+
+EXPORT_SYMBOL(__ide_dma_unsupported);
+
+/**
* __ide_dma_host_off - Generic DMA kill
* @drive: drive to control
*
@@ -1214,3 +1241,17 @@
}
EXPORT_SYMBOL_GPL(ide_setup_dma);
+
+/*
+ * For IDE interfaces that do not support DMA, we still need to
+ * initialize some pointers to dummy functions during initialization.
+ */
+void default_hwif_dmaops (ide_hwif_t *hwif)
+{
+ hwif->ide_dma_on = __ide_dma_unsupported;
+ hwif->ide_dma_off_quietly = (int (*)(ide_drive_t *))&__ide_dma_no_op;
+ hwif->ide_dma_host_off = (int (*)(ide_drive_t *))&__ide_dma_no_op;
+ hwif->ide_dma_host_on = (int (*)(ide_drive_t *))&__ide_dma_no_op;
+}
+
+EXPORT_SYMBOL_GPL(default_hwif_dmaops);
diff -Naur linux-2.4.24/drivers/ide/ide.c linux-2.4.24-patched/drivers/ide/ide.c
--- linux-2.4.24/drivers/ide/ide.c 2003-11-28 18:26:20.000000000 +0000
+++ linux-2.4.24-patched/drivers/ide/ide.c 2004-01-23 03:32:24.000000000 +0000
@@ -251,6 +251,7 @@
hwif->sata = 0; /* assume PATA */
default_hwif_iops(hwif);
+ default_hwif_dmaops(hwif);
default_hwif_transport(hwif);
for (unit = 0; unit < MAX_DRIVES; ++unit) {
ide_drive_t *drive = &hwif->drives[unit];
diff -Naur linux-2.4.24/include/linux/ide.h linux-2.4.24-patched/include/linux/ide.h
--- linux-2.4.24/include/linux/ide.h 2003-11-28 18:26:21.000000000 +0000
+++ linux-2.4.24-patched/include/linux/ide.h 2004-01-23 03:24:08.000000000 +0000
@@ -1691,8 +1691,12 @@
extern void ide_destroy_dmatable(ide_drive_t *);
extern ide_startstop_t ide_dma_intr(ide_drive_t *);
extern int ide_release_dma(ide_hwif_t *);
+extern void default_hwif_dmaops(ide_hwif_t *);
extern void ide_setup_dma(ide_hwif_t *, unsigned long, unsigned int);
+extern void ide_setup_dma_off(ide_hwif_t *);
+extern int __ide_dma_no_op(void);
+extern int __ide_dma_unsupported(ide_hwif_t *);
extern int __ide_dma_host_off(ide_drive_t *);
extern int __ide_dma_off_quietly(ide_drive_t *);
extern int __ide_dma_off(ide_drive_t *);
@@ -1712,6 +1716,7 @@
extern int __ide_dma_lostirq(ide_drive_t *);
extern int __ide_dma_timeout(ide_drive_t *);
#else
+static inline void default_hwif_dmaops(ide_hwif_t *x) {;}
static inline void ide_setup_dma(ide_hwif_t *x, unsigned long y, unsigned int z) {;}
static inline void ide_release_dma(ide_hwif_t *x) {;}
#endif
On Fri, Jan 23, 2004 at 01:32:45PM -0500, Glenn Wurster wrote:
>
> Brief Synopsis:
>
> IDE initialization on non-DMA controllers causes OOPS during boot
> due to dereference of null function pointers.
Linus - I am ok with this but for 2.6 Bart needs to look at it I guess
> > Brief Synopsis:
> >
> > IDE initialization on non-DMA controllers causes OOPS during boot
> > due to dereference of null function pointers.
>
> Linus - I am ok with this but for 2.6 Bart needs to look at it I guess
I tried out the 2.6.1 kernel quickly and it did not exhibit the same
obvious problems oopsing with dma and the ide controller as the latest
2.4 kernels (on my hardware at least). It booted up nicely without a
problem on unmodified source. Whether or not the problem occurs for
other types of hardware I can't say.
Glenn.
On Friday 23 of January 2004 23:09, Glenn Wurster wrote:
> > > Brief Synopsis:
> > >
> > > IDE initialization on non-DMA controllers causes OOPS during boot
> > > due to dereference of null function pointers.
> >
> > Linus - I am ok with this but for 2.6 Bart needs to look at it I guess
>
> I tried out the 2.6.1 kernel quickly and it did not exhibit the same
> obvious problems oopsing with dma and the ide controller as the latest
> 2.4 kernels (on my hardware at least). It booted up nicely without a
> problem on unmodified source. Whether or not the problem occurs for
> other types of hardware I can't say.
Hi,
[ Sorry for delay. ]
Glenn, your patch hides potential problems - these functions shouldn't be
called if host doesn't support DMA. However there is one place when
->ide_dma_host_off() shouldn't be called unconditionally, here is a patch.
It is not pretty but at least consistent - we check hwif->ide_dma_check
to see if DMA is supported in other places too. Does it fix the problem?
Are you sure that it doesn't happen on 2.6.1? Maybe you've used a bit
different config (ie. compiled without DMA support)?
Cheers,
--bart
--- ide-iops.c.orig 2003-12-06 17:47:27.000000000 +0100
+++ ide-iops.c 2004-01-24 00:17:32.129567576 +0100
@@ -912,7 +912,8 @@
// ide_delay_50ms();
#if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
- hwif->ide_dma_host_off(drive);
+ if (hwif->ide_dma_check) /* check if host supports DMA */
+ hwif->ide_dma_host_off(drive);
#endif /* (CONFIG_BLK_DEV_IDEDMA) && !(CONFIG_DMA_NONPCI) */
/*
> Glenn, your patch hides potential problems - these functions shouldn't be
> called if host doesn't support DMA. However there is one place when
> ->ide_dma_host_off() shouldn't be called unconditionally, here is a patch.
> It is not pretty but at least consistent - we check hwif->ide_dma_check
> to see if DMA is supported in other places too. Does it fix the problem?
Not quite. If we go forward with a patch like that, then it must be
updated to include at least two other places that I know of
immediately. The updated patch would be something similar to the one
at the bottom of this e-mail.
On a further note, how should hdparm -d behave on my controller, the
relivant lines from dmesg are:
SIS5513: IDE controller at PCI slot 00:01.1
SIS5513: chipset revision 8
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS551x ATA 16 controller
ide0: BM-DMA at 0xf870-0xf877, BIOS settings: hda:pio, hdb:pio
SIS5513: simplex device: DMA disabled
ide1: SIS5513 Bus-Master DMA disabled (BIOS)
hda: WDC AC21000H, ATA DISK drive
hdc: Maxtor 6Y080L0, ATA DISK drive
"hdparm -d 1 /dev/hdc" returns an operation not permitted, but "hdparm
-d 1 /dev/hda" is successful and results in future calls to "hdparm -d
1 /dev/hdc" seemingly locking the computer.
> Are you sure that it doesn't happen on 2.6.1? Maybe you've used a bit
> different config (ie. compiled without DMA support)?
Oops, on further research it did exhibit the exact same problem.
Sorry about the misdiagnosis.
Let me know how I can continue to help.
Glenn.
[Sorry about my tardy response]
--- linux-2.4.24-orig/drivers/ide/ide-iops.c 2003-11-28 18:26:20.000000000 +0000
+++ linux-2.4.24/drivers/ide/ide-iops.c 2004-01-27 05:19:41.000000000 +0000
@@ -912,7 +912,8 @@
// ide_delay_50ms();
#if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
- hwif->ide_dma_host_off(drive);
+ if (hwif->ide_dma_check) /* Check if host supports DMA */
+ hwif->ide_dma_host_off(drive);
#endif /* (CONFIG_BLK_DEV_IDEDMA) && !(CONFIG_DMA_NONPCI) */
/*
@@ -980,10 +981,13 @@
drive->id->dma_1word &= ~0x0F00;
#if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
- if (speed >= XFER_SW_DMA_0)
- hwif->ide_dma_host_on(drive);
- else
- hwif->ide_dma_off_quietly(drive);
+ if (speed >= XFER_SW_DMA_0) {
+ if (hwif->ide_dma_check) /* Check if host supports DMA */
+ hwif->ide_dma_host_on(drive);
+ } else {
+ if (hwif->ide_dma_check) /* Check if host supports DMA */
+ hwif->ide_dma_off_quietly(drive);
+ }
#endif /* (CONFIG_BLK_DEV_IDEDMA) && !(CONFIG_DMA_NONPCI) */
switch(speed) {
On Tuesday 27 of January 2004 06:52, Glenn Wurster wrote:
> > Glenn, your patch hides potential problems - these functions shouldn't be
> > called if host doesn't support DMA. However there is one place when
> > ->ide_dma_host_off() shouldn't be called unconditionally, here is a
> > patch. It is not pretty but at least consistent - we check
> > hwif->ide_dma_check to see if DMA is supported in other places too. Does
> > it fix the problem?
>
> Not quite. If we go forward with a patch like that, then it must be
> updated to include at least two other places that I know of
Doh. I overlooked one place.
IMO this check needs to be added only to two places.
> immediately. The updated patch would be something similar to the one
> at the bottom of this e-mail.
Did you test this patch?
> On a further note, how should hdparm -d behave on my controller, the
> relivant lines from dmesg are:
>
> SIS5513: IDE controller at PCI slot 00:01.1
> SIS5513: chipset revision 8
> SIS5513: not 100% native mode: will probe irqs later
> SIS5513: SiS551x ATA 16 controller
> ide0: BM-DMA at 0xf870-0xf877, BIOS settings: hda:pio, hdb:pio
> SIS5513: simplex device: DMA disabled
> ide1: SIS5513 Bus-Master DMA disabled (BIOS)
> hda: WDC AC21000H, ATA DISK drive
> hdc: Maxtor 6Y080L0, ATA DISK drive
>
> "hdparm -d 1 /dev/hdc" returns an operation not permitted, but "hdparm
> -d 1 /dev/hda" is successful and results in future calls to "hdparm -d
> 1 /dev/hdc" seemingly locking the computer.
I suspect that this is caused by unfinished handling of simplex devices
in setup-pci.c (simplex host - one DMA engine but two channels).
> @@ -980,10 +981,13 @@
> drive->id->dma_1word &= ~0x0F00;
>
> #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
> - if (speed >= XFER_SW_DMA_0)
> - hwif->ide_dma_host_on(drive);
> - else
> - hwif->ide_dma_off_quietly(drive);
> + if (speed >= XFER_SW_DMA_0) {
> + if (hwif->ide_dma_check) /* Check if host supports DMA */
> + hwif->ide_dma_host_on(drive);
I've seen this and decided that it is not needed.
If we try to program drives to DMA on non-DMA host
something is going wrong and it is better to just OOPS.
> + } else {
> + if (hwif->ide_dma_check) /* Check if host supports DMA */
> + hwif->ide_dma_off_quietly(drive);
> + }
Yep, I've missed this one.
Thanks,
--bart
> Doh. I overlooked one place.
> IMO this check needs to be added only to two places.
True, I added the check three times to emphasise the three different
calls which could potentially OOPS (at least upon initial
observation). It could be optimized into 2 checks.
> Did you test this patch?
Yes.
> I've seen this and decided that it is not needed.
>
> If we try to program drives to DMA on non-DMA host
> something is going wrong and it is better to just OOPS.
This makes sense. Did you want to update the patch for those proposed
changes (You're more familiar with the code than I - I'm reluctant to
play too much with code unless I understand what it is doing)? I'd be
willing to test an updated patch.
> I suspect that this is caused by unfinished handling of simplex
> devices in setup-pci.c (simplex host - one DMA engine but two
> channels).
I'm really not familiar with the complexities behind DMA
programming, especially when it comes to simplex devices so I'm
probably not in much of a position to finish up handling of simplex
devices.
Glenn.