2003-03-22 13:55:23

by Dominik Brodowski

[permalink] [raw]
Subject: 2.5.65-ac2 -- hda/ide trouble on ICH4

Hi Alan,

unfortunately 2.5.65-ac2 does not boot:

ICH4: IDE controller at PCI slot 00:1f.1
ICH4: chipset revision 1
ICH4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
...
hda: ICH35L080AVVA07-0, ATA DISK driver
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
...
hda: host protected area => 1
hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >

and *deadlock*...

in plain 2.5.65 I was seeing strange error messages like:

Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
Mar 19 20:29:55 mondschein kernel:

which are not repeatable when I switch back to 2.5.62.

lspci:
00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 01)
00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 01)
00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 01)
00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corp. 82801DB ISA Bridge (LPC) (rev 01)
00:1f.1 IDE interface: Intel Corp. 82801DB ICH4 IDE (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL [Radeon 8500 LE]
02:03.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
02:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

parts of dmesg in 2.5.62:
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH4: IDE controller at PCI slot 00:1f.1
ICH4: chipset revision 1
ICH4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
hda: IC35L080AVVA07-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: HL-DT-STDVD-ROM GDR8161B, ATAPI CD/DVD-ROM drive
hdd: HL-DT-ST GCE-8480B, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: host protected area => 1
hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >


Dominik


2003-03-22 15:11:57

by Alan

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, 2003-03-22 at 14:03, Dominik Brodowski wrote:
> hda: host protected area => 1
> hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
> hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >
>
> and *deadlock*...

Where is the lock, what does the NMI oopser show ?

> in plain 2.5.65 I was seeing strange error messages like:
>
> Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
> Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
> Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
> Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
> Mar 19 20:29:55 mondschein kernel:

I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
code on 2.4 so far. Which is odd but I don't yet understand what is going on.

2003-03-22 16:14:09

by Dominik Brodowski

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> On Sat, 2003-03-22 at 14:03, Dominik Brodowski wrote:
> > hda: host protected area => 1
> > hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
> > hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >
> >
> > and *deadlock*...
>
> Where is the lock, what does the NMI oopser show ?

The lock is directly "below" that line -- and the NMI oopser isn't
triggered, AFAICT

> > in plain 2.5.65 I was seeing strange error messages like:
> >
> > Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
> > Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
> > Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
> > Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
> > Mar 19 20:29:55 mondschein kernel:
>
> I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> code on 2.4 so far. Which is odd but I don't yet understand what is going on.
/me neither, unfortunately :-(

Dominik

2003-03-22 16:18:52

by Alan

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
> > Where is the lock, what does the NMI oopser show ?
>
> The lock is directly "below" that line -- and the NMI oopser isn't
> triggered, AFAICT

Anything useful off right-alt scroll-lock etc ?

2003-03-22 16:28:04

by Jan Dittmer

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

Alan Cox wrote:
> On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
>
>>>Where is the lock, what does the NMI oopser show ?
>>
>>The lock is directly "below" that line -- and the NMI oopser isn't
>>triggered, AFAICT
>
>
> Anything useful off right-alt scroll-lock etc ?
>
I'm seeing the same problem. VIA KT133A platform, nothing after
partition detection. No NMI-Watchdog, no sysrq magic. Just dead.
Any patch particular I could try to revert?

Jan

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
ide0: BM-DMA at 0x9000-0x9007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0x9008-0x900f, BIOS settings: hdc:DMA, hdd:DMA
hda: IC35L060AVV207-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: SONY CD-RW CRX175E2, ATAPI CD/DVD-ROM drive
hdd: Pioneer DVD-ROM ATAPIModel DVD-104S 012, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: host protected area => 1
hda: 120103200 sectors (61493 MB) w/1821KiB Cache, CHS=7476/255/63,
UDMA(100)
/dev/ide/host0/bus0/target0/lun0: p1 p2 p3 < p5 p6 p7 >

Subject: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4


On Sat, 22 Mar 2003, Dominik Brodowski wrote:
> On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> >
> > I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> > code on 2.4 so far. Which is odd but I don't yet understand what is going on.
> /me neither, unfortunately :-(


Alan, I can trigger the same dma_intr bug under 2.4.21-pre5-ac3 but not
2.4.20-ac2 (VIA vt8235 + WD800JB so it is not controller/disk related).

Dominik could you try attached patch with vanilla 2.5.65?
It reverts previous logic of handling masked_irq argument of ide_do_request().

Previously callers called it with masked_irq=0 and disabling/enabling
hwif->irq code wasn't executed, now ide_do_request() is called with
masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.

And no, I don't know wtf is exactly going on there :\.


[ Alan, please forget about yesterday's mail, I hitted dma_intr again today
with yesterday's patch, with attached patch I hope it is gone now :-) ]

BTW 2.5.64-ac4 deadlocks for me the same way Dominik has described.


Greets
--
Bartlomiej


Attachments:
2.5.65-dma_intr-fix.diff (1.31 kB)

2003-03-22 22:04:45

by Alan

[permalink] [raw]
Subject: Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, 2003-03-22 at 22:03, Bartlomiej Zolnierkiewicz wrote:
> Previously callers called it with masked_irq=0 and disabling/enabling
> hwif->irq code wasn't executed, now ide_do_request() is called with
> masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.


You are right - I botched the simplification of that. The logic is actually
cleaner than I did with a bit more thought - IDE_NO_IRQ can go away and
we should be using hwif->irq as the argument.


Subject: Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4


On 22 Mar 2003, Alan Cox wrote:

> On Sat, 2003-03-22 at 22:03, Bartlomiej Zolnierkiewicz wrote:
> > Previously callers called it with masked_irq=0 and disabling/enabling
> > hwif->irq code wasn't executed, now ide_do_request() is called with
> > masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.
>
> You are right - I botched the simplification of that. The logic is actually
> cleaner than I did with a bit more thought - IDE_NO_IRQ can go away and
> we should be using hwif->irq as the argument.

I think so.

--
Bartlomiej

2003-03-23 00:51:35

by Dominik Brodowski

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, Mar 22, 2003 at 05:42:02PM +0000, Alan Cox wrote:
> On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
> > > Where is the lock, what does the NMI oopser show ?
> >
> > The lock is directly "below" that line -- and the NMI oopser isn't
> > triggered, AFAICT
>
> Anything useful off right-alt scroll-lock etc ?

not from this debugging source - USB wireless keyboard :) - however, ~1000
printks later I've found out the following: the kernel spins in the while()
loop in drivers/ide/ide_register_driver:

while (!list_empty(&list)) {
ide_drive_t *drive = list_entry(list.next, ide_drive_t,
list);
list_del_init(&drive->list);
if (drive->present)
ata_attach(drive);
}


It was called by ide_register_driver, which itself got called by
idedisk_init.

Dominik

2003-03-23 09:01:08

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sat, Mar 22, 2003 at 11:03:33PM +0100, Bartlomiej Zolnierkiewicz wrote:
>
> On Sat, 22 Mar 2003, Dominik Brodowski wrote:
> > On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> > >
> > > I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> > > code on 2.4 so far. Which is odd but I don't yet understand what is going on.
> > /me neither, unfortunately :-(
>
>
> Alan, I can trigger the same dma_intr bug under 2.4.21-pre5-ac3 but not
> 2.4.20-ac2 (VIA vt8235 + WD800JB so it is not controller/disk related).
>
> Dominik could you try attached patch with vanilla 2.5.65?
> It reverts previous logic of handling masked_irq argument of ide_do_request().

Seems to work fine over here...

> BTW 2.5.64-ac4 deadlocks for me the same way Dominik has described.

plain 2.5.65 does not, but 2.5.65-bk-current does.

Dominik

2003-03-23 14:24:25

by Alan

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sun, 2003-03-23 at 01:03, Dominik Brodowski wrote:
> while (!list_empty(&list)) {
> ide_drive_t *drive = list_entry(list.next, ide_drive_t,
> list);
> list_del_init(&drive->list);
> if (drive->present)
> ata_attach(drive);

Can you boot and printk the drive name out each iteration see if the list
is hosed somewhere.

list is the list of all the drives. We pull the drive off the list
and attach it to the relevant device driver (or ide-default if none
is known).

The behaviour you see certainly sounds like the list got corrupted

2003-03-23 14:44:47

by Dominik Brodowski

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sun, Mar 23, 2003 at 03:47:54PM +0000, Alan Cox wrote:
> On Sun, 2003-03-23 at 01:03, Dominik Brodowski wrote:
> > while (!list_empty(&list)) {
> > ide_drive_t *drive = list_entry(list.next, ide_drive_t,
> > list);
> > list_del_init(&drive->list);
> > if (drive->present)
> > ata_attach(drive);
>
> Can you boot and printk the drive name out each iteration see if the list
> is hosed somewhere.

printk("%4s\n", drive->name) prints out "hdd" all the time.

hda is an ATA disk drive
hdb is empty
hdc is an ATAPI CD/DVD-ROM drive
hdd is an ATAPI CD-ROM CD-R/RW drive

Dominik

2003-03-23 17:17:35

by Alan

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

> printk("%4s\n", drive->name) prints out "hdd" all the time.
>
> hda is an ATA disk drive
> hdb is empty
> hdc is an ATAPI CD/DVD-ROM drive
> hdd is an ATAPI CD-ROM CD-R/RW drive

This gets weirder by the minute, and I still can't get it to happen here
annoyingly.

The list thats breaking is a private list. We delete the drive from that
list, if its present (it may be an empty bay) we then use ata_attach
to add it to a device list (or back to ata_unused).

I find it hard to believe something like this is a compiler bug, but right
now I don't see how stuff is reappearing on the list.

2003-03-23 18:04:38

by Dominik Brodowski

[permalink] [raw]
Subject: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4

On Sun, Mar 23, 2003 at 06:41:10PM +0000, Alan Cox wrote:
> > printk("%4s\n", drive->name) prints out "hdd" all the time.
> >
> > hda is an ATA disk drive
> > hdb is empty
> > hdc is an ATAPI CD/DVD-ROM drive
> > hdd is an ATAPI CD-ROM CD-R/RW drive
>
> This gets weirder by the minute, and I still can't get it to happen here
> annoyingly.
>
> The list thats breaking is a private list. We delete the drive from that
> list, if its present (it may be an empty bay) we then use ata_attach
> to add it to a device list (or back to ata_unused).
>
> I find it hard to believe something like this is a compiler bug, but right
> now I don't see how stuff is reappearing on the list.

Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
suspicious. Might be better to use list_for_each_safe() which is designed
exactly for this purpouse. I'm currently recompiling
2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
it works with this kernel, too...

Dominik

--- linux/drivers/ide/ide.c.original 2003-03-23 19:08:40.000000000 +0100
+++ linux/drivers/ide/ide.c 2003-03-23 19:10:25.000000000 +0100
@@ -2392,6 +2392,8 @@
int ide_register_driver(ide_driver_t *driver)
{
struct list_head list;
+ struct list_head *list_loop;
+ struct list_head *tmp_storage;

spin_lock(&drivers_lock);
list_add(&driver->drivers, &drivers);
@@ -2402,8 +2404,8 @@
list_splice_init(&ata_unused, &list);
spin_unlock(&drives_lock);

- while (!list_empty(&list)) {
- ide_drive_t *drive = list_entry(list.next, ide_drive_t, list);
+ list_for_each_safe(list_loop, tmp_storage, &list) {
+ ide_drive_t *drive = container_of(list_loop, ide_drive_t, list);
list_del_init(&drive->list);
if (drive->present)
ata_attach(drive);

2003-03-23 18:15:05

by Dominik Brodowski

[permalink] [raw]
Subject: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

On Sun, Mar 23, 2003 at 07:15:33PM +0100, Dominik Brodowski wrote:
> On Sun, Mar 23, 2003 at 06:41:10PM +0000, Alan Cox wrote:
> > > printk("%4s\n", drive->name) prints out "hdd" all the time.
> > >
> > > hda is an ATA disk drive
> > > hdb is empty
> > > hdc is an ATAPI CD/DVD-ROM drive
> > > hdd is an ATAPI CD-ROM CD-R/RW drive
> >
> > This gets weirder by the minute, and I still can't get it to happen here
> > annoyingly.
> >
> > The list thats breaking is a private list. We delete the drive from that
> > list, if its present (it may be an empty bay) we then use ata_attach
> > to add it to a device list (or back to ata_unused).
> >
> > I find it hard to believe something like this is a compiler bug, but right
> > now I don't see how stuff is reappearing on the list.
>
> Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
> suspicious. Might be better to use list_for_each_safe() which is designed
> exactly for this purpouse. I'm currently recompiling
> 2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
> it works with this kernel, too...

Yes, it also works with 2.5.65-bkX.

--- linux/drivers/ide/ide.c.original 2003-03-23 19:08:40.000000000 +0100
+++ linux/drivers/ide/ide.c 2003-03-23 19:10:25.000000000 +0100
@@ -2392,6 +2392,8 @@
int ide_register_driver(ide_driver_t *driver)
{
struct list_head list;
+ struct list_head *list_loop;
+ struct list_head *tmp_storage;

spin_lock(&drivers_lock);
list_add(&driver->drivers, &drivers);
@@ -2402,8 +2404,8 @@
list_splice_init(&ata_unused, &list);
spin_unlock(&drives_lock);

- while (!list_empty(&list)) {
- ide_drive_t *drive = list_entry(list.next, ide_drive_t, list);
+ list_for_each_safe(list_loop, tmp_storage, &list) {
+ ide_drive_t *drive = container_of(list_loop, ide_drive_t, list);
list_del_init(&drive->list);
if (drive->present)
ata_attach(drive);

2003-03-23 22:05:35

by Jan Dittmer

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

Dominik Brodowski wrote:
> On Sun, Mar 23, 2003 at 07:15:33PM +0100, Dominik Brodowski wrote:
>>Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
>>suspicious. Might be better to use list_for_each_safe() which is designed
>>exactly for this purpouse. I'm currently recompiling
>>2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
>>it works with this kernel, too...
>
>
> Yes, it also works with 2.5.65-bkX.
>

Yes, my system also boots again :)

Thanks,

Jan

2003-03-24 07:41:28

by Alexander Atanasov

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

Hello,

On Sun, 23 Mar 2003, Dominik Brodowski wrote:

> Yes, it also works with 2.5.65-bkX.

Dominik, can you try this patch on top of 2.5.65-ac3/bk,
i can't reproduce the hang but it seems that drives without driver can get
both in ata_unused and idedefault_driver.drives and lists go nuts.
It kills ata_unused and uses idedefault_driver.drives only,
boots fine here. I'd guess you have ide-cd as module, and the two drives
handled by it couse the trouble - first joins the lists second couses the
loop.

--
have fun,
alex

===== drivers/ide/ide.c 1.52 vs edited =====
--- 1.52/drivers/ide/ide.c Sun Mar 23 02:00:50 2003
+++ edited/drivers/ide/ide.c Mon Mar 24 08:48:54 2003
@@ -469,7 +469,6 @@
return -ENXIO;
}

-static LIST_HEAD(ata_unused);
static spinlock_t drives_lock = SPIN_LOCK_UNLOCKED;
static spinlock_t drivers_lock = SPIN_LOCK_UNLOCKED;
static LIST_HEAD(drivers);
@@ -1440,9 +1439,6 @@
spin_unlock(&drivers_lock);
if(idedefault_driver.attach(drive) != 0)
panic("ide: default attach failed");
- spin_lock(&drives_lock);
- list_add_tail(&drive->list, &ata_unused);
- spin_unlock(&drives_lock);
return 1;
}

@@ -2399,7 +2395,7 @@

spin_lock(&drives_lock);
INIT_LIST_HEAD(&list);
- list_splice_init(&ata_unused, &list);
+ list_splice_init(&idedefault_driver.drives, &list);
spin_unlock(&drives_lock);

while (!list_empty(&list)) {

2003-03-24 09:57:38

by norbert_wolff

[permalink] [raw]
Subject: PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers

Hi !

I tried linux-2.5.65-ac3 with ide-disk and ide-scsi drivers both built in
the Kernel.
Two ide-disks attached to ide0, two CDROMS attached to ide1.
Im using the sis5513-PCI-IDE-Driver, but configuring it out makes no difference.

The System hangs while booting with last messages
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15

It seems that all Drives went twice in the driver-list, so ata_attach
gets called twice for them, which leads to the hang.

Dirty Hack that works for me below.

Regards,

Norbert


--- drivers/ide/ide.c.orig 2003-03-24 10:52:40.000000000 +0000
+++ drivers/ide/ide.c 2003-03-24 10:57:52.000000000 +0000
@@ -1423,6 +1423,14 @@
{
struct list_head *p;
spin_lock(&drivers_lock);
+#define _DEBUG 1
+#ifdef _DEBUG
+ printk("ata_attach called for %s\n", drive->name);
+#endif
+ if (drive->already_attached) {
+ printk ("ata_attach: already attached for %s !\n", drive->name);
+ return 0;
+ }
list_for_each(p, &drivers) {
ide_driver_t *driver = list_entry(p, ide_driver_t, drivers);
if (!try_module_get(driver->owner))
@@ -1431,12 +1439,14 @@
if (driver->attach(drive) == 0) {
module_put(driver->owner);
drive->gendev.driver = &driver->gen_driver;
+ drive->already_attached = 1;
return 0;
}
spin_lock(&drivers_lock);
module_put(driver->owner);
}
drive->gendev.driver = &idedefault_driver.gen_driver;
+ drive->already_attached = 1;
spin_unlock(&drivers_lock);
if(idedefault_driver.attach(drive) != 0)
panic("ide: default attach failed");
--- include/linux/ide.h.orig 2003-03-24 11:01:41.000000000 +0000
+++ include/linux/ide.h 2003-03-24 11:02:33.000000000 +0000
@@ -791,6 +791,7 @@
int forced_lun; /* if hdxlun was given at boot */
int lun; /* logical unit */
int crc_count; /* crc counter to reduce drive speed */
+ int already_attached; /* Dirty Hack ... */
struct list_head list;
struct device gendev;
struct gendisk *disk;

2003-03-24 12:30:50

by Alan

[permalink] [raw]
Subject: Re: PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers

On Mon, 2003-03-24 at 11:08, Norbert Wolff wrote:
> Hi !
>
> I tried linux-2.5.65-ac3 with ide-disk and ide-scsi drivers both built in
> the Kernel.
> Two ide-disks attached to ide0, two CDROMS attached to ide1.
> Im using the sis5513-PCI-IDE-Driver, but configuring it out makes no difference.

See the -ac4 tree for a cleaner fix from Dominik

2003-03-24 12:35:47

by Alan

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

On Mon, 2003-03-24 at 09:55, Alexander Atanasov wrote:
> i can't reproduce the hang but it seems that drives without driver can get
> both in ata_unused and idedefault_driver.drives and lists go nuts.
> It kills ata_unused and uses idedefault_driver.drives only,
> boots fine here. I'd guess you have ide-cd as module, and the two drives
> handled by it couse the trouble - first joins the lists second couses the
> loop.

We need to know the difference between the two really so I would much rather
ensure we don't end up on both lists at once (which is a bug) than lose a
list

2003-03-24 15:55:26

by Alexander Atanasov

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

Hello,

On 24 Mar 2003 13:59:33 +0000
Alan Cox <[email protected]> wrote:

> On Mon, 2003-03-24 at 09:55, Alexander Atanasov wrote:
> > i can't reproduce the hang but it seems that drives without driver
> > can get both in ata_unused and idedefault_driver.drives and lists go
> > nuts. It kills ata_unused and uses idedefault_driver.drives only,
> > boots fine here. I'd guess you have ide-cd as module, and the two
> > drives handled by it couse the trouble - first joins the lists
> > second couses the loop.
>
> We need to know the difference between the two really so I would much
> rather ensure we don't end up on both lists at once (which is a bug)
> than lose a list
>

I don't understand, what's the difference and how the list is lost?
ata_unused used to hold all drives that were not claimed by any driver,
now idedefault_driver claims all that drives, all drives go in the .list
of its driver. ide_register_driver wants to take all unused drives and
attach them to the newly registered driver, so take all drives that use
idedefault_driver, and try, if they fail to find a driver they end up
again with the same driver and list (idedefault_driver). I think
idedefault_driver.list and ata_unused became dublicates, and the proper
place is to hold drives with no real driver is idedefault_driver, so the
patch. list from ata_unused becomes idedefault_driver.list, and does
exactly the same as ata_unused. I want to understand where i'm wrong,
please?

The bug is there, and waiting to explode, keeping both lists would mean to
add one more list head in ide_drive_t, is that the fix you want?

--
have fun,
alex

2003-03-24 16:16:23

by Alan

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> I don't understand, what's the difference and how the list is lost?
> ata_unused used to hold all drives that were not claimed by any driver,
> now idedefault_driver claims all that drives, all drives go in the .list

ata_unused -> unattached device slots, new hotplug discoveries
idedefault_driver -> attached/known devices with no driver
other list -> driven by that driver

> The bug is there, and waiting to explode, keeping both lists would mean to
> add one more list head in ide_drive_t, is that the fix you want?

I don't see where stuff is ending up on both lists yet. I've not had time to look
hard at it though

2003-03-24 17:20:22

by Alexander Atanasov

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

Hello, Alan!

On 24 Mar 2003 17:40:08 +0000
Alan Cox <[email protected]> wrote:

> On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > I don't understand, what's the difference and how the list is
> > lost?
> > ata_unused used to hold all drives that were not claimed by any
> > driver, now idedefault_driver claims all that drives, all drives go
> > in the .list
>
> ata_unused -> unattached device slots, new hotplug discoveries

Ok.

> idedefault_driver -> attached/known devices with no driver
> other list -> driven by that driver
>
> > The bug is there, and waiting to explode, keeping both lists would
> > mean to add one more list head in ide_drive_t, is that the fix
> > you want?
>
> I don't see where stuff is ending up on both lists yet. I've not had
> time to look hard at it though
>

It happens this way:
ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
return to ata_attach -> list_add_tail(&drive->list, &ata_unused);

--
have fun,
alex

2003-03-25 04:08:14

by Andre Hedrick

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]


This is one thing all of you don't get about hotplug.

You are not allowed on PATA, only SATA.

The BIOS and setup on the HBA's need a kick to come alive.
There are basic hooks that do not permit post boot hotplug in PATA.

Cheers,

On Mon, 24 Mar 2003, Alexander Atanasov wrote:

> Hello, Alan!
>
> On 24 Mar 2003 17:40:08 +0000
> Alan Cox <[email protected]> wrote:
>
> > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > I don't understand, what's the difference and how the list is
> > > lost?
> > > ata_unused used to hold all drives that were not claimed by any
> > > driver, now idedefault_driver claims all that drives, all drives go
> > > in the .list
> >
> > ata_unused -> unattached device slots, new hotplug discoveries
>
> Ok.
>
> > idedefault_driver -> attached/known devices with no driver
> > other list -> driven by that driver
> >
> > > The bug is there, and waiting to explode, keeping both lists would
> > > mean to add one more list head in ide_drive_t, is that the fix
> > > you want?
> >
> > I don't see where stuff is ending up on both lists yet. I've not had
> > time to look hard at it though
> >
>
> It happens this way:
> ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> return to ata_attach -> list_add_tail(&drive->list, &ata_unused);
>
> --
> have fun,
> alex
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2003-03-25 12:36:03

by Alan

[permalink] [raw]
Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]

On Tue, 2003-03-25 at 04:16, Andre Hedrick wrote:
> This is one thing all of you don't get about hotplug.
>
> You are not allowed on PATA, only SATA.
>
> The BIOS and setup on the HBA's need a kick to come alive.
> There are basic hooks that do not permit post boot hotplug in PATA.

Several vendors support bus tristate handling. We now do error
handling on that. Its a first step towards being able to rescan
the bus.

Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]


On Mon, 24 Mar 2003, Alexander Atanasov wrote:
> Hello, Alan!
>
> On 24 Mar 2003 17:40:08 +0000
> Alan Cox <[email protected]> wrote:
>
> > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > I don't understand, what's the difference and how the list is
> > > lost?
> > > ata_unused used to hold all drives that were not claimed by any
> > > driver, now idedefault_driver claims all that drives, all drives go
> > > in the .list
> >
> > ata_unused -> unattached device slots, new hotplug discoveries
>
> Ok.
>
> > idedefault_driver -> attached/known devices with no driver
> > other list -> driven by that driver
> >
> > > The bug is there, and waiting to explode, keeping both lists would
> > > mean to add one more list head in ide_drive_t, is that the fix
> > > you want?
> >
> > I don't see where stuff is ending up on both lists yet. I've not had
> > time to look hard at it though
> >
>
> It happens this way:
> ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> return to ata_attach -> list_add_tail(&drive->list, &ata_unused);

Exactly.

Alan, if we want to keep ata_unused, we should remove
list_add_tail(%drive->list, &ata_unused) from ata_attach()
and perhaps use (after fixing it to handle idedefault_driver)
ide_replace_subdriver() for driver switching for drives owned
by idedefault_driver.

BTW in ide_register_driver() we don't use ide_drives lock to protect
drive->list changes, why?

--
bzolnier

> --
> have fun,
> alex
>

Subject: Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]


On Tue, 25 Mar 2003, Bartlomiej Zolnierkiewicz wrote:

>
> On Mon, 24 Mar 2003, Alexander Atanasov wrote:
> > Hello, Alan!
> >
> > On 24 Mar 2003 17:40:08 +0000
> > Alan Cox <[email protected]> wrote:
> >
> > > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > > I don't understand, what's the difference and how the list is
> > > > lost?
> > > > ata_unused used to hold all drives that were not claimed by any
> > > > driver, now idedefault_driver claims all that drives, all drives go
> > > > in the .list
> > >
> > > ata_unused -> unattached device slots, new hotplug discoveries
> >
> > Ok.
> >
> > > idedefault_driver -> attached/known devices with no driver
> > > other list -> driven by that driver
> > >
> > > > The bug is there, and waiting to explode, keeping both lists would
> > > > mean to add one more list head in ide_drive_t, is that the fix
> > > > you want?
> > >
> > > I don't see where stuff is ending up on both lists yet. I've not had
> > > time to look hard at it though
> > >
> >
> > It happens this way:
> > ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> > return to ata_attach -> list_add_tail(&drive->list, &ata_unused);
>
> Exactly.
>
> Alan, if we want to keep ata_unused, we should remove
> list_add_tail(%drive->list, &ata_unused) from ata_attach()
> and perhaps use (after fixing it to handle idedefault_driver)
> ide_replace_subdriver() for driver switching for drives owned
> by idedefault_driver.
>
> BTW in ide_register_driver() we don't use ide_drives lock to protect
> drive->list changes, why?

drives_lock lock of course, writing from memory :-)

>
> --
> bzolnier
>
> > --
> > have fun,
> > alex