Subject: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel 2.4.22-pre6

Juergen, David; please review this patch, and recommend that Marcelo apply
this patch if you think it is okay... Thanks!

Attached is a patch against linux-2.4.22-pre6, to fix a hang problem when an
Adaptec SlimSCSI (PCMCIA) adapter is ejected from a PCMCIA card reader (e.g.
yenta/TI1225 based).

The fix involves:

1. A change to the common aha152x driver to ignore an interrupt in the
top-half handler if it cannot read valid data from the I/O ports (possibly
due to a bad host-adapter chip or an ejected PCMCIA card). This way, a
shared interrupt handler (e.g. yenta) can pick up the interrupt if the IRQ
is really meant for it. This is where the original hang was taking place;
the aha152x bottom half was getting into an infinite loop, though the
SlimSCSI card had been ejected, and the actual IRQ was meant for yenta.

2. A change to the aha152x_cs stub driver to not use the SCSI error-handling
thread code. The aha152x_cs driver calls scsi_unregister_module() as a
queued timer task when it gets a CS_EVENT_CARD_REMOVAL event, which causes
scsi_unregister_host() to do a down() on a semaphore, calling schedule(),
when executing the timer_bh for the timer.

Thanks!

- Bhavesh
--
Bhavesh P. Davda E-mail : [email protected]
Avaya Inc. Phone/Fax : (303) 538-4438
Room B3-B03, 1300 West 120th Avenue
Westminster, CO 80234


Attachments:
linux-2.4.22-aha152x.patch (1.73 kB)

Subject: RE: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel 2.4.22-pre6

Whoops! Sent it without proof-reading. I AM SORRY! Please ignore the
previous patch, here is the correct patch ...

Thanks
- Bhavesh

--
Bhavesh P. Davda E-mail : [email protected]
Avaya Inc. Phone/Fax : (303) 538-4438
Room B3-B03, 1300 West 120th Avenue
Westminster, CO 80234

> -----Original Message-----
> From: Bhavesh P. Davda
> Sent: Tuesday, July 15, 2003 3:57 PM
> To: [email protected]
> Cc: [email protected]; [email protected]; Marcelo Tosatti
> Subject: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel
> 2.4.22-pre6
>
>
> Juergen, David; please review this patch, and recommend that
> Marcelo apply this patch if you think it is okay... Thanks!
>
> Attached is a patch against linux-2.4.22-pre6, to fix a hang
> problem when an Adaptec SlimSCSI (PCMCIA) adapter is ejected from
> a PCMCIA card reader (e.g. yenta/TI1225 based).
>
> The fix involves:
>
> 1. A change to the common aha152x driver to ignore an interrupt
> in the top-half handler if it cannot read valid data from the I/O
> ports (possibly due to a bad host-adapter chip or an ejected
> PCMCIA card). This way, a shared interrupt handler (e.g. yenta)
> can pick up the interrupt if the IRQ is really meant for it. This
> is where the original hang was taking place; the aha152x bottom
> half was getting into an infinite loop, though the SlimSCSI card
> had been ejected, and the actual IRQ was meant for yenta.
>
> 2. A change to the aha152x_cs stub driver to not use the SCSI
> error-handling thread code. The aha152x_cs driver calls
> scsi_unregister_module() as a queued timer task when it gets a
> CS_EVENT_CARD_REMOVAL event, which causes scsi_unregister_host()
> to do a down() on a semaphore, calling schedule(), when executing
> the timer_bh for the timer.
>
> Thanks!
>
> - Bhavesh
> --
> Bhavesh P. Davda E-mail : [email protected]
> Avaya Inc. Phone/Fax : (303) 538-4438
> Room B3-B03, 1300 West 120th Avenue
> Westminster, CO 80234


Attachments:
linux-2.4.22-aha152x.patch (1.98 kB)

2003-07-16 13:04:23

by Alan

[permalink] [raw]
Subject: Re: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel 2.4.22-pre6

On Maw, 2003-07-15 at 22:56, Bhavesh P. Davda wrote:
> 2. A change to the aha152x_cs stub driver to not use the SCSI error-handling
> thread code. The aha152x_cs driver calls scsi_unregister_module() as a
> queued timer task when it gets a CS_EVENT_CARD_REMOVAL event, which causes
> scsi_unregister_host() to do a down() on a semaphore, calling schedule(),
> when executing the timer_bh for the timer.

Right - scsi_unregister should not be called on a timer event, instead
it needs to kick off a task queue

Subject: RE: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel2.4.22-pre6

> -----Original Message-----
> From: Alan Cox [mailto:[email protected]]
> Sent: Wednesday, July 16, 2003 7:16 AM
> To: Bhavesh P. Davda
> Cc: Linux Kernel Mailing List; [email protected];
> [email protected]; Marcelo Tosatti
> Subject: Re: [PATCH] AHA152x driver hangs on PCMCIA card eject,
> kernel2.4.22-pre6
>
>
> On Maw, 2003-07-15 at 22:56, Bhavesh P. Davda wrote:
> > 2. A change to the aha152x_cs stub driver to not use the SCSI
> error-handling
> > thread code. The aha152x_cs driver calls scsi_unregister_module() as a
> > queued timer task when it gets a CS_EVENT_CARD_REMOVAL event,
> which causes
> > scsi_unregister_host() to do a down() on a semaphore, calling
> schedule(),
> > when executing the timer_bh for the timer.
>
> Right - scsi_unregister should not be called on a timer event, instead
> it needs to kick off a task queue


Thanks. I will give this a shot for the aha152x_cs stub driver, and post
another patch (unless David Hinds or Juergen Fischer get around to it
first).

However, the same problem exists for all the PCMCIA SCSI stub drivers, who
have all chosen to NOT use the scsi error handling thread, by setting
use_new_eh_code to 0 in the Scsi_Host_Template. I don't feel comfortable
posting patches to those stub drivers (fdomain, nsp, qlogic) to do the same,
since I don't have the hardware to test with.

Also, I wanted to warn you of a couple of leaps of faith I had to make to
answer questions I had for this patch:

1. Can the AIC-6360 host adapter ever have all 1's in the REV and DMACNTRL0
registers? I am guessing not, but waiting on specs from Adaptec to answer
this question.

2. What happens if there is no physical device hanging off an I/O port
address? I am guessing, that on an i386 host, the inb returns 0xFF, but am
not sure what happens on other architectures. I have a question outstanding
to Intel for this.

Thanks

- Bhavesh
--
Bhavesh P. Davda E-mail : [email protected]
Avaya Inc. Phone/Fax : (303) 538-4438
Room B3-B03, 1300 West 120th Avenue
Westminster, CO 80234

2003-07-17 17:40:50

by David Hinds

[permalink] [raw]
Subject: Re: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel2.4.22-pre6

On Thu, Jul 17, 2003 at 08:15:39AM -0600, Bhavesh P. Davda wrote:

> > Right - scsi_unregister should not be called on a timer event, instead
> > it needs to kick off a task queue

The removal timers need to be taken out from most *_cs drivers; they
are a holdover from when card removal events were delivered in
interrupt context, and when that was changed to an event handler
thread, drivers were not changed accordingly. The removal routine
should now just be called in-line instead of firing up a timer.

> 2. What happens if there is no physical device hanging off an I/O port
> address? I am guessing, that on an i386 host, the inb returns 0xFF, but am
> not sure what happens on other architectures. I have a question outstanding
> to Intel for this.

On most but not all x86 systems floating ports return 0xff. Checking
for that or other "impossible" register values should be at least
harmless on other architectures.

-- Dave

Subject: RE: [PATCH] AHA152x driver hangs on PCMCIA card eject, kernel2.4.22-pre6

> -----Original Message-----
> From: David Hinds [mailto:[email protected]]
> Sent: Thursday, July 17, 2003 11:56 AM
> To: Bhavesh P. Davda
> Cc: Alan Cox; Linux Kernel Mailing List; [email protected];
> [email protected]; Marcelo Tosatti
> Subject: Re: [PATCH] AHA152x driver hangs on PCMCIA card eject,
> kernel2.4.22-pre6
>
>
> On Thu, Jul 17, 2003 at 08:15:39AM -0600, Bhavesh P. Davda wrote:
>
> > > Right - scsi_unregister should not be called on a timer event, instead
> > > it needs to kick off a task queue
>
> The removal timers need to be taken out from most *_cs drivers; they
> are a holdover from when card removal events were delivered in
> interrupt context, and when that was changed to an event handler
> thread, drivers were not changed accordingly. The removal routine
> should now just be called in-line instead of firing up a timer.
>
> > 2. What happens if there is no physical device hanging off an I/O port
> > address? I am guessing, that on an i386 host, the inb returns
> 0xFF, but am
> > not sure what happens on other architectures. I have a question
> outstanding
> > to Intel for this.
>
> On most but not all x86 systems floating ports return 0xff. Checking
> for that or other "impossible" register values should be at least
> harmless on other architectures.
>
> -- Dave

Thank you, Dave!

Attached is a patch that takes into account comments I have received in
response to the original patch I posted. Please peruse it, and if it looks
okay, please recommend that Marcelo pick it up for 2.4.22.

FYI, I have tested this patch on a PIII/440BX with a TI1225 based PCMCIA
card reader and Adaptec SlimSCSI 1460D SCSI adapter card connected to a
Fujitsu SCSI MO drive.

Thanks!

- Bhavesh
--
Bhavesh P. Davda E-mail : [email protected]
Avaya Inc. Phone/Fax : (303) 538-4438
Room B3-B03, 1300 West 120th Avenue
Westminster, CO 80234


Attachments:
linux-2.4.22-aha152x.patch (6.31 kB)