2002-01-29 23:28:45

by Erik Andersen

[permalink] [raw]
Subject: Adaptec 1480b SlimSCSI vs hotplug

I recently came into posession of an Adaptec 1480 cardbus SCSI
adaptor. Using kernel 2.4.18-pre7 this adaptor does not work
properly with hotplug. I find that the following change allows
the hotplug driver to properly load the aic7xxx driver when a
hotplug "add" event occurs. Removing the card, and then
re-inserting it fails to properly reinitialize the device (unless
the aic7xxx module has been manually unloaded in between), but
this happens with or without this patch. At least with this
patch, the device will load in the first place. Does this look
agreeable?

--- linux-2.4.18-pre7.orig/drivers/scsi/aic7xxx/aic7xxx_linux_pci.c Tue Jan 29 05:20:08 2002
+++ linux/drivers/scsi/aic7xxx/aic7xxx_linux_pci.c Tue Jan 29 05:20:08 2002
@@ -62,12 +62,12 @@
/* We do our own ID filtering. So, grab all SCSI storage class devices. */
static struct pci_device_id ahc_linux_pci_id_table[] = {
{
- 0x9004, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
- PCI_CLASS_STORAGE_SCSI << 8, 0xFFFF00, 0
+ PCI_VENDOR_ID_ADAPTEC, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
+ ((PCI_CLASS_STORAGE_SCSI << 8) | 0x00), ~0, 0
},
{
- 0x9005, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
- PCI_CLASS_STORAGE_SCSI << 8, 0xFFFF00, 0
+ PCI_VENDOR_ID_ADAPTEC2, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
+ ((PCI_CLASS_STORAGE_SCSI << 8) | 0x00), ~0, 0
},
{ 0 }
};

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--


2002-01-30 00:49:33

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: Adaptec 1480b SlimSCSI vs hotplug

>Does this look agreeable?

The only thing you've really changed is the class_mask. I don't
understand why testing against *more bits* of the class allows your
card to be detected. Can you explain why the old code fail?

>--- linux-2.4.18-pre7.orig/drivers/scsi/aic7xxx/aic7xxx_linux_pci.c Tue Jan
> 29 05:20:08 2002
>+++ linux/drivers/scsi/aic7xxx/aic7xxx_linux_pci.c Tue Jan 29 05:20:08 200
>2
>@@ -62,12 +62,12 @@
> /* We do our own ID filtering. So, grab all SCSI storage class devices. */
> static struct pci_device_id ahc_linux_pci_id_table[] = {
> {
>- 0x9004, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
>- PCI_CLASS_STORAGE_SCSI << 8, 0xFFFF00, 0
>+ PCI_VENDOR_ID_ADAPTEC, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
>+ ((PCI_CLASS_STORAGE_SCSI << 8) | 0x00), ~0, 0
> },
> {
>- 0x9005, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
>- PCI_CLASS_STORAGE_SCSI << 8, 0xFFFF00, 0
>+ PCI_VENDOR_ID_ADAPTEC2, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
>+ ((PCI_CLASS_STORAGE_SCSI << 8) | 0x00), ~0, 0
> },
> { 0 }
> };
>
> -Erik

--
Justin

2002-01-30 02:52:36

by Erik Andersen

[permalink] [raw]
Subject: Re: Adaptec 1480b SlimSCSI vs hotplug

On Tue Jan 29, 2002 at 05:48:53PM -0700, Justin T. Gibbs wrote:
> >Does this look agreeable?
>
> The only thing you've really changed is the class_mask. I don't
> understand why testing against *more bits* of the class allows your
> card to be detected. Can you explain why the old code fail?

Exactly, the class_mask is the significant bit. The rest I just
tidied since I hate seeing magic numbers. Anyways, I started off
with the simple observation that it didn't work. Watching
/sbin/hotplug (diethotplug with debugging enabled) closely during
add events showed me the following:


Jan 29 19:34:59 sage kernel: cs: cb_alloc(bus 3): vendor 0x9004, device 0x6075
Jan 29 19:34:59 sage kernel: PCI: Enabling device 03:00.0 (0000 -> 0003)
Jan 29 19:34:59 sage hotplug: pci_handler: action = add
[---------snip--------]
Jan 29 19:34:59 sage hotplug: match_vendor: vendor = 9004, device = 6075, subvendor = 9004, subdevice = 7560
[---------snip--------]
Jan 29 19:34:59 sage hotplug: match_vendor: looking at aic7xxx
Jan 29 19:34:59 sage hotplug: match_vendor: loading aic7xxx
Jan 29 19:34:59 sage hotplug: load_module: loading module aic7xxx
Jan 29 19:34:59 sage hotplug: match_vendor: looking at aic7xxx
Jan 29 19:34:59 sage hotplug: match_vendor: vendor check failed 9005 != 9004

Here we can see it is looking at aic7xxx twice, once for vendor
0x9004, where it notices that the vendor matches, but then fails
to match due to the 0xFFFF00 class_mask filter, and once for
vendor 9005 which of course doesn't match. After changing the
class_mask to ~0 I now see:


Jan 29 19:44:52 sage kernel: cs: cb_alloc(bus 3): vendor 0x9004, device 0x6075
Jan 29 19:44:52 sage kernel: PCI: Enabling device 03:00.0 (0000 -> 0003)
Jan 29 19:44:52 sage hotplug: pci_handler: action = add
[---------snip--------]
Jan 29 19:44:52 sage hotplug: match_vendor: vendor = 9004, device = 6075, subvendor = 9004, subdevice = 7560
[---------snip--------]
Jan 29 19:44:52 sage hotplug: match_vendor: looking at aic7xxx
Jan 29 19:44:52 sage hotplug: match_vendor: loading aic7xxx
Jan 29 19:44:52 sage hotplug: load_module: loading module aic7xxx
Jan 29 19:44:52 sage hotplug: match_vendor: looking at aic7xxx
Jan 29 19:44:52 sage hotplug: match_vendor: vendor check failed 9005 != 9004
Jan 29 19:44:52 sage cardmgr[561]: product info: "Adaptec", "APA-1480 SCSI Host Adapter", "Version 1.10 ", ""
Jan 29 19:44:52 sage cardmgr[561]: manfid: 0x012f, 0xcb01 function: 8 (SCSI)
Jan 29 19:44:52 sage cardmgr[561]: PCI id: 0x9004, 0x6075

So let me turn the question back to you: What is the intended
purpose of masking out part of the class space?

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

2002-01-30 04:54:18

by Justin T. Gibbs

[permalink] [raw]
Subject: Re: Adaptec 1480b SlimSCSI vs hotplug

>On Tue Jan 29, 2002 at 05:48:53PM -0700, Justin T. Gibbs wrote:
>> >Does this look agreeable?
>>
>> The only thing you've really changed is the class_mask. I don't
>> understand why testing against *more bits* of the class allows your
>> card to be detected. Can you explain why the old code fail?
>
>Exactly, the class_mask is the significant bit. The rest I just
>tidied since I hate seeing magic numbers.

It would make your patch easier to review if you didn't intermingle
style changes with content changes.

>Anyways, I started off
>with the simple observation that it didn't work. Watching
>/sbin/hotplug (diethotplug with debugging enabled) closely during
>add events showed me the following:

[...]

>Here we can see it is looking at aic7xxx twice, once for vendor
>0x9004, where it notices that the vendor matches, but then fails
>to match due to the 0xFFFF00 class_mask filter, and once for
>vendor 9005 which of course doesn't match. After changing the
>class_mask to ~0 I now see:

[...]

Changing the mask may have the desired effect, but it doesn't
show where the true bug is.

>So let me turn the question back to you: What is the intended
>purpose of masking out part of the class space?

I'm only trying to match on the base and sub class fields. In
the kernel (see drivers/pci/pci.c), the pci device field is
three bytes wide with the programming interface stored in the
lowest byte and class information in bytes 1 and 2. Since I only
place the expected class values in my table, not the value for
the programming interface, the low byte needs to be masked away prior
to doing the comparison. The logic in the kernel handles the
mask correctly:

((ids->class ^ dev->class) & ids->class_mask) == 0

My guess is that diethotplug is not handling the mask correctly
and thus doesn't work with a partial mask.

--
Justin

2002-01-31 13:20:09

by Erik Andersen

[permalink] [raw]
Subject: Re: Adaptec 1480b SlimSCSI vs hotplug

On Tue Jan 29, 2002 at 09:53:34PM -0700, Justin T. Gibbs wrote:
> to doing the comparison. The logic in the kernel handles the
> mask correctly:
>
> ((ids->class ^ dev->class) & ids->class_mask) == 0
>
> My guess is that diethotplug is not handling the mask correctly
> and thus doesn't work with a partial mask.

Upon looking closer, you are correct. diethotplug gets it wrong
here. I've fixed diethotplug, and now I can plug in my Adaptec
1480 card and have hotplug load the driver and everything works.
I can also unplug the card, and things seem to work as expected.

If I subsequently insert the card, it no longer works until I
have first manually unloaded the aic7xxx module. A quick bit of
debugging shows that ahc_linux_pci_dev_probe is getting called
the second time as it should be, and aic7xxx_detect_complete is
indeed 1 the second time, so ahc_linux_register_host gets
called... But nothing seems to actually get registered with the
scsi layer the second time around.

Here is another interesting bit. insert the card for the first
time and wait for initialization and 'cat /proc/scsi/aic7xxx/0'
and all looks normal. Now remove the card. /proc/scsi/aic7xxx/0
is still present, and 'cat /proc/scsi/aic7xxx/0' produces an
Oops... Hmm.

I see that on card removal, ahc_linux_pci_dev_remove() calls
ahc_free() which calls ahc_platform_free() which calls
scsi_unregister(), which means that /proc/scsi/aic7xxx/0
shouldn't be there anymore.... I'm missing something in
there somewhere. Ideas?

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--