2003-09-17 18:44:26

by Sean Estabrooks

[permalink] [raw]
Subject: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

[PCMCIA] Fix race condition causing cards to be incorrectly recognised

This patch that went into test5 causes my Toshiba laptop with Xircom
pcmcia nic to freeze on boot at "Socket status: 30000020".

test4 didn't have this issue, and test5-mm2 is the same as test5 vanilla.
If the Xircom nic is inserted after boot then everything works without a
problem.

Backing out this patch restores normal boot. Any tips on how i can
narrow the problem down further would be appreciated. lspci and
.config attached.

TIA,
Sean


Attachments:
lspci.laptop (7.50 kB)
.config (22.14 kB)
Download all attachments

2003-09-17 21:33:43

by Russell King

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

On Wed, Sep 17, 2003 at 02:44:06PM -0400, Sean Estabrooks wrote:
> [PCMCIA] Fix race condition causing cards to be incorrectly recognised
>
> This patch that went into test5 causes my Toshiba laptop with Xircom
> pcmcia nic to freeze on boot at "Socket status: 30000020".

Unfortunately this patch does two things:

(a) it fixes the problem with PCMCIA cards not being recognised on boot.
(b) it introduces a deadlock between the PCMCIA layer and the driver
model.

The trace I'm seeing is:

insmod D C003AC98 28008860 446 422 (NOTLB)
[<c003a9fc>] (schedule+0x0/0x368)
[<c002b4a0>] (__down+0x0/0x108)
[<bf0059bc>] (pcmcia_register_client+0x0/0x288 [pcmcia_core])
[<bf012458>] (pcmcia_bus_add_socket+0x0/0x168 [ds])
[<c00f5c48>] (class_device_add+0x0/0x10c)
[<bf00395c>] (pcmcia_register_socket+0x0/0x160 [pcmcia_core])
[<bf016558>] (yenta_probe+0x0/0x238 [yenta_socket])
[<c00cbc14>] (pci_device_probe_static+0x0/0x68)
[<c00cbd64>] (__pci_device_probe+0x0/0x58)
[<c00cbdbc>] (pci_device_probe+0x0/0x44)
[<c00f50b8>] (bus_match+0x0/0x70)
[<c00f51b8>] (driver_attach+0x0/0x80)
[<c00f53d0>] (bus_add_driver+0x0/0x84)
[<c00f5710>] (driver_register+0x0/0x40)
[<c00cc018>] (pci_register_driver+0x0/0x88)
[<c0053a00>] (sys_init_module+0x0/0x308)
pccardd D C003AC98 4290282060 448 1 418 (L-TLB)
[<c003a9fc>] (schedule+0x0/0x368)
[<c00c3de8>] (__down_write+0x0/0x98)
[<c00f52e0>] (bus_add_device+0x0/0x7c)
[<c00f4454>] (device_add+0x0/0xe4)
[<c00c8800>] (pci_bus_add_devices+0x0/0xf4)
[<bf0092e8>] (cb_alloc+0x0/0xd0 [pcmcia_core])
[<bf0042b0>] (socket_insert+0x0/0x1ec [pcmcia_core])
[<bf004694>] (socket_detect_change+0x0/0x90 [pcmcia_core])
[<bf004724>] (pccardd+0x0/0x178 [pcmcia_core])

The problem is that the semaphore which prevents ds interfering with the
sleepy card initialisation (in pccardd) is blocking insmod. However,
the driver is being called with the PCI bus semaphore held.

pccardd in turn discovered a cardbus card, so it is trying to add the
PCI devices to the PCI bus, which requires the PCI bus semaphore.

At present, I do not see a sane solution to this problem - we have two
completely contradictory requirements. The only way which may work is
to play tricks like "wait 1 second" and hope pccardd has completed type
hacks into PCMCIA to get around this, but that's somewhere I _do not_
want to go.

--
Russell King ([email protected]) http://www.arm.linux.org.uk/personal/
Linux kernel maintainer of:
2.6 ARM Linux - http://www.arm.linux.org.uk/
2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2003-09-19 02:48:07

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

On Wed, 17 Sep 2003 22:33:36 BST, Russell King said:
> On Wed, Sep 17, 2003 at 02:44:06PM -0400, Sean Estabrooks wrote:
> > [PCMCIA] Fix race condition causing cards to be incorrectly recognised
> >
> > This patch that went into test5 causes my Toshiba laptop with Xircom
> > pcmcia nic to freeze on boot at "Socket status: 30000020".
>
> Unfortunately this patch does two things:
>
> (a) it fixes the problem with PCMCIA cards not being recognised on boot.
> (b) it introduces a deadlock between the PCMCIA layer and the driver
> model.

OK... so what's different about Sean's Toshiba and my Dell (or alternatively,
the fact we both have Xircom cards) that the *old* code worked just fine?

Is it the fact that it's a multi-function card?

lspci -v says:
03:00.0 Ethernet controller: Xircom Cardbus Ethernet 10/100 (rev 03)
Subsystem: Xircom Cardbus Ethernet 10/100
Flags: bus master, medium devsel, latency 64, IRQ 9
I/O ports at 1000 [size=128]
Memory at 10800000 (32-bit, non-prefetchable) [size=2K]
Memory at 10800800 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at 10400000 [disabled] [size=16K]
Capabilities: [dc] Power Management version 1

03:00.1 Serial controller: Xircom Cardbus Ethernet + 56k Modem (rev 03) (prog-if 02 [16550])
Subsystem: Xircom CBEM56G-100 Ethernet + 56k Modem
Flags: medium devsel, IRQ 9
I/O ports at 1080 [size=8]
Memory at 10801000 (32-bit, non-prefetchable) [size=2K]
Memory at 10801800 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at 10404000 [disabled] [size=16K]
Capabilities: [dc] Power Management version 1

I could see problems if the serial controller is being added while the ethernet
controller is still getting its act together while holding locks, since it's one
physical card.

I admit not being hot on the programming model for cardbus, but I'm
quite willing to test patches.. ;)



Attachments:
(No filename) (226.00 B)

2003-09-19 06:03:40

by Jeremy T. Bouse

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

Not sure if it's related but I've had problems with PCMCIA support in
the test kernels myself... The Cisco Aironet 350 has problems and I've
been able to make any use of the Orinoco Gold wireless card...
Fortunately the ethernet card is built in on PCI bus and doesn't have
any problem but I had being corded down...

Regards,
Jeremy

On Thu, Sep 18, 2003 at 10:47:48PM -0400, [email protected] wrote:
> On Wed, 17 Sep 2003 22:33:36 BST, Russell King said:
> > On Wed, Sep 17, 2003 at 02:44:06PM -0400, Sean Estabrooks wrote:
> > > [PCMCIA] Fix race condition causing cards to be incorrectly recognised
> > >
> > > This patch that went into test5 causes my Toshiba laptop with Xircom
> > > pcmcia nic to freeze on boot at "Socket status: 30000020".
> >
> > Unfortunately this patch does two things:
> >
> > (a) it fixes the problem with PCMCIA cards not being recognised on boot.
> > (b) it introduces a deadlock between the PCMCIA layer and the driver
> > model.
>
> OK... so what's different about Sean's Toshiba and my Dell (or alternatively,
> the fact we both have Xircom cards) that the *old* code worked just fine?
>
> Is it the fact that it's a multi-function card?
>
> lspci -v says:
> 03:00.0 Ethernet controller: Xircom Cardbus Ethernet 10/100 (rev 03)
> Subsystem: Xircom Cardbus Ethernet 10/100
> Flags: bus master, medium devsel, latency 64, IRQ 9
> I/O ports at 1000 [size=128]
> Memory at 10800000 (32-bit, non-prefetchable) [size=2K]
> Memory at 10800800 (32-bit, non-prefetchable) [size=2K]
> Expansion ROM at 10400000 [disabled] [size=16K]
> Capabilities: [dc] Power Management version 1
>
> 03:00.1 Serial controller: Xircom Cardbus Ethernet + 56k Modem (rev 03) (prog-if 02 [16550])
> Subsystem: Xircom CBEM56G-100 Ethernet + 56k Modem
> Flags: medium devsel, IRQ 9
> I/O ports at 1080 [size=8]
> Memory at 10801000 (32-bit, non-prefetchable) [size=2K]
> Memory at 10801800 (32-bit, non-prefetchable) [size=2K]
> Expansion ROM at 10404000 [disabled] [size=16K]
> Capabilities: [dc] Power Management version 1
>
> I could see problems if the serial controller is being added while the ethernet
> controller is still getting its act together while holding locks, since it's one
> physical card.
>
> I admit not being hot on the programming model for cardbus, but I'm
> quite willing to test patches.. ;)
>
>



Attachments:
(No filename) (2.41 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2003-09-19 10:16:08

by Klaus Kurzmann

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

[email protected] schrieb am Donnerstag, den 18.09.2003 um 22:47:
> Is it the fact that it's a multi-function card?

> lspci -v says:
> 03:00.0 Ethernet controller: Xircom Cardbus Ethernet 10/100 (rev 03)
> Subsystem: Xircom Cardbus Ethernet 10/100
> Flags: bus master, medium devsel, latency 64, IRQ 9
> I/O ports at 1000 [size=128]
> Memory at 10800000 (32-bit, non-prefetchable) [size=2K]
> Memory at 10800800 (32-bit, non-prefetchable) [size=2K]
> Expansion ROM at 10400000 [disabled] [size=16K]
> Capabilities: [dc] Power Management version 1

> 03:00.1 Serial controller: Xircom Cardbus Ethernet + 56k Modem (rev 03) (prog-if 02 [16550])
> Subsystem: Xircom CBEM56G-100 Ethernet + 56k Modem
> Flags: medium devsel, IRQ 9
> I/O ports at 1080 [size=8]
> Memory at 10801000 (32-bit, non-prefetchable) [size=2K]
> Memory at 10801800 (32-bit, non-prefetchable) [size=2K]
> Expansion ROM at 10404000 [disabled] [size=16K]
> Capabilities: [dc] Power Management version 1

> I could see problems if the serial controller is being added while the ethernet
> controller is still getting its act together while holding locks, since it's one
> physical card.

I have the same problem with my 3Com card, which is ethernet only.

02:00.0 Ethernet controller: 3Com Corporation 3CCFE575CT Cyclone CardBus (rev 10)
Subsystem: 3Com Corporation FE575C-3Com 10/100 LAN CardBus-Fast Ethernet
Flags: bus master, medium devsel, latency 64, IRQ 9
I/O ports at 1800 [size=256]
Memory at 10800000 (32-bit, non-prefetchable) [size=128]
Memory at 10800080 (32-bit, non-prefetchable) [size=128]
Expansion ROM at 10400000 [disabled] [size=128K]
Capabilities: [50] Power Management version 2





--
Klaus Kurzmann, FluxNetz GmbH
Tel: +49 (0)89 140 79 281 eMail: [email protected]
Fax: +49 (0)89 140 79 282 www : http://www.fluxnetz.de

Bitte senden Sie mir keine Word- oder PowerPoint-Anh?nge.
Siehe http://www.fsf.org/philosophy/no-word-attachments.de.html

2003-09-20 21:22:18

by Russell King

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

On Wed, Sep 17, 2003 at 02:44:06PM -0400, Sean Estabrooks wrote:
> [PCMCIA] Fix race condition causing cards to be incorrectly recognised
>
> This patch that went into test5 causes my Toshiba laptop with Xircom
> pcmcia nic to freeze on boot at "Socket status: 30000020".
>
> test4 didn't have this issue, and test5-mm2 is the same as test5 vanilla.
> If the Xircom nic is inserted after boot then everything works without a
> problem.
>
> Backing out this patch restores normal boot. Any tips on how i can
> narrow the problem down further would be appreciated. lspci and
> .config attached.

Ok, can you try the attached patch please? It basically juggles the
initialisation so that we avoid the locking issues by moving the init
between our the socket driver and our private thread.

The patch is against Linus' tree as of last Wednesday.

Note that I haven't compile-tested this exact patch, (but one similar)
so I need feedback from both cardbus and pcmcia-using people before I
submit it.

===== drivers/pcmcia/cs.c 1.62 vs edited =====
--- 1.62/drivers/pcmcia/cs.c Mon Sep 8 23:15:21 2003
+++ edited/drivers/pcmcia/cs.c Sat Sep 20 22:13:27 2003
@@ -281,72 +281,29 @@
EXPORT_SYMBOL(pcmcia_socket_dev_resume);


-static int pccardd(void *__skt);
-#define to_class_data(dev) dev->class_data
-
-static int pcmcia_add_socket(struct class_device *class_dev)
-{
- struct pcmcia_socket *socket = class_get_devdata(class_dev);
- int ret = 0;
-
- /* base address = 0, map = 0 */
- socket->cis_mem.flags = 0;
- socket->cis_mem.speed = cis_speed;
- socket->erase_busy.next = socket->erase_busy.prev = &socket->erase_busy;
- INIT_LIST_HEAD(&socket->cis_cache);
- spin_lock_init(&socket->lock);
-
- init_completion(&socket->thread_done);
- init_waitqueue_head(&socket->thread_wait);
- init_MUTEX(&socket->skt_sem);
- spin_lock_init(&socket->thread_lock);
-
- socket->socket = dead_socket;
- socket->ops->init(socket);
-
- ret = kernel_thread(pccardd, socket, CLONE_KERNEL);
- if (ret < 0)
- return ret;
-
- wait_for_completion(&socket->thread_done);
- BUG_ON(!socket->thread);
- pcmcia_parse_events(socket, SS_DETECT);
-
- return 0;
-}
-
-static void pcmcia_remove_socket(struct class_device *class_dev)
+static void pcmcia_release_socket(struct class_device *class_dev)
{
struct pcmcia_socket *socket = class_get_devdata(class_dev);
client_t *client;

- if (socket->thread) {
- init_completion(&socket->thread_done);
- socket->thread = NULL;
- wake_up(&socket->thread_wait);
- wait_for_completion(&socket->thread_done);
- }
- release_cis_mem(socket);
while (socket->clients) {
client = socket->clients;
socket->clients = socket->clients->next;
kfree(client);
}
- socket->ops = NULL;
-}

-static void pcmcia_release_socket(struct class_device *class_dev)
-{
- struct pcmcia_socket *socket = class_get_devdata(class_dev);
complete(&socket->socket_released);
}

+static int pccardd(void *__skt);

/**
* pcmcia_register_socket - add a new pcmcia socket device
*/
int pcmcia_register_socket(struct pcmcia_socket *socket)
{
+ int ret;
+
if (!socket || !socket->ops || !socket->dev.dev)
return -EINVAL;

@@ -381,15 +338,34 @@
socket->dev.class = &pcmcia_socket_class;
snprintf(socket->dev.class_id, BUS_ID_SIZE, "pcmcia_socket%u", socket->sock);

- /* register with the device core */
- if (class_device_register(&socket->dev)) {
- down_write(&pcmcia_socket_list_rwsem);
- list_del(&socket->socket_list);
- up_write(&pcmcia_socket_list_rwsem);
- return -EINVAL;
- }
+ /* base address = 0, map = 0 */
+ socket->cis_mem.flags = 0;
+ socket->cis_mem.speed = cis_speed;
+ socket->erase_busy.next = socket->erase_busy.prev = &socket->erase_busy;
+ INIT_LIST_HEAD(&socket->cis_cache);
+ spin_lock_init(&socket->lock);
+
+ init_completion(&socket->socket_released);
+ init_completion(&socket->thread_done);
+ init_waitqueue_head(&socket->thread_wait);
+ init_MUTEX(&socket->skt_sem);
+ spin_lock_init(&socket->thread_lock);
+
+ ret = kernel_thread(pccardd, socket, CLONE_KERNEL);
+ if (ret < 0)
+ goto err;
+
+ wait_for_completion(&socket->thread_done);
+ BUG_ON(!socket->thread);
+ pcmcia_parse_events(socket, SS_DETECT);

return 0;
+
+ err:
+ down_write(&pcmcia_socket_list_rwsem);
+ list_del(&socket->socket_list);
+ up_write(&pcmcia_socket_list_rwsem);
+ return ret;
} /* pcmcia_register_socket */
EXPORT_SYMBOL(pcmcia_register_socket);

@@ -404,10 +380,13 @@

DEBUG(0, "cs: pcmcia_unregister_socket(0x%p)\n", socket->ops);

- init_completion(&socket->socket_released);
-
- /* remove from the device core */
- class_device_unregister(&socket->dev);
+ if (socket->thread) {
+ init_completion(&socket->thread_done);
+ socket->thread = NULL;
+ wake_up(&socket->thread_wait);
+ wait_for_completion(&socket->thread_done);
+ }
+ release_cis_mem(socket);

/* remove from our own list */
down_write(&pcmcia_socket_list_rwsem);
@@ -783,11 +762,22 @@
{
struct pcmcia_socket *skt = __skt;
DECLARE_WAITQUEUE(wait, current);
+ int ret;

daemonize("pccardd");
skt->thread = current;
complete(&skt->thread_done);

+ skt->socket = dead_socket;
+ skt->ops->init(skt);
+
+ /* register with the device core */
+ ret = class_device_register(&skt->dev);
+ if (ret) {
+ printk(KERN_WARNING "PCMCIA: unable to register socket 0x%p\n",
+ skt);
+ }
+
add_wait_queue(&skt->thread_wait, &wait);
for (;;) {
unsigned long flags;
@@ -823,6 +813,9 @@
}
remove_wait_queue(&skt->thread_wait, &wait);

+ /* remove from the device core */
+ class_device_unregister(&skt->dev);
+
complete_and_exit(&skt->thread_done, 0);
}

@@ -2501,12 +2494,6 @@
};
EXPORT_SYMBOL(pcmcia_socket_class);

-static struct class_interface pcmcia_socket = {
- .class = &pcmcia_socket_class,
- .add = &pcmcia_add_socket,
- .remove = &pcmcia_remove_socket,
-};
-

static int __init init_pcmcia_cs(void)
{
@@ -2514,7 +2501,6 @@
printk(KERN_INFO " %s\n", options);
DEBUG(0, "%s\n", version);
class_register(&pcmcia_socket_class);
- class_interface_register(&pcmcia_socket);

return 0;
}
@@ -2523,7 +2509,6 @@
{
printk(KERN_INFO "unloading Kernel Card Services\n");
release_resource_db();
- class_interface_unregister(&pcmcia_socket);
class_unregister(&pcmcia_socket_class);
}



--
Russell King ([email protected]) http://www.arm.linux.org.uk/personal/
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2003-09-21 16:49:37

by Sean Estabrooks

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

On Sat, 20 Sep 2003 22:22:07 +0100
Russell King <[email protected]> wrote:


> Ok, can you try the attached patch please? It basically juggles the
> initialisation so that we avoid the locking issues by moving the init
> between our the socket driver and our private thread.
>
> The patch is against Linus' tree as of last Wednesday.
>
> Note that I haven't compile-tested this exact patch, (but one similar)
> so I need feedback from both cardbus and pcmcia-using people before I
> submit it.

Russell,

Thanks for the patch! It applied cleanly against linus-bk as of
this morning and fixed the problem here.

Cheers,
Sean

2003-09-21 17:31:47

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PCMCIA] Xircom nic hang on boot since cs.c race condition patch

On Sat, 20 Sep 2003 22:22:07 BST, Russell King said:

> Ok, can you try the attached patch please? It basically juggles the
> initialisation so that we avoid the locking issues by moving the init
> between our the socket driver and our private thread.
>
> The patch is against Linus' tree as of last Wednesday.
>
> Note that I haven't compile-tested this exact patch, (but one similar)
> so I need feedback from both cardbus and pcmcia-using people before I
> submit it.

Applied clean to -test5-mm3, work as expected on a Dell Latitude C840
with yenta sockets.



Attachments:
(No filename) (226.00 B)