2007-01-26 21:37:54

by Pieter Palmers

[permalink] [raw]
Subject: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

I get the following Oops when ejecting a firewire cardbus controller.
The kernel is a 2.6.20-rc6 (git pull from a few hours ago), patched with
Ingo's -rt2 patch.

cc'ing linux1394-devel because it seems to be a problem in the ieee1394
stack.

Pieter

Jan 26 22:29:37 localhost kernel: pccard: card ejected from slot 0
Jan 26 22:29:37 localhost kernel: BUG: unable to handle kernel NULL
pointer dereference at virtual address 00000000
Jan 26 22:29:37 localhost kernel: printing eip:
Jan 26 22:29:37 localhost kernel: f91c8a6d
Jan 26 22:29:37 localhost kernel: *pde = 05c64067
Jan 26 22:29:37 localhost kernel: stopped custom tracer.
Jan 26 22:29:37 localhost kernel: Oops: 0000 [#1]
Jan 26 22:29:37 localhost kernel: PREEMPT
Jan 26 22:29:37 localhost kernel: Modules linked in: raw1394 dv1394
ohci1394 ieee1394 nfsd exportfs lockd nfs_acl autofs4 hidp rfcomm l2cap
bluetooth sunrp
c vfat fat radeon drm nvram snd_intel8x0m snd_intel8x0 snd_ac97_codec
ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_
oss snd_mixer_oss snd_pcm joydev snd_timer snd uhci_hcd i2c_i801
soundcore ipw2200 ieee80211 ieee80211_crypt 8139too mii i2c_core pcspkr
snd_page_alloc rng
_core ext3 jbd
Jan 26 22:29:37 localhost kernel: CPU: 0
Jan 26 22:29:37 localhost kernel: EIP: 0060:[<f91c8a6d>] Not
tainted VLI
Jan 26 22:29:37 localhost kernel: EFLAGS: 00010286 (2.6.20-rc6-rt2 #9)
Jan 26 22:29:37 localhost kernel: EIP is at dv1394_remove_host+0x17/0xc0
[dv1394]
Jan 26 22:29:37 localhost kernel: eax: f88fd140 ebx: 00000001 ecx:
00000000 edx: f91c8a56
Jan 26 22:29:37 localhost kernel: esi: 00000000 edi: f91c9910 ebp:
00000000 esp: f7e49ebc
Jan 26 22:29:37 localhost kernel: ds: 007b es: 007b ss: 0068
preempt: 00000001
Jan 26 22:29:37 localhost kernel: Process pccardd (pid: 284, ti=f7e48000
task=f7f68ab0 task.ti=f7e48000)
Jan 26 22:29:37 localhost kernel: Stack: f91cb540 ee588000 f91cb540
f88ea50f d4142180 ee588000 00000000 f91cb540
Jan 26 22:29:37 localhost kernel: ee588000 00000000 ee589a84
f88ea759 ee588000 f8950e94 f88ea18a f7e71400
Jan 26 22:29:37 localhost kernel: f894a213 f8950f14 ee5880bc
f7e71400 f8950e94 00000000 f7eae048 c01c0caf
Jan 26 22:29:37 localhost kernel: Call Trace:
Jan 26 22:29:37 localhost kernel: [<f88ea50f>]
__unregister_host+0x18/0x8f [ieee1394]
Jan 26 22:29:38 localhost kernel: [<f88ea759>]
highlevel_remove_host+0x21/0x42 [ieee1394]
Jan 26 22:29:38 localhost kernel: [<f88ea18a>]
hpsb_remove_host+0x37/0x55 [ieee1394]
Jan 26 22:29:38 localhost kernel: [<f894a213>]
ohci1394_pci_remove+0x47/0x1e4 [ohci1394]
Jan 26 22:29:38 localhost kernel: [<c01c0caf>] pci_device_remove+0x16/0x35
Jan 26 22:29:38 localhost kernel: [<c020fadc>]
__device_release_driver+0x71/0x87
Jan 26 22:29:38 localhost kernel: [<c020fba2>]
device_release_driver+0x18/0x21
Jan 26 22:29:38 localhost kernel: [<c020f6af>] bus_remove_device+0x70/0x80
Jan 26 22:29:38 localhost kernel: [<c020e0a9>] device_del+0x13a/0x197
Jan 26 22:29:38 localhost kernel: [<c020e10e>] device_unregister+0x8/0x10
Jan 26 22:29:38 localhost kernel: [<c01be718>] pci_stop_dev+0x20/0x4e
Jan 26 22:29:38 localhost kernel: [<c01be803>]
pci_remove_bus_device+0x28/0x8c
Jan 26 22:29:38 localhost kernel: [<c01be881>]
pci_remove_behind_bridge+0x1a/0x2d
Jan 26 22:29:38 localhost kernel: [<c022895a>] socket_shutdown+0x6d/0xb3
Jan 26 22:29:38 localhost kernel: [<c02289bc>] socket_remove+0x1c/0x26
Jan 26 22:29:38 localhost kernel: [<c0228f97>] pccardd+0x129/0x1ea
Jan 26 22:29:38 localhost kernel: [<c010e56c>]
default_wake_function+0x0/0x16
Jan 26 22:29:38 localhost kernel: [<c0228e6e>] pccardd+0x0/0x1ea
Jan 26 22:29:38 localhost kernel: [<c01216ce>] kthread+0xa0/0xca
Jan 26 22:29:38 localhost kernel: [<c012162e>] kthread+0x0/0xca
Jan 26 22:29:38 localhost kernel: [<c0102fe7>]
kernel_thread_helper+0x7/0x10
Jan 26 22:29:38 localhost kernel: =======================
Jan 26 22:29:38 localhost kernel: Code: 1c f9 e8 7f 6c 0e c7 89 da 89 9e
a8 00 00 00 eb d9 5b 5e c3 57 bf 10 99 1c f9 56 53 8b 98 b0 00 00 00 8b
80 b4 00 0
0 00 8b 70 04 <ac> ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 0f
85 8d
Jan 26 22:29:38 localhost kernel: EIP: [<f91c8a6d>]
dv1394_remove_host+0x17/0xc0 [dv1394] SS:ESP 0068:f7e49ebc


2007-01-26 22:51:15

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

Pieter Palmers wrote:
> cc'ing linux1394-devel because it seems to be a problem in the ieee1394
> stack.

Yes, this bug is in stock Linux 2.6 too. Ever has been.
http://bugzilla.kernel.org/show_bug.cgi?id=7121
--
Stefan Richter
-=====-=-=== ---= ==-=-
http://arcgraph.de/sr/

2007-01-27 08:44:29

by Pieter Palmers

[permalink] [raw]
Subject: Re: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

Stefan Richter wrote:
> Pieter Palmers wrote:
>> cc'ing linux1394-devel because it seems to be a problem in the ieee1394
>> stack.
>
> Yes, this bug is in stock Linux 2.6 too. Ever has been.
> http://bugzilla.kernel.org/show_bug.cgi?id=7121
Oops, missed that. Sorry for the noise.

Pieter

2007-01-27 09:00:57

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

Pieter Palmers wrote:
> Stefan Richter wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=7121
> Oops, missed that. Sorry for the noise.

No, it's OK. We have a pile of old open bugs, and those that are rubbed
in our face more often might even get a chance to be fixed some day... :-/
--
Stefan Richter
-=====-=-=== ---= ==-==
http://arcgraph.de/sr/

2007-01-27 09:20:32

by Pieter Palmers

[permalink] [raw]
Subject: Re: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

Stefan Richter wrote:
> Pieter Palmers wrote:
>> Stefan Richter wrote:
>>> http://bugzilla.kernel.org/show_bug.cgi?id=7121
>> Oops, missed that. Sorry for the noise.
>
> No, it's OK. We have a pile of old open bugs, and those that are rubbed
> in our face more often might even get a chance to be fixed some day... :-/

OTOH, dv1394 is scheduled for removal. Not that we should tolerate bugs
in the Linux kernel ;), but it lowers priority I guess.

For me the solution is simple: I don't need dv1394, so I just don't
build it anymore. It has the annoying tendency to auto-load when raw1394
is loaded.

Pieter

2007-01-27 09:46:12

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.20-rc6-rt2: kernel oops when ejecting cardbus firewire card

Pieter Palmers wrote:
> OTOH, dv1394 is scheduled for removal. Not that we should tolerate bugs
> in the Linux kernel ;), but it lowers priority I guess.

It does. (Plus I didn't have a Linux box with CardBus slot in operation
for a while, but now I have.)

> For me the solution is simple: I don't need dv1394, so I just don't
> build it anymore. It has the annoying tendency to auto-load when raw1394
> is loaded.

This is a correlation but not a causality. dv1394's and raw1394's
_device_id tables for driver matching both match AV/C devices.
--
Stefan Richter
-=====-=-=== ---= ==-==
http://arcgraph.de/sr/

2007-01-27 12:56:25

by Stefan Richter

[permalink] [raw]
Subject: [PATCH 2/2] ieee1394: dv1394: tidy up card removal

small coding style touch-up and terser coding

Signed-off-by: Stefan Richter <[email protected]>
---
drivers/ieee1394/dv1394.c | 36 +++++++++++++-----------------------
1 files changed, 13 insertions(+), 23 deletions(-)

Index: linux-2.6.20-rc6/drivers/ieee1394/dv1394.c
===================================================================
--- linux-2.6.20-rc6.orig/drivers/ieee1394/dv1394.c 2007-01-27 13:30:05.000000000 +0100
+++ linux-2.6.20-rc6/drivers/ieee1394/dv1394.c 2007-01-27 13:34:14.000000000 +0100
@@ -2255,47 +2255,37 @@ static int dv1394_init(struct ti_ohci *o
return 0;
}

-static void dv1394_un_init(struct video_card *video)
+static void dv1394_remove_host(struct hpsb_host *host)
{
- /* obviously nobody has the driver open at this point */
- do_dv1394_shutdown(video, 1);
- kfree(video);
-}
-
-
-static void dv1394_remove_host (struct hpsb_host *host)
-{
- struct video_card *video;
+ struct video_card *video, *tmp_video;
unsigned long flags;
- int id = host->id, found_ohci_card = 0;
+ int found_ohci_card = 0;

- /* find the corresponding video_cards */
do {
- struct video_card *tmp_vid;
-
video = NULL;
-
spin_lock_irqsave(&dv1394_cards_lock, flags);
- list_for_each_entry(tmp_vid, &dv1394_cards, list) {
- if ((tmp_vid->id >> 2) == id) {
- list_del(&tmp_vid->list);
- video = tmp_vid;
+ list_for_each_entry(tmp_video, &dv1394_cards, list) {
+ if ((tmp_video->id >> 2) == host->id) {
+ list_del(&tmp_video->list);
+ video = tmp_video;
found_ohci_card = 1;
break;
}
}
spin_unlock_irqrestore(&dv1394_cards_lock, flags);

- if (video)
- dv1394_un_init(video);
- } while (video != NULL);
+ if (video) {
+ do_dv1394_shutdown(video, 1);
+ kfree(video);
+ }
+ } while (video);

if (found_ohci_card)
class_device_destroy(hpsb_protocol_class, MKDEV(IEEE1394_MAJOR,
IEEE1394_MINOR_BLOCK_DV1394 * 16 + (host->id << 2)));
}

-static void dv1394_add_host (struct hpsb_host *host)
+static void dv1394_add_host(struct hpsb_host *host)
{
struct ti_ohci *ohci;
int id = host->id;


--
Stefan Richter
-=====-=-=== ---= ==-==
http://arcgraph.de/sr/

2007-01-27 12:57:22

by Stefan Richter

[permalink] [raw]
Subject: [PATCH 1/2] ieee1394: dv1394: fix CardBus card ejection

Fix NULL pointer dereference on hot ejection of a FireWire card while
dv1394 was loaded. http://bugzilla.kernel.org/show_bug.cgi?id=7121
I did not test card ejection with open /dev/dv1394 files yet.

Signed-off-by: Stefan Richter <[email protected]>
---
drivers/ieee1394/dv1394.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)

Index: linux-2.6.20-rc6/drivers/ieee1394/dv1394.c
===================================================================
--- linux-2.6.20-rc6.orig/drivers/ieee1394/dv1394.c 2007-01-27 13:28:39.000000000 +0100
+++ linux-2.6.20-rc6/drivers/ieee1394/dv1394.c 2007-01-27 13:30:05.000000000 +0100
@@ -2267,11 +2267,7 @@ static void dv1394_remove_host (struct h
{
struct video_card *video;
unsigned long flags;
- int id = host->id;
-
- /* We only work with the OHCI-1394 driver */
- if (strcmp(host->driver->name, OHCI1394_DRIVER_NAME))
- return;
+ int id = host->id, found_ohci_card = 0;

/* find the corresponding video_cards */
do {
@@ -2284,6 +2280,7 @@ static void dv1394_remove_host (struct h
if ((tmp_vid->id >> 2) == id) {
list_del(&tmp_vid->list);
video = tmp_vid;
+ found_ohci_card = 1;
break;
}
}
@@ -2293,8 +2290,9 @@ static void dv1394_remove_host (struct h
dv1394_un_init(video);
} while (video != NULL);

- class_device_destroy(hpsb_protocol_class,
- MKDEV(IEEE1394_MAJOR, IEEE1394_MINOR_BLOCK_DV1394 * 16 + (id<<2)));
+ if (found_ohci_card)
+ class_device_destroy(hpsb_protocol_class, MKDEV(IEEE1394_MAJOR,
+ IEEE1394_MINOR_BLOCK_DV1394 * 16 + (host->id << 2)));
}

static void dv1394_add_host (struct hpsb_host *host)


--
Stefan Richter
-=====-=-=== ---= ==-==
http://arcgraph.de/sr/