Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965121AbVKHQZi (ORCPT ); Tue, 8 Nov 2005 11:25:38 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932445AbVKHQZi (ORCPT ); Tue, 8 Nov 2005 11:25:38 -0500 Received: from mailhub.lss.emc.com ([168.159.2.31]:3816 "EHLO mailhub.lss.emc.com") by vger.kernel.org with ESMTP id S932444AbVKHQZh (ORCPT ); Tue, 8 Nov 2005 11:25:37 -0500 Message-ID: From: "goggin, edward" To: "'Andrew Morton'" , Masanari Iida Cc: linux-kernel@vger.kernel.org, linux-usb-devel@lists.sourceforge.net, linux-scsi@vger.kernel.org Subject: RE: oops with USB Storage on 2.6.14 Date: Tue, 8 Nov 2005 11:24:25 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain X-PMX-Version: 4.7.1.128075, Antispam-Engine: 2.1.0.0, Antispam-Data: 2005.11.8.14 X-PerlMx-Spam: Gauge=, SPAM=1%, Reasons='EMC_FROM_00+ -3, IP_HTTP_ADDR 0, __CT 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __HAS_X_MAILER 0, __IMS_MSGID 0, __IMS_MUA 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __PORN_PHRASE_15_0 0, __SANE_MSGID 0' Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4981 Lines: 142 I've run into a bug like this several times using 2.6.14-rc4 while testing dm-multipath's reaction to uevents generated by forcing fiber channel transport failures -- which leads to the scsi device being detached and the queuedata pointer in the device's queue being reset in scsi_device_dev_release. The fix I've used is below and it seems to work well for me. I was going to place this patch on dm-devel today or tomorrow anyway. drivers/scsi/scsi_lib.c:scsi_next_command() Call scsi_device_get and scsi_device_put around the calls to scsi_put_command and scsi_run_queue so that the scsi host structure will not be de-allocated between scsi_put_command and scsi_run_queue. *** ../base/linux-2.6.14-rc4/drivers/scsi/scsi_lib.c Mon Oct 10 20:19:19 2005 --- drivers/scsi/scsi_lib.c Thu Nov 3 13:30:03 2005 *************** *** 592,601 **** void scsi_next_command(struct scsi_cmnd *cmd) { ! struct request_queue *q = cmd->device->request_queue; scsi_put_command(cmd); scsi_run_queue(q); } void scsi_run_host_queues(struct Scsi_Host *shost) --- 592,611 ---- void scsi_next_command(struct scsi_cmnd *cmd) { ! struct scsi_device *sdev = cmd->device; ! struct request_queue *q = sdev->request_queue; ! ! // need to hold a reference on the device before we let go of the cmd ! if (scsi_device_get(sdev)) { ! scsi_put_command(cmd); ! return; // maybe sdev_state == SDEV_CANCEL, SDEV_DEL ! } scsi_put_command(cmd); scsi_run_queue(q); + + // ok to remove device now + scsi_device_put(sdev); } void scsi_run_host_queues(struct Scsi_Host *shost) > -----Original Message----- > From: linux-scsi-owner@vger.kernel.org > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Andrew Morton > Sent: Monday, November 07, 2005 11:41 PM > To: Masanari Iida > Cc: linux-kernel@vger.kernel.org; > linux-usb-devel@lists.sourceforge.net; linux-scsi@vger.kernel.org > Subject: Re: oops with USB Storage on 2.6.14 > > Masanari Iida wrote: > > > > Hello, > > I updated my system's kernel from 2.6.13.2 to 2.6.14, > > then it oops when I connect my Digital Camera via USB connection > > as USB storage device. > > I went back to 2.6.14-rc1, still the same panic happen. > > 2.6.13.2 and before, the kernel has been worked as expected. > > > > CPU Intel P4(2.4Ghz) > > USB Device Pentax Optio S40. > > > > Unable to handle kernel paging request at virtual address dc9d1f4c > > printing eip: > > c02b44cc > > *pde = 00073067 > > *pte = 1c9d1000 > > Oops: 0000 [#1] > > SMP DEBUG_PAGEALLOC > > Modules linked in: autofs e100 ipt_LOG ipt_state ip_conntrack > > ipt_recent iptable > > _filter ip_tables video rtc > > CPU: 1 > > EIP: 0060:[] Not tainted VLI > > EFLAGS: 00010286 (2.6.14) > > EIP is at scsi_run_queue+0xc/0xd0 > > eax: 00000001 ebx: dc9d1e3c ecx: d6b67910 edx: dc9d1e3c > > esi: d5048eb0 edi: dc9d1e3c ebp: c1507e98 esp: c1507e84 > > ds: 007b es: 007b ss: 0068 > > Process ksoftirqd/1 (pid: 6, threadinfo=c1506000 task=dfe2dad0) > > Stack: 00000292 de3a7bf8 dc9d1e3c d5048eb0 dc9d1e3c > c1507ea8 c02b4612 dc9d1e3c > > da51bf60 c1507ecc c02b473f d5048eb0 00000000 > 00000024 00000286 00000001 > > d5048eb0 00000000 c1507f10 c02b4b2e d5048eb0 > 00000000 00000024 00000001 > > > > Call Trace: > > [] show_stack+0x7f/0xa0 > > [] show_registers+0x162/0x1d0 > > [] die+0x100/0x1a0 > > [] do_page_fault+0x31e/0x640 > > [] error_code+0x4f/0x54 > > [] scsi_next_command+0x22/0x30 > > [] scsi_end_request+0xcf/0xf0 > > [] scsi_io_completion+0x26e/0x470 > > [] scsi_generic_done+0x37/0x50 > > [] scsi_finish_command+0x85/0xa0 > > [] scsi_softirq+0xcc/0x140 > > [] __do_softirq+0xd5/0xf0 > > [] do_softirq+0x38/0x40 > > [] ksoftirqd+0x95/0xe0 > > [] kthread+0xba/0xc0 > > [] kernel_thread_helper+0x5/0x18 > > Code: f0 8b 42 44 e8 16 7f 0e 00 89 45 ec 89 1c 24 e8 6b b7 > ff ff eb aa 89 f6 8d > > bc 27 00 00 00 00 55 89 e5 57 56 53 83 ec 08 8b 55 08 <8b> > 82 10 01 00 00 8b 38 > > f6 80 85 01 00 00 80 0f 85 9e 00 00 00 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > > > Has there been any progress on this? > > If not, can you please test the latest snapshot from > ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots and if > it still fails, raise a bug at bugzilla.kernel.org? > > Thanks. > - > To unsubscribe from this list: send the line "unsubscribe > linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/