Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753707AbXL0Rxs (ORCPT ); Thu, 27 Dec 2007 12:53:48 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752000AbXL0Rxl (ORCPT ); Thu, 27 Dec 2007 12:53:41 -0500 Received: from emroute3.ornl.gov ([160.91.4.110]:61693 "EHLO emroute3.ornl.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752002AbXL0Rxk (ORCPT ); Thu, 27 Dec 2007 12:53:40 -0500 Date: Thu, 27 Dec 2007 12:53:38 -0500 From: David Dillow Subject: Re: list corruption on ib_srp load in v2.6.24-rc5 In-reply-to: <20071227115817I.fujita.tomonori@lab.ntt.co.jp> To: FUJITA Tomonori Cc: tomof@acm.org, linux-kernel@vger.kernel.org, general@lists.openfabrics.org, pw@osc.edu Message-id: <1198778018.9960.1.camel@lap75545.ornl.gov> Organization: Oak Ridge National Laboratory MIME-version: 1.0 X-Mailer: Evolution 2.12.2 (2.12.2-2.fc8) Content-type: text/plain Content-transfer-encoding: 7bit References: <1198275532.9979.43.camel@lap75545.ornl.gov> <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> <20071227115817I.fujita.tomonori@lab.ntt.co.jp> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3655 Lines: 80 On Thu, 2007-12-27 at 11:58 +0900, FUJITA Tomonori wrote: > On Wed, 26 Dec 2007 12:14:11 -0500 > David Dillow wrote: > > > > > On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote: > > > transport_container_unregister(&i->rport_attr_cont) should not fail here. > > > > > > It fails because there is still a srp rport. > > > > > > I think that as Pete pointed out, srp_remove_one needs to call > > > srp_remove_host. > > > > > > Can you try this? > > > > That patched oopsed in scsi_remove_host(), but reversing the order has > > survived over 500 insert/probe/remove cycles. > > Thanks, > > Can you post the oops message? The srp class might have bugs related > to it. This is the oops generated by doing srp_remove_host() prior to scsi_remove_host() in 2.6.24-rc5: Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: [] klist_del+0xa/0x46 PGD 8450d8067 PUD 843cbd067 PMD 0 Oops: 0000 [1] SMP CPU 3 Modules linked in: sg sd_mod ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_mod ib_cm ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core ehci_hcd ohci_hcd nfs lockd nfs_acl sunrpc unionfs forcedeth Pid: 2450, comm: rmmod Not tainted 2.6.24-rc5 #2 RIP: 0010:[] [] klist_del+0xa/0x46 RSP: 0018:ffff81084192bd28 EFLAGS: 00010282 RAX: ffff81084600b000 RBX: 0000000000000000 RCX: ffffe2001ce562c8 RDX: 0000000000000000 RSI: ffff810447c1d000 RDI: ffff81084657f050 RBP: ffff81084657f028 R08: ffff810447c1d000 R09: ffff8108455a1800 R10: ffff8108455a1800 R11: ffff810846730808 R12: ffff81084657f050 R13: ffff810844c4a170 R14: ffff81084657f028 R15: 0000000000000880 FS: 00002afbf1b0b6e0(0000) GS:ffff810846531840(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000843c56000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 2450, threadinfo ffff81084192a000, task ffff810844d47620) Stack: ffff810844c4a000 ffff81084657f028 ffff81084657f000 ffffffff8114cbd6 ffff810846730808 ffff810844c4a000 ffff81084657f028 ffff81084657f000 0000000000000246 ffffffff88118322 ffff8108455a1800 ffff81084657f000 Call Trace: [] device_del+0x20/0x2f0 [] :scsi_mod:scsi_target_reap_usercontext+0x53/0xbd [] execute_in_process_context+0x20/0x47 [] :scsi_mod:scsi_device_dev_release_usercontext+0xd3/0x105 [] execute_in_process_context+0x20/0x47 [] kobject_cleanup+0x2f/0x51 [] kobject_release+0x0/0x9 [] kref_put+0x74/0x82 [] :scsi_mod:scsi_forget_host+0x53/0x55 [] :scsi_mod:scsi_remove_host+0x76/0xf7 [] :ib_srp:srp_remove_one+0x102/0x19d [] :ib_core:ib_unregister_client+0x40/0xb3 [] :ib_srp:srp_cleanup_module+0xe/0x34 [] sys_delete_module+0x18d/0x1bc [] error_exit+0x0/0x51 [] system_call+0x7e/0x83 Code: 48 8b 6b 20 48 89 df e8 b7 2f 00 00 4c 89 e7 e8 d2 ff ff ff RIP [] klist_del+0xa/0x46 RSP CR2: 0000000000000020 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/