Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262844AbUCRSZP (ORCPT ); Thu, 18 Mar 2004 13:25:15 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262846AbUCRSZO (ORCPT ); Thu, 18 Mar 2004 13:25:14 -0500 Received: from e2.ny.us.ibm.com ([32.97.182.102]:8173 "EHLO e2.ny.us.ibm.com") by vger.kernel.org with ESMTP id S262844AbUCRSYo (ORCPT ); Thu, 18 Mar 2004 13:24:44 -0500 Date: Thu, 18 Mar 2004 10:24:10 -0800 (PST) From: Sridhar Samudrala X-X-Sender: sridhar@localhost.localdomain To: rusty@rustcorp.com.au cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: OOPS when force unloading sctp with CONFIG_DEBUG_SLAB enabled In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5540 Lines: 106 I took a look at the changes to module unload code(sys_delete_module) after 2.6.3. One difference i noticed was that mod->waiter is being set to the kthread that runs __try_stop_module() instead of the thread that calls sys_delete_module(). The following patch fixed the oops. diff -Nru a/kernel/module.c b/kernel/module.c --- a/kernel/module.c Thu Mar 18 10:17:58 2004 +++ b/kernel/module.c Thu Mar 18 10:17:58 2004 @@ -591,6 +591,10 @@ /* Stop the machine so refcounts can't move and disable module. */ ret = try_stop_module(mod, flags, &forced); + /* Mark it as dying. */ + mod->waiter = current; + mod->state = MODULE_STATE_GOING; + /* Never wait if forced. */ if (!forced && module_refcount(mod) != 0) wait_for_zero_refcount(mod); I am not sure if this is the right or the complete fix, but i think that mod->waiter is definitely being set incorrectly in __try_stop_module(). Thanks Sridhar On Wed, 17 Mar 2004, Sridhar Samudrala wrote: > I am getting the following oops when force unloading sctp (rmmod -f sctp) with > 2.6.5-rc1 and 2.6.4. This happens only when CONFIG_DEBUG_SLAB is enabled. > > This used to work fine until 2.6.3. > > looks like somehow a freed task struct is getting dereferenced in > try_to_wake_up() and it causes oops when debug memory allocations is enabled. > The call sequence seems to be > sys_delete_module > cleanup_module > sock_release > module_put > wake_up_process > try_to_wake_up > > Thanks > Sridhar > > Mar 17 08:31:40 w-sridhar kernel: Unable to handle kernel paging request at virtual address 6b6b6b7b > Mar 17 08:31:40 w-sridhar kernel: printing eip: > Mar 17 08:31:40 w-sridhar kernel: c011ad39 > Mar 17 08:31:40 w-sridhar kernel: *pde = 00000000 > Mar 17 08:31:40 w-sridhar kernel: Oops: 0000 [#1] > Mar 17 08:31:40 w-sridhar kernel: PREEMPT SMP > Mar 17 08:31:40 w-sridhar kernel: CPU: 0 > Mar 17 08:31:40 w-sridhar kernel: EIP: 0060:[] Tainted: GF > Mar 17 08:31:40 w-sridhar kernel: EFLAGS: 00010086 (2.6.5-rc1) > Mar 17 08:31:40 w-sridhar kernel: EIP is at try_to_wake_up+0x29/0x310 > Mar 17 08:31:40 w-sridhar kernel: eax: 6b6b6b6b ebx: c041cc80 ecx: ccc66da4 edx: cdc258a0 > Mar 17 08:31:40 w-sridhar kernel: esi: cdc7c000 edi: c041cc80 ebp: cdc7df18 esp: cdc7def4 > Mar 17 08:31:40 w-sridhar kernel: ds: 007b es: 007b ss: 0068 > Mar 17 08:31:40 w-sridhar kernel: Process rmmod (pid: 1636, threadinfo=cdc7c000 task=cde2a140) > Mar 17 08:31:40 w-sridhar kernel: Stack: d08e2300 cdc7df14 c02bde4c cfcd9304 00000000 00000282 cdf1e5bc cdc7c000 > Mar 17 08:31:40 w-sridhar kernel: d08e2300 cdc7df2c c011b03e cdc258a0 00000007 00000000 cdc7df44 c02bac4a > Mar 17 08:31:40 w-sridhar kernel: cdf1e5bc c03843b8 d08e2300 00000a80 cdc7df54 d08d6f24 cdf1e5bc 00000a80 > Mar 17 08:31:40 w-sridhar kernel: Call Trace: > Mar 17 08:31:40 w-sridhar kernel: [] sk_free+0x6c/0xf0 > Mar 17 08:31:40 w-sridhar kernel: [] wake_up_process+0x1e/0x30 > Mar 17 08:31:40 w-sridhar kernel: [] sock_release+0xea/0xf0 > Mar 17 08:31:40 w-sridhar kernel: [] cleanup_module+0x24/0x1d5 [sctp] > Mar 17 08:31:40 w-sridhar kernel: [] sys_delete_module+0x174/0x1d0 > Mar 17 08:31:40 w-sridhar kernel: [] sys_munmap+0x58/0x80 > Mar 17 08:31:40 w-sridhar kernel: [] syscall_call+0x7/0xb > Mar 17 08:31:40 w-sridhar kernel: > Mar 17 08:31:40 w-sridhar kernel: Code: 8b 40 10 8b 14 85 20 f0 41 c0 ff 46 14 01 d7 31 c0 86 07 84 > Mar 17 08:31:40 w-sridhar kernel: <6>note: rmmod[1636] exited with preempt_count 1 > Mar 17 08:31:40 w-sridhar kernel: Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 > Mar 17 08:31:40 w-sridhar kernel: in_atomic():1, irqs_disabled():0 > Mar 17 08:31:40 w-sridhar kernel: Call Trace: > Mar 17 08:31:40 w-sridhar kernel: [] __might_sleep+0xab/0xd0 > Mar 17 08:31:40 w-sridhar kernel: [] profile_exit_task+0x22/0x60 > Mar 17 08:31:40 w-sridhar kernel: [] do_exit+0x7a/0x610 > Mar 17 08:31:40 w-sridhar kernel: [] do_divide_error+0x0/0xf0 > Mar 17 08:31:40 w-sridhar kernel: [] do_page_fault+0x215/0x58a > Mar 17 08:31:40 w-sridhar kernel: [] recalc_task_prio+0xb4/0x1f0 > Mar 17 08:31:40 w-sridhar kernel: [] schedule+0x3a9/0x7b0 > Mar 17 08:31:40 w-sridhar kernel: [] scheduler_tick+0x43/0x6a0 > Mar 17 08:31:40 w-sridhar kernel: [] do_page_fault+0x0/0x58a > Mar 17 08:31:40 w-sridhar kernel: [] error_code+0x2d/0x38 > Mar 17 08:31:40 w-sridhar kernel: [] atkbd_connect+0x19b/0x420 > Mar 17 08:31:40 w-sridhar kernel: [] try_to_wake_up+0x29/0x310 > Mar 17 08:31:40 w-sridhar kernel: [] sk_free+0x6c/0xf0 > Mar 17 08:31:40 w-sridhar kernel: [] wake_up_process+0x1e/0x30 > Mar 17 08:31:40 w-sridhar kernel: [] sock_release+0xea/0xf0 > Mar 17 08:31:40 w-sridhar kernel: [] cleanup_module+0x24/0x1d5 [sctp] > Mar 17 08:31:40 w-sridhar kernel: [] sys_delete_module+0x174/0x1d0 > Mar 17 08:31:40 w-sridhar kernel: [] sys_munmap+0x58/0x80 > Mar 17 08:31:40 w-sridhar kernel: [] syscall_call+0x7/0xb > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/