Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751498AbWHZBYc (ORCPT ); Fri, 25 Aug 2006 21:24:32 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751578AbWHZBYc (ORCPT ); Fri, 25 Aug 2006 21:24:32 -0400 Received: from wx-out-0506.google.com ([66.249.82.238]:307 "EHLO wx-out-0506.google.com") by vger.kernel.org with ESMTP id S1751498AbWHZBYb (ORCPT ); Fri, 25 Aug 2006 21:24:31 -0400 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Gkk2+hfn0Kl9YbKxEAyS0RjgWe63C5cUuNvCRtSjJ1iKnHWpUNvB8NGf1orVpQKCxUqduGpnRo8f1JsXRwnL8DZcD8XFlB0XRwdDNg2ttH35LPmhKVWswpC55edSs7XEoyC4TfzEEDv00SAIdcx5YJNZa8w4wf91g9IUoF9ODE0= Message-ID: Date: Fri, 25 Aug 2006 18:24:29 -0700 From: "Robert Crocombe" To: "hui Bill Huey" Subject: Re: rtmutex assert failure (was [Patch] restore the RCU callback...) Cc: "Esben Nielsen" , "Ingo Molnar" , "Thomas Gleixner" , rostedt@goodmis.org, linux-kernel In-Reply-To: <20060825071957.GA30720@gnuppy.monkey.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20060818115934.GA29919@gnuppy.monkey.org> <20060822013722.GA628@gnuppy.monkey.org> <20060822232051.GA8991@gnuppy.monkey.org> <20060823202046.GA17267@gnuppy.monkey.org> <20060823210558.GA17606@gnuppy.monkey.org> <20060823210842.GB17606@gnuppy.monkey.org> <20060824014658.GB19314@gnuppy.monkey.org> <20060825071957.GA30720@gnuppy.monkey.org> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8098 Lines: 195 On 8/25/06, hui Bill Huey wrote: > http://mmlinux.sourceforge.net/public/against-2.6.17-rt8-2.diff It compiled a kernel through once, but died on a 2nd try. The trace is the usual one, I believe. kjournald/1061[CPU#1]: BUG in debug_rt_mutex_unlock at kernel/rtmutex-debug.c:471 Call Trace: {_raw_spin_lock_irqsave+29} {__WARN_ON+123} {__WARN_ON+54} {debug_rt_mutex_unlock+199} {rt_lock_slowunlock+25} {__lock_text_start+9} {kmem_cache_alloc+202} {mempool_alloc_slab+17} {mempool_alloc+75} {_raw_spin_unlock+46} {rt_lock_slowunlock+65} {bio_alloc_bioset+35} {bio_clone+29} {clone_bio+51} {__split_bio+399} {dm_request+453} {generic_make_request+375} {submit_bio+192} {submit_bh+262} {ll_rw_block+161} {journal_commit_transaction+990} {_raw_spin_unlock+46} {rt_lock_slowunlock+65} {__lock_text_start+9} {try_to_del_timer_sync+85} {kjournald+199} {autoremove_wake_function+0} {kjournald+0} {keventd_create_kthread+0} {kthread+219} {schedule_tail+185} {child_rip+8} {keventd_create_kthread+0} {kthread+0} {child_rip+0} --------------------------- | preempt count: 00000002 ] | 2-level deep critical section nesting: ---------------------------------------- .. [] .... _raw_spin_lock+0x16/0x23 .....[] .. ( <= rt_lock_slowunlock+0x11/0x6b) .. [] .... _raw_spin_lock_irqsave+0x1d/0x2e .....[] .. ( <= __WARN_ON+0x36/0xa2) And I should've noted this before, but in general, I use icecream http://en.opensuse.org/Icecream to compile stuff. I have been able to reproduce the identical problem without it, but it is another variable. Sorry for the omission. I also feel like I should be doing more, so I built a vanilla 2.6.17-rt8 kernel with the more limited set of config options I've used on the most recent patches, but I also built this as a uni-processor config. On the uni-processor kernel with icecream disabled and while building a kernel with a simple 'make', I received the following BUG, but the machine kept on going: BUG: scheduling while atomic: make/0x00000001/20884 Call Trace: {plist_check_head+60} {__schedule+155} {task_blocks_on_rt_mutex+643} {__mod_page_state_offset+25} {find_task_by_pid_type+24} {__mod_page_state_offset+25} {schedule+236} {rt_lock_slowlock+416} {rt_lock+13} {__mod_page_state_offset+25} {__free_pages_ok+336} {__free_pages+47} {free_pages+128} {free_task+26} {__put_task_struct+182} {thread_return+163} {schedule+236} {pipe_wait+111} {autoremove_wake_function+0} {rt_mutex_lock+50} {pipe_readv+785} {rt_lock_slowunlock+98} {__lock_text_start+9} {pipe_read+30} {vfs_read+171} {sys_read+71} {system_call+126} --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [] .... __schedule+0xb3/0x57b .....[] .. ( <= schedule+0xec/0x11a) Of 4 total attempts this way, two completed with no problems, one had the BUG above, and yet another had two BUGs (below), but also continued. BUG: scheduling while atomic: make/0x00000001/14632 Call Trace: {plist_check_head+60} {__schedule+155} {task_blocks_on_rt_mutex+643} {__mod_page_state_offset+25} {find_task_by_pid_type+24} {__mod_page_state_offset+25} {schedule+236} {rt_lock_slowlock+416} {rt_lock+13} {__mod_page_state_offset+25} {__free_pages_ok+336} {__free_pages+47} {free_pages+128} {free_task+26} {__put_task_struct+182} {thread_return+163} {schedule+236} {pipe_wait+111} {autoremove_wake_function+0} {rt_mutex_lock+50} {pipe_readv+785} {rt_lock_slowunlock+98} {__lock_text_start+9} {pipe_read+30} {vfs_read+171} {sys_read+71} {system_call+126} --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [] .... __schedule+0xb3/0x57b .....[] .. ( <= schedule+0xec/0x11a) BUG: scheduling while atomic: make/0x00000001/23043 Call Trace: {plist_check_head+60} {__schedule+155} {task_blocks_on_rt_mutex+643} {free_pages_bulk+42} {find_task_by_pid_type+24} {free_pages_bulk+42} {schedule+236} {rt_lock_slowlock+416} {rt_lock+13} {free_pages_bulk+42} {__free_pages_ok+438} {__free_pages+47} {free_pages+128} {free_task+26} {__put_task_struct+182} {schedule_tail+152} {ret_from_fork+5} --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [] .... rt_lock_slowunlock+0x16/0xba .....[] .. ( <= __lock_text_start+0x9/0xb) I also tried the compile with icecream running, and the machine simply wudged during each of three attempts, once almost instantly. I was able to use the sysrq keys (but nothing else: couldn't get the capslock light when I toggled that key), and I dumped all the task info. It's A LOT, since I was building with 'make -j40' or something. I will probably try and put it up on my little webserver once I get home (I have no mechanism here). Is there anything else you want me to try: I can probably wave a chicken over the machine or get some black candles or something? -- Robert Crocombe rcrocomb@gmail.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/