Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752202Ab1DTIgc (ORCPT ); Wed, 20 Apr 2011 04:36:32 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:50609 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839Ab1DTIga (ORCPT ); Wed, 20 Apr 2011 04:36:30 -0400 Date: Wed, 20 Apr 2011 10:36:16 +0200 From: Ingo Molnar To: Dave Jones , Linux Kernel , x86@kernel.org, "Paul E. McKenney" , Peter Zijlstra Subject: Re: rcu stall. Message-ID: <20110420083616.GA1124@elte.hu> References: <20110420020215.GA30081@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110420020215.GA30081@redhat.com> User-Agent: Mutt/1.5.20 (2009-08-17) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.3.1 -2.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2758 Lines: 60 * Dave Jones wrote: > Machine was under heavy load (300 or so running processes > calling random system calls). The rcu stall detector kicked in, > spewed this, and then the machine completely locked up. Without having looked at it in detail, isnt this a lockup somewhere in the wireless code: > [] ? simple_release_fs+0x22/0x57 > [] ? arch_local_irq_restore+0x6/0xd > [] lock_acquired+0x20f/0x21e > [] _raw_spin_lock+0x62/0x6a > [] ? simple_release_fs+0x22/0x57 > [] ? _raw_spin_unlock+0x28/0x2c > [] simple_release_fs+0x22/0x57 > [] debugfs_remove_recursive+0x11f/0x16b > [] ieee80211_debugfs_key_remove+0x1f/0x2e [mac80211] > [] __ieee80211_key_destroy+0x61/0x6d [mac80211] > [] ieee80211_key_link+0x12c/0x165 [mac80211] > [] ieee80211_add_key+0xfb/0x133 [mac80211] > [] nl80211_new_key+0xe5/0x106 [cfg80211] > [] ? cfg80211_get_dev_from_ifindex+0x72/0x7a [cfg80211] > [] genl_rcv_msg+0x1dc/0x207 > [] ? genl_rcv+0x2d/0x2d > [] netlink_rcv_skb+0x43/0x8f > [] genl_rcv+0x26/0x2d > [] netlink_unicast+0xec/0x156 > [] netlink_sendmsg+0x27f/0x2c0 > [] __sock_sendmsg+0x69/0x75 > [] sock_sendmsg+0xa1/0xb6 > [] ? lock_release+0x181/0x18e > [] ? might_fault+0xa5/0xac > [] ? might_fault+0x5c/0xac > [] ? copy_from_user+0x2f/0x31 > [] ? copy_from_user+0x2f/0x31 > [] ? verify_iovec+0x52/0xa6 > [] sys_sendmsg+0x23a/0x2b8 > [] ? lock_acquire+0xec/0xfb > [] ? lock_release+0x181/0x18e > [] ? mntput+0x26/0x28 > [] ? fput+0x1e6/0x1f5 > [] ? path_put+0x1f/0x23 > [] ? audit_syscall_entry+0x11c/0x148 > [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [] system_call_fastpath+0x16/0x1b RCU stall detector is simply the first thing that noticed the hang. Enabling the regular lockup detector would probably have resulted in a similar looking hang. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/