Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752101Ab3HKGJi (ORCPT ); Sun, 11 Aug 2013 02:09:38 -0400 Received: from mail.sf-mail.de ([62.27.20.61]:46334 "EHLO mail.sf-mail.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751656Ab3HKGJf (ORCPT ); Sun, 11 Aug 2013 02:09:35 -0400 From: Rolf Eike Beer To: Peter Zijlstra Cc: Borislav Petkov , linux-kernel@vger.kernel.org, dhowells@redhat.com, "Paul E. McKenney" Subject: Re: Hard lockups using 3.10.0 Date: Sun, 11 Aug 2013 08:09:19 +0200 Message-ID: <24155824.fdsQYPMaDK@donald.sf-tec.de> User-Agent: KMail/4.10.5 (Linux/3.9.8-1.gf3348a8-desktop; KDE/4.10.5; i686; ; ) In-Reply-To: <20130711105207.GE25631@dyad.programming.kicks-ass.net> References: <8484013.LsABBJRIOx@devpool02> <20130711100721.GA28131@pd.tnic> <20130711105207.GE25631@dyad.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart7217183.Rmjntc6u6b"; micalg="pgp-sha1"; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2915 Lines: 90 --nextPart7217183.Rmjntc6u6b Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Peter Zijlstra wrote: > On Thu, Jul 11, 2013 at 12:07:21PM +0200, Borislav Petkov wrote: > > On Thu, Jul 11, 2013 at 11:38:37AM +0200, Rolf Eike Beer wrote: > > > Hi, > > > > > > I'm running 3.10.0 (from openSUSE packages) on an "Intel(R) Core(TM) > > > i7-2600 CPU @ 3.40GHz". I got a hard lockup on one of my CPUs twice, > > > once with backtrace (see attached image). Graphics is the builtin > > > Intel, used with X 7.6 and KDE 4.10beta2 (basically current openSUSE > > > 12.3+KDE). > > > > > > I'm not aware that I had done anything special, just "normal" desktop > > > and > > > development usage, but no heavy compile work at the moment the lockups > > > happened. > > > > Hmm, I can see commit_creds() doing some rcu pointers assignment and rcu > > calling into the scheduler which screams about a cpu runqueue of the > > task we're about to reschedule not being locked. Let's add some more > > people who should know better. > > Ok, for the other people too lazy to bother finding the picture: > > http://marc.info/?l=linux-kernel&m=137353587012001&q=p3 > > So we bug at: > > kernel/sched/core.c:519 assert_raw_spin_locked(&task_rq(p)->lock); > > and get there through: > > resched_task() > check_preempt_wakeup() > check_preempt_curr() > try_to_wake_up() > autoremove_wake_function() > __call_rcu_nocb_enqueue() > __call_rcu() > commit_creds() > ____call_usermodehelper() > ret_from_fork() > > That don't make much sense though. Since: > > try_to_wake_up() > ttwu_queue() > raw_spin_lock(&rq->lock) > ttwu_do_activate() > ttwu_do_wakeup() > check_preempt_curr() > check_preempt_wakeup() > resched_task(rq->curr) > assert_raw_spin_locked(task_rq(p)->lock) > > It would somehow mean that 'task_rq(rq->curr) != rq', that's completely > bonkers, we do after all have rq->lock locked. > > I must also say that I've _never_ seen this bug before. Meanwhile I found that there was a hardware defect on this machine. So if it does not happen again I will assume that this was caused by this. Thanks for looking into this. Eike --nextPart7217183.Rmjntc6u6b Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlIHKpYACgkQXKSJPmm5/E7lagCfaDU07q4o0o/ueR/CH6+WppXK f8oAoJbL7tyc/7TXW3xb0DycQ77UhX59 =xVA2 -----END PGP SIGNATURE----- --nextPart7217183.Rmjntc6u6b-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/