Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754173AbaLBREs (ORCPT ); Tue, 2 Dec 2014 12:04:48 -0500 Received: from e36.co.us.ibm.com ([32.97.110.154]:36484 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750831AbaLBREr (ORCPT ); Tue, 2 Dec 2014 12:04:47 -0500 Date: Tue, 2 Dec 2014 09:04:07 -0800 From: "Paul E. McKenney" To: =?iso-8859-1?Q?D=E2niel?= Fraga Cc: Linus Torvalds , Linux Kernel Mailing List Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141202170407.GK25340@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20141127225637.GA24019@redhat.com> <547b8a45.6e608c0a.20f9.1002@mx.google.com> <547bbe36.48548c0a.105c.779c@mx.google.com> <20141201191431.GA17385@linux.vnet.ibm.com> <547ccf74.a5198c0a.25de.26d9@mx.google.com> <20141201230813.GE25340@linux.vnet.ibm.com> <547dec29.c71f8c0a.33d1.11d9@mx.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <547dec29.c71f8c0a.33d1.11d9@mx.google.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14120217-0021-0000-0000-00000697D2C5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 02, 2014 at 02:43:17PM -0200, D?niel Fraga wrote: > On Mon, 1 Dec 2014 15:08:13 -0800 > "Paul E. McKenney" wrote: > > > Well, this turned out to be way simpler than I expected. Passes > > light rcutorture testing. Sometimes you get lucky... > > Linus, Paul and others, I finally got a call trace with > only CONFIG_TREE_PREEMPT_RCU *disabled* using Paul's patch (to trigger > it I compiled PHP with make -j8). Is it harder to reproduce with CONFIG_PREEMPT=y and CONFIG_TREE_PREEMPT_RCU=n? If it is a -lot- harder to reproduce, it might be worth bisecting among the RCU read-side critical sections. If making a few of them be non-preemptible greatly reduces the probability of the bug occuring, that might provide a clue about root cause. On the other hand, if it is just a little harder to reproduce, this RCU read-side bisection would likely be an exercise in futility. Thanx, Paul > Dec 2 14:24:39 tux kernel: [ 8475.941616] conftest[9730]: segfault at 0 ip 0000000000400640 sp 00007fffa67ab300 error 4 in conftest[400000+1000] > Dec 2 14:24:40 tux kernel: [ 8476.104725] conftest[9753]: segfault at 0 ip 00007f6863024906 sp 00007fff0e31cc48 error 4 in libc-2.19.so[7f6862efe000+1a1000] > Dec 2 14:25:54 tux kernel: [ 8550.791697] INFO: rcu_sched detected stalls on CPUs/tasks: { 4} (detected by 0, t=60002 jiffies, g=112854, c=112853, q=0) > Dec 2 14:25:54 tux kernel: [ 8550.791702] Task dump for CPU 4: > Dec 2 14:25:54 tux kernel: [ 8550.791703] cc1 R running task 0 14344 14340 0x00080008 > Dec 2 14:25:54 tux kernel: [ 8550.791706] 000000001bcebcd8 ffff880100000003 ffffffff810cb7f1 ffff88021f5f5c00 > Dec 2 14:25:54 tux kernel: [ 8550.791708] ffff88011bcebfd8 ffff88011bcebce8 ffffffff811fb970 ffff8802149a2a00 > Dec 2 14:25:54 tux kernel: [ 8550.791710] ffff8802149a2cc8 ffff88011bcebd28 ffffffff8103e979 ffff88020ed01398 > Dec 2 14:25:54 tux kernel: [ 8550.791712] Call Trace: > Dec 2 14:25:54 tux kernel: [ 8550.791718] [] ? release_pages+0xa1/0x1e0 > Dec 2 14:25:54 tux kernel: [ 8550.791722] [] ? cpumask_any_but+0x30/0x40 > Dec 2 14:25:54 tux kernel: [ 8550.791725] [] ? flush_tlb_page+0x49/0xf0 > Dec 2 14:25:54 tux kernel: [ 8550.791727] [] ? lru_cache_add_active_or_unevictable+0x22/0x90 > Dec 2 14:25:54 tux kernel: [ 8550.791731] [] ? alloc_pages_vma+0x72/0x130 > Dec 2 14:25:54 tux kernel: [ 8550.791733] [] ? lru_cache_add_active_or_unevictable+0x22/0x90 > Dec 2 14:25:54 tux kernel: [ 8550.791735] [] ? handle_mm_fault+0x3a0/0xaf0 > Dec 2 14:25:54 tux kernel: [ 8550.791737] [] ? __do_page_fault+0x224/0x4c0 > Dec 2 14:25:54 tux kernel: [ 8550.791740] [] ? new_sync_write+0x7c/0xb0 > Dec 2 14:25:55 tux kernel: [ 8550.791743] [] ? fsnotify+0x27c/0x350 > Dec 2 14:25:55 tux kernel: [ 8550.791746] [] ? rcu_eqs_enter+0x93/0xa0 > Dec 2 14:25:55 tux kernel: [ 8550.791748] [] ? rcu_user_enter+0xe/0x10 > Dec 2 14:25:55 tux kernel: [ 8550.791749] [] ? do_page_fault+0x5a/0x70 > Dec 2 14:25:55 tux kernel: [ 8550.791752] [] ? page_fault+0x22/0x30 > > If you need more info/testing, just ask. > > -- > Linux 3.17.0-dirty: Shuffling Zombie Juror > http://www.youtube.com/DanielFragaBR > http://exchangewar.info > Bitcoin: 12H6661yoLDUZaYPdah6urZS5WiXwTAUgL > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/