Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934616AbaFSUdI (ORCPT ); Thu, 19 Jun 2014 16:33:08 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:51156 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932071AbaFSUdH (ORCPT ); Thu, 19 Jun 2014 16:33:07 -0400 Message-ID: <53A348E6.3050404@oracle.com> Date: Thu, 19 Jun 2014 16:32:38 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com, Thomas Gleixner CC: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote: >> > >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote: >>>> > > > >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) >>>> > > > >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be >>>> > > > used in slab allocators? What happened? >>> > > >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), >>> > > call_rcu_bh(), or call_srcu(). >>> > > >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? >>> > > That would be unfortunate... >> > >> > Well, no. Look at the callchain: >> > >> > __call_rcu >> > debug_object_activate >> > rcuhead_fixup_activate >> > debug_object_init >> > kmem_cache_alloc >> > >> > So call rcu activates the object, but the object has no reference in >> > the debug objects code so the fixup code is called which inits the >> > object and allocates a reference .... > OK, got it. And you are right, call_rcu() has done this for a very > long time, so not sure what changed. It's probable my fault. I've introduced clone() and unshare() fuzzing. Those two are full with issues and I've been waiting with enabling those until the rest of the kernel could survive trinity for more than an hour. Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/