Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759218AbYBTGzE (ORCPT ); Wed, 20 Feb 2008 01:55:04 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753985AbYBTGyy (ORCPT ); Wed, 20 Feb 2008 01:54:54 -0500 Received: from mga10.intel.com ([192.55.52.92]:25047 "EHLO fmsmga102.fm.intel.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753847AbYBTGyw (ORCPT ); Wed, 20 Feb 2008 01:54:52 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.25,380,1199692800"; d="scan'208";a="521442383" Subject: Re: Linux 2.6.25-rc2 From: "Zhang, Yanmin" To: Pekka Enberg Cc: Ingo Molnar , Mathieu Desnoyers , Torsten Kaiser , Linus Torvalds , Linux Kernel Mailing List , Christoph Lameter In-Reply-To: <1203473339.3248.45.camel@ymzhang> References: <64bb37e0802161338j306c1357m25bc224f09e6b7cd@mail.gmail.com> <20080219061107.GA23229@elte.hu> <64bb37e0802182254l49b10cbblc23f8a83d189ff8e@mail.gmail.com> <84144f020802182321x452888bai639c71ea2a5067da@mail.gmail.com> <20080219140230.GA32236@Krystal> <84144f020802190621s509dbe7gc8e5609d94aca9b4@mail.gmail.com> <84144f020802190638i4a364d19o8986a457e76ec187@mail.gmail.com> <20080219145554.GE21176@elte.hu> <47BAFB42.9000806@cs.helsinki.fi> <1203467785.3248.42.camel@ymzhang> <1203473339.3248.45.camel@ymzhang> Content-Type: text/plain; charset=utf-8 Date: Wed, 20 Feb 2008 14:53:10 +0800 Message-Id: <1203490390.3248.48.camel@ymzhang> Mime-Version: 1.0 X-Mailer: Evolution 2.9.2 (2.9.2-2.fc7) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2862 Lines: 67 On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote: > On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > > > Ingo Molnar wrote: > > > > * Pekka Enberg wrote: > > > > > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > > > >>> lockless path when an IRQ triggers? > > > >> Hmm. The barrier() in slab_free() looks fishy. The comment says it's > > > >> there to make sure we've retrieved c->freelist before c->page but then > > > >> it uses a _compiler barrier_ which doesn't affect the CPU and the > > > >> reads may still be re-ordered... Not sure if that matters here though. > > > > > > > > find a fix patch for that below - most systems affected seem to be SMP > > > > ones. > > > > > > > > If this (or my other patch) indeed solves the problem i'd still favor a > > > > full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks > > > > quite un-cooked and quite un-tested for multiple independent reasons. > > > > > > > > Sigh, why do i again have to be the messenger who brings the bad news to > > > > SLUB land, and again when poor Christoph went on vacation? :-/ > > > > > > > > Ingo > > > > > > > > --------------------------> > > > > Subject: SLUB: barrier fix > > > > From: Ingo Molnar > > > > > > > > --- > > > > mm/slub.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > Index: linux/mm/slub.c > > > > =================================================================== > > > > --- linux.orig/mm/slub.c > > > > +++ linux/mm/slub.c > > > > @@ -1862,7 +1862,7 @@ static __always_inline void slab_free(st > > > > debug_check_no_locks_freed(object, s->objsize); > > > > do { > > > > freelist = c->freelist; > > > > - barrier(); > > > > + smp_mb(); > > > > /* > > > > * If the compiler would reorder the retrieval of c->page to > > > > * come before c->freelist then an interrupt could > > > > > > Torsten/Yamin, does this fix things for you? What about reverting commit > > > 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c ("SLUB: Alternate fast paths > > > using cmpxchg_local")? > > I'm busy in another issue and will test it ASAP. Sorry. > I tested it on my 3 x86-64 machines. The small fix to use smp_mb to replace > barrier in slab_free doesn't work. Kernel still crashed at the same place. > > I will test the reverting patch. Kernel with the reverting patch is ok. I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 machines, and kernel didn't crash. -yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/