Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937317Ab3DJKve (ORCPT ); Wed, 10 Apr 2013 06:51:34 -0400 Received: from www.linutronix.de ([62.245.132.108]:57123 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936553Ab3DJKvd (ORCPT ); Wed, 10 Apr 2013 06:51:33 -0400 Date: Wed, 10 Apr 2013 12:51:15 +0200 (CEST) From: Thomas Gleixner To: Dave Hansen cc: Borislav Petkov , "Srivatsa S. Bhat" , LKML , Dave Jones , dhillf@gmail.com, Peter Zijlstra , Ingo Molnar Subject: Re: [PATCH] kthread: Prevent unpark race which puts threads on the wrong cpu In-Reply-To: Message-ID: References: <515F457E.5050505@sr71.net> <515FCAC6.8090806@linux.vnet.ibm.com> <20130407095025.GA31307@pd.tnic> <20130408115553.GA4395@pd.tnic> <516439DF.3050901@sr71.net> <51647C30.3050109@sr71.net> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2051 Lines: 67 On Wed, 10 Apr 2013, Thomas Gleixner wrote: > On Tue, 9 Apr 2013, Dave Hansen wrote: > > > On 04/09/2013 12:30 PM, Thomas Gleixner wrote: > > > On Tue, 9 Apr 2013, Thomas Gleixner wrote: > > > Thought more about it and found, that the stupid binding only works > > > when the task is really descheduled. So there is a small window left, > > > which could lead to this. Revised patch below. > > > > > > Anyway a trace for that issue would be appreciated nevertheless. > > > > Here you go: > > > > http://sr71.net/~dave/linux/bigbox.1365539189.txt.gz > > Hmm. Unfortunately migration/146 is not in the trace. > > Can you please apply the patch below? That avoids the oops, but might > hang an online operation. Though the machine should stay up and you > should be able to retrieve the trace. > > Thanks, > > tglx > --- > Index: linux-2.6/kernel/smpboot.c > =================================================================== > --- linux-2.6.orig/kernel/smpboot.c > +++ linux-2.6/kernel/smpboot.c > @@ -131,7 +131,10 @@ static int smpboot_thread_fn(void *data) > continue; > } > > - BUG_ON(td->cpu != smp_processor_id()); > + if (td->cpu != smp_processor_id()) { > + tracing_off(); > + schedule(); Bah, that wants a continue. Revised patch below. > + } > > /* Check for state change setup */ > switch (td->status) { Index: linux-2.6/kernel/smpboot.c =================================================================== --- linux-2.6.orig/kernel/smpboot.c +++ linux-2.6/kernel/smpboot.c @@ -131,7 +131,11 @@ static int smpboot_thread_fn(void *data) continue; } - BUG_ON(td->cpu != smp_processor_id()); + if (td->cpu != smp_processor_id()) { + tracing_off(); + schedule(); + continue; + } /* Check for state change setup */ switch (td->status) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/