Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932350Ab1CINbz (ORCPT ); Wed, 9 Mar 2011 08:31:55 -0500 Received: from mtagate6.uk.ibm.com ([194.196.100.166]:35833 "EHLO mtagate6.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756851Ab1CINbw (ORCPT ); Wed, 9 Mar 2011 08:31:52 -0500 Date: Wed, 9 Mar 2011 14:31:52 +0100 From: Martin Schwidefsky To: Peter Zijlstra Cc: Benjamin Herrenschmidt , "linux-kernel@vger.kernel.org" , linuxppc-dev , Jesse Larrew Subject: Re: [BUG] rebuild_sched_domains considered dangerous Message-ID: <20110309143152.3cc6c191@mschwide.boeblingen.de.ibm.com> In-Reply-To: <1299676769.2308.2944.camel@twins> References: <1299639487.22236.256.camel@pasglop> <1299665998.2308.2753.camel@twins> <1299670429.2308.2834.camel@twins> <20110309141548.722e4f56@mschwide.boeblingen.de.ibm.com> <1299676769.2308.2944.camel@twins> Organization: IBM Corporation X-Mailer: Claws Mail 3.7.8 (GTK+ 2.20.1; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2110 Lines: 48 On Wed, 09 Mar 2011 14:19:29 +0100 Peter Zijlstra wrote: > On Wed, 2011-03-09 at 14:15 +0100, Martin Schwidefsky wrote: > > On Wed, 09 Mar 2011 12:33:49 +0100 > > Peter Zijlstra wrote: > > > > > On Wed, 2011-03-09 at 11:19 +0100, Peter Zijlstra wrote: > > > > > It appears that this corresponds to one CPU deciding to rebuild the > > > > > sched domains. There's various reasons why that can happen, the typical > > > > > one in our case is the new VPNH feature where the hypervisor informs us > > > > > of a change in node affinity of our virtual processors. s390 has a > > > > > similar feature and should be affected as well. > > > > > > > > Ahh, so that's triggering it :-), just curious, how often does the HV do > > > > that to you? > > > > > > OK, so Ben told me on IRC this can happen quite frequently, to which I > > > must ask WTF were you guys smoking? Flipping the CPU topology every time > > > the HV scheduler does something funny is quite insane. And you did that > > > without ever talking to the scheduler folks, not cool. > > > > > > That is of course aside from the fact that we have a real bug there that > > > needs fixing, but really guys, WTF! > > > > Just for info, on s390 the topology change events are rather infrequent. > > They do happen e.g. after an LPAR has been activated and the LPAR > > hypervisor needs to reshuffle the CPUs of the different nodes. > > But if you don't also update the cpu->node memory mappings (which I > think it near impossible) what good is it to change the scheduler > topology? The memory for the different LPARs is striped over all nodes (or books as we call them). We heavily rely on the large shared cache between the books to hide the different memory access latencies. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/