Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936389AbXKPX1k (ORCPT ); Fri, 16 Nov 2007 18:27:40 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S964975AbXKPX0o (ORCPT ); Fri, 16 Nov 2007 18:26:44 -0500 Received: from wa-out-1112.google.com ([209.85.146.178]:20124 "EHLO wa-out-1112.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964958AbXKPX0n (ORCPT ); Fri, 16 Nov 2007 18:26:43 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=foibOBvytAeuUNquWZreC5JH9+XOMTuJVVkbhjMnVnU6eNT4p1fPqmTdnHb8XqMhg+60niOIUjSDM3vyaUtUwadXq3fjEgbpJPvEkEHzGrddzOQxzy6G1wwAWTij1nk1CoG/celgCvvWQrmdTlvLdO71zpxpk0Ysnp0suv6gbLU= Message-ID: Date: Sat, 17 Nov 2007 00:26:41 +0100 From: "Dmitry Adamushko" To: "Micah Dowty" Subject: Re: High priority tasks break SMP balancer? Cc: "Ingo Molnar" , "Christoph Lameter" , "Kyle Moffett" , "Cyrus Massoumi" , "LKML Kernel" , "Andrew Morton" , "Mike Galbraith" , "Paul Menage" , "Peter Williams" In-Reply-To: <20071116221404.GC31527@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <2FAA6826-653E-482F-A037-C539BAEEA1DA@mac.com> <20071115202425.GC4914@vmware.com> <20071115213510.GA16079@vmware.com> <20071116024408.GA20322@vmware.com> <20071116060700.GD16273@elte.hu> <20071116221404.GC31527@vmware.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1948 Lines: 61 On 16/11/2007, Micah Dowty wrote: > [ ... ] > > or just remove bit #3 (which is responsible for 8 == 1000) here: > > > > cat /proc/sys/kernel/sched_features > > > > (this one is enabled by default in 2.6.23.1) > > Aha. Turning off bit 3 appears to instantly fix my problem while it's > occurring in an existing process, and I can't reproduce it on any new > processes afterward. humm... ok, but considering your recent summary for various kernels... I guess, it doesn't qualify as the primary suspect... it just likely affects something else. > > > cpu_1 : 2-3 nice(0) cpu-hog tasks ; > > > > both cpus may be seen with similar rq->load_cpu[]... > > When I try this, cpu0 has a cpu_load[] of over 10000 and cpu1 has a > load of 2048 or so. yeah, one of the options for 2048 would be presence of 2 nice(0) cpu-hogs (1024 is the weight for a nice(0) task). > > yeah, one would > > argue that one of the cpu hogs could be migrated to cpu_0 and consume > > remaining 'time slots' and it would not "disturb" the nice(-20) task > > as : > > it's able to preempt the lower prio task whenever it want (provided, > > fine-grained kernel preemption) and we don't care that much of > > trashing of caches here. > > Yes, that's the behaviour I expected to see (and what my application > would prefer). yep, that's what load_balance_newidle() is about... so maybe there are some factors resulting in its inconsistency/behavioral differences on different kernels. Let's say we change a pattern for the niced task: e.g. run for 100 ms. and then sleep for 300 ms. (that's ~25% of cpu load) in the loop. Any behavioral changes? > > Thanks much, > --Micah > -- Best regards, Dmitry Adamushko - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/