Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752295AbbFRGbW (ORCPT ); Thu, 18 Jun 2015 02:31:22 -0400 Received: from blu004-omc1s38.hotmail.com ([65.55.116.49]:62890 "EHLO BLU004-OMC1S38.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751085AbbFRGbN (ORCPT ); Thu, 18 Jun 2015 02:31:13 -0400 X-TMN: [nYHG4mgbMqV34LB6a9lmxa1/CgijKKbJ] X-Originating-Email: [wanpeng.li@hotmail.com] Message-ID: Date: Thu, 18 Jun 2015 14:31:00 +0800 From: Wanpeng Li User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Yuyang Du , Boqun Feng CC: mingo@kernel.org, peterz@infradead.org, linux-kernel@vger.kernel.org, pjt@google.com, bsegall@google.com, morten.rasmussen@arm.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, len.brown@intel.com, rafael.j.wysocki@intel.com, fengguang.wu@intel.com, srikar@linux.vnet.ibm.com Subject: Re: [Resend PATCH v8 0/4] sched: Rewrite runnable load and utilization average tracking References: <1434396367-27979-1-git-send-email-yuyang.du@intel.com> <20150617030650.GB5695@fixme-laptop.cn.ibm.com> <20150617051501.GA7154@fixme-laptop.cn.ibm.com> <20150617031101.GC1244@intel.com> In-Reply-To: <20150617031101.GC1244@intel.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 18 Jun 2015 06:31:11.0663 (UTC) FILETIME=[5E530BF0:01D0A990] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4045 Lines: 146 On 6/17/15 11:11 AM, Yuyang Du wrote: > Hi, > > The sched_debug is informative, lets first give it some analysis. > > The workload is 12 CPU hogging tasks (always runnable) and 1 dbench > task doing fs ops (70% runnable) running at the same time. > > Actually, these 13 tasks are in a task group /autogroup-9617, which > has weight 1024. > > So the 13 tasks at most can contribute to an average of 79 (=1024/13) > to the group entity's load_avg: > > cfs_rq[0]:/autogroup-9617 > .se->load.weight : 2 > .se->avg.load_avg : 0 > > cfs_rq[1]:/autogroup-9617 > .se->load.weight : 80 > .se->avg.load_avg : 79 > > cfs_rq[2]:/autogroup-9617 > .se->load.weight : 79 > .se->avg.load_avg : 78 > > cfs_rq[3]:/autogroup-9617 > .se->load.weight : 80 > .se->avg.load_avg : 81 > > cfs_rq[4]:/autogroup-9617 > .se->load.weight : 80 > .se->avg.load_avg : 79 > > cfs_rq[5]:/autogroup-9617 > .se->load.weight : 79 > .se->avg.load_avg : 77 > > cfs_rq[6]:/autogroup-9617 > .se->load.weight : 159 > .se->avg.load_avg : 156 > > cfs_rq[7]:/autogroup-9617 > .se->load.weight : 64 (dbench) > .se->avg.load_avg : 50 How you figure out this one is dbench? Regards, Wanpeng Li > > cfs_rq[8]:/autogroup-9617 > .se->load.weight : 80 > .se->avg.load_avg : 78 > > cfs_rq[9]:/autogroup-9617 > .se->load.weight : 159 > .se->avg.load_avg : 156 > > cfs_rq[10]:/autogroup-9617 > .se->load.weight : 80 > .se->avg.load_avg : 78 > > cfs_rq[11]:/autogroup-9617 > .se->load.weight : 79 > .se->avg.load_avg : 78 > > So this is very good runnable load avg accrued in the task group > structure. > > However, why the cpu0 is very underload? > > The top cfs's load_avg is: > > cfs_rq[0]: 754 > cfs_rq[1]: 81 > cfs_rq[2]: 85 > cfs_rq[3]: 80 > cfs_rq[4]: 142 > cfs_rq[5]: 86 > cfs_rq[6]: 159 > cfs_rq[7]: 264 > cfs_rq[8]: 79 > cfs_rq[9]: 156 > cfs_rq[10]: 78 > cfs_rq[11]: 79 > > We see cfs_rq[0]'s load_avg is 754 even it is underloaded. > > So the problem is: > > 1) The tasks in the workload have too small weight (only 79), because > they share a task group. > > 2) Probably some "high" weight task even runnable a small time > contribute "big" to cfs_rq's load_avg. > > The patchset does what it wants to do: > > 1) very precise task group's load avg tracking from group to children > tasks and from children tasks to group. > > 2) the combined runnable + blocked load_avg is effective, so the blocked > avg made its impact. > > I will try to figure out what makes the cfs_rq[0]'s 754 load_avg, but > I also think that the tasks have so small weight that they are very > easy to be fairly "imbalanced" .... > > Peter, Ben, and others? > > In addition, the util_avg sometimes is insanely big, I think I already > found the problem. > > Thanks, > Yuyang > > On Wed, Jun 17, 2015 at 01:15:01PM +0800, Boqun Feng wrote: >> On Wed, Jun 17, 2015 at 11:06:50AM +0800, Boqun Feng wrote: >>> Hi Yuyang, >>> >>> I've run the test as follow on tip/master without and with your >>> patchset: >>> >>> On a 12-core system (Intel(R) Xeon(R) CPU X5690 @ 3.47GHz) >>> run stress --cpu 12 >>> run dbench 1 >> Sorry, I forget to say that `stress --cpu 12` and `dbench 1` are running >> simultaneously. Thank Yuyang for reminding me that. >> >> Regards, >> Boqun > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/