Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1424809AbdD1RIv (ORCPT ); Fri, 28 Apr 2017 13:08:51 -0400 Received: from foss.arm.com ([217.140.101.70]:51566 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1036125AbdD1RIj (ORCPT ); Fri, 28 Apr 2017 13:08:39 -0400 Subject: Re: [PATCH v2] sched/fair: update scale invariance of PELT To: Morten Rasmussen , Vincent Guittot References: <1491815909-13345-1-git-send-email-vincent.guittot@linaro.org> <20170428155258.GA12012@e105550-lin.cambridge.arm.com> Cc: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, yuyang.du@intel.com, pjt@google.com, bsegall@google.com From: Dietmar Eggemann Message-ID: <9e629311-7da1-afab-a493-3300f11836d8@arm.com> Date: Fri, 28 Apr 2017 18:08:36 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170428155258.GA12012@e105550-lin.cambridge.arm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2190 Lines: 60 On 28/04/17 16:52, Morten Rasmussen wrote: > Hi Vincent, [...] > As mentioned above, waiting time, i.e. !running && weight, is not > scaled, which causes trouble for load. I ran some rt-app-based tests on a system with frequency and cpu invariance. (1) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at 625Mhz (max 1100Mhz) starting to run at the same time, so one task (task1) is wakeup preempted. (I'm only considering the phase of the test run where this is a stable condition, i.e. task1 is always wakeup preempted by task2). So the runtime of a task is 0.2*12ms*1100/625 = 4.2ms. At the beginning of the preemption period, __update_load_avg_se(task1) is called with running=0 and weight=0, at the end with running=0 and weight=1024. When task1 finally runs there are two calls with (running=1, weight=1024) before the next wakeup preemption period for task1 starts again with (running=0, weight=0). Task task2 which doesn't suffer from wakeup preemption starts running with (running=0, weight=0), then there are 2 calls with (running=1, weight=1024) before it starts running again with (running=0, weight=0). Task1 is runnable for 8.4ms and sleeps for 3.6ms whereas task is runnable for 4.2ms and sleeps for 7.8ms. The load signal of task1 is ~600 whereas the the load of task2 is ~200. (2) Two periodic 20% tasks with 12ms period on a cpu (capacity=1024) at 1100Mhz (max 1100Mhz) starting to run at the same time, so one task (task1) is wakeup preempted. So the runtime of one task is 0.2*12ms*1100/1100 = 2.4ms. Task1 is runnable for 4.8ms and sleeps for 7.2ms whereas task is runnable for 2.4ms and sleeps for 9.6ms. The load signal of task1 is ~400 whereas the the load of task2 is ~200. Like Morten said, the scaling for load works differently on different OPP's. Scaling for utilization looks fine. IMHO, the implementation of your scale_time() function can't take preemption into consideration. I also did tests comparing the time_scaling implementation with tip (contribution scaling) (two periodic tasks 20%/16ms at 625Mhz/1100Mhz and 20%/32ms at 625Mhz/1100Mhz) showing this as a difference between time_scaling and tip. -- Dietmar [...]