Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753418Ab2E3M07 (ORCPT ); Wed, 30 May 2012 08:26:59 -0400 Received: from mx2.parallels.com ([64.131.90.16]:52539 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753185Ab2E3M05 (ORCPT ); Wed, 30 May 2012 08:26:57 -0400 Message-ID: <4FC61188.8000908@parallels.com> Date: Wed, 30 May 2012 16:24:40 +0400 From: Glauber Costa User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: Paul Turner CC: , , , Peter Zijlstra , Tejun Heo , "Eric W. Biederman" , , , Serge Hallyn Subject: Re: [PATCH v3 5/6] Also record sleep start for a task group References: <1338371317-5980-1-git-send-email-glommer@parallels.com> <1338371317-5980-6-git-send-email-glommer@parallels.com> In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2282 Lines: 57 On 05/30/2012 03:35 PM, Paul Turner wrote: > On Wed, May 30, 2012 at 2:48 AM, Glauber Costa wrote: >> When we're dealing with a task group, instead of a task, also record >> the start of its sleep time. Since the test agains TASK_UNINTERRUPTIBLE >> does not really make sense and lack an obvious analogous, we always >> record it as sleep_start, never block_start. >> >> Signed-off-by: Glauber Costa >> CC: Peter Zijlstra >> CC: Paul Turner >> --- >> kernel/sched/fair.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index c26fe38..d932559 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -1182,7 +1182,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) >> se->statistics.sleep_start = rq_of(cfs_rq)->clock; >> if (tsk->state& TASK_UNINTERRUPTIBLE) >> se->statistics.block_start = rq_of(cfs_rq)->clock; >> - } >> + } else >> + se->statistics.sleep_start = rq_of(cfs_rq)->clock; > > You can't sanely account sleep on a group entity. > > Suppose you have 2 sleepers on 1 cpu: you account 1s/s of idle > Suppose you have 2 sleepers now on 2 cpus: you account 2s/s of idle > > Furthermore, in the latter case when one wakes up you still continue > to accrue sleep time whereas in the former you don't. > > Just don't report/collect this. sleep_start is not for iowait. This is for idle. And I know no other way to collect idle time per cgroup, other than the time during which it was out of the runqueue. Now what you say about the sleepers don't make that much sense for idle because this information is per-cpu as well. When the se is being dequeued, it means none of its children is running on that runqueue. That's idle. >> #endif >> } >> >> -- >> 1.7.10.2 >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/