Received: by 10.213.65.68 with SMTP id h4csp106230imn; Fri, 6 Apr 2018 16:51:42 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+jRDIC8rQgT95GZPdns0gFwSukbYY4FcJjs9IOLb3b8A+lfQ08QUPlVz21usPzCKFXCHOg X-Received: by 2002:a17:902:ab98:: with SMTP id f24-v6mr28745984plr.331.1523058702940; Fri, 06 Apr 2018 16:51:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523058702; cv=none; d=google.com; s=arc-20160816; b=ybHMDLR3lxMWRXzJX6QZ0v5hxXZH0T0UWYlWsu12ram7VkXO0CVJvKCHiHxzu04euQ WGOOgGLVu3nVqcm8bBRdhb9eNL97iVx4S8r+SBvUSJIHRTqtfQNBWPNBZbYhwcIz6Dka cA1rfzZkYDs2T5cKX7L7duhPC4+Ckg9JdmvEJEfbqrT/6U7Ak+nv8AVZWUA1VfHOmxV9 wO0+qeNs3Nsr6O5YHEjsYxlPy28k86dRuj9cszgNsRhdIOX3TzAz7h49ZMapaYZX7Tg8 FylwNRZYg/50GaVoIVUBJuBYWscly9JqAvyrf0Iu7pXcnmfMniNaei3Hz1A7Vesr/tA/ tHZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=SQyrvEKXIcWkY0Kb48JmqHVcS6FwNTrtD1PJGCnLgC8=; b=F3+jmlgYzwbfoQHMgUUj9y09isxV/Z5Bbq4Fn46UECoGjrkN2xi+a2Onbg/4FmnY34 //z+NIyh0Yn/zDxWK7r1JkWR83Ge/R7YawTtHaT4SlUmjn3musM9rtklCwjSZA0Tb8II 3DcAZc44NSIgUNlsBU1RUcD1HRPCo7bIikr35vo7sw7ijF+3A3XcgECrw0z5Hq+j3w+h 7xjF+yC0XS9U2G+kR/OisUaDNz1oJIkP+GP0LHMViH9JrvbC/AloKJbUxAPv/Jx71HtT d2DOq2oGwPeCDwu8gRtLg4SKMFnxlEUYQc/Ab23X7x2eldYpyjWi8yQ8RlJtLp3otPnH Ks3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AUL249++; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b61-v6si9344494plc.500.2018.04.06.16.51.05; Fri, 06 Apr 2018 16:51:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=AUL249++; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751802AbeDFXs0 (ORCPT + 99 others); Fri, 6 Apr 2018 19:48:26 -0400 Received: from mail-it0-f54.google.com ([209.85.214.54]:51612 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751417AbeDFXsZ (ORCPT ); Fri, 6 Apr 2018 19:48:25 -0400 Received: by mail-it0-f54.google.com with SMTP id b5-v6so4119333itj.1 for ; Fri, 06 Apr 2018 16:48:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=SQyrvEKXIcWkY0Kb48JmqHVcS6FwNTrtD1PJGCnLgC8=; b=AUL249++5CdT6VKof/QbjxzQW78TIQdRe4UqTf6UXUWQpC62uxvyRc6G/00TvyBq7x dED5AqFYy17c3ugrTjKihafu4OjLP8gOJqics49jI7Uk6qv/ABXulNvP15sc/h6DaSIV MfpDWf2tyKXCM7aU/5cIhnwcHIFb0TyxmytozhcCWJcMVviMEvALJCQloaUg1IMT16YQ lB/c/CYPhLz9xSrhPutfmiCEJS8I+bDHMwe5amvifqV/JwAX/dvTS3va+UJzBmEcXeGS X0r0RySZgrRbKY+BaVWTD+Vc2uva3ZPywmt8oUWGTBP5ejWyJsg3TvMGr1rjI9r70YU3 Gb2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=SQyrvEKXIcWkY0Kb48JmqHVcS6FwNTrtD1PJGCnLgC8=; b=GShVKKdyhW5/HvT/X2eauKycT1ZDgsaCn1xqJXin3fyrdiXAUzuU361XM3LVIxDiYB OK5ZNbKQs/TzTtHjpRQeII5bEz405FU21fmPQupt54Mm5LB0q5Dnf4s91xsgJx06XWfS gqYKtAsECDrgl3HfPekmBU9eG87twt/dXeCcErCzWkMk/atu2r5jlNNDhEywBxcnD9rz wezgU+7patgwcW8M7gPD2qteFYJwmMpvHTnpbiy0qh/ctupgRMrwtL0HMl5s2N1P9qRg Av+yonsTbvhEt85qE9HG4oopD+EFW9RqGkBn2w6RVL4NGMTxuPOmBJ0DTjThH/dZ7uXs /mzQ== X-Gm-Message-State: ALQs6tDtgUEgYNxQn6mwLmkVWBQO9o48y8e56rLWtibezY2OAxpFOYgD 9DfZvAsciKYxkhjrkKgGgDezdlkeKoguuTufbGM5Sw== X-Received: by 2002:a24:fd03:: with SMTP id m3-v6mr20154319ith.47.1523058504028; Fri, 06 Apr 2018 16:48:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.11.158 with HTTP; Fri, 6 Apr 2018 16:48:22 -0700 (PDT) In-Reply-To: <20180406172835.20078-1-patrick.bellasi@arm.com> References: <20180406172835.20078-1-patrick.bellasi@arm.com> From: Joel Fernandes Date: Fri, 6 Apr 2018 16:48:22 -0700 Message-ID: Subject: Re: [PATCH] sched/fair: schedutil: update only with all info available To: Patrick Bellasi Cc: LKML , Linux PM , Ingo Molnar , Peter Zijlstra , "Rafael J . Wysocki" , Viresh Kumar , Vincent Guittot , Juri Lelli , Steve Muckle , Dietmar Eggemann , Morten Rasmussen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 6, 2018 at 10:28 AM, Patrick Bellasi wrote: > Schedutil is not properly updated when the first FAIR task wakes up on a > CPU and when a RQ is (un)throttled. This is mainly due to the current > integration strategy, which relies on updates being triggered implicitly > each time a cfs_rq's utilization is updated. To me that's not implicit, but is explicitly done when the util is updated. It seems that's the logical place.. > > Those updates are currently provided (mainly) via > cfs_rq_util_change() > which is used in: > - update_cfs_rq_load_avg() > when the utilization of a cfs_rq is updated > - {attach,detach}_entity_load_avg() > This is done based on the idea that "we should callback schedutil > frequently enough" to properly update the CPU frequency at every This also I didn't get, its not called "frequently enough" but at the right time (when the util is updated). > utilization change. > > Since this recent schedutil update: > > commit 8f111bc357aa ("cpufreq/schedutil: Rewrite CPUFREQ_RT support") > > we use additional RQ information to properly account for FAIR tasks > utilization. Specifically, cfs_rq::h_nr_running has to be non-zero > in sugov_aggregate_util() to sum up the cfs_rq's utilization. > > However, cfs_rq::h_nr_running is usually updated as: > > enqueue_entity() > ... > update_load_avg() > ... > cfs_rq_util_change ==> trigger schedutil update > ... > cfs_rq->h_nr_running += number_of_tasks > > both in enqueue_task_fair() as well as in unthrottle_cfs_rq(). > A similar pattern is used also in dequeue_task_fair() and > throttle_cfs_rq() to remove tasks. Maybe this should be fixed then? It seems broken to begin with... > > This means that we are likely to see a zero cfs_rq utilization when we > enqueue a task on an empty CPU, or a non-zero cfs_rq utilization when > instead, for example, we are throttling all the FAIR tasks of a CPU. > > While the second issue is less important, since we are less likely to Actually I wanted behavior like the second issue, because that means frequency will not be dropped if CPU is about to idle which is similar to a patch I sent long time ago (skip freq update if CPU about to idle). > reduce frequency when CPU utilization decreases, the first issue can > instead impact performance. Indeed, we potentially introduce a not desired This issue would be fixed by just fixing the h_nr_running bug right? > latency between a task enqueue on a CPU and its frequency increase. > > Another possible unwanted side effect is the iowait boosting of a CPU > when we enqueue a task into a throttled cfs_rq. Probably very very rare. > Moreover, the current schedutil integration has these other downsides: > > - schedutil updates are triggered by RQ's load updates, which makes > sense in general but it does not allow to know exactly which other RQ > related information has been updated (e.g. h_nr_running). It seems broken that all information that schedutil needs isn't updated _before_ the cpufreq update request, so that should be fixed instead? > > - increasing the chances to update schedutil does not always correspond > to provide the most accurate information for a proper frequency > selection, thus we can skip some updates. Again IMO its just updated at the right time already, not just frequently enough... > > - we don't know exactly at which point a schedutil update is triggered, > and thus potentially a frequency change started, and that's because > the update is a side effect of cfs_rq_util_changed instead of an > explicit call from the most suitable call path. > > - cfs_rq_util_change() is mainly a wrapper function for an already > existing "public API", cpufreq_update_util(), to ensure we actually > update schedutil only when we are updating a root RQ. Thus, especially > when task groups are in use, most of the calls to this wrapper > function are really not required. > > - the usage of a wrapper function is not completely consistent across > fair.c, since we still need sometimes additional explicit calls to > cpufreq_update_util(), for example to support the IOWAIT boot flag in > the wakeup path > > - it makes it hard to integrate new features since it could require to > change other function prototypes just to pass in an additional flag, > as it happened for example here: > > commit ea14b57e8a18 ("sched/cpufreq: Provide migration hint") > > All the above considered, let's try to make schedutil updates more > explicit in fair.c by: > > - removing the cfs_rq_util_change() wrapper function to use the > cpufreq_update_util() public API only when root cfs_rq is updated > > - remove indirect and side-effect (sometimes not required) schedutil > updates when the cfs_rq utilization is updated > > - call cpufreq_update_util() explicitly in the few call sites where it > really makes sense and all the required information has been updated > > By doing so, this patch mainly removes code and adds explicit calls to > schedutil only when we: > - {enqueue,dequeue}_task_fair() a task to/from the root cfs_rq > - (un)throttle_cfs_rq() a set of tasks up to the root cfs_rq > - task_tick_fair() to update the utilization of the root cfs_rq About the "update for root" thingy, we're already doing rq->cfs == cfs_rq in cfs_rq_util_change so its already covered? Also, I feel if you can shorten the commit message a little and only include the best reasons for this patch that'll be nice so its easier to review. > > All the other code paths, currently _indirectly_ covered by a call to > update_load_avg(), are also covered by the above three calls. I would rather we do it in as few places as possible (when util is updated for root) than spreading it around and making it "explicit". I hope I didn't miss something but I feel its explicit enough already... thanks! - Joel