MIME-Version: 1.0
In-Reply-To: <CAJWu+ooYuwKo34YiOnt3aQc=xMSWEQfCVRiO5iji3hYvam33Ew@mail.gmail.com>
References: <4366682.tsferJN35u@aspire.rjw.lan> <2185243.flNrap3qq1@aspire.rjw.lan>
 <20170320035745.GC25659@vireshk-i7> <CAKfTPtD8bp-mB=9Rjufeyj3weg6T3b1J-o+Sc2Oe2EMGX3zKzQ@mail.gmail.com>
 <20170320123416.GB27896@e110439-lin> <CAJWu+oocwa0Nxch+ShqG6BPfVRXUhS0GQwYK1qBu04kuvh6vig@mail.gmail.com>
 <CAKfTPtD=xKb1UCUL6CWFOfr8ina_sNSOdaM-11teWhKe_xmedA@mail.gmail.com> <CAJWu+ooYuwKo34YiOnt3aQc=xMSWEQfCVRiO5iji3hYvam33Ew@mail.gmail.com>
From: Vincent Guittot <vincent.guittot@linaro.org>
Date: Mon, 27 Mar 2017 08:59:05 +0200
Message-ID: <CAKfTPtAz=2WOZEryHgu-JVCzVfA-ndCHRPQaSetxfq1sky6ESg@mail.gmail.com>
Subject: Re: [RFC][PATCH 2/2] cpufreq: schedutil: Force max frequency on busy CPUs
To: Joel Fernandes <joelaf@google.com>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>,
        "Rafael J. Wysocki" <rjw@rjwysocki.net>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Linux PM <linux-pm@vger.kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Juri Lelli <juri.lelli@arm.com>,
        Morten Rasmussen <morten.rasmussen@arm.com>,
        Ingo Molnar <mingo@redhat.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2115
Lines: 51

On 25 March 2017 at 04:48, Joel Fernandes <joelaf@google.com> wrote:
> Hi Vincent,
>
> On Thu, Mar 23, 2017 at 3:08 PM, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> [..]
>>>>
>>>>> So I'm not really aligned with the description of your problem: PELT
>>>>> metric underestimates the load of the CPU.  The PELT is just about
>>>>> tracking CFS task utilization but not whole CPU utilization and
>>>>> according to your description of the problem (time stolen by irq),
>>>>> your problem doesn't come from an underestimation of CFS task but from
>>>>> time spent in something else but not accounted in the value used by
>>>>> schedutil
>>>>
>>>> Quite likely. Indeed, it can really be that the CFS task is preempted
>>>> because of some RT activity generated by the IRQ handler.
>>>>
>>>> More in general, I've also noticed many suboptimal freq switches when
>>>> RT tasks interleave with CFS ones, because of:
>>>> - relatively long down _and up_ throttling times
>>>> - the way schedutil's flags are tracked and updated
>>>> - the callsites from where we call schedutil updates
>>>>
>>>> For example it can really happen that we are running at the highest
>>>> OPP because of some RT activity. Then we switch back to a relatively
>>>> low utilization CFS workload and then:
>>>> 1. a tick happens which produces a frequency drop
>>>
>>> Any idea why this frequency drop would happen? Say a running CFS task
>>> gets preempted by RT task, the PELT signal shouldn't drop for the
>>> duration the CFS task is preempted because the task is runnable, so
>>
>> utilization only tracks the running state but not runnable state.
>> Runnable state is tracked in load_avg
>
> Thanks. I got it now.
>
> Correct me if I'm wrong but strictly speaking utilization for a cfs_rq
> (which drives the frequency for CFS) still tracks the blocked/runnable
> time of tasks although its decayed as time moves forward. Only when we
> migrate the rq of a cfs task is the util_avg contribution removed from
> the rq. But I can see now why running RT can decay this load tracking
> signal.

Yes. you're right


>
> Regards,
> Joel