Date: Tue, 11 Jun 2013 17:27:23 -0700 (PDT)
From: David Lang <david@lang.hm>
To: Daniel Lezcano <daniel.lezcano@linaro.org>
cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>,
        "Rafael J. Wysocki" <rjw@rjwysocki.net>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Ingo Molnar <mingo@kernel.org>,
        Morten Rasmussen <Morten.Rasmussen@arm.com>,
        "alex.shi@intel.com" <alex.shi@intel.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Mike Galbraith <efault@gmx.de>, "pjt@google.com" <pjt@google.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        linaro-kernel <linaro-kernel@lists.linaro.org>,
        "arjan@linux.intel.com" <arjan@linux.intel.com>,
        "len.brown@intel.com" <len.brown@intel.com>,
        "corbet@lwn.net" <corbet@lwn.net>,
        Andrew Morton <akpm@linux-foundation.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Linux PM list <linux-pm@vger.kernel.org>
Subject: Re: power-efficient scheduling design
In-Reply-To: <51B5FE02.7040607@linaro.org>
Message-ID: <alpine.DEB.2.02.1306111722470.24968@nftneq.ynat.uz>
References: <20130530134718.GB32728@e103034-lin> <51B221AF.9070906@linux.vnet.ibm.com> <20130608112801.GA8120@MacBook-Pro.local> <1834293.MlyIaiESPL@vostro.rjw.lan> <51B3F99A.4000101@linux.vnet.ibm.com> <51B5FE02.7040607@linaro.org>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2648
Lines: 50

On Mon, 10 Jun 2013, Daniel Lezcano wrote:

> Some SoC can have a cluster of cpus sharing some resources, eg cache, so
> they must enter the same state at the same moment. Beside the
> synchronization mechanisms, that adds a dependency with the next event.
> For example, the u8500 board has a couple of cpus. In order to make them
> to enter in retention, both must enter the same state, but not necessary
> at the same moment. The first cpu will wait in WFI and the second one
> will initiate the retention mode when entering to this state.
> Unfortunately, some time could have passed while the second cpu entered
> this state and the next event for the first cpu could be too close, thus
> violating the criteria of the governor when it choose this state for the
> second cpu.
>
> Also the latencies could change with the frequencies, so there is a
> dependency with cpufreq, the lesser the frequency is, the higher the
> latency is. If the scheduler takes the decision to go to a specific
> state assuming the exit latency is a given duration, if the frequency
> decrease, this exit latency could increase also and lead the system to
> be less responsive.
>
> I don't know, how were made the latencies computation (eg. worst case,
> taken with the lower frequency or not) but we have just one set of
> values. That should happen with the current code.
>
> Another point is the timer allowing to detect bad decision and go to a
> deep idle state. With the cluster dependency described above, we may
> wake up a particular cpu, which turns on the cluster and make the entire
> cluster to wake up in order to enter a deeper state, which could fail
> because of the other cpu may not fulfill the constraint at this moment.

Nobody is saying that this sort of thing should be in the fastpath of the 
scheduler.

But if the scheduler has a table that tells it the possible states, and the cost 
to get from the current state to each of these states (and to get back and/or 
wake up to full power), then the scheduler can make the decision on what to do, 
invoke a routine to make the change (and in the meantime, not be fighting the 
change by trying to schedule processes on a core that's about to be powered 
off), and then when the change happens, the scheduler will have a new version of 
the table of possible states and costs

This isn't in the fastpath, it's in the rebalancing logic.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/