by Catalin Marinas

[permalink] [raw]

Subject: Re: [11/11] system 1: Saving energy using DVFS

On Mon, Jan 20, 2014 at 05:54:32PM +0000, Pavel Machek wrote:
> On Mon 2014-01-20 17:10:29, Catalin Marinas wrote:
> > On Mon, Jan 20, 2014 at 04:49:26PM +0000, Pavel Machek wrote:
> > > > To save energy, the higher frequencies should be avoided and only used
> > > > when the application performance requirements can not be satisfied
> > > > otherwise (e.g. spread tasks across more cpus if possible).
> > >
> > > I argue this is untrue for any task where user waits for its
> > > completion with screen on. (And that's quite important subset).
> > >
> > > Lets take Nokia n900 as an example.
> > >
> > > (source http://wiki.maemo.org/N900_Hardware_Power_Consumption)
> > >
> > > Sleeping CPU: 2mA
> > > Screen on: 230mA
> > > CPU loaded: 250mA
> > >
> > > Now, lets believe your numbers and pretend system can operate at 33%
> > > of speed with 11% power consumption.
> > >
> > > Lets take task that takes 10 seconds on max frequency:
> > >
> > > ~ 10s * 470mA = 4700mAs
> > >
> > > You suggest running at 33% speed, instead; that means 30 seconds on
> > > low requency.
> > >
> > > CPU on low: 25mA (assumed).
> > >
> > > ~ 30s * 255mA = 7650mAs
> > >
> > > Hmm. So race to idle is good thing on Intel machines, and it is good
> > > thing on ARM design I have access to.
> >
> > Race to idle doesn't mean that the screen goes off as well. Let's say
> > the screen stays on for 1 min and the CPU needs to be running for 10s
> > over this minute, in the first case you have:
>
> No, it does not. I just assumed user is continuing to use his
> machine. Obviously, waiting 60 seconds with screen on will make the
> difference look smaller. But your solution still means user has to
> wait longer _and_ you consume more battery doing so.
>
> And this is for any task where user waits for result with screen
> on. Like rendering a webpage. Like opening settings screen. Like
> installing application.

Page rendering should make very little difference to power since the
reading (screen on) time is much larger than the rendering (CPU) time.
But what I'm pointing at for 10s/60s ratios are thing like games or
video playing where the CPU is running for 1/6 of the time and idle for
the other 5/6. We get better energy figures by changing the run time to
3/6 and idle at 3/6.

> There are not too many background tasks on a cellphone.

For sleep time, screen off etc. there are some background tasks but here
the run-time doesn't matter much, it's probably more expensive to take
CPUs out of deep sleep states. What we want to optimise here is which
CPU to wake (like a little vs big).

> But hey, maybe you are right and running at lowest possible frequency
> is right. Please provide concrete numbers like I did.

They've been anonymised (for many reasons) and you have the right not to
trust them. But do you really think we are making up the numbers? We
have a great interest in the Linux scheduler working efficiently on the
ARM platforms rather than optimising it for non-existent scenarios. If
at some point this argument becomes a blocking factor, I'm sure we can
share the real numbers with the relevant parties under an NDA.

--
Catalin

2014-01-20 18:25:41

Hi,

On 20.01.2014 20:25, Sebastian Reichel wrote:
> On Mon, Jan 20, 2014 at 06:54:32PM +0100, Pavel Machek wrote:
>> On Mon 2014-01-20 17:10:29, Catalin Marinas wrote:
>>> On Mon, Jan 20, 2014 at 04:49:26PM +0000, Pavel Machek wrote:
>>>>> To save energy, the higher frequencies should be avoided and only used
>>>>> when the application performance requirements can not be satisfied
>>>>> otherwise (e.g. spread tasks across more cpus if possible).
>>>>
>>>> I argue this is untrue for any task where user waits for its
>>>> completion with screen on. (And that's quite important subset).
>>>>
>>>> Lets take Nokia n900 as an example.
>>>>
>>>> (source http://wiki.maemo.org/N900_Hardware_Power_Consumption)
>>>>
>>>> Sleeping CPU: 2mA
>>>> Screen on: 230mA
>>>> CPU loaded: 250mA
>>>>
>>>> Now, lets believe your numbers and pretend system can operate at 33%
>>>> of speed with 11% power consumption.
>>>>
>>>> Lets take task that takes 10 seconds on max frequency:
>>>>
>>>> ~ 10s * 470mA = 4700mAs
>>>>
>>>> You suggest running at 33% speed, instead; that means 30 seconds on
>>>> low requency.
>>>>
>>>> CPU on low: 25mA (assumed).
>>>>
>>>> ~ 30s * 255mA = 7650mAs
>>>>
>>>> Hmm. So race to idle is good thing on Intel machines, and it is good
>>>> thing on ARM design I have access to.
>>>
>>> Race to idle doesn't mean that the screen goes off as well. Let's say
>>> the screen stays on for 1 min and the CPU needs to be running for 10s
>>> over this minute, in the first case you have:
>>
>> No, it does not. I just assumed user is continuing to use his
>> machine. Obviously, waiting 60 seconds with screen on will make the
>> difference look smaller. But your solution still means user has to
>> wait longer _and_ you consume more battery doing so.
>>
>> And this is for any task where user waits for result with screen
>> on. Like rendering a webpage. Like opening settings screen. Like
>> installing application.
>>
>> There are not too many background tasks on a cellphone.
>>
>> But hey, maybe you are right and running at lowest possible frequency
>> is right. Please provide concrete numbers like I did.
>
> So what about using the display status information for power
> management? Basically always using the lowest frequency should be ok
> on phones if the display is disabled?

Well, not really. There are a looot of devices running linux kernel, and
there are always devices and use cases that can't operate if you
hardcode something like that.

It is good to know what the problematic use cases are, but it usually
does not end well if you optimize for specifics. End users are
unpredictable in ways they utilize their devices :)

And these days audio playback power optimization in smart phones is
mostly for the product spec marketing purposes anyways :P

I think the discussion has been going into right direction:
- find what data we have to make better decisions
- find ways to utilize that data

And then in the end the big smart phone manufacturers will twist that to
their use cases in horrible ways to meet product specs on tight
schedules :D But at least the starting point will be closer to target.

Even on N900 dvfs is beneficial for audio playback. But it was not
because of CPU consuming less, it was because the peripheral bus
frequency was tied to CPU frequency. There we ended up removing a very
low 125MHz cpu operating point so that the device ran at highest
possible 250MHz CPU rate (to idle quickly) while still keeping the lower
peripheral bus speed.

So we raced to idle and used DVFS.. how nice :)

- Kalle

>
> -- Sebastian
>